Regular Expressions or ‘regex’ are extremely powerful and utterly invaluable.

Its primary use case is to find patterns in text. And working in cyber security, text is everywhere - inside logs, artefacts and documents. There can be an endless supply of artefacts and documents and different vendor or products may have its own proprietary formats.

This article is going to cover the use of regular expressions with plenty of examples.

This article will not be explaining the meaning of each specific character (explained in the Glossary) nor covering how to open or read certain artefacts or documents to obtain plain text data, as the means to do so are endless.

Some artefacts and logs are well structured, such as syslog, Windows Event logs, IIS web server or even CSV and JSON. In this circumstance, we may want to extract sub sections of already extracted fields. Further, regex truly comes into its own when we encounter unstructured or inconsistently formatted logs.


When applying regex to a log or blob of text, I would recommend reading through the data first, or at least have a sample log open alongside your regex. As stated, some data is more forthcoming in structure (and/or supporting documentation) than others so understanding the schema or lack thereof will vary.


To validate your regex, leverage https://regex101.com/. It explains the construction and extraction as you write the expression and has saved me skin more than once!

Without further ado, let’s get started…

Analysis

scanme.nmap.org.nmap


SyncReporter-2024-10-22-033943.log


com.apple.Terminal.plist


la-lsaR


Security.evtx