Satisnet Ltd, Basepoint Innovation Centre, 110 Butterfield Great Marlings, Luton, Bedfordshire, LU2 8DL
+44 (0) 1582 434320

Filtering Data within Splunk

Filtering Data within Splunk

Filtering Data within Splunk

Typically within a business an instance of Splunk would be indexing varied forms of data at copious volumes. A few examples would be Windows registry, event logs, application web logs, Linux configuration syslog, application web logs, and database audits.

Forwarding all possible logs into Splunk can be hugely beneficial towards visibility, however in some cases a user may not be interested in particular logs, and may only want to index specific logs. For instance a common scenario would be based around compliance when recording Windows Security events of which a Splunk administrator may only be interested in logging and reporting user log-on and/or log-off activity. Any other events may not be of interest/needed, which means filtering out these unwanted events would be favourable.

Within the back end of Splunk’s configurable depths, an administrator can modify two configuration files called props.conf and transforms.conf. This results in a way of filtering unwanted data before being indexed. This blog will provide an example of how to achieve pre-index filtering in Splunk with the use of props.conf and transforms.conf.

Props.conf and Transforms.conf

Some of the most common uses for props.conf are as follows:

  • When experiencing multiline events, props.conf can be configured for linebreaking
  • Configuration to recognise timestamps
  • Create segmentation between events
  • A way of overriding the automated host and source type matching built into Splunk
  • Advanced regex overriding based on host and source type configuration
  • Renaming source types
  • Ability to anonymise particular types of data feed such as bank card details, etc
  • Re-routing of particular events when a user may have multiple indexes
  • Any many more

Users have the option to define basic search-time extractions based entirely through props.conf, however if you need a search-time extraction based on one or more of the following, the use of transforms.conf is required:

  • Reuse of the same field-extracting regular expression across multiple sources, source types, or hosts
  • Application of more than one regex to the same source, source type, or host
  • Delimiter-based field extractions (they involve field-value pairs that are separated by commas, colons, semicolons, bars, or something similar)
  • Extraction of multiple values for the same field (multivalued field extraction)
  • Extraction of fields with names that begin with numbers or underscores
Steps to take when filtering events

Create props.conf and transforms.conf files within the following directory:


Within props.conf:


TRANSFORMS-sec = WinEventDrop,WinEventPass

# Contains the name of each definition within transforms.conf and the order in which you want each transforms process to be carried out.

Within transforms.conf

# Transforms sets the regex to match the events which need white-list/black-list filtration

# To begin a process of ‘allowing’ only a small amount of events, it may be favourable to first ‘drop’ all events and then only ‘allow’ the small amount of required events



# Regex ‘.’ matches all events

DEST_KEY = queue

FORMAT = nullQueue

# nullQueue drops all events matching the specified regex

# Now that all events have been dropped, we can start to filter by allowing the events of interest


REGEX = (?m)^EventCode=(4624|4625)[^0-9]

# The specified regex matches all events with the words ‘EventCode=’ followed by either ‘4624’ or ‘4625’

# Event code 4624 = An account was successfully logged on

# Event code 4625 = An account failed to log on

DEST_KEY = queue

FORMAT = indexQueue

# indexQueue indexes all events matching the specified regex

Once these configuration changes have been completed, the Splunk instance will need to be restarted. Complete the process by restarting all Splunk services.

Ways to Check If Filtering Is Taking Affect
  • Assess the license volume on the Splunk interface, this should start to reduce
  • Check indexes on the Splunk interface for the amount of event being indexed, this should start to reduce
  • Attempt a Splunk search based on all events that are being indexed into the source type of interest

Further information can be found via these links: