Documentation
Parsers

Developing a Parser

Detailed guide on implementing the interface function and mapping results.

requirements.txt File

This file stores the dependencies used by the parser. Once the parser is created, the system creates a specific Python virtual environment for the parser and installs all the dependencies in requirements.txt to that virtual environment.

The virtual environment is important to avoid conflict between libraries from different parsers.

configuration.json File

This is an optional file that stores the parser configuration and metadata in JSON format for use when the parser is imported again into the system. By default, you do not need to create one.

The following is an example of a configuration.json file created by the system:

{
  "name": "BrowserHistory",
  "description": "This parse the browser history artifacts.",
  "author": "SandsBytes",
  "version": "1.0.1",
  "parser_type": "BrowserUsage",
  "identification_field": "File_Attributes",
  "id": "6d09b702-0e4a-47c8-89b5-8087e03a74f9",
  "identification_values": [
    "name:\"WebCacheV01.dat\" OR name:\"History\" OR name:\"places.sqlite\""
  ]
}

The interface.py File

The system executes a function named interface in this file and passes several parameters:

ParameterDescription
parserThe name of the parser.
input_fileFull path to the artifact file being processed.
output_fileFull path where the results should be written.
utilsUtility functions (default to None in definition).

Implementation Example

The interface function must parse the provided input_file artifact and write the results as JSONL to output_file (one JSON record per line).

Example output:

{"path": "C:\\\\windows\\\\temp\\\\a.exe", "size": "500", "user": "saleh"}
{"path": "C:\\\\windows\\\\temp\\\\b.exe", "size": "100", "user": "ahmid"}

Tip

Always use try/except in your script so that failures do not crash the system.

import json
import sys

def interface(parser, input_file, output_file, utils=None):
    try:
        # Parse input_file and write JSONL to output_file (one record per line)
        parsed_data = {"path": "C:\\temp\\example.exe", "size": "500", "user": "saleh"}
        with open(output_file, "w") as ofile:
            ofile.write(f"{json.dumps(parsed_data)}\n")
    except Exception as e:
        _, exc_obj, exc_tb = sys.exc_info()
        msg = f"Exception: {str(exc_obj)}, Line No. {str(exc_tb.tb_lineno)}"
        raise Exception(msg)

Note

You must define the interface function inside the interface.py file.

Timestamp

Each record should have a @timestamp field of type date in ISO 8601 format (yyyy-mm-ddThh:mm:ss) for the main timeline. If no @timestamp is provided, it is stored as 1700-01-01T00:00:00.000000.

Data types

Avoid using multiple data types for the same field in different records; otherwise the whole record may be dropped. For example, {"path": "C:\\\\windows\\\\temp\\\\b.exe"} indexes correctly, but {"path": {"filename": "a.exe", "folder_path": "C:\\\\windows\\\\temp\\\\"}} fails because path is an object instead of a string.

Lists

Avoid a field of type list with a large number of elements (e.g. {"files": ["file1.exe", "file2.exe", "file3.exe", ...]}). Instead, split into multiple records, each with a single element: {"files": "file1.exe"}, {"files": "file2.exe"}, {"files": "file3.exe"}, and so on.

ECS Mapping (ecs_mapper.yaml)

The Elastic Common Schema (ECS) defines a standard schema for your results across all parsed records. ECS mappings are stored in ecs_mapper.yaml and are used to normalize parser output into ECS fields.

The following sections describe the YAML specification used in the ECS mapper.

<mapper>

Mapper name. You can create multiple mappers in a single ecs_mapper.yaml file with different mapper names.

<mapper>

Example: Chrome_Mapper, Firefox_Mapper.

<condition>

If the field matches the provided Lucene query, the mapper is applied.

<condition>

Example: event.module:"RecycleBin".

<categgories>

One or more of: api, authentication, configuration, database, driver, email, file, host, iam, intrusion_detection, library, malware, network, package, process, registry, session, threat, vulnerability, web.

<categgories>

Example: ["file"].

<types>

One or more of: access, admin, allowed, change, connection, creation, deletion, denied, end, error, group, indicator, info, installation, protocol, start, user.

<types>

Example: ["deletion"].

<action>

Describes the action of the event (more specific than event.category).

<action>

Example: group-add, process-started, file-created.

<message>

Message describing the event. Content follows Python String Template.

<message>

Example: File ${Data.Name} deleted.

<field>: "<value>"

Inserts a field into the record. <field> can be a specific field (e.g. Data) or nested (e.g. file.name). <value> is any string; content follows Python String Template.

<field>: "<value>"

Example: file.type: "file", file.path: "${Data.Path}", process.executable: '${Data.FolderPath}${Data.ExplorerFileName}'.

<field> with source and expression

Inserts a field using <field-source> (Python String Template) and Python <field-expression> applied on that source.

<field>:
      source: "<field-source>"
      expression: "<field-expression>"

Example: file.drive_letter: source: "${Data.Path}", expression: "source.split(':')[0].upper()".

<field> with source, expression, type, and delimiter

Same as above; if the expression returns a list, set type to list and define delimiter. The system splits the expression result by the delimiter into a list.

<field>:
      source: "<field-source>"
      expression: "<field-expression>"
      type: <field-type>
      delimiter: "<field-delimiter>"

Example: file.attributes with type: list, delimiter: "|".

<field> with source, params, and expression

<field-params> passes parameters to <field-expression>.

<field>:
  source: <field-source>
  params: <field-params>
  expression: <field-expression>

Example: winlog.level with params: levels: ['LogAlways','Critical','Error','Warning','Information','Verbose'], expression: "params['levels'][int(source)]".

<field> with expression_conf

Set allow_spec_char: True to allow special characters (e.g. avoid escaping \n for multi-line display).

<field>:
  source: <field-source>
  expression: <field-expression>
  expression_conf:
    allow_spec_char: True

Example: message with expression_conf: allow_spec_char: True.

file.path with required_inputs

The field is added only if the specified fields exist in the record; otherwise it is ignored.

file.path:
  source: <field-source>
  required_inputs:
    - '<field-required-inputs>'

Example: file.path with source: "${file.directory}\\\\${file.name}", required_inputs: ['file.directory', 'file.name'].

ecs_mapper.yaml Example

<mapper>:
  condition: <condition>
  event.category: <categgories>
  event.type: <types>
  event.action: <action>
  mapping:
    message: "<message>"
    <field>: "<value>"
    <field>:
      source: "<field-source>"
      expression: "<field-expression>"