Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Alter JSON output #208

Open
OlafHaalstra opened this issue Jan 31, 2022 · 5 comments
Open

[Question] Alter JSON output #208

OlafHaalstra opened this issue Jan 31, 2022 · 5 comments

Comments

@OlafHaalstra
Copy link

OlafHaalstra commented Jan 31, 2022

Dear Omer,

Awesome work on this library, it is really blazing fast.

I hope you can help me with the following question about the JSON serializer. I would like to alter the JSON data that is outputted by the parser and I am looking for the best way to do it.

By default it outputs something like this:

{
        "Event": {
            "EventData": {
                "Binary": null,
        ...
        "Event_attributes": {
            "xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
        }
}

Which I would like to append a few properties to, e.g.:

{
        "Event": {
            "EventData": {
                "Binary": null,
        ...
        "Event_attributes": {
            "xmlns": "http://schemas.microsoft.com/win/2004/08/events/event"
        },
    "fields": {
        "host": "WIN-TEST",
        "source": "Setup.evtx",
        "time": 1623066248.0
    }
}

This should happen somewhere around this snippet of code, which returns a record which contains the data object which is already a string (from the into_json function):

            EvtxOutputFormat::JSON => {
                for record in parser.records_json() {
                    self.dump_record(record)?;   
                }
            }

The following solutions were the ones I could think off:

  1. Alter the string to insert the fields part.
  • Advantages
    • Easy to implement
    • Fast?
  • Disadvantages
    • Not flexible
    • Error prone
  1. Parse the record.data string to object with serde_json, alter it, and convert it to string again.
  • Advantages
    • Easy to implement
    • Flexible
    • Not prone to errors
  • Disadvantages
    • Compromises performance due to inherent inefficiency
  1. Implement own records_json function
  • Advantages
    • Fast?
    • Flexible
  • Disadvantages
    • I'm a terrible rust developer
    • Introduces a lot of code from your library which will be outdated
  1. insert even better solution here

I'm asking for your advise on this because I wasn't able to figure it out how to properly do it in rust, also performance is important for me so I want to find a very efficient solution.

For solution (3) I already tried to implement something but that doesn't work. Maybe you can provide some guidance or you might even have a much better solution in mind.

// Stable shim until https://github.com/rust-lang/rust/issues/59359 is merged.
// Taken from proposed std code.
pub trait ReadSeek: Read + Seek {
    fn tell(&mut self) -> io::Result<u64> {
        self.seek(SeekFrom::Current(0))
    }
    fn stream_len(&mut self) -> io::Result<u64> {
        let old_pos = self.tell()?;
        let len = self.seek(SeekFrom::End(0))?;

        // Avoid seeking a third time when we were already at the end of the
        // stream. The branch is usually way cheaper than a seek operation.
        if old_pos != len {
            self.seek(SeekFrom::Start(old_pos))?;
        }

        Ok(len)
    }
}

impl<T: Read + Seek> ReadSeek for T {}

pub struct JsonSerialize<'a, T: ReadSeek> {
    settings: ParserSettings,
    parser: &'a mut EvtxParser<T>,
}


impl<T: ReadSeek> JsonSerialize<'_, T> {

    /// Return an iterator over all the records.
    /// Records will be JSON-formatted.
    pub fn records_json(
        &mut self,
    ) -> impl Iterator<Item = Result<SerializedEvtxRecord<String>, EvtxError>> + '_ {
        EvtxParser::serialized_records(self.parser, |record| record.and_then(|record| self.into_json(record)))
    }

    /// Consumes the record and parse it, producing a JSON serialized record.
    fn into_json(self, record: EvtxRecord) -> Result<SerializedEvtxRecord<String>, EvtxError> {
        let indent = self.settings.should_indent();
        let mut record_with_json_value = EvtxRecord::into_json_value(record)?;

        let data = if indent {
            serde_json::to_string_pretty(&record_with_json_value.data)
                .map_err(SerializationError::from)?
        } else {
            serde_json::to_string(&record_with_json_value.data).map_err(SerializationError::from)?
        };

        Ok(SerializedEvtxRecord {
            event_record_id: record_with_json_value.event_record_id,
            timestamp: record_with_json_value.timestamp,
            data,
        })
    }
}
@omerbenamram
Copy link
Owner

omerbenamram commented Jan 31, 2022

I would probably use jq for this. https://stackoverflow.com/questions/49632521/how-to-add-a-field-to-a-json-object-with-the-jq-command.

It can also handle streams if that's an issue https://stackoverflow.com/questions/62825963/improving-performance-when-using-jq-to-process-large-files.

@OlafHaalstra
Copy link
Author

Preferably I want to have it baked into the code. Not sure where to start. Running into problems with option (2): apparently renaming fields is not trivial.

Replacing values is quite easy with: *v.get_mut("name").unwrap() = json!("Alice");
As well as adding something:

    let new_data = r#"{"name":"Alice"}"#;
    let new_value: JSONValue = serde_json::from_str(new_data)?;
    v["new"] = new_value;

@forensicmatt
Copy link
Contributor

forensicmatt commented Jan 31, 2022

@OlafHaalstra , you will want to create a custom tool around the evtx library and do something like this:

    let mut evtx_parser = match EvtxParser::from_path(path) {
        Ok(p) => p.with_configuration(parser_settings),
        Err(e) => {
            eprintln!("Error handling {}; {}", path.display(), e);
            return;
        }
    };

    for result in evtx_parser.records_json_value() {
        let record = match result {
            Ok(r) => r,
            Err(e) => {
                eprintln!("Error serializing event record: {}", e);
                continue;
            }
        };

        let mut json_value = record.data;
        json_value["source_file"] = json!(path.to_string_lossy());

        println!("{}", json_value);
    }

I am actually planning to make a YouTube video this week that will showcase just this along with things like recursing and parsing files in parallel. Subscribe and hit the bell so it will alert you when this video comes out (https://www.youtube.com/channel/UCudIWnSPimNaqMyGoKbaneQ)

@forensicmatt
Copy link
Contributor

Preferably I want to have it baked into the code. Not sure where to start. Running into problems with option (2): apparently renaming fields is not trivial.

Replacing values is quite easy with: *v.get_mut("name").unwrap() = json!("Alice"); As well as adding something:

    let new_data = r#"{"name":"Alice"}"#;
    let new_value: JSONValue = serde_json::from_str(new_data)?;
    v["new"] = new_value;

Baking this into the library is not a good idea. Its better to augment data after you have parsed the raw data as this is personal preference on how you want to structure metadata around the parsed entry.

@forensicmatt
Copy link
Contributor

@OlafHaalstra I made a video that I think will answer your question on how to do this and also give you an example of how to create a CLI around this library and tweak the json values. https://www.youtube.com/watch?v=yVeCAMQ5fZo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants