- a work in progress.

Python re-implementation of LogDissector.

In a nutshell, the LogDissector idea includes:

  • regular-expression based recognition of various recognizable entities in the logs - examples: dates, times, filenames, URLs
  • two things happen when a particular entity is recognized:
    • the count of times we've seen "entity n" is bumped in the appropriate dictionary
    • the term <entity> (abstractly) is appended to the pattern rather than the specific entity detail.
      example: <date> replaces "10/20/30" in the resulting pattern.
  • as a result, log lines are abstracted into an underlying "pattern" and each pattern knows which lines follow it

Currently, the script simply dumps the resulting dictionaries to stdout. Any number of improvements would be sensible there.

Topic revision: r3 - 2016.10.06 - PaulReiber
Copyright © is by author. All material on this collaboration platform is the property of its contributing author.