OpenLogParser, an unsupervised log parsing approach using open-source LLMs, improves accuracy, privacy, and cost-efficiency in large-scale data processing.
Approach:
- Log grouping: Clusters logs based on shared syntactic features.
- Unsupervised LLM-based parsing: Uses retrieval-augmented approach to separate static and dynamic components.
- Log template memory: Stores parsed templates for future use, minimizing LLM queries.
Results:
- Processes logs 2.7 times faster than other LLM-based parsers.
- Improves average parsing accuracy by 25% over existing parsers.
- Handles over 50 million logs from the LogHub-2.0 dataset.
- Achieves high grouping accuracy (87.2%) and parsing accuracy (85.4%).
- Outperforms other state-of-the-art parsers like LILAC and LLMParserT5Base in processing speed and accuracy.