OpenLogParser, an unsupervised log parsing approach using open-source LLMs, improves accuracy, privacy, and cost-efficiency in large-scale data processing.
Approach:
- Log grouping: Clusters logs based on shared syntactic features.
- Unsupervised LLM-based parsing: Uses retrieval-augmented approach to separate static and dynamic components.
- Log template memory: Stores parsed templates for future use, minimizing LLM queries.
Results:
- Processes logs 2.7 times faster than other LLM-based parsers.
- Improves average parsing accuracy by 25% over existing parsers.
- Handles over 50 million logs from the LogHub-2.0 dataset.
- Achieves high grouping accuracy (87.2%) and parsing accuracy (85.4%).
- Outperforms other state-of-the-art parsers like LILAC and LLMParserT5Base in processing speed and accuracy.
GPU-accelerated LLMs on Odrange Pi 5, which features a Mali-G610 GPU. The authors used Machine Learning Compilation (MLC) techniques to achieve speeds of 2.3 tok/sec for Llama3-8b, 2.5 tok/sec for Llama2-7b, and 5 tok/sec for RedPajama-3b. They also managed to run a Llama-2 13b model at 1.5 tok/sec on a 16GB version of the Orange Pi 5+.