SemanticScuttle - klotz.me » klotz: text processing

klotz: text processing*

NotebookLlama: An Open Source version of NotebookLM

A guided series of tutorials/notebooks to build a PDF to Podcast workflow using Llama models for text processing, transcript writing, dramatization, and text-to-speech conversion.

2024-10-28 Tags: notebookllama, pdf, llama, text processing, foss, facebook by klotz

OpenLogParser: A Breakthrough Unsupervised Log Parsing Approach

OpenLogParser, an unsupervised log parsing approach using open-source LLMs, improves accuracy, privacy, and cost-efficiency in large-scale data processing.

Approach:

Log grouping: Clusters logs based on shared syntactic features.
Unsupervised LLM-based parsing: Uses retrieval-augmented approach to separate static and dynamic components.
Log template memory: Stores parsed templates for future use, minimizing LLM queries.

Results:

Processes logs 2.7 times faster than other LLM-based parsers.
Improves average parsing accuracy by 25% over existing parsers.
Handles over 50 million logs from the LogHub-2.0 dataset.
Achieves high grouping accuracy (87.2%) and parsing accuracy (85.4%).
Outperforms other state-of-the-art parsers like LILAC and LLMParserT5Base in processing speed and accuracy.

2024-08-14 Tags: openlogparser, logs, llm, llama3-8b, text processing, otel, production engineering by klotz

Everything You Can Do With Python’s Textwrap Module

The TextWrapper class provides functionality for wrapping long pieces of text into multiple shorter lines while preserving the initial and subsequent indents.

2024-02-07 Tags: python, text processing, text, wrapping by klotz

Summarization |

2023-08-08 Tags: langchain, llm, summarization, mapreduce, text processing by klotz

Tail Free Sampling – Trenton Bricken – Interested in Machine Learning, Neuroscience, and Original Glazed Krispy Kreme Doughnuts.

2023-06-12 Tags: llm, sampling, llama.cpp, llama, text processing by klotz

Random Words on Entropy and DNS | Splunk

2023-05-23 Tags: entropy, splunk, dns, text processing by klotz

The big list of naughty strings

2022-09-10 Tags: github, strings, text processing, fuzz, testing by klotz

First / Previous / Next / Last / Page 1 of 0