Tags: parsing*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. Virgil.Dev is a tool that parses GitHub repositories into structured code graphs, extracting crucial elements like functions, classes, imports, and cross-file references across ten programming languages. It differs from traditional text-based search by providing exact structural results from an indexed code graph, enabling faster and more accurate code understanding. Users can explore their code via the Model Context Protocol (MCP), an AI chat interface with built-in tools, or a dedicated CLI for local parsing and querying. Pricing tiers range from free to developer plans.
    2026-03-16 Tags: , , , by klotz
  2. Invisible XML (ixml) is a method for treating non-XML documents as if they were XML, enabling authors to write documents and data in a format they prefer while providing XML for processes that are more effective with XML content.
  3. This post explores a new idea for parallelizing a simplified parsing task using a "stack monoid" and scan operations, potentially enabling efficient GPU implementation of parsing algorithms.
  4. Docling is a tool that parses documents and exports them to desired formats like Markdown and JSON. It supports various document formats and provides advanced PDF understanding, metadata extraction, and integration with LlamaIndex and LangChain for RAG / QA applications.
    2024-11-01 Tags: , , , , , , , , , by klotz
  5. Docling is a tool that parses documents and exports them to desired formats like Markdown and JSON. It supports various document formats including PDF, DOCX, PPTX, Images, HTML, AsciiDoc, and Markdown.
    2024-11-01 Tags: , , , , , , , , , , by klotz
  6. A post discussing new techniques developed for parsing and searching PDFs, focusing on turning them into a hierarchical structure for RAG search. The approach involves dynamically generating chunks for searches, sending headers and sub-headers to the Language Model along with relevant chunks.
    2024-06-27 Tags: , , , , , by klotz
  7. The llmsherpa project provides APIs to accelerate Large Language Model (LLM) projects. It includes features like LayoutPDFReader for PDF text parsing, smart chunking for vector search and Retrieval Augmented Generation, and table analysis. It is open-sourced under Apache 2.0 license.
  8. 2017-02-13 Tags: , , , , , by klotz
  9. 2017-02-09 Tags: , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "parsing"

About - Propulsed by SemanticScuttle