SemanticScuttle - klotz.me » Tags: scraper+python

An open source web crawler that searches the internet. It's a minimal, real-time web search CLI that searches the internet for you. Enter a query and get search results as JSON (title, url, published_date), sorted by recency.

2025-08-28 Tags: web, crawler, scraper, search, cli, json, internet, open source, python, llm by klotz

8 Python Libraries So Good, I Stopped Writing My Own Scripts

The article highlights eight Python libraries that can save time, reduce bugs, and simplify coding tasks.

| Library | Purpose | Key Feature |
|-----------|-----------------------------------------------------------------------|----------------------------------------------------------------------------|
| Rich | Enhance CLI output | Styling, tables, syntax-highlighted tracebacks, progress bars |
| Typer | Build CLIs quickly | Simple CLI creation using function signatures and type hints |
| Pendulum | Handle datetime operations | Time zone handling, formatting, arithmetic, and human-readable time parsing |
| Pydantic | Validate data with type hints | Automated validation, documentation, and parsing of input data |
| Faker | Generate fake data | Create realistic dummy data for testing and development |
| Tqdm | Add progress bars | Monitor loop progress and catch infinite loops |
| Requests-HTML | Web scraping with JavaScript support | Parse modern web pages with JavaScript rendering |
| Loguru | Simplify logging | Easy logging configuration with levels, file rotation, and colorful output |

2025-07-03 Tags: python, scraper, logs, cli by klotz

Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper

A popular and actively maintained open-source web crawling library for LLMs and data extraction, offering advanced features like structured data extraction, browser control, and markdown generation.

2025-04-03 Tags: web crawler, scraper, llm, data extraction, open-source, python, crawl4ai, quixey by klotz

Browser Use: Enable AI to Control Your Browser

Browser Use is a library that enables AI agents to interact with web browsers, making websites accessible for automated tasks. It includes features for browser automation, agent memory, and various demos showcasing its capabilities.

2025-03-14 Tags: python, browser, automation, agents, llm, github, crawler, scraper by klotz

Introducing LlamaExtract: Unlocking Structured Data Extraction in Just a Few Clicks

LlamaExtract is a powerful, easy-to-use tool that allows users to extract structured data from unstructured documents with minimal effort, available through LlamaCloud’s web UI and Python SDK.

2025-02-28 Tags: llamaextract, structured data, extraction, unstructured documents, schema, data, scraper, python by klotz

Scraperr - Self-hosted Webscraper

Scraperr is a self-hosted web application for scraping data from web pages using XPath. It supports queuing URLs, managing scrape elements, and provides features such as job management, user login, and integration with AI services.

2024-11-17 Tags: self-hosted, web, scraper, xpath, python, github by klotz

d3ndr1t30x/emailfinderv2: emailFinder - Email Extraction Tool

emailFinder is a Python-based web scraping tool designed to extract email addresses from websites or URLs listed in a file. It can crawl through website pages, parse content, and efficiently extract email addresses.

2024-10-26 Tags: email, scraper, python, github actions by klotz

Parsera: Lightweight Python Library for Web Scraping with LLMs

Parsera is a simple and fast Python library for scraping websites using Large Language Models (LLMs). It's designed to be lightweight and minimize token usage for speed and cost efficiency.

2024-08-16 Tags: python, web, scraper, llm, data extraction, parsera by klotz

Scrapegraph-ai GitHub Repository

Scrapegraph-ai is a Python library for web scraping using AI. It provides a SmartScraper class that allows users to extract information from websites using a prompt. The library uses LLM models like Ollama, OpenAI, Azure, Gemini, and others for information extraction.

2024-05-03 Tags: python, scraper, llm, scrapegraph-ai, github by klotz

Writing a Web Scraper With ChatGPT. How Good is It? | HackerNoon

2023-04-20 Tags: chatgpt, web, scraper, python by klotz

SemanticScuttle - klotz.me

Tags: scraper* + python*

Linked Tags

Related Tags