SemanticScuttle - klotz.me

klotz: web*

Experimental new browser engine. Parses HTML/CSS, computes styles, performs layout, and paints pixels. Includes a desktop browser shell and JavaScript execution via an embedded JS engine.

2026-01-18 Tags: browser engine, rust, html, css, javascript, rendering, web, ai by klotz

7 Things I Wish I Knew Before Using Python Playwright for the First Time

A guide to common pitfalls and best practices when starting with Playwright and Python, covering topics like browser context, waiting strategies, and handling different environments.

2025-12-12 Tags: playwright, python, web, automation, testing, browser by klotz

Complete Crawler List For AI User-Agents [Dec 2025]

This article provides a verified list of AI crawlers (GPTBot, ClaudeBot, Gemini, etc.) with user-agent strings, crawl rates, and IP verification information to help manage access and maintain inclusion in AI discovery.

2025-12-06 Tags: llm, web, crawler, user agent, gptbot, claudebot, gemini, bingbot, seo, ai search, robots.txt, ip verification by klotz

AI's free web scraping days may be over, thanks to this new licensing protocol

The internet's new standard, RSL, is a clever fix for a complex problem, and it just might give human creators a fighting chance in the AI economy.

2025-09-11 Tags: llm, web, scraping, rsl, licensing by klotz

financial-datasets/web-crawler

An open source web crawler that searches the internet. It's a minimal, real-time web search CLI that searches the internet for you. Enter a query and get search results as JSON (title, url, published_date), sorted by recency.

2025-08-28 Tags: web, crawler, scraper, search, cli, json, internet, open source, python, llm by klotz

LlamaPen

A no-install needed web-GUI for Ollama. It provides a web-based interface for interacting with Ollama, offering features like markdown rendering, keyboard shortcuts, a model manager, offline/PWA support, and an optional API for accessing more powerful models.

2025-08-18 Tags: ollama, web, vue, typescript, pwa, local llm, chat, github, indarktom, llamaprn by klotz

Requests-HTML: An HTML parsing library in Python

An introduction to the Requests-HTML library, covering its installation, core features, basic and advanced usages, and practical application scenarios for web scraping and parsing.

2025-07-03 Tags: python, html-request, scrapet, web, selenium by klotz

This self-hosted web scraper lets me archive my favorite websites

The article discusses Sosse, a self-hosted web scraper that allows users to archive their favorite websites. It highlights the tool's simplicity, ease of installation via Docker, and its ability to create full HTML snapshots of web pages, including stylesheets and assets. The author integrates Sosse into their workflow for archiving articles and technical documentation, praising its minimal interface and reliability.

2025-06-24 Tags: web, scraper, sosse, self-hosted, archiving, docker, html, content, scuttle, hallux by klotz

Pocket is shutting down, so I switched to a self-hosted alternative

The author details their transition from Pocket to Karakeep, a self-hosted, open-source alternative for saving and reading articles later. They discuss the benefits of owning your data and the features of Karakeep, including RSS integration and AI-powered tagging.

2025-06-01 Tags: pocket, karakeep, self-hosted, open-source, web, archive, scuttle by klotz

Notte: Full stack framework for the agentic internet layer

Notte is an open-source browser using an agent, designed to improve speed, cost, and reliability in web agent tasks through a perception layer that structures webpages for LLM consumption. It offers a full stack framework with customizable browser infrastructure, web scripting, and scraping endpoints.

2025-04-17 Tags: agent, automation, browser, web, ai, openai, llm, anthropic, web scraping by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: web*

Linked Tags

Related Tags