klotz: web*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. Notte is an open-source browser using an agent, designed to improve speed, cost, and reliability in web agent tasks through a perception layer that structures webpages for LLM consumption. It offers a full stack framework with customizable browser infrastructure, web scripting, and scraping endpoints.

  2. Discussion about the Clipboard API and the differences between clipboardRead and paste events.

    2025-02-11 Tags: , , , , by klotz
  3. Exploring different methods and designs in implementing a Paste file feature that lets users quickly copy and send files between devices.

    2025-02-11 Tags: , , , by klotz
  4. Justin Garrison demonstrates how to use a Raspberry Pi or other single-board computer to run a local Personal Data Server (PDS) for the microblogging platform Bluesky, allowing users to store and manage their own data.

  5. This project provides an LLM Websearch Agent using a local SearXNG server for search functionality and includes Python scripts and a bash script for interacting with an LLM to summarize search results.

    2024-11-30 Tags: , , , , , by klotz
  6. Scraperr is a self-hosted web application for scraping data from web pages using XPath. It supports queuing URLs, managing scrape elements, and provides features such as job management, user login, and integration with AI services.

    2024-11-17 Tags: , , , , , by klotz
  7. FlowScraper is a powerful web scraper with an intuitive FlowBuilder, enabling effortless website automation and data extraction without coding. It features customizable AI actions and automatic anti-bot protection.

  8. The crawl-delay directive is an unofficial directive in robots.txt meant to communicate to crawlers to slow down crawling to not overload the web server. However, support for this directive varies among search engines.

    2024-10-07 Tags: , , by klotz
  9. Crawl4AI is an open-source web crawling tool designed to efficiently collect and curate high-quality, structured data from the web for large language model training. It handles multiple URLs simultaneously and supports various data formats, including JSON and Markdown.

    2024-09-28 Tags: , , , , by klotz
  10. Google's Martin Splitt shares how to defend against malicious bots and improve site performance. SEO expert Roger Montti explains why contacting resource providers won't work and offers alternative solutions.

    2024-08-26 Tags: , , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: web

About - Propulsed by SemanticScuttle