0 bookmark(s) - Sort by: Date ↓ / Title /
A popular and actively maintained open-source web crawling library for LLMs and data extraction, offering advanced features like structured data extraction, browser control, and markdown generation.
Browser Use is a library that enables AI agents to interact with web browsers, making websites accessible for automated tasks. It includes features for browser automation, agent memory, and various demos showcasing its capabilities.
ReaderLM-v2 is a 1.5B parameter language model developed by Jina AI, designed for converting raw HTML into clean markdown and JSON with high accuracy and improved handling of longer contexts. It supports multilingual text in 29 languages and offers advanced features such as direct HTML-to-JSON extraction. The model improves upon its predecessor by addressing issues like repetition in long sequences and enhancing markdown syntax generation.
The author records a screen capture of their Gmail account and uses Google Gemini to extract numeric values from the video.
Cloudflare plans to launch a marketplace where website owners can sell AI model providers access to scrape their content. This move aims to give publishers more control over their content and monetization opportunities in the AI era.
Parsera is a simple and fast Python library for scraping websites using Large Language Models (LLMs). It's designed to be lightweight and minimize token usage for speed and cost efficiency.
Reworkd is a platform that simplifies web data extraction, using LLM code generation to help businesses scale their web data pipelines. No coding skills required.
Mariya Mansurova explores using CrewAI's multi-agent framework to create a solution for writing documentation based on tables and answering related questions.
AI Helps Make Web Scraping Faster And Easier: Scrapegraph-ai is a new tool that uses large language models (LLMs) to automate the process of web scraping and data processing.
Scrapegraph-ai is a Python library for web scraping using AI. It provides a SmartScraper class that allows users to extract information from websites using a prompt. The library uses LLM models like Ollama, OpenAI, Azure, Gemini, and others for information extraction.
First / Previous / Next / Last
/ Page 1 of 0