An open source web crawler that searches the internet. It's a minimal, real-time web search CLI that searches the internet for you. Enter a query and get search results as JSON (title, url, published_date), sorted by recency.
Extract data from websites in LLM ready JSON or CSV format. Crawl or Scrape entire website with Website Crawler
Browser Use is a library that enables AI agents to interact with web browsers, making websites accessible for automated tasks. It includes features for browser automation, agent memory, and various demos showcasing its capabilities.
Mariya Mansurova explores using CrewAI's multi-agent framework to create a solution for writing documentation based on tables and answering related questions.
AutoCrawler is a two-stage framework that leverages the hierarchical structure of HTML for progressive understanding and aims to assist crawlers in handling diverse and changing web environments more efficiently. This work introduces a crawler generation task for vertical information web pages and proposes the paradigm of combining LLMs with crawlers, which supports the adaptability of traditional methods and enhances the performance of generative agents in open-world scenarios. Generative agents, empowered by large language models, suffer from poor performance and reusability in open-world scenarios.