0 bookmark(s) - Sort by: Date ↓ / Title /
An article discussing the capabilities of Manus AI, a general AI agent that can think, plan, and execute tasks independently. Unlike other AI assistants, Manus can deliver results directly, making it highly efficient for various tasks.
Browser Use is a library that enables AI agents to interact with web browsers, making websites accessible for automated tasks. It includes features for browser automation, agent memory, and various demos showcasing its capabilities.
LlamaExtract is a powerful, easy-to-use tool that allows users to extract structured data from unstructured documents with minimal effort, available through LlamaCloud’s web UI and Python SDK.
ReaderLM-v2 is a 1.5B parameter language model developed by Jina AI, designed for converting raw HTML into clean markdown and JSON with high accuracy and improved handling of longer contexts. It supports multilingual text in 29 languages and offers advanced features such as direct HTML-to-JSON extraction. The model improves upon its predecessor by addressing issues like repetition in long sequences and enhancing markdown syntax generation.
New release of shot-scraper CLI tool for taking screenshots and scraping web pages with support for HTTP Archive (HAR) files.
Scraperr is a self-hosted web application for scraping data from web pages using XPath. It supports queuing URLs, managing scrape elements, and provides features such as job management, user login, and integration with AI services.
Karishma Shukla announces the open-sourcing of Maxun, a no-code web data extraction platform. Maxun allows users to build custom data scraping robots easily, bypass geolocation restrictions, captchas, and anti-bot measures. The project aims to democratize access to web data and offer a simple API for users.
emailFinder is a Python-based web scraping tool designed to extract email addresses from websites or URLs listed in a file. It can crawl through website pages, parse content, and efficiently extract email addresses.
The author records a screen capture of their Gmail account and uses Google Gemini to extract numeric values from the video.
Crawl4AI is an open-source web crawling tool designed to efficiently collect and curate high-quality, structured data from the web for large language model training. It handles multiple URLs simultaneously and supports various data formats, including JSON and Markdown.
First / Previous / Next / Last
/ Page 1 of 0