The article discusses Sosse, a self-hosted web scraper that allows users to archive their favorite websites. It highlights the tool's simplicity, ease of installation via Docker, and its ability to create full HTML snapshots of web pages, including stylesheets and assets. The author integrates Sosse into their workflow for archiving articles and technical documentation, praising its minimal interface and reliability.
Scraperr is a self-hosted web application for scraping data from web pages using XPath. It supports queuing URLs, managing scrape elements, and provides features such as job management, user login, and integration with AI services.