Clauncher — Simple Web Scraper

this project is can be found on GitHub

The main purpose was to create a simple scraper that made use of the same functionality as a browser’s reader mode. It’s a feature in most modern browsers.

Firefox
Chrome
Arc

etc.

I make use of the readability library that was opensourced by Mozilla.

It simply uses axios to get the content of a page then sanitize it and put into a virtual DOM using jsdom and DOMPurify. Finally, return the readability payload as an API response.

It definitely isn’t the most optimized scraper ever and probably won’t work for most webpages too well, but it’s a cheap and dirty option for self-hosting or testing with other projects.

🤔 Vineeth's Thoughts

Recent Posts

Latios Devlog #1 A Naive Copilot

Questions about LLMs in Group Chats

The Debate around AI Art

Clauncher — Simple Web Scraper

Graph View

Backlinks