symfony/dom-crawler
Symfony DomCrawler Component
Latest release v5.2.0-RC2 - Updated - 3.17K stars
Scrapy
A high-level Web Crawling and Web Scraping framework
Latest release 2.4.1 - Updated - 39.8K stars
jaybizzle/crawler-detect
CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
Latest release v1.2.103 - Updated - 1.38K stars
crawler
Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server ...
Latest release 1.2.2 - Updated - 5.42K stars
simplecrawler
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic c...
Latest release 1.1.9 - Updated - 2K stars
github.com/gocolly/colly
Elegant Scraper and Crawler Framework for Golang
Latest release v2.1.0 - Updated - 13.1K stars
osmosis
Web scraper for NodeJS
Latest release 1.1.10 - Updated - 3.73K stars
jaeger/querylist
Simple, elegant, extensible PHP Web Scraper (crawler/spider),Use the css3 dom selector,Based on p...
Latest release V4.2.7 - Updated - 2.22K stars
puppeteer-extra-plugin-stealth
Stealth mode: Applies various techniques to make detection of headless puppeteer harder.
Latest release 2.7.5 - Updated - 2.17K stars
github.com/gocolly/colly/v2
Elegant Scraper and Crawler Framework for Golang
Latest release v2.1.1-0.20210217135629-a888c12a4e4d - Updated - 13.1K stars
newspaper3k
Simplified python article discovery & extraction.
Latest release 0.2.8 - Updated - 10.7K stars
us.codecraft:webmagic-core
A crawler framework. It covers the whole lifecycle of crawler: downloading, url management, cont...
Latest release 0.7.4 - Updated - 9.61K stars
spatie/crawler
Crawl all internal links found on a website
Latest release 6.0.1 - Updated - 1.83K stars
wombat
Generic Web crawler with a DSL that parses structured data from web pages
Latest release 2.10.0 - Updated - 1.16K stars
wa72/htmlpagedom
jQuery-inspired DOM manipulation extension for Symfony's Crawler
Latest release v2.0.1 - Updated - 299 stars
@nodelib/fs.walk
A library for efficiently walking a directory recursively
Latest release 1.2.5 - Updated - 24 stars
scrapy
Scrapy is an open source and collaborative framework for extracting the data you need from websit...
Latest release 2.4.0 - Updated - 39.8K stars
us.codecraft:webmagic-selenium
A crawler framework. It covers the whole lifecycle of crawler: downloading, url management, cont...
Latest release 0.7.4 - Updated - 9.61K stars
us.codecraft:webmagic-extension
A crawler framework. It covers the whole lifecycle of crawler: downloading, url management, cont...
Latest release 0.7.4 - Updated - 9.61K stars
Abot
Abot is an open source C# web crawler built for speed and flexibility. It takes care of the low l...
Latest release 2.0.69 - Updated - 1.84K stars
google-play-scraper
scrapes app data from google play store
Latest release 8.0.2 - Updated - 1.4K stars
org.codelibs.fess:fess
Fess is Full tExt Search System.
Latest release 13.10.1 - Updated - 538 stars
cheerio-httpcli
http client module with cheerio & iconv(-lite) & promise
Latest release 0.8.1 - Updated - 240 stars
spatie/robots-txt
Determine if a page may be crawled from robots.txt and robots meta tags
Latest release 1.0.9 - Updated - 125 stars
HtmlAgilityPack
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you a...
Latest release 1.11.30 - Updated
headless-chrome-crawler
Distributed web crawler powered by Headless Chrome
Latest release 1.8.0 - Updated - 4.87K stars
scrapy-redis
Redis-based components for Scrapy.
Latest release 0.6.8 - Updated - 4.79K stars
DotnetSpider
DotnetSpider, a .NET Standard web crawling library. It is lightweight, efficient and fast high-le...
Latest release 5.0.8 - Updated - 2.9K stars
apify
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of dat...
Latest release 1.0.2-beta.0 - Updated - 2.72K stars
zhihu-api
Unofficial API for zhihu (https://www.zhihu.com)
Latest release 3.0.0 - Updated - 255 stars
License
Language
Keyword
Platform

Subscribe to an RSS feed of this search