Table of Contents
Prologue
Never had such a pure crawler like this nacf
.
Although I often write crawlers, I don’t like to use huge frameworks, such as scrapy, but prefer
simple requests+bs4
or more general requests_html
. However, these two are inconvenient for a
crawler. E.g. Places, such as error retrying or parallel crawling, need to be handwritten by
myself. It is not very difficult to write it while writing too much can be tedious. Hence I
started writing this nacf (Nasy Crawler Framework), hoping to simplify some error retrying or
parallel writing of crawlers.
Packages
Package | Version | Description |
---|---|---|
requests-html | 0.10.0 | HTML Parsing for Humans. |
nalude | 0.3.0 | A standard module. Inspired by Haskell’s Prelude. |
Usage
see tests.
Development Process
DONE Http Functions
CLOSED: <Thu Feb 28 20:51:00 2019>
DONE Get
CLOSED: <Tue Dec 25 17:36:00 2018>
DONE Post
CLOSED: <Thu Feb 28 20:44:00 2019>
DONE Bugs
CLOSED: <Thu Feb 28 20:51:00 2019>
DONE Fix an error from inspect.Parameter which caused the function parallel down. :err:1:
CLOSED: <Wed Dec 26 20:26:00 2018>
NEXT Docs
NEXT Usage
Epoligue
History
Version 1.0.2
- Data: <Sun Mar 10, 2019>
- Changes: Update nalude.
Version 1.0.1
- Data: <Sun Mar 10, 2019>
- Changes: Update requests-html.
Version 1.0.0
- Data: <Thu Feb 28, 2019>
-
Changes: Now, old HTTP methods (
get
andpost
) cannot accept multiple URLs. Instead, we can usegets
andposts
. -
Adds: -
nacf.html
nacf.json
nacf.gets
nacf.posts
-
Includes: -
nalude
Version 0.1.2
- Data: <Wed Dec 26, 2018>
-
Fixed:
inspect.Parameter
error in last version.
Version 0.1.1
- Data: <Wed Dec 26, 2018>
-
Ignored: An error caused by
inspect.Parameter
- Help Wanted: Can someone help me about the Parameter?
Version 0.1.0
- Date: <Sun Dec 23, 2018>
-
Commemorate Version: First Version
- Basic Functions.