Gollum

Robots.txt parser with caching. Modelled after Kryten. Docs can be found here.

Usage

Call Gollum.crawlable?/3 to obtain whether a certain URL is permitted for the specified user agent.

iex> Gollum.crawlable?("hello", "https://google.com/")
:crawlable
iex> Gollum.crawlable?("hello", "https://google.com/m/")
:uncrawlable

Gollum is an OTP app (For the cache) so just remember to specify it in the extra_applications key in your mix.exs to ensure it is started.

Gollum allows for some configuration in your config.exs file. The following shows their default values. They are all optional.

config :gollum,
  name: Gollum.Cache, # Name of the Cache GenServer
  refresh_secs: 86_400, # Amount of time before the robots.txt will be refetched
  lazy_refresh: false, # Whether to setup a timer that auto-refetches, or to only refetch when requested
  user_agent: "Gollum" # User agent to use when sending the GET request for the robots.txt

Author

Ravern Koh - <ravern.koh.dev@gmail.com>

gollum
Release 0.1.0

Release 0.1.0

0.1.0

0.2.0

0.2.1

0.2.2

0.3.2

0.3.3

Documentation

Gollum

Usage

Author

Stats

Development practices

Releases

Maintainers

Contributors

gollum Release 0.1.0

Release 0.1.0 Toggle Dropdown 0.1.0 0.2.0 0.2.1 0.2.2 0.3.2 0.3.3

Documentation

Gollum

Usage

Author

Stats

Development practices

Releases

Maintainers

Contributors

gollum
Release 0.1.0

Release 0.1.0

0.1.0

0.2.0

0.2.1

0.2.2

0.3.2

0.3.3