hqurlfind3r
A passive reconnaissance tool for known URLs discovery.
Resource
Features
- Fetches known URLs:-
- ... from AlienVault's OTX, Common Crawl, URLScan, Github, Intelligence X and the Wayback Machine.
- ... from parsing
robots.txt
, snapshots on the Wayback Machine, disallowed paths.
- Reduces noise:-
- ... by xegex filtering URLs.
- ... by removing duplicate pages in the sense of URL patterns that are probably repetitive and points to the same web template.
- Outputs to stdout, for piping, or file.
Installation
From Binary
You can download the pre-built binary for your platform from this repository's releases page, extract, then move it to your $PATH
and you're ready to go.
From Source
hqurlfind3r requires go1.20+ to install successfully. Run the following command to get the repo
go install -v github.com/hueristiq/hqurlfind3r/v2/cmd/hqurlfind3r@latest
From Github
git clone https://github.com/hueristiq/hqurlfind3r.git && \
cd hqurlfind3r/cmd/hqurlfind3r/ && \
go build; mv hqurlfind3r /usr/local/bin/ && \
hqurlfind3r -h
Post Installation
hqurlfind3r will work after installation. However, to configure hqurlfind3r to work with certain services - currently github - you will need to have setup API keys. The API keys are stored in the $HOME/.config/hqurlfind3r/conf.yaml
file - created upon first run - and uses the YAML format. Multiple API keys can be specified for each of these services.
Example:
version: 2.0.0
sources:
- commoncrawl
- github
- intelx
- otx
- urlscan
- wayback
- waybackrobots
keys:
github:
- d23a554bbc1aabb208c9acfbd2dd41ce7fc9db39
- asdsd54bbc1aabb208c9acfbd2dd41ce7fc9db39
intelx:
- 2.intelx.io:00000000-0000-0000-0000-000000000000
Usage
DiSCLAIMER: fetching urls from github is a bit slow.
hqurlfind3r -h
This will display help for the tool.
_ _ __ _ _ _____
| |__ __ _ _ _ _ __| |/ _(_)_ __ __| |___ / _ __
| '_ \ / _` | | | | '__| | |_| | '_ \ / _` | |_ \| '__|
| | | | (_| | |_| | | | | _| | | | | (_| |___) | |
|_| |_|\__, |\__,_|_| |_|_| |_|_| |_|\__,_|____/|_| v2.0.0
|_|
USAGE:
hqurlfind3r [OPTIONS]
OPTIONS:
-d, --domain string target domain
--include-subdomains include subdomains
-f, --filter string URL filtering regex
--use-sources strings comma(,) separated sources to use
--exclude-sources strings comma(,) separated sources to exclude
--list-sources list all the available sources
-m, --monochrome no colored output mode
-s, --silent silent output mode
-o, --output string output file
Examples
Basic
hqurlfind3r -d tesla.com
Regex filter URLs
hqurlfind3r -d tesla.com -f ".(jpg|jpeg|gif|png|ico|css|eot|tif|tiff|ttf|woff|woff2)"
Include Subdomains' URLs
hqurlfind3r -d tesla.com --include-subdomains
Contribution
Issues and Pull Requests are welcome!