@cityssm/get-site-urls

Get all of the URLs from a website.


Keywords
crawl, urls, get, site, links, index, hacktoberfest, indexing, website
License
MIT
Install
npm install @cityssm/get-site-urls@3.4.1

Documentation

get-site-urls

npm (scoped) Codacy grade Code Climate maintainability Code Climate coverage AppVeyor Snyk Vulnerabilities for GitHub Repo

Get all of the URLs from a website.

Forked from alex-page/get-site-urls.

Install

npm install @cityssm/get-site-urls

Usage

import { 'getSiteUrls' } from "@cityssm/get-site-urls";

getSiteUrls( 'https://saultstemarie.ca' )
	.then( links => console.log( links ) );

( async () => {
	const links = await getSiteUrls( 'https://saultstemarie.ca' );
	console.log( links );

/*
{
	pages: [
		'https://saultstemarie.ca',
		'https://saultstemarie.ca/City-Hall.aspx',
		'https://saultstemarie.ca/City-Hall/City-Council.aspx',
		...,
		'https://saultstemarie.ca/Contact-Us.aspx',
		'https://saultstemarie.ca/Site-Map.aspx'
	],
	errors: [
		'https://saultstemarie.ca/Broken-Link.aspx'
	]
}
*/
})();

Parameters

The function getSiteUrls() takes two parameters:

getSiteUrls( url, maxDepth );
  1. url - The URL to search.
  2. maxDepth - The maximum depth to search, default 1.