WebScrapa

DISCLAIMER

This package was put together for learning purposes only.

A simple PHP web scraper package -

Most websites do not offer the functionality to save a copy of the data which they display to your computer. The only option then is to manually copy and paste the data displayed by the website in your browser to a local file in your computer - a very tedious job which can take many hours or sometimes days to complete. Web Scraping is the technique of automating this process.

WebScrapa is a simple web scraper package written with PHP. It uses cURL to request and download a webpage. The downloaded webpage is converted to XML DOM object and XPath is used to navigate through elements in the XML DOM object.

Installation

composer require "florence/scrapa: v1.0"

Usage

Create an instance of the Scrap class:

$url = 'https://www.youtubecom/JustinBieber/about';
$query = '//ul[@class="about-custom-links"]//a[@class="about-channel-link "]/@href';

$scrap = new Scrap($url, $query);

Learn about XPath and how to scrape the elements based on their tags and attributes, such as CSS classes and IDs. https://goo.gl/Gjd3R3

Use the toArrayScrapDOM method to get the results of your query in array format
```
print_r($scrap->toArrayScrapDOM());
```
Use the toStringScrapDOM method to get the results of your query in string format
```
print_r($scrap->toStringScrapDOM())
```

Run the example file

clone the repo

git clone https://github.com/andela-fokosun/webscrapa

run

composer install

from your terminal, run:

php example.php

run tests

vendor/bin/phpunit

florence/scrapa
Release v1.0

Release v1.0

v1.0

0.1.1

0.0.1

Documentation

WebScrapa

DISCLAIMER

Installation

Usage

Run the example file

Stats

Development practices

Releases

Contributors

florence/scrapa Release v1.0

Release v1.0 Toggle Dropdown v1.0 0.1.1 0.0.1

Documentation

WebScrapa

DISCLAIMER

Installation

Usage

Run the example file

Stats

Development practices

Releases

Contributors

florence/scrapa
Release v1.0

Release v1.0

v1.0

0.1.1

0.0.1