twitographer

A parallelized web crawler to traverse the Twitter graph


License
Unlicense
Install
pip install twitographer==0.0.4

Documentation

twitographer -- a parallelized web crawler to traverse the twitter graph


this is a tiny script that uses miyakogi's python port of google's puppeteer headless browser to traverse the twitter graph, logging the friends of each account.

i wrote this for a big data analysis project to infer twitter communities from aggregated social relationships.

add your twitter username and password to credentials.py before running.

twitographer determines the number of processes to spawn via:

from credentials import creds
if len(creds) <= multiprocessing.cpu_count():
   num_processes = len(creds)
else:
   num_processes = multiprocessing.cpu_count()

installation

(requires python >= 3.0, redis >= 1.0.0)

via pip

pip install twitographer

livebeef <livebeef (@t) protonmail (d.t) com>