It is a python package to help get data from Twitter, Foursquare.
This package was created to facilitate the data mining from Twitter and Foursquare. (Only Linux)
Install (generic way)
$ python3 -m pip install SocialCrawler
How work ?
- Python >= 3
- Foursquare developer credentials ( if you wanna work with)
- Twitter developer credentials ( if you wanna work with )
- geckodriver installed and in $PATH (we got this problem with when try run in Linux Mint and Kali)
- Download from https://github.com/mozilla/geckodriver/releases
$ export PATH=$PATH:<geckodriver-path>
- As the package use tweepy as framework to connect with Twitter we can use Twitter Stream API. Therefore you can search based in :
As shown in Stream Overview
- Getting check-ins shared in Twitter or the check-ins of the last week.
- If you have a Foursquare credential you will be able to track data from specific locations and others.
- fixed module class declaration
- fixed syntax erro and hacking method dir output
- added selenium as requirements to use foursquare browser request (to avoid rate limit), can not work
- updated ExtractorData to a full version to allow get (almost) full VENUE info (NewExtractorData)
- removed urlib2 as requirements
- updated run flow, now always we will have return just check if the field is NULL, when this happen it is because the data is missing
- when VENUE or FOURSQUARE get requests error the program thread will wait 15 minutes to request again
- Added new except treatments
- separeted foursquare request and venue request in two try-except blocks
- fixed write categorie_id bug, missing int to str convert
- yet in ExtractorData possibility of use other file (non a created by Collector or CollectorV2 ) to consult Foursquare. (not available yet)
- Formatted to PEP257 and PEP8 (almost)
- Implementaded ExtractorData: a simple way to get data from Foursquare using the swarm url code
- Add HistoricalCollector.CollectorV2 that get all data from json tweet and save as tsv file
- Add in ExtractorData possibility of use other file (non a created by Collector or CollectorV2 ) to consult Foursquare. (not available yet)
- added urllib2 as Requirements
- Fixed bug in getStoredData function that allow some parameter be None
- Updated format file name generated
- Increased time wait request from 15 minutos to 16. ( Sometimes when was tried request again -after 15 minutes - the server responded that don't finished the 15 minutes.
- Updated the fields saved. Now all field is saved in a file using \tab format as is shown in Wiki.