by Chris Lindgren chris.a.lindgren@gmail.com Distributed under the BSD 3-clause license. See LICENSE.txt or http://opensource.org/licenses/BSD-3-Clause for details.
Documentation: https://nttc.readthedocs.io/en/latest/
A set of functions that process and create topic models from a sample of community-detected Twitter networks' tweets. It also analyzes if there are potential persistent community hubs (either/and by top mentioned or top RTers).
It assumes you seek an answer to the following questions:
- What communities persist or are ephemeral across periods in the corpora, and when?
- What can these communities be named, based on their top RTs and users, top mentioned users, as well as generated topic models?
- Of these communities, what are their topics over time?
Accordingly, it assumes you have a desire to investigate communities across periods and the tweets from each detected community across already defined periodic episodes with the goal of naming each community AND examining their respective topics over time in the corpus.
It functions only with Python 3.x and is not backwards-compatible (although one could probably branch off a 2.x port with minimal effort).
Warning: nttc
performs no custom error-handling, so make sure your inputs are formatted properly! If you have questions, please let me know via email.
- arrow
- tsm
- nltk
- networkx
- matplot
- pandas
- numpy
- emoji
- pprint
- gensim
- spacy
- tqdm
- sklearn
- joblib
- MulticoreTSNE
- hdbscan
- seaborn
- stop_words
pip install nttc
- Please contact me if you discover any issuess.
- See the
assets/examples
folder for example uses.
# Create new distribution of code for archiving sudo python setup.py sdist bdist_wheel # Distribute to Python Package Index python -m twine upload --repository-url https://upload.pypi.org/legacy/ dist/*