News title sentiment analysis


License
MIT
Install
pip install senti-news==0.0.39

Documentation

What

The goal of this project is to predict the popularity of each candidate. As of October 1, there are 19 Democrats and 4 Republicans in contention for the presidential nomination. Other parties and candidates may be considered in the future as an extension.

How

Use news text data to predict popularity of presidential candidates in the United States 2020 election. The news sources will have different biases and based in different regions. Initial approaches can just use lexicons, tokenization, and other rule-based approaches. Next, sentiment analysis and other NLP techniques will be used to determine whether articles are positive or negative.

Why

It is generally regarded as very difficult to predict the popularity of a presidential candidate, and it could be useful for the candidate or active supporters to know what makes a candidate’s popularity increase.

Goals

  1. Create a pipeline to take text data from the news sources, clean the text data, and then analyze it using lexicons, tokenization, and other rule-based approaches.
  2. Start with just the top 5 democratic candidates right now (Biden, Warren, Sanders, Buttigieg, Harris)
  3. Build more complicated models that account for the news outlet’s bias, using NLP approaches like sentiment analysis, vectorization, neural networks, and transformers.
  4. Have an interactive user-facing product to show latest results.