Inspiration
Primer dataset and news
What it does
How we built it
Colab notebook, python and qlikcloud.com for data vizualization
Challenges we ran into
Processing power, large cost to analyze with LLMs the entire news corpus, poor sentiment analysis results with off-the-shelf tools, selecting a subset that can tell a good story.
Accomplishments that we're proud of
Extracted the sentiment for People and Organization on a smallish sample of news across all three regions. Results look good and provide a spring board for further analysis. Both results are visualizable with Qlikcloud. - didn't have time to build an UI for that. Tableau would be even cooler.
What we learned
Sentiment is too high level. We need to look deeper into the sub-sentence parts, noun phrases to understand the flavours of sentiments. Eg. Trumps is -4,-5 across all 3 regions. However
- China hates Trump because of the trade tarrifs
- Russia speak poorly of him because of 'falsifying business records'
- US - 'civil fraud case', 'sexual abuse and defamation'
What's next for Propaganda deep dive
iterate with the final users, ask questions, find ways to sample the data, cluster topics so we can provide higher fidelity for the content that bears a sentiment. For example, if Elon Musk got a negative sentiment...why is it so? Which part? what exactly does Russia, China, US appreciates/ hates about the statement.
We can see occasionally on the current dataset some glimpses of ideas...Russia is +5 on space and Nasa. While China prefers their tech over Nasa and SpaceX. SpaceX is neutral (not negative!)
Built With
- colab
- openai
- python
- radiant
- scikit-learn
- spacy
Log in or sign up for Devpost to join the conversation.