Propaganda deep dive

Inspiration

Primer dataset and news

What it does

How we built it

Colab notebook, python and qlikcloud.com for data vizualization

Challenges we ran into

Processing power, large cost to analyze with LLMs the entire news corpus, poor sentiment analysis results with off-the-shelf tools, selecting a subset that can tell a good story.

Accomplishments that we're proud of

Extracted the sentiment for People and Organization on a smallish sample of news across all three regions. Results look good and provide a spring board for further analysis. Both results are visualizable with Qlikcloud. - didn't have time to build an UI for that. Tableau would be even cooler.

What we learned

Sentiment is too high level. We need to look deeper into the sub-sentence parts, noun phrases to understand the flavours of sentiments. Eg. Trumps is -4,-5 across all 3 regions. However

China hates Trump because of the trade tarrifs
Russia speak poorly of him because of 'falsifying business records'
US - 'civil fraud case', 'sexual abuse and defamation'

What's next for Propaganda deep dive

iterate with the final users, ask questions, find ways to sample the data, cluster topics so we can provide higher fidelity for the content that bears a sentiment. For example, if Elon Musk got a negative sentiment...why is it so? Which part? what exactly does Russia, China, US appreciates/ hates about the statement.

We can see occasionally on the current dataset some glimpses of ideas...Russia is +5 on space and Nasa. While China prefers their tech over Nasa and SpaceX. SpaceX is neutral (not negative!)

Built With

colab
openai
python
radiant
scikit-learn
spacy

Updates

Victor Paraschiv started this project — May 05, 2024 02:39 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.