Inspiration

The spread of misinformation, especially in politics and other critical areas, has a significant impact on public opinion and decision-making. With the rise of social media platforms like TikTok, Instagram, and YouTube, misinformation can quickly reach a wide audience. Inspired by the need for reliable information and the ability to detect and counteract misinformation effectively, we created PerspectivePlus.

What it does

PerspectivePlus is a comprehensive tool designed to detect and analyze misinformation across various platforms. It leverages advanced natural language processing models to evaluate the credibility of content and provides users with reliable insights. By embedding video analysis, caption chunking, and clustering techniques, PerspectivePlus ensures accurate detection and offers explanations to help users understand the context and validity of the information.

How we built it

We built PerspectivePlus using a combination of state-of-the-art NLP models and machine learning techniques. Key components include:

Data Collection: We gathered a large dataset focusing on political and general misinformation. Model Selection: Initially, we used sciBERT for embedding and similarity analysis, but we plan to switch to a more suitable model for improved performance. Processing Pipeline: We implemented caption chunking and clustering to handle longer videos efficiently. Platform Integration: PerspectivePlus supports content analysis from major social media platforms like YouTube, and hopefully inf the future shorts, instagram reels, tiktok (and maybe xiaohongshu).

Challenges we ran into

Data Diversity: The dataset included a lot of political content, making it challenging to generalize across different misinformation categories. Model Performance: BERT worked well for initial testing, but we faced limitations in processing speed and accuracy for long videos. Integration: Embedding and analyzing multimedia content from various platforms required extensive customization and optimization.

Accomplishments that we're proud of

Robust Data Pipeline: Successfully implemented a pipeline that handles large-scale data, including video captions and text content. Platform Versatility: Enabled PerspectivePlus to work across multiple social media platforms. User-Friendly Explanations: Integrated an explainer inspired by PUBHEALTH to provide clear and concise explanations of misinformation detection results.

What we learned

Model Suitability: The importance of choosing the right model for specific tasks. We learned that while BERT is powerful, it might not be the best fit for all aspects of our project. Data Normalization: Proper normalization techniques are crucial for handling diverse datasets and avoiding biased recommendations. Iterative Development: The value of continuously testing and iterating on our methods to improve accuracy and efficiency.

What's next for PerspectivePlus

Add more data politics and general misinfo Dataset had a lot more political stuff Switch to a different model Explainer – idea from PUBHEALTH Implement testing metrics More platforms Tiktok, instagram, youtube embeds Cut into multiple requests to speed up for longer videos – also use clustering Improve caption chunking Normalization - will help with the big examples getting way too frequently recommended

Built With

Share this project:

Updates