Social Media’s Guardian

Architecture
Report

Inspiration

The rise of misinformation and toxic behaviour on social media platforms inspired us to create an online Guardian. We aimed to develop a tool that could help users discern the truth and foster a more positive and respectful online community.

What Motivated Us

We were motivated by the need to combat the spread of false information and improve the quality of interactions on social media. Our goal was to create a solution that could provide accurate information and promote civility.

What We Learned

We explored the integration of Microsoft Fabric Notebooks and Lakehouse, utilizing data from social media APIs. We have learned how to use advanced Multi Agent Frameworks like CrewAI and LangGraph and how to use these frameworks effectively. By leveraging cutting-edge LLM models like GPT-4o, embedding models (sentence-transformers/all-MiniLM-L6-v2 and text-embedding-3-small), we navigated the complexities of online content moderation. Our journey provided deep insights into the challenges and highlighted the importance of innovative solutions in creating a safer online environment.

How we build the project:

We designed the system to ingest social media comments and video content from popular platforms like Instagram, Facebook, and YouTube via their APIs. Once the data is ingested, it undergoes a storage process where it is securely held for further analysis. The stored data is then analyze, cleaned and transformed, with a specific focus on tasks such as flagging offensive comments using AzureOpenAI models and categorizing them into "POSITIVE", "NEGATIVE", "NEUTRAL", "ABUSE", "RACIST", or "SEXUAL", this is then stored into the Lakehouse table and analyzed using multi-agent AI system named "CrewAI” to generate summary report for each category and providing a detailed recommendation. Additionally, we leverage Power BI to create insightful reports on top of this data. Also transcribed videos are used to check whether the content is factual using AzureOpenAI, a report is generated on top of this using multi-agent AI system named " LangGraph”. These outputs are then utilized to provide insights, potentially influencing moderation actions on the respective social media platforms.

Challenges we faced:

Major challenges we faced were accessing the data from various social media sites and getting the necessary permissions to do so. Also, addressing ethical concerns around AI moderation, such as bias and fairness, was essential. We implemented rigorous testing and validation processes to minimize biases and ensure fair treatment of all users. Using LangChain, LangGraph and CrewAI we faced a lot of dependency issues that needed to be resolved in Fabric.

Built With

agenticframework
azureopenai
chromadb
crewai
lakehouse
langchain
langgraph
microsoftfabric
powerbi
python
rag
sqlite
youtubeapis

Updates

Jovin Mathias started this project — Nov 12, 2024 10:37 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.