Emakia: Toxicity Filtering & Validation for Social Media

Emakia

Inspiration

We are inspired by the pressing need to combat online harassment and its detrimental effects on marginalized communities like women, people of color, and LGBTQI+ individuals. Addressing these challenges helps protect mental well-being and ensures freedom of expression in the digital space.

What it does

Our project validates labels and model outputs to effectively detect and filter toxic content. This includes developing an iPhone app that utilizes Core ML text classifiers evaluated with Google's Vertex AI. For this hackathon, we focus on evaluating labels and models using Gemini and LangChain OpenAI to further enhance our system's reliability and effectiveness.

How we built it

Before developing the app, we evaluated Core ML text classifiers on Google Cloud using Vertex AI. Additionally, we employed LangChain OpenAI and LangChain for label and model output validation, though we have only completed label validation so far. We imported libraries such as langchain.prompts and ChatGoogleGenerativeAI.

Challenges we ran into

A significant challenge was ensuring the accurate validation of labels and model outputs, which required integrating multiple advanced tools and overcoming technical hurdles. We faced issues with libraries and large datasets, which complicated evaluations. We also discovered that Gemini 1.5 Flash-8B is not compatible with LangChain Gemini, limiting us to using Gemini-Pro. Determining when to use ChatGoogleGenerativeAI, ChatPromptTemplate, or PromptTemplate posed additional challenges. Big files with rows exceeded the Gemini quota.

Accomplishments that we're proud of

We successfully integrated LangChain OpenAI and LangChain Gemini, overcoming significant technical obstacles.

What we learned

We learned the importance of setting up a virtual environment for libraries to prevent compatibility issues. By applying similar methods for OpenAI and Gemini, we developed a stronger understanding of components like ChatGoogleGenerativeAI, ChatPromptTemplate, and PromptTemplate. We also explored using AI Studio to test various types of content, noting that results can sometimes vary and are not always consistent.

What's next for Emakia: Toxicity Filtering & Validation for Social Media

We plan to continue enhancing our system's reliability and effectiveness by further developing and refining our methodologies for detecting and filtering toxic content.