InspirationAbout the Project: Emakia – Observability for Digital Integrity
Inspiration The digital space is awash with content — some enlightening, some harmful, and some deeply misleading. Emakia was born from a vision to empower individuals and communities with tools that critically evaluate the truthfulness, bias, and toxicity of online narratives. Inspired by the increasing polarization and manipulation in media ecosystems, I set out to build a modular and ethical AI framework that could identify harmful patterns across platforms like Reddit, MaxNews, and Twitter data stored in BigQuery.
are learned weights from feedback-based tuning This logistic formulation gives us flexibility to tune sensitivity dynamically.
What it does
How I Built It
This project integrates multiple systems and principles: Streamlit Dashboard: A lightweight and interactive interface where users can enter DQL queries and review analysis results. Dynatrace DQL API: Enables real-time observability by querying metrics and logs. Gemini-Powered Agents: Three language agents — ToxicityAnalyst, BiasAnalyst, and MisinformationAnalyst — built using Google’s Gemini LLMs via parallel orchestration. BigQuery Integration: Tweets and public statements are ingested and scored using custom queries. MongoDB Storage: Analysis results are stored for auditing, longitudinal tracking, and potential retraining cycles. Modular Codebase: Split into input scrapers (Reddit, MaxNews, BigQuery), analysis agents, and storage utilities — ensuring scalability and flexibility.
What I Learned
How to orchestrate multiple LLM agents in parallel, ensuring context preservation and consistent evaluations. The nuances of building privacy-aware observability systems using Dynatrace and integrating real-time pipelines. Managing credentials securely through secrets.toml, and maintaining reproducibility without exposing sensitive tokens. How bias can manifest subtly in phrasing and narrative structure — requiring sophisticated prompt tuning and interpretability.
Challenges I Faced
Token Scope & API Compatibility: Dynatrace's DQL queries required specific token scopes — simple connectivity worked, but broke under full integration until scopes were correctly matched. Agent Overlap & Contradictions: Bias and misinformation often intertwine. Agent outputs sometimes contradicted each other, requiring additional logic to resolve and interpret results. BigQuery Credentials: Parsing multiline private keys from secrets involved edge-case handling to prevent decode failures. MongoDB Insertion Errors: Handling nested structures and missing content required robust sanitation before writes. Scaling: Real-time sentiment and bias detection demands performance optimizations and thoughtful caching when scaling beyond test volumes.
What's next for Toxicity and misinformation Detection
Built With
- bigquery-loader-(get-tweets-from-bigquery)
- custom-scraper-modules-(reddit-scraper
- db-dtypes
- db-utils
- dynatrace
- dynatrace-dql-client
- google-cloud-bigquery
- google-cloud-platform-(bigquery
- google-genai-(gemini)
- google.adk-(parallelagent
- google.oauth2
- maxnews-rss-or-rest
- maxnews-scraper)
- minimax
- mongodb
- python
- python-dotenv
- runner
- sessionservice)
- sql
- store-result
- streamlit
- vertex-ai)
Log in or sign up for Devpost to join the conversation.