InspirationAbout the Project: Emakia – Observability for Digital Integrity

Inspiration The digital space is awash with content — some enlightening, some harmful, and some deeply misleading. Emakia was born from a vision to empower individuals and communities with tools that critically evaluate the truthfulness, bias, and toxicity of online narratives. Inspired by the increasing polarization and manipulation in media ecosystems, I set out to build a modular and ethical AI framework that could identify harmful patterns across platforms like Reddit, MaxNews, and Twitter data stored in BigQuery.

are learned weights from feedback-based tuning This logistic formulation gives us flexibility to tune sensitivity dynamically.

What it does

How I Built It

This project integrates multiple systems and principles: Streamlit Dashboard: A lightweight and interactive interface where users can enter DQL queries and review analysis results. Dynatrace DQL API: Enables real-time observability by querying metrics and logs. Gemini-Powered Agents: Three language agents — ToxicityAnalyst, BiasAnalyst, and MisinformationAnalyst — built using Google’s Gemini LLMs via parallel orchestration. BigQuery Integration: Tweets and public statements are ingested and scored using custom queries. MongoDB Storage: Analysis results are stored for auditing, longitudinal tracking, and potential retraining cycles. Modular Codebase: Split into input scrapers (Reddit, MaxNews, BigQuery), analysis agents, and storage utilities — ensuring scalability and flexibility.

What I Learned

How to orchestrate multiple LLM agents in parallel, ensuring context preservation and consistent evaluations. The nuances of building privacy-aware observability systems using Dynatrace and integrating real-time pipelines. Managing credentials securely through secrets.toml, and maintaining reproducibility without exposing sensitive tokens. How bias can manifest subtly in phrasing and narrative structure — requiring sophisticated prompt tuning and interpretability.

Challenges I Faced

Token Scope & API Compatibility: Dynatrace's DQL queries required specific token scopes — simple connectivity worked, but broke under full integration until scopes were correctly matched. Agent Overlap & Contradictions: Bias and misinformation often intertwine. Agent outputs sometimes contradicted each other, requiring additional logic to resolve and interpret results. BigQuery Credentials: Parsing multiline private keys from secrets involved edge-case handling to prevent decode failures. MongoDB Insertion Errors: Handling nested structures and missing content required robust sanitation before writes. Scaling: Real-time sentiment and bias detection demands performance optimizations and thoughtful caching when scaling beyond test volumes.

What's next for Toxicity and misinformation Detection

Built With

  • bigquery-loader-(get-tweets-from-bigquery)
  • custom-scraper-modules-(reddit-scraper
  • db-dtypes
  • db-utils
  • dynatrace
  • dynatrace-dql-client
  • google-cloud-bigquery
  • google-cloud-platform-(bigquery
  • google-genai-(gemini)
  • google.adk-(parallelagent
  • google.oauth2
  • maxnews-rss-or-rest
  • maxnews-scraper)
  • minimax
  • mongodb
  • python
  • python-dotenv
  • reddit
  • runner
  • sessionservice)
  • sql
  • store-result
  • streamlit
  • vertex-ai)
Share this project:

Updates