DependenceDoc

💡 Inspiration

As a 13-year-old Class 8 student developing a private Python-based AI assistant ecosystem named Project Bankai, I quickly hit the ultimate rite of passage for every backend engineer: Dependency Hell. Missing packages, clashing version requirements, and silent native C-toolchain pipeline crashes would freeze my development loop for hours.

When I looked for existing tools, I realized that almost every developer utility on the market is just a fragile AI wrapper. They blindly throw raw terminal dumps at an LLM and hope it guesses a fix. This introduces dangerous hallucinations, security vulnerabilities, and broken local environments. I was inspired to build DependenceDoc—a completely open-source, production-grade system health engine that reverses this paradigm. DependenceDoc proves that software remediation does not need to blindly rely on generative AI. The core engine operates entirely on deterministic logic, compiled regular expressions, and rigid mathematical version constraint arrays, deploying AI purely as an executive auditor to explain the underlying system mechanics to the user with zero hallucinations.

🛠️ How We Built It & Core Architecture

DependenceDoc is built as a transparent, fully open-source ecosystem where the entire code registry is completely visible right from the UI panel. The application features a clean, interactive Streamlit frontend layer (main.py) running inside a secure, containerized Replit environment, using internal secret vaults to manage API pipeline credentials safely.

The system pipeline processes raw logs through a highly sophisticated, decoupled object-oriented routing matrix:

app.py (Master Orchestrator): Instantiates all core engines, wraps the execution pipeline inside a secure background worker thread monitored by a SandboxWatcher, computes an explainable confidence score, and bundles the final payload.
core/sentinel.py (Signal Detection Gate): Performs rapid, keyword-based multi-domain classification of the incoming log, instantly generating a boolean signal map across four separate fault domains: System, Environment, Runtime, and Dependency.
core/scout.py (Constraint Extraction Parser): Scans raw logs line-by-line using three independent regex engines (Classic Inline, Modern Multi-Line, and Requested Version Detectors) to output structured dependency specifiers.
core/detective.py (Conflict Analysis Engine): Implements strict PEP 440 SpecifierSet mathematics to analyze package versions, grouping multi-line constraints into unified sets and classifying errors as version violations or unresolvable deadlocks.
core/healer.py (Recovery Plan Compiler): Evaluates conflict reports alongside live metadata fetched via the PyPI Registry Client (services/pypi_client.py), running a constraint satisfaction algorithm to find optimal pinned install points and output safe, executable --force-reinstall commands.
core/system_resolver.py (OS-Layer Fault Resolver): Maintains regular expression rule arrays to catch native bare-metal toolchain failure signatures—such as missing GCC compilers or absent OpenSSL headers (openssl/ssl.h)—and instantly builds the required native terminal command arrays.
core/environment_resolver.py & core/runtime_resolver.py: Detect and resolve environment path configurations (like missing virtual environments or secret tokens) and infrastructure layer collisions (like ghost processes or database connection errors using targeted lsof and systemctl commands).
core/auditor.py (AI Insight Gateway): Routes the mathematically verified payload to Gemini 2.5 Flash or Groq LLaMA-3.3-70B using a dual-provider credential resolution chain. The AI acts strictly as an auditor, translating the hard data into natural explanations and side-effect warnings.
core/watcher.py (Sandbox Execution Warden): Enforces defensive guardrails in real time, policing a 90-second CPU thread ceiling and a maximum threshold of 10 cumulative API calls to guarantee a clean runtime exit.
models/package.py (Structured Data Models): Enforces strict type safety throughout the entire application data cycle using dataclass blueprints.

📊 Analytics, Verification & Seamless Exports

To track real-world application performance, the frontend injects a production-active Novus Analytics Event Emitter (services/pendo_tracker.py), which safely posts structured telemetry events across five crucial system lifecycle milestones (from log loading to report generation) back to our database with zero disruption to the user.

Once a diagnostic run finishes, the platform bypasses manual copying by providing two robust export options managed by the Executive PDF Report Engine (services/pdf_generator.py):

An executable .sh terminal shell script to run the calculated fixes instantly.
A comprehensive, professionally styled three-page ReportLab PDF documentation report complete with root-cause insights, a predictive risk matrix, and monospace remediation command blocks.

⚠️ Challenges We Faced

The greatest challenge of this build was handling multi-domain environment crashes where local library dependencies and low-level operating system toolchains failed at the exact same time. Testing the system with a massive crash log—where a python cryptography wheel compilation broke due to a missing native host GCC toolchain while Streamlit and Subaligner fought over version constraints—initially caused chaotic telemetry routing.

By running deep regex pattern optimization and refining the rule tables inside core/system_resolver.py, I successfully tuned the multi-layered routing matrix. The system now parses compound system crashes, isolates the root cause, and generates exact bare-metal command solutions in an absolute record-breaking 14.05 seconds with 100% resolution confidence.

🧠 What We Learned & The Future Vision

Building DependenceDoc taught me that true, production-grade reliability is achieved by anchoring software in deterministic mathematics and structural logic, using large language models as contextual narrators rather than primary decision-makers.

Moving into Phase 2, the vision is to leverage Google AI Studio’s massive 2-million token context window to expand DependenceDoc beyond individual terminals. The next evolution will allow it to ingest, parse, and map entire multi-container enterprise cloud environments, dynamically healing complex microservice infrastructures automatically. The app is live, the code is completely transparent for community audit, and Project Bankai is officially deployed!

Built With

gemini-api
groq
linux
novus-analytics
pendo-api
pep-440-mathematics
python
regular-expressions
replit
reportlab-pdf-engine
shell-scripting
streamlit

Updates

Krishiv J started this project — Jun 20, 2026 09:15 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.