Inspiration
Traditional Chinese Medicine (TCM) involves incredibly complex, multi-layered systems—ranging from classical herb pairings and formula compositions to modern network pharmacology and molecular targets. Researchers often have to manually bounce between scattered databases (like TCMSP, PubMed, and KEGG) and overcome significant language barriers between Chinese source texts and English scientific literature. The inspiration was to build a unified, intelligent "agentic OS" that could modernize this process, autonomously bridging ancient medical frameworks with modern bioinformatics through natural language.
What it does
tcm-cli is an open-source, autonomous AI agent for TCM research and discovery. You can ask it complex questions in natural language (e.g., "Build a network pharmacology analysis for 补中益气汤 against diabetes targets"). The agent then plans a multi-step research workflow, selects from over 30 built-in domain tools, executes the analysis, validates the results, and returns data-backed conclusions. It handles everything from syndrome differentiation and formula analysis to herb-drug interaction safety checks and literature reviews, fully supporting both English and Chinese terminologies.
How we built it
We built the tool as an interactive, terminal-based Python CLI.
- Core Intelligence: It is powered by a multi-model reasoning engine that lets users plug in their preferred LLM (supporting Anthropic, OpenAI, Google Gemini, DeepSeek, Kimi, and more).
- Tool & Data Integrations: We built an ecosystem of 30+ specialized tools that connect live to 10+ databases, including PubMed, TCMSP, UniProt, STRING, KEGG, ClinicalTrials.gov, and Open Targets.
- Local Datasets: We engineered a data-pulling system to download and manage heavy bioinformatics datasets (like TCMID, BATMAN-TCM, and SymMap) locally to boost offline accuracy.
- UX: We wrapped it all in an interactive terminal featuring slash commands, session exports, token tracking, and customizable agent profiles (research, clinical, or education).
Challenges we ran into
(Inferred from the architecture)
- Orchestrating Complex Workflows: Teaching the LLM to accurately sequence a pipeline out of 30+ tools—for example, mapping a symptom to a syndrome, finding the right classical formula, extracting the active compounds, and then running an ADMET prediction—required robust prompt engineering and agent planning logic.
- Bilingual Data Alignment: TCM research requires strict fidelity to historical Chinese texts while simultaneously mapping that data to modern English biomedical terminology. Maintaining this context without losing scientific accuracy was a significant hurdle.
- Data Fragmentation: Consolidating highly fragmented, multi-format bioinformatics databases (some requiring manual registration and extraction) into a streamlined CLI tool was an engineering challenge.
Accomplishments that we're proud of
- Comprehensive Tool Ecosystem: We successfully integrated an unprecedented breadth of TCM research tools—from ancient frameworks like Jun-Chen-Zuo-Shi (君臣佐使) formula analysis to modern network pharmacology—into a single interface.
- Seamless Multi-Model Support: The architecture flexibly supports almost every major LLM provider out of the box, allowing researchers to dynamically switch models based on their needs or API availability.
- True Bilingual Capabilities: We are proud of the robust language modes (
en,zh, andbi) that seamlessly align bullet points and headings, making the tool highly accessible to both traditional practitioners and Western researchers.
What we learned
Building tcm-cli reinforced the power of combining deterministic bioinformatics tools with non-deterministic LLM reasoning. We learned how to structure an "agentic OS" that doesn't just confidently generate text, but actively grounds its conclusions in hard data fetched from external APIs and local pharmacological databases. It also highlighted the massive potential of AI in accelerating cross-disciplinary research in alternative and traditional medicine.
What's next for TCM-CLI
Moving forward, we plan to refine our experimental Python sandbox, which currently allows the agent to write and execute custom code for data analysis. We also intend to expand our integration with more local clinical datasets, improve the accuracy of our pathway enrichment visualizations, and further optimize the agent's multi-step planning capabilities for highly complex, longitudinal clinical trial research.
Built With
- antigravity
- gemini
Log in or sign up for Devpost to join the conversation.