HealthTest AI
Automating IEC 62304-Compliant Test Case Generation for Healthcare Software
The Problem
Healthcare software development operates under strict regulatory frameworks — IEC 62304, FDA 21 CFR Part 11, ISO 13485, and HIPAA.
Quality assurance (QA) teams in medical device companies spend 40–60% of their time manually converting requirements into compliant test cases.
A single software module can take weeks to generate hundreds of test cases — each carefully mapped to requirements, compliance tags, and risk classifications.
We’ve seen QA engineers spending 2–3 hours per requirement just to write documentation, often using large spreadsheets that are difficult to maintain during audits.
This creates bottlenecks that delay product launches, increase costs, and limit scalability for fast-growing healthtech startups.
Modern LLMs can understand complex regulatory language — so why not build a system that generates IEC 62304-compliant test cases instantly?
What It Does
HealthTest AI automates the entire test case generation workflow for healthcare software.
Key Features
Multi-Format Requirements Processing Accepts requirements in PDF, Word, or plain text format.
AI-Powered Test Case Generation Uses advanced LLMs to generate comprehensive, compliance-ready test cases covering:
Positive and negative scenarios
Boundary conditions
Security validations
Approval Workflow
QA teams can approve, reject, or enhance AI-generated test cases before finalization.
Compliance-First Output Each test case includes:
Traceability IDs linking to source requirements
Risk levels per IEC 62304 classification
Compliance tags (FDA 21 CFR Part 11, HIPAA, ISO 27001)
Detailed preconditions, test steps, and expected results
Jira Integration Mock export functionality that generates ticket IDs for immediate QA workflow integration.
How We Built It
Backend Architecture
FastAPI web framework with async support for efficient processing
Document parsing pipeline using PyPDF2 and python-docx
JSON-based persistence for requirements, test cases, exports, and logs
SHA-256 hashing for duplicate requirement detection
AI Prompt Engineering
Our custom system prompt deeply understands:
IEC 62304 software safety classifications
FDA electronic record requirements
ISO 13485 design controls
HIPAA security safeguards
The prompt instructs the LLM to generate structured JSON output with realistic, audit-ready test cases that would pass regulatory review.
Frontend Stack
Vanilla JavaScript for simplicity and performance
CSS Grid/Flexbox for responsive layouts
Hash-based routing for navigation
Real-time polling for generation status updates
Challenges We Ran Into
API Cost Management
Claude API costs $15 per million tokens.
One large document could cost $2–3 to process.
Implemented prompt caching (90% cost reduction) and multi-LLM fallback to free tiers.
Empty Response Handling
Some providers returned empty strings or malformed JSON.
Added comprehensive error handling and fallback test cases to ensure system reliability.
Windows Environment Variable Loading
.env file failed to load due to UTF-8 BOM encoding issues.
Implemented manual file parsing as a fallback solution.
Accomplishments
Technical
Achieved 90% cost reduction through caching and fallback strategies
Generated realistic test cases with proper compliance terminology
Built a robust system compatible with multiple LLMs
Business Value
Reduced test case creation time from hours to seconds
Estimated 60% QA time savings
Ensured comprehensive compliance coverage
Made enterprise-grade QA automation accessible to startups
User Experience
Clean and professional UI suitable for healthcare enterprises
Approval workflow integrated into existing QA processes
Mock Jira integration demonstrating real-world applicability
What We Learned
Provider Abstraction is Critical: Avoid hardcoding API calls to prevent failures from rate limits.
Healthcare Compliance is Complex: Test cases must reflect regulatory terminology and structure.
Error Handling Over the Happy Path: Robust fallback handling ensures production stability.
JSON Parsing is Non-Trivial: Each LLM formats responses differently; a multi-stage parser was necessary.
What’s Next for HealthTest AI
Short-Term (Next 3 Months)
Real Jira/Azure DevOps integration with OAuth authentication
Traceability matrix visualization linking requirements to test cases
Batch processing for multiple requirement documents
Excel export with regulatory audit formatting
Medium-Term (6–12 Months)
PostgreSQL backend replacing JSON file storage
Role-based access control for multi-user collaboration
Test execution automation integration
Custom compliance templates (add regulations beyond FDA/IEC)
Version control for requirements with automatic test case regeneration
Long-Term Vision
End-to-end QA automation: requirements → test cases → execution → defect tracking
AI-powered maintenance that auto-updates test cases when requirements change
Compliance dashboard showing regulatory coverage metrics
GDPR-compliant enterprise deployment for EU healthcare companies
CI/CD integration for continuous compliance validation
Future Enhancement: RAG Integration
In future releases, Retrieval-Augmented Generation (RAG) will be integrated to enhance robustness and reduce hallucinations.
RAG will:
Retrieve verified compliance and regulatory documents dynamically
Improve factual accuracy and contextual grounding
Ensure all generated test cases reference authoritative standards
Provide audit-proof traceability from requirement → standard → test case
This will make HealthTest AI more reliable, explainable, and production-ready for enterprise healthcare QA teams.
Built With
- 4.5
- anthropic
- claude
- css
- fastapi
- git
- html
- javascript
- pydantic
- pypdf2
- python
- sonnet
- uvicorn
Log in or sign up for Devpost to join the conversation.