Inspiration

After reviewing the starter code provided, our team decided to try the Closed Challenge topic. As beginners, a more structured and guided approach to a solution seemed like something more beneficial for us, as we would have some program to build off of rather than starting from scratch.

What it does

TheTale receives biological evidence (DNA samples and fingerprints, as FASTA and BMP files) from crime scenes and compares them against a database of suspects.

  • Fingerprint Analysis: Uses ORB keypoint matching to evaluate visual similarities and account for smudging or distortion.
  • DNA Alignment: Utilizes the Needleman-Wunsch dynamic programming algorithm to calculate global alignment scores, even handling degraded samples with missing base pairs.
  • Additional Forensics: Processes secondary clues like Algor Mortis (Time of Death estimation) to refine the suspect ranking. The mathematical model we integrated for estimating the hours since death ($\Delta t$) relies on the victim's current temperature, normal body temperature, and the cooling rate constant (i.e. Glaister's equation)
  • The AI Investigator: Finally, it outputs a ranked list of suspects along with confidence scores and a detailed analysis explaining the reasoning. It leverages the Gemini API to analyze the raw scores and generate a case report that explains the forensic significance of the match.

How we built it

We divided our team so that one half worked on optimizing the starter code and considering the edge cases. OpenCV was implemented to handle fingerprint file feature extraction, and Biopython and dynamic programming to parse the DNA files. The core DNA comparison relies on a Needleman-Wunsch matrix scoring system.

The other half worked on developing the front-end and connecting it to the OpenCV/DP logic through a Flask backend and RESTful API calls. The user interface was rapidly prototyped and designed in Figma to ensure a sleek, dark-mode, professional crime-lab aesthetic.

To tie it all together, we integrated the Google Gemini API to dynamically read the algorithmic outputs and generate the human-readable forensic reports. We did have some issues with rate limiting requests, so we implemented a fallback manual case report in the form of a PDF, that can be previewed through the dashboard.

Challenges we ran into

For frontend development, the biggest challenges were probably just making sure that the API calls worked, and that they weren't returning 500 Internal Server Error. We also had to figure out a way to place the uploaded files directly into their respective folders to seamlessly connect with backend logic. Once that was resolved, the rest was smooth sailing! We did come across issues with implementing Gemini API; it worked out initially, but then we all ended up getting rate limited, so we resorted to generating our own case report PDF.

For the backend, we realized we could simplify the DNA starter code by a lot (which took some time, because we had to understand the code first). Implementing and debugging Lowe's ratio test for partial fingerprints using a relative distance threshold instead of a fixed distance approach was the most difficult for us with regards to the fingerprint analysis.

What we learned

We learned how to successfully bridge the gap between UI/UX design (Figma) and functional backend engineering (Flask). We also gained a deep appreciation for the power of LLMs (using them to debug code and help us learn!). Attending the workshop on ORB with OpenCV and how the Needleman-Wunsch algorithm works also encouraged us to continue pursuing biotech-related concepts!

Built With

Share this project:

Updates