# Inspiration

The inspiration for SafeQuery DP came from a critical bottleneck in healthcare: the "Privacy Paradox." We realized that while data is the fuel for medical breakthroughs, strict regulations like GDPR and HIPAA often lock this data away to protect patient identities. Traditional anonymization is no longer enough; modern hackers can re-identify individuals using auxiliary data. We set out to build a solution that allows researchers to "see" the data patterns without ever "touching" the actual identities.

# What it does

SafeQuery DP is a Zero-Knowledge Privacy Middleware that acts as a secure gateway between sensitive databases and researchers. It uses Differential Privacy to inject calibrated Laplace noise into SQL queries.

  • It provides a Privacy-Accuracy Dashboard where users can tune the Epsilon ($\epsilon$) budget.
  • It features an Attack Simulator that detects and blocks "Linkage Attacks" in real-time.
  • It generates Synthetic Data previews and one-click Compliance Reports for HIPAA auditing.

# How we built it

We built the core engine using FastAPI for high-concurrency performance and React/Tailwind CSS for a professional cybersecurity-themed interface. The mathematical backbone is the Laplace Mechanism:

$$f(x)_{noisy} = f(x) + \text{Laplace}(\frac{\Delta f}{\epsilon})$$

We integrated PostgreSQL for user budget tracking and custom logic to monitor "Epsilon Consumption" per session, ensuring no researcher exceeds their privacy quota.

# Challenges we ran into

The primary challenge was the "Accuracy Paradox"—balancing mathematical noise so that it’s strong enough to hide individuals but light enough to keep the research valid. We also faced significant environment configuration hurdles while integrating privacy libraries, which led us to develop custom, lightweight NumPy-based DP functions to ensure the system remains portable and production-ready.

# Accomplishments that we're proud of

We are incredibly proud of our Real-time Linkage Attack Detection. Successfully creating a middleware that can distinguish between a legitimate aggregate query and a malicious attempt to isolate a single patient record was a major milestone. Furthermore, achieving a 95%+ accuracy rate on noisy data while maintaining a high safety score is a testament to our optimized algorithm.

# What we learned

Building SafeQuery DP taught us that Privacy is a mathematical guarantee, not just a policy. We learned the intricacies of Epsilon budgeting and how to transform abstract data privacy theories into a functional Web UI. We also realized the power of Synthetic Data in enabling safe, open innovation in the medical field.

# What's next for SafeQuery DP

The next phase for SafeQuery DP involves expanding beyond the Laplace mechanism to include Exponential Mechanisms for non-numeric data. We also plan to implement Federated Learning support, allowing the middleware to protect data across multiple hospitals simultaneously without the data ever leaving their local servers. Our goal is to make SafeQuery DP the industry standard for ethical data sharing.

Built With

Share this project:

Updates