Inspiration

Today, a new class of cyber attacks is emerging where inputs that look completely normal, like everyday audio, can secretly carry malicious, structured perturbations. These signals may even include inaudible frequencies beyond human perception, making them indistinguishable from harmless noise to us, while still influencing how downstream systems process them.

As quantum computing enters the Noisy Intermediate-Scale Quantum (NISQ) era, this becomes even more critical. Most systems still assume noise is just a hardware limitation. But in reality, noise can now be intentionally crafted and injected, turning it into a stealth attack vector that mimics natural disturbances while subtly manipulating outcomes.

The challenge is that existing defenses, including Quantum Error Correction, are designed to fix errors, not determine whether those errors are natural or malicious. This creates a dangerous blind spot where systems may continue operating on compromised inputs without any indication of attack.

To make this practical beyond a demo upload flow, we also built an MVP continuous monitoring pipeline: Quanterp monitors a Gmail inbox for new audio attachments, automatically runs the full detection algorithm on each incoming file, and triggers alert/report actions when malicious patterns are detected.

We built Quanterp to close that gap.

Quanterp doesn't just ask: Is this signal noisy?

It asks: What kind of noise is this, and does it look natural or adversarial?


What It Does

Quanterp is a hybrid quantum-classical audio security system that detects suspicious noise and malicious manipulation in audio files.

It classifies audio as:

  • Clean audio
  • 🌫️ Natural noise (e.g., microphone hiss, environmental distortion)
  • 🚨 Malicious audio manipulation, including:
    • Bit flip-style distortion
    • Phase flip-style disruption
    • Ultrasonic injection
    • Depolarizing mixed attack

Quanterp can:

  • Real-time audio upload and analysis
  • Controlled attack generation/injection for testing
  • Chunk-level classical feature extraction
  • Quantum feature encoding with Qiskit ZZFeatureMap
  • Aer simulation across bit-flip, phase-flip, and depolarizing channels
  • Jensen-Shannon divergence–based quantum distribution comparison
  • Hybrid fusion of quantum fingerprints + classical signal evidence
  • Final output: label, attack type, confidence, and chunk evidence
  • Automated alerts and incident-style reports
  • Continuous Gmail attachment scanning for suspicious audio

How We Built It

We built Quanterp as a hybrid detection pipeline using:

Python · Streamlit · Qiskit · Qiskit Aer · NumPy · SciPy · Pandas · Matplotlib

The pipeline runs in six stages:

Stage 1 — Audio Ingestion and Normalization

Audio is converted to mono, resampled to a consistent rate, normalized in amplitude, and split into chunks across the full file duration — not just the beginning. This matters because some attacks (e.g., ultrasonic injection) may be hidden in the middle of a recording.

$$x(t) \rightarrow {x_1, x_2, \dots, x_n}$$

Stage 2 — Classical Feature Extraction

For each chunk $x_i$, we extract a feature vector:

f_i = [f_rms, f_centroid, f_high, f_flatness, f_polarity, f_discontinuity]

Feature What It Captures
RMS energy Signal loudness
Spectral centroid Frequency center of mass
High-frequency ratio Ultrasonic upper-band energy
Spectral flatness Broadband / noise-like behavior
Phase instability Irregular phase behavior (phase flip)
Discontinuity score Sharp sample-level spikes (bit flip)

Stage 3 — Quantum Feature Encoding

The feature vector is encoded into a quantum circuit using a Qiskit ZZFeatureMap:

$$|\psi_i\rangle = U_{\phi}(\mathbf{f}_i)|0\rangle^{\otimes q}$$

Each audio chunk becomes a quantum state whose structure depends on the audio signal. Entangling gates in the feature map allow subtle cross-feature interactions to emerge in a higher-dimensional quantum state space.

Stage 4 — Quantum Noise-Channel Simulation

The same circuit is run under four conditions via Qiskit Aer:

$$P_{ideal}^{(i)},\quad P_{bit}^{(i)},\quad P_{phase}^{(i)},\quad P_{depol}^{(i)}$$

Each run yields a measurement distribution — a quantum fingerprint for that audio chunk.

Stage 5 — Jensen-Shannon Divergence

We compare each noisy distribution against the ideal:

$$JSD_{bit}^{(i)} = JSD\left(P_{ideal}^{(i)},\ P_{bit}^{(i)}\right)$$

$$S_i = \max\left(JSD_{bit}^{(i)},\ JSD_{phase}^{(i)},\ JSD_{depol}^{(i)}\right)$$

This quantum shift $S_i$ measures how sensitive the encoded audio chunk is to each noise model.

Stage 6 — Hybrid Classification

We fuse quantum evidence with classical audio evidence:

$$Score_{bit} = \alpha_1 \cdot discontinuity + \alpha_2 \cdot flatness + \alpha_3 \cdot JSD_{bit}$$

$$Score_{phase} = \beta_1 \cdot phase_instability + \beta_2 \cdot polarity_flip + \beta_3 \cdot JSD_{phase}$$

$$Score_{ultrasonic} = \gamma_1 \cdot high_frequency_ratio + \gamma_2 \cdot quantum_shift$$

$$Score_{depol} = \delta_1 \cdot flatness + \delta_2 \cdot mixed_carrier + \delta_3 \cdot JSD_{depol}$$

The full detection input for each chunk is:

$$\mathbf{z}_i = [\mathbf{f}_i,\ \mathbf{q}_i]$$

where

q_i = [JSD_bit(i), JSD_phase(i), JSD_depol(i), S_i]


Why Traditional Quantum Error Correction Is Not Enough

Traditional QEC is designed to preserve quantum information:

$$|\psi\rangle \rightarrow \mathcal{E}(|\psi\rangle) \rightarrow |\psi\rangle$$

It asks: Did a qubit experience an error, and how do we fix it?

Quanterp asks a fundamentally different question:

Was this disturbance natural or adversarial — and what type of attack was it?

Traditional QEC may treat malicious bit flip and natural hiss identically — both as noise to be corrected. Quanterp instead treats noise as evidence and classifies its origin and intent.

Traditional QEC:      detect → repair
Quanterp:             detect → classify → explain → alert

Challenges We Ran Into

  • Separating bit flip from natural hiss — both appear broadband in spectrograms; required discontinuity scoring as the tiebreaker
  • Preventing depolarizing attacks from being confused with ultrasonic injection — both have elevated high-frequency signatures
  • Stopping phase flip from being mislabeled as bit flip — required dedicated phase instability features
  • Sampling audio across the full file rather than only the beginning — attacks injected mid-file would be missed otherwise
  • Mapping classical audio features into meaningful quantum states
  • Designing divergence thresholds that generalized across both generated and uploaded audio
  • Balancing false positive rates on realistic audio inputs
  • Fusing quantum and classical evidence without over-relying on either

Accomplishments We're Proud Of

  • Built a fully working quantum-classical hybrid audio security pipeline
  • Successfully differentiated BitFlip.wav → Malicious · Bit Flip from naturalHiss.wav → Natural · Natural
  • Encoded audio features into Qiskit ZZFeatureMap quantum circuits
  • Simulated ideal, bit-flip, phase-flip, and depolarizing quantum noise channels
  • Compared quantum measurement distributions using Jensen-Shannon divergence
  • Classified all four attack types plus clean audio and natural noise
  • Built full-file chunk sampling so mid-file attacks are detected
  • Added confidence scoring, chunk-level evidence tables, and incident-style reporting
  • Extended detection to email attachment scanning

What We Learned

  • Noise is not always random — in security contexts, noise can carry intent
  • Spectrograms alone are insufficient for attack classification
  • Quantum feature maps can amplify subtle differences between structured signal patterns
  • Quantum simulation is most powerful when fused with classical signal evidence
  • Traditional quantum error correction is not the same as adversarial intent detection
  • Hybrid systems separate edge cases better than either approach alone
  • Real-world audio requires aggregate reasoning across the full file, not only per-chunk decisions

What's Next for Quanterp

  • Train a dedicated ML model on fused classical + quantum feature vectors
  • Expand attack coverage: replay attacks, spoofing, compression artifacts, adversarial audio injection
  • Test the feature-map pipeline on real quantum hardware
  • Add real-time streaming detection for live audio monitoring
  • Improve explainability of quantum response fingerprints
  • Benchmark against classical signal-processing and ML baselines
  • Extend beyond audio to network packets, IoT sensor streams, and communication signals
  • Build a production-ready monitoring dashboard for security teams

Built With

  • docker
  • ffmpeg/pydub
  • gmail-smtp/imap
  • google-cloud-run
  • jensen-shannon-divergence
  • matplotlib
  • numpy
  • pandas
  • python
  • qiskit
  • qiskit-aer
  • quantum-noise-channel-simulation
  • scipy
  • streamlit
  • zzfeaturemap
+ 7 more
Share this project:

Updates