QWave

Inspiration

Today, a new class of cyber attacks is emerging where inputs that look completely normal, like everyday audio, can secretly carry malicious, structured perturbations. These signals may even include inaudible frequencies beyond human perception, making them indistinguishable from harmless noise to us, while still influencing how downstream systems process them.

As quantum computing enters the Noisy Intermediate-Scale Quantum (NISQ) era, this becomes even more critical. Most systems still assume noise is just a hardware limitation. But in reality, noise can now be intentionally crafted and injected, turning it into a stealth attack vector that mimics natural disturbances while subtly manipulating outcomes.

The challenge is that existing defenses, including Quantum Error Correction, are designed to fix errors, not determine whether those errors are natural or malicious. This creates a dangerous blind spot where systems may continue operating on compromised inputs without any indication of attack.

To make this practical beyond a demo upload flow, we also built an MVP continuous monitoring pipeline: Quanterp monitors a Gmail inbox for new audio attachments, automatically runs the full detection algorithm on each incoming file, and triggers alert/report actions when malicious patterns are detected.

We built Quanterp to close that gap.

Quanterp doesn't just ask: Is this signal noisy?

It asks: What kind of noise is this, and does it look natural or adversarial?

What It Does

Quanterp is a hybrid quantum-classical audio security system that detects suspicious noise and malicious manipulation in audio files.

It classifies audio as:

✅ Clean audio
🌫️ Natural noise (e.g., microphone hiss, environmental distortion)
🚨 Malicious audio manipulation, including:
- Bit flip-style distortion
- Phase flip-style disruption
- Ultrasonic injection
- Depolarizing mixed attack

Quanterp can:

Real-time audio upload and analysis
Controlled attack generation/injection for testing
Chunk-level classical feature extraction
Quantum feature encoding with Qiskit ZZFeatureMap
Aer simulation across bit-flip, phase-flip, and depolarizing channels
Jensen-Shannon divergence–based quantum distribution comparison
Hybrid fusion of quantum fingerprints + classical signal evidence
Final output: label, attack type, confidence, and chunk evidence
Automated alerts and incident-style reports
Continuous Gmail attachment scanning for suspicious audio

How We Built It

We built Quanterp as a hybrid detection pipeline using:

Python · Streamlit · Qiskit · Qiskit Aer · NumPy · SciPy · Pandas · Matplotlib

The pipeline runs in six stages:

Stage 1 — Audio Ingestion and Normalization

Audio is converted to mono, resampled to a consistent rate, normalized in amplitude, and split into chunks across the full file duration — not just the beginning. This matters because some attacks (e.g., ultrasonic injection) may be hidden in the middle of a recording.

$$x(t) \rightarrow {x_1, x_2, \dots, x_n}$$

Stage 2 — Classical Feature Extraction

For each chunk $x_i$, we extract a feature vector:

f_i = [f_rms, f_centroid, f_high, f_flatness, f_polarity, f_discontinuity]

Feature	What It Captures
RMS energy	Signal loudness
Spectral centroid	Frequency center of mass
High-frequency ratio	Ultrasonic upper-band energy
Spectral flatness	Broadband / noise-like behavior
Phase instability	Irregular phase behavior (phase flip)
Discontinuity score	Sharp sample-level spikes (bit flip)

Stage 3 — Quantum Feature Encoding

The feature vector is encoded into a quantum circuit using a Qiskit ZZFeatureMap:

$$|\psi_i\rangle = U_{\phi}(\mathbf{f}_i)|0\rangle^{\otimes q}$$

Each audio chunk becomes a quantum state whose structure depends on the audio signal. Entangling gates in the feature map allow subtle cross-feature interactions to emerge in a higher-dimensional quantum state space.

Stage 4 — Quantum Noise-Channel Simulation

The same circuit is run under four conditions via Qiskit Aer:

$$P_{ideal}^{(i)},\quad P_{bit}^{(i)},\quad P_{phase}^{(i)},\quad P_{depol}^{(i)}$$

Each run yields a measurement distribution — a quantum fingerprint for that audio chunk.

Stage 5 — Jensen-Shannon Divergence

We compare each noisy distribution against the ideal:

$$JSD_{bit}^{(i)} = JSD\left(P_{ideal}^{(i)},\ P_{bit}^{(i)}\right)$$

$$S_i = \max\left(JSD_{bit}^{(i)},\ JSD_{phase}^{(i)},\ JSD_{depol}^{(i)}\right)$$

This quantum shift $S_i$ measures how sensitive the encoded audio chunk is to each noise model.

Stage 6 — Hybrid Classification

We fuse quantum evidence with classical audio evidence:

$$Score_{bit} = \alpha_1 \cdot discontinuity + \alpha_2 \cdot flatness + \alpha_3 \cdot JSD_{bit}$$

$$Score_{phase} = \beta_1 \cdot phase_instability + \beta_2 \cdot polarity_flip + \beta_3 \cdot JSD_{phase}$$

$$Score_{ultrasonic} = \gamma_1 \cdot high_frequency_ratio + \gamma_2 \cdot quantum_shift$$

$$Score_{depol} = \delta_1 \cdot flatness + \delta_2 \cdot mixed_carrier + \delta_3 \cdot JSD_{depol}$$

The full detection input for each chunk is:

$$\mathbf{z}_i = [\mathbf{f}_i,\ \mathbf{q}_i]$$

where

q_i = [JSD_bit(i), JSD_phase(i), JSD_depol(i), S_i]

Why Traditional Quantum Error Correction Is Not Enough

Traditional QEC is designed to preserve quantum information:

$$|\psi\rangle \rightarrow \mathcal{E}(|\psi\rangle) \rightarrow |\psi\rangle$$

It asks: Did a qubit experience an error, and how do we fix it?

Quanterp asks a fundamentally different question:

Was this disturbance natural or adversarial — and what type of attack was it?

Traditional QEC may treat malicious bit flip and natural hiss identically — both as noise to be corrected. Quanterp instead treats noise as evidence and classifies its origin and intent.

Traditional QEC:      detect → repair
Quanterp:             detect → classify → explain → alert

Challenges We Ran Into

Separating bit flip from natural hiss — both appear broadband in spectrograms; required discontinuity scoring as the tiebreaker
Preventing depolarizing attacks from being confused with ultrasonic injection — both have elevated high-frequency signatures
Stopping phase flip from being mislabeled as bit flip — required dedicated phase instability features
Sampling audio across the full file rather than only the beginning — attacks injected mid-file would be missed otherwise
Mapping classical audio features into meaningful quantum states
Designing divergence thresholds that generalized across both generated and uploaded audio
Balancing false positive rates on realistic audio inputs
Fusing quantum and classical evidence without over-relying on either

Accomplishments We're Proud Of

Built a fully working quantum-classical hybrid audio security pipeline
Successfully differentiated BitFlip.wav → Malicious · Bit Flip from naturalHiss.wav → Natural · Natural
Encoded audio features into Qiskit ZZFeatureMap quantum circuits
Simulated ideal, bit-flip, phase-flip, and depolarizing quantum noise channels
Compared quantum measurement distributions using Jensen-Shannon divergence
Classified all four attack types plus clean audio and natural noise
Built full-file chunk sampling so mid-file attacks are detected
Added confidence scoring, chunk-level evidence tables, and incident-style reporting
Extended detection to email attachment scanning

What We Learned

Noise is not always random — in security contexts, noise can carry intent
Spectrograms alone are insufficient for attack classification
Quantum feature maps can amplify subtle differences between structured signal patterns
Quantum simulation is most powerful when fused with classical signal evidence
Traditional quantum error correction is not the same as adversarial intent detection
Hybrid systems separate edge cases better than either approach alone
Real-world audio requires aggregate reasoning across the full file, not only per-chunk decisions

What's Next for Quanterp

Train a dedicated ML model on fused classical + quantum feature vectors
Expand attack coverage: replay attacks, spoofing, compression artifacts, adversarial audio injection
Test the feature-map pipeline on real quantum hardware
Add real-time streaming detection for live audio monitoring
Improve explainability of quantum response fingerprints
Benchmark against classical signal-processing and ML baselines
Extend beyond audio to network packets, IoT sensor streams, and communication signals
Build a production-ready monitoring dashboard for security teams