TEMBI

Inspiration

Conservationists use infrasound (deep rumbles below the limit of human hearing) to track elephants. But in the wild, this signal is drowned out by the roar of airplanes, trucks, and generators. We built TEMBI to surgically remove the "noise of man" and let the rumbles be heard.

What it does

TEMBI is a deep-learning restoration suite. It uses specialized neural networks to isolate elephant rumbles from heavy industrial interference.

Specialist "Brains": Custom models for specific noise types including Airplane, Vehicle, and Generator.

Infrasonic Translator: Because $f < 20$ Hz is silent to us, we pitch-shift the cleaned rumbles up by 3 octaves for human verification.

Live API: A production-ready FastAPI backend that cleans data in real-time.

How we built it

We developed a U-Net CNN architecture in PyTorch. Signal Processing: Audio is downsampled to 2000 Hz to target the infrasound spectrum.

Surgical Masking: The AI predicts a "mask" that multiplies the noisy spectrogram to zero out interference while preserving the phase of the elephant signal.

Challenges we ran into

The biggest hurdle was dimensionality collapse. In early versions, our spectrograms were too "thin." This caused the U-Net layers to hit a spatial dimension of zero and crash. We solved this by implementing a 16,000-sample temporal window which provided the context the AI needed to distinguish a rumble from a jet engine.

Accomplishments that we're proud of

We successfully moved from a "one-size-fits-all" model to Specialist Brains. We learned that trust in the math is vital. When you are working with sound you cannot hear, your only eyes are the loss curves and spectrogram plots. We also mastered the art of "Overlap-Add" reconstruction to ensure zero clicking in the final audio.

What we learned

We learned how to build and train a custom U-Net architecture from scratch using PyTorch. This process taught us the core mechanics of machine learning, from managing backpropagation to optimizing loss functions. We discovered that a model is only as good as its data, which required us to master Short-Time Fourier Transforms (STFT) to turn raw audio into processable tensors.

We also gained experience in Reconstructive Logic. We learned how to use AI-generated masks to surgically clean spectrograms without destroying the original audio phase. Most importantly, we found that specialization is key. By training focused "Specialist Brains" for different noise types, we achieved a level of clarity that a general model could not produce.