Sebenzai

Inspiration

Sebenza means "work" in isiXhosa and isiZulu, two popular languages in South Africa. Our mission is to create 1 million jobs in Africa. There is 28% unemployment in South Africa right now. Most of those people have a smartphone and they all have free time.

There is no isiXhosa voice-to-text model but many voice-to-text use cases e.g. automated medical voice doctor that would greatly benefit from voice-to-text in isiXhosa. If you get a human listen to an hour of audio to transcribe it, you have to trust them with the potentially private content in the audio and it will take them an hour. Instead, you can "digitally shred" the audio into several 5-second pieces and play them for separate people so that no one person ever hears more than a 5 second slice of audio without any context. Doing it in parallel means you can do it much quicker.

Each worker does a test on signup and we calculate their "skill". We cannot trust any single worker to be correct so we play each slice to several workers and use their skill to calculate a "skill-weighted-consensus" in their transcription to get an accurate result.

This skill is stored on a smart contract as a sort of verified public CV (to demonstrate literacy etc.). We pay workers in Dai stablecoin to a wallet we set up for them on registration.

What it does

See above

How we built it

Django, python, ruby, solidity

Challenges we ran into

Ran out of time. Getting dai on test net was difficult.

Accomplishments that we're proud of

It actually works!

What we learned

loads

What's next for Sebenzai

We have final round ycombinator interviews in May. We are working on every type of data labeling for machine learning, not just audio. We are on a mission to create a million jobs in Africa.

Built With

Submitted to

ETHCapetown
- Winner ETHCapeTown Finalists
- Winner NuCypher
- Winner MakerDAO

Created by

Worked on splitting audio input data into small snippets that do not leak private data because they do not contain any context. Worked on skill-weighted consensus mechanism to combine several worker annotations per snippet into a accurate label for that audio snippet. Wrote code to output wavefile graphic colour coded for snippets. Ran Monte Carlo simulations and ended up using quadratic skill voting with payments in quadratic skill space.

Alex Conway
I built the Django web-app and async task interactions with the Ethereum focused API built by Armand

Gavin Wiener
Worked on the AWS infrastructure, backend services/APIs and ERC20 contracts and integration. Was a massive learning curve but super fun being new to the blockchain space.

Armand du Plessis
Ongeziwe Mbana