Inspiration

South Africa has a high unemployment rate which is coupled with rising organized crime in the form of cash in transit heights, ATM bombings, looting and other socio economic crimes. In addition to that the private sector in South Africa is larger than the public sectors police force, there are 2.7 Million security officers registered in comparison to 150,000 police officers for the ~60 Million population.

What it does

The platform is a VSaaS tool that caters for managers, camera technicians, tactical responders and most importantly the CCTV operators in control rooms. The platform is able to ingest CCTV feeds and run various automations on them in real time such as streaming and recording. We are using Gemini Vision to automate the detection of suspicious activity in CCTV video streams, we named the feature OperatorGPT. We are also using embeddings to put together a data lake consisting of surveillance data allowing operators investigating on our platform to benefit from vector search on all platform images within a 7 day retention period, this feature is called CropSearch.

How we built it

The frontend to the web platform is built using Ionic, Angular, Firebase, Firestore, RealtimeDB, Cloud Storage, Extensions, AppCheck and Google Analytics with a Google Cloud backend. Our backend microservices are built using NodeJS, Chokidar, AppCheck, LangChainJS, FFMPEG and various Google Cloud API's such as Gemini and Multimodal Embeddings API and Firebase. We have a private codebase up on Google Cloud Repositories and I am happy to share it with the judging team privately.

Challenges we ran into

Converting CCTV video streams into a format suitable for the browser. Converting CCTV video streams into images that can be analyzed as a trigger. Prompting Gemini Vision pro so that it can perform consistently on various platforms. Multithreading vs Microservices given the real time nature of the app.

Accomplishments that we're proud of

Configuring microservices that can encode and decode the CCTV video media using FFMPEG. Creating a consistent OperatorGPT prompt structure that allows us to use Gemini Pro vision to mimic the role of a CCTV operator at a price that just shy of the average CCTV operator salary in South Africa. In terms of AI development I am really glad that we took on the challenge of learning about Gemini and LangChainJS early on, if we had not started so early we would not be in the position we are now where we can bring in agents like OperatorGPT and search functionality like CropSearch into a video management platform.

What we learned

Media encoding and decoding with FFMPEG. We had to master the following protocols and media types: RTSP, HLS, RTMP, PNG, MP4 and spectogram. How vision language models such as LLava works in comparison to a multimodal GenAI model like Gemini and how various forms of data can be represented in a single semantic embeddings space. Working with VertexAI embeddings API and building image based data lakes in the form of vector stores.

What's next for Barrier CCTV Technology

We want to build the best VSaaS platform for surveillance teams that helps them work better and smarter with the help of web technologies, artificial intelligence, mapping intelligence and big data. When these technologies are correctly put together the end goal is safer communities and better responses to crime.

We also want to use our learnings to build a B2C platform at https://people.barrier.technology where users can leverage various local features with the option to use OperatorGPT and other cloud based automations at their home or small business. Our strategy is to launch a B2B product for the secondary market which are the security companies, offsite or on-site monitoring companies and public sector departments. While doing so we will also begin to release our B2C offerings targeted at the primary marker of house owners, facility managers and so forth.

Built With

Share this project:

Updates