Inspiration

Public records should be more easily searchable and interpretable for non-technical citizens. Through conversational interfaces, LLMs can empower analysis of neighborhood safety and discover of systematic bias in law enforcement.

What it does

Allows users to query public police stops records using conversational question and answering.

How we built it

We were very interested in learning about langchain, so this was our primary framework. In particular, we utilized the RetrievalQA chain to query openAI (for generating embeddings and answering questions), FAISS vector store and the DirectoryLoader for ingesting multiple text files.

Challenges we ran into

OpenAI rate limiting required us to preprocess our source data in a way that minimized the total number of calls to their API.

Accomplishments that we're proud of

Getting an end to end working example using Langchain that simplifies publicly available government data.

What we learned

  • Rate limiting is a huge challenge, so data processing is a key step for cost/compute optimization.
  • The ability to try different language models is very helpful for experimentation.
  • Modular development of an end-2-end LLM application helps improve quality and enable more comparisons of vendors.

What's next for safe street

Add more datasources from open government data. Extend the current architecture through Langchain's map reduce techniques. Explore systematic bias in different neighborhoods through incorporating PII attributes, like gender, age etc.

Built With

  • chatgpt
  • langchain
  • openai
  • streamlit
Share this project:

Updates