Semantic Search Engine

Semantic Search allows retrieving documents from a corpus using a search query in a semantic way. This means that the search engine looks not only for exact text matches, but also for overlapping semantic meaning (e.g. synonyms and periphrases). \ This repository contains code for my semantic search engine which can search for user provided queries on user provided text/documents and also on the internet.

How did I go about it?

I did it using Streamlit. The text blob was broken down into sentences using regex StackOverflow. The sentences were then encoded and converted to tensors using all-MiniLM-L6-v2 model under SentenceTransformers. \ Then cosine-similarity scores are calculated with the user provided query (also tensorised) and these sentences.\ A Pandas dataframe is used to store these and the sentences with the top 20 scores are displayed.

Google Search Results

Using SerpApi's Google Search API to get the top results' title, links and their snippets, I displayed these as a table. This was done mostly as a 'can be looked more into in the future' mindset and may be useful in some scenarios.

Template

The Streamlit template I used was made by Fabio Chiusano (link). Huge thanks to him!

Deployment

The app is deployed here: Semantic-Search-Engine.

To run it locally

Clone the repository
Set up a virtual env (Python)
Login at SerpApi and get an API key.
Make a file called config.py and add api_key="ENTER YOUR KEY HERE" in the file.
run $ pip install -r requirements.txt
run $ streamlit run "Path-to-Repo\Semantic-Search-Engine\app.py"

Video Demo

The video demo can be seen here: link

Built With

huggingface
keras
numpy
python
semantic-engines-semantic
sentence-transformers
serpapi
streamlit

Updates

Siddhartha Mahajan started this project — Jun 16, 2023 04:23 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.