Semantic Search Engine
Semantic Search allows retrieving documents from a corpus using a search query in a semantic way. This means that the search engine looks not only for exact text matches, but also for overlapping semantic meaning (e.g. synonyms and periphrases). \ This repository contains code for my semantic search engine which can search for user provided queries on user provided text/documents and also on the internet.
How did I go about it?
I did it using Streamlit. The text blob was broken down into sentences using regex StackOverflow. The sentences were then encoded and converted to tensors using all-MiniLM-L6-v2 model under SentenceTransformers. \ Then cosine-similarity scores are calculated with the user provided query (also tensorised) and these sentences.\ A Pandas dataframe is used to store these and the sentences with the top 20 scores are displayed.
Google Search Results
Using SerpApi's Google Search API to get the top results' title, links and their snippets, I displayed these as a table. This was done mostly as a 'can be looked more into in the future' mindset and may be useful in some scenarios.
Template
The Streamlit template I used was made by Fabio Chiusano (link). Huge thanks to him!
Deployment
The app is deployed here: Semantic-Search-Engine.
To run it locally
- Clone the repository
- Set up a virtual env (Python)
- Login at SerpApi and get an API key.
- Make a file called config.py and add
api_key="ENTER YOUR KEY HERE"in the file. - run
$ pip install -r requirements.txt - run
$ streamlit run "Path-to-Repo\Semantic-Search-Engine\app.py"
Video Demo
The video demo can be seen here: link
Built With
- huggingface
- keras
- numpy
- python
- semantic-engines-semantic
- sentence-transformers
- serpapi
- streamlit
Log in or sign up for Devpost to join the conversation.