Searchin

Antisocial Engineering

https://github.com/Samuel-Nathanson/searchin/tree/master

Inspiration & What it does People of different ages and education levels may not have the same goals when using search engines on the Internet. A few weeks ago, I wanted to learn more about how too much cholesterol affects my health and well-being. I asked my virtual assistant, Siri, to tell me more about cholesterol, and Siri responded with this definition:

"Cholesterol is an organic molecule. It is a sterol, a type of lipid. Cholesterol is biosynthesized by all animal cells and is an essential structural component of animal cell membranes. Cholesterol also serves as a precursor for the biosynthesis of steroid hormones, bile acid, and vitamin D."

Did you understand Siri's definition of Cholesterol? If you're not a Life Sciences Professor, the answer is probably "no".

We recognized a global problem here. People, particularly children, spend lots of time learning from the internet. Modern search engines do a great job of giving relevant search results. However, no search engine has ever focused on giving custom-tailored search results based on age or education level.

Searchin is a custom search engine that delivers easy-to-understand search results for its users. Searchin's primary purpose is to help educate children by giving children age-appropriate and easy-to-understand information on the internet. For example, when using Searchin, a child who wants to learn more about cholesterol would receive easy-to-digest public health information instead of a lecture on cholesterol's molecular structure.

How we built it Source Control - We used Github to manage our version control. We had the opportunity to work both independently and together without any version control issues. We also utilized branches and pull requests as intended to keep code clearly organized.

Infrastructure - Hosted on Google Cloud’s App Engine with managed DNS and SSL security. We utilized Domain.com to register our domain and configured a few subdomains with CNAME records (https://iam.searchin.online, https://im.searchin.online, https://www.searchin.online).

Application - The search engine consists of a Flask web app hosted on Google Cloud which displays a search page, results page, and contact page. Upon submitting the search form, the user’s reading level and search query are put into a Google Custom Search API call which gathers a set of Google search results. These results are then passed along with the reading level to a set of algorithms that determine the readability of the pages and generate a relevance level for each based on their original index in the search results and their distance from the user’s reading level. The search results are then sorted and labeled based on their relevance and then passed back to the web server to be rendered.

Challenges we ran into

Unfamiliarity with Google Cloud - None of us had experience with Google Cloud before so we had a slower start, but App Engine’s setup was intuitive Performance issues - Response time is important for a search engine. We needed to rewrite parts of code to be asynchronous or generally more efficient Relevance Scoring - We developed our own technique of generating a relevance score, and experimented with parameters to best model search result relevance. Readability Scoring - We needed to determine a numerical readability score from websites. This required balancing different factors, flexible and robust enough to handle diverse content on the internet.

Accomplishments that we’re proud of About halfway through the competition, we realized that we could optimize the response time of our algorithm by redesigning a major part of our back-end to utilize asynchronous programming. None of our team members had any experience writing asynchronous code in Python, but we took on the challenge because we realized how important speed is to our users. We took on the challenge of learning asynchronous programming in Python at 1:34 AM, and we finished before 4:00 AM!

What we learned

Projects on the scale of Searchin can be completed very quickly given the right environment Voice/text channels are very helpful in development Learned about readability/text complexity and how it can be computed

What’s next for the project? Looking towards the future, we are planning to polish our readability and ranking algorithms. Although Searchin started as a project for this hackathon, we believe it can go much further and revolutionize the future of search engines. We plan to continue work and possibly partner with? companies in the search engine field such as Google and Bing.

Built with? Searchin Online was built with Google Cloud’s App Engine, Flask, BeautifulSoup, Asyncio,TextSTAT, Google Custom Search API, and Github; and it was written in Python, HTML, CSS, and Javascript.

Check out some of these photos of our hack in action!!

Not a video of our hack itself, but we think this represents a world with Searchin in it! https://youtu.be/qxSikKDj_3c

Share this project:

Updates