Inspiration

As a student who struggled in high school to find a college that would be right for me. My goal is to create things that would have helped me out when I was younger. If this project is implemented, it could help many students find the right college.

What it does

College Fit comes into the picture in high school, when choosing colleges: a very proactive student could look at maybe 30 colleges, but likely less than that. This is a problem as there are hundreds if not thousands of colleges that he or she cannot consider. With College Fit, a student can tell an application what they want in an ideal college, and the application will give him or her a list of colleges that would best match that ideal college. College Fit helps to streamline the process and use modern machine learning and web scrapping of first person sources/ reviews to build models that can establish a 'fit' of a person's college experience.

How I built it

The way it works on that back end is that there is a script that scrapes review data from sites like niche.com or U.S. News College Rankings. After gathering the reviews, the data would be combined and put though a program that uses a natural language processing API, such as Google Cloud Natural Language Processing, or IBM’s Watson. By running a sentiment analysis, the program can find the specific advantages and disadvantages of each aspect of a college. From that point, there would need to be another program that weights the salience (importance/relevance) and of each general area (sports, academics, extra-curriculars, etc) as well as more specific groups (in sports: baseball, tennis, etc.) After all the information is sorted and weighted, it will be compared to the provided student data. The student data could be gathered in multiple ways including an few paragraphs about what they want in their ideal college, or a survey with choices such as “A good physics department is important to me, rate 1-5 with 1 being it is not important”. By using this data to create a profile of the student, we can used a combination of the college’s and student’s difference in their weights and sentiments to arrive at a score that can predict how well a college would match a given student.

Web Scrapper to create local data source => Input into a usable form to run Natural language processing => Data math to find Universities with the best fit.

Challenges I ran into

-Writing a web scrapper with no knowledge of building of working with pulling info from web elements -Working with Selenium to build a working prototype. -Setting up a natural Language processing Algorithm with google google cloud and creating usable data.

Accomplishments that I'm proud of

-Building a working webscrapper. -Learning about sentiment analysis and natural language processing. -Using web API's and Tools to accelerate the Machine learning framework -Learning to use cloud platforms NLP and ML with minimally explanatory documentation

What I learned

-Honestly, I learned: I knew nothing. -It was a hefty and tedious learning curve. Filled with confusion, pain and the will to give up.

What's next for College Fit

Further expansion of this idea could integrate factors like specific scholarships made available to specific demographics

Here's the link to the git-hub: https://github.com/Osaila/CollegeFit/tree/master/HackPSU2018Fall

Share this project:

Updates