Stats Engine

Initial Unloaded GUI for StatsGine

Inspiration

While in statistics class we realized inputting data into calculators can be quite menial, and a task that can be easily automated. Further more, students of lower socioeconomic backgrounds sometimes do not have access to tools like the TI84 calculator, which allow them to utilize statistical procedures, whereas they might have access to computers in their schools. We used the synergy of these two principles to inspire the statisticians of tomorrow.

What it does

Our Java application has the user specify URL with a search term, or something desired to be found on the web page. A java web crawler using Jsoup then finds urls, with that search term, and returns any relevant quantitative data to a statistics engine. The statistics engine performs user requested calculations and presents that calculated data in an appropriate format. It can perform basic statistical analysis of one or two links, Z-Test Confidence Intervals and Hypotheses tests, along with the corresponding T-Test Procedures. It can also calculate various forms or regression and upon choosing the optimal equation, display it graphically.

How we built it

Jsoup- With the help of Jsoup we created a java web crawler that finds the first instance of the inputted search term. The program then parses the page the search term is located on, and gets relevant quantitative data with regard to the search term. Statistical Engine- Using the Jsoup web crawler and integrating bing search query, we quickly found functions to use to perform calculations like normal distributions, student's distributions, as well as various regressions.

Challenges we ran into

Because we only found out about JSoup halfway through the hackathon, our original web-crawler tried checking the content using a rudimentary HTML Parser we wrote ourselves, which was susceptible to code that did not follow standard formatting conventions. To solve this, we researched, and replaced our method with a methods from JSoup that reversed our process and therefore received the same result without compromising memory.

Accomplishments that we're proud of

 We are quite proud of the integration with Bing, because we had to study HTTP request protocols to integrate Bing search query with Java, such that our algorithm can get results that are fine-tuned to what the user requests. We are also quite proud of our dysfunctional, self-built HTML parser, but it needs to be refined before we can implement it in an algorithm such as this.

What we learned

We learned search engine query integration, as well as the complexity of many, many statistic formulas we often took for granted, as well as, of course, the importance of teamwork!

What's next for Stats Engine

We hope to improve the GUI, and include more statistical analysis options. We are also looking into refining the web-crawler such that results are loaded more efficiently and presented in an easy to view and understand fashion.

Built With

bing-query
java
json
jsoup

Submitted to

Los Altos Hacks
- Winner Third Place

Created by

I created a Content Reader that searched through the HTML for elements containing the search text, and then went up the hierarchy, to take the node, lower in hierarchy to only the body tag, from the HTML file and display it to the user, as well as to separately evaluate any info-graphics for relevant data. I also integrated Bing Web search with our Java program, and designed the front-end.

Rehan Durrani
I worked back-end utilizing Jsoup, and Java. Primarily, I worked on the web crawler portion of the program, using a Jsoup algorithm to retrieve the document displayed on the URL, and getting links on that document.

Andrey Pluzhnik
I used Riemann Sums to simulate integrals and normal and T Curves (everything pertaining to statistics). I also integrated information from the web-crawler into data tables to make it easier to read.

Aniket Mandalik

Updates

Rehan Durrani started this project — Jan 31, 2016 12:59 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.