Inspiration

A majority of our team has taken an AP Statistics class at school. We have always found it tedious to type in the data values with occasional errors. We have simply tried to overcome this issue that has always bothered students.

What it does

It simply takes pictures of Data Tables from a Textbook and processes it to create a Regression Line, Scatterplot, and Residual plot while calculating Standard Deviation and averages for each Variable.

How I built it

We created three modules: imageToText.py : This module uses pytesseract ocr to extract the data points from image

StatsWindow.py : the GUI applet made using Tkinter

plot.py : uses Matplotlib to plot the graphs and performs all the statistical calculations

Challenges I ran into

The ocr wasn't that accurate in extracting just the data points. We ended up with a mix of letters, words, x and y values and we only wanted the latter.

Accomplishments that I'm proud of

We were able to separate the x and y values from the rest of the image and place it in a csv file.

What we learned

Some of the group members were new to the Python Programming Langauge. As a result of this, we faced many challenges at the start of our project but we soon overcame this problem as we managed to teach everyone essentials of Python. Lastly, many members also learned how to use TKinter, PyTesseract, MatPlotLib, and computer vision modules.

What's next for PhotoGraph

As of now, you can only upload a dataset of two-quantitative variables. Later on, we'll try to work with categorical variables and improve text-detection.

Built With

  • argsparse
  • csv
  • matplotlib
  • pil
  • pytesseract
  • python
  • sklearn
  • tkinter
Share this project:

Updates