A majority of our team has taken an AP Statistics class at school. We have always found it tedious to type in the data values with occasional errors. We have simply tried to overcome this issue that has always bothered students.
What it does
It simply takes pictures of Data Tables from a Textbook and processes it to create a Regression Line, Scatterplot, and Residual plot while calculating Standard Deviation and averages for each Variable.
How I built it
We created three modules: imageToText.py : This module uses pytesseract ocr to extract the data points from image
StatsWindow.py : the GUI applet made using Tkinter
plot.py : uses Matplotlib to plot the graphs and performs all the statistical calculations
Challenges I ran into
The ocr wasn't that accurate in extracting just the data points. We ended up with a mix of letters, words, x and y values and we only wanted the latter.
Accomplishments that I'm proud of
We were able to separate the x and y values from the rest of the image and place it in a csv file.
What we learned
Some of the group members were new to the Python Programming Langauge. As a result of this, we faced many challenges at the start of our project but we soon overcame this problem as we managed to teach everyone essentials of Python. Lastly, many members also learned how to use TKinter, PyTesseract, MatPlotLib, and computer vision modules.
What's next for PhotoGraph
As of now, you can only upload a dataset of two-quantitative variables. Later on, we'll try to work with categorical variables and improve text-detection.