This is my fifth semester as a TA for undergrad CS classes. As our department grows, so does the amount of reliance on the popular QA platform Piazza. Piazza was envisioned to be a place where students could ask questions and receive answers from other students and TAs. However, the number of posts is increasing at nearly an exponential rate, which puts a lot of burden on TAs to keep up, and decreases student motivation to read other people's questions. This leads to a lot of duplicate questions.
This tool is intended to both help students identify more easily what other students are struggling with, and help TAs improve assignments from semester to semester.
What it does
We use the unofficial Piazza API here to get the data from the course. We then build a graph of questions that reference each other. As TAs, we often try to indicate duplicate questions by tagging the duplicate (using an @question_number), so I built this off the idea that this provides dead simple clustering on the data. We visualize it as a normal graph, and you can very easily see what areas of the homework gave a lot of trouble!
How I built it
Parsed the course data, put it into the Cytoscape graph api.
What I learned
This immediately makes it clear what areas of the assignment were difficult. Also, even as a TA, I learned quite a bit from playing with the UI, and I found that it was a really efficient way to get information at a glance on what students are struggling with.
What's next for Piazza Plz Help Me
A long time ago, I made a Chrome extension that injects additional functionality into Piazza. I'd like to integrate that into this project to allow students and TA's to "tag" posts with more meaningful categories. It would be great to identify clusters of questions in a more automated way, so being able to tag posts is one step closer to using true machine learning algorithms. I thought about various models I could use here to better group up questions than just links, and I think that a supervised learning method would perform much better in the long run. Having an extension with easy buttons to tag would definitely not be terrible to implement and have people use.
There's a lot more data that I am interested in getting out this that can answer interesting questions. For example, are there students who consistently ask duplicate questions? Can we determine the rate at which students move through assignments by associating certain types of questions with parts of the assignment? Is there a correlation between the rate of instructor response and the average number of questions asked on Piazza? These are all things that are answerable using this infrastructure, and questions that have come up before.
Also, this doesn't incorporate follow up posts.