To learn what precisely is going on during parliamentary sessions is a pain. However, Parliament meetings' transcripts are publicly available. We were curious whether we could extract interesting insights to reduce time spent with reading the transcripts and improve transparency of the democratic process.
What Parlamenticon does?
The app analyses transcripts of the meetings of the Czech parliament and creates a visual summary. It gives immediate insights about the most important topics discussed. Moreover, we show an overview of the most active speakers in each meeting.
The solution & tech-stack
Scraping, parsing and cleaning of the parliamentary pages (unix tools,
python), grouping & text analysis (
python, lemmatization, keyword analysis, topic analysis - LDA, LSI), API, frontend & visualization (Flask, jinja, bootstrap, D3.js).
Challenges we ran into
The wifi in the lobby forced us to use a limited bandwith of mobile hotspots. Frequently 4 laptops were connected to the internet through one phone. For the implementation we has some issues with the styling in the d3.cloud library.
Accomplishments that we're proud of
We believe that despite the limited time we were able to finish the pipeline and bring meaningfull insights through a beautiful visualization.
What we learned
We enjoyed the collaborative atmosphere at this event, this was really something special. We've improved our ability to speed-up the development process through communication. Peer programming can actually be useful and even fun!
What's next for Parlamenticon
We would like to develop the idea further, enable incremental processing of this evolving dataset, create more visualizations as ebmeddable components, publish the API and connect to other open data initiatives.