The human eye is quick to comprehend patterns and information. A picture is worth a thousand words, as the saying goes. Data presented in the form of a graph or chart is simple to comprehend and analyze. Visuals help to keep the audience's attention and help them understand stuff better. Visuals pique our curiosity and effectively communicate the idea. Information is easier to parse with charts and pictures. Data can be stored in spreadsheets and lists, but comprehending that data can be tricky. Data Visualization helps to make swift data-driven decisions and gain better insights. These techniques can assist in determining the optimum product optimization approach, business growth measurements, and other critical decisions.
Our team faced various instances where data visualization was a cumbersome and laborious process. Plotting simple histograms also took some coding/manual efforts. We wished for a solution that could assist our colleagues and increase their efficiency. We aspired to build a tool that could solve the problem within seconds and provide the user with almost all the available options, while also ensuring easy comprehension and sharable on Slack. Hence, we created Autoplotter to solve this.
What it does
Autoplotter is a Slack app where users can drag and drop any dataset file (CSV, TXT, JSON, or NPY format) and start Exploratory Data Analysis in any Slack channel. It supports different types of visualizations, statistical analysis of data, plotting various types of plots according to data supplied in the file. Autoplotter can be used to make a wide variety of charts, histograms, count plots, 3D scatter plots, subplots, and so on. Data presented in the form of a graph or chart is simple to comprehend and analyze. The best thing about Autoplotter is designed so that you don’t have to waste your time writing code for data analysis and creating different kinds of visual analysis, Autoplotter can do all this within a few clicks. This offers dramatic improvements over manual processing. Autoplotter is significantly faster and has a user-friendly interface that allows the user to download files according to their requirement. The output consists of separate zip files for Univariate, Bivariate, and Miscellaneous plots. Automatically generated plots are shared instantly with others in Slack channels!
How we built it
For developing Autoplotter, we have split the development approach into a few sections.
The first section includes the plotting of graphs which was done mainly using matplotlib, seaborn, and pandas. We used modules to find perfect subsets to the corresponding data for plotting different graphs related to various columns.
The second section comprises parsing data according to the file format uploaded by the user. For CSV files, it checks whether the file has a header or not and converts the file to a data frame for ease in plotting. For JSON files, it normalizes the file and converts it to a table which could be easily ported to a data frame. For NPY/NPZ files, it reshapes the file to a data frame that could be supplied to the previous files for plotting. For TXT files, it uses a parameter to detect delimiter and split the given data. This file also uses a function to remove all the rows which contain empty fields to avoid unnecessary errors while plotting. All these data frames are used to plot graphs using the functions mentioned in the first section.
The third section comprises the part where all the backend tasks are integrated with the slack app using
slack_sdk (Web client), and
flask. The next file consists of the function responsible to display the output on the slack channel. This function takes 4 arguments -
filename. We have integrated slack's latest UI functionality Block Kit to display the desired files in a suitable format that could be understood by the user. The main file
slackeventsapi for slack signing secret and slack token. It triggers a function
send_message on app mention (In this case
@autoplotter). This payload gets the file using the
ngrok server and uses the functions written in the previous file and sends appropriate messages according to the file. It checks the number of plots and sends messages if there are no visualizations or the file shared was corrupted.
Challenges we ran into
We faced numerous challenges like:
- The application needed to be generic, and it should give an appropriate response for every request. Handling every case with different file formats and still being able to visualize the meaningful data was the major challenge in this project
- Another challenge was the integration of slack with the python backend code. Getting a request, processing it and then sending a response, all the things required a certain level of synchronization which became manageable with the features provided by slack api and engineering applied on top of it.
- In addition, ensuring that the outputs from Autoplotter can be easily digested and distributed in Slack was also a challenge.
Accomplishments that we're proud of
We are very proud of having built a comprehensive solution for a problem that we faced on a regular basis. We achieved all of our goals and have been using Autoplotter internally on a daily basis. We are also very happy with integration of Autoplotter into Slack which makes our lives just a heck of a lot easier!
What we learned
Throughout this project, we gain a lot of knowledge on how to utilize Slack API and the Slack Block-kit (front-end) for building any Slack app. With this, we are excited to craft new Slack apps in the near future.
What's next for Autoplotter
This first version of Autoplotter outputs all possible plots without requiring any user inputs (except the data file). The next version of autoplotter will cater more towards user's need by taking required inputs and allow users to further edit/customize the plots.