Please Note: We have two proect submissions and were told by the host to list both projects in this section. We have first mentioned information about out project RAD-V and then on GithubSearchMadeEasy

########RAD-Visualizer

Inspiration

As the human race stepped into the infromation age, Data explosion has been one of the biggest things that occured in the past few decades. It came as no surprise that the Data Scientist role has been considered the sexiest job of the 21st century. The number of roles in the field have reached all time highs. Yet, there is still a dire need for good data scientists in the community. At this moment, one has to ask Why hasn't such abundance of data and the massive interest in data science projects among the community sufficed to solve the need for data scientists? We believe the answer is simple. Most of the data science roles in the industry require a strong background in statistics coupled with a strong background in programming using Python, R, Java etc. This implies that a majority of people might be unable to match the required skillset! Consider the statisticians who do not have a background in coding or Business Intelligence majors who could identify some key patterns which could go unnoticed from the usual CS/ECE developers for instance.

RAD-V is a web application which is built with just solving this goal in mind. We believe bridging the gap between the non-coder and data science.

What it does

The application provides a set of unique and helpful features that we believe could help a lot of people join the Data Science realm. The user can upload data files on the web app and receive a quick summary of the data,visualize it and also run machine learning models and more without writing a single Line of Code! Some of the examples of the functionalities include scatter plots, bar plots, mapping geo-spatial data and machine learning algorithms such as linear regression and decision trees

How we built it

Just as every successful project, RAD-V was designed keeping one main goal in mind- the end user. We wanted the users to get the feel and performance of one of the most popular languages among data scientists without being afraid of not having enough expertise/knowledge of the same. Thus, we decided to choose the R programming language and used R Shiny dashboard to host a web application to provide data science functionalities to the user. Internally, we also made use of several open source packages in R such as ggplot,leaflet and some machine learning packages

Challenges we ran into

One of the biggest challenges were that 2/3 members of the team were new to the R programming language,let alone hosting a web application in the same. Another challenging task was to make a customer-centric product. The product sims to provide a comprehensive set of tools which a user can make use of without causing any confusion.However, with a lot of iterations over the process of implementing and designing a minimalistic web UI, we were able to create a tool perfect for both beginners and intermediate users of R in the Data Science realm

Accomplishments that we're proud of

RAD-V is one of a kind open source web app which can be used by anyone who wants to enter the Data Science world. The user can use both data visualization as well as machine learning tools to generate reports, identify patterns and much more! The intention of allowing the user to work without code is not to avoid teaching them to code.On the contrary, we would like to provide an interactive learning experience through the application. This is the primary reason why we have a code-block which displays the code used to generate the plot/run an algorithm for every tool on the web app. This is just the opposite way a Notebook works. Usually, the user writes the code and then observes its output. Here, we let the user generate the output using a GUI and then display the back end code which is generated dynamically based on custom user inputs. We aim at creating a GUI which provides a comprehensive set of options avaialable from the packages so the user can have a truly unique and complete learning experience.

What we learned

We learnt that hackathons are much more than just prizes, food and giveaways. Its about coming together to learn,build and collaborate. We realize we have a long way to go to create a complete project. But it is the need of the hour to develop and promote open-source tools aimed for beginners interested in data science to assist them in effectively using data technologies. RAD-V is here to help the current open-source community and more importantly, ensure it's growth to all corners of the world. And lastly, being new to the open source , we realize how much we have taken this community for granted for all that they have provided us.

What's next for RAD Visualizer

As mentioned above, we have a long way to realize the full potential of RAD-V. Some of the immediate features that need to be added include enhancing the current app with more data types and more algorithms. We also need more learning models to be added in the Machine Learning section such as unsupervised learning along with Cross Validation, Feature Selection and Bootstrapping. In addition, we have a vision of allowing this project to be used as an plugin web browsers so users all over the world could use it for any tasks they would like such as finding patterns from public datasets, using it to understand their own financial trends or even to just map to different locations over the globe!

###################################GithubSearchMadeEasy

Inspiration

Github is one of the world's leading software software development platform used by individuals/teams to work through problems,share ideas and move ahead. Being a popular platform among the developer world, it supports various different programming languages and projects and is considered one of the biggest repositories of code. Naturally, such a huge code base requires a comprehensive and advanced searching feature to find projects, projects and more to support and promote the Github open source community. And Github provides this with it's advanced search. However, what about those who are relatively new to coding or to the concept of open-source? As mentioned in the talks given by the the mentors and keynote speakers, it is usually the first step which scares a new open source developer. No proper knowledge about Github or is it's usage or not knowing where to start is the biggest hurdle in their path. For them, the basic search on the nav-bar is not enough to find them the projects which they can work on and the advanced search is just too complex. (We must confess, even we did not ever use the advanced search to its full potential for searching for issues,projects or code inspite of being developers for years!) With this project, we aim at bridging this gap. We are concerned with those class of people who are looking for a start in the open source world.

What it does

With Github search, we provide just the right amount information and input options that a newbie would need to find beginner level issues in open source projects to start with.

How we built it

With the help of R shiny, we build a web-app around the Github API with restricting the some of the search parameters in implicitly in our web app's back end. After researching on the different types of search parameters which would help retrieve the best results for a new open-source dev, we were able to conclude that parameters such as specific labels and the popularity of the repo and the timestamps associated with the issues could be crucial in helping a newbie. For example, Github suggests using labels such as 'good for first issue' and 'help-wanted' for calling new devs for help. Along with this, we also found other labels which were pretty popular among the community such as 'beginner' and 'open for grabs' used for referring the same set of people. Using these findings, we use the API to find the optimum results sorted in the order of relevance that we believe would help the newbie the most.

Challenges we ran into

Github advanced search is not open source, thus we could not directly contribute to it. There is a limit on the number of API calls that can be made per minute. Not all of us were proficient in R and Github APIs(or even used it before!). Finding varies tags used by open source projects to indicate issues which were user friendly took a bit of manual effort to discover.

Accomplishments that we're proud of

Made an Open Source tool to help the Open Source Community! We simplified the very process which we used to discover open source beginner tags on Github, so that others do not need to do the same.

What we learned

Developed skills in R! The ethics of open source and why its needed

What's next for GitHubSearchMadeEasy

Plans to provide more information to the user but find the crossover point between simplicity and information overload. Integrate a forum to help beginners learn more about open source and find help. Become the hub for beginners in open source. Have a section for them to find events just like HackIllinois, so as to dive into open source.

Built With

  • r
  • shiny
  • ggplot
  • esrileaflet
Share this project:
×

Updates