This project was inspired by the widespread proclivity of students at our school to check their grades every other second, but having to get past that pesky login screen every time. Sometimes, when there are currently unfolding news stories, the best way to keep up with what is happening is to keep an eye on a page. The issue with this is that it eventually becomes very impractical to keep track of all the pages you would like to. Scrapdash solves this issue by allowing you to follow as many pages as you would like and displaying them all in one place.

What it does

The extension is made up of two parts. The first part is an interactive element selector that works on any web page. It allows you to select any information or page that you want to keep an eye on. The second part is a stylistic, personalized new-tab page that gives you a bird’s-eye view of all the latest updates that you care about. With the fusion of the two components, you can stay informed and up-to-date, at minimal expense of your time.

How we built it

The extension was built using a combination of many different technologies. To be able to take advantage of the numerous libraries that exist for Javascript, we used Webpack to transpile the extension-- this also allowed us to use Babel to use more recent ES features and Vue for the frontend interface. For the web page rendering, we used headless chrome and the puppeteer library in order to get the best results. In addition, we developed a mini local server that served the bridge between our browser extension and the headless instance.

Challenges we encountered

As it turns out, extracting content from web pages while maintaining their style and appearance was unexpectedly difficult, especially with all the JavaScript frameworks and CSS-style sheets used on the modern web. We got around this by rendering the web pages the same way the users see them. With a headless instance of Chrome running in the background, all the pages along with user cookies are rendered in Chrome, and a screenshot is produced as a result retaining all the style and details of the original page.

In the process of creating the container image for the Scrapdash server, we also ran into a few issues with Chromium support inside containers being ... strange. Running a sandboxed Chromium instance inside a container meant having to make sure a lot of other dependencies were installed and that the proper capabilities were added to the container.

Accomplishments that we're proud of

Our team was able to produce a fully functional product, beyond, say, just a demo. Looking back at what we’ve created, we are proud to say that it is an application that we would personally use and solves a genuine problem in people’s lives (however silly the original inspiration was).

What we learned

Our team faced many different problems that we had to overcome during the process of building this extension. We gained a much deeper understanding of how the native messaging API of extensions worked as we created the host part of the extension for pulling information from remotes.

What's next for Scrapdash?

We are on-track to release the extension for both Chrome and Firefox; furthermore, we have several more features planned for the future, including server-side rendering and built-in OCR. Additionally, we plan on open-sourcing the project and sharing it with the wider community.

Built With

Share this project: