Scary Story Scraper

Inspiration

I love reading scary stories every october and the best ones always come from regular users on forums sharing their experiences. The only downside is that they are unorganized and hard to search for. My goal is to have a web site that helps make the discovery of these spooky tales easier by centralizing them in a single place and categorizing them.

What it does

A google cloud functions runs automatically scraping user submitted scary stories from 2 sources; reddit's nosleep subreddit and jezebel's scary story contest.

It then analyzes the posts it founds using google's cloud natural language api. Finally the information acquired gets saved on firebase's realtime database.

These posts are then visible and searchable on the web site.

How I built it

The website is built with react js and using firebase libraries to access the data that gets displayed. The scraper is built with node js using cheerio for html handling and firebase admin library to store the information.

Challenges I ran into

Figuring out the basics of web scraping
On one particular case some information I needed was inside an iframe which required some workaround to extract.
It took me a while to understand google' natural language api and to found a use case for it.

Accomplishments that I'm proud of

I added a word cloud from the entities that the natural language api returns
I was able to extract all the information I wanted from the sites that I originally planned.

What I learned

Cloud functions
Web scraping techniques and libraries
Google's Natural Language API

What's next for Scary Story Scraper

More complex categorizing of the stories.
Extract user stories from other sites/sources.

Built With

cheerio
firebase
google-cloud-natural-language
node.js
react

Updates

Manuel Salguero started this project — Nov 01, 2020 05:04 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.