Inspiration

We thought writing abstracts could be boring, so we thought this would be a funnier and more convenient way of creating them.

What it does

Ever wanted to impress your friends with impressive scientific research, but don't have the time or resources to actually conduct it yourself? Well then, look no further! The Abstract Abstract Maker is the product for you!

The main script implements an nth order Markov Chain as specified at the beginning of the program. The script parses the dataset given by H1 Insights and "reads" through all of the publication abstracts. It then contructs its own fake abstract and outputs this paragraph to a text file.

How we built it

Using Pandas, the publications.csv file that was given to us by H1 Insights is imported and parsed to generate a list of all of the abstracts as strings. Then, this list used to hash map occurences of n ordered words in each abstract ( where n is the specified order of the Markov Chain). Finally, two words are given to start building the new abstract; these words are compared to the first two words of every dictionary key, and based on the number of occurences of that key, it randomly chooses the next n - 2 words (from a key that matches). Eventually, the program will end for one of two resons: 1) there were no more keys that matched the two given words. 2) the script runs a certain number of loops that is specified in the beginning of the program.

Challenges we ran into

We attempted to use the H1 Insights API to collect the data via a request library in Python. However, since the request only returned 10,000 rows at a time and we were able to run the program on over 100,000 rows at once, it was easier to implement the project by parsing the whole dataset as a .csv file

Accomplishments that we're proud of

It's pretty neat, and produces some hilarious abstracts.

What's next for Abstract Abstract Maker

Implementing a UI, and potentially filtering the abstracts based on keywords or jounal it was published in so the abstracts generated have content that is relevent to itself.

Share this project:
×

Updates