Inspiration

We need "all hands on" the COVID-19 pandemic. As a humanities researcher, I know that not everyone has the ability to work well with JSON objects. Given that the COVID-19 Open Research Dataset (CORD-19) made available to the public is a collection of JSONs, I wanted to give other researchers the ability to access it and run searches for multiple keywords (not "this" OR "that", but "this" AND "that") to determine causal relationships between ideas.

Research data set is here: https://pages.semanticscholar.org/coronavirus-research. My Powershell script is here: https://github.com/lmhouston/COVID19-KeyCon-Viewer.

What it does

Based on user input, this Powershell script runs a multiple keyword search on the complete dataset after downloaded by the user. It outputs an Excel spreadsheet of the hits it returns and compiles all articles into one text document and then cleans up the JSON markings for ease of reading. Researchers already have options to search for 1 word at a time (CTRL-F), but this search function helps them see the relationships between the key terms the interest them.

Instructions

Once the data set is downloaded, expand zipped folders. Research article files will be in JSON format. Copy JSON files into a folder on your desktop named "Covid19_Keyword_Search," and then run this Powershell script. NOTE: New to Powershell or on a government computer? Run this script in Powershell by copying/pasting the text into Powershell, selecting all text, and then clicking "run selection."

How I built it

I used some pieces of existing scripts I'd created in Powershell and researched the rest on the internet! I then tested it a few times on the data set, trying out multiple scenarios.

Challenges I ran into

I'm used to working with Powershell on the PC I have back at the office. I was on leave visiting family with my Mac laptop when the pandemic hit, so I've just stayed put and am working remotely. I had to find a way to install Powershell on my Mac so that I could make this script. This is my first time working with Powershell in Visual Studio Code!

Accomplishments that I'm proud of

I only started working with Powershell last year. I'm really proud of the way this gives researchers an easy workflow to get their jobs done.

What I learned

I learned a lot about what works (and what doesn't) in the latest version of Powershell, given that much of the sample code posted on the internet is for older versions. I'm sure that more senior Powershell coders might have been able to write this script more elegantly, but I wanted to get this into the hands of researchers quickly, and it gets the job done!

What's next for COVID-19 KeyCon Viewer (Keyword Convergence Viewer)

I hope people use it to figure out important things from the data we have so that we can end the pandemic and prevent future occurrences!

Built With

  • other
  • powershell
Share this project:

Updates