Imagine you just watched a TigerGraph Graph Gurus episode and are really excited to try out the implementation for yourself. Well, there are multiple steps before you get to start working with the demo you saw. You have to create an account, create a blank box, copy all of the code you just saw being used, upload all of the data, and install all of the queries. Only after all of that is done do you get to play around with the graph. What if there was an easier way. What if you could install the demo without having to touch the code yourself? What if you could install that demo with the click of a few buttons? This is what inspired the automatic demo loader. An interactive Jupyter notebook that lets you grab does from the TigerGraph GitHub and automatically upload them to your server.
What it does
This code creates an interactive Jupyter notebook environment (using Google Colab) that allows a user to automatically upload TigerGraph demos onto their own server. After the user enters their server information (hostname, username, password, etc.), the code connects to the user’s graph. It then pulls up a list of demos for the user to choose from. The user can choose one, and all of the scripts corresponding to that demo (create schema, load data, create queries, etc.) are run. Then, the user now has access to the full demo without having to do any work on their own (besides typing in their info and clicking some buttons). This solution is handy because it automates the process of creating a graph, and makes it much easier. Also, each user will have their own copy of the interface, so there is no need to worry about storing private data or handling user requests. Each user gets their own, personal package to work with. Finally, all of the code is provided on the interface itself, so if a person is curious they can click a button and see all of the code.
How I built it
The UI and all the backend was written with Python. The UI was created using Python widgets, and the backend (connecting to the TigerGraph server) was done using the pyTigerGraph package for Python.
Challenges I ran into
One of the biggest challenge I ran into was actually loading the data. I used pyTigerGraph, but there was no method or functionality available for directly loading data via the ddl REST endpoint. So, I had to engineer it myself. Using the Python requests library along with the TigerGraph Docs, I manage to figure out how to attach the appropriate headers and filenames to upload the data based on a given loading query. However, this only works if the loading script uses the
Define filename f=‘some/file’. This brings me to the biggest challenge I faced. The demos available in the TigerGraph ecosys are very different, and there is no common pattern between them. Additionally, some of them just provide the graph tar file and not the scripts, which to my knowledge you can not upload by remotely accessing the server -if it is possible, I’d really like to know how (: But, for the demo loader that I had, it can be generalized to work on any demo (not just the one I included in the sample code) if all of the demo folders have the following:
- A README detailing what the demo is and how it works (not important for uploading, but important for giving information to the user
- A folder with all of the data files
- A bash .sh file listing all of the scripts needed to be run (in order)
- The loading jobs to all be of the format
Define filename f=‘some/file’. So, no scripts that load the file like this:
load f=‘some/file’ to …. In actuality, the loading files just need to be consistent, and either format works as long as it’s consistent. But, the first (define) format is much easier to work with.
Accomplishments that I'm proud of
My biggest accomplishment was figuring out the direct ddl load. When I first encountered that problem, I was fully expecting to rely on pyTigerGraph. But I was shocked to see that the functionality wasn’t even available. So, once I finally figured it out I felt really proud of myself for semi-inventing the functionality I was looking for.
What I learned
The main thing I learned was how to use Python widgets. These are tools I never even knew existed until I started working on this project. They are really handy for creating proof of concepts and showing potential workflows for products. Also, since it’s entirely coded in Python, I didn’t have to learn any new languages (just a new library).
What's next for Automatic Demo Loader
The next step for the Auto Demo Loader is to add additional functionalities available in Graph Studio. For example, we could add a query creator/editor, a schema editor (that runs schema-change jobs in the background), or even a 3D visualizer. I actually made a sample query editor (with the help of Jon Here) which you can view here. In essence, we could recreate Graph Studio, with potentially more functionality (thanks to the flexibility of Python) in a Jupyter notebook environment. This will especially help data scientists (one of TigerGraph’s biggest user groups) to get introduced to the basics of graphs and GSQL while in the comforting language of Python. It also helps automate the process, so there’s very little work needed on the side of the user (no need to find the demo, copy it/recreate it in Graph Studio, etc.) which will help in the transition to using graphs and GSQL. Additionally, I think it really helps reinforce that the graph database at the end of the day is a database, and Graph Studio is no the actual database but a visual representation that helps make the process easier. This is a concept that I struggled with when first introduced to TigerGraph, so hopefully seeing graphs in that new environment will help reinforce that idea.