Inspiration

The inspiration for Aister was MeiliSearch and ElasticSearch, which are both open source APIs that any regular individual can use and host. However, instead of implementing a searching RESTful API, we wanted Aister to allow developers to integrate pre-trained machine learning models into their project.

Challenges responding to

Best Solo Hack

I am responding to the best solo hack because I believe that mine may be the winner of this. Obviously, no one can say for sure. I think that I deserve this award because I have connected an open source project idea with a real application and tried to learn new concepts that I have not ventured into before such as machine learning.

What it does

Aister is an API that can be integrated into virtually every website and program that a developer would wish to. Aister interacts with the rust-bert for the handling of the pre-trained machine learning models and the module loading. Currently, Aister can create a message dialogue and translate numerous languages with its WebSocket endpoints.

How we built it

We built Aister using actix-web and rust-bert. All of our endpoints use WebSockets to allow for a bi-directional data flow, faster speeds, and a better developer experience.

Challenges we ran into

Building on Windows

I could not get my Rust server to build on Windows until twelve hours before the hackathon ended due to their being an executable conflict hidden within one of my PATHs. I needed G++ to compile torch-sys the C++ code for the PyTorch bindings to which was a dependency of rust-bert.

Sharing Data between Threads

Rust is an extremely safe language with a borrow-technique for keeping the state of data. The borrowing technique allows Rust to not need a runtime-costly garbage collector which basically throws away data that will not be needed anymore by the program. Anyways, the models that are loaded into Rust are extremely large in size, so the author of the crate (alias for library in the cargo world) did not allow us to clone the model structs. Therefore, the data for models would be restricted to one thread. The thread limitation disallowed being able to keep a static reference of the models between requests, thus requiring the pre-built machine learning models to be loaded on every request. The implementation of the API suffered greatly from this caveat as each request could now take over 2s even with WebSockets.

Accomplishments that we're proud of

UI

I am quite proud of the UI I sculpted out with Vue because I think it looked modern with a matching color scheme.

Caching

I am very proud of a caching system that I made for the WebSocket integration that allowed me to easily store structs in a sized array and replace the ones that are not being used anymore. In the end, though, I could not use the code for this project due to the threads issue.

What I learned

Rust Ownership

The project definitely made me more skilled on how the Rust ownership works for functions and structs. I tried many methods (lazy_static, custom caching, ect.) to try to get around it, but to no avail. In the end, I ended up reading a bunch of articles on how it works and understood why it would not work.

WebSockets

I have never used WebSockets before, so it was cool to learn how they work. The native APIs are really handy to use, as I never knew that they even existed.

Stronger grasp on Serde

I have always been decent at serde, but building this API has definitely improved my skills because I had to learn more decorators that you need to use to customize the serialization/deserialization steps.

What's next for Aister

I hope to make a branch for the hackathon version and continue to develop and add more features on the main branch. These features would most likely consist of adding all of the other pipelines/models featured in rust-bert and perhaps look into other crates to use as well. I would love to make it into something like MeiliSearch, but I definitely need to find a way around the thread sharing for static machine learning models.

Citations (required by rust-bert

Becquin, Guillaume. Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS), pages 20-25. Association for Computational Linguistics, 2020. https://www.aclweb.org/anthology/2020.nlposs-1.4

Built With

Share this project:

Updates