Understanding the complex protein structures in scientific journals can be a challenge. Apart from that, it's hard for a student to visualize complex 3D molecular structures in a classroom. We wanted to build something on the new ARKit that's both useful and fun to make. Since neither of us know each other before, we had to focus on over overlapping skillsets and decided to build a data-driven iOS application.
What it does
The app basically, opens up a camera. A user can point to a scientific journal or college textbook. The App parses through the text, extracts chemical compounds and presents a list. A user can choose the molecule/protein he wants to render in the AR environment. The complex 3D structures is then presented which the user can interact physically.
How we built it
The App extracts chemical names from a large text in real-time and renders the molecules/proteins in an AR environment.
- Get real-time text from IOS camera.
- Find Chemical compounds/molecules/proteins from the text.
- Locally pre-trained CoreML model (Chemical named entities recognition model) and AbbyyRtrSDK.
- Wit.AI for processing keywords/traits and extracting entities
- Fetch the chemical name and search for information on ChemSpider
- The backend service takes a .MOL file containing the molecular data and renders into a 3D scene with OpenGL+PyMOL+MeshTool.
- The rendered .DAE (Collada) files are used to project Chemical formulae as an AR Scene.
We had multiple scripts to parse.MOL files, a flask-based service to serve the rendered files and the processed text to the App. And of course, an IOS App.
Challenges we ran into
The 30 hour time from idea to product was challenging enough. :P
We had to find a way to automate the .MOL to .DAE conversion. There's no direct way to do it. MOL is a Chemical table file with structure information and .DAE is a 3D scene. The conversion was done with pyMOL scripts along with an AppleScript automating the whole process throughout the night.
Another challenge was iOS not having an OCR API. We had to draw bounding boxes and extract the text separately.
Accomplishments that we're proud of
We are proud to learn how complex file conversions can be. We learnt more about CoreML and ARToolKit, the new toolkits from Apple. We formed the team and the idea at the venue and it has been an amazing 30 hours to convert that into a product.
What we learned
This was our first AR project.
What's next for ChemStreet
Better MOL rendering, more interactiveness on the 3D scene and better UX. We have plans to port it on to Oculus as well.