Inspiration

Having worked at large corporations where data was siloed in accounting systems, SQL databases, and documents (spreadsheets, pdfs), I knew there had to be a solution for bringing data together. Not to mention, storing logins and passwords on a post-it under my keyboard for these siloed systems was not the most secure solution for data access.

What it does

Our application uses OpenAI's LLM and it's novel "function_call" functionality to chain together various disparate data sources and acts as an ACL (access control list) by allowing access to functions based upon a user's role within the company. Data that can be accessed in our demo includes a Health Care Benefits guide stored in Pinecone for semantic search capabilities, sales/human resources/production/purchasing in a SQL database, and news and weather information via APIs.

Particularly impressive was ChatCompletion's "SQL Translation" ability. It generates SQL based upon a user's prompt when provided the corporate SQL Schema. This allows for adaptation as corporate schemas evolve over time:

### MS SQL Server tables, with their properties:
# [Sales].[SalesOrderDetail](SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate)
# [Sales].[SalesOrderHeader](SalesOrderID, RevisionNumber, OrderDate, DueDate, ShipDate, Status, OnlineOrderFlag, SalesOrderNumber, PurchaseOrderNumber, AccountNumber, CustomerID, SalesPersonID, TerritoryID, BillToAddressID, ShipToAddressID, ShipMethodID, CreditCardID, CreditCardApprovalCode, CurrencyRateID, SubTotal, TaxAmt, Freight, TotalDue, Comment, rowguid, ModifiedDate)
# [Sales].[SalesPerson](BusinessEntityID, TerritoryID, SalesQuota, Bonus, CommissionPct, SalesYTD, SalesLastYear, rowguid, ModifiedDate)
#
### What were the sales in 2018?
SELECT

We then made authentication a breeze by applying Pinecone's vector storage and HuggingFace's facial recognition models to create a biometric security system (literally, Pinecone determines the score as to whether a user will be authenticated or not!!!).

How we built it

Standard Python Django framework. Used an MS SQL Server as our datastore for AdventureWorks2022 (our fictitious company). Used the fictious Northwinds Healthcare Plan and chunked it into Pinecone for semantic search (used HuggingFace all-MiniLM-L6-v2 model for setence-transformer encoding). Then implemented facial recognition with face-api.js (HuggingFace face_landmark_68, face_recognition, and tiny_face_detector models) and a separate Pinecone vector index. Implemented a "function_call" stack that allowed us to make as many function calls as OpenAI's ChatCompletion required to adequately answer a user's request. Added fun stuff like current news and weather API feeds.

Challenges we ran into

Function_Call chaining was a learning curve. A Pinecone content creator had a great initial youtube tutorial that inspired us. Facial Recognition was another learning curve as the library was new and we were uncertain if the information that came out could be successfully used as a biometric security matching system. Initially, since face-api uses Euclidean distance, we setup our Pinecone Vector with Euclidean, but that was HORRIBLE. I dropped the index and rebuilt it with cosine-similarity and it was FRICKING AMAZING!!!

Saturday until midnight was spent installing AWS. Initially wanted to use an RDS instance but found it too difficult to access. So moved to LightSail where I installed a Linux MS SQL Server and then configured a NGINX server using Gunicorn to allow a WSGI for my Django project.

Accomplishments that we're proud of

All of the above!

What we learned

AI is insanely powerful! Also, two areas we didn't get to touch on, but are really excited to try are using Pinecone's vector database to implement Recommendations and Anomaly detection.

What's next for Seamless Access and Insight to your Corporate Information

Continuing to push forward on implementing additional data source capabilities! And perhaps a little sleep... :-)

Built With

Share this project:

Updates