Alex - Your personal assistant at work

Inspiration

We always wished there were more hours in a day.

To get more things done at work.

To not waste time doing boring tasks that don't add value to companies and that decreases your productivity.

For decades, executives have relied heavily on personal assistants to run errands.

Now, why only executives can have assistants?

What if everyone at your company could have their own personal assistant to help them save time?

Can you image the financial impact this can have on your company?

Well, now there is a solution for this.

Let me introduce you: Alex Bot.

Alex is your personal assistant at work that helps you focus on what matters: getting work done!

What it does

Alex is a chatbot that helps you at work, saving your time and boosting your productivity.

He can help you to ask for expenses refunds, add tasks, create tickets, find documents and answer questions.

One more thing: Alex is a friend of Alexa.

How I built it

Table of contents

Features

Feature Message Code AWS Curated Models used AWS Services used
Send passport Alex, here is my passport lambda_send_passport.py Passport Data Page Detection Lex S3 SageMaker
Add task Alex, can you add a task lambda_add_task.py Mphasis Autocode WireframeToCode Lex S3 SageMaker Alexa Skills
Add expense Alex, can you add this expense lambda_track_expense.py Mphasis DeepInsights Address Extraction Lex S3 SageMaker Textract Comprehend
Open ticket Alex, can you open a ticket lambda_open_ticket.py Mphasis Optimize.AI Expert Identifier Lex S3 SageMaker Kendra
Find document Alex, can you find a document lambda_find_document.py Mphasis DeepInsights Text Summarizer Lex S3 SageMaker Kendra
Ask question Alex, {how, what, why} ... lambda_ask_question.py S3 Kendra Alexa Skills Lex

Architecture

Architecture

AWS Services used

  • Amazon Lex
  • Amazon S3
  • Amazon Textract
  • Amazon Comprehend
  • Amazon SageMaker
  • Amazon Kendra
  • Alexa Skills

AWS Curated models used

  • Passport Data Page Detection
  • Mphasis Autocode WireframeToCode
  • Mphasis DeepInsights Address Extraction
  • Mphasis Optimize.AI Expert Identifier
  • Mphasis DeepInsights Text Summarizer

Sequence diagrams

Send Passport

Add task

Extract expense

Open Ticket

Find document

Ask question

Inputs and Outputs

Feature Input Output
Send passport data-input/Passport.jpg {'ExpirationDate': '17/01/1985','BirthDate': '31/01/2016','PassportNumber': '107185703'}
Add task data-input/wireframe.jpg data-output/wireframe.html
Add expense data-input/expense.jpg {'Price': '13.54', 'Location': 'Guildford', 'Store': 'Co-op'}
Open ticket {'query':'My internet is not working'} and this file {'Summary': 'Wait 2-5 minutes before plugging it back in.\n\n\n3. Wait 5 more minutes and retry the connection.\n\n\nIn most cases, this should x your issue and allow you to get back online. If you go through\nthese steps and something still isnt working, you may need to contact your internet\nservice provider for assistance.\n\n\nUnderstanding Your Routers Icons\n\n\nMost routers have a series of icons that illuminate to convey dierent status messages at a\nglance. Though these can vary from brand to brand, most manufacturers include at least\nthree primary status indicators:\n\n\nWiFi not working\n\n\nWiFi slowed down\n\n\nWiFi network disappearing\n\n\nDevices that wont connect to Wi\n\n\nGlobe icon: solid when modem is connected to the Internet.'}
Find document {'Description': 'Privacy policy'} and this file {'Summary': 'We collect your personal information in order to provide and continually improve our products and services. What personal information about customers does amazon europe collect ?provide , troubleshoot , and improve amazon services.'}
Ask question {'Description': 'How many vacation weeks I have on my first year?'} and this file {'Summary': "Amazon.\ncom's salaried employees earn two weeks of vacation time in their first year of employment and three weeks of vacation in their\nsecond year"}

Testing

There are three levels of testing you can follow, with increasing levels of difficulty.

You can test only the Models, test the Use-cases and test the Full deployment.

Models

If you just want just wand to test the deployment of the curated models and perform inference, follow the instructions at this Jupyter notebook.

Use-cases

Now, if you want to test the use-cases, i.e. test the models integrated with the business logic and AWS services, use this this Jupyter notebook.

Please note that for this you will need to have deployed the models, and depending on the feature you want to test, you may need to configure S3 Buckets, Kendra index, and get a Trello API key/token.

Deploying

As I said, doing a full deploy is not exactly easy.

But bear with me and let's follow these steps:

  1. Config
  2. Deploying Models
  3. Deploying AWS services
  4. Building Mobile App

Config

  1. Open the file config.py
  2. Edit with the configurations from your AWS account

Deploying Models

Run this command in the command line and wait (it will take some time to deploy all models):

python models_deploy.py

Skip if you have already deployed the models with the Jupyter Notebook.

Deploying AWS services

S3 Buckets

  1. Create a bucket for images and HTML files (Remember to replace the bucket in your config.py file)
  2. Create a bucket to store the documents for Kendra

Amazon Kendra

I'm not going to explain in detail how to deploy a Kendra index because AWS docs do a good job on this.

You need to follow two steps:

  1. Configure prerequisites
  2. Create index

You just need to make sure you use the bucket you have created above as the datasource for the Kendra index.

Lambda functions

Now it is time to create our Lambda functions.

It is very simple to create a Lambda function via the console, you can follow this tutorial.

You will need to create seven functions, namely:

  1. alex_flow
  2. ask_question
  3. find_document
  4. add_task
  5. open_ticket
  6. send_passport
  7. upload_file

Click on the links above to copy the source code for the Lambda function and paste it at the Lambda configuration. All of them use Python 3.6 engine.

Important You will need to configure the timeout for at least 60 seconds, or otherwise the functions will fail. This is because some of the models are slow to perform inference.

Amazon Lex Bot

  1. Go to the Amazon Lex service page in AWS Console
  2. Open the file intent_config_lex.json and replace all instances of {AWS_ACCOUNT_ID} with your AWS account id
  3. Click import and select the file intent_config_lex.json (you will actually need to zip it first)
  4. Click Build and then Publish
  5. Done! Your bot is created

Alexa Skill

Creating Lambda

We need to create a Lambda to handle the requests from Alexa. For this, we are going to use the starting application from AWS Serverless repository because it already creates the necessary Alexa Skill trigger for us, and we just need to update the Lambda code.

  1. Go to the same page that you used to create a Lambda function
  2. Choose Browse serverless app repository
  3. Type alexa-skills-kit-python36-facts-skill and select the app
  4. Change the application name to alex-alexa-skill
  5. Change the Lambda name to alex-alexa-skill
  6. Click Deploy
  7. Now select the Lambda function and paste this code
  8. Click save

Important You will need to configure the timeout for at least 60 seconds

Creating Bot

  1. Go to the Alexa developer dashboard
  2. Select Create Skill
  3. Choose Custom and provision your own
  4. Select Start from scratch
  5. Click JSON Editor and drag and drop this file into the editor
  6. Go to Endpoint and copy and paste the ARN of your application from the previous step
  7. Click Save model and Build model

Configuring permissions and IAM

First, let's create a custom Policy named InvokeLambda. This is needed because we need to allow Lambda functions to call other Lambda functions (such as lambda_alex_flow.py).

Go to this link to create the policy and add InvokeAsync and InvokeFunction actions.

Adding policies to Lambda's roles

Now we need to attach a number of policies to each Lambda role. You can find attach policies to roles at this link.

  1. alex_flow
    • InvokeLambda
  2. ask_question
    • AmazonSageMakerFullAccess
    • AmazonKendraFullAccess
  3. find_document
    • AmazonSageMakerFullAccess
    • AmazonKendraFullAccess
    • AmazonS3FullAccess
    • InvokeLambda
  4. add_task
    • AmazonSageMakerFullAccess
    • AmazonS3FullAccess
  5. open_ticket
    • AmazonKendraFullAccess
    • AmazonSageMakerFullAccess
  6. send_passport
    • AmazonSageMakerFullAccess
  7. upload_file
    • AmazonS3FullAccess

You should also add InvokeLambda to the role for your Alexa Skill app, which should be named something like serverlessrepo-alexa-skil-(....)

Mobile App

There are two projects for the mobile app, the cordova wrapper and the Lex Bot UI (which was based on this project from Amazon)

Lex bot UI

First, you need to create a Cognito pool ID and add the Lex policy.

Next, open those three files:

  • lex-ui/src/config.dev.json
  • lex-ui/src/config.prod.json
  • lex-ui/src/store/action.js

And replace {YOUR_API_GATEWAY_TO_UPLOAD_FILE_LAMBDA} for https://{YOU_ENDPOINT}.execute-api.us-east-1.amazonaws.com/Production/upload_file

Also, replace YOUR_COGNITO_POOL_ID with the Cognito pool ID you have created above.

  1. npm install
  2. npm run build
  3. python move_files.py (this is necessary to copy the built UI to the cordova Wrapper)

Cordova App

  1. Install Ionic npm install -g ionic
  2. Install Cordova npm install -g cordova
  3. Run ionic cap add ios
  4. Run ionic build
  5. Run ionic cap copy ios
  6. Run ionic cap sync ios
  7. Run ionic cap open ios and now you can deploy the app to your device using Xcode

Challenges I ran into

The first issue I faced was a bureaucratic one. I reached a limit of running endpoints because I'm using 5 different models and I have a new AWS account. I had to open a ticket for Amazon support, and only then was I was able to test the full application.

The second main issue was trying to record the user audio. My initial idea was to use the MediaDevices.getUserMedia() API. This is a Javascript API that helps you to stream Audio over WebRTC.

The main issue is that it is supported on Safari mobile, but not on WKWebview (the webview used by mobile apps), and it took me some time to figure this out.

In the end, I had to use a Cordova plugin to record the audio and send it to Amazon Lex.

Accomplishments that I'm proud of

I'm really proud of being able to build all this in only 10 days! I have only found out (via an email from DevPost) about the Hackathon on the 13th of April, so I had to rush to finish everything.

What I learned

This was a great learning opportunity, as I had never used any Machine Learning services from Amazon.

For this project, I had to learn how to use all of the following:

  • Amazon SageMaker
  • Amazon Textract
  • Amazon Kendra
  • Amazon Lex
  • Alexa Skill

What's next for Alex - your personal assistant at work

As of next steps:

  • Launch Android App
  • Launch Web Version
  • Develop scan document capability
  • Develop extract info from table (scan)
  • Make Alex talk
  • And more ... =)

Thank you!

Congratulations for reaching this far

Built With

Share this project:

Updates