C05-Library-Chat-Bot
Team and patrons
- felixvdh, Christopher Hlubek, Jan Peter P.
- Anastasia Kazakova (ZBW), Tamara Pianos (ZBW)
Challenge
The chat of the library service is used to attend the very different needs of the customer. These might be rather simple questions such as “Why is my account blocked?”, “When do I need to return books?”, “I cannot access an electronic full text online. Why is that?”, which are often re-occurring in similar forms, or very specific questions on publications dealing with certain research issues. Currently the complete chat communication is answered by library staff and restricted to the office hours of the library.
Desired Solution
Prepare existing data for further using and definig intents. Desing possibPrototype a chat bot with the help of Rasa X.
Dataset
The dataset comprises of chat transcripts from the last couple of years. The chats were done between library users (students or researchers mostly) and library staff. The language of the chats is German and English. While some user questions are unique there are possibly many recurring questions that could be answered automatically in the future like e.g. “Why is my account blocked?”, “When do I need to return books?”, “I cannot access an electronic fulltext online. Why is that?”
Provider
ZBW - Leibniz-Informationszentrum Wirtschaft
Characteristics
Format: XML
Number of Cases: 4000
Type of Cases: complete chats (1-40 per day)
Number of Variables: 10
Variable Names: questionId, status, type, wait_time, session_time, language, browser_betriebssystem, author (Patron or Library), time_stamp, text
Data processing and analysis
- Transformation of XML data into a linear format (CSV) and basic cleanup of data by patterns
- Identification of questions asked in chat transcripts
- Manual annotation of transcripts that begin with greetings ("Hi", "hello", "moin")
- Training of a separate Rasa NLU to classify transcript items into "greeting", "question" or "test"
- Idea: Use the resulting questions to validate the chatbot and identify the distribution of intents in the last years
Concept
- Identifying the most importan use cases by analysing chat logs and consultig partons
- Definig/identifyin intents
- Retrieving trainig data (so far manually)
- Elaborating trinig data
Development notes
How to run XML transformations
Start the Rasa API server:
cd xml-transform/rasa-nlu
rasa run --enable-api
Run conversion:
cd xml-transform
./convert.sh
Results
- Scripts for transforming data into CSV-Format
- Trining data for several intents
Possible further work
- Find the possibility to retrieve trainig data semi-automatic
- Define more intents and define/use more suffisticated actions
- Evaluate intents and trainig data with EconDesk stuff
- Provide UI
Log in or sign up for Devpost to join the conversation.