C05-Library-Chat-Bot

Team and patrons

Challenge

The chat of the library service is used to attend the very different needs of the customer. These might be rather simple questions such as “Why is my account blocked?”, “When do I need to return books?”, “I cannot access an electronic full text online. Why is that?”, which are often re-occurring in similar forms, or very specific questions on publications dealing with certain research issues. Currently the complete chat communication is answered by library staff and restricted to the office hours of the library.

Desired Solution

Prepare existing data for further using and definig intents. Desing possibPrototype a chat bot with the help of Rasa X.

Dataset

The dataset comprises of chat transcripts from the last couple of years. The chats were done between library users (students or researchers mostly) and library staff. The language of the chats is German and English. While some user questions are unique there are possibly many recurring questions that could be answered automatically in the future like e.g. “Why is my account blocked?”, “When do I need to return books?”, “I cannot access an electronic fulltext online. Why is that?”

Provider

ZBW - Leibniz-Informationszentrum Wirtschaft

Characteristics

Format: XML
Number of Cases: 4000
Type of Cases: complete chats (1-40 per day)
Number of Variables: 10
Variable Names: questionId, status, type, wait_time, session_time, language, browser_betriebssystem, author (Patron or Library), time_stamp, text

Data processing and analysis

  • Transformation of XML data into a linear format (CSV) and basic cleanup of data by patterns
  • Identification of questions asked in chat transcripts
    • Manual annotation of transcripts that begin with greetings ("Hi", "hello", "moin")
    • Training of a separate Rasa NLU to classify transcript items into "greeting", "question" or "test"
  • Idea: Use the resulting questions to validate the chatbot and identify the distribution of intents in the last years

Concept

  • Identifying the most importan use cases by analysing chat logs and consultig partons
  • Definig/identifyin intents
  • Retrieving trainig data (so far manually)
  • Elaborating trinig data

Development notes

How to run XML transformations

Start the Rasa API server:

cd xml-transform/rasa-nlu
rasa run --enable-api

Run conversion:

cd xml-transform
./convert.sh

Results

  • Scripts for transforming data into CSV-Format
  • Trining data for several intents

Possible further work

  • Find the possibility to retrieve trainig data semi-automatic
  • Define more intents and define/use more suffisticated actions
  • Evaluate intents and trainig data with EconDesk stuff
  • Provide UI

Built With

Share this project:

Updates