Wroclaw University of Technology

 

Chatbox implemented with AIML

which offers automatic customer assistance

in the search of products

in a osCommerce based Internet shop

 

Subject:

Methods and Algoritms of Artificial Intelligence (ARE 3513)

http://sequoia.ict.pwr.wroc.pl/~witold/aiarr/indexe.html

 


Teacher:

dr inż. Witold Paluszyński

 

June 2008

 

 

Authors:

Adolfo Escolano Luján

Enrique Ros Carrión

Javier Maíllo Saralegui

 

 

 

 

 

 

 

 

 

 

Abstract

 

The present report is written in the field of Artificial Intelligence and explains the way in that a chatterbot called MATIAS (Modeled Artificial Totally Intelligent Assistant Softbot) was developed and how the technique Case Based Reasoning was used.

 

The context chosen for MATIAS' conversations were those related to the http://www.uniformesmastia.es/tienda/ Uniforms shopping web Portal, thus the chatbot acts as a virtual assistant, offering help to the visitors, in such way that users think they are speaking with a human being instead of a chatbot.

 

Internally, the chatterbot analyzes the human's dialogue and search inside its Knowledge base the most appropriate answer using the technical CBR (Case Based Reasoning).

 

The real importance of this project resides in showing the reader how many different uses can be given to the chatterbots: They could be used in the areas of entertainment, education (e-learning), training, interfaces man-machine, etc. And great advantages can be obtained implementing this type of applications

 

Key Words: Artificial Intelligence, AIML, Artificial Intelligence Markup Language, Case Based Reasoning, Chatbot, Customer Assistance, osCommerce.

 

 

 

1. Introduction

 

One of the most promising areas of research in Computer Science is the Artificial Intelligence field [4], in which have been developed some application techniques whithin specific branches of science, what has transformed it from a purely academic science into an experimental science. Case-Based Reasoning has an important role into AI modus operandi.

 

CBR cares about the study of needful mental mechanisms for repeating what was done previously, either by oneselves, or guided by some concrete cases collected in bibliography or folk wisdom [2].

 

It is also important to say that in this project, we did not try to solve all natural language related problems, but we aimed to create one chatter softbot able to talk in English language in a specific domain.

 

 

 

2. Case-Based Reasoning

 

The nature of human thought is generaly a mixture of different kind of processes like inductions, deductions, etc. This method of human reasoning is based in analogies, with whom people make inferences that allow them to acquire new knowledge. One example of this type of reasoning is presented here: One doctor has the knowledge of some different kind of sicknes, symptoms and treatments, larned by his studies and especially to the experience accumulated during his years of practice. When a patient tells him about his disease, the doctor observe patient's symtoms and uses the disease and symtoms information he previously learned in order to give a diagnosis. If there are no previously known corresponding symptoms to any sickness, he take the most similar cases in order to offer a solution. New information related with disease, symtoms, possible solution and patient results, will be stored in doctor's memory in order to be applied in the future

 

Case-Based Systems do the same process as the example presented above, with the advantage that provide machines the ability to manage thousands of information sources in a very short period of time. CBS resolutions is performed this way: face one problem's description, then some previous related case is recovered, and it is adapted to the solution of solved problem in order to get new problem's solution, finally solution is reviewed and and new case is learned [1].

 

Among the advantages of CBR systems we can mention:

 

 

But this technique also presents some difficulties in its implementation as:

 

 

 

3. Chatbots

 

The very beginint of all this is in the Turing Test, which Turing called he Imitation Game and what wasa test that intended to answer the question of whether "machines may think. "

 

Other of the important steps were made by Hugh Loebner in 1991, with the creation of a contest based on the Turing test. The contest gives medals and cash prizes for "most human computer" and this has boosted the creation of such sofbots.

 

But before this we find the first system and a pioneer in the field of sofbots chat, "Eliza", which was the first programme in this area. It was a functional sofbot for the psychology field, created in 1980 by Joseph Weizenbaum [3]. Its purpose was make people speak with him about their problems like as if they were talking to a psychologist. Even the secretaries and non-technical staff at MIT (Massachusetts Institute of Technology) thought that the machine was a real therapist, and spent hours revealing their personal problems to the program. Eliza is still the most widely circulated in the history of intelligence artificial.

However, Eliza, was not only the first of chatterbots, but also the cornerstone for almost all of them, since they are based primarily on the creation of patterns that simulate human behaviour.

 

A few years ago, in 1995, Dr. Richard Wallace wrote Alice. The development of this sofbot chat started in its first version, using SETL (based on mathematical logic and set of theories language). From this first attempt was developed what was called "A Program" which was the first version of Alice, using AIML (Artificial Intelligence Markup Language) and Java. A while later was developed the next version, "Program B", in which were about 300 developers and where the AIML evolved to a XML grammar. This resulted in the development of editors and tools for managing AIML, and was this version which won the Loebner Prize in 2000.

 

It was subsequently created two other versions of Alice, "Program C" (which was developed using C / C + +) and "D Program" (based on Java2 technology). From here, it was created in 2001, "The Alice AI Foundation", whose objectives are among others, the distribution, promotion, development and maintenance of Alice and AIML technology. Our bot MATIAS is written under D Program features.

 

Other sofbots that have been developed are: "Theresa" (whose main topics of conversation are music and Greek mythology), "Mimic" (which learns while holding talks with him), "Brian" (which won bronze medal in the Loebner Prize in 1998), "Sofbot Rock Critic" (up criticism of discs and artists that do not exist), "ELVIS" (Sofbot chat that simulates being Elvis Presley), "Jlaip" (trying to replicate the personality of the former Beatle John Lennon). The following table present some of these jobs with their respective authors:

 

SOFTBOT NAME

AUTHOR

LANGUAGE

Alice

Richard S Wallace and his team

English

Shampage

Rich Waugh

English

Eliza

Joseph Weizenbaum

English

Parry

Kenneth Colby

English

Fred

Robby Garner in Robitron Software Research

English

Jabberwock

Juergen Pirner

English

Yeti

Webolutions New Media

English

Al Alex

John Precedo

English

Elbot

Fred Roberts

German

 

Many of these sofbots are, in principle, general, that is not specialize in any particular subject, while in others, we found various themes such as Elvis Presley, music, rock and roll, John Lenon, and so on.

 

Chatterbots can be created as a means of entertainment or as part of interactive games, Internet information services, directories of Web sites, e-commerce players and more. Here are presented some applications that can be built using sofbots chat:

 

 

 

4. MATIAS Knowledge Base Structure

 

This section explains the structure of the knowledge base of sofbot MATIAS, constructed so as to enter, modify, update and retrieve knowledge of the simplest way possible. Because, as in any CBR system, the structure of the cases and their storage at the base of knowledge, is one of the most important parts.

AIML (Artificial Intelligence Mark-up Language), is a language structured to allow introducing bot knowledge in an easier way. It was developed by Dr. Richard S. Wallace and the free software community of Alice bot during the years 1995-2000.

 

This language is composed of a series of XML tags that kind help organizing the knowledge of the robot. The operation of AIML is based on a stimulus-response model.

 

MATIAS Knowledge is organized into AIML files, text files that contain labels with a defined structure: the first line is a foreword XML standard and contents of the file should be closed within the labels <aiml> </aiml>. In AIML, <category> labels are mainly the ones containing knowledge. In <pattern> labels we find user questions patterns and answers appear labeled as <template>.


MATIAS knowledge consists of units with the following structure:

 

<category>

<pattern>

<!-- Question user model>

</pattern>

<template>

<!-- bot reaction produced by the question model>

</template>

</category>

 

 

 

 

 


Category example

 

Here is shown a very basic AIML file

<?xml version="1.0">

 

<aiml version="1.0"gt;

<category>

<pattern>HELLO</pattern>

<template>Dzien Dobry</template>

</category>

</aiml>

 

 

 

 

 


The basic unit of knowledge in AIML is called a category. Each category consists of an input question, an output answer, and an optional context. The question, or stimulus, is called the pattern. The answer, or response, is called the template. There are two optional methods for defining context using the <that> and <topic> markup. The <that> tag appears inside a category, and its pattern must match the robot's last utterance. Remembering one last utterance is important if the robot asks a question. The <topic> tag appears outside the category, and collects a group of categories together. The topic may be set inside any template [6].

The pattern language within AIML is simple, consisting only of words, spaces, and the wildcard symbols "_" and "*". The words may consist of letters and numerals, but no other characters. The pattern language is case insensitive. Words within the pattern are separated by a single space, and the wildcard characters function like words.

The template language is also designed to represent the response as simply as possible for the task at hand. In its simplest form, the template consists of only plain, unmarked text. More generally, AIML tags allow the reply to save data, activate other programs, give conditional responses, and recursively call the pattern matcher to insert the responses from other categories. Most AIML tags belong to this template side sublanguage.

AIML supports two ways to interface other languages and systems. The <system> tag executes any program accessible as an operating system shell command, and inserts the results in the reply. The <javascript> tag allows arbitrary scripting inside the templates.

AIML processing is similar to querying a simple database of questions and answers. However, the pattern matching "query" language is much simpler than something like SQL. A category template may contain the recursive <srai> tag, so that the output depends not only on one matched category, but also any others recursively reached through <srai>.

 The CBR "cases" are the categories in AIML. The algorithm finds best-matching pattern for each input. The category ties the response template directly to the stimulus pattern. MATIAS is conceptually not much more complicated that Weizenbaum's ELIZA chat robot; the main differences are the much larger case base and the tools for creating new content by dialog analysis [5].

 

The answer or template that represents the answer given by the robot to the user. In its simplest form, the template consists of plain text and unmarked (ie without labels).

 

More generally, the template may be composed of AIML labels, such as random, get, set, think, srai, etc. which transforms the chatbot response in a computer program which can save data on disk, activate other programs, give Conditional answers, and even call recursively other pattern (patterns), etc..

 

The portion of optional context category consists of two variants called by the labels <that> and <topic>. The use of context can be very useful for the proper development of chatterbot in the talks.

 

The label <that> which appears within categories, indicates the last answer given by the bot in the conversation. Remembering the last answer of chatbot is important if the sofbot is going to make a second question.

 

The label <topic> appears outside the category, and is used to establish a group of categories based on a theme. This label refers to the subject or topic of the current conversation.

 

 

5. MATIAS, The "Program D" bot

 

Program D is the most widely used free ("open source") AIML bot platform in the world. It is the most feature-complete, best-tested implementation of the current AIML specification. It supports unlimited multiple bots in a single server instance, and has an open-ended architecture for interacting via any interface imaginable. The standard release provides a J2EE web application implementation that can be deployed as a .war file. Drop-in listeners are available for IRC, AIM, and Yahoo. It includes an automated testing framework for testing knowledge bases, and is packaged with an AIML Test Suite that verifies that the program itself complies to the AIML specification.

 

Program D is known to work with many different languages / character sets. Its component-oriented architecture allows it to be integrated into any application framework desirable. It is implemented in Java, and uses many features of the latest JDK to provide optimum code reliability. It is actively maintained and supported at http://aitools.org/Program_D

 

 

 

6. Knowledge Recovery

 

The knowledge recovery stored in MATIAS is a process that extends from the user input, the corresponding pattern search with the entry, through the pairing up algorithm generating the answer.

 

The entries are word(s) entered by the user to the chatterbot, and since these are written in natural language, they should be standardized, ie go through a transformation process before being sent.

 

This normalization process is conducted in the following order:

 

When the user enters a question to the bot first it is normalized, and then proceeded to search in the knowledge base to find a category that contains the pattern that best matches the question.

 

 

 

Tree nodes

 

 

7. Chat examples

 

Let's remember that our system offers search help in a internet uniform shop in order to find several differents products in a osCommerce system based shop. This search help it is offered through a chat box which is implemented with AIML.

Here we show some of typical chats with MATIAS

Dialogue history 1

 

Dialogue history 2

 

 

Dialogue history 3

 

 

 

Bibliography

 

 

[1] AAMODT, A., PLAZA E. Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. Artificial Intelligence Communications. 1994

 

[2] SCHANK R. Dynamic memory; a theory of reminding and learning in computers and people. Cambridge University Press. 1982.

 

[3] WEIZENBAUM, J. "ELIZA - A Computer Program for the Study of Natural Language Communication between Man and Machine," Communications of the Association for Computing Machinery. 1966.

 

[4] WINSTON P. Inteligencia Artificial. Addison-Wesley. 1994

 

[5] http://list.alicebot.org/pipermail/alicebot-general/2001-September/000758.html

 

[6] http://www.idealliance.org/papers/extreme/proceedings/html/2007/Freese01/EML2007Freese01.html

 

 

 

Gathering further information

 

A.L.I.C.E.
http://www.alicebot.org

http://alicebot.blogspot.com/

 

A.I. TOOLS (ProgramD Website)

http://aitools.org/Main_Page


AIML Reference

http://home.tampabay.rr.com/ringo/aimlrm.html

 

Bug tracker:

http://bugs.aitools.org/


Java
http://www.javasoft.com


JDOM
http://www.jdom.org


Jakarta Tomcat

http://jakarta.apache.org
http://www.javasoft.com


Valid HTML 4.01 Transitional