Report of Chatbox implemented with AIML which offers automatic customer assistance in the search of products in a osCommerce based Internet shop

Wroclaw University of Technology

Chatbox implemented with AIML

which offers automatic customer assistance

in the search of products

in a osCommerce based Internet shop

Subject:

Methods and Algoritms of Artificial Intelligence (ARE 3513)

http://sequoia.ict.pwr.wroc.pl/~witold/aiarr/indexe.html

Teacher:

dr inż. Witold Paluszyński

June 2008

Authors:

Adolfo Escolano Luján

Enrique Ros Carrión

Javier Maíllo Saralegui

Abstract

The present report is written in the field of Artificial Intelligence and explains the way in that a chatterbot called MATIAS (Modeled Artificial Totally Intelligent Assistant Softbot) was developed and how the technique Case Based Reasoning was used.

The context chosen for MATIAS' conversations were those related to the http://www.uniformesmastia.es/tienda/ Uniforms shopping web Portal, thus the chatbot acts as a virtual assistant, offering help to the visitors, in such way that users think they are speaking with a human being instead of a chatbot.

Internally, the chatterbot analyzes the human's dialogue and search inside its Knowledge base the most appropriate answer using the technical CBR (Case Based Reasoning).

The real importance of this project resides in showing the reader how many different uses can be given to the chatterbots: They could be used in the areas of entertainment, education (e-learning), training, interfaces man-machine, etc. And great advantages can be obtained implementing this type of applications

Key Words: Artificial Intelligence, AIML, Artificial Intelligence Markup Language, Case Based Reasoning, Chatbot, Customer Assistance, osCommerce.

1. Introduction

One of the most promising areas of research in Computer Science is the Artificial Intelligence field [4], in which have been developed some application techniques whithin specific branches of science, what has transformed it from a purely academic science into an experimental science. Case-Based Reasoning has an important role into AI modus operandi.

CBR cares about the study of needful mental mechanisms for repeating what was done previously, either by oneselves, or guided by some concrete cases collected in bibliography or folk wisdom [2].

It is also important to say that in this project, we did not try to solve all natural language related problems, but we aimed to create one chatter softbot able to talk in English language in a specific domain.

2. Case-Based Reasoning

The nature of human thought is generaly a mixture of different kind of processes like inductions, deductions, etc. This method of human reasoning is based in analogies, with whom people make inferences that allow them to acquire new knowledge. One example of this type of reasoning is presented here: One doctor has the knowledge of some different kind of sicknes, symptoms and treatments, larned by his studies and especially to the experience accumulated during his years of practice. When a patient tells him about his disease, the doctor observe patient's symtoms and uses the disease and symtoms information he previously learned in order to give a diagnosis. If there are no previously known corresponding symptoms to any sickness, he take the most similar cases in order to offer a solution. New information related with disease, symtoms, possible solution and patient results, will be stored in doctor's memory in order to be applied in the future

Case-Based Systems do the same process as the example presented above, with the advantage that provide machines the ability to manage thousands of information sources in a very short period of time. CBS resolutions is performed this way: face one problem's description, then some previous related case is recovered, and it is adapted to the solution of solved problem in order to get new problem's solution, finally solution is reviewed and and new case is learned [1].

Among the advantages of CBR systems we can mention:

Knowledge acquirement. Cases are the basic units of knowledge (and not the rules). Humans manage their knowledge through examples of previous problems and solutions, not using abstract specific rules.
The way the system learns. Unlike systems based on conventional rules, if the
same situation is repeated, it does not have to build or generate the same
solution from scratch every time same problem appears.
Learning is simple because it does not require a deep understanding of the domain. CBR can propose solutions in domains that are not so well understood.
It is possible to evaluate solutions when there is no algorithmic method for doing so.
It is easier to acquire new cases than discover new rules and generalizations.
It provides performance improvements: faster than reasoning from scratch, capacity of anticipating and avoiding past mistakes, ability to focus first on the most important parts of a problem.

But this technique also presents some difficulties in its implementation as:

The problem of how representate the cases.
Structuring relations between cases.
Very big database cases hard to manage
It does not have a problem solver, that is, without cases, we can not resolve problems.

3. Chatbots

The very beginint of all this is in the Turing Test, which Turing called he Imitation Game and what wasa test that intended to answer the question of whether "machines may think. "

Other of the important steps were made by Hugh Loebner in 1991, with the creation of a contest based on the Turing test. The contest gives medals and cash prizes for "most human computer" and this has boosted the creation of such sofbots.

But before this we find the first system and a pioneer in the field of sofbots chat, "Eliza", which was the first programme in this area. It was a functional sofbot for the psychology field, created in 1980 by Joseph Weizenbaum [3]. Its purpose was make people speak with him about their problems like as if they were talking to a psychologist. Even the secretaries and non-technical staff at MIT (Massachusetts Institute of Technology) thought that the machine was a real therapist, and spent hours revealing their personal problems to the program. Eliza is still the most widely circulated in the history of intelligence artificial.

However, Eliza, was not only the first of chatterbots, but also the cornerstone for almost all of them, since they are based primarily on the creation of patterns that simulate human behaviour.

A few years ago, in 1995, Dr. Richard Wallace wrote Alice. The development of this sofbot chat started in its first version, using SETL (based on mathematical logic and set of theories language). From this first attempt was developed what was called "A Program" which was the first version of Alice, using AIML (Artificial Intelligence Markup Language) and Java. A while later was developed the next version, "Program B", in which were about 300 developers and where the AIML evolved to a XML grammar. This resulted in the development of editors and tools for managing AIML, and was this version which won the Loebner Prize in 2000.

It was subsequently created two other versions of Alice, "Program C" (which was developed using C / C + +) and "D Program" (based on Java2 technology). From here, it was created in 2001, "The Alice AI Foundation", whose objectives are among others, the distribution, promotion, development and maintenance of Alice and AIML technology. Our bot MATIAS is written under D Program features.

Other sofbots that have been developed are: "Theresa" (whose main topics of conversation are music and Greek mythology), "Mimic" (which learns while holding talks with him), "Brian" (which won bronze medal in the Loebner Prize in 1998), "Sofbot Rock Critic" (up criticism of discs and artists that do not exist), "ELVIS" (Sofbot chat that simulates being Elvis Presley), "Jlaip" (trying to replicate the personality of the former Beatle John Lennon). The following table present some of these jobs with their respective authors:

SOFTBOT NAME	AUTHOR	LANGUAGE
Alice	Richard S Wallace and his team	English
Shampage	Rich Waugh	English
Eliza	Joseph Weizenbaum	English
Parry	Kenneth Colby	English
Fred	Robby Garner in Robitron Software Research	English
Jabberwock	Juergen Pirner	English
Yeti	Webolutions New Media	English
Al Alex	John Precedo	English
Elbot	Fred Roberts	German

Many of these sofbots are, in principle, general, that is not specialize in any particular subject, while in others, we found various themes such as Elvis Presley, music, rock and roll, John Lenon, and so on.

Chatterbots can be created as a means of entertainment or as part of interactive games, Internet information services, directories of Web sites, e-commerce players and more. Here are presented some applications that can be built using sofbots chat:

Forms Assistant: A chat bot working as a form interactive assistant can act as a complement to a digital form, inviting the visitor to its filling. It can answer questions, help resolving errors and ensuring that they answer all questions. The sofbot can also replace the form completely, "interviewing" the visitor in an entirely natural process. The information is collected through a pleasant conversation
Sales Agent: An interactive sales bot is a good seller that helps customers to make purchases on the Internet. It can behave like any real seller, find out what is the client looking for, suggesting different products, channeling sales to other interesting products or higher range, answering questions, demonstrating products and assisting the customer while he completes the order form. You can also track the sale and discover whether the client has been satisfied, after making suggestions for other products.
Navigation Assistant: A wizard interactive navigation complements the typical actions of the mouse. Rather than moving by the Web site, pointing out places of interest, etc. the visitor asks the sofbot to help him, and in his own words!. After a brief dialogue, sofbot know exactly what they need and will offer visitors a direct response, showing at the same time the correct website.
Chat Assistant: Attendees chat remain fun and entertaining conversations with users. Their responses are often full of humor and surprises, to keep the interest of the user.
Game Opponents: chatbots can also be very useful in game programmes, playing the role of "the opponent". For the user it will be like competing as if his opponent was a real person, playing as he can go on conversing with the opponent.
Educational Support: We can build the sofbot, giving the appearance of a guardian or storing in its knowledge base all the information it must provide to their students when they will ask him questions, thus making education more interactive and attractive to users

4. MATIAS Knowledge Base Structure

This section explains the structure of the knowledge base of sofbot MATIAS, constructed so as to enter, modify, update and retrieve knowledge of the simplest way possible. Because, as in any CBR system, the structure of the cases and their storage at the base of knowledge, is one of the most important parts.

AIML (Artificial Intelligence Mark-up Language), is a language structured to allow introducing bot knowledge in an easier way. It was developed by Dr. Richard S. Wallace and the free software community of Alice bot during the years 1995-2000.

This language is composed of a series of XML tags that kind help organizing the knowledge of the robot. The operation of AIML is based on a stimulus-response model.

MATIAS Knowledge is organized into AIML files, text files that contain labels with a defined structure: the first line is a foreword XML standard and contents of the file should be closed within the labels <aiml> </aiml>. In AIML, <category> labels are mainly the ones containing knowledge. In <pattern> labels we find user questions patterns and answers appear labeled as <template>.

MATIAS knowledge consists of units with the following structure:

<!-- Question user model>

</pattern>

<!-- bot reaction produced by the question model>

</template>

</category>

Category example

Here is shown a very basic AIML file

<?xml version="1.0">

<aiml version="1.0"gt;

<pattern>HELLO</pattern>

<template>Dzien Dobry</template>

</category>

</aiml>

The basic unit of knowledge in AIML is called a category. Each category consists of an input question, an output answer, and an optional context. The question, or stimulus, is called the pattern. The answer, or response, is called the template. There are two optional methods for defining context using the <that> and <topic> markup. The <that> tag appears inside a category, and its pattern must match the robot's last utterance. Remembering one last utterance is important if the robot asks a question. The <topic> tag appears outside the category, and collects a group of categories together. The topic may be set inside any template [6].

The pattern language within AIML is simple, consisting only of words, spaces, and the wildcard symbols "_" and "*". The words may consist of letters and numerals, but no other characters. The pattern language is case insensitive. Words within the pattern are separated by a single space, and the wildcard characters function like words.

The template language is also designed to represent the response as simply as possible for the task at hand. In its simplest form, the template consists of only plain, unmarked text. More generally, AIML tags allow the reply to save data, activate other programs, give conditional responses, and recursively call the pattern matcher to insert the responses from other categories. Most AIML tags belong to this template side sublanguage.

AIML supports two ways to interface other languages and systems. The <system> tag executes any program accessible as an operating system shell command, and inserts the results in the reply. The <javascript> tag allows arbitrary scripting inside the templates.

AIML processing is similar to querying a simple database of questions and answers. However, the pattern matching "query" language is much simpler than something like SQL. A category template may contain the recursive <srai> tag, so that the output depends not only on one matched category, but also any others recursively reached through <srai>.

 The CBR "cases" are the categories in AIML. The algorithm finds best-matching pattern for each input. The category ties the response template directly to the stimulus pattern. MATIAS is conceptually not much more complicated that Weizenbaum's ELIZA chat robot; the main differences are the much larger case base and the tools for creating new content by dialog analysis [5].

The answer or template that represents the answer given by the robot to the user. In its simplest form, the template consists of plain text and unmarked (ie without labels).

More generally, the template may be composed of AIML labels, such as random, get, set, think, srai, etc. which transforms the chatbot response in a computer program which can save data on disk, activate other programs, give Conditional answers, and even call recursively other pattern (patterns), etc..

The portion of optional context category consists of two variants called by the labels <that> and <topic>. The use of context can be very useful for the proper development of chatterbot in the talks.

The label <that> which appears within categories, indicates the last answer given by the bot in the conversation. Remembering the last answer of chatbot is important if the sofbot is going to make a second question.

The label <topic> appears outside the category, and is used to establish a group of categories based on a theme. This label refers to the subject or topic of the current conversation.

5. MATIAS, The "Program D" bot

Program D is the most widely used free ("open source") AIML bot platform in the world. It is the most feature-complete, best-tested implementation of the current AIML specification. It supports unlimited multiple bots in a single server instance, and has an open-ended architecture for interacting via any interface imaginable. The standard release provides a J2EE web application implementation that can be deployed as a .war file. Drop-in listeners are available for IRC, AIM, and Yahoo. It includes an automated testing framework for testing knowledge bases, and is packaged with an AIML Test Suite that verifies that the program itself complies to the AIML specification.

Program D is known to work with many different languages / character sets. Its component-oriented architecture allows it to be integrated into any application framework desirable. It is implemented in Java, and uses many features of the latest JDK to provide optimum code reliability. It is actively maintained and supported at http://aitools.org/Program_D

6. Knowledge Recovery

The knowledge recovery stored in MATIAS is a process that extends from the user input, the corresponding pattern search with the entry, through the pairing up algorithm generating the answer.

The entries are word(s) entered by the user to the chatterbot, and since these are written in natural language, they should be standardized, ie go through a transformation process before being sent.

This normalization process is conducted in the following order:

Replacements: Applied to the user input in order to adjust words to stored patterns in the knowledge base before making the pairing. These replacements are very useful for solving problems like spelling, shortened words, use of jargon, acronyms, abbreviations, exemptions, etc..
Complex Entries Division: This is a series of heuristics applied at the entrance to split into two or more simple sentences.
Fitting: The fitting process is removing from the input non allowed pattern-chars.

When the user enters a question to the bot first it is normalized, and then proceeded to search in the knowledge base to find a category that contains the pattern that best matches the question.

Tree nodes

7. Chat examples

Let's remember that our system offers search help in a internet uniform shop in order to find several differents products in a osCommerce system based shop. This search help it is offered through a chat box which is implemented with AIML.

Here we show some of typical chats with MATIAS

Dialogue history 1

Dialogue history 2

Dialogue history 3

Bibliography

[1] AAMODT, A., PLAZA E. Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. Artificial Intelligence Communications. 1994

[2] SCHANK R. Dynamic memory; a theory of reminding and learning in computers and people. Cambridge University Press. 1982.

[3] WEIZENBAUM, J. "ELIZA - A Computer Program for the Study of Natural Language Communication between Man and Machine," Communications of the Association for Computing Machinery. 1966.

[4] WINSTON P. Inteligencia Artificial. Addison-Wesley. 1994

[5] http://list.alicebot.org/pipermail/alicebot-general/2001-September/000758.html

[6] http://www.idealliance.org/papers/extreme/proceedings/html/2007/Freese01/EML2007Freese01.html

Gathering further information

A.L.I.C.E.
http://www.alicebot.org

http://alicebot.blogspot.com/

A.I. TOOLS (ProgramD Website)

http://aitools.org/Main_Page

AIML Reference

http://home.tampabay.rr.com/ringo/aimlrm.html

Bug tracker:

http://bugs.aitools.org/

Java
http://www.javasoft.com

JDOM
http://www.jdom.org

Jakarta Tomcat

http://jakarta.apache.org
http://www.javasoft.com