For a computer to be able to conduct a conversation like a human, it needs to master four elements of dialogue:
Speech recognition is a process in which the computer listens to a human utterance and converts it to written form. The process can be seen on certain news channels, for example, where the words being spoken by a journalist appear on the screen as subtitles almost immediately.
In recent years, systems of this type have made huge progress in recognising speech signals in practically any of the world’s natural languages. Not long ago it was close to a miracle if a car navigation system understood a spoken destination address correctly; today we are surprised if even the smallest error occurs in understanding that kind of spoken text.
Just a few years ago, a computer-generated voice would jar with its unvarying rhythm and unnatural intonation. Today, you need an especially sensitive ear to distinguish a synthetic voice from a human one.
Text generation is a process of creating a text in a natural language based on knowledge encoded in the computer. For example, based on information of the type: “<departure station: Kings Cross; operator: LNER; destination station: Newcastle; delay: 30>” the computer system can generate the text: “The LNER train from Kings Cross to Newcastle is delayed by thirty minutes”.
Text generation can be done using a human-designed set of rules or templates; it is a much easier task than language comprehension, which we discuss next.
We can say that a computer system understands a natural language if, in response to a human utterance, it takes action in accordance with the speaker’s intention. For example, in response to a question, the system should provide a relevant piece of information; in response to a command, it should perform the requested action; and in response to an expression of opinion, it should continue the dialogue in the way a human would.
Understanding natural language is the hardest problem in computerised conversation, and is still far from being solved. This is because it is very hard to develop a finite set of rules or principles that will take care of the infinite number of possible human utterances.
However, if we accept that the system is to operate exclusively within a certain simplified reality, in which the user’s utterances are limited in vocabulary and structure, it then becomes possible – and not even especially complicated – to write a computer program that will understand natural language.
In this blog post I will describe, step by step, how such a system can be created. The example given will be written in Python, but even without knowledge of that or any other programming language, the reader will be able to learn how to build effective systems of this type.
The system in this example will be a virtual trader, doing business in accordance with the mission statement: “I buy in order to sell.” It will be an uncomplicated system, responding to short sentences with simple syntax. To broaden its potential market, our trader will communicate in English, and its partner in conversation will give it commands of the type:
Buy 5 large blue metal boxes.
Sell 3 tiny red plastic rings.
The trader will do everything to make a suitable deal in every case. It will accept any purchase instruction, recording the relevant information in its inventory. It will carry out a sale instruction only if it has previously bought the required quantity of goods.
We will accept that the system understands what is said to it if its inventory is always in agreement with the purchases made and the sale transactions concluded.
Our program is expected to work in the following way:
The user gives a command in a natural language, e.g.:
Buy 5 large blue metal boxes.
We will have to create a dictionary containing all of the words that the system is expected to understand. These will be called tokens. First, however, we need to define the permissible meanings of words, called token types. The corresponding Python source code takes the following form:
tokens = ( 'OPERATE', 'NUMBER', 'SIZE', 'COLOR', 'MATERIAL' 'KIND', )
We have defined six token types, corresponding to different meanings of tokens. In a moment we will determine, for example, that the dictionary can contain a token “Buy” belonging to the OPERATE type.
Guess to which of the above types the tokens in this sentence belong:
“Buy 5 large blue metal boxes.”
Answers in a moment…
At the next step, we define for each type what tokens belong to it and what the system has to do when a specified token appears in the user’s command:
def t_OPERATE(t): r'Buy | Sell' return t
The above code assigns to the OPERATE type the two words Buy and Sell (second line), and instructs the system to remember the given operation when such a token appears in the user’s command (third line).
def t_NUMBER(t): r'\d+' t.value = int(t.value) return t
The above code assigns to the NUMBER type any text consisting purely of digits (second line), transforms that text into the corresponding numerical value (third line), and instructs the system to remember that value whenever such a token appears in the user’s command (fourth line).
def t_MATERIAL(t): r'metal | plastic' if t.value =='metal': t.value = 1 elif t.value == 'plastic': t.value = 2 return t
The above code assigns to the MATERIAL type the words metal and plastic. Next (and this is something new!) it instructs the system to store either 1 (if the material is metal) or 2 (if the material is plastic).
We can construct the other token types analogously to MATERIAL:
def t_COLOR(t): r'(black | white | red | green | blue)' if t.value =='black': t.value = 1 elif t.value == 'white': t.value = 2 elif t.value == 'red': t.value = 3 elif t.value == 'green': t.value = 4 elif t.value == 'blue': t.value = 5 return t def t_SIZE(t): r'tiny | small | big | large' if t.value =='tiny': t.value = 1 elif t.value =='small': t.value = 2 elif t.value =='big': t.value = 3 elif t.value =='large': t.value = 4 return t
def t_KIND(t): r'box(es)? | ring(s)?' if t.value[0] =='b': t.value = 1 else: t.value = 2 return t
In the definition of the KIND type we ensure that the dictionary contains the words box, boxes, ring and rings. When a token referring to boxes appears, the value 1 is stored; and when the token refers to rings, the value 2 is stored.
How does our computer trader identify the index of the goods mentioned in a command? This is made clear by the following table, showing the values defined for particular tokens.
Value | KIND | SIZE | MATERIAL | COLOR |
1 | box(es) | tiny | metal | black |
2 | ring(s) | small | plastic | white |
3 | big | red | ||
4 | large | green | ||
5 | blue |
When the system has to determine the index of an item being bought or sold, it does so in the way indicated by the above table – where the ordering of the columns is significant.
For example, to determine the index of the good referred to as:
large blue metal boxes
the system must first arrange the given attributes in an order corresponding to the columns of the table: KIND, SIZE, MATERIAL, COLOR. This gives in this instance:
boxes large metal blue
It then assigns digits to the tokens as given in the table:
1415
(boxes → 1; large → 4; metal → 1; blue → 5)
The index of the good large blue metal boxes is therefore 1415. Notice that even if the user had stated the attributes in a different order, the resulting index would still be the same.
The virtual trader will only understand commands given in accordance with the rules of grammar defined by the programmer. Every rule indicates how smaller parts of a sentence can be put together to make a larger whole. For example, the following rule says that a command must contain, in order, a token denoting an operation, a token denoting a number, and a component denoting the article:
command: OPERATE NUMBER article
A grammar rule can also contain instructions which the program is to perform when it finds that the analysed text conforms to that rule.
Let us begin with the simplest rules, concerning the attributes of goods:
def p_attribute_color(p): 'attribute : COLOR' p[0] = p[1]
The rule denoted p_attribute_color (the name of a rule must begin with “p_”, but the rest of the name is freely chosen by the programmer) says that a part of the sentence with the name attribute may consist exclusively of a token of COLOR type (second line). The third line says that in this situation the value of the attribute component (denoted p[0]) is to be identical to the value of the COLOR token (p[1] denotes the first, and in this case only, component).
Thus, if the attribute black appears in the command, it will have the value 1, the attribute white will have the value 2, and so on.
def p_attribute_material(p): 'attribute : MATERIAL' p[0] = 10 * p[1]
The rule with the name p_attribute_material says that a part of the sentence with the name attribute may alternatively consist exclusively of a token of MATERIAL type (second line). The third line says that in this situation the value of the attribute component is to be the value of the MATERIAL token multiplied by 10.
Thus the attribute metal has the value 10, and the attribute plastic has the value 20.
Analogously:
def p_attribute_size(p): 'attribute : SIZE' p[0] = 100 * p[1]
The description of an article may consist only of the kind of good:
def p_article_kind(p): 'article : KIND' p[0] = 1000 * p[1]
The value will be 1000 for boxes and 2000 for rings.
The description of an article may be preceded by an attribute of any type:
def p_article_attribute(p): 'article : attribute article' p[0] = p[1] + p[2]
Adding an attribute to the description of an article increases its value. For example, the description “boxes” has the value 1000, “metal boxes” has the value 1010, “blue metal boxes” has the value 1015, and “large blue metal boxes” has the value 1415.
The most important rule concerns the entire command:
# Main parser rule (command) def p_command(p): 'command : OPERATE NUMBER article' index = p[3] #Buy article if p[1] == 'Buy': tab[index] += p[2] print('OK. I am buying ' + str(p[2])+ ' new articles indexed as ' + str(index) +'.') print('No of articles in shop: '+ str(tab[index])) #Sell article elif p[1] == 'Sell': if p[2] > tab[index]: print('I do not have as many articles as you want.') else: tab[index] -= p[2] print('OK. I am selling ' + str(p[2])+ ' articles indexed as ' + str(index) + '.') print('No of articles in shop: ' + str(tab[index]))
The first line states that the analysed command must consist of, in order, the operation (an OPERATE token), the number of articles (a NUMBER token), and a description of those articles (article).
The index of the good referred to in the command is taken to be the value of the article (p[3]). In the case of a purchase, the inventory record for the good is increased by the value of the NUMBER token, and an appropriate message is returned. In the case of a sale, the system additionally checks whether the required quantity of the good is available.
What can I do for you?
Buy 10 large blue metal boxes
OK. I am buying 10 new articles indexed as 1415.
No of articles in shop: 10
What can I do for you?
Buy 5 tiny white plastic rings
OK. I am buying 5 new articles indexed as 2122.
No of articles in shop: 5
What can I do for you?
Sell 5 large blue metal boxes
OK. I am selling 5 articles indexed as 1415.
No of articles in shop: 5
What can I do for you?
Sell 3 tiny white plastic rings
OK. I am selling 3 articles indexed as 2122.
No of articles in shop: 2
What can I do for you?
Sell 3 tiny white plastic rings
I do not have as many articles as you want.
What can I do for you?
Bye
>>>
The full code of the program is given below (with essential “decorations”).
# Import lexer and parser from ply module import ply.lex as lex import ply.yacc as yacc # List of token types. tokens = ( 'NUMBER', 'OPERATE', 'SIZE', 'KIND', 'COLOR', 'MATERIAL' ) # Token types may be defined as regular expressions, e.g. r'Buy | Sell' def t_OPERATE(t): r'Buy | Sell' return t def t_MATERIAL(t): r'metal | plastic' if t.value =='metal': t.value = 1 elif t.value == 'plastic': t.value = 2 return t def t_COLOR(t): r'(black | white | red | green | blue)' if t.value =='black': t.value = 1 elif t.value == 'white': t.value = 2 elif t.value == 'red': t.value = 3 elif t.value == 'green': t.value = 4 elif t.value == 'blue': t.value = 5 return t def t_SIZE(t): r'tiny | small | big | large' if t.value =='tiny': t.value = 1 elif t.value =='small': t.value = 2 elif t.value =='big': t.value = 3 elif t.value =='large': t.value = 4 return t def t_KIND(t): r'box(es)? | ring(s)?' if t.value[0] =='b': t.value = 1 else: t.value = 2 return t def t_NUMBER(t): r'\d+' t.value = int(t.value) return t # Lexer error handling rule (Handle words out of vocabulary) def t_error(t): print("Illegal character '%s'" % t.value[0]) t.lexer.skip(1) # Ignore white spaces t_ignore = ' \t' # Main parser rule (command) def p_command(p): 'command : OPERATE NUMBER article' index = p[3] #Buy article if p[1] == 'Buy': tab[index] += p[2] print('OK. I am buying ' + str(p[2])+ ' new articles indexed as ' + str(index) +'.') print('No of articles in shop: '+ str(tab[index])) #Sell article elif p[1] == 'Sell': if p[2] > tab[index]: print('I do not have as many articles as you want.') else: tab[index] -= p[2] print('OK. I am selling ' + str(p[2])+ ' articles indexed as ' + str(index) + '.') print('No of articles in shop: ' + str(tab[index])) # Parser rule (attribute) def p_attribute_color(p): 'attribute : COLOR' p[0] = p[1] # Parser rule (attribute) def p_attribute_material(p): 'attribute : MATERIAL' p[0] = 10 * p[1] # Parser rule (attribute) def p_attribute_size(p): 'attribute : SIZE' p[0] = 100 * p[1] # Parser rule (article - stop) def p_article_kind(p): 'article : KIND' p[0] = 1000 * p[1] # Parser rule (article - recursion) def p_article_attribute(p): 'article : attribute article' p[0] = p[1] + p[2] # Syntax error handling rule def p_error(p): print("Syntax error in input!") ####################################### #Main program #Initialize table of articles (zero articles at the beginning) tab = [] for index in range(3000): tab.append(0) #Build the lexer lexer = lex.lex() #Tokenize (short version) # for tok in lexer: # print(tok) #Build the parser parser = yacc.yacc() #Main loop while True: s = input('What can I do for you? \n') if s == 'Bye': break parser.parse(s)