Implementing Finite State Machines in Python with popmatches – Part 1

Introduction

Chosing the right representation for your data is at the heart of efficient AI programming as we are reminded in the book edited by Daniel G. Bobrow and Allan Collins, titled “Representation and Understanding: Studies in Cognitive Science”. However, quite often, the good old list will do or at least is a good place to start. Writing programs to manipulate lists structures of varying complexity can be tedious and error prone. A pattern matcher simplifies this process immensely. In fact, one of the main reasons for using a Pattern Matcher is that programmers can produce more readable and intuitive code.

One could argue that Pattern Matching is an essential component of writing intelligent programs and there are many books on this topic. My favorite being the book titled “Pattern Directed Inference Systems” by Donald Arthur Waterman, and Rick Hayes-Roth. Practically speaking, Finite State Machines, can be viewed as Pattern Directed Inference Systems, and their implementation can be greatly simplified with a pattern matcher.

Pattern Matching

Pattern matching is a powerful tool which is available in many functional programming languages. The most notable application areas for Pattern Matchers are Computer Algebra Systems, Natural Language Processing and Finite State Machines.

Text editors, were among the first computer programs to use pattern matching. Ken Thompson, who designed and implemented the original Unix operating system, extended the search and replace features of the QED editor to accept regular expressions. Two of the first well known programming languages to include pattern matching functionality were SNOBOL and LISP.

Nowadays, pattern matching is an advanced art. Mathematica offers very expressive pattern matching which is similar to Perl Compatible Regular Expressions (PCRE), but for symbolic tree structures instead of strings. TGREP2 is also a very powerful pattern matcher for parse trees containing linguistic information and was implemented around 2004. More recently, in 2017, a python library for pattern matching on symbolic expressions was made available on github.

The pattern matcher described here, called popmatch, is based on the POPLOG pattern matcher which was available at the beginning of the 1980s. I was lucky enough to have learnt to program with this Pattern Matcher while at Sussex university during this time. POPLOG appears to have been one of the first systems with a powerfull pattern matcher for tree like structures.

The Matcher

Popmatches can be used for checking the correspondence of nested lists with a pattern. The matcher, while quite simple, provides a powerful tool for operating on lists, and makes it much easier to write list processing programs. The alternative is to use the “lower level” facilities for manipulating lists which is tedious. For instance, in python the head and tail of a list can be obtained with:

l = [1,2,3,4]; hd = l[0]; tl = l[1:]

With the matcher described here we would write:

matches([1,2,3,4], [?hd, ??tl])

For both cases, the head of the list will be assigned to the variable hd and the tail to tl:

hd = 1, tl = [2,3,4]

The pattern expression “?hd”, will match one item and assign the match to the variable hd. The pattern expression “??tl“, will match zero or more items and assign the result to the variable tl. Even for this simple example, we can see that the code is a lot easier to understand and follow. Quite often, the information needed to complete a task can be extracted from a complex list with just a single call to match.

The python code to implement the above example would be:

from popmatches import matches, mexp

if __name__ == '__main__':

    matches([1,2,3,4], [mexp("?hd"), mexp("??tl")])
    print(hd, tl)

The table below lists the pattern elements that can form part of a pattern.

OperatorFunction
=match one item
=:procedure nameBefore allowing the match, apply the procedure to the matching item.
If the result is FALSE, don’t allow the match to succeed.
==match an arbitrary number of items (possibly none)
==:procedure nameBefore allowing the match, apply the procedure to the matching items.
If the result is FALSE, don’t allow the match to succeed.
==:integerRestrict the match to a given number of items
==:procedure name:integerRestrict the match to a given number of items and then apply the procedure.
If the result is FALSE, don’t allow the match to succeed.
?variablematch one item and bind it to the variable
?variable:procedure nameBefore allowing the match, apply the procedure to the matching item.
If the result is FALSE, don’t allow the match to succeed.
If the result is TRUE, allow it and bind the result to the variable.
??variablematch any number of items (possibly none), make a list of them, and bind the list (possibly ) to the variable.
??variable:procedure nameBefore allowing the match, apply the procedure to the matching items.
If the result is FALSE, don’t allow the match to succeed.
If the result is TRUE, allow it and bind the result to the variable.
??variable:integerRestrict the variable to match the given number of items
??variable:procedure name:integerRestrict the variable to match the given number of items and then apply the procedure.
If the result is FALSE, don’t allow the match to succeed.
If the result is TRUE, allow it and bind the result to the variable.

Finite State Machines

The Finite State Machines here are based on those given in the book “Natural Language Processing in Pop-11” by Gerard Gazdar, Chris Mellish. But are writen in Python and not Pop-11. There are another two versions of this book for PROLOG and LISP. While the languages covered are not so common these days, these three books are still an excellent place to start because they provide insight into the programming techniques used during the early days of Natural Language Processing.

The first example shown above, is a FSM to generate laughter!  A single recursive procedure, traverse_network, is needed to implement the FSM. This procedure calls matches to find the valid transitions from the start node (S1) to the final node (S4). When the final node is reached, the function exit_from_fsm is called to exit from the recursive call to traverse_network. This is done by generating an exception which is caught by the function traverse. The code for the finite state machine is:

from popmatches import mexp, matches
import random

class FSM(object):
    output = ''

    def __init__(self, name, initial, final, arcs):
        self.name = name
        self.initial = initial
        self.final = final
        self.arcs = arcs
        self.finished = False

    def exit_from_fsm(self):
        raise Exception('exiting from FSM')

    def traverse_network(self, start_node):
        for arc in self.arcs:
            if matches(arc, [start_node, mexp('?to'), mexp('?symbol')]):
                if to == self.final:
                    self.output += symbol
                    self.finished = True
                    self.exit_from_fsm()
                elif symbol == '#' and random.random() < 0.8:
                    self.traverse_network(to)
                elif symbol != '#':
                    self.output += symbol
                    self.traverse_network(to)

    def traverse(self):
        try:
            self.traverse_network(self.initial)
        except:
            pass
        self.pr()

    def pr(self):
        if self.finished:
            print(self.output)
        else:
            print("FSM did not complete!")

if __name__ == '__main__':

    arcs = [[1, 2, 'h'],
            [2, 3, 'a'],
            [3, 1, '#'],
            [3, 4, '!']]

    start_node = 1
    final_node = 4

    myfsm = FSM('to laugh', start_node, final_node, arcs)

    myfsm.traverse()

In the second example shown below, an FSM is defined to check if a sentence, input to the machine, conforms to a simple grammar. For example “Kim was a consumer and Lee was stupid”. The procedure recognise, shown in the code below, calls the recursive procedure recognise_next to find valid transitions from the start node (S1) to the final node (S9) that are consistent with the input. All but one of the procedures called by recognise_next calls  matches.

The flow of control is as follows: Firstly, recognised_next is called with the current node and input sentence. Secondly, recognise_next checks to see if a final node has been reached and the tape is empty. If so the FSM exits with success. Then the validity of the transition is checked, that is if the variable tape is not equal to None. If the tape is None then the function returns. Thirdly, recognise_next is called recursively for each of the possible transistions from the current node which are retrieved by get_transitions. Recognise_next, is called with two arguments, 1) a new node and 2) the input tape advanced by one word (computed by recognise_move). The procedture recognise_move, is responsible for advancing the input tape if a transition to the new node is compatible with the word on the front of the tape. The procedure returns the tape advanced by one word, or None if a transition is not valid.

The code for the second FSM is:

from popmatches import mexp, matches
import random
from pprint import pprint

A__ = mexp("==")

class FSTN(object):

    output = []

    def __init__(self, name, arcs, abbreviations, tape):
        self.name = name
        self.arcs = arcs
        self.network = arcs
        self.abbreviations = abbreviations
        self.tape = tape
        self.finished = False

    def exit_from_fstn(self):
        raise Exception('exiting from FSM')

    def initial_nodes(self):
        global v, A__
        matches(self.network, [A__, ['Initial', mexp('??v')], A__])
        return v

    def final_nodes(self):
        global v, A__
        matches(self.network, [A__, ['Final', mexp('??v')], A__])
        return v

    def valid_move(self, label, s):
        global A__
        return matches(self.abbreviations, [A__, [label, 'abbreviates', A__, s, A__], A__])

    def get_transitions(self, node):
        global newnode, label
        pattern = ['From', node, 'to', mexp('?newnode'), 'by', mexp('?label')]
        return [(newnode, label) for entry in self.network if matches(entry, pattern)]

    def recognise_move(self, label, tape):
        if tape is not [] and self.valid_move(label, tape[0]):
            self.output.append((label, tape[0]))
            return tape[1:]
        elif label == '#':
            return tape
        return None

    def recognise_next(self, node, tape):
        if tape == [] and node in self.final_nodes():
            self.finished = True
            self.exit_from_fstn()
        elif tape is None:
            return

        for newnode, label  in self.get_transitions(node):
            self.recognise_next(newnode, self.recognise_move(label, tape))

    def recognise(self):
        try:
            for node in self.initial_nodes():
                self.recognise_next(node, self.tape)
        except:
            pass
        self.pr()

    def pr(self):
        if self.finished:
            pprint(self.output)
        else:
            print("FSM did not complete!")


if __name__ == '__main__':

    arcs = [['Initial', 1],
            ['Final', 9],
            ['From', 1, 'to', 3, 'by', 'NP'],
            ['From', 1, 'to', 2, 'by', 'DET'],
            ['From', 2, 'to', 3, 'by', 'N'],
            ['From', 3, 'to', 4, 'by', 'BV'],
            ['From', 4, 'to', 5, 'by', 'ADV'],
            ['From', 4, 'to', 5, 'by', '#'],
            ['From', 5, 'to', 6, 'by', 'DET'],
            ['From', 5, 'to', 7, 'by', 'DET'],
            ['From', 5, 'to', 8, 'by', '#'],
            ['From', 6, 'to', 6, 'by', 'MOD'],
            ['From', 6, 'to', 7, 'by', 'ADJ'],
            ['From', 7, 'to', 9, 'by', 'N'],
            ['From', 8, 'to', 8, 'by', 'MOD'],
            ['From', 8, 'to', 9, 'by', 'ADJ'],
            ['From', 9, 'to', 4, 'by', 'CNJ'],
            ['From', 9, 'to', 1, 'by', 'CNJ']]

    abbreviations = [['NP', 'abbreviates', 'kim', 'sany', 'lee'],
                     ['DET', 'abbreviates', 'a', 'the', 'her'],
                     ['N', 'abbreviates', 'consumer', 'man', 'woman'],
                     ['BV', 'abbreviates', 'is', 'was'],
                     ['CNJ', 'abbreviates', 'and', 'or'],
                     ['ADJ', 'abbreviates', 'happy', 'stupid'],
                     ['MOD', 'abbreviates', 'very'],
                     ['ADV', 'abbreviates', 'often', 'always', 'sometimes']]

    tape = 'kim was a consumer and lee was stupid'.split()

    myfstn = FSTN('english_1', arcs, abbreviations, tape)
    myfstn.recognise()

Leave a comment