FB Pixel Joshua Cole: Chat Bots Chat Bots | Joshua Cole’s Personal Website

Chat Bots

Published: 23 Jul 2022

I remember when I was young my mom had me in her lap and I was reading a book at her request. It wasn’t the first time we had done this, but this time went different to many of the other times. My mom did something that humans do commonly - she errored. Often people act like this sort of thing is bad. It isn’t. Her error was to our benefit. Accidentally, she turned multiple pages, rather than the one page she had intended to turn.

This mistake changed the course of my life dramatically. To her astonishment I read the page she skipped not the page she was on. Or rather - I didn’t read the page. I had memorized the book and was repeating it from my memory.

What followed was a dedicated effort on her part to teach me phonetics with hooked on phonics. Over the course of what I remember as a winter break I went from a reading level appropriate to me - I was in early grade school - to one much higher. The school wished to move me forward three grades in response to my enormous jump in proficiency. My mother declined for the sake of my social development.

I now recognize what happened to me as overfitting a small training dataset. I read in text, memorized it, spat back out that same text. So the time spent learning to read was time spent learning to generalize from the small datasets I was reading to the majority of English writing.

Talking Programs

My first ever contribution to an open source project that wasn’t my own was a plugin for an IRC chat bot in #clojure on freenode. The plugin was called findfn, because that is what it did. When someone types something in IRC along the lines of findfn 2 2 4 it would return a list of all the functions which would return 4 after being passed 2 twice. In that example it would probably return the operators for both addition and multiplication. Other times it might return first, last, map, or any number of other functions.

To do this the plugin took a brute force approach. There weren’t any clever means of figuring out what would return what, though it would be probably be possible to do that given functions with the right metadata. Instead it went through all the functions in the core namespaces and tested each of them to see if applying all but the last argument would return a result which matched the last argument.

It wasn’t long after that I submitted the plugin that I saw it being rolled out in #clojure. Then only a little while after that I saw the plugin being merged into the core of the bot. For a first time contribution to open source I was pretty happy with the result. People actually found something I had helped to build to be useful.

I’m really glad that I had the chance to give back to the IRC community in #clojure. For those of you who program in Clojure, but haven’t been there, you’re really missing out. It is home to some of the nicest and most helpful people I’ve met online. I’m pretty sure that I ended up having more questions answered in the IRC chat and through searching its logs then I did on StackOverflow.

I know that I would never have submitted the plugin if it weren’t for wonderful people like amalloy and raynes. amalloy helped me fix a bug in the initial impementation of the function and also guided me through submitting the plugin to lazybot.

This early chat bot didn’t think. It did, however, help people. It thought for them. It had utility.

It wasn’t the only chat bot I wrote.

When my uncle came out from South Carolina to visit us for Christmas. He brought along a laptop which had a license on it which allowed him to run chat bots on Camfrog, Camfrog is a video conferencing platform geared around chat rooms. He wanted me to create a bot for the program and I was happy to do so.

At first we tried to find a way to copy the dll of an existing bot, because it already did something really close to what we needed. This didn’t work, as the bots architecture didn’t allow it to have two separate questions files. So I was forced to put on the programming belt and hammer out some code.

I wrote an event based C++ dll which integrated against the closed source C++ bot that loads and sends events to the dll. At the time this was one of the most complicated programs I’d ever written, but it was actually a pretty trivial program. Given datasets about questions and answer pairs it let people race to answer the quesitons.

Players loved it. It taught them language. It helped them to think more quickly, because it was a race.

When I finished the bot, the week was over and my uncle left for home. When I woke up in the morning my uncles laptop was gone and a question had been left in its place. How would my untested code fare in the wild? Thinking about it helps me to understand why it is that developers love web apps so much. It will be such a pain to push out updates if this bot doesn’t work.

Another program that outputs text. Another program that generates utility. Yet there is nothing there like what a human does. There is no generalization from the examples to the underlying meaning.

Smarter programs

I’ve been reading and enjoying Paradigms of Artificial Intelligence: Case Studies in Common Lisp. It is an amazing book written by Peter Norvig, formerly of NASA and now head of research at Google. Norvig has a way of talking about AI programs that shaped the field in their time in a way that educates rather than confuses.

The programs he talks about start out seeming like magic, but because he is good at giving explanations the magic tricks quickly fade, replaced by explanation. They seem baffling and astounding. Then you find out what is really going on and it might still be cool, but its a different kind of cool. Once you know the trick behind it, the entire program is seen in a new light.

This feeling, the opposite of magic, is actually at the root of what it even means to communicate explanations well. Magic hides the explanation for events. Norvig doesn’t.

To show you what I mean I’m going to go over an AI that Norvig talks about in the book called Eliza. Eliza is a chat bot built way back when and it tricked a lot of people into thinking there was actually something intelligent behind it. That people took it seriously actually freaked out the person who coded the AI.

So what is Eliza? Eliza is a rule based translator. What does that mean? Well I’ll give you an example of a set of rules to show you. These rules are taken directly from an implementation of a subset of the features of the actual Eliza that I wrote in Clojure, one of my favorite programming languages. I’;m hoping it will be pretty readable even if you aren’t familiar with the language:

("?*x I want ?*y"
  ("What would it mean if you got ?*y"
   "Suppose you got ?*y soon?"
   "Why do you want ?*y"))

So what we have above is a list of two items. The first item is a string and the second item is a list. That first string, it is a pattern. Whenever Eliza is told something it will check this rule and look at the pattern. It will then try to see if there is any point in what it was told where ‘I want’ appears. If there is it will save segments of text which will be called ?*x and ?*y for use. The second item in the list is another list. This list contains strings, each of which is a response. If Eliza managed to save anything to ?*x and?*y it will select one of these response strings at random. Then using the saved variables it will create a new string. There is also this stage where Eliza converts things like you to me and me to you in the values it saves. So given this rule if I were to tell Eliza, “Hey you, I want to make you look simple.” It might respond, “What would it mean if you got to make me look simple.”

Eliza has a lot of rules and most of them are kind of vague. It allows you to become convinced that it knows more than it really does by being vague. It parrots things back, but leaves things open ended. Sometimes it is going to mess up and give really weird responses, but with enough patterns it can lead you into fooling yourself for a while.

Fun fact: at one point Eliza almost got a job. Prestigious medical journals were actually wondering if using Eliza to fill in for actual psychiatry might be a good idea. They wouldn’t have to pay it much after all. Remember when I said the author was a little bothered by how seriously people were taking the program? I wonder what sort of things might have brought that on.

Despite all of this, it was actually pretty cool to implement. Usually I would use regular expressions to match a pattern in a string, but this time I got to see how matching patterns can be done at a lower level. Now if you’re interested in all of that you can check out some of the code in a github gist. Be warned…. I’m using mutually recursive functions and there are a whole lot of parentheses.

Eliza isn’t the only program in that book, but they tend to be the one people know about. Student is another program I learned about through PAIP. It is a clever little bit of code that can solve word problems that wouldn’t be out of place in a high-school algebra class.

The root of most equations in which you are solving for one variable is isolating that one variable. You just keep moving things around using simple rules until the unknown is the only thing on one side of the equals sign. At that point things are solved. So once you type out exactly how to invert each of the mathematical operations that Student is likely to encounter, solving the equations becomes trivial.

However, we don’t give Student equations. We give it paragraphs. It is solving word problems.

A modern reader might suppose this means that advanced AI is being used to understand language and then on the basis of that understanding the text is parsed into equations. Actually, it is far simpler than that. The parsing of word problems into equations in Student is the same technique I’ve talked about before in Eliza. They both use extremely basic pattern matching.

There are a few differences between the two programs. For example, the pattern matching code of Eliza is ready to quit after finding its first match, but Student recursively matches every part of its input. If you give it a paragraph it will keep on breaking the input up until it has a series of equations which model the problem that was given.

Like Eliza, the short fall of the program is that if you give it a bit of math that doesn’t fit into the patterns that were supplied for it, it won’t behave properly.

Yet again, no generalization.

NLU

Now I’m working at Amazon on Alexa. The world has transformer architectures that actually are trying to generalize such that they understand language.

If you want to be notified when I post new content you can subscribe to the site's Radio waves; RSS/Atom feed or

If you think this content might valuable to someone else please consider sharing it with them.