Writing a Natural Language Parser in C# Part 3–CommandProcessor and ConversationContext

March 25, 2012 at 6:51 AMAdministrator

This post is part of a series on creating a natural language processor in C#. The other entries in this series are:

Writing a Natural Language Parser in C# Part 1–Why?
Writing a Natural Language Parser in C# Part 2 - Architecture
Writing a Natural Language Parser in C# Part 4–Tokens 
Writing a Natural Language Parser in C# Part 5 - Questions and Rules

The command processor is is the class that controls the flow of parsing a request and replying to it.  Below is a full listing of the class.

CommandProcessor class
  1. [Export]
  2.     [PartCreationPolicy(CreationPolicy.NonShared)]
  3.     public class CommandProcessor
  4.     {
  5.         [Import]
  6.         private QuestionManager _questionManager { get; set; }
  7.  
  8.         public void ProcessCommand(string command, Guid userId,
  9.             SmartHome.Global.ConversationMode mode,
  10.             string address,
  11.             Action<string> callback)
  12.         {
  13.             string localCommand = command.ToLower().Replace("'", "");
  14.             var context = ServiceLocator.GetInstance<ConversationContext>();
  15.             context.Init(userId, mode, address, callback);
  16.             var tokenManager = ServiceLocator.GetInstance<TokenManager>();
  17.             var buckets = tokenManager.TokenizeInput(command, userId);
  18.             context.LogRequest(command, buckets);
  19.  
  20.             //before we check the tokens against rules, let's see if we match any questions
  21.             Question question = null;
  22.             List<Token> tokens = new List<Token>();
  23.  
  24.             _questionManager.CheckForMatchingQuestion(buckets, mode, userId,
  25.                 address, ref question, ref tokens);
  26.  
  27.             if (question != null)
  28.             {
  29.                 question.ExecuteIfAnswered.Invoke(context,
  30.                     question.State, tokens);
  31.                 
  32.                 _questionManager.RemoveQuestion(question);
  33.                 
  34.                 return;
  35.             }
  36.  
  37.             RuleMethod ruleMethod = RuleManager.LocateMatchingRule(buckets, context);
  38.  
  39.             if (ruleMethod != null)
  40.             {
  41.                 ruleMethod.Rule.Invoke(null, ruleMethod.PassIns);
  42.             }
  43.             else
  44.             {
  45.                 context.Say("I didn't understand your request", null);
  46.             }
  47.         }
  48.     }

First of all, you can see that MEF is being used as an IOC container.  The only method in this class is the ProcessCommand method that takes the following parameters.

  • A string that is the command the user sent in, “What’s the weather” for example.
  • The user id.  This is a Guid because we are using the Membership provider for authentication.
  • The conversation mode is specified, next, as an enumeration.  The values in the enumeration specify what channel the user specified the command through (email, IM, etc.).
  • An address is specified as a string for the next parameter.  We can tell from the mode what channel the user is on, but if they have multiple IM accounts or email addresses, we also need the address to figure out how to reply to them on the right account.
  • Last is a call back to be called after the command is processed.  This is primarily here for commands sent to the system on a duplex WCF service.  In order to reply to the user on this service, we need to be able to respond on the specified interface defined in the WCF service.  This callback allows that.

To start off, we clean the command up a bit by making it lower case and removing apostrophes.  Now, we create an instance of ConversationContext and initialize it.  We’ll look at this class in just a minute.  Next, we create the TokenManager and pass the user’s command to it.  What is passes us back is a Dictionary<int, List<TokenResult>> where the int specifies a start position where tokens were found and the List contains all the tokens that were found there.  We’ll look at the tokenization of the string in detail in the next part in this series.

Since the system saves all messages sent by the system or the user, we next log the users request.  The buckets are passed in because they will be serialized and saved along with the command.  When future requests come in, this will allow us to pull any past request from the database, de-serialize it and inspect it.

In some cases, the system will ask clarifying questions of the user.  For example, if you as the system to remind you of something, it will need to know when to remind you and how.  So, it will ask you these questions and wait for your reply.  In the next code block we use the QuestionManager to check and see if the current request is actually the answer to a previously asked question.

If the command is not the answer to a question, we need to check and see if it matches one of the rules we have defined.  Rules are how you specify sentences the system should recognize and what it should do when it receives that sentence.  The RuleManager class compares our buckets  with the rules to find a match.  If one is found, it is invoked.  Otherwise, the system replies that it did not understand the command.

Notice that the response is sent via the Say method on the ConversationContext class.  Since this functionality completes the processing cycle, let’s look at that class next.

ConversationContext

As has been said before, the purpose of this class is to manage the conversation’s details such as the user it belongs to, it’s mode, address and history.  Below is a listing of this class.

ConversationContext Class
  1. [Export]
  2.     [PartCreationPolicy(CreationPolicy.NonShared)]
  3.     public class ConversationContext
  4.     {
  5.         public User ConversationUser { get; set; }
  6.         public Conversation Conversation { get; set; }
  7.         public List<ConversationHistory> ConversationHistory { get; set; }
  8.         public SmartHome.Global.ConversationMode Mode { get; set; }
  9.         public Action<string> Callback { get; set; }
  10.         public string Address { get; set; }
  11.  
  12.         [Import]
  13.         private IEventAggregator EventAggregator { get; set; }
  14.  
  15.         [Import]
  16.         public QuestionManager QuestionManagerReference { get; set; }
  17.  
  18.         public void Init(Guid userId, SmartHome.Global.ConversationMode mode,
  19.             string address, Action<string> callback)
  20.         {
  21.             ConversationUser = UserData.GetUserByUserId(userId);
  22.             Conversation = ConversationData.GetConversationByUserAndMode(
  23.                 userId, mode, address);
  24.             ConversationHistory = ConversationData.GetConversationHistory(
  25.                 Conversation.ConversationId);
  26.             Mode = mode;
  27.             Callback = callback;
  28.             Address = address;
  29.         }
  30.  
  31.         public void LogRequest(string request, object tag)
  32.         {
  33.             string tagString = SerializeTag(tag);
  34.             var history = ConversationData.CreateConversationHistory(
  35.                 Conversation.ConversationId, request, tagString, tag, true);
  36.             ConversationHistory.Add(history);
  37.         }
  38.  
  39.         public void Say(string comment, object tag)
  40.         {
  41.             string tagString = string.Empty;
  42.             tagString = SerializeTag(tag);
  43.  
  44.             EventAggregator.GetEvent<ReplyToChannelEvent>().Publish
  45.                 (new ReplyToChannelEventArgs
  46.             {
  47.               Mode = (SmartHome.Global.ConversationMode)Conversation.Mode,
  48.               Reply = comment,
  49.               TagString = tagString,
  50.               Tag = tag,
  51.               ConversationId = Conversation.ConversationId,
  52.               Callback = Callback,
  53.               UserId = ConversationUser.UserId,
  54.               Address = Address
  55.             });
  56.         }
  57.  
  58.         private string SerializeTag(object tag)
  59.         {
  60.             string tagString = string.Empty;
  61.  
  62.             if (tag != null)
  63.             {
  64.                 if (tag is IEntityWithChangeTracker)
  65.                     (tag as IEntityWithChangeTracker).SetChangeTracker(null);
  66.                 var ser = new DataContractSerializer(tag.GetType());
  67.                 var ms = new MemoryStream();
  68.                 ser.WriteObject(ms, tag);
  69.                 tagString = Encoding.Default.GetString(ms.ToArray());
  70.             }
  71.  
  72.             return tagString;
  73.         }
  74.  
  75.         public void AskQuestion(string text, List<Token> expectedReplies, object state,
  76.             Action<ConversationContext, object, List<Token>> executeOnAnswer)
  77.         {
  78.             Question question = new Question
  79.             {
  80.                 Address = Address,
  81.                 ExecuteIfAnswered = executeOnAnswer,
  82.                 ExpectedReplys = expectedReplies,
  83.                 Mode = Mode,
  84.                 PosedDateTime = DateTime.Now,
  85.                 UserId = ConversationUser.UserId,
  86.                 State = state,
  87.                 QuestionText = text
  88.             };
  89.  
  90.             QuestionManagerReference.AddQuestion(question);
  91.  
  92.             Say(text, null);
  93.         }
  94.     }

At the top of this class are instance variables that hold the instance’s state.  The Init method initializes these variables.

Next is the LogRequest method we saw earlier that serializes the tag that is passed in (the buckets) and writes the request to the database.

The next method is Say().  We saw this method being used in the CommandProcessor class to reply to the user.  It’s also used by the various rules for the same purpose.  If you as the system about the weather, it will send the details back to you with the Say method.  As you can see, I’m using the EventAggregator from the PRISM library to make the component of the system more loosely coupled.  The say method serializes the tag parameter just like the LogRequest method does and then it raises an event asking someone to reply to a user.  There are various classes responsible for communicating with the user on different channels.  These classes are listening for this event and when they receive it, they will inspect the event args to see if they should respond.  If so, they will send the response to the user and log the communication to the database.

Context

Looking at the listing, the next method is a helper that serializes objects into xml so they can be written to the database. This is a very important piece of functionality.  When a rule is invoked by the system, it is passed an instance of the ConversationContext class.  You can see the ConversationHistory property on this class which contains the entire history of the conversation.  Most of these entries have a piece of context that helps to either explain why the system responded the way it did or to store a piece of information that may be useful later.  For example, suppose you ask the system to list all the reminders it has for you.  These will be returned to you in a numbered list.  Suppose you’d like to delete the third item in the list.  You could do that by telling the system to “delete 3”.  How does the system know what was listed, much less which in the list was number three?  It does this by walking backward through the conversation history until it finds a list that was stored in the tag.  It then uses that list.

As another example, suppose you ask the system to remind you to meet with Phil.  The system will ask you what time you would like to be reminded.  You reply that you’d like to be reminded at 4:00 pm.  How does the system know what you’re talking about?  You guessed it.  By storing the reminder as a tag.  This is a powerful idea and allows the system to have a stateful conversation via a stateless protocol in a similar way that ASP.Net does with its serialized state.

The last method in the ConversationContext class is AskQuestion().  This is similar to Say() in that it also raised the event to have the reply sent back to the user.  However, it also adds the question to a collection maintained by the QuestionManager.

Next time, we’ll look at the TokenManager and how the commands are tokenized.

Posted in: .Net | Natural Language Processing

Tags: ,

Writing a Natural Language Parser in C# Part 2 - Architecture

March 18, 2012 at 11:16 AMAdministrator

This post is part of a series on creating a natural language processor in C#. The other entries in this series are:

Writing a Natural Language Parser in C# Part 1–Why? 
Writing a Natural Language Parser in C# Part 3–CommandProcessor and ConversationContext 
Writing a Natural Language Parser in C# Part 4–Tokens
Writing a Natural Language Parser in C# Part 5 - Questions and Rules

In our last post we discussed why one might be interested in building and using a natural language processor in a business or home project.  This week, I’d like to look at the architecture of the processor I’ve written.  This will be a high-level look at the various pieces of the system, the flow of processing a single sentence and how each piece contributes to the process.

An Activity Diagram

Below is an activity diagram depicting the interaction of the major parts of the system and how they work together to process a sentence sent to the system by the user.

Activity

The steps involved in this process are as follows:

  • The string the user submits, along with a User ID and other bits of information are submitted to the Command Processor.
  • The command processor creates an instance of a conversation context.  This is a very important part of the system as a whole.  It keeps track of what user made the request, what method was used to communicate the request (email, IM, etc.) and has a history of all statements that have occurred in both directions for the duration of the conversation.
  • The system contains a collection of Tokens that know how to inspect the input for certain strings or conditions.  They generate a TokenResult instance for each interesting piece of the statement and records where the interesting part begins in the string and how long it is.  The token result also records a strong type that indicates the kind of thing that was found such as a certain phrase, a date or the name of something it knows about.
  • After all the tokens have had a chance to process the string, the resulting TokenResults all exist in a single collection.  This collection is then organized into buckets where each bucket corresponds to a start position in the string.  For example, all interesting things found to have begun at position zero would be in a bucket together.  How can there be more than one token result for the same phrase?  Well consider that the string contained a numeral “1”.  This could represent an integer, a long, a decimal, and ordinal (think “first” as in the first day of the month or first day of the week), etc,
  • Rules are methods that return void and have any number of parameters which are each a type of token.  These rules are all defined in classes that are marked with a particular interface and are obtained via reflection.  After all the token result instances have been organized into buckets, each of these methods is inspected and its parameters are compared to the contents of each bucket, in order, to see if a match for the parameter is found in the corresponding bucket. 
  • Once a matching rule is found, it is executed by passing in the conversation context and all the matching tokens.  The context can be used from within the method to inspect the history of the conversation and also to send responses back to the user.

An Example

Lets’ look at a simple example.  Even if this example does not make complete sense to you right now, as we look into each piece of the system in more detail, it will become clear to you.

Let’s say the user sends in a request, “What’s the weather”.  Once the tokens have all had a chance to look at this string, we will have a collection of token results.  In fact, we will have two results.  The first will say that “What’s the” can be tokenized into a TokenList beginning at position 0 in our string.  The second will say that the remainder of the string can be tokenized into a TokenWeather beginning at position 10 (I know this seems like an incorrect position, but I will explain in a future post).

Now, these two tokens will be put into a couple of buckets and the system will begin to go through all the defined rules.  Once of these rules has a signature like this:

Weather Rule
  1. public static void GetWeather(ConversationContext cContext, TokenList list, TokenWeather weather)

Since TokenList matches what’s in our first bucket and TokenWeather matches what’s in the second, this will be the matching rule and will be executed.

Threading

It should be noted that the entire process diagrammed above executes on a single thread.  However, much like a web server, each request is handled on a separate thread that is initiated by classes that listen on different protocols.

Posted in: .Net | Natural Language Processing

Tags: ,

Writing a Natural Language Parser in C# Part 1–Why?

March 9, 2012 at 11:35 AMAdministrator

Recently, I did a refactor of several key pieces of the natural language processor I created for my smart home.  Spending the time refocusing on how it works convinced me to write a series of posts explaining how it works and how it was built.  In this first installment, I'll look at why an NLP engine is a useful piece of software and other things it could be used for.

Originally Written for Smart House

Obviously, I created this engine for use in my smart house.  As I've said many times, the original idea was inspired by similar work done by Ian Mercer.  By making this a part of my smart house, I can communicate with the system to request information and ask it to carry out work for me using natural language.  What's more, since the interface to my system is nothing more than text, I can communicate through any channel that supports text: email, instant messaging, SMS, etc.  By using text to speech and speech recognition functionality available on multiple platforms, I can even have a conversation with the system through speech.

Is There Room for NLP in Business

Is there a place for this interface in business? I think so.  UX technologies are changing rapidly.  Touch is an expected interface in most devices, now.  Certainly mobile phones are expected to support touch, as well as tablet computers.  The Kinnect introduced motion gestures for gaming and this method of interacting with systems is coming to PCs.  The point is that interfaces not involving a keyboard and mouse are getting more common-place and natural language certainly has a place at the table.

Some examples of business use might include:

  1. An ad-hoc query system that lets users ask for what they want from the data source using sentences. 
  2. Personal Assistant software that could ask the user for direction on what to do with emails or documents and allow the user to specify answers in natural language.
  3. Conference room software that controls displays and / or whiteboards via natural language requests.

I've found working with this technology to be a lot of fun and the results have been satisfying.  I invite all of you to follow this series to see how it all works and to contribute your own ideas.

In part 2, we'll look at the architecture of the system.  From there, we'll take deep dives into the individual parts.

Posted in: .Net | Natural Language Processing

Tags: ,