The confounding characteristics of language—an introduction to understanding what happens when computers look at language.
Many years ago, I took a shortcut while walking to class. I passed through part of a building that was primarily used for drafting classes. The halls were very quiet… so quiet that I actually tried to silence my footfalls so as not to disturb the students in various rooms as they toiled at large drafting tables (this was before the days of AutoCad, I’m afraid). My thoughts wandered as I walked. I casually glanced at a sign, one that we all see so often even today that we scarcely give it a second thought. It read “Warning: Keep This Door Closed at All Times!” Unfortunately, in my pensive mood, a thought occurred to me that I found uncontrollably hilarious. It was the exclamation point that did me in, as if the sign was not emphatic enough. This sign meant business. That’s what made me wonder: If the door is really to be kept closed at ALL TIMES, why didn’t they just have a wall there? I burst out laughing at my private joke, which promptly caused a professor to step into the hallway and ask me to please go somewhere else. I apologized and sort of snickered myself out of the building. Of course, the sign wasn’t really that funny. It was my interpretation that made me laugh.
A memory of this fun episode was sparked during a recent CXOTalk I had the pleasure of doing with Dr. Stephen Wolfram about the Future of Computing and Artificial Intelligence, as we attempted to break down emerging trends in artificial intelligence, natural language processing, and the rapid digitization of nearly every aspect of life.
As a follow up to that conversation, and considering all of the hype lately around these topics, I thought it might be a good time to resurface the concept so wonderfully highlighted by bad grammar, missing punctuation, and alternative interpretations everywhere…something I refer to as the confounding characteristics of language.
Natural Language Processing: A very unnatural science
The field of Computational Linguistics gives us some amazing capabilities. We have developed the ability to take strings of spoken or written words and make them consumable to all sorts of algorithmic interpretation. The range of capabilities is well beyond the scope of this document, but let us consider a few basic tools of the trade. First and foremost, just like we did when we were in school learning to read and write, we teach computers the ability to recognize words, punctuation marks, and basic sentence structures. Consider the sentence “See Spot run.” This is one of the first sentences many of us learned to read in the famous “Dick and Jane” readers by William Gray and Zerna Sharp. This sentence appears to be fairly simple. On further examination, however, it has some nuance. In computational linguistics, one of the initial tricks is to find the verb first – so in this case we find the verb “see.” The subject of this verb is implied (you), so our sentence is not so simple. The word “Spot” could be confused with a verb, but it is capitalized, which gives us a clue that it is actually a proper noun. Thus, with a little magic, we organize the sentence into a command where the author is speaking to the reader, directing the reader to observe someone or something named “Spot” as it “runs.”
The process above is not too dissimilar from how a parsing algorithm works. If an algorithm sees the sentence “Apple announced a new product today,” the algorithm can find the verb “announced,” assign the actor of the verb to a proper noun called “Apple,” and further distinguish the company from the fruit based on the fact that a fruit can not operate on a transitive verb. Chalk up one point for science.
Computational linguistics has advanced to the point where very sophisticated parsing can take place. A sentence like this one, which has a parenthetical expression embedded within it, is no problem for modern capabilities, even when the sentence is quite long and suddenly contains subsequent subordinated clauses such as this example illustrates. Using modern techniques, as long as the grammar is well understood and the text is reasonably well written, all sorts of fascinating capabilities are possible.
Of course, sometimes, these very same algorithms fail rather spectacularly. There is an apocryphal tale about one of the original digital assistants being activated by a user who said “Call me an ambulance!” to which the assistant allegedly responded, “Very well, from now on, I will refer to you as an ambulance.” What makes this sort of thing happen is not the recognition of the words, but the ambiguity of the articulation itself. Language, it turns out, with all of its rules and grammar, is a notoriously nuanced way of communicating.
Computational Linguistics and machine interpretation of language are very sophisticated, but the ways in which language are used by human speakers contains tremendous nuance, much of which can be very difficult to completely detect. This difficulty is one of the fundamental challenges in dealing with written and spoken articulations.
The Tools: Turning words into math
There are many reasons for making language computable, or consumable to an algorithm. Initially, the capability was designed to help us interface with computers themselves. For example, by using certain keywords, we could write instructions that later could be transformed into machine language that controlled the functioning of a computer. As the capability matured, we moved from simply using language to talk to computers to allowing the computers to do basic interpretation of language in order to understand human-to-human interaction.
Early examples of this human-to-human interaction included basic translation capabilities, keyword searching for similar concepts in large bodies of text, and grammar analyzers that attempted to help writers avoid common grammar errors. Today, these tools have become significantly more sophisticated, including capabilities such as sentiment analysis, where algorithms attempt to quantify the mean mood or sentiment of a body of language.
Consider the following three sentences:
- This is an apple.
- This is a good apple.
- This is an excellent apple.
We can easily imagine an algorithm assigning a score of 0, for neutral to the first articulation, and 1 and 2 respectively to the remaining articulations, allowing us to rank-order these three statements about how good the speaker feels. Furthermore, using entity extraction (the assignment of grammar and parts of speech explained above), we can conclude that all three sentences might be referring to the same subject, an apple. We can easily imagine an algorithm assigning a score of 0, for neutral to the first articulation, and 1 and 2 respectively to the remaining articulations, allowing us to rank-order these three statements about how good the speaker feels. Furthermore, using entity extraction (the assignment of grammar and parts of speech explained above), we can conclude that all three sentences might be referring to the same subject, an apple.
We could even further allow for negative sentiment, such as assigning a score of -1 to “This is a bad apple.” or -2 to “This apple is horrible.”
By looking at thousands or millions of articulations, it is easy to imagine coming up with mean sentiment, histograms of potential subject matter, lists of synonyms, etc. Consider doing something like that over time in social media, for example, and you have the essence of the science of “social listening.” Such capabilities are crucial to modern approaches to detecting shifts in sentiment about organizational image, products, and other more nuanced concerns such as ethics and social responsibility.
Before we get too comfortable, however, let’s consider another example. “Apples are good if they are fresh.” You can see how an algorithm might score such a sentence as positive, but there is an important nuance. Without looking at the rest of the articulation that follows, we might not know if this is a positive statement or not. In fact, sentiment attribution often reaches different conclusions depending on the length of text subsumed into a single articulation. Many tools are not sophisticated enough to consider such nuance, and are prone to oversimplification. A typical counter-argument is that if you look at enough language, such small errors wash out in the math. Reality, however, shows that a number of preconditions must be met before we can ignore complexity so cavalierly. (For example, if you have an extremely large collection of very short articulations, such as tweets, such an argument would be more valid than if you had an extremely large collection of hotel reviews.)
Beware of using simple tools. Understand the preconditions that must be met for the assumption that such tools are valid to be appropriate.
Confounding Characteristics: Run, Spot, run!
Consider another sentence that many of us learned to read: “Run, Spot, run!” Is this three verbs? Are we being asked to run, to spot, and to run? Of course not. By now, we know that the dog is named Spot (largely because of the animation), and we are telling the dog to run. As I look back on this first experience with reading, part of me wonders how we figured it all out. How are we not confused by a dog named Spot? Wouldn’t it have been easier if his name was Fido, which couldn’t be confused with a verb? It turns out, our brains are great at fixing context to avoid such confusion, even at the tender age of 4 or 5.
Unfortunately, modern computer methods of understanding language aren’t always so adept. Consider this articulation: “ABC Megatron is an excellent company.” We can imagine an algorithm finding the verb “is” and deciding therefore that this is a comparison between “ABC Megatron” and “company.” ABC Megatron is a company. What kind of company? Excellent company. Assign a score of 2 (using a similar scale to the imaginary one described above).
However, the articulation continues: “ABC Megatron is an excellent company, if you like.”
We might imagine that we have a set of qualifying expressions with rules. The phrase “if you like” is found in a list of qualifying expressions along with “more or less” and “so to speak” and taken to be an equivocation. Equivocation reduces sentiment slightly, so we will reduce our sentiment from very positive, or 2, to slightly less positive, or 1.
However, the articulation continues: “ABC Megatron is an excellent company, if you like destroying the environment!”
Immediately our human brains see this statement as sarcasm. To get an algorithm to do the same magic as our intuitive brains requires quite a bit of sophistication:
- Recognize that “if” sets up two clauses. The initial clause (starting with “ABC Megatron”) is dependent, while the latter, (starting with “you like destroying”) is independent.
- Recognize that the independent clause is very negative.
- Do this by realizing that “destroying” is bad if you destroy something good, or good when you destroy something bad.
- The environment is good, so the independent clause is very bad.
- Realize the juxtaposition of the two clauses, one very positive, if and only if one which is very negative as a form of sarcasm.
This sort of analysis will pick up sarcasm only in this one form, which albeit common, is only one form. There are many other types of sarcasm (“nice shoes!”) which require tone, context, or other nuance which is beyond the state of current linguistic inference capabilities.
Sarcasm is only one confounding characteristic of language. There are others, such as neologism (making up new words, such as hashtags), intentional misspellings (such as “ABC Megagroan” instead of “ABC Megatron”), and inclusion of foreign language (such as saying “he had a moment of déjà vu”). All of these confounding usages of language require special consideration. It turns out, we have a way to go before we can declare victory on fully understanding something so sophisticated as sentiment. Furthermore, sentiment is only one of the many things we assess to understand language in all of its many varied uses.
We use language tools to do all sorts of things with language today. We do so knowing that these tools are imperfect, that language continues to change, and that our assumptions must always be tempered with the reality that our brains are much better at understanding language than the tools we use to synthesize language.
I hope that this short introduction to the finer points of understanding what happens when computers look at language helps to inspire. If we get better at this capability, we can do a better job at understanding shifts in sentiment. We can get better at understanding the impact that leaders have on organizations, or perhaps to understand when important messages are not well received. We can get better at recognizing important minorities speaking in vast crowds with nearly undetectable, but crucially important things to say. The science of understanding unstructured data, such as collections of words in all of their varied forms, is one which promises to help us understand ourselves better, and maybe to open a few very important new doors.