AI, and the data used to advance its learning, is permeating every industry and all facets of enterprise.
In the Temple of Apollo at Delphi, there was an inscription that read γνῶθι σεαυτόν (pronounced “Gnothi se afton”). It means “Know yourself.” These simple words have been written about many times and taken to mean many things. For example, know yourself before you come here seeking answers that you may not be ready to hear. Know yourself and your weaknesses so that you might find greater strength. Knowing has been around for a very long time indeed. Lately, there has been a lot of attention in media to machines coming to “know” things through artificial intelligence (AI). Wonderful terms, like machine learning, deep learning, and cognitive computing bring forth all sorts of anthropomorphic wonder to the capabilities of modern computer science. But are these terms realistic? What is really going on when machines learn?
The Basics: Training, recursion, and learning
When I was in grammar school, I did well in many subjects. Spelling, however, wasn’t one of them. Doing my homework, I would call out to my mother “Ma…how do you spell aardvark?” (insert any word for aardvark, I just thought it was funny). “Look it up in the dictionary!” would come the expected reply. “If I knew how to spell it,” I would answer, “then I could look it up in the dictionary!” How was I supposed to know that aardvark starts with two a’s? Later on, as the words got more complicated, people would tell me to write the word down, because if it was spelled right, it would “look” right. This “helpful hint” was honestly never very helpful to me.
This little vignette from my childhood highlights one of the first ways that machines “learn” – through training. Give a machine algorithm authoritative sources, dictionaries or large collections of words in common usage for example, and then ask the machine to tell you if something “looks” right. It will be highly accurate, according to the creators of such methods. If you ask how they measure accuracy, they will tell you that it is a comparison of the results of the algorithm as compared to the training sets. If the training is “right,” then accuracy can be measured by comparison. Of course, if usage changes, in this simple example, imagine that we also wanted to include alternative spellings (British vs. American spelling of “color” for example), we need to include training examples of the alternate spellings.
Many web crawling and text mining approaches are based on training. We create a template by having human users look at examples of the thing we are looking for, such as a properly formatted telephone number or a picture of an elephant. Next, we look at millions of pages, comparing what we find to the training set. If the training is good, the results are essentially just as good as people doing the same thing, but machines don’t get tired and they can work as long as we wish. As long as the thing we are looking for doesn’t change much, and the places we are looking at resemble the training sets, these approaches work well. (There are, of course, many other considerations, like the permissible use of such information, but we are only talking about learning here.)
Such approaches can even be enhanced with the concept of recursion, allowing subsequent steps to be informed by the output of previous steps. Imagine we had a very large database of text that came from dialog with our customer service organization. Our task is to extract articulations that hint at happy customers. This task is not as simple as looking at a training set, because people express happiness in different ways (using positive words, saying thank you at the end of the call, including key phrases such as “helpful” for example). If we simply tried to create a training set for a task like this, we would only find situations where people said very specific things. (Some might suggest we simply ask the customers if they are happy at the end of the call, but remember we are looking at previously-captured information here, and there are other issues with asking people directly how they feel.)
In this example, a recursive approach might work very well. We could start with asking an algorithm to isolate examples of positive and negative articulations by using long lists (training sets) of words typically taken to be positive or negative. Next, we could cluster those articulations into categories, such as all positive, mostly positive, neutral, mostly negative, and all negative. Having created subsets of the previously very large database, we could now show representative examples of the highly dispositive articulations (all positive, all negative) to human experts for rating (correct/incorrect). Such users could highlight key phrases that they find to reinforce their attribution. Those phrases could then be added to the training set and the process could be repeated until the overall performance approached a useful level for the task at hand. Real-world processes are significantly more complex than this example, but the concepts are essentially the same.
For data that is numerical in nature, we can even skip the step of asking humans to rate the results. Machine learning algorithms can take sufficiently large samples of numerical data and divine equations that predict, with measurable accuracy, the future behavior of a system as long as there is sufficient stability and as long as future experience reasonably resembles the past.As an example, imagine that you had many years of performance data on a large fleet of the same vehicle. Imagine that you had data including maintenance history, weight on board, topography covered, weather conditions, speed, braking, and other performance data. Imagine also that you had many years of data about fuel prices including market conditions, commodities pricing, and other related factors. A machine learning algorithm could ingest thousands of such attributes and come up with a very dependable prediction of the impact of future similar conditions to estimate fuel costs on a similar fleet of vehicles. As long as attributes were updated as key performance factors and market factors change, such a “machine learning” method would perform very well. Machine learning algorithms continue to help us in many ways, including malware detection, sales optimization, and medical diagnostics. If we keep in mind when such approaches are appropriate, they can be critical tools in the AI toolbox.
Learning methods based on observation have been around for a long time. As computers become more powerful and data becomes richer and more ubiquitous, the ability to use machine learning techniques to observe and predict will become increasingly valuable to modern computer and data science.
Other Kids On The Block: Deep learning, cognitive computing
It might be simpler if our story ended here, but basic machine learning is not all there is to AI. In fact, modern methods, especially ones that are based on more nuanced types of “learning,” are adding significantly to the AI toolbox. We should be careful of any approach that only uses any single method. There is an old saying that if you only have a hammer, everything looks like a nail. I suppose one could use a hammer to hammer in a screw, but wouldn’t a screwdriver be better?
There are so many methods in AI that one could write a book on them (some people have, in fact!).
Newer methods, especially non-regressive methods which learn from a combination of observation and proposition in a forward-looking approach, are helping with some very challenging applications such as autonomous vehicles, military, and cyber-crime applications.
Deep learning (including deep believe and certain neural network infrastructures) is a type of advanced machine learning. Deep learning methods have been around for about two decades, but have recently become significantly more impressive with the introduction of specialized hardware and advanced data curation methods. These methods can either be supervised (where human experts or their attributions participate in the ongoing tuning of results) or unsupervised (where the methods themselves provide their own improvement through recursion as introduced above, often in unstructured data). These approaches are particularly useful in speech recognition and natural language processing, among other applications. The learning that occurs in such approaches is quite complex, involving basic observation and higher-level abstraction.
Another approach, which is fascinating but not universally defined, is the concept of cognitive computing. In cognitive approaches, the machine works alongside a human expert, helping to refine the problem and learning from observation of the human expert to inform future iterations. These approaches are showing great promise in fields such as medical diagnosis, where doctors can be challenged with the overwhelming specter of knowing about thousands of publications issued daily contributing to a growing body of knowledge. It is exciting to think that cognitive approaches could evolve to provide better advice in context than other AI methods (or possibly working in conjunction with other AI methods).
As machines become faster, data becomes more prevalent, and methods become better understood by a broader set of practitioners, the possibilities for AI to advance and learn in new ways is permeating every industry and all facets of enterprise.
The Future of Learning: Goal modification and autonomy
If we were to stop the story here, we might feel happy and full of hope. Unfortunately, there are some very big challenges to consider in the future of machines and their ability to learn.
- Regulatory – Regulators will never keep pace with technological evolution. However, in modern times, with machines learning in new ways, there has been important debate about the degree to which such learning should be regulated. Arguments in favor of regulation include the fact that if we allow AI to advance unfettered, privacy violations and unintended future capabilities could eventually allow AI to have very negative impact on mankind. Arguments against regulation center on the seeming impossibility of defining the space adequately and the near certainty that evolution of AI will continue no matter what.
- Ethical – There are many ethical questions that emerge from advancing capability in AI. The divide between what we can do and what we should do is being called into question in many areas of society. For example, is it appropriate to make future actuarial predictions based on an enhanced ability to learn from the human genome? Should police have the ability to take over the operation of autonomous self-driving vehicles? Should pharmaceutical companies be able to market to you based on the prediction that you have, or will soon have, a certain medical condition? AI is indeed bringing many ethical questions to light.
- Malfeasance – We should not ever assume that technology will be inherently good or evil. Technology has no such notions. AI can be used to provide much benefit. It also can be used as an instrument of harm. Consider malware bots that learn from their failure with recursive methods that eventually find weaknesses in the cyber-defenses of our systems. Consider also computer viruses that could lie hidden inside autonomous vehicles or other equipment waiting for certain conditions or signals to trigger their actions. All of these scenarios and many more are possible if we are not vigilant.
- Provenance – As AI becomes more capable, we humans have an increasingly difficult time understanding why and how decisions are made. The provenance, or history, of what information was used to reach a conclusion and the reasoning for that conclusion, are often difficult or impossible to discern in certain modern AI methods. Essentially, we can wind up with increasingly smart agents that have an increasingly difficult time explaining themselves. If we are not careful, our only recourse (horrifying, to me) would be to simply trust them.
- Marginalization – As with any technology, advances in AI create marginalization. There will always be classes of people, regions of the world, and various groupings of “others” who have reduced access to the benefits that come with such advances. We must be careful to understand this marginalization and to address it in ways that are thoughtful and intentional.
There is no doubt that advances in AI and various types of computer learning are causing us to learn as humans as well. Our continued thoughtful advances in this field can provide great benefit and also bring about great risk. We must be good stewards of the learning agents that we create.
It seems that knowing ourselves, and knowing our AI agents, will always involve learning. As we continue to apply learning methods to machines to make things more autonomous, more predictive, and more performant, we have as much to learn as our digital agents.