By Dr. Lance Eliot, the AI Trends Insider
I am very active in my alma mater. I serve on several alumni boards and committees, and try to support the youth of today that will soon be the inventers and innovators of tomorrow. As a mentor for our campus start-up incubator, it is exciting and refreshing to see so many entrepreneurs that are desirous of launching the next Facebook or Google. Hurrah for them and let’s all hope that they keep their spirits high and remain determined in their quest.
Since I just said hurrah, it reminds me of cheering, and cheering reminds me of sports, so perhaps it would be timely for me to mention that I also enjoy going to our exciting football games and our boisterous basketball games. Our football team is generally more highly ranked than our basketball team, but either way, whether we win or lose, it’s fun to be a supportive alumnus that cheers on his teams.
Which brings up a question for you. Let’s suppose that we recently managed to snag an incoming student that will be a great addition to our football team, and another student that will be a superb addition to our basketball team. The two students we’ll say are named A and B. I’d like you to try and guess which went to which of our two sports teams.
The student A is 6 foot 2 inches tall and weighs about 260 pounds.
The student B is 6 foot 8 inches tall and weighs about 220 pounds.
Did A go to the football team or the basketball team?
Did B go to the football team or the basketball team?
You might at first glance say that there’s no way to tell which went to which. I didn’t provide enough information such as how high each can jump and nor whether one of them made a ton of touchdowns last year. But, I did give you some info that would be helpful, namely, their height and weight.
By now, I assume that you have deduced that A went to the football team, while B went to the basketball team. This seems logical since B is rather tall and we’d expect a basketball player to be relatively tall. A is shorter than B, and a bit heavier, which makes sense for a football player.
You used your awareness of the physical characteristics normally required for each of those two sports to try and figure out the most likely classification that matched to the description of A and that matched to the description of B. Sure, you could still be wrong and maybe the football team wanted a really tall football player to block kicks, and maybe the shorter of the two would be really wily and fast on a basketball court, but by-and-large you made a pretty reasonable guess that A should be tossed into the football classification and that B should be put into the basketball classification.
Congratulations, as you are now a Support Vector Machine (SVM).
Well, kind of. Allow me a moment to explain what a Support Vector Machine is.
SVM is a statistical method that aids in classifying things. You typically feed various training examples into the SVM mathematical algorithm, and it tries to identify how to best classify the data. As an example, I might have data about football players and basketball players, let’s say their height and weight, and I enter that into a SVM. The algorithm analyzes the data tries to come up with a mathematical classification for the two.
Henceforth, if you were to have an A or B come along (two sports players), it could try to tell you into which classification each belongs.
Notice that I mentioned that we provided the SVM with training data. We had two classes, namely football players and basketball players. The training data consisted of data points, in this case suppose I had data describing one hundred football players and one hundred basketball players, and so I provided two hundred instances and gave the height and weight for each of those instances. These data points could be considered a p-dimensional vector, and the SVM develops a p-1 dimensional hyperplane that has the largest separation or margin between the two classes.
The hyperplane is essentially a means to divide the two classes from each other. It is a mathematical construct that aims to ensure that the widest separation between the two classes is found. In this manner, when a new data point comes along, such as our student A, the algorithm can look to see if A is in the football classification or on the other side of the hyperplane and actually in the basketball classification. It’s kind of like having a dividing wall between the two classes, and can be used to decide whether a new data point is on one side or the other side of the wall. This is formally called the maximum-margin hyperplane.
I’ll add some more jargon into this.
The SVM is known as a non-probabilistic binary linear classifier.
The “binary” part means that it usually determines whether something is in a class or not, or whether it is in one of two classes. So, we have our example already of using SVM to determine whether someone is in one of two classes (football versus basketball).
We could also have used the SVM by providing only say football players (just one class), and then asked the algorithm to indicate whether someone seemed to fit into the football player classification or not. For our purposes, this would have indicated that A was in the football player class. But, for B, it would only have indicated that B was not in the football class, and would not have known anything about the basketball class, so we only would have known that it seemed that B was not in the football classification.
SVM is usually “non-probabilistic” meaning that we won’t get a probability about the odds that the algorithm is correct that A is a football player and that B is a basketball player. There are special versions of SVM that do add a probabilistic capability.
SVM is usually “linear” which is the easier and straightforward way to find the hyperplane. There is a more advanced version of SVM that provides for a non-linear approach, often using something referred to as a kernel trick. This can be handy if your data points aren’t amenable to the easier linear approach.
The SVM is mainly used with training examples, and therefore it is considered a “supervised” learning model. The supervision aspect is that we are providing known examples and thus giving direct guidance to the SVM as to data points and into which classes they are supposed to fit into.
Suppose though that we weren’t sure of what the classifications should be.
If we had two hundred sports players and fed in the examples, but didn’t say that they belonged into a football classification and/or a basketball classification, we might instead want the SVM to come up with whatever classifications it might find. We could then look at how the data had been classified by the SVM, and try to ascribe some logical basis to the mathematical classes that it found. It could be that we’d say that the classes were for football and basketball, or we might decide the classes are for something else instead.
This would then be considered an unsupervised learning model, and the SVM tries to find a “naturally occurring” way to group or classify the data. Since this is quite a bit different from the traditional SVM, the unsupervised version is often referred to as Support Vector Clustering rather than being called a Support Vector Machine.
SVM’s have been used quite successfully in a variety of disciplines, such as in the biochemical sciences it has been utilized for classifying proteins. Another area that SVM is especially known for aiding is the classification of images. Suppose you had hundreds or thousands of pictures of lions and of elephants. You could feed those images into SVM and have it identify a mathematical categorization for the lions and for the elephants, and then when a new image comes along it could be fed into the SVM to have it indicate whether the image is in the lion category or in the elephant category.
What does this have to do with AI self-driving cars?
At the Cybernetic Self-Driving Car Institute, we are using SVM as an integral part of the AI self-driving car software that we are developing.
Indeed, anyone doing AI self-driving car development should be either using SVM or at least considering whether and when to use SVM. Generally, in AI, overall, the use of SVM is considered an important tool in the toolkit of AI learning models.
How does SVM get involved in AI self-driving cars, you might ask.
Suppose a self-driving car has a camera that is able to capture images of what’s ahead of the self-driving car. The AI of the self-driving car might want to know whether there is a vehicle up ahead, and could feed the image into a SVM that’s been trained on what vehicles look like. The SVM could do its mathematical analysis and report back to the AI system that the image does contain a vehicle or does not contain a vehicle. The AI system then can use this result, doing so in combination with other sensors and whatever those sensors are capturing such as radar signals, LIDAR images, etc.
You might be aware of AI sufficiently to wonder why the image analysis wasn’t being done by a neural network. Well, you are right that usually we would be using a neural network to do the image analysis. But, suppose that we also thought it prudent to use SVM.
In essence, you can use SVM to do an initial analysis, and then do a double-check with say a trained neural network. Or, you could have used the neural network as the first analysis, and then use a SVM as a double-check on the neural network. This kind of double-checking can be quite useful, and some might argue is even a necessity.
Why would it be considered a necessity?
Suppose the neural network was our only image analyzer on the AI self-driving car. Suppose further that the neural network got confused and thought there was a vehicle in the image, but there really wasn’t. Or, suppose the neural network thought there was not a vehicle in the image, but there really was. Either way, the AI of the self-driving car could be horribly misled and make a maneuver based on a mistaken analysis by the neural network. If the SVM was acting as a double-check, the AI could then consider the result from the SVM and also consult the result from the neural network, and decide what to do if the two different image analyzers had two different interpretations of what the image contains.
Thus, you can use SVM for AI self-driving cars as:
— Standalone SVM
— SVM as initial analysis, double-checked by some other approach
— SVM as a double-check upon some other approach which has been first used
You might be wondering whether the computational processing time of using the SVM might be prohibitive to use it for an AI self-driving car. Whatever AI learning models are used on a self-driving car need to be fast enough to deal with the real-time needs of guiding a self-driving car. The self-driving car might be going 70 miles per hour, and so the sensory analyses need to be fast enough to make sure that the AI gets informed in time to make prudent decisions about controlling the car.
SVM is pretty quick after having been trained, and so it is a suitable candidate for use on a self-driving car. That being said, if you want to also do further training of the SVM while it is immersed in the AI of the self-driving car, you’d need to be cautious in doing the training while the self-driving car is otherwise involved in maneuvering in traffic.
We use any non-traffic non-transportation time of the self-driving car (such as when it is parked), in order to have the SVM do additional training. Furthermore, you can push the SVM training off into the cloud, in the sense that the self-driving car if connected to a cloud-based over-the-air updating system can have SVM updates occur elsewhere so as to not bog down the self-driving car processing per se. Once the SVM has been updated in the cloud, it can be pushed back down into the local AI system of the self-driving car.
Besides doing vehicle versus non-vehicle image analysis classifications, an SVM can be used for a wide array of other aspects on an AI self-driving car. We’ve found it especially handy for doing pedestrian classifications. For example, whether an image contains a pedestrian or does not contain pedestrian. Even more involved would be a classification of whether a pedestrian poses a “threat” to the self-driving car or does not pose a threat. By the use of the word “threat” we mean that the pedestrian might be darting into the street in front of the self-driving car. This constitutes a form of threat in that the self-driving car might need to take some radical evasive maneuver to try and avoid hitting the pedestrian.
How would the classifier realize whether a pedestrian is a threat or not? We train the SVM on images of pedestrians. In one set, we had pictures of pedestrians that are in a standing posture or otherwise in a posture that does not suggest dramatic movement. A stance of a pedestrian that suggests they are running would be considered a more dramatic movement. The distance of the pedestrian is another factor, since someone might be in a running posture but so far away from the self-driving car that it is not considered an imminent threat. On the other hand, if the image shows a pedestrian in a running stance that is very near to the self-driving car, the AI would want to know to be on the alert.
One of the budding areas of self-driving car capabilities involves being able to discern the intent of pedestrians. Right now, most of the AI systems for self-driving cars merely detect whether a pedestrian exists somewhere within a near virtual bubble of the self-driving car. The latest advances go further and try to guess what the intent of the pedestrian might be. Is the pedestrian moving toward the self-driving car or away from it? Are they going to end-up in front of the self-driving car or behind it? Do they seem to be looking at the self-driving car or looking elsewhere? All of these aspects help to try and gauge the intent of the pedestrian. Us human drivers are continually scanning around us, looking at pedestrians and trying to guess what the pedestrian is going to do. That’s what the AI of the self-driving car should also be doing.
With an SVM, the AI needs to be cautious about being possibly led down a primrose path, so to speak. The SVM might say that something is classified as X, but it could be a false positive. For our earlier example about the sports players, suppose that the SVM had indicated that B was a football player. That’s a false positive. Suppose the SVM had indicated that A was not a football player. That’s a false negative. The AI of the self-driving car needs to consider whether to believe the SVM classifier, which will depend on a variety of facets, such as in the case of images whether the image is a clear image or a noisy image, and so on.
For SVM, it is wise to be cautious of the SVM results whenever the target classes tend to be very close to each other or even overlap. I am guessing that if we tried to train a SVM on baseball players and soccer players, we might find that based on height and weight alone that the two classifications are very close to each other. This means that when we have a sports player C that presents themselves to us, and if we ask the SVM to classify them, our belief in whether the SVM says that C is a baseball player or is a soccer player will need to be carefully reviewed or double-checked.
The SVM can also inadvertently overfit to the training data. Overfitting is a common problem in most learning models, including neural networks. The aspect of overfitting means that the learning becomes overly fixated on the training data and has not been able to generalize beyond the training data. Imagine a baby that is learning about blocks. Suppose the baby is given a bunch of blocks and they are all the color green and are one inch square in size. The baby might believe that all blocks are green, and can only be green, and must be one inch square in size. If you handed the baby a new block which was red, the baby might not realize it is a block. That’s overfitting.
Another issue for SVM involves outliers. Suppose we happen to find a really great football player that is nearly seven feet tall. If we had included just one such instance in our training set, the SVM might have considered the outlier as irrelevant and ignored it. This could be okay, or it might be bad in that maybe we really could have football players of that size. Thus, the SVM might make a mistake that when we later on do look at a football player of that height, the SVM will instead insist that the player must be a basketball player.
So, the SVM, like other learning models, must be taken with a grain of salt. It can be prone to overfitting to the training data. It can be computational costly to do the training. It can have difficulty with outliers. It assumes that the characteristics or features being used are generally relevant to the classification. And so on.
I don’t want you to though feel like I am saying don’t use SVM. I assure you that any learning model, including neural networks, will have the same kinds of limitations and issues that need to be considered. The SVM is a very valuable tool in the AI toolkit and one that we believe deserves due attention for AI self-driving cars.
Click here for the Podcast version of this column.
This content is originally posted on AITrends.com.