By Lance Eliot, the AI Trends Insider
Machine Learning (ML) is essential for the advent and further progress of AI self-driving cars. The nature of how Machine Learning is being undertaken today for AI self-driving cars will undoubtedly evolve and become more sophisticated over time. One crucial aspect for Machine Learning in the context of AI self-driving cars is whether or not to distribute out the Machine Learning aspects, and if so to what degree the ML should be distributed.
This aspect of distributing ML is often referred to as Federated Machine Learning (FML). You can think of the word “federated” in the same sense that it is used for the governmental arrangement of the United States. The United States is a collection of distributed States that are collectively part of an overarching federation. We are continually striving in the United States to ascertain what is the appropriate balance of States rights versus Federal, and there are ongoing debates about how much autonomy the States should have and versus how much control the federal government should have. This applies in the same manner to Federated Machine Learning, as will be further explored herein.
Allow me to first provide a quick story that perhaps helps illustrate the notion of federated learning.
I had done some high-tech consulting work for Snap-on Tools a number of years ago. You might be aware that Snap-on Tools is a famous brand of high-end tools that are often used by car mechanics in automotive repair and maintenance shops. Most car mechanics love their Snap-on Tools. They tend to be very passionate about how great the tools are, how durable they are, and so on. Indeed, many car mechanics are essentially fans of Snap-on Tools and enjoy wearing a Snap-on branded cap or shirt, and are proud to proclaim that they buy and use Snap-on Tools.
What is especially interesting about Snap-on Tools here is that they are primarily a franchise based business and make use of “dealers” to actually sell the tools. You’ve perhaps seen the Snap-on Tools trucks driving around town (they are very distinctive in appearance). The franchise network consists of over 4,000 such dealers. They drive around town, visiting the automotive repair and maintenance shops. During a visit, the dealer will open the back of the truck, pull down a ramp, and try to get the mechanics to take a break from work and come into the truck to see the tools that are being sold.
When I did my high-tech work at the firm, I actually went on several rides with some of the dealers. It was amazing to see the car mechanics get as excited to see the truck as they would if it were an ice cream truck or a food truck. The mechanics would usually rush out of the repair bays and relish coming to see the tools. They would salivate at the new tools and would often dream that someday they could buy a full set of Snap-on Tools. The enthusiasm for the brand was intense. You might liken this to people that are avid fans of Apple products and gush at the sight of a new iPad or iPhone.
I was putting in place an intranet that would allow the franchisees to readily communicate with each other. Up until then, they had no easy means to communicate with each other, beyond the use of email and some crude email distribution lists. By putting in place an intranet, it would enable all the franchisees to become more aware of what was taking at the firm and in the field. It would also aid franchisees in terms of assisting each other. They could be in their truck, driving around town, and when they came to a stop at a site visit, they could check the intranet to see if there were any useful announcements or other aspects posted there.
To try and showcase the value of the new intranet site, I went with one of the dealers just as we had rolled it out. He did his usual thing of driving to an auto repair facility, put down the ramp, and invited the mechanics to come on in. But, he did something else that I hadn’t seen done before. He had taped a competitor’s cap to the ramp. At first, this seemed odd to me, since I figured why in the world would he want the mechanics to see another brand name and be thinking about anything other than Snap-on. Well, there was a method to his madness.
One by one, the car mechanics came to the ramp, and as they walked up the ramp, each one stopped for a moment, took their boot and smashed down on that cap, moving their foot back and forth like they were trying to squash a bug. They delighted in doing this. I realized that the dealer had found an easy way to reinforce the devotion to Snap-on. As loyalists, they were able to showcase their support by walking all over a competitor brand. Every mechanic that walked up the ramp did so. It was also clever because some of them came into the back of the truck simply because they wanted to have their turn smashing down on the cap. They otherwise might not have come into the truck and instead stood outside and just bemoaned the fact that they couldn’t afford another tool just then. By getting them into the truck, the allure of the tools would potentially get them into a buying mood and overlook the price.
I suggested to the dealer that he share this trick with the other franchisees. He was easily able to do so via the intranet. He posted his approach and right away got some comments that it seemed like a good idea. Within two days, a large portion of the franchisees had adopted this simple technique. The odds are that without the intranet to allow for convenient communication electronically, almost no one else would have known about the trick. Some might have discovered it on their own, and maybe a few might have learned about it via word-of-mouth, but otherwise it would not have become so widespread.
This is the potential power of federated learning. When something is learned at an outside edge, it can be conveyed to the larger federation, and the federation can possibly propagate it out to the rest of the collective. There are learnings that can occur solely at the federation that are then shared with the edges. There are learnings at the edges that can be shared with the federation, and then possibly embraced throughout. This prevents an otherwise isolated learning from becoming “trapped” within an edge, and never seeing the light of day beyond its being used at that particular edge.
What does this have to do with AI self-driving cars?
At the Cybernetic Self-Driving Car Institute, we are in the midst of developing and advancing the use of Federated Machine Learning for AI self-driving cars.
Allow me to elaborate.
As shown in Figure 1, the conventional approach right now to Machine Learning for most AI self-driving cars is that the Machine Learning happens in the cloud. The AI self-driving car, we’ll consider it an edge device, provides data that is uploaded to the cloud (typically this cloud would be setup by the auto maker or tech firm providing the AI capabilities for the self-driving car). The Machine Learning takes place in the cloud and then the resultant updated ML model is pushed down into the AI self-driving car. This happens via the OTA (Over The Air) capabilities of the AI self-driving car.
For this conventional approach, there really isn’t much of a federation taking place per se. It is simply that each of the AI self-driving cars that are included in this collective are dutifully uploading their collected data, and the real action of using that data for ML purposes occurs in the cloud. Presumably, all of the AI self-driving cars then get the resultant updated ML model and are obligated to use it, once it has been pushed down into the AI self-driving car locally.
See my framework for AI self-driving cars.
There have been some concerns raised that with essentially all of the data being uploaded from the AI self-driving car, there are perhaps privacy aspects that are being shared into the cloud that otherwise don’t need to be (well, at least don’t need to be uploaded for the purposes of the Machine Learning that is going to take place). Suppose the self-driving car is keeping track of how many times you’ve visited your local emergency room, because you have some ailment, does that kind of data really need to be shared with the cloud for purposes of doing Machine Learning on how to enhance the AI driving capability? Some would argue that there’s a lot of data collected by the AI self-driving car that does not and should not be shared into the cloud.
See my article on privacy and AI self-driving cars.
Take a look at Figure 2. If you believe in this concern for privacy, it could be the case that the data uploaded would only be summary data and also only data that’s pertinent directly to the purposes of doing Machine Learning or for other intended and identified legitimate purposes. Besides the privacy aspects, this also would substantially cut down on the transmission time of conveying the data and would presumably be less taxing on any electronic communications established for cloud connecting elements.
The next evolution of this ML would be to actually become more federated and work in some kind of collaborative mode with the edges and the cloud.
Take a look at Figure 3.
As shown, the Machine Learning in the AI self-driving car is also doing actual Machine Learning, and it then provides an updated ML up to the cloud. The cloud then has to figure out what to do with this updated ML, along with having the summarized data from the AI self-driving car too. The cloud-based Machine Learning can potentially use the now-provided updated ML model from the AI self-driving car, further expand or refine it, and then ultimately push it back down to the AI self-driving car, which would replace the prior ML model with the new one. Alternatively, the push could be just the changes of a new version of the updated ML model that has been modified via efforts in the cloud.
It will be unlikely that there is only one AI self-driving car in the collective or federation. Instead, it is assumed that there will be lots of AI self-driving cars in a particular federation.
Take a look at Figure 4.
Here, you can see that there are some multiple number of AI self-driving cars, each of which has its own respective ML model, each of which is collecting its own local data, and each of which is providing up to the cloud it’s local summarized data and its latest updated ML model. The cloud now has the “collective wisdom” from the set of AI self-driving cars.
You might call this the wisdom of the crowd approach, or more formerly it is referred to as a Federated Machine Learning Architecture.
Implementing this is a lot harder than it looks. You need to consider a wide array of aspects of how you want to architecture this. There is a myriad of trade-offs.
Let’s start with some fundamentals. Will all of the AI self-driving cars in the federation be ending up with the same ML model, upon each of the refresh cycles? In other words, we could force all of the AI self-driving cars to have the same ML model, which would provide consistency across the AI self-driving cars and greatly simplify matters. On the other hand, this might also then negate some localized ML model aspects that would otherwise be helpful or maybe even crucial to a particular AI self-driving car or subset of the AI self-driving cars.
I know you could say that everyone gets everything, regardless of any localized aspects, and that the AI self-driving car in a localized context would then use that localized element but otherwise would not care about it. This though also suggests that we can actually isolate the localized difference and prevent it from being invoked when perhaps it would be best to not have it invoked. There is also the concern that the ML model becomes excessively large and unwieldly, and also might become a transmission hog if it is entirely being sent out with each refresh cycle.
This dovetails into a related question, namely how will the localized ML models be comingled into a single comprehensive ML model? Some say that there should be a consensus approach. If enough of the localized ML models seem to have reached a new state that offers value, it gets included. Meanwhile, if the localized ML model is the only one or less than a consensus worth of having something new, it does not get included.
Returning to my earlier story about Snap-on Tools, the adoption of the “smashing the cap” originated with one dealer. If, at first, the rule was to not allow an isolated instance to be propagated, it might not have gotten an opportunity for traction with the other dealers. After a few dealers started to embrace it, there was a Yelp-like review scoring that caught the attention of the other dealers, and helped propel it into the lexicon of many of the dealers.
So, we need to figure out whether something new or interesting at a local ML model is worthwhile to possibly include into the federated ML model, or whether it should be extinguished, or whether it is retained but only for the contributing ML model or some subset of the AI self-driving cars. This is not easy to programmatically do with aplomb.
We next need to consider the security aspects. Suppose somehow a localized ML model gets infiltrated with something bad, such as an indication that once the AI self-driving car goes over 50 miles per hour and sees a green street sign it should then direct the self-driving car to crash into the nearest wall. Let’s pretend this gets pushed up to the cloud. Let’s further assume that the cloud realizes this is something new, but doesn’t have any particular means to determine the reasonableness of it, and opts to then include it into the federated ML model. The federated model gets pushed out to the AI self-driving cars. Voila, this is like allowing a malware virus to readily be shared, doing so via the handy mechanism put in place for more valid purposes.
Thus, the security of what goes up, how it is encrypted, what comes down, and how the ML model and the data are stored in the AI self-driving car, in the cloud, and in transit, are all essential to this federated approach working properly. Ensuring tight security will be crucial. The advantage of having the OTA capability can be readily turned into a huge disadvantage is overtaken for evil purposes.
Another aspect involves conflicts among the ML models that are provided to the cloud. Let’s say that one ML model indicates to never go faster than the speed limit, while another ML model has indicated that sometimes going faster than the speed limit is warranted such as an emergency to rush the occupants to a hospital. How again will the cloud be ascertaining what stays and what goes, and if model updates are in contradiction how to resolve the contradiction.
If you are further interested in this topic, you might find of interest that Google is doing some fascinating work on Federated Machine Learning as it relates to mobile devices such as smart phones. They have considered too aspects of optimization algorithms, such as the Stochastic Gradient Descent approach, and how it can be utilized in this Federated ML structure.
How much ML should be taking place in the AI self-driving car itself? Should it be getting data from the cloud that was collected from other AI self-driving cars, so it can do further ML? How much processing do we want to take place in an AI self-driving car and how much will that raise the cost and complexity of the AI self-driving car? These and many other questions are now being explored, and the future of AI self-driving cars as being part of a Federated Machine Learning approach is a necessity and still open matter of intense and vital focus.
Copyright 2018 Dr. Lance Eliot
This content is originally posted to AI Trends.
You must be logged in to post a comment.