AI and machine learning help provide security to users online and with their IoT devices.
When I think of the unique challenge faced by AI researchers in Security, it reminds me of an excerpt from the Harry Potter series. At the beginning of Book 6, the Minister of Magic pays a visit to the Human (muggle) Prime Minister to warn him about evil deeds being carried out by dark wizards. The Human Prime Minister is understandably scared and confused. In frustration, he implores: “But for heaven’s sake - you’re wizards ! You can do magic ! Surely you can sort out – well - anything !” The Minister of Magic replies pragmatically, “The trouble is, the other side can do magic too”.
These days, the hype around AI makes it almost seem like magic. While the breathtaking pace of advancement in AI has borne us great results in Vision, Audio, and NLP, none of these use cases are deliberately trying to evade the algorithms. Security is the only domain for AI where we have a truly adversarial opponent. I will claim that this is a unique challenge, since the other side can do AI too !! This arms race between the blackhats and the whitehats continues to fascinate security practitioners.
In truth, AI used for defensive security has to overcome several inherent handicaps. First, a good AI system often relies on a large well-labeled dataset, which is rare to obtain in our domain. Second, a Machine Learning (ML) system can increase its accuracy at the cost of few cases of mistaken identity (‘false positive’ in the lingo). But the software world is unforgiving to Anti-Virus that incorrectly blocks a few good applications, and demands false positive rates much less than 1%. Third, sophisticated ML algorithms like neural networks are often difficult to explain in human readable terms – yet the security domain requires the results to be “explainable”. Thus the arms race feels skewed in the opposite direction, since it is often easier to apply AI to offensive purposes than to defensive purposes. As such, it is worth analyzing the areas of security where AI has found effective use – in the past, present and future.
In the last decade, email servers have all deployed efficient spam detection algorithms, such that we rarely get a spam mail today – this is a successful deployment story of AI. In the Financial world, credit card companies automatically identify anomalous charges on your credit card in the fraction of a second, and block the fraudulent activity. These AI algorithms are deployed across the world, and protect millions of credit card consumers on a daily basis.
A domain that has seen considerable success in AI deployment is the detection of malware in the cloud. Top security providers like Avast deploy AI engines that consistently detect >99% of malware that comes out in the wild. In the mobile space, Google Play and Apple AppStore deploy effective cloud-based AI engines to deter malware. In fact, the centralized appstore model has done well in protecting mobile users from malware. Fuzzing, or automatic detection of security bugs in software, has also seen broad usage of ML techniques. With the explosion in lines of code written each day, these automated tools are the only realistic method to allow companies to detect new bugs. Moreover, they can quickly and reliably identify other examples of the same bug, once a new bug has been detected.
Devices have been getting efficient and powerful for a while. Thus, it is now feasible to implement an entire ML pipeline (observation – feature extraction – analysis) on the device. By putting the brain on the device itself, we can ensure that this layer of protection is omnipresent, it can see every action that a malicious program undertakes, and evaluate it in real time. The key challenge here is overhead – not only do we need to keep an ML engine running all the time, but the dominant cost stems from observing enough events. The solution is to run a super-efficient but shallow (few features) model constantly, while launching a more accurate model as needed. This accurate model can also be context-aware, whereby the ML engine can incorporate the immediate context of the device in its decision making. This is a promising area with interesting results, but there’s still a lot of work to be done.
This is the new frontier for security, with millions of IoT devices connected together and accessible on the internet. Many of these devices have old software that are rarely, or never, patched. Furthermore, these devices do not allow installation of security software on them. So the challenge for the security provider is to protect these vulnerable devices while only observing them from the network ! This is definitely a tough ask, but also a prime opportunity where AI can provide maximum benefits. The key is that most IoT devices have a very limited range of actions, and are hence easy to model. As such, efficient anomaly detection techniques deployed in the network will help protect these devices.
As you can see, Security is a unique challenge for AI since this is the one true adversarial use case, with the hackers having the same AI tools at their disposal. Yet, the domain has seen significant successes like spam prevention and credit card fraud. Malware detection is also a pretty advanced art where the success rates are impressive (>99%). On-device ML is rare, but has shown significant potential. And finally, IoT security in the network is the next big frontier, where AI is the best defensive bet.
Johns Hopkins University cryptographers used publicly available documentation from Apple and Google and discovered that if you have the right tools, Android and iOS encryption may not be as robust as you think.
After a FaceTime bug was uncovered in 2019, Google researchers have discovered the same bug in other group chat apps including Signal, JioChat, Mocha, Google Duo, and Facebook Messenger.