Event Recording

Alexei Balaganski - AI in Cybersecurity: Between Hype and Reality

Name: Alexei Balaganski - AI in Cybersecurity: Between Hype and Reality
Uploaded: 2019-12-04T12:00:00+01:00
Duration: 25 min 23 s

Posted on Dec 04, 2019

Artificial Intelligence is surely one of the hottest topics in nearly every industry nowadays, and not without reason. Some of its practical applications have already become an integral part of our daily lives – both at home and in offices; others, like driverless cars, are expected to arrive within a few years. With AIs beating humans not just in chess, but even in public debating, surely, they’ve already matured enough to replace security analysts as well?

Show description

Speaker

Alexei Balaganski

Lead Analyst & CTO
KuppingerCole

Show Transcript

Right. So welcome to our AI and cybersecurity track. My name is Alexei a Balaganski. I am a lead Analyst Analyst at, and besides being a speaker, I'm also be moderating the rest of this track for the day. So let's just start with the first presentation.

Our, we will definitely have more experienced, more deeply technical presentations about the applications of AI in cybersecurity later today. So I would really like to start with a general introduction and in a sense, trying to separate some of the myth with regards to applications of artificial intelligence insecurity. And obviously there is a lot of those myth because AI and machine learning are one of the top buzzwords in it. Industry nowadays knows old enough to remember all the previous buzzwords.

We had big data and the cloud of course, blockchain, but AI probably beats them all because AI is now everywhere in every industry, not just in cybersecurity, but at home in the office anywhere. So new this to say the market responds very favorably to buzzwords. So AI is now built into anything or just like blockchain used to be a few years ago. Now nothing comes without AI label on top, or of course the market predictions for AI development within the next decade or so are amazing. There is lots of money in this market.

So this diagram show that within the next decade or so, the overall AI market size will grow tenfold. Whereas the AI for cybersecurity applications will probably grow even more than that. Almost like 20 times, 15 times, if you will.

And again, there are some really common misconceptions about AI and I will just, well, I have listed some of them on this slide and I will go into details about some of them in my presentation. So yeah, some believe that AI is cutting edge itself just appeared suddenly on the market just a couple of years ago, somewhat promoted as the ultimate solution for any of your problems. And of course, in those two years, AI has matured enough to have a, a solution for any application it'll completely replace humans or rule the world. And of course it'll completely revolutionize cybersecurity.

So let's look into those misconceptions a little bit deeper. So this is just a short timeline to remind you when it all started. I would not claim that I lived through all of this. My personal experience with AI started probably in the eighties, when, as a teenager, I was trying to build the image recognition robot from the blueprints of a Soviet technical magazine for teenagers, which was cool. And even back then, it wasn't actually considered new. It was tough for kids to play with a little bit.

So yeah, AI started almost oh, 70 years ago. It grow, it has grown steadily then, and with the dark cloud, this dark times of AI in the seventies and early eighties where nearly nothing has happened, it has then picked up the pace and has been basically exploding in the recent years where we just got enough computing power to run it on a large scale. So AI has been with us for a long time, but it only become commodity if you bill within the last decade or so, along with the cloud, this slide I have to confess I've quote, shameless has stolen from a really nice research paper.

It just to illustrate that there is actually no such thing as AI is a single concept. It's a hugely complicated and large research field where multiple technologies, multiple even philosophies, if you build, or are developed simultaneously and separately by different teams, by different research and research institutes and corporations. So anyone who would claim to be an AI expert, who should always ask, like which AI is your field of expertise. So let's talk a little bit about our applications, AI and cybersecurity.

You, if you look at the market at the moment, you will find thousands. If not hundreds of products, which are offered with an AI or ML label on the packaging, the problem is that not all of those products are created the same. And on this slide I've listed, I believe like five major current and future application fields, if you will, for machine learning and general AI in cybersecurity.

So it all starts with simple stuff, data correlation, what used to be purely statistical or bunch of methods, which mathematicians like myself that started in the early university years sometimes, or there isn't even any machine learning into it. It's just the same statistics or packaging shiny box is sold to you as a solution to correlate your security data, to find patterns or to find anomalies in your network, traffic, or look for user behavior anomalies, for example.

So yeah, this is probably where 90, if not 95% of all existing cybersecurity products, AI based cybersecurity products are falling into the next important development is decision support. Again, we are not talking about replacing the human Analyst. We're talking about augmenting his daily job with Obviously forensic analysis, just helping human Analyst to dig through huge amounts of security data faster, and to make a decision, Hopefully the right decision faster.

Or again, this has already found some applications in current generation security products. I would rather argue it's still somewhat rudimentary. Basically. It's looking back at your previous history of resolved cases and suggesting maybe the current case looks somewhat similar to the one you did three weeks ago. So let's look in that direction. For example, however, there are interesting development here as well, or we're all focusing about focusing around improving your productivity in different fields, expert systems and NextGen expert systems are really interesting development.

It's basically summarizing your existing knowledge in a slightly less rule based and more Semantic form. Again. So expert system has existed for decades, but now with help of AI, they are evolving somewhat. The next step is intelligent automation.

Again, automation has been the core of any it industry and of course the core of cybersecurity. It used to be mostly rules based automation. Now with our machine applica machine learning applications, you can extend this automation to more generalized problem fields. So we are talking about security orchestration and internal response, by the way, if anyone has attended our bootcamp yesterday, this is what we've we've been talking about. And if not, you will find the slides online, robotic process automation, or security testing. You name it.

It's all about even further augmenting your productivity as a human Analyst or augmenting productivity of a simple robot role based engineer, head in place with something more sophisticated. However, when it comes to really interesting stuff is what we call cognitive technologies. It's all the stuff around language processing, image recognition and other kind of less, Less about the things which can be ized in numbers.

And more about words, if you will, this is where it's already been applied for obvious things as threat intelligence, collecting data from the duck web research publications, you name it detecting a business, email compromise and anti solutions. And again, the B C is becoming increasingly sophisticated, which is isn't something which you can detect with rules anymore.

And again, I am speaking from my own experience, sorry, I don't, I'm not allowed to share any details, but yeah, this is what we are doing at the moment. And I do not wish anyone, any company to deal with stuff like that. So here it's looking for patterns beyond just security events, it's helping to improve your risk analysis and to optimize your policies because policies are more often than not also expressed in words and not in numbers of rules.

Finally, the next chain, if you will, is say, sorry, in the, what I would tentatively call autonomous AI, it's the AI, which is in theory, able to make its own decision without any human involvement. Again, there are some interesting developments. There are some companies which are already using about autonomous in different contexts. You probably heard about autonomous databases, autonomous Threat mitigation, or from companies like duct trace, for example, or more of that is coming in the future. So floating security systems, self mitigating AI systems. It's something sorry, excuse me.

That's something which will be discussing in detail item, no security event correlation. This can happen in any field of cybersecurity, starting with your firewall logs or detecting anti malware or dealing with huge variants of modern malware patterns, threat hunting. How do you dig through your logs or manually to find ongoing threats, data governance? How do you deal with huge amounts of data and decide which one is different? It was securing before any, any other data access control.

How do you make your sensitive data and resources more secure by protecting from not just illegal illegitimate access, but from access buttons, which look legitimate, but are coming from our Malaysia third party, of course, AI and machine learning is applied widely in malware solutions. Again, not on not all malware tools with AI are created equal, some claim to do find just with AI. I will talk about it later. Some use AI machine learning to augment existing signature based or behavior based detection engines. And then we are coming again to the topic of cognitive analysis.

And I'm really sorry, something in the air here, I guess no security policies. They are always are, start with word documents and not some kind of a viable rules or scenarios. And before those security policies can be applied to a specific technology text, they had to be validated and tested and approved whether they comply with or regulations and contractual obligations, indoor standards and so on. And this is where AI, the real, the true cognitive AI technology come to help deep learning. That's another misused word on this slide.

I just randomly found a really nice illustration, which shows how deep learning actually works for computer vision. The whole point of deploying that you have not one neural network, but multiple which deal, which start with Large earned rough factors, if you will, in your input data and then refine it gradually with further layers of analysis. And of course this works great with cognition, but it also can be applied to really interesting cybersecurity related developments. For example, there are companies which say we will take a snapshot of your binary code.

We will look at it as if it's an image and we will be able to just by looking at your code, understand what the application does and whether it does something malicious really nice stuff, still more or less academic, but differently something to look for in the near future. Hm M the hidden market models.

Technically It has nothing to do with machine learning unless traditional statistical methods, which work well with data, where you have to come well, you have to predict an outcome from some process without actually seeing all the existing data, all the existing steps of the process when combined with proper machine learning, it yields some really interesting applications. In, for example, dealing with business, email compromise, or overall analysis of your employee mood, if you will.

So this whole topic of dealing with insider threats, how do you predict where an employee would go rogue for any reason this other technologies, which can be used to, to do this? So artificial intelligence, again, I don't really like the term machine learning doesn't fit at all. Artificial intelligence is a little bit too big, but let's stick to it. So AI slash ML is definitely function, which is already here. It's development is ongoing. It definitely helps a lot within cybersecurity, just like in other aspects of it.

So it reduces the time spent on mundane tasks and thus improves your security staff productivity. It really can power better smarter tool does not necessarily by definition, but if implemented properly, it does, it can improve your overall security posture, just like any other good security tool, but it will definitely not replace all of them. It'll definitely not make your human security Analyst redundant. And it'll definitely introduce additional challenges. And let's talk about those. Obviously AI is still by far not a commodity.

It's still very costly, both in terms of running it without the cloud, it's pretty expensive and the skills needed to actually make it work properly, to train it, to produce the data. It's really well, it's still really complicated. So there are not enough, definitely not enough people in the boat to understand how it's supposed to work. And the implications from that misunderstanding of kind of AI skills gap heard about them, ethics, legal challenges. It's not just about technology. They lack transparency and explainability again.

So most, all those tools are just black boxes. You buy, you deploy and you expect the right answer from them, but how can you verify that it's actually right. It's not failing due to some or advers attack or just some or misconfiguration or weakness in the original threat model. And of course the data to, to train those algorithms are, has many issues. Bias is just one thing or, or mentioned quite a lot.

You know, the popular say in garbage, in garbage out. If you train your security tool on a very narrow data sample, which will never be able to go beyond that and just magically start detecting other types of malware, which never were in your training set, if you will. But probably the biggest challenge specifically for cybersecurity is the lack of better proven models.

So yeah, AI and cybersecurity machine learning cybersecurity relies on designing specific models for specific applications, training those models with huge amounts of properly selected data. And then somehow testing whether they work properly. Unfortunately, for many tools, when on the market, there is no way for us customers to actually verify that those models are working as working. And I will show you a couple of negative examples later, Data's scarcity and sensitivity.

So here all data is gold now, but security data is double because security data is by definition of sensitive data, which either something like PII, personal, personal data, which is very strict limitations on how you can share it with third parties, including that security vendor, who is striving to give you the best AI power security tool. And of course the quality of the data are if you only limited to a small sample of potential or current customers, how can you be sure that the algorithm training the data will work outside of the temple?

AI models are vulnerable, not just all AI implementations of those AI models are vulnerable in their own way. And I think we already heard some examples of this earlier.

So yes, there are definitely ways to exploit any machine learning based model by creating those several examples. Like you can wear a special kind of colorful glasses, which make face recognition technology. I think that you are someone else, or you can put a sticker on a traffic sign and make your Tesla car drive into the wall. The same applies to cyber security, or there are already known examples of successful Edward examples being implemented for AI based security tools, which makes them go crazy if you will. Unfortunately, some of those tools have internal weaknesses and implementation.

Probably the most famous one is the antiviral from silence, which is one of those vendors who claim that they can do all the security stuff just with machine learning, nothing else. And it turns out that to do so, sorry.

Yes, to be able to claim this, they had to tweak their engine to include some hidden white list to prevent false, positive and popular games and other applications and security researchers were able to turn their malware beats into impersonators of those gaming software just by sticking some strings dumped from that game at the end of the mobile file. So it's really easy to break your AI security solution if even the developers themselves, to not trust their AI. And they have tourism to work around, but perhaps the biggest sign of developers kind of trust in their own autonomous tools.

So that those tools actually never come in a box. They always come as a manage servers because the developers always know that those supposedly anonymous solutions have their limitations and they will always need some tweaking from the behind. If something breaks there has to be an engineer to intervene and fix the problem immediately.

No, these are just a couple of examples. I have one, I already mentioned silence. It happened a few months ago, or probably the first widely publicated example of subverting and AI based security tool. Another one happened just recently. It's not that exciting. It's just the developers of some emulation software complained that every time they release a new update of their product, it immediately gets banned by Microsoft defender as a potential malware.

Of course, they file a white listing request and Microsoft engineers go and list delisted. But the next time they release another build, it'll be bent automatically.

Again, it turns out that Microsoft simply has no way to shut their ML engine up once and for all, and you never can predict how many additional data samples you have to feed into the model to make it switch, to make that switch in the black box or that they had to deal with this for weeks and probably continue for month. You never know. And it's always the relevant XK that's popular online comic talking about yeah.

You know, garbage Gar you, and how do you just tweak? So you kinda just tell your machine learning model to, to make a new decision. You have to, to co it's step by step. And this is what in the field of cybersecurity, where the threats are changing daily, if not hourly, it's just impossible. And of course there is a topic of AI driven cyber attack. So if you are not using the AI or machine learning, someone else will against you, there are still largely speculation, but there are some researchers reporting that it is theoretically possible. So researchers have done it.

And there are already allegations that for example, AI and machine learning is based for creating deep fakes for voice or images to use in social engineered on business, email compromise, for example, it's something else to think about, and this would be my last slide. So just a few key takes away.

So yeah, again, AI machine by machine learning based security tools exist, they can help you, they will improve your efficiency, but they will definitely not replace it completely. I mean, not replace your human it security completely. And of course, hype around the buzzword should be really the last thing to, for you to consider during the purchase decisions. It's not a silver bullet, it has its own limitation and those limitations go way beyond it.

And we will definitely, when other speakers will definitely talk about those limitations today, on the other hand, do not expect to suit it out and not to adopt machine learning to a certain extent, because if you don't someone else will against you. And this is it. Thank you. I think we have just enough time for one question, perhaps. Thank You. Thank you. So does any.

Like this?

Don't like this?

Alexei Balaganski - AI in Cybersecurity: Between Hype and Reality