CSLS Speaker Spotlight: MasterCard's Donnie Wendt on Machine Learning in Cybersecurity

Donnie Wendt, Principal Security Researcher at MasterCard, will give a presentation entitled Machine Learning: Cybersecurity’s Friend & Foe on Wednesday, November 10 from 14:20 pm to 14:40 pm at Cybersecurity Leadership Summit 2021.

To give you a sneak preview of what to expect, we asked Donnie some questions about his presentation.

Could you give us a sneak peek into your presentation: “Machine Learning: Cybersecurity’s Friend & Foe”?

Sure. My presentation, it's designed to help the cybersecurity professional understand what machine learning is and how it's used for cybersecurity. Because when you look at a lot of the marketing material for most cybersecurity products today, machine learning and artificial intelligence appear to be this magical fairy dust. But in employing machine learning for cybersecurity, it really does offer tremendous benefit and great promise. However, there are many risks that security professionals must understand.

So my presentation really seeks to demystify machine learning and its underlying concept so that cybersecurity professionals can harness its true potential. I'll discuss some of the possible vulnerabilities and risks, including attacks on the process, data and models, and also explore implementing adversarial machine learning to test and protect our machine learning models that we use in cybersecurity.

How can Machine Learning improve Cybersecurity?

Well, one of the challenges in cybersecurity is really dealing with the extremely high volumes of data and the rate of emerging threats out there. I've also done a lot of work in security automation and quite simply, the amount of data and the rate of attack variants make manual analysis and detection impossible, or machine learning comes in as it can effectively and efficiently deal with massive amounts of data while discovering subtle changes and patterns in that data. So employing machine learning actually offers tremendous benefits to cybersecurity.

It has become a very vital component in many of our security solutions. There are many cybersecurity use cases that are perfect for machine learning, including things like spam filtering, malware, phishing detection, intrusion detection, biometric authentication, user behavior analysis, anomaly detection. The list goes on. It's like almost every cybersecurity use case we have is good for machine learning. So the power of machine learning is to analyze large amounts of data efficiently and accurately ensures that its use in cybersecurity is just going to continue to expand rapidly.

What risks are there to consider?

Researchers have proven that machine learning models are vulnerable to adversarial inputs and attacks. A lot of times it will classify those attacks based on a machine learning phase, such as training or inference or based on whether they're causative or exploratory. For example, during the training phase, attacks are often what we call poisoning attacks, which can cause the models to learn the wrong behavior.

In the inferent stage, we look for exploratory attacks that seek to understand how our model works, which can then be a precursor to evasion or integrity attacks on our models. In addition to the attacks, we also have to understand the risks associated with bias and the inherent integrity issues within the data or the model we're using. That's why I'm a very big supporter of transparency and explicability of our machine learning models, so we do know what they're doing.

Which attacks are particularly dangerous?

Interesting, I see the most dangerous attack on machine learning is the one we don't detect, right? Because one of the leading areas in research right now is how we can detect various attacks during the phases of machine learning process. We have to understand where our weaknesses are with a given machine learning model or data. For instance, if you're doing online learning, you are more susceptible to a training phase poisoning attack because you're actually using live data to train your systems.

Another example is if you're using clustering algorithms for classification, you're much more susceptible to an evasion attack because it's easy to create something that looks like or looks similar to what you're clustering. So it's really, the main attack I see, the main weakness is the cybersecurity professionals not understanding necessarily how the models within the tools they're implementing work. So we really, that's why I came up with this presentation was to try to bridge that gap. It's not to make our cybersecurity professionals data scientists, but make sure they understand how these tools are leveraging machine learning.