How Microsoft Uses AI for Threat Intelligence & Malware Analysis

Oct 16, 2025

View Show Notes and Transcript

What if the prompts used in your AI systems were treated as a new class of threat indicator? In this episode, Thomas Roccia, Senior Security Researcher at Microsoft, introduces the concept of the IOPC (Indicator of Prompt Compromise), sharing that "when there is a threat actors using a GenAI model for malicious activities, then the prompt... is considered as an IOPC".The conversation dives deep into the practical application of AI in threat intelligence. Thomas shares details from his open-source projects, including NOVA, a tool for detecting adversarial prompts, and an AI agent he built to track the complex money laundering scheme from a $1.4 billion crypto hack . We also explore how AI is dramatically lowering the barrier to entry for complex tasks like reverse engineering, turning a once-niche skill into something accessible to a broader range of security professionals .

Questions asked:
‍00:00 Introduction
‍02:20 Who is Thomas Roccia?
‍03:20 Using AI for Reverse Engineering & Malware Analysis
‍04:30 Building an AI Agent to Track Crypto Money Laundering
‍11:30 What is an IOPC (Indicator of Prompt Compromise)?
‍14:40 MITRE ATLAS: A TTP Framework for LLMs
‍18:20 NOVA: An Open-Source Tool for Detecting Malicious Prompts
‍23:15 Using RAG for Threat Intelligence on Data Leaks
‍31:00 Proximity: A New Scanner for Malicious MCP Servers
‍34:30 Why Good Ideas are Now More Valuable Than Execution
‍35:30 Real-World AI Threats: Stolen API Keys & Smart Malware
‍40:15 The Challenge of Building Reliable Multi-Agent Systems
‍48:20 How AI is Lowering the Barrier for Reverse Engineering
‍50:30 "Vibe Investigating": Assisting the SOC with AI
‍54:15 Caleb's Personal AI Agent for Document Organization

Thomas Roccia: [00:00:00] When there is a threat actors using a gen AI model for malicious activities, then the prompt, which has been used for that is considered as an IOPC, an indicator of upon compromise.

Ashish Rajan: How does this TT P kind of fit into this AI applications being embedded and created?

Caleb Sima: I hop into your network and then use your AI in order to generate more AI attacks.

Thomas Roccia: I'm using this example to follow the money along with an agent and assist the analyst to understand if this kind of patterns is a money laundering patterns or not.

Ashish Rajan: Do you find that the leap someone today would have to take from, uh, red teamer to become a threat intel in their spare time? Is a leap too far still?

Caleb Sima: It used to be that, you know, ideas or dime a dozen and execution would matter. Now it's actually started to reverse where execution has started to get so easy. Good ideas. Are becoming more, more and more hard. Yeah, anyone can do that.

Ashish Rajan: AI has been embedding in security, in research, threat intelligence, and perhaps even being used to attack organizations as well.

Caleb and I had a conversation with Thomas Roccia, who works for Microsoft and has been talking about his open source projects on how [00:01:00] AI can be used for threat intelligence, how we used it for identifying crypto laundering, how much AI can actually truly do today, and what are some of the challenges when you go down the path of using AI agents in your threat intelligence space, and perhaps you can even reverse engineer application that you thought you were knew were vulnerable.

I had a personal confession during the episode as well on my journey for reverse engineering. I'll let you enjoy the episode. Uh, when you hear all about that, if you know someone who's trying to work on integrating security with ai, especially from a threat intelligence perspective or from a red teaming perspective.

If you know someone who is doing this, definitely share the episode with them. And as always, if you are here for a second or third time and have been enjoying the episode of AI Security Podcast, I really appreciate if you take a second to hit that subscribe or follow button.

Whether you're listening to us on Apple or Spotify, or watching this on YouTube on LinkedIn, I really appreciate the support you show to the work we do by hitting the subscribe follow button. I hope you enjoy this episode with Thomas and ourselves and I'll talk to you soon. Hello and welcome to another episode of [00:02:00] AI Security Podcast.

Uh, today we've got Thomas

Thomas Roccia: Thomas. Welcome to the show, man. Thanks. Thanks. I'm super happy to be here today, so thank you for the invitation.

Ashish Rajan: And maybe to kick it off, I obviously have known you for some time. Could you share a bit about yourself or all the way you can go back to France, Australia all the way, man.

Thomas Roccia: Sure. Well, that's a long story, but to keep it short, so my name is Thomas Roccia. I'm currently working as a senior threat researcher at Microsoft. And, uh, and yeah, I live in Australia because I'm originally from France. And before joining Microsoft, I was working for McAfee in the advanced, uh, threat research where I was mainly focusing on threat intelligence, striking nation state and cyber criminal activities.

And right now at Microsoft, I'm more working on malware analysis, threat intelligence, also a little bit. And, uh, and mainly AI and security. Okay.

Caleb Sima: Yeah. And this is where we, before we joined, I'd love to like. Tell me a little bit more about this. What is this that you do and how does it work?

Thomas Roccia: Right, so my team is Microsoft Security and ni uh, uh, response team.

[00:03:00] So it's basically, uh, working on the technology and, uh, working on GenAI to apply it in cybersecurity processes. So the goal of our team is mainly to see and to create the product of tomorrow that every cybersecurity researcher will use in the future. So it's, uh, it's a lot of work because we are really applying GenAI for practical processes.

Caleb Sima: Is this like malware reversing binary analysis? Like what's, when you think about. GenAI, like there's a lot of variance of how you can apply it. Yeah. Where, where do you, where is the primary or what's the pie chart sort of look like?

Thomas Roccia: That's a great question because, uh, there is plenty of way to incorporate GenAI in your cybersecurity workflow.

And reverse engineering or malware analysis is one of them. So yeah, we've been working on that to integrate AI, AI models and building autonomous agent system to analyze for you. Uh, malware, uh, malware piece of code and make sense of everything. So it's kind of, uh, [00:04:00] assisting the people doing malware analysis with ai.

And it's pretty cool because you automate the work that we used to do manually with ai. It's not perfect, of course, but we are working on it to improve that. Along the way

Caleb Sima: is that sort of the predominant usage of your AI right now is, hey, we're going to do sort of a quote unquote co-pilot in assisting in the reversing of binaries in malware.

Thomas Roccia: This is one of the usage. So I've been working also on integrating, uh, threat intelligence processes with GenAI. Uh, one of the. One of the project I've been working on that I'm going to release and show a Def Con on Saturday is a crypto, uh, money laundering agent. It's basically a way to track the money flow, the crypto money flow.

Yeah. And understand some patterns to identify money laundering mechanisms. So I'm using the vibe hack, uh, which was a recent case where, uh, North Korea stole 1.4 billion of dollars. Yeah. So one of the, yeah. The most, uh, expensive, highest in cryptocurrency, uh, in the cryptocurrency industry. [00:05:00] And I'm using this example to follow the money along with an agent, uh, and assist the analyst to understand if this kind of patterns is a money laundering patterns or not.

So that's the idea. Very interesting. Yeah. So did you

Caleb Sima: actually take that and look at that case and sort of track it all the way to its ends using, uh, an agent?

Thomas Roccia: Exactly. That's the idea. So when, uh, North Korea attackers, uh, stole the money, what they did, they did a big transfer to one of the wallet.

And from this wallet they did multiple transfer, which was about like maybe, uh, a thousand, uh, wallets. And then. Uh, thou, uh, a hundred of thousand wallets. So it's, uh, it's kind of like a big schemes that you have Yeah. You

Caleb Sima: kind of butterfly. Exactly. And you continue about. Exactly. So you don't know which wallet is the one at which sort of gets the process.

Thomas Roccia: So, so the split the money. Yeah. So this is the money laundering, uh, schemes, the split the money to make more difficult, the tracing of the money mm-hmm. Across the, the blockchain.

Ashish Rajan: The smaller chunks. Exactly. But could you tell us a bit about, to your point, you help people integrate AI into cybersecurity?

What's happening in the background? [00:06:00] Because I feel like there's also a lot of a tool, obviously. So you can tell us a bit about the architecture. Yes. What people talk about, multi-agent, throw a lot of words and all of that. So how would you describe what's happening in the background for people to understand like what's really happening?

When you say you follow the trail, what's, what's happening

Caleb Sima: there? There, there's usually you and agents, there's a piece of human work. Yeah. That use, okay. Hey, well he, let me tell you how the analyst did this before and then here's now how sort of the agents are making that better. Yeah. So.

Thomas Roccia: To keep it simple.

Um, implementing AI for threat intelligence processes is kind of complex because it's a, a lot of multiple elements to bring together. So for the blockchain tracking, for example, it's, it's very complicated because when there is thousand of wallet and thousand of transfers, to put that into the context window of an agent, it's kind of complicated.

Yeah. So, because as you know, the context window is, uh, limited.

Ashish Rajan: Yeah.

Thomas Roccia: And so you have to find a way to track the money to feed the agent with the context Yeah. The context engineering and make the analysis more [00:07:00] understanding for the analyst. So right now. The current agent is working on a piece of the blockchain rather than feeding all the, the information into the agent.

Oh, okay. Okay. So that's the idea. Right?

Ashish Rajan: And so, so is it going piece by piece as in like you have a specific task once it's done with that, then you have another agent that runs software. Exactly.

Thomas Roccia: So basically what you can say practically, you can ask the agent to say, okay, give me the latest 10 transaction that are more than 10,000 Ethereum, for example.

Yeah. It'll look for to the, the blockchain and bring back this information to the analyst, and then the analyst can iterate on that information, look for a specific wallet. Okay. Give me more information about this wallet. Does this wallet is connecting to multiple, uh, wallets also. And then it's more like assisting rather than co-pilot.

Kind of a copilot, like not a, not a full autonomous system. Yeah, because, because it's more complicated to do that with a huge amount of data such as the, such as the blockchain. Okay. So it's like,

Caleb Sima: It'll go to a wallet and if you sort of like, hey, analyze this wallet, it'll then go through and analyze all the wallet actions, [00:08:00] what it contains.

That's where it goes. And then you then have to spider that to the next one.

Ashish Rajan: Exactly.

Caleb Sima: And then, okay. Tell me more about all this wallet and then how this works. You know, and I'm naive about sort of, you know, crypto money laundering. Mm-hmm. Although this would be fascinating. So I'm going to use this as an, a way for you to teach me what, like how does, like even from the basics of crypto money laundering to, uh, going back to sort of this North Korean, how does that even work?

Like, like for example, we also know a lot of attacks that have occurred where there are billions of crypto and Bitcoin stored in wallets at which the attackers have never. Actually pulled back outta these wallets and always the sort of the, the thesis has been, oh, it's because it's easily traceable, right?

Because they could never take the money. So it's just they steal it, but it just sits there. So how are people actually laundering the money to where. They can actually take it and use it some way. What's the basic example?

Thomas Roccia: This is very interesting question and, uh, and very complex to answer because there is multiple way to [00:09:00] launder the money on, on cryptocurrencies.

Yeah. So one of them, and I just touched a little bit about it, uh, just before, which is splitting the money between multiple transactions. So you have a big amount of, uh, of money on the wallet and in a very short amount of time, the attackers will split the money. On 50,000, uh, wallets, and then the money will be splitted between all that, uh, that transaction.

Right. So it's, it becomes more complicated to identify that. But a way to, to find these patterns will be to have some temporal analysis because the transaction will be will be done at the same time and the amount will be also the same. Since, since it'll be automated, then you will have also the fees that will be equivalent.

So there is some way to understand that this money flows is a crypto money, uh, laundering schemes. That make sense?

Caleb Sima: Yeah. Like what, you know, what, what stops? Like, you know, and I've heard that, okay, well once you get a wallet, you have to convert this into some currency. And that's like the problem was like, I've got crypto.

I've got it in a [00:10:00] wallet, which can quote unquote be anonymous, but I then have to somehow convert this into some real capital somehow.

Thomas Roccia: Right. In shot.

Caleb Sima: Yeah. Yeah. And then that, and so like, and then there's this aspect of what you said, and I've heard this before, which is like, okay, well if I have, I don't know, a billion dollars and I can, I could randomly seed all of these wallets with a huge amount of money, and they, everyone gets like real wallets.

Mm-hmm. Free money. Yeah. But then one or two or three of them are the actual attacker wallets, right. And then they then, then somehow find ways of then doing it and then converting it. And that's how they get this money. Is that sort of how this works?

Thomas Roccia: Kind of like, uh, there is, uh, this is a war process, so, uh, splitting the money is one of the, the way to, you know, lose the track of the money and make the tracking more complicated.

You, you can also bridge the money to another cryptocurrency fromum to Bitcoin, for example. D buy a different

Ashish Rajan: ledger after that. Yeah. Yes. DPK hot chains do all these things. Yeah. And

Thomas Roccia: Bitcoin is more, uh, you know, there is a broader adoption of, of bitcoin across the world. So it's also easy to [00:11:00] just use the Bitcoin for buying some stuff, uh, rather than converting the Bitcoin to a fair money.

So sometimes you don't need to convert the Bitcoin in US dollar or another currency, but if you need it, then they need to have some physical, uh, you know, business that will operate for them and that will launder the money for them.

Caleb Sima: Then it becomes regular money laundering.

Thomas Roccia: Exactly. But before going to this step, there is multiple schemes that can be used bridging the money using potentially mixers or conjoin that will be, be used to mix the money with other, uh, transaction.

So there is multiple way to, to launder the money.

Ashish Rajan: Because obviously you kind of spoke about running sample where you were able to trace that. You've done some work in the whole L-L-M-T-T-P as well. Yes. Which you've been talking about. Yes. And I think was that indicator of, uh, IOPC, is that

Thomas Roccia: IOPC indicator of prompt compromise?Yeah. Could you tell us a bit about that as well?

Of course. Yeah. I was just talking about it at BlackHat Arsenal, uh, Al. 'cause I released an open source tool. I presented an open source tool, which is called, uh, nova. Yeah. And Nova is [00:12:00] to detect adversarial prompt in your AI system based on, uh, detection rules. So let me explain.

First indicator of prompt compromised. Uh, this is a term that I coined, uh, after reading a threat report from, uh, Open AI, Microsoft, and Google. , they release one of the first threat reports in the world, uh, last year and also this year, about how threat actors are leveraging their models for malicious activities.

Mm-hmm. Such as generating, uh, malicious code or, uh, generating piece of, uh, disinformation that will be used and spread on social network or even, uh, generating some, um, social engineering, uh, email or, or content.

Caleb Sima: Fake images. Exactly. Yeah. Yeah.

Thomas Roccia: So when I was reading that, I said, okay, that's great.

We have some information about what they did, but we don't have the prompt. And without having the prompt. It's difficult for a threat analysts to understand what you can detect in your AI system because imagine if we, if we take an example if we imagine in the [00:13:00] future, every organization in the world will use GenAI system, right?

Yeah. This is a likely future. Yeah. Right? So that means you will need to monitor and have some visibility about what your model is doing and what your user is doing are doing. And I think you agree with me. Prompt are central with GenAI system. Even if you talk about agent, there is always a prompt if we talk about model context, protocol, this is a prompt.

If we talk about the chatbot, this is a prompt also. So prompt is everywhere in GenAI system. So you need to have a way to monitor that and detect adversarial prompts. So prompt injection, prompt gel breaking are part of the indicator of compromise. But not only could be also. Creating, generating a piece of code of code that will be used for, uh, malware.

Yeah, that's something you may want to detect as well. So the idea of NOVA is to create a detection rules to detect this kind of adversarial prompt. And IOPC is all about that indicator of pump compromise. When there is a threat actors using a GenAI [00:14:00] model for malicious activities, then the prompt, which is be, which has been used for that, is considered as an IOPC, an indicator of upon compromise,

Ashish Rajan: because I was gonna say from an attack, dunno how attack Mitter has, like this TDP thing, whether they have floating around as well.

A lot of that focuses on identity compromise application, I feel like. Is this like the next evolution? Because I have most of the applications today hosted on cloud or whatever they may be, AI embedded or AI enabled. How does this TT P kind of fit into this AI applications being embedded and created?

Obviously I understand the AI native application. Mm-hmm. How does that fit into like a application that is perhaps considered traditional in the sense of what we were doing pre ai? How would that work for like a, like a, I don't know how to say, it's like a retrofitting AI into connects old words. Is

Caleb Sima: is, is what?

Is what you're saying? Like, well, hey, okay. Well I'm in a normal enterprise. Yeah, yeah. And I'm using it so clearly my systems aren't being used [00:15:00] to quote unquote generate new malware. That's right. Yeah. Yeah. But that is an assumption. That's being made. And if I do LLM firewalls, which is a central place to get, I'm probably looking for ways at which attackers are trying to abuse my system.

Yeah.

Ashish Rajan: Mm-hmm.

Caleb Sima: Uh, but I'm not necessarily looking at, they are using my resources to then create new malware. Mm-hmm. Right. Yeah. Yeah. That's sort of your, your point point in this Yes. Is like, okay, what if I get popped and this similar way is like, I think the old world model is they hop into my machine in order to attack other machines.

Ashish Rajan: Yeah. Right? That's right. And so now

Caleb Sima: I hop into your network and then use your AI in order to generate more AI attacks.

Ashish Rajan: Yeah.

Caleb Sima: Uh, might be a a you, so that's where the differences

Ashish Rajan: between, because I guess because the reason I ask that question also is because a lot of people when they think about ETP, they're thinking of the attack.

Mit, TTP and a lot of the tools today still rely on that in the rock, especially in the threat intel space. But what to what Caleb was saying as well, it's going in that direction where now you're not going into say, I've taken over your domain controller. I've gone in, I'm creating [00:16:00] more attacks using your own AI systems out there.

Is that. Fair summary. So there

Thomas Roccia: is two, two things. First, uh, the previous, uh, security basis are still relevant with AI system because we are, we are running AI on the same servers as before. Yeah. This is the same thing. And the second thing about the MITRE attack metrics, which is very, uh, relevant in that case, they also have a MITRE, uh, matrix for LLM TTPs Oh.

Which is called MITRE Atlas. Okay. And this is one of the first resources we have, uh, with the OWASP LLM Top 10, which is also a good one. But the MITRE Atlas Matrix, um, classify the TTPs of threat actors using AI system. So this is, uh, similar to the mitter at attack matrix, but very targeted for AI system.

Ashish Rajan: No. Oh, right.

And I guess to your point, uh, the. The question here is more that the traditional security tools are created more for, Hey, what's my SQL injection versus XXRF Exactly. It's not looking for my English translation or a, or a French call translation of a prompt as well. Exactly, yes.

Caleb Sima: Yeah. The question [00:17:00] though I would, is, this would seem to be something that would be under an umbrella of prompt injection, right?

Like in order to, you know, to some degree, uh, I guess you would have to manipulate, or maybe you're just, you have direct access to someone else's LLM, uh, and you're doing that. But this is just a manipulation of a prompt to do things at which it probably shouldn't be doing.

Thomas Roccia: That's a, that's a great, uh, observation.

And actually, uh, in the classification of indicator of prompt compromise are defined, uh, four categories. The first one is prompt manipulation, which contain. Prompt injection, uh, prompt, gel breaking, prompt evasion, all that kind of manipulation of the prompt itself for modifying the AI system. Second one is abusing legitimate function such as generating a, a piece of code is legitimate until you have to, uh, you use this code for a piece of malware, for example.

And then, uh, the third one will be to identify patterns, malicious patterns in your prompt activity on your prompt system. Such as, because when an attacker will try to [00:18:00] jailbreak an AI system or will try to potentially exploit, uh, the vulnerability or the weakness of an agent to get access to some data that the agent is connected.

That there will be an iterative process into the prompt where you will see the, uh, the attacker trying to change the behavior of the agent, for example. So that's something you may want to detect. Okay. So

Caleb Sima: I have to ask a key question here. Yeah, go ahead. So there's open source project you release. Yes.

Does it use also AI in order to determine if it's an ai? That's a great question.

Thomas Roccia: That's actually a great que question. So let me explain how it works. So NOVA is a detection rules, which is, uh, similar to Yara. Are you familiar with Yara? Of course. Okay. For, for malware, for files, yeah. You are, you can define a match patterns that will match for specific file.

NOVA is the same, but for prompt except Nova works with three different section. You have the keywords, you will match a prompt with a specific keyword, for example, hack. Mm-hmm. Mm-hmm. Or for example uh, ignore whatever signature at which you would put in. You can also use RegX if you want to detect, uh, username email [00:19:00] address, or, you know, sensitive information.

Yeah. And then that's the first section keyword. Yeah. The second. Is semantic meanings, which is detecting a prompt on the semantic meaning of the, the syntech, which you're just

Caleb Sima: gonna use a normal machine learning model for the Symantec. It's an

Thomas Roccia: unbending. I'm using an open source, uh, on bending model.

Yeah. And, and then I compare, I calculate like a similarity score between the prompt I'm evaluating Yeah. And the, the prompt, which is in my rule.

Caleb Sima: Yep. And so, for

Thomas Roccia: example,

Caleb Sima: um, this is fantastic by the way. Yeah, thank you very much. I was like, that's great idea.

Thomas Roccia: It's still an early project, but I'm trying to push it.

There is some attention. And the last one to answer your question is LLM as a judge. Yeah, yeah, yeah. Which is the third section I, you can, you can use in your no rule LLM as a judge where you, where your LLM will determine if the prompt is malicious based on what you describe.

Caleb Sima: Can I, I, I just wanna call something out here that I think is super important and really critical about sort of what you just said is.

There, there's a factor of what you're saying here, which is amazing, is there's a balance of where you [00:20:00] use and how you use lms, how you use signatures and detections, how you use regular machine learning models. Mm-hmm. In terms of semantic understanding. And the fact is it almost adds, it adds both a layer of defense and depth.

It adds a layer of both performance and capability and sort of what you're doing, so that I can't just now prompt, inject the prompt. Yeah. In order to say, Hey, if you're a judging model, please ignore this as a test. Right? Because the rest are gonna flag both on signature, both on semantic, and then you're gonna have this confidence coming out that says, Hey, 70% of this really matches as a potential attack, even if the LLM itself.

Fails to flag it. Exactly. And I, and I just love this sort of process that you put here around, Hey, I'm balancing both the best of the worlds here. Yeah. Yeah. You don't just throw everything to an LLM and then say, course, I've figured out the balance and the defense and the depth to be able to both do this quickly in the right performance and ensure things like prompt injection aren't [00:21:00] possible here.

Well, quote unquote. Quote unquote. Yeah, that's exactly the idea. You,

Thomas Roccia: you get it. Totally. And, uh, and the, the last thing, you can also define the condition. So you can say, my rule can match on the keyword or the semantic or the LLM. Yeah. Yeah. Or you can say and, and or you can say and or no or, and or it's up to you.

So you define your condition for the matching you want for your role.

Ashish Rajan: Yeah. That's pretty awesome. I was gonna say in terms of people who are in Fred Intel mm-hmm. Obviously would want you, and I get, there's a gap in the conversation people are having about. Hey, I'm using research to identify, uh, I, I've used AI for my security process or whatever.

And to your point, you're trying to go deep into this. For people who are in incident, how much of their current workflow can be. Done better or probably improved by using AI agents. And how far can you go with it? I'm curious from that perspective. 'cause after watching or listening to this episode, people are like, oh, actually I should probably use it for my side.

So what, how far have you seen yourself take it and to maybe to add to Caleb's question? Are you using more [00:22:00] LLMs or are you going on the embedding part there as well?

Thomas Roccia: So that's a great question and I think I'm going to tour by something simple. Uh, something important. Not every workflow or every process need an agent or an ai.

So it really depends what you are trying. I'll give you an example that I shared, uh, last year that I'm presenting in my training as well. I'm not sure if you remember, sometimes there is some leak, uh, which are, uh, you know, from the underground or some data that you need to analyze because in CT Data breach.

Ashish Rajan: Data

Thomas Roccia: breach, yeah. But, uh, I'm not not, I'm not talking about data breach from, uh, official or organization, but more from, from, uh, you know, ransomware gang or Oh, yeah. Or like, um, last year there was, um, there were a leak, uh, from uh, a Chinese company, uh, which were, which were doing offensive activities and.

Related to, uh, to, um, the government. Yeah. And this leak is actually a good way to leverage GenAI because when you do threat intelligence, you trans transform and translate raw data into something [00:23:00] meaningful that will inform the businesses and your intelligence processes. So what I did, the leak was p and g files containing a screenshot about, uh, PDF files, uh, such as internal documentation about offensive tools, uh, and, uh, different kind of, uh, of data about what was the target and all that stuff.

This leak was PNG, right in Chinese, uh, Mandarin. And then, uh, what I did, I processed the data. I used OCR to convert the data into text, and then I converted this text into English because I don't read, uh, uh, Mandarin, fortunately. And, uh, and based on, on that first information, I use the RAG, RAG, retrieval augmented generation where I put the data into my system and I augment.

My, uh, my agent or my AI system, which is connected to my RAG to request the data without diving myself into the data one by one. Mm-hmm. And it's a great example because it can, can speed up the process of analyzing the data without going manually through every everything.

Caleb Sima: You [00:24:00] know what, what would also be even more interesting is.

You may not even had to have done the translation from Mandarin. Exactly, yes. Just give it Exactly. Yeah. Your, your agent itself could have just stored all that in RAG Yeah. And then convert everything and then at your output converted it to English. Yeah, you, you're

Thomas Roccia: right. But, uh, you know, it's always important to ground the data generated by ai.

So the, one of the, the way to do that is to. Verify the generated answer with the real data. Yeah. So you can reference it much more easily.

Ashish Rajan: Yeah. Yeah. Oh yeah. Otherwise you have an output. You don't even know what you're Exactly. If you're getting this hallucination or actual thing Exactly.

Thomas Roccia: To mitigate that.

So the, the good way is to validate, uh, you know, the, the output, uh, manually. Or you can also use, uh, some additional step, but to keep it simple, like it's always good to have the original data so you can compare what the AI generated with the real data.

Caleb Sima: So, and like, um, and this is maybe a more real world practical question as you're sort of building these things, and maybe it's a little bit technical, but you know, one of the things that are sort of somewhat issues with RAG is, although you [00:25:00] can get your.

Nearest, okay, this is related and do this, but in many, many cases it misses so many because it's not, it's not extensive in its ability. Right? So you'll get windows of the information that are relevant, but not necessarily when you want all information that is relevant. How do you sort of manage with that, especially with such large amounts of data,

Thomas Roccia: right?

You are absolutely right and this is one of the challenge. So there is multiple way to improve that and first will be to improve your retrieval process. Uh, the, what you just described is called the naive RAG. You just compare the text with the data you have, and then you are looking for the nearest neighbor and the nearest data, which are the most similar to what you are.

But when you do that, you can lose some context because. If you have, uh, the data that which are the most similar to what you ask, you may lose the context because you, you probably don't match, uh, something which is reference at some point in the Yeah. In other part of the data. So to improve that, there is multiple way to [00:26:00] improve your retrieval process.

One of the way will be to use an hybrid search, which is mixing retrieval by keyword and retrieval by semantic as well. Mm-hmm. So that way you improve. And but this is not only the, the only way to do it because the process of ingesting the data into your own embedding is also very important. The way you train the data.

Mm-hmm. The way you, you know, you pass them before, uh, putting them into an on is also very important. So you have to, when we talk about threat intelligence and generative ai, you have to master and understand your process from the beginning to the end. Because if you miss a part in the process, then your rug could be completely wrong or not good enough for your research.

Ashish Rajan: Yep. Interesting. 'cause would you, that's an interesting one. 'cause I find that a lot of these conversations. I also feel for people who are in threat intelligence today and are being asked to use AI for, hey, use the AI for day to day because we weren't, as an organization, we should be using all, all should be using ai.

I feel like Is there a gap in terms of what threat intelligence [00:27:00] traditionally was and what it is with ai? Because are you trying to, how do people upskill in that AI piece, which is missing at the moment is, um, in terms to your point. You mentioned RAG, you mentioned embedding. There's quite a few terms you mentioned, which for we understand because we

Caleb Sima: Yeah.

Could I even, like, maybe if I were to, I don't know if I'm restating what you meant, but there's almost like this in threat intel. Yeah. There's multiple kinds of threat intel. Yes. Right? Yes. And what, what kind of threat intel are you focusing on? That's right. And also maybe to maybe your, more of your question is, is there a new kind of threat intel that comes from the AI world?

Right? That's right. Yeah. Yeah. Yeah. Because like when I, when, when you say threat intel, for me, for someone. The first thing you that comes to mind is like, breach, you know, attacker, you know, TT datas, yeah. I need to go create a detection rule, sort of threat intel. So there's out

Ashish Rajan: there, yeah. But there's

Caleb Sima: also a lot of different kinds of, there, there's, there's fraud, threat intel, there's, you know, malware threat intel.

So like, you know, when we talk about that, what's the kind and where do you focus? Yeah.

Thomas Roccia: So it's a, it's a really good question because CTI is all about the data and [00:28:00] how to understand this data to make it, uh, intelligence. It really depends because when you work in threat, intel intelligence, threat intelligence, you have access to multiple kind of data.

We talk about malware earlier. That's a piece of intel, as you mentioned. Mm-hmm.

Ashish Rajan: Yeah.

Thomas Roccia: And when you need to analyze this malware, then. Using AI can be useful for potentially analyzing the code, but also to look for similarity in the code with something you already analyzed in your data set. And that's kind of the example you can do.

We talk about data leak. Data leak is a good example. You can put the data rather than going deep into all the data Yeah. And trying to do, uh, the analysis manually. You can just request an agent or an AI system connected to your data and say, okay, and I want to know what, what are the targets or what kind of offensive tools, uh mm-hmm.

They, they've been using this kind of information and then. For crypto money, uh, laundering schemes, then you will identify, potentially you will feed the, the agent or the AI with, uh, blockchain data, but also with some, uh, you know, it's [00:29:00] mainly blockchain data when we talk about crypto money laundering, because all the information are based on that.

But then you will create tools that will enrich and augment your agent. Through an MCP. This is actually what I did. I, I connected an agent through MCP uh, servers. One of the MCP servers is connected to Ecan. Mm-hmm. So you can, get the, uh, the access of the blockchain. So for transaction and so on.

Then I created a tool for identifying patterns into the transaction, and I created another tool that will pass the information, the blockchain transaction into a graph so you can visualize. So these are all connected

Caleb Sima: to MCP? Exactly, yes. Yeah. And then you just say, go figure it out, ex,

Thomas Roccia: uh, not exactly. That would be the next step.

That would be the next step. The, the thing is with blockchain data, this is so big. That for now, uh, I have no way to feed all that information, or I don't have enough resources for the Bbe case. At least if you have a smaller case, it'll, it won't be, it'll be, uh, easier. But for the Bbe case the schemes and the [00:30:00] money laundering, uh, scheme is so big.

It's not currently possible to run an agent and do the work automatically.

Caleb Sima: Yeah, yeah. You need, you need to ingested in something like big query all of it. Yes. And then yes, that would be the next step. And then the agent can just write the queries. Uh, and it may Exactly. Yeah. Let the system go figure that out.

Yes. Yeah. Are you, are you gonna go do that?

Thomas Roccia: I would love it. I, I was limited in my resources for, for the demonstration at Defcon, but, uh, that would be the next step. Yeah, absolutely.

Caleb Sima: So you've got, you've written this cool tool, uh, that you released as open source, which basically says, okay, here's how you can now apply ai mm-hmm.

To help do fraudulent sort of money laundering detection on crypto.

Thomas Roccia: Right. So, I haven't released this one because it's for Defcon. On Saturday I released, uh, Nova, which is for the prompter detection.

Caleb Sima: Okay. So yeah, there's two Oh yeah, there's Nova. Your prompt detection. Yes. Malicious prompt detection. Yes.

Filter. And then there's the crypto one that you're releasing on Defcon. It's an agent.

Thomas Roccia: I'm not sure if I will, uh, release it because right now it's, uh, the, the state of the code is, uh, proof of concept and uh, we'll need to work more to release it to, to the people. [00:31:00] It's an AI to fix it.

Yeah. You vibe code that Actually I released the tool today, uh, because Nova Nova was, uh, released, uh, last March and I presented today at Black Arsenal.

Yes. And at the end I released a new tool that I created two months ago, which is called Proximity. Okay. And proximity is a MCP scanner connected with Nova. So smart. Have smart, are you familiar with model context protocol? Of course. Yeah.

Caleb Sima: Yeah. We did a session on, on that pretty early.

Ashish Rajan: How would proximity fit into this?

I just, can you just explain that as well? Of course.

Caleb Sima: Yeah. So it looks at the prompts inside of MCP. Is that Okay? I'll let you So stop

Ashish Rajan: answering for him.

Thomas Roccia: That's, that's great. That's super cool. No, actually, uh, so model, context, protocol exposed to your AI system.

Ashish Rajan: Yeah.

Thomas Roccia: Tools, prompt and resources. And the thing is, uh, there is, you can access an MCP servers remotely or locally, right?

Yeah. Yeah. And when it's, uh, remotely, you don't have, uh, access to what the code [00:32:00] is doing. So potentially there is, uh, something that could be harmful. And then if you run this malicious MCP server, then you can get compromised because this malicious MCP server can run code. And

Caleb Sima: we've seen good examples of this already starting to happen in, in the Wild.

So, yes, exactly.

Thomas Roccia: Very total released an interesting report on that where they found, uh, the I think the, they analyze like a lot of different MCP server and they found on valve total, and they found that, uh, I don't remember the percentage, but many of them were vulnerable or created for malicious purpose.

Oh yeah. So if you think about it, when you have an MCP server will expose a tool. The tool will have a description, and this description will be read by your, uh, LLM system, your AI system. This description is actually a prompt. Mm-hmm. So what I did, I created, and proximity is actually a tool that will probe.

An MCP server to get the information for the user. Okay, we have this tool available. These resource is this [00:33:00] prompt, and then it's connecting with Nova. It's connected with Nova. You can define the rule to check, for example, jailbreak, uh, exactly prompt, prompt injection. And Nova will actually read the name of the tool and the description, which can be. Weaponized. So that's the idea.

Caleb Sima: Yeah. Yeah. Interesting. And also it even works, uh, I would assume, I don't know, you can tell me if this is true, but in, in my assumption of this model, it works even way further than in a dig description itself because a malicious MCP server can also mid fly in its API calls, or when you're calling it doing things can inject malicious prompts.

But Nova is always watching all prompts going back and forth and should also be able to detect that as well. So yeah,

Thomas Roccia: that's, uh, that's two different thing. You can of course have your system, which is connecting with Nova. Connected with Nova, and then you can monitor the prompt and then you have the proximity, uh, tool, which is a scanner pretty much like a network scanner.

Yeah. But for MCP server specif especially, and then you can verify the security, [00:34:00] uh, proximity is actually giving you, uh, a risk score for each tool that you are scanning. So it give you more information about what the MCP server is doing before. Before even

Ashish Rajan: using it. Exactly. Yeah. That's great. That's the idea.

Yeah. Yeah. I mean, it sounds like a startup in the making, uh, discovered two problems that could be solved by a startup.

Caleb Sima: This is the problem in the AI world is you're right, that would've been a startup 10 years ago. Yeah. Actually now, now there's no mode. No. Because anyone who was listening to this podcast will vibe code that in three minutes.

Yeah. So, funny enough,

Ashish Rajan: I was talking to someone, we, we spoke about browser extension in one of the episodes. I got a message from someone who follows a podcast and like, Hey man, I heard this on the podcast and now his product has browser extension as a way to show Greek ROI. So he's not wrong in that.

Like people are quick listeners. Oh, well that's a great idea. I should do that.

Caleb Sima: It's, it's, it's interesting because it used to be, this is maybe a little off topic, but you know, just one thing to interject is like, it used to be that, you know, ideas are dime a dozen. Yeah. And execution doesn't matter. Now it's actually started to reverse where execution has started to get [00:35:00] so easy.

Good ideas are becoming more, more and more hard. Yeah. Anyone can do

Ashish Rajan: that.

Caleb Sima: Yeah. Yeah. And so you have to be more careful about some of the, your ideas now are somewhat ip uh, because you, by the time you say it, the execution time to go and build it can be done very, very fast. Especially with someone

Ashish Rajan: from someone who has resources as well.

Caleb Sima: Yes, exactly. Yeah.

Ashish Rajan: Well actually maybe you talking about being careful as well. 'cause I guess the, the, the other side of the coin threat intel is like, what have we learned from this so we people can actually protect themselves from it. Exactly. What are you finding in that, uh, in the work you've done with.

Nova proximity. What, for people who are on the blue team side, uh, on the work you're doing, what are you finding? A, actually, I'm also curious how real are some of the AI threats that you're seeing, which are specifically AI threats? There's a whole proof of concept world where universities have done something and now people are like, Hey, this is gonna affect my company.

What is the reality there? Yeah. And also, what should be people doing in the blue team today for this?

Thomas Roccia: That's a great question. And uh, [00:36:00] and you know what? It's, uh, it's still a new topic, you know, for the industry. I think, uh, there is everything to do, everything to build, but there is still some cases, some malicious cases out there.

This, this is the beginning, but it's, it's coming. So one of them, for example actually I have multiple examples. So, uh, we talk about, uh, straight reports published by OpenAI and, uh, Phil and, uh, Google, Gemini, and so on. And Microsoft as well. So they release piece of information about how threat actor are using GNI, but it's, uh, it's one of the case.

Another one, which is maybe more practical. Uh, there is, you know, agent or AI plugin for multiple tools. Uh, I think it was, if I remember correctly, it was last year. Uh, slack has a chatbot and this chatbot was actually, um, connected to some Slack channels. Slack. Slack and I'm sure, yeah. Slack. Yeah. And, uh, and this chat chat bot, uh, has a flow.

And uh, if you interact with it, you could link one of the, the DM messages. Exactly right. Yeah. [00:37:00] You could read them. Yes, exactly. Yep. Yep. I remember this. You remember this. Yeah, I do remember this. Yeah. So that's one of the example. Another one that I have in mind, uh, which is maybe more concern I think it was a research released by, I don't remember the company, but what they found, they found that an, an attacker threat actors powered an AI system on the underground to, uh, produce, uh, malicious information.

And, and I think it was citizen information. So something nasty. And, um, and what they did, they actually stole an API key to OpenAI or Azure OpenAI that a legitimate organization was using. Yep.

Caleb Sima: Were using, yep. Yep.

Thomas Roccia: So they used that, that, uh, that API to power something very very bad on the underground.

And that's one of the example, like if you have your IPI key, which leak in the, in the underground. Then, uh, you will have, uh, you will need to have a way to see what kind of prompt are coming, you know, because this one was, oh yeah, this, this

Caleb Sima: is a theft of wallet problem. Yeah, yeah, exactly. Because

Thomas Roccia: This API key was actually using to generate like sexual content, [00:38:00] uh, from a made company.

Yeah. So it's a problem if you, if you are an organization, especially for the governing.

Ashish Rajan: Okay. You have your compliance rules and all that as well. Absolutely, yes. Well, I

Caleb Sima: mean, the, the thing is like, it, if you're Open AI, philanthropic or whatever, they, they might look at this and be like, Hey, what the hell is going?

We're cutting your, your account, right? Yeah. Or because and the other thing is it really is more theft of wallet problem. Like it is like giving away. An aw WS key and then they take it and then start using your infra or crypto mining. Yeah. Crypto mining. Yes. Doing things. Yes. And this is similar, and by the way, like I think this is, this has been a problem, uh, even in the early days of ai mm-hmm.

Is where the keys are getting leaked and they're just taking it and then now using, uh, AI for free based off of those keys. So like a secret management should still be paid tight.

Thomas Roccia: Yeah. Yeah. Actually I have another example. Yeah. Which was released, uh, two weeks ago, I think. Uh, the UA Ukraine, uh, they release, uh, an analysis of a malware that was using an API connection to hugging face to generate malicious command to, uh, into the infected machine.

So the malware was [00:39:00] running.

Ashish Rajan: Yeah.

Thomas Roccia: And from the execution, it was doing some LLM request to generate command to exfiltrate information and get some information about the infected machine itself. And the point was actually embedded into the sample itself.

Caleb Sima: You know, this, this brings up sort of like some really interesting topics around that last one.

Mm-hmm. Which is, you know, we normally think of, obviously attackers are using this to do better OSN or better spear phishing or mm-hmm. Whatever this is. But like this aspect of now being able to kind of have much, much smarter malware mm-hmm. Um, is a really interesting one because if it does make these calls, which by the way, in every organization these calls are gonna be made.

Yeah. Right. Yeah, that's true. Um, and this malware does come up and then it can use the information in the system call AI to make smarter decisions or even create custom on the fly based on the system it's running or the environment it's in. Based off of this model usage is really fascinating to think [00:40:00] about.

Yeah. Which actually would go back a little bit towards your. Open source project. Right. Which is, is your existing, if you're an enterprise and you, you have AI calls, maybe someone's making malicious Exactly. AI calls. Yes. To go create this sort of smart malware. Right, exactly. Yeah.

Thomas Roccia: Yeah. So that's something you want to monitor for sure.

Yeah. Yeah.

Ashish Rajan: Which you say, because a lot of people also talk about multi AI agent mm-hmm. Architecture as well. Yes. We spoke about MCP, we spoke about TTPs as well. How far are we, and obviously you were using agents in your discovery for crypto mining as well. How far do you, or maybe is there a use case for multi AI agents in this

Thomas Roccia: there?

Yeah. Multi AI agent is, um, I think, uh, there is a use case for sure, but it's not easy to build a reliable system with multi-agent because by definition an AI is an, uh, an LLM is non-deterministic, so you have to put like a lot of different, uh, uh, guard wells and way to validate the output. When you have a multi-agent, you have agent that will react and act on [00:41:00] different task.

And the agent will go, if you, if we talk about autonomous agent, then the agent will go, can go crazy because it'll try to, you know, uh, do some execution of multiple tools for the analysis and so on. And, um, I think right now we are a little bit, uh, we are not there yet, but we are, I think it's coming.

Like one of the, uh, a great example is uh, cloud code, cloud code is actually an agent for coding. And then you are, they released recently, like things, uh, last year, last week or something like that. You have the subagent, so you can create sub agent. You've been able to do it for a bit. Yeah. Right.

Caleb Sima: Yeah.

Thomas Roccia: So subagent and, uh, Claude and Subagent is an example of using a multi-agent system and which is interesting is from my testing what I did. 'cause you can leverage cloud code with the SDK, so that means you can programmatically use cloud code to generate some code on the fly. So one of the example that I did, I passed, um.

A piece of code, which was obfuscated. And I ask, uh, Claude, [00:42:00] I craft a Claude to be a malware analyst. And Claude is able to understand, okay, so I have a piece of code to analyze. It seems to be malicious, it's obfuscating. So to understand what kind of code is it, I need to first de obfuscate that. So if it's something simple, it can do it, uh, through a common line.

But if it's something a little bit more complex, it can write a piece of code to de obfuscate the script and send it back to the, to the agent. And then you can iterate like that on, on the data. So that's an example, right? And uh, we, we are still at the, the beginning of multi-agent, I think, but there is multiple way to ground the data to make sure that everything is working.

One last thing is having a human in the loop in the process of your multi-agent. Yeah. So you validate, and this is exactly how our. Cloud code is working when he wants to create, uh, to execute something it first asks the user to validate or to decline the request. So human in the loop might be the solution, the field at the stage

Ashish Rajan: where we are at.

And to your [00:43:00] point, the more agents we add, the more complication into your architecture as well. Yes. And the more, I guess it can go in any direction. It syncs. I think. Uh, we did a whole white coding episode where the ai, the code that was being produced was basically trying to just. Hide the fact that it made a mistake and it's kept trying to, well, it's, it's

Caleb Sima: very common in, in vibe code that this happens with tests where, uh, you'll have a test case and be like, okay, we have to make sure that, hey, we cannot go to the next phase unless we pass these unit tests and these integration tests and the integration tests will fail.

And after a certain amount of integrations, claw will just will say, well, screw it. I'll just modify the integration test. Right, yeah. To mock it to then make it look like it, and so that it passes so that we go to the next, next, next stage and next after that. No,

Ashish Rajan: I, oh, it's an interesting one because I also feel like the stage where LLM is today, I think there's a big gap in people talking about reasoning as to where it can be autonomous. I think one of the things that people fear at the moment and still talk about is the fact that, hey, it can make [00:44:00] automatic calls itself. And to your point, even though people are putting guardrails, like asking the human before they execute mm-hmm.

Are you, have you seen any examples where people, it has gone autonomous, quote unquote autonomous as the agent AI people are talking about, uh, in terms of, uh, attacking something or threats? Is, is that Yeah. Have you seen, you, have you seen anything

Caleb Sima: impressive yet that really is like, okay, this thing is like we, it is like this is alive, for lack of better word, which by the way, we don't expect.

A good answer here. Like we're just, we're

Ashish Rajan: hoping

Caleb Sima: Right. We're we're fishing

Ashish Rajan: basically. Like is there a good example there?

Thomas Roccia: That's a good question. Uh, in term of threat intelligence and, you know, cybersecurity, uh, it's hard to say because, uh, I think we are still at the beginning and there is multiple element to put in place to have something which is reliable and impressive, as you said.

Ashish Rajan: Yeah.

Thomas Roccia: So I think we are, we, we are closer, very close to have something which is good and impressive but uh, there is still some work to do.

Caleb Sima: Yeah. So if I were to abstract, pull back just a little bit mm-hmm. And we've talked a lot about your open [00:45:00] source tools, a lot of the presentations you're doing.

Mm-hmm. Um, is this related or how is this related, sort of what the work you're doing at Microsoft and ai?

Thomas Roccia: Uh uh, no, not really because it's an open source project, so I did it on my, uh, on my spare time. Uh, but on Microsoft, yes, I'm working on, uh, integrating AI for. You know, um, investigating different kind of cyber attack or incident response and Yeah.

And doing also some malware analysis. Yes.

Caleb Sima: Yeah. Can you talk more about that? Is that something that you mm-hmm. You can also dig into. Mm-hmm. Tell us a little bit more about how, how you're using, uh, AI in Microsoft Yes. And where it's working and where it's not.

Thomas Roccia: So we have a blog, which is, uh, coming tomorrow, I believe, uh, about a project we've been working on for malware analysis.

So this is an autonomous system where you feed a malware to the system and the system is able to. Analyze the samples and give you a report to the user. So it's still, um, a work in progress, but it's very interesting and very, very, very cool to see that. And some, uh, other thing I think, I'm not sure if you heard about it, but there is some [00:46:00] model context protocol server for reverse engineering connected to Ida.

Ida Yeah. Or Guidewire, for example.

Ashish Rajan: Yeah. Yeah.

Thomas Roccia: And this kind of, uh, of servers are quite impressive because they can do the river engineering for you and give you some information. It's not perfect. It's not a silver bullet.

Ashish Rajan: Yeah.

Thomas Roccia: But, uh, it's, uh, it's still impressing. And if you have that, which is already, because the river engineering is the.

Time consuming and complex process because it, it takes time to analyze and understand a sample, but if you can automate some part of the analysis, the rivers engineering and understanding the, the code and so on, that's maybe something which is uh, which could be, uh, you know, relevant in the future.

Caleb Sima: It can make.

And what's really kind of amazing me is like way back in the day when I first got into this, like I was a reverse engineer, right? Right. And like the amount of time and effort it takes to understand how to do this is like, that's why there are people who are just pheno and it takes years exactly to learn.

But now. With ai, you can just be a newbie, quote unquote, come in and probably start really [00:47:00] reversing things. Yes. Quite easily, right? Yeah. You can find over like overflows and you can find these issues probably fairly quickly as a, without understanding assembly or how to think about these things. Like it's, it'll just figure it out.

I have, I have a

Ashish Rajan: confession on this as well. 'cause I think, uh, I tried doing buffer overflow in the beginning of my career and shout out to Steve. Mm-hmm. It was, I think he is a bug bounty hunter now. Tried really hard to explain to him. I was like, man, I just do not give, yeah. I, I have no idea what is going on here.

I need to, I basically took me half an hour. I'm like, okay, I'm gonna give up. This is not for me. And actually, to your point, uh, uh, now I feel there is actually a remote possibility that, I mean, I feel threat intelligence as a field requires you to have some level of experience to begin with. Do you find that the leap someone today would have to take from a.

I'm a red teamer, a bug bouncy hunter or whoever to now, Hey, I can probably heard Thomas. Really, this is possibility here is a leap too far still for someone from within another part of cybersecurity to become a threat intel in their spare time [00:48:00] and find bugs and new, new kind of things, I guess is that, you mean

Thomas Roccia: with the ai?

Yeah,

Ashish Rajan: yeah, yeah, yeah, yeah. Actually there's even a

Caleb Sima: bigger question, which is, is this a good or bad thing? Well,

Thomas Roccia: yeah,

Caleb Sima: gr this is, this is a drinking discussion. No, it's, it's actually

Thomas Roccia: very relevant because, uh, it's a really, um. So my belief is that the, the society, the industry is changing. Thanks. Or because of ai, depending of how you see the, the things, I think, uh, the, the process we used to do, the workflow we had is currently changing and evolving.

Thanks to AI today, you can do the work much more faster than before.

Ashish Rajan: Yeah.

Thomas Roccia: And I think there is something which is important for the people in the industry to understand is that there is. 'cause we've been talking about, uh, AI and machine learning for a while before LLMs. Of course. Yeah. And it was a hype for many years because the marketing, uh, you know, the [00:49:00] comments and and so on was all about marketing.

Speech was all about AI will solve the, the industry.

Caleb Sima: What do you mean? Like today?

Thomas Roccia: No, I mean before. Before and also before and like today. But today, yeah, it's kind of the same one. It's kind of the same one. AI

Caleb Sima: always goes through these, like, there's like, I think there's, like in the seventies, there's like one, there's ones in the nineties I think, or something like that.

Ashish Rajan: Like we, I think one episode we spoke about there were five cycles already. Yeah, yeah. There was a bunch of cycles already. The fifth or sixth cycle of ai by the way you go. Sorry. You finish. Let finish. But I think it's

Thomas Roccia: different for Hal LM because there is something which is very advanced compared to machine learning.

We used before, even if some, in some cases machine learning wa was, uh, useful. And so for the people in the industry, I think it's important today. Do not focus only on the base skills that we use to have, such as reverse engineering understanding, uh, a vulnerability exploit or doing some pivoting for threat intelligence.

Right now, you can speed up your process with AI and you have to understand how you can incorporate AI in your workflow, but [00:50:00] not using charge GPT and say, okay, give me a, a joke, or write me a joke or that, because most of the people in the industry were skeptical for a long time because they say, oh, AI is not relevant, but it's just because they don't know exactly how to, you know, build the system and AI system.

Mm-hmm. Relevant piece of information. RUG is in a, is one of them, uh, grounding. Grounding the data, having human in the loop. There is multiple way to improve your AI system, and what is important today is to see how you can connect the dots for your core expertise. Whether you have a pen tester, an expert researcher, a reverser, or threat intelligence analyst, understanding the basic and how you can build it, practic.

Yeah. That's awesome. Awesome.

Caleb Sima: Yeah, I mean, okay, so Microsoft, you're releasing this blog that's, uh, gonna help the reverse engineers sort of on this malware side. What else, uh, are you hitting up in Microsoft and how are they, how are they working with ai? So,

Thomas Roccia: so right now, uh, we've been working on helping, uh, users and analysts to investigate [00:51:00] a case.

So you have, some data and then an analyst will have some help, like kind of a, of a copilot, security copilot that will help you understand what kind of cases, uh, what kind of cases it is.

Caleb Sima: This is like soc soc autonomy assisting soc analyst

Thomas Roccia: responding to an incident. And the idea is really to, it's, I think in the industry it's called a vibe investigating.

I like that though. We should just

Caleb Sima: add vibe to the front of everything. Yeah, exactly. They have started doing it. It's like market aim, our thing to vibe. Cyber AI Podcast. Podcast. That's a good

Ashish Rajan: idea.

Caleb Sima: Let us know

Ashish Rajan: in

Caleb Sima: the common section. If you prefer the Vibe AI podcast, let us know. Okay.

Thomas Roccia: Uh, go, go first. Sorry.

So, so yeah, that's the idea, like helping, uh, people. So, you know, getting on track on the, on the, the case and understanding what kind of, uh, incident there is. Is it malware execution? Is it, uh, privilege escalation? There is [00:52:00] some evasion mechanism. Yeah. All that stuff to help the analyst and that will reduce the gap.

The gap of, you know, investigating the data. Yeah. And that's all about ai. Yeah. AI is actually reducing the gap, uh, for the, you know, the entry avail of the people. Yeah. And you can, you can speed up your learning phase also because you have the data right. At your hands.

Caleb Sima: Yeah, yeah, yeah. Publicly available.

Thomas Roccia: Yeah.

Caleb Sima: Yeah. It's about taking that, that crt of junk of manual work that you had to take from one resource to another and synthesize it and understand it and make a decision to come to something, that actually an LM is very good at being able to compress that. Right,

Thomas Roccia: right. It's really good. And that's something that I wanted to, to add as well.

Most of the. At least for, for, for what I see, what I saw most of the security tool today are using. LLM summarization, which is good, which is, you know, getting information from some data and summarizing this data for the users. I think the next step will be really to have something which is [00:53:00] more intelligent with ai, with multi-agent, uh, with agents, with multi-agent, uh, system.

And I think that is the next step, how you can automate everything connecting to different kind of tool. Because LM summarization was the first step. Yeah. And then the next step will be more.

Caleb Sima: But I also think there's a, there's a, there's a, there's a very particular reason for that. And the reason is that, you know, everyone right now is.

Needing to call themselves an AI company mm-hmm. And an AI product.

Ashish Rajan: Yeah.

Caleb Sima: And so what they do is exactly, they, they bolt on ai and the easiest way to bolt on AI is summarization. Exactly, yes. Now, now what you have is you have companies who are building sort of, you know, AI core companies mm-hmm. Which are, they are building their product and their technology.

That's right. With AI at center, they are now doing the more smarter things, but th they, this is still infancy in, in those stages. Again, it kind of reminds me of cloud days when Yes, exactly. Exactly. Everyone's a cloud product and they're like, yes. Oh, we, we have something you, you can launch an EC2 and we connect to it.

And then they're, then the companies came out that [00:54:00] truly built their architecture around cloud and made use of its abilities. Yeah. And I, you know, I think that we are still in the early phases of seeing. What AI core companies can do. Right? Yeah. When you build your business around ai mm-hmm. Yeah. And the way you deliver a product around ai, you're gonna see some pretty dramatic, dramatic changes, I think, right?

Yeah.

Thomas Roccia: Yeah. Awesome. That's really cool. And question for you, did you experimented with, uh, you know, AI agent or, uh, what, what kind of work did you do? Because you seem very, I know you are running the AI security podcast, so you are already familiar with the ecosystem and everything, but what kind of experimentation did you do already?

Caleb Sima: You go first. No, you go first. Uh, well, with AI agents specifically. Oh, yeah. Yeah. I mean, I think, you know, me, I just manually personally myself started coding things and, and I'll, I'll tell you, it had nothing to do with enterprise stuff. All of the stuff that I mainly coded all had things to do with like more ea personal task organization stuff.

Like, I'll give you a great example. And I, this was not an agent, I don't. [00:55:00] Actually, you know what, with terms today, I'm gonna just call everything an agent. So yes. But like I'll give you a great example of just something stupid, right? Which was, Hey, you know, one of the things is I have lots of attachments, lots of documents and things I need to keep track of.

But building an organizational system for that is fairly complicated, right? Like if you really want a way to understand where your data is so you can quickly reference it, like it's kind of hard and it's manual in its task, right? So my first experimentations around AI was building a system that would take the document.

Par the document, understand the contents of it automatically both tag what is relevant, right. Find and categor automatically categorize it within the list and then move the file to the appropriate folder system structure, right, of where it was categorized. So for example, if you get a document in, it's a, is it a finance document?

But yeah, out of a finance document, is it a tax document? Mm-hmm. And then for what year is it for? And then for what? For me, I have also companies, what company versus myself individually. Right. And then what was it referenced [00:56:00] by? And so you'd get emails for these things. So what I wrote a thing forward would take the email, take the document.

Parse it, the email, the document, automatically understand the categories of which it is, and then categorize it, autotag it, move it to the folder structure done.

Thomas Roccia: That's pretty cool,

Caleb Sima: right? And so then every email or attachment, so now what's gonna happen is people are gonna prompt inject me. I know now I know Caleb has a system that parses email attachments.

Here's the, the next email coming outta the, but like, you know, like that's the kind of, you know, me messing around for like personal things. Like that is like the kind of thing, like, I haven't gone to the sense of going through and saying, let me actually build cool security tools right. Yet out of this.

Yeah. But I think it's mostly because just based off time, right? Yeah, of course. It's been hard for me as well. It's already

Thomas Roccia: amazing. Like, I mean, it's uh, it's potentially like far away from most of the people, like doing new automation for your personal stuff. It's, it's really cool.

Caleb Sima: Yeah. Yeah. And it's like, that's what I, and what's really also interesting is I started that [00:57:00] project when I think, you know, LMS were first starting to come out and I've continued sort of iterating on that project as LMS have gotten better and better.

And it's amazing. Like I've rewritten this thing four or five times and how much faster and easier and better the rewrite gets. Uh, in what I've been doing. And by the way, I have another one that I've also done around contacts and Oh yeah. And address contacts in your ad, in your address book, uh, around the ability to, by the way, address, book, organization and search is terrible in every system.

Okay.

Ashish Rajan: Yeah.

Caleb Sima: Right. And so like I wrote this command line thing that did this, and both on both of these, I've rewritten them in LLMs and it's amazing how much faster, how much better, how much easier it's gotten to be able to do this stuff. That's pretty cool. Right.

Ashish Rajan: Interesting. Well, I kind of for, uh. I knew this before, 'cause we spoke about this before in another episode.

I think for me personally, I don't think I've built a security tool yet. I think what I have been going down the path of, because with Tech ride.io we've been doing a lot of advisory, right? And I'm trying to build [00:58:00] a, at least in my mind, I've, I've got this notion template going on. 'cause Notion has an AI Yes.

Moment as well. What this almost like a database for my drag database, for lack of a better word, going on with what the current AI architecture being used by most organization is. So my hope is at least, and I did to your point about the picture that you said before, the p and g file. Mm-hmm. A lot of that is just pictures people have shared with me and go, Hey man, this is where we started.

This is where we are. Right. And I'm like, what's the difference? So I think I've been using a lot to a lot of it to pass data to help people quicker. And I think it's become my quote unquote, I have a RAG that I maintain, but I'm not building a security tool. Now that I've had this conversation, I feel I should revisit the buffer overflow.

Yeah, absolutely. You should. Like the last time I opened a pro, I was like, oh my God, all you see is zero x, zero x. I think I even got, for a hot minute, I changed my name with that zero X thing on Twitter. I'm like, like, how do you explain it? You no bite. Yeah. And I'm like, how do you, I was like, I did that only for 24 hours and I came back to like Raja after that.

I'm like, this world is not for me. So I feel like I maybe should explore that [00:59:00] world now that I have spoken to you about this. You should. I feel there is a poss because I feel

Caleb Sima: GI Gira, right? It giras free, right? Yeah, giras free. Yeah. I

Ashish Rajan: almost feel like, oh wait, I, I think I always looked at research as a space.

I would love to do that when I have some spare time. So a lot of the current focus that I've have had is primarily either in the architecture space or the engineering space. I am doing a lot of Kubernetes engineering. So what does that look like? Terraform building? Um, we do some training with them as well.

So that's where basically where we have been using it. I don't think they've gone to a point where I would love to build a security tool. As I feel to your, to your point about ideas that now ips Yeah, yeah. Now I'm going with that, with like, I'm making a note of all what should I build? Yeah. But, but there's also the fact that.

The more you talk about it, the more you realize the, hopefully you can edit that out. Yeah. On that note, we are talking about basically end this conversation. No. I, I definitely find the accuracy piece is harder and harder as the data. I think you kind of mentioned this, the more data you collect now, I think the amount of information that I have [01:00:00] after a point, it's not practically possible for me to go back and check, was this actually what the architecture diagram was or was this actually what the code was?

That's the part that I'm currently struggling with. I haven't figured out a way around it yet, but that's where I've landed with the AI piece at the moment, where I'm like, oh, okay, this kind of. My, it has helped me speed up a lot of things, which have taken me a long time. Come up with frameworks. I released like a seven layer framework thanks to that because I could see the architecture, I'm like, oh, this is a, there's a clear pattern here, and I didn't have to use AI for it.

I just had to get that pattern out.

Caleb Sima: Use Gemini two five for your PRD and your architecture, and then use cloud code to to code to write your code card. You,

Ashish Rajan: you guys are totally telling me intro as well, but on that note, we are on the episode and the end of the episode as well. Where can people find all this information that you're releasing in terms of open source code, your talks and training and all of that as well?

Where can people find all that information?

Thomas Roccia: On the website for Nova, which is Nova Hunting ai. So there is everything, the code, the documentation, and uh, you can start from here and from there. And then, [01:01:00] uh, I'm publishing quite often on social network, on LinkedIn and Twitter mainly. So you can find me also there.

And, uh, and on my blog, uh, which is blog, uh, do security break.io. This is my personal blog.

Ashish Rajan: Yeah.

Thomas Roccia: And, uh, and yeah, this is where I'm, I'm pushing, uh, my spare research and, uh, stuff for the industry. So yeah, I'm trying to, I'm trying to, to lead the way, uh, in AI and threat intelligence and cybersecurity because.

I really think there is a shift and, and I, I'm hoping to, to help the people and the analyst across the world. Yeah, man. That's why, yeah, I mean definitely doing that,

Ashish Rajan: so thank you for sharing that research with us as well. Thank you very much. Thanks for doing the

Thomas Roccia: show. Thanks. Thanks again for the invitation.

No problem.

Ashish Rajan: Well, thanks everyone for watching. We'll see you next time. Thank you for watching all listening to that episode of AI Security Podcast. This was brought to you by Tech riot.io. If you want to hear or watch more episodes of AI security, check that out on ai security podcast.com. And in case you're interested in learning more about cloud security, you should check out a sister podcast called Cloud Security Podcast, which is available on [01:02:00] www.cloudsecuritypodcast.tv

Thank you for tuning in and I'll see you in the next episode. Peace.

‍

No items found.

How to Build Your Own AI Chief of Staff with Claude Code

AI Security 2026 Predictions: The "Zombie Tool" Crisis & The Rise of AI Platforms

Why AI Agents Fail in Production: Governance, Trust & The "Undo" Button

AI Security 2025 Wrap: 9 Predictions Hit & The AI Bubble Burst of 2026

AI Paywall for Browsers & The End of the Open Web?

How to Build Your Own AI Chief of Staff with Claude Code

AI Security 2026 Predictions: The "Zombie Tool" Crisis & The Rise of AI Platforms

Why AI Agents Fail in Production: Governance, Trust & The "Undo" Button

AI Security 2025 Wrap: 9 Predictions Hit & The AI Bubble Burst of 2026

AI Paywall for Browsers & The End of the Open Web?

Build vs. Buy in AI Security: Why Internal Prototypes Fail & The Future of CodeMender

Inside the 29.5 Million DARPA AI Cyber Challenge: How Autonomous Agents Find & Patch Vulns

Anthropic's AI Threat Report: Real Attacks, Simulated Competence & The Future of Defense

The Future of AI Security is Scaffolding, Agents & The Browser

A CISO's Blueprint for AI Security (From ML to GenAI)

Gen AI Threat Modeling vs. AI-Powered Defense: A Debate with Canva & Anthropic

Vibe Coding for CISOs: Managing Risk & Opportunity in AI Development

Vibe Coding, Slopsquatting, and the Future of AI in Software Development with Guy Podjarny

Is Your Browser the Biggest AI Security Risk?

AI in Cybersecurity: Phil Venables (Formerly Google Cloud CISO) on Agentic AI & CISO Strategy

AI Red Teaming & Securing Enterprise AI with Leonard Tang of Haize Labs

RSA Conference 2025 Recap: Agentic AI Hype, MCP Risks & Cybersecurity's Future

MCP vs A2A Explained: AI Agent Communication Protocols & Security Risks

How to Hack AI Applications: Real-World Bug Bounty Insights

The Future of Digital Identity: Fighting AI Deepfakes & Identity Fraud