How to Build Your Own AI Chief of Staff with Claude Code

Feb 11, 2026

View Show Notes and Transcript

What if you could automate your entire work life with a personal AI Chief of Staff? In this episode, Caleb Sima reveals "Pepper," his custom-built AI agent that manages emails, schedules meetings, and even hires other AI experts to solve problems for him .Using Claude Code and a "vibe coding" approach, Caleb built a multi-agent system over a single holiday weekend, without writing a single line of Rust code himself . We discuss how he used this same method to build a black-box testing agent that auto-files bugs on GitHub and even designed the branding for his venture fund, White Rabbit .We explore why "intelligence is becoming a commodity," and how you can survive by becoming an architect of AI agents rather than just a worker

Questions asked:
‍00:00 Introduction
‍03:20 Meet "Pepper": Caleb's AI Chief of Staff
‍05:40 How Pepper Dynamically Hires "Expert" Agents
‍07:30 Pepper Builds its Own Tools (MCP Servers)
‍11:50 Do You Need to Be a Coder to Do This?
‍12:50 Using "Claude Superpowers" to Orchestrate Agents
‍16:50 Automating a Venture Fund: Branding White Rabbit with AI
‍20:50 Building a "Black Box" Testing Agent in Rust (Without Knowing Rust)
‍28:50 The Developer Who Went Skiing While AI Did His Job
32:20 The Coming "App Sprawl" Crisis in Enterprise Security
‍36:00 Security Risks: Managing Shared Memory & Context
‍41:20 The Future of Work: Is Intelligence Becoming a Commodity?
‍44:50 Why Plumbers are Safe from AI

Caleb Sima: [00:00:00] I really want a chief of staff. If Iron Man's pepper pots handled everything, so I created pepper.

Ashish Rajan: He wrote a bunch of agents to fill out his time sheet. He's up somewhere skiing while his clock code submitted for pull request and with a manager thinking that the person's actually working.

Caleb Sima: I should vibe code a black box agent tester.

I don't know anything about rest. Yeah, and like

Ashish Rajan: it

Caleb Sima: works.

Ashish Rajan: There'll be a lot of infrastructure creeping into AppSec. You are almost making the AppSec person multi-skill, creating memory infrastructure behind it.

Caleb Sima: Intelligence is what AI has now made a commodity. That means you spend your whole life learning to process to be intelligent, and when that is now a commodity, that's just gonna eliminate things.

Ashish Rajan: Yeah, at one point in time, the highest risk. Expected job you could get was the chariot writer. That job doesn't even exist. It would transform into something most mornings. If you're like me, you probably open up your phone or laptop and you go through the news for the industry. [00:01:00] You go through news of what's happening in your organization, what things you should be looking at most of your time.

This could be done by someone else, and usually it is someone called a chief of staff, which fortunately, unfortunately, mostly used to be accessible for chief executives or companies. But with ai, what we've been finding is that you can actually build your own chief of staff. In this episode, Caleb was sharing what he did as his holiday experiment to build a chief of staff for himself.

By the way, if you want to make a version of yourself, comment, pepper on this video. Wherever you're watching or listening to this. If you hit a hundred comment, we would be open sourcing pepper to the world as well, which is the chief of staff that Caleb ended up building wall wide coded. We spoke about the changes that are coming in 2026 and how you can leverage the claude code version that is there today, not just as something you talk in an interface for making your angry email not so angry anymore, but also.

Be a little more advanced and not as someone who comes from a technical background. You could totally be someone who just uses the regular version of AI just to talk [00:02:00] to it and use that as a starting point to build your own chief of stuff. In this episode, we talk about some of the things that it can do, how it would work on a multi AI agent, and, and if you are thinking of building an AI startup, there are definitely a few ideas here that I would definitely encourage You can steal and make one today.

I hope you enjoy this episode of AI Security Podcast. As always, if you have been listening or watching an AI security podcast episode for the second or third time and have been finding it valuable, if you can take a quick second to just hit that subscribe or follow button, whichever podcast platform like LinkedIn, YouTube, apple, Spotify that you may be listening or watching this on, it's a free thing for you, but it means a lot for support to the work that we do here.

So thank you so much for taking that. One second to do that. Enjoy this episode and I look forward to hearing all the AI capabilities that you build yourself after

watching this episode. Talk to you soon. Peace.

Hello and Welcome another episode of AI Security Podcast. This is probably the first few episodes of 2026.

And, and if you are like us, probably wondering what can you get, what can you do with AI [00:03:00] in 2026? So Kayla figured out a chief of staff for himself. And that's what we're gonna talk about today do we have a name for a chief of staff that you've made over the long weekend or the holiday period? Man. So this is by the way, an AI not an actual human to chief of staff.

Caleb Sima: It, yeah, it is a, okay. So what I did over the holidays is basically like just code and do really fun stuff. And so one of, you know, I, I actually completed several, uh, tools that I really needed and wanted. And then I started with saying, well, I wanna understand a lot more detail about how. In my mind it was like swarm of agents, right?

Yeah. What does it take to like, oh, orchestrate or command, like hundreds of different agents that accomplish different things. And then that got me into a problem space I already have, which is chief of staff, uh, which is like, I, as chief of staff basically like manages like tons of different things going on simultaneously so that I don't have to figure this out.[00:04:00]

Yeah. And so I was like, okay, you know what? I really want a chief of staff. And the person who I think of in my mind as the perfect chief of staff is if Iron Man's pepper pots. Right? Like, if you remember, she handled everything like it doesn't matter. Yeah. Iron Man just focused on cool tech nerdy stuff.

Yeah. And Pepper just handled everything else. And so I needed that. So I created Pepper and basically, so Pepper is your orchestrator or manager agent and mm-hmm. It then can control different types of experts of agents. But the thing is, is like, you know, when you think about what does a chief of staff do?

The chief of staff needs to have access to tools, so access to your calendar, email, all that stuff is easy, right? To be able to do this. But you also need an a chief of staff to know experts to get in contact with the right people. So like, for example, if I'm talking to. Pepper, and I'm like, man, my foot hurts when I'm running on the treadmill at this [00:05:00] specific thing.

Like, I need to figure out what's going on with this. She'll go, okay, let me go figure out who the world's most expert is on feet and running. And what she'll do is she'll notice that there doesn't exist. An agent expert for this, she will dynamically create a new agent that is an expert in foot. Health running provide it, create the prompt, automatically provided access to the data that it needs about my health, provide access to the tools it needs, auto generate that.

Then Pepper will then have a conversation with this doctor and say, Hey, my boss, Caleb has a problem, and describe the issue. The doctor will then say. Hey, you know, I need to know this information to be able to verify more, to do a diagnosis. And Pepper will then say, oh, that information I already have because I have memory and knowledge.

Yeah. Or I'll just raise it up and ask Caleb himself, and then Pepper will come [00:06:00] up and say, Caleb, I've talked to the doctor. Here's what the doctor needs to know. Do you want to communicate with him directly or should I just pass the information on and I'll say, oh, just pass the information on. Yeah, here's the answers.

Pepper will then go back, continue the entire conversation, come up with the doctor's answers, interrupt me and say, Hey, here's what the doctor said about the scenario. And I can say, oh, okay. That's interesting. Let me talk to the doctor directly. Then she'll connect to me and then I'll just have a one-on-one conversation with the doctor.

Get that. And by the way, the cool part about all of this is any interaction I'm having with Pepper any with these agents memory is being stored. Yeah. So every, I don't have to say save this, like, it'll remember I automatically have a watcher PEPPER watches and says, oh, I've stated my age. I've stated facts.

I've stated locations. These are all things that should be remembered. And so it will save it in memory and it will reference it every time. If it needs it again, it will just, [00:07:00] pepper can just reference it and so now PEPPER can just dynamically create and then manage a bunch of different agents as it goes, as she goes around and does her stuff.

So like I did this one cool thing. Where I was like, okay, one of the things I need is I have digital mail and so I need to access digital mail. There's no real MCP server, which is what I really need for this digital mail. It does have an API. So I basically told Pepper, Hey Pepper, I need to build an MCP server for this thing so that you can have access to my mail.

PEPPER goes, I already have a software engineer. Agent role. She goes and talks to the software engineer and agent role creates a CTO agent role, dynamically passes it to the CTO. The CTO then works with the software engineer to then create an MCP server. And actually I have a Ralph Wigga agent, which is the thing that just kind of continuously just does it until they define the spec and the success criteria.

[00:08:00] And then just ran the Ralph Wickham. Agent and it created the MCP server successfully. And so then it goes and accesses my mail pepper now has access to my digital mail. All of this, by the way, I would say my interaction with that whole scenario was probably about five or six messages. And all that got done.

It was like, so like once you, and by the way, I use Claude code for everything. So Claude Code is sort of the center. Of this whole thing.

Ashish Rajan: Yeah,

Caleb Sima: and you, it just creates agents. Pepper manages the conversations, the memory gets stored appropriately. Tools get run appropriately. It's like, I've just come away from this thing, like agents and managing agents at scale is getting to reality.

This thing is. Like, I don't know if you remember this, but like a year ago maybe it was, how long have we been doing this

Ashish Rajan: podcast? 2, 2, 2, 3 years. This 33 years. Yeah. Three years.

Caleb Sima: Yeah. So it has to be at least three years ago that I said the phases that at [00:09:00] some point AI will manage at scale, right?

Where you can manage your production infrastructure for issues and it will auto resolve these issues like. I think we're getting pretty close to this. Like,

Ashish Rajan: yeah,

Caleb Sima: it's really crazy.

Ashish Rajan: I think a few things that I took away from this, and I think it's pretty awesome that what you did. First of all, I think it's a great project.

I think, uh, and I a hundred percent agree man, I think developers already doing this. A lot of developers that I speak to who use AI for coding some of them have gotten to a point where. They basically start the code generation piece in the later in the night or just before they go to bed in the morning.

It's been running the entire night, creating all these requests. A pull request comes in, all the developer has to do and get up and just basically review the pull request and go, yay, happy, no happy. Push through into production or not. And the competition is more around, I think I was watching, I don't know if it was the tweet thread or something where the competition was, how long can you keep running [00:10:00] an AI agent?

How long or complex you get asking and give them, because I, I don't know, like the example that you gave of the foot or the electronic mail, creating an MCP server, how long did any of that 'cause I, I guess there's one is the feature component where, you know, amazing feature, they build all of that.

The other part is the tokens being used to produce this and everyone who's listening or watching this, they'll be like, oh, how much did he end up spending on this? Or was that a max plan enough just to do this?

Caleb Sima: Oh yeah. Max Plan. Well, I mean so far Max Plan, I have not hit limits of Max plan yet. I think my token

Ashish Rajan: needs is this the one isn't there like A Levels to max plan as well?

There's like a. 2000 and this 20,000 or something. There's like a large max, or

Caleb Sima: I don't, there's like a supermax.

Ashish Rajan: Yeah, basically. So you were, you're saying that the max plan was enough,

Caleb Sima: max plan

Ashish Rajan: was enough. And obviously you're kind of like all of us, you're a bit. Would you say you're a developer? I guess, 'cause that's some, 'cause a lot of people would look at this and go, man, I wasn't a developer.

[00:11:00] Maybe Caleb was a developer. Like, I don't look at myself as a developer. I may, I'm quasi script kitty, copy, paste code, all of that. And maybe scripting, but what kind of, what level of, uh, skillset should a person have before they even walk this path of white coding their way to, into building a chief of staff for their life, whether you are a principal, engineer, architect, whoever.

In my mind when you say all this, I'm already going with, if I'm an architect, all my workflow for, hey, if my project requires me to build a infrastructure for AWS with ai, whatever, I just give it to pepper in this case, or I'm a ciso, I give it to pepper my reporting for the month. There are all these thoughts that are coming to my mind, like how much of a coding expert do I need to be to get to like a stage one of what you're saying.

Caleb Sima: I don't think you need to be any coding expert at all. I don't think you need to even be a technology expert. I think as long as you have Claude or any AI per se,

Ashish Rajan: like a CLI version not the,

Caleb Sima: not

Ashish Rajan: the [00:12:00] user.

Caleb Sima: Like if I were to tell someone who didn't know anything about tech, I do think in this scenario, if you want make something as advanced as.

A multi orchestrator or something along those lines. I think that using Claude code is sort of the way to do this.

Ashish Rajan: Yeah.

Caleb Sima: Um, therefore I would have to ask, you would use AI to ask AI to teach you at least the basics of Claude Code. But like, for example, in order for me to make the chief of staff you know, I, I'm a little bit more on the ground.

So for example, I took. A plugin that was a multi-agent orchestrator that I think does a really good job. Have you ever heard of Claude Superpowers before?

Ashish Rajan: No. I,

Caleb Sima: yeah, it's a, it's a open source project that you, it's a plugin for Claude that does a really amazing job at building software projects, right?

Like by far. I think it's way better than anything I've used. And so [00:13:00] superpowers is an agent that controls other agents. And so I just went through the prompts. 'cause all it is is prompts. It's not really source code, it's just prompts. And so I kind of went through to understand a little bit about how it works and then I just went to Claude code and I just said, I just said this is what I want.

I want a chief of staff. Who controls other experts, who creates the experts on demand when they don't exist, saves those experts once created. I need a shared memory layer and a shared document retrieval area you know, some vector database of some sort. Yeah. You know, in order to be able to do this.

And then it just walk me through the planning session. Back and forth on, okay, what tech stack do we use? What features does it require? What observability, logging and debugging, like all of that. It just kind of, it just freaking holds your hand through the whole thing and then it creates a PRD, right? So then you get this PRD.

Ashish Rajan: Yeah.

Caleb Sima: And you're like, great, alright, let's go [00:14:00] execute it. And it breaks it up into testable chunks. Execute each one test to make sure it passes. And then like, you know, six hours later, maybe eight hours, I think six to eight hours, you have a cloud plugin called Pepper. And then I just started running it.

And then, oh, I would come up with, it's

Ashish Rajan: a plugin now. It's not an interface you're talking to. It's a plugin.

Caleb Sima: It's a plugin, yeah. So in Claude, I install the plugin and then I just do forward slash pepper, and then I just do my thing.

Ashish Rajan: Interesting. 'cause I guess this is kinda where I think we, a few episodes ago we spoke about how John Ivy and Sam Altman have been talking about that always on a piece of AI that would, I guess whatever the necklace that they're making where Yeah.

Making it so easy that you don't have to go, no longer go on a browser, open cloud, and put the plugin. It's basically, it's always on around you. Maybe it's on your Apple Watch or Apple phone or whatever. It's always listening and just going, Hey, [00:15:00] recording a memory of it. And I guess when I'm going with this is that I,

Caleb Sima: yeah, think about it.

Claude Code being a forever persistent session.

Ashish Rajan: Yeah.

Caleb Sima: So Claude Code becomes the sort of command control. That is always running. And so Claude code in and of itself is an agent that then controls pepper, that then controls subagents, right? So it's, it controls a subagent, which then controls sub subagent.

And it's always running. So my integrations all of these things can all, all coming into Claude Code. So cloud code has been sort of this central interface at which then controls everything else. And it's amazing because like, you know, it can, in, in effect, given the right tools and access, can do anything, anything on your, your normal pc, anything on the web, like any of these things it has that capability and it has the automatic reasoning and then everything is.

[00:16:00] Not everything, but like a large part of everything you do is just prompts.

Ashish Rajan: Yeah.

Caleb Sima: Right. So your expertise is prompts, your decisions are prompts, your actions are prompts. And then it's just access to tools. And only the tools have the code part, right. Which you either download existing ones or create, you know, vibe code, the ones you need, like I vibe coded ones for my electronic mail.

And then they just have access to it and then everything is then rotated that way. It's pretty amazing. Like the more I've used Claude Code, the more I Claude code is not just for code, right? Yeah. Like I think you can build your central command console for everything out of that.

Ashish Rajan: Yeah. Wow. I think as you say that, in my mind, and maybe this is a startup idea for anyone who might be working on this as well, is that. 'cause in my mind, if I was the CEO of a company, let's just say in your, you're the CEO of your life. You have just hired as chief of staff, which is an AI chief of staff.

Now if you just replace that chief of you obviously have white rabbit and your venture [00:17:00] fund and everything, that if this can basically extend onto that as well. The chief of staff is no longer just a personal chief of staff. It's also for the, for your work context. An individual.

Caleb Sima: Yeah, yeah, yeah. Pepper does everything.

So like, yeah, this is a good example. I also, vibe coded my um, white rabbits both messaging, branding. Color, uh, you know, what does that look like? What are the color themes we need to use? Yeah. What does the website need to look like itself? And by the way, all of this was done through Cloud code.

Uh oh, okay. Similar, yeah. All cloud code. I basically just, you know, using the same thing created, an expert. Yeah. In a sense of, Hey, I want to go through what is my branding and messaging for White Rabbit should be, right? And what it did is it, and you go through q and a, it asks you questions. Like it just like any like, well what's most important about your branding is it needs to be personal to you.

So [00:18:00] give me your background. So I would, I fed it both myself and my partner's background resume, who we are and like what we enjoy. What we are personal, like, and it deduced things. It's like, Caleb, you've done a lot in your career, but at at your core, you're an engineer. Like you love. You love hacking, you love coding, you love being, you know, you're what you're good at.

And it basically told me is you're good at finding the problems.

Ashish Rajan: Yeah.

Caleb Sima: Right. And then my partner is really good at being able to create big category winners out of solutions to those problems. Interesting. That's what he's good at. Yeah. And it created all this and it created the messaging, created the branding, and then from then fed it into.

What is the, what's the right colors for you guys? The theme and then the right kind of representation of the website. And then I took all of this, these documents.

Ashish Rajan: Yeah,

Caleb Sima: I went to lovable, and by the way, I think I could automate this [00:19:00] too, and then just dumped it in the lovable, and it just generated the, using the themes, the design, the brand, the messaging.

Everything. And it just generated the website and it was largely 85% there. Wow. And then everything else was some tweaking.

Ashish Rajan: Yeah.

Caleb Sima: Around it. And it was done. So I did this for the holiday. And it looks like, I don't know, ish, you've seen it, like it looks professional, right?

Ashish Rajan: Yeah, yeah. A hundred percent. I mean, I actually thought you had a few developers, web developers in the background.

When I looked through it, I'm like, oh, that's pretty good. And um, I guess it's one of those ones, man. 'cause I definitely find. And my hope is when people listen or watch this episode, they get to open their mind to the possibility. Like, you know, I definitely find in 2025, a lot of conversation was more around, we don't know what we can do with ai.

Like, what's the use case? What, like a lot of CISOs and cts were trying to encourage the use of AI in their organization and talk about, Hey I want my team to use more ai [00:20:00] this at this point in time, three years down the track. What you've kind of shared at the moment is that now you can not just talk to chat GPT and C Claude in a console, like as a consumer way, if given enough time.

Even if you take, even if you were to take one weekend, you can literally just automate a lot more parts of your personal life, professional life, wherever you want to take this. And you mentioned it. Can

Caleb Sima: I give you another example?

Ashish Rajan: A hundred percent. Go for it, man.

Caleb Sima: So here's another example. So I'm building these other tools.

They're, they're like stupid tools. Like, for example uh, there's no easy way to manage your Google contacts, right? I have a lot of Google contacts and it's, there's no easy way to, so I vibe coded an AI contact manager for Google Contacts just because I need it. It's a tool, right? But my, my problem was.

I hated testing. Every time, you know, it would vibe, code, and finish. You could run tests with Claude Code. But the problem with Claude Code is it loves doing unit tests and mocks of tests. [00:21:00] And what I want is I want a real black box tester that goes through like a user. Uses the thing and then says, oh, it works or it doesn't work.

I don't want it to have access to code. I don't want it to do any of that stuff. And my problem is I'm the one doing that, so I'm like, this is dumb. I should vibe code a black box agent tester. So I vibe coded a black. And here's what's amazing about it. It's a AI black box tester, and you don't need to tell it anything about your project.

Nothing. Wow. All you do is you point it at the project and it will learn. Automatically what it is, what the purpose of the software is, how to test it, and then go through all of the variations. It will first say, oh, I'm gonna test it primarily like a user would install it, configure it, run through it.

Does it work? Does it not work? Then it will go through use cases, then it will go through edge use cases. Then it will go through like brute force performance testing, like [00:22:00] it goes, and then it automatically will then look at all the bugs, reconcile them. Do analysis on it, determine which ones are really pertinent and what at what criticality, and then AutoFile the tickets as issues on the GitHub repo.

So then in your, so like I got this thing called contact ai, which is the Google contact thing, and in the issues it will auto-create my bugs as one. And then what I've got

Ashish Rajan: endless man.

Caleb Sima: And then it, it'll take, so now I've got it where I've almost got it. It, I, it works, but I've gotta work on the, how many iterations.

Ashish Rajan: Yep.

Caleb Sima: You know, Claude Code will pick up that a new issue was filed. Take that issue, vibe, code the solution, test it, come back and say resolved, and then it, when it comes back and says, resolved, the black box will get issued. Go and look at it. Test to make sure that it works, [00:23:00] and then confirm whether the issue was resolved.

Like this is now all automated, so I wrote this. I've coded that to do that. Think about this. This is this is amazing. And I did all of this, by the way, in the weekend. I mean the holiday.

Ashish Rajan: Yeah. Oh, wow. Wait,

Caleb Sima: so I mean it just by myself. Nobody else. Just me.

Ashish Rajan: Yeah. Well, I will say this as well for people who are watching and listening and you want us to do a workshop on this or you can do it for yourself, just comment, pepper.

Wherever you're watching or listening to this, and we'll just do, we'll, we'll do a workshop for you on why to do this so you can make a pepper for yourself. Yeah, but I'm, I, I guess I do wanna call out, there's obviously enterprise use case that we can take this with this as well. I mean, we, obviously, you've mentioned the example where you've used this for your venture fund White Rabbit

You've used a personal purpose. You said you don't have to be a technical person. So I'm gonna reiterate that fact that people who are watching and listening, thinking that, oh, I have to be an ai, like I have to be a techer to begin with. Born an engineer to solve love hacking. You don't need to do that as long as you know what you're [00:24:00] trying to solve and you're clear about it.

And you can write a PRD. Go back to the vibe code, vibe coding episode where

Caleb Sima: The black box agent one. Yeah. Written in rust. I don't know anything about rust. Never even looked at the code.

Ashish Rajan: But that's what me and actually, I don't know man. Like do you, because you know when, when I talk about these things where I did not know a language and I coded something gets people a bit nervous 'cause they automatically think about, does this mean my job is at jeopardy?

And obviously we are in the beginning of 2026.

Caleb Sima: W Yeah.

Ashish Rajan: How now knowing what you've done.

Yeah.

Caleb Sima: Yeah. I think, I think people's software engineers jobs are definitely in jeopardy.

Ashish Rajan: Software engineers.

Caleb Sima: Yeah. Listen we still have a long way to go. It's not like these things are magic, but at the rate of capability and execution and I, yeah.

I, I, look man, I. [00:25:00] I don't know anything about rust.

Ashish Rajan: Yeah.

Caleb Sima: And I built this black box ai, which

Ashish Rajan: clearly works as well, by the way. It really works.

Caleb Sima: Yeah, it works. And it does a great job. Now listen, I think the, the problem here is, is like I can build something like that that would normally have taken a pretty decent engineering team, a good amount of time to make in Yeah.

In the pre AI world. Right. By any, yeah. Cause and I did that by myself. Screwing around and it took me, you know, four, four to six days to get it into a really good state.

Ashish Rajan: Yeah.

Caleb Sima: Right. And this was sort of part-time, not, you know, I was working on, I, you know, 'cause you're sort of multitasking, a multitasking between different projects, right?

Ashish Rajan: Yeah. You keep letting it run in the background while you do

Caleb Sima: something else. Yeah, that's right. And so it's sort of like, I, I like to think of it as like, you know, when you play online poker, you're like doing multi, multi yeah. Multi game poker. And like, it, it's really good. Now I think where the software engineering part, and I wrote a blog post about this becomes problematic is you can build these.

And [00:26:00] I actually think in the next six months or a year the size, the complexity and the refinement that you by yourself can just auto code projects is going to get, you know, double or triple in capability. Yeah. Is absolutely true. The maintenance and operation of these software. Things is where it becomes much harder.

Like for example you are like, I open sourced this project for, for some reason, and people then start submitting bugs. Now someone has to sit there and like maintain, okay, what is this bug? Did it really fix this bug? Like, do I not like, how do I fix this bug? And this is the biggest problem with ai.

Without it creating 14 other bugs, right? Yeah. Or changing the functionality of the software. And if I don't know anything about rust, like I don't know what it's doing or what it won't do minus the fact that I have a test suite and I do new, I need to validate that it shouldn't touch any of these things.

All these [00:27:00] tests need to pass correctly.

Ashish Rajan: Yeah.

Caleb Sima: And so like I think getting into the maintenance and operation of software. Still is, I think harder with ai 'cause it doesn't have, sort of the managing operations mindset yet. But that's sort of where Pepper comes in, where I was sort of testing that a little bit at a small scale by a chief of staff.

And if a chief of staff can constantly be monitoring my conversations, determining when to interject and say, let me help you with that. Go and run parallel processes to go manage and do things like every day this thing, you know, PEPPER needs to look at my mail, pepper needs to check my calendar, pepper needs to do these things.

Like then you're starting getting into operations and management and how well will it do? And you know, honestly. I think it's pretty decent. Like it's so far looking like it's very capable. There's a guy who posted on Hacker News, I don't know, [00:28:00] a couple days ago or something like that, about how he is now using Claude Code to monitor his production infrastructure issues in dashboard and then alerting and, and investigating potential resolutions.

Yeah, and like it works.

Ashish Rajan: Yeah. Yeah,

Caleb Sima: so

Ashish Rajan: I was talking to developers as well. Funny I, I wouldn't even name the developer 'cause I know uh, you know how a lot, not all countries don't have forced leaves during the holiday period as well. This person, unfortunately, was in the on call. Person a week before the thing started, he wrote a, let's just say, a bunch of agents to, to fill out his time sheet, quote unquote, when he logs in, logs out.

He's got a laptop at home, by the way, he's, he's, he's up somewhere, I don't know, middle of Europe skiing or whatever he is doing while his Claude code. Is I, I can't remember. Obviously I, I don't think the person would ever publicly it will jeopardize his job, so I'll just use, I would not call it out examples, but essentially would say this, that he had code being [00:29:00] submitted for pull request, uh, which he could review at the evening if he wanted to.

Would submit that. So quote unquote, the manager or whoever's thinking that there's actually code being submitted, the person's actually working and they also had unit tests being created as well. So, and I find it really fascinating, we are moving towards a world where a company which would just say, Hey, no, no, ai, we want the good old school, uh, experience software developer to come and solve the problems and challenge.

Define standards and everything else. I wonder how much of what Sam Altman said about we are, we are entering the era of fast fashion SaaS, where we'll have a lot of these tools. I think it was a tweet from him sometime ago, where a lot of these tools will be created, discarded, perhaps created again or, and spread across.

Or maybe I may have my own toolkit where I walk around. I, I switch jobs, but I walk. Claude cord pepper version and go, oh, what, what are you guys chalk cord in. Oh, Russ. I have no idea what I do with Russ. Let me just, uh, Hey [00:30:00] Pepper, do you mind just like, it's like, you know how in matrix there was a, the, what's the, now I end a kung fu moment, which is, it's like one of those ones like that's in my mind.

That's where I'm going with this.

Caleb Sima: Yeah. You know, it's let's take my black box testing agent.

Ashish Rajan: Yeah.

Caleb Sima: Um, I rewrote that project three times. First I did it in TypeScript. Then I did it in go, and then I just, just, I just said for just, you know, giggles.

Ashish Rajan: Yeah,

Caleb Sima: let's just do it in rust.

Ashish Rajan: Yeah.

Caleb Sima: And did it in the rust was the final version.

And so I just cupped it. But like, that was over a period of, of six days, man.

Ashish Rajan: Yeah.

Caleb Sima: Like,

Ashish Rajan: so you asked, you asked to refactoring your cord by the, this is actually a refactoring of

Caleb Sima: cord. No. Like restart, like from scratch.

Ashish Rajan: Ah,

Caleb Sima: yeah, completely from scratch.

Ashish Rajan: So obviously we spoke about software engineers, jobs evolving, and there'll be a split between people who adopt AI and not operate ai.

What about security? Do you feel like, what, what, I mean, obviously, and, and I do wanna clarify, I think we've mentioned this in a few episodes before. There's obviously a split happening [00:31:00] in companies where there's the maintenance mode for the non-AI application that's gonna continue. I'm not saying that unfortunately ai, no ai, we still have to do the whole.

AI transformation projects for that. That's what a lot of people are going through. We are talking about the newer features and products being built. That's what we are referring to is this, or I don't believe the legacy ones. I mean, mainframe is gonna be mainframe forever, as long as it wants to be. What would that new world mean for security?

Because most of our audience is security. So curious, where do you see that go? Because obviously you're investing companies as well, where a lot of these roles are being replaced by ai, but then if those components itself won't exist, where we are using AI for this new world that we are moving towards.

And have you got some thoughts there as well?

Caleb Sima: I mean, you know, I, I will repeat this, what is going to come is, and I think this reference is a little bit of the Sam Altman quote that you said, is that we're gonna [00:32:00] create app sprawl. Right. Yeah. Which is same as when we had cloud and everyone in their brother spun up everything they could in cloud and was experimenting, right?

Yeah. And prototyping and oh, I can just spin up a server for this. I can spin up an instance for that. I can spin up a lambda for this. Like everyone was spinning up all these services and it became a massive problem, right? Yeah. And then you had to have cleaning maintenance, like. It became, you know, cloud sprawl became a massive problem.

Yeah. Now, uh, attackers today take advantage of that because of all the loose permissions, sprawl, configurations that are everywhere, and that's how keys, tokens, et cetera, get moved and like, that's what happens. We have not had that in the application space, right? Like in the application space.

Applications has been pretty well-defined. You in any enterprise, you have an engineering group that has a generally set number of applications at which an entire team is [00:33:00] dedicated and focused on both building and maintaining. And you have a developer pipeline that is focused to be able to produce, manage, and operate these things.

And then, and so now. With this technology, the sprawl, I think is crazy because not only will you have software engineers building the primary product at which your company makes money.

Ashish Rajan: Yeah,

Caleb Sima: that pri primary product is gonna spawn 50 new features that it couldn't have done in a, you know, in the same amount of time the year before ai, right?

It would take you four features, but now you could do 30 features, uh, that get produced. And also I think your internal employees. Your corporate employees are gonna wanna start building applications too, right? This is way, way bigger than, you know. Let's think about like what we do today. Customized apps for every flow, for HR and people, for finance, for [00:34:00] marketing, for sales.

Everyone builds their own custom apps on Salesforce or Oracle. And then now, if you can get to this level of capability. Everyone is gonna wanna produce their own versions of technologies, routines, and applications inside the corporate employee base. That's gonna be. Crazy nuts. And all of this is gonna create a way bigger apps sprawl with bad configurations, bad security measurements, all of those things.

And so I think enterprises, you know, when it gets to this point, we'll have to start planning. What does an internal corporate engineering development pipeline look like, right? Mm-hmm. Not a software engineering, engineering group, but a now corporate employees producing and building their own apps to work more efficiently.

You have to now build pipelines for this. You have to build standards for that. That's, I think will be very, very big in terms of what we do. And AppSec, by the way.

Ashish Rajan: Yeah.

Caleb Sima: Which has [00:35:00] always been I think my personal, some of the smaller teams inside of InfoSec is just gonna blow up because the problem space is gonna be way, way bigger at a much bigger scale.

I'm more

Ashish Rajan: complex as well.

Caleb Sima: Right. Way more complex. Yeah. And I guess your

Ashish Rajan: point there be, there'll be a lot of infrastructure creeping into AppSec as well. 'cause it would, I mean, I guess to what you're saying as well. The example is no longer that I can produce rust code, but I'm also creating memory infrastructure behind it.

The domains being available on the internet. So exposure, you're almost making the AppSec person multi-skilled, for lack of better word, doesn't really matter. They're rust or not.

Caleb Sima: Access to context and data is going to be crazy, right? Like, you know, even in pepper, think about my personal life.

In my personal life, I had to have access control to some degree. Like I kind of implemented in a little bit of a wonky way. But like my health person, my health expert doesn't need to have shared memory access to my financial data. [00:36:00] Right. And so how do I make sure that that is done? And, and I did do it right actually, Claude Co did it.

I didn't really have, I just kind of discussed it with them.

Yeah. Uh, but like, okay, now, like in this memory storage, my health expert only has access to health data. But the thing is that sometimes my health expert may want access to financial data. And this is where you, this collection of memory and data.

Knowledge per se, has to be amassed and then has to be shared across all these executors. And so in an enterprise that'll be very dangerous. Like, where do you, how do you collect your data? How do you amass it? How do you, you know, distribute it? Share it. Because everyone's gonna want it. Every agent is gonna need it.

Um, integration has become super key.

Ashish Rajan: Yeah, I mean, especially in a regulated environment as well, where regulation requires you to have a data lineage as they call it. Like you're able to see how, who access data, when and where it was moved around. [00:37:00] Everything, all the way from the origin to the place it is sitting at today.

How many people or how many applications access to this? So much complexity to data even before ai. Uh, yeah, it'll be interesting. So to, to your point then, I guess. The two parts to this, obviously we, we were talking about chief of staff building it for people. There, there is another school of thought, and this is at least I believe so would happen eventually where the playing field would level up for a pepper in the future as well.

Someone would make a startup for this. Or our product version from it. I'm

Caleb Sima: sure there's many of them already. Sure. There, there's a lot of these like ea assistant things.

Ashish Rajan: Yeah,

Caleb Sima: but they, but they don't, do you know what I need it to do? Um,

Ashish Rajan: oh yeah. Yeah. I mean your, your personal thing, but then I guess. What we already do in organizations where you may have a cloud or an AI security product, or AppSec product or whatever, it's a gen, it's a generalized product for, hey, 10 things on the [00:38:00] internet that people care about, but somehow you have 11 and 12 that you have to customize yourself for your unique environment.

So same would apply here as well,

Caleb Sima: which gets into the next theme of really personalized software. Right? Mm-hmm. I think it's gonna be a given.

Ashish Rajan: That's kind of where I was going with, and I guess maybe that's where, maybe we don't see it today, but that's where Sam Altman and John Ivy and apple as the world are going with, 'cause they realized they would need to be a device that would need to be your quote unquote pepper.

Whether it's your iPhone with the Apple intelligence, or whether it's a Sam Altman piece with, uh, whatever they were creating, someone, something would have to become your personalized. Pepper who walks around with you and is every time with you, whenever you need information.

Caleb Sima: I, I think that, you know, that's part of just, you know, an always on recording device.

Yeah. Yeah. And, and the other, when I think of personalized software, um, to some extent, and I'm going a little bit out on a limb here, but [00:39:00] like, how do I purchase a product from a vendor and then use ai. To be able to customize it to the exact needs of my requirements. Right. Yeah. And so there there's a, there's an aspect here of how does the vendor landscape change, like, for example, vendors as SaaS or vendors with, non open source code.

How do they adapt to. A potential future where personalized software is a requirement. And right now you can do this with open source, right? Because I can take any open source project and clearly personalize it with AI very quickly.

Ashish Rajan: Yep.

Caleb Sima: To get it to do what I want, but, you know, what does that look like for vendors in the future?

How do you offer the ability to do this? Is it just APIs now that are exposed through MCP that then makes it your [00:40:00] personalized versions? I don't know if that's good enough or what level of exposure do does the next generation of software have to be in order to make this a reality

Ashish Rajan: That's good food for thought, uh, personalized softwares and potential.

Chief of staff. For most people out there who watch and listen to this, I would definitely love to hear people talk about what they're doing with ai. Maybe in the comment section if you wanna drop that as well. Was there anything else that you wanted to cover from a chief of staff? Build your AI perspective?

I think we have kind of given PE people an insight into how far they can go with this and. Well, I don't think that

Caleb Sima: actually the recording of this episode wasn't even intended to be this, it just our, our

Ashish Rajan: pre-talk.

Caleb Sima: Our

Ashish Rajan: pre-talk. It'll be hilarious if people comment pepper and then we have to do a workshop on this.

That would be really hilarious. But hey, oh

Caleb Sima: man. Yeah,

Ashish Rajan: we'll find out what

Caleb Sima: happens. Forcing pepper either, but,

Ashish Rajan: Yeah, let it open source pepper and just ly have, uh, people just solve their own bugs. And AI can solve its [00:41:00] own bugs as well. But I, I guess, um, just to kind of wrap this up as well, then. I guess 2026 is the year that people should be continuing to pay attention to AI developments.

Maybe not the tooling set, but definitely the code updates because I guess as much as I would like to sit here and go, Hey, jobs are not at risk, perhaps some jobs definitely are.

Caleb Sima: Yeah. I mean, man, just going through this, this weekend or this holiday, not this weekend, but this holiday. Yeah, I mean, I, I am definitely on this page of, uh, if we continue at this rate, which there's nothing saying that we wouldn't

Ashish Rajan: Yeah.

Caleb Sima: I, there there's just no question jobs are gonna be eliminated. I, and I just think just across the board. In some sense, intelligence is what AI has now made a commodity.

Ashish Rajan: Yeah.

Caleb Sima: And that [00:42:00] is pretty interesting and scary to think about. So that means you, you spend your whole life learning to process to be intelligent, to be an expert at something.

And that's where your value comes from and when that is now a commodity. Yeah, I mean, that's just, that's just gonna eliminate things.

Ashish Rajan: Yeah. Maybe it, that's where you, I think people talk about the whole architect thing where you almost become and I, I, I do wanna, uh, first of all, I agree with what you said, but I think it's an interesting thought exercise to kind of imagine that most of the projects that we have worked on were.

People going out on a limb for this would be a great idea, bank would be a great idea, housing would be a great idea. So that used to be the guts gut feeling of most people that, Hey, I think it's a great idea. We should try and build a bank. Because someone would come like, you know, the. I'm gonna find a problem, make a solution for the problem, sell the problem.

Yeah. I think in my mind that what happens to that layer is where I'm coming from. 'cause that's what [00:43:00] created the experts. Once I created the bank and I had a solution, I wanted someone to maintain the solution for me. So I think my mind is going, what does that mean for the new architects as to how we design and how much of where it's going.

But I definitely, um. I'm more hopeful for the people who are experienced. I'm more nervous for the people who are coming in now 'cause they don't have the experience to kind of ju make a judgment call for, this is good and this is bad, but for people who are experienced, at least they are able to say, oh, that's bad code.

It should have some testing. This should run without any bugs. Hey, I would want to, what he was saying, I wouldn't want my finance and health memory to be separate. Yeah, there, there is no scenario should be talking like there are things that you, me and others who are watching and listening who have experienced would make a call for that.

Others would not, who are just starting the call. I think my nervousness comes more for them than for the experienced people. I think that. It should hopefully fi they will figure it out. I'm optimistic in that context, but I think I'm more nervous for the people in university. [00:44:00] But with that said, I don't wanna give some hope to people as well.

Not go not like lower the bar of the podcast and walk away. Uh,

Caleb Sima: but you, you, but you gotta be, you know, realistic, you know, too, like, you know, there are no answers to, to some of this Today.

Ashish Rajan: Today,

Caleb Sima: I was gonna give some positive, I mean, the younger people. The one thing that they do have is they have time to adapt.

So we don't know what the adaption is, per se. Yeah. But at least they've got the time to adapt versus older people. Set in their careers, may not have any time to adapt

Ashish Rajan: mm-hmm.

Caleb Sima: Depending on what the next fair two to three years look like.

Ashish Rajan: Yeah.

Caleb Sima: Um, yeah. I don't know, man, like it's, uh, you know, the one thing that has always been said that continues to stand true though, is being a plumber or a general contractor is always a safe job.

Ashish Rajan: So if you wanna invest in a plumbing business today, this is a great idea. This is a reminder. People find a plumbing business close to you or actual real estate. Like, you know, you start building [00:45:00] houses if you want. Just like, that would be a great skill 'cause people would still need houses to live in.

Nos. It's like super cold here in London, so people would still need warm houses to live in there. There's so many jobs that are still out there. Maybe the job profile changes. So I think, um, one manager that I used to have ages ago was to say that. At one point in time, the, the highest respected job you can get was the chariot rider for the queen cha.

That job doesn't even exist. Yeah.

Caleb Sima: No, no.

Ashish Rajan: Yeah. No one even knows what that job is or if there is one, when there is no queen to begin with. So, but I guess that's kind of what happens here, is what I wanna leave people with, that it would transform into something, but it would not be like a snap of a finger transformation.

You would see it coming as someone who's experienced. I think that's kind of where I'm going with this. It's not gonna be like you snap a finger. The entire organization doesn't exist tomorrow.

Caleb Sima: Man, I don't know. I'm, I'm, I'm starting to lean,

Ashish Rajan: I'm trying to leave them with something positive here.

Caleb Sima: I, I know, and I'm just [00:46:00] leaning on.

I, the more I, I screw it. This, the more I get negative, more pessimistic. I, I get

Ashish Rajan: Well let people comment on what they feel about this. I mean, I, I'm with you and I would definitely say. You can't be delusional. Definitely there is a reality to this, which is why people should keep an eye on the whole AI as it evolves.

But as most things things do take time to kind of evolve and be slapped in your face. Like many people perhaps say no for AI for that reason. But I think it'll be interesting, man. I'm, I'm obviously still very optimistic and balancing on the realist realist. Uh, yeah, yeah. Tight wire that we are walking.

Caleb Sima: Your job in this discussion is to be the positive person.

Ashish Rajan: Like, I mean, you, you, it's like the divider is half glass, full versus half glass. Empty conversation. No, but I think we definitely have, uh, well, people can, let us, let us know your thoughts on the comment section. Or just feel free to reach out and more than happy to have more conversations about this.

And again, if you wanna workshop on pepper, definitely drop the word pepper and we'll give you a workshop. But yeah, thank you so much for your [00:47:00] time and, uh, we'll see you next episode. See you. Thank you for watching or listening to that episode of AI Security Podcast. This was brought to you by Tech riot.io.

If you want to hear or watch more episodes of AI security, check that out on ai security podcast.com. And in case you're interested in learning more about cloud security, you should check out a sister podcast called Cloud Security Podcast, which is available on Cloud Security Podcast tv. Thank you for tuning in, and I'll see you in the next episode, episode.

Peace.

‍

No items found.

How to Build Your Own AI Chief of Staff with Claude Code

AI Security 2026 Predictions: The "Zombie Tool" Crisis & The Rise of AI Platforms

Why AI Agents Fail in Production: Governance, Trust & The "Undo" Button

AI Security 2025 Wrap: 9 Predictions Hit & The AI Bubble Burst of 2026

AI Paywall for Browsers & The End of the Open Web?

AI Security 2026 Predictions: The "Zombie Tool" Crisis & The Rise of AI Platforms

Why AI Agents Fail in Production: Governance, Trust & The "Undo" Button

AI Security 2025 Wrap: 9 Predictions Hit & The AI Bubble Burst of 2026

AI Paywall for Browsers & The End of the Open Web?

Build vs. Buy in AI Security: Why Internal Prototypes Fail & The Future of CodeMender

Inside the 29.5 Million DARPA AI Cyber Challenge: How Autonomous Agents Find & Patch Vulns

Anthropic's AI Threat Report: Real Attacks, Simulated Competence & The Future of Defense

How Microsoft Uses AI for Threat Intelligence & Malware Analysis

The Future of AI Security is Scaffolding, Agents & The Browser

A CISO's Blueprint for AI Security (From ML to GenAI)

Gen AI Threat Modeling vs. AI-Powered Defense: A Debate with Canva & Anthropic

Vibe Coding for CISOs: Managing Risk & Opportunity in AI Development

Vibe Coding, Slopsquatting, and the Future of AI in Software Development with Guy Podjarny

Is Your Browser the Biggest AI Security Risk?

AI in Cybersecurity: Phil Venables (Formerly Google Cloud CISO) on Agentic AI & CISO Strategy

AI Red Teaming & Securing Enterprise AI with Leonard Tang of Haize Labs

RSA Conference 2025 Recap: Agentic AI Hype, MCP Risks & Cybersecurity's Future

MCP vs A2A Explained: AI Agent Communication Protocols & Security Risks

How to Hack AI Applications: Real-World Bug Bounty Insights

The Future of Digital Identity: Fighting AI Deepfakes & Identity Fraud