Why Hashes, Filenames, and URLs Don’t See What AI Apps Actually Do

Dedi Shindler

Installing an app used to mean unlocking the filesystem. Today it can mean handing over the filesystem, the browser, the user’s logged-in accounts, and the right to act on company data — all from the same kind of double-click that installed a screenshot tool ten years ago.
Employees are now building their own apps with AI tools like Lovable, Replit, and Base44 — faster than IT can list them. Red Access found roughly 5,000 such apps across customer environments, with about 40% reachable by anyone on the open internet.
The Copilot pane inside Outlook is one of the most-used AI surfaces in the enterprise — and the one most security tools see least clearly. Browser-side tools watch the tab; network-side tools watch the traffic; neither watches what actually happens inside the pane.

****

For two decades, controlling what software ran inside a corporate environment meant controlling identifiers. File hashes, executable names, URL allowlists, domain reputation, signed publishers — the stack catalogued the known-bad, permitted the known-good, and trusted the perimeter to do the rest. That model worked when the unit of governance was the artifact: something you installed, ran, and could fingerprint. It works less and less as the software now entering enterprises increasingly does none of those things in the old sense. What looks like a detection failure is something else — a layer mismatch.

Identifier-based controls were built for a category that’s shrinking

Hash blocking, application allowlisting, URL filtering, IP reputation — all of these are premised on the idea that the thing you’re trying to govern has a name, a hash, a publisher, a destination. You catalogue the bad ones, allow the good ones, and the perimeter enforces the list. For a long time that worked, because corporate software was a relatively static set of installable executables pointing at a known set of SaaS destinations.

None of these controls are dead. They catch real things every day, and any honest argument about modern security has to start there. The point isn’t that identifier-based controls don’t work. It’s that the category of things they work on is shrinking relative to the things now mattering most — and the rate of shrinkage is accelerating.

The techniques aren’t new. What they now reach is.

Installing software in a user’s home directory — without admin rights, without a system-wide trace — is a pattern that has existed for as long as Windows has had a Users folder. Renaming and repackaging an installer to break a hash blocklist is older than that. Search-result poisoning is older still. Nothing on this list is novel. What’s new is that the most consequential software now entering organizations is the software that least resembles a fingerprintable artifact — and it’s arriving at a rate the cataloguing model was never designed to keep up with.

The same AI tool ships in multiple shapes. Microsoft Copilot is a desktop app, a browser tab on copilot.microsoft.com, a pane embedded inside Outlook, a pane embedded inside Teams, and an agent reachable through Microsoft 365 — five surfaces, one assistant, and the user moves between them inside a single workflow. A hash that catches the desktop client says nothing about the same vendor’s web client running in a Chrome tab two seconds later, or the embedded pane that rendered inside Outlook while the user was reading mail. An executable allowlist applied to the install path doesn’t reach the four surfaces that never installed anything.

The blast radius of a user-context install has grown by orders of magnitude. A productivity utility installing into a user’s AppData folder ten years ago could touch the filesystem. An AI assistant installing the same way today can touch the filesystem, the browser, the user’s tokens, and — increasingly, through MCP and similar agentic interfaces — take actions against corporate data on the user’s behalf. The install pattern is unchanged. The consequences of it aren’t.

AI-built mini-apps are being produced faster than any allowlist can catalogue them. Recent Red Access research on the vibe-coding category — the AI-driven app builders that let employees describe an application and get a working URL — examined hundreds of thousands of web assets surfaced across customer environments, identifying roughly 5,000 unique applications, with roughly 40% reachable on the open internet and a meaningful share carrying real business data. Any model that depends on a finite list of known things falls behind a process that generates new things continuously.

Repackaged and rewrapped installers used to be a slow loop. It has been industrialised: search-result placements are bought, alternative hosting is cheap, and a popular AI desktop app can be republished with a trojan loader and a fresh hash in hours. Block by name, the renamed file gets through. Block by hash, the next build evades it. What’s different now is the speed at which the techniques reproduce. And in the cases that matter most, what the rewrapped installer does after it runs — the prompts it sends, the tokens it touches, the data it pulls — happens at a layer the hash never reached anyway.

None of these techniques is individually new. What’s new is the rate at which they now matter — and the fact that the most consequential software entering organizations is no longer the kind of thing identifier-based controls were built to recognize.

The gap isn’t a detection problem. It’s a layer problem.

What the industry has been calling a detection gap is actually a layer mismatch.

Identifier-based controls inspect the artifact and the destination. The thing that matters about an AI app — what it accesses, what it sends, what it acts on — happens inside an interactive session. Network controls see packets. Endpoint controls see process behavior. Neither reaches inside the interaction itself — what was typed, what was pasted, what the model returned, what the user did with it. The session layer — what happens inside the tab, the embedded webview, the AI client, after the network has delivered the bytes and before the user sees the result — is where the behavior actually lives, and historically it’s been the layer with the least native instrumentation.

The firewall isn’t broken. The EDR isn’t broken. The CASB isn’t broken. Each one is doing exactly what it was designed to do, on the layer it was designed to do it on — and what each was designed to inspect, it inspects well. What none of them inspects is the thing happening inside the session after the bytes arrive: the prompt typed into the model, the file pasted into the chat, the credential entered into a form rendered inside an embedded webview. The enforcement point is still valid. The layer being inspected is the problem.

Where the controls have to move to

Once governance reaches the session layer, the unit of policy stops being the artifact and starts being the interaction. A control at that layer doesn’t ask whether an AI tool loaded. It asks what the tool did after it loaded — what was typed into it, what was pasted out of it, what was uploaded, what was downloaded, what credential was passed, what extension touched the page. The questions change because the vantage point changed.

That vantage point covers any web session, regardless of where it renders — a Chrome tab, an Edge window, the Copilot pane inside Outlook, the embedded webview inside Slack or Teams, the ChatGPT or Claude desktop client. The session keeps happening even when the visible browser doesn’t. A managed enterprise browser sees the tab it owns; a network agent sees the packets it routes; neither reaches deeply or consistently into the Copilot pane embedded inside the Outlook desktop app, which is where a meaningful share of AI activity now actually happens.

The industry is collectively moving from visibility at this layer to action at it, and the bar is rising fast. Real-time enforcement on the session — blocking a paste of source code into a public model, stopping an upload to a vibe-coded app reachable from the internet, preventing a credential from being typed into a phishing page rendered inside an embedded webview — is shipped capability today. Automated containment that spans the broader stack is the next layer of work.

Session-layer controls aren’t a replacement for the existing stack. They’re the layer the existing stack hasn’t been able to reach. The firewall keeps doing what it does. The EDR keeps doing what it does. The session layer adds what neither could see — and the reason hash, filename, and URL no longer carry the weight of the answer is that the answer no longer lives at the layer they were built to inspect.

The layer problem is solvable without rebuilding the stack you already own. See how session-layer enforcement extends the firewall, the EDR, and the SSE you’ve already deployed — without a managed browser, an extension, or a new network plane →

Insights & Ideas

Latest from RedAccess

See all articles

June 16, 2026

|

Yaniv Levi

GitHub Lost 3,800 Private Repos to a Single Extension. Your EDR Won’t Catch the Next One Either.

Key Takeaways GitHub — the company that hosts the source code for 90% of the Fortune 100…

Read More: GitHub Lost 3,800 Private Repos to a Single Extension. Your EDR Won’t Catch the Next One Either.
June 1, 2026

|

Dedi Shindler

Why Enterprise Browser Deployments Stall (and What That Means for Session-Layer Security)

Key takeaways ** The official story about enterprise browsers is a growth story: a fast-rising category, strong…

Read More: Why Enterprise Browser Deployments Stall (and What That Means for Session-Layer Security)
May 24, 2026

|

Dedi Shindler

The Enterprise Browser Problem Isn’t Security. It’s Architecture.

Key Takeaways Why “zero change management” matters more than feature parity in the Island vs. Red Access…

Read More: The Enterprise Browser Problem Isn’t Security. It’s Architecture.