Why Point-in-Time Scans Give You a False Sense of Security for AI Agent Skills

The numbers keep coming in.

Aguara scanned 31,330 AI agent skills across five registries and found security findings in 7.4% of them — 448 rated critical. Tork Network ran a tighter sample: 500 skills from ClawHub alone, flagged 10% as dangerous, and found typosquats with confirmed C2 connections. ClawCare published a Cisco finding of an OpenClaw skill exfiltrating data via a silent curl call. ClawDefend appeared on Hacker News with 277 skills scanned and counting.

Practitioners are paying attention. A detailed post on r/LocalLLaMA this week independently documented four real attack vectors inside skill files — data exfiltration, privilege escalation, Base64 shell payloads, hidden Unicode injections — and ended with a direct question: "Is there a tool that does this analysis continuously?"

The answer to that question matters more than most security teams realise.

The Problem with Scanning Once

Every one of the tools above has the same fundamental constraint: they run at a point in time.

You scan your installed skills today. You get a clean report. You move on.

But AI agent skill ecosystems don't freeze on the day you scan them. Skills are updated silently. New versions get published without changelog entries. A skill that was safe three weeks ago can pull in a new dependency with a known CVE. A maintainer account can be compromised and a malicious update pushed. A skill you vetted before you installed it can behave differently after an auto-update.

This isn't hypothetical. The Aguara data covers five registries including ClawHub — and their finding of 448 critical issues is a snapshot, not a floor. The Tork Network data found typosquats with active C2 connections. Those C2 servers don't go dark because someone ran a scan.

The Supply Chain Analogy That Actually Fits

Think about how software dependency scanning works in a mature engineering org. You don't run npm audit once at project kickoff and call it done. You run it in CI. You get Dependabot alerts. You have SBOMs that update when your dependencies update.

Nobody argues that a point-in-time audit replaces continuous dependency monitoring. Everyone understands why: the threat surface changes daily.

AI agent skills are the same thing, except the blast radius is larger. A compromised npm package can affect one codebase. A compromised skill can affect every agent that loads it across an entire platform, reading context windows, making tool calls, and operating on behalf of users with live credentials.

The risk profile demands the same level of continuous visibility.

What Continuous Monitoring Actually Means

Running the same static analysis on a loop is not enough.

Effective continuous skill security monitoring needs to:

Track the full skill lifecycle — not just installed state, but every version published to the registry. A malicious update should trigger an alert before it gets installed, not after.
Maintain a behavioural baseline — what does this skill normally call? What external hosts does it contact? Deviation from baseline is often the first signal of compromise.
Correlate across the ecosystem — a typosquat attack doesn't target one developer; it targets everyone using a skill catalog. A detection on one install should propagate as a signal across the platform.
Integrate with the developer workflow — alerts that require someone to open a dashboard get missed. Security signals need to surface where developers are already working: in the IDE, in CI, in the agent runtime.

What Four Independent Scanners Actually Proved

The wave of scan tools that appeared this month is net positive for the ecosystem. Aguara, Tork Network, ClawCare, and ClawDefend each caught real issues. The research is valuable. The community attention is warranted.

But the proliferation of one-off tools also reveals the gap they leave behind. Four tools scanning the same corpus and finding overlapping but different results tells you this problem isn't solved by any single scan. It tells you the attack surface is large, dynamic, and under-monitored.

A scan is a starting point. Continuous monitoring is the answer.

The Ask

If you've run one of these scanners and found your environment clean — good. Now set a reminder to run it again next month. And the month after that. And after every skill update.

Or build something that does that automatically.

SkillShield is built for teams who can't afford to rely on remembering to re-scan. Continuous monitoring, ClawHub-integrated, OpenClaw-native — the secure path with as little friction as the insecure one.

Sources

Aguara Go scanner — Hacker News discussion (7.4% of 31,330 skills, 448 critical findings)
Tork Network scan — Hacker News discussion (10% of 500 ClawHub skills dangerous, typosquats with C2)
ClawCare + Cisco exfil finding — Hacker News discussion
ClawDefend Show HN — 277 skills scanned
r/LocalLLaMA supply chain thread — 4 attack vectors, direct ask for continuous tooling