SkillSpector is an Apache-2.0 security scanner from NVIDIA for AI agent skills: instruction packages that may contain Markdown, scripts, dependencies, tool definitions and activation rules. It accepts a local directory, individual file, zip archive, Git repository or URL and produces terminal, JSON, Markdown or SARIF findings with a 0–100 risk score.

Its value is the threat model. A skill is not merely documentation. Once an agent follows it, instructions can cause file access, shell execution, credential use, network transmission or persistent configuration changes. SkillSpector checks both conventional software risks and agent-specific problems such as prompt injection, excessive agency, memory poisoning, trigger abuse and MCP tool-description poisoning. A clean report is evidence from one pre-install analysis layer—not a guarantee that runtime behavior is safe.

SkillSpector security scanner repository screenshot — SkillSpector can make skill review repeatable and CI-visible. Its report must be combined with provenance, sandboxing, permissions and human verification.

Where SkillSpector fits in the trust pipeline

 source identity + commit pin
             |
             v
 unpack in isolated workspace
             |
     .-------+--------.
     v                v
 static rules     dependency / OSV
 AST + taint      signatures + metadata
     '-------+--------'
             v
 optional LLM semantic review
             |
       human disposition
 accept / fix / suppress-with-reason / reject
             |
 sandbox + least privilege + runtime monitoring

This order matters. Scanning an unpinned URL today does not prove that tomorrow’s download is identical. Running an LLM over hostile instructions without isolating its credentials creates another exposure. Uploading a SARIF file without a blocking policy creates visibility but no control. Treat the scanner as one stage in a reproducible admission process.

What the two-stage analyzer does

Stage	Mechanism	Strength	Expected weakness
Static patterns	Regex and rule matching across skill files	Fast, deterministic and explainable	Benign examples can match; novel wording can evade
Behavioral AST	Detects Python execution primitives and dangerous call chains	Finds code behavior beyond keywords	Does not execute code; language and indirection coverage is finite
Taint tracking	Looks for flows from inputs/secrets/files to execution or network sinks	Connects source and sink into a meaningful path	Dynamic dispatch and complex frameworks can defeat static flow
OSV lookup	Queries OSV.dev for declared dependency vulnerabilities	Current vulnerability intelligence when online	Offline mode uses a smaller fallback; undeclared/vendored code may be missed
YARA signatures	Matches known malware, webshell, miner and exploit patterns	Useful for known families	Unknown, packed or modified payloads may evade
Optional LLM review	Assesses intent and context, explains findings	Can reduce obvious static false positives	Probabilistic, provider-dependent and exposed to adversarial content

The project describes static analysis as high-recall with moderate precision and says optional semantic review improves precision. Do not copy a headline precision figure into policy without reproducing it on your own corpus. A security team cares about per-category false negatives, not only one aggregate number.

The 16 risk categories in operational terms

Risk group	Examples covered	What a reviewer should verify
Instruction control	Prompt injection, hidden directives, system-prompt leakage	Are instructions visible, scoped and subordinate to host policy?
Data access	Environment harvesting, filesystem enumeration, context exfiltration	Which exact data classes can flow to which domains?
Authority	Privilege escalation, excessive agency, tool misuse	Does the skill request only the minimum reversible permissions?
Persistence	Memory poisoning, self-modification, startup or cron persistence	Can the skill modify future sessions or its own enforcement?
Software supply chain	Unpinned packages, remote scripts, CVEs, typosquatting, obfuscation	Are hashes, commits and dependency provenance reproducible?
MCP boundary	Wildcard permissions, undeclared capabilities, tool poisoning	Does declared metadata match code and runtime traffic?
Code behavior	exec/eval, subprocess, dynamic imports, tainted execution	Is dangerous behavior essential, constrained and safely parameterized?
Output and triggers	Unsanitized output, broad activation, shadow commands	Can ordinary text unexpectedly activate or cross a trust boundary?

SkillSpector currently documents 64 patterns across these 16 categories. The number is useful for versioning coverage, not as a security score by itself. Ten variants of a known regex do not necessarily protect against one new attack chain, while a single high-quality taint rule may prevent a critical leak.

Understanding the risk score

The documented model adds 50 points for a critical finding, 25 for high, 10 for medium and 5 for low, applies a 1.3× multiplier when executable scripts are present, and caps the result at 100. Published bands label 0–20 low, 21–50 medium, 51–80 high and 81–100 critical.

Score/band	Default recommendation	What it does not mean
0–20 / Low	Continue manual review and sandbox test	Not “safe”; blind spots may contain unobserved behavior
21–50 / Medium	Require owner disposition and remediation	Not automatically malicious; legitimate network or shell use may score
51–80 / High	Block installation unless security grants an exception	Score alone does not establish intent
81–100 / Critical	Reject and investigate source/provenance	Multiple findings may describe one root cause; deduplicate during triage

A score compresses heterogeneous evidence. Preserve rule IDs, file locations, snippets, confidence, analyzer version and source commit. Two skills with a score of 50 can have radically different risks: one critical code-execution path versus ten low-severity hygiene findings.

Static-only versus LLM-assisted scanning

Mode	Use when	Data boundary	Policy implication
`--no-llm`	Fast CI, untrusted inputs, air-gapped review or deterministic baselines	No semantic payload sent to a model provider	Expect more false positives and missed intent
Managed OpenAI/Anthropic/NVIDIA endpoint	Contextual triage is worth external processing	Skill content may leave the environment	Review provider terms, retention, region and credentials
Local OpenAI-compatible endpoint	Sensitive skills or reproducible internal evaluation	Can remain local if networking is controlled	Model quality and prompt-injection robustness become your responsibility

Never give the scanner’s semantic model more authority than it needs. The model should analyze text, not possess production secrets or installation permissions. Treat any skill content as adversarial input and log the exact provider/model because semantic dispositions can change between versions.

Known blind spots

NVIDIA explicitly lists important limitations: non-English attacks may be missed; text hidden in images is not analyzed; compiled, encrypted or binary code is outside static visibility; and runtime behavior is not executed. OSV coverage is reduced when the network is unavailable. Static analysis may also struggle with generated code, reflection, second-stage downloads, environment-conditioned behavior and abuse that appears only after a remote server changes its response.

Blind spot	Compensating control
Image or document instructions	Extract OCR/metadata and review media in a separate sandbox
Binary or encrypted artifact	Reject opaque payloads or require reproducible source and malware sandboxing
Runtime-only behavior	Execute in an instrumented container with denied-by-default network/filesystem
Non-English or obfuscated prompt	Use language-aware review, Unicode normalization and adversarial test cases
Remote dependency changes	Pin commit/hash, mirror dependencies and verify signatures/SBOM
Scanner evasion or defect	Use defense in depth, independent review and scanner regression tests

A CI admission policy that teams can enforce

Resolve immutable input. Record repository, commit SHA, archive hash, author and license before scanning.
Unpack without execution. Disable hooks, templates and package installation while inventorying every file.
Run static-only first. Save JSON and SARIF with the SkillSpector version and whether OSV was reachable.
Apply explicit gates. Block any unsuppressed critical/high result, credential-to-network taint, hidden instruction or opaque executable.
Perform semantic review separately. Send only approved content to an isolated provider and never allow the model to install the skill.
Require human disposition. Every suppression needs an owner, rationale, scope and expiry date.
Test runtime boundaries. Start with no secrets, read-only fixtures, a domain allowlist and a disposable account.
Re-scan updates. Trigger on source changes, dependency lock changes, rulepack releases and newly disclosed CVEs.

Finding	Suggested gate	Exception evidence
Critical/high taint or execution chain	Hard block	Security review plus code remediation; avoid permanent suppression
External network destination	Block unless allowlisted	Business purpose, data classification, domain and payload proof
Unpinned dependency	Block release	Lockfile/hash and controlled update process
Broad trigger or wildcard permission	Require narrowing	Demonstrated need and runtime approval control
Low-severity hygiene issue	Warn with deadline	Owner and remediation ticket

How to evaluate detection quality

Create a labeled corpus that resembles the skills your organization installs. Include clean skills that legitimately use subprocesses or network APIs, intentionally vulnerable fixtures for every relevant rule family, multilingual instructions, encoded text, renamed tools and chained behaviors. Split the corpus so tuning examples do not become the only test set.

Metric	Question answered	Why it matters
Recall by severity/category	How many known dangerous cases were found?	A high overall rate can hide zero coverage for credential exfiltration
Precision	How many alerts were actionable?	Low precision trains developers to bypass the gate
Time to disposition	How long does human triage take?	Measures actual operational cost
Incremental scan time	Can the check run on every change?	Slow gates migrate to infrequent audits
Semantic stability	Do repeated/model-version runs agree?	Exposes nondeterministic policy outcomes
Escape rate	What risky behavior appeared at runtime after passing?	Tests the entire admission system, not just the scanner

SARIF and reporting

SARIF output can be uploaded to GitHub code scanning so findings appear at source locations and participate in the repository’s normal security workflow. JSON is better for policy engines and longitudinal metrics; Markdown helps a human review packet; terminal output is convenient locally. Confirm exit-code behavior in your pinned version and implement the gate explicitly rather than assuming that creating a report fails a build.

Reports can themselves contain sensitive snippets, internal paths and suspicious instructions. Limit artifact retention and access. Do not publish a full finding containing a token or exploit payload in a public pull request.

Alternatives and complementary controls

Option	Best use	Relationship to SkillSpector
Cisco AI Defense skill-scanner	Another agent-skill-focused multi-engine scanner	Compare coverage and false positives; independent engines can diversify detection
Semgrep/CodeQL	Deep language-specific static analysis and custom organization rules	Complement agent-instruction and MCP-specific checks
OSV-Scanner/Dependabot	Dependency vulnerability and update workflows	Broader dependency operations than one embedded lookup
YARA/antimalware sandbox	Known malware and dynamic behavior	Useful for payloads outside Markdown-oriented analysis
Manual allowlist + signed catalog	High-control environments with few approved skills	Reduces supply-chain variability; still scan each signed release
Runtime sandbox/policy engine	Enforcing file, network, secret and approval boundaries	Essential compensation because pre-install scanning cannot observe everything

FAQ

Does a low score mean a skill is safe?

No. It means the pinned scanner version did not accumulate many recognized findings. Blind spots, runtime behavior and supply-chain substitution remain possible.

Does SkillSpector execute the skill?

No. Its documented limitation is static analysis rather than dynamic execution. Use a separate instrumented sandbox for runtime testing.

Can it scan a remote Git repository?

Yes, along with URLs, zip archives, directories and individual files. For reproducibility, scan an immutable commit or hash rather than a mutable branch URL.

Is LLM analysis required?

No. --no-llm runs static-only. Semantic analysis is optional and can use managed providers or a local OpenAI-compatible endpoint.

What happens without OSV network access?

The scanner falls back to a small built-in list, so dependency-vulnerability coverage is less current and comprehensive. Record that degraded state in the report.

Who should own the final decision?

A designated security or platform reviewer, with input from the skill’s functional owner. The scanner provides findings; it should not grant installation authority.

Sources and verification

Last reviewed July 26, 2026. Pattern counts, providers, default models and CLI behavior may change. Pin the scanner version and verify the current repository before adopting a gate.

SkillSpector

Tags

Product Preview

About SkillSpector

Where SkillSpector fits in the trust pipeline

What the two-stage analyzer does

The 16 risk categories in operational terms

Understanding the risk score

Static-only versus LLM-assisted scanning

Known blind spots

A CI admission policy that teams can enforce

How to evaluate detection quality

SARIF and reporting

Alternatives and complementary controls

FAQ

Does a low score mean a skill is safe?

Does SkillSpector execute the skill?

Can it scan a remote Git repository?

Is LLM analysis required?

What happens without OSV network access?

Who should own the final decision?

Sources and verification

Ready to try SkillSpector?

Quick Info

Share This Tool

Submit it to AI Dreamhub

Related Tools

AI Detect Lab

Credence

Omogle

Claw Patrol