Skip to content
AI & Security · March 23, 2026 · 8 min read

AI Skills Are the New Phishing — Here's How to Protect Yourself

Or: how to use Claude to close the most dangerous gap in your AI workflow

Illustration for article: AI Skills Are the New Phishing — Here's How to Protect Yourself

What is an AI skill?

Claude has knowledge but remains in conversation by default. A skill changes this dynamic by providing tools — it enables action rather than (opent in nieuw venster) mere discussion.

Claude calls this an MCP server (Model Context Protocol). Cursor or VS Code use the term "extension." ChatGPT previously called them "plugins." Terminology varies, but the principle remains consistent: a small piece of software that you install and which your AI assistant can then use.

A practical example: you install a Notion skill, allowing Claude to create new pages in project folders with today's notes. Instead of just having a conversation, Claude takes action: opening Notion, creating pages, inserting text. A Google Calendar skill sets appointments during diary discussions. A Slack skill enables the retrieval of messages, summarisation and dispatch.

The advantage is substantial — skills turn chatbots into genuine digital colleagues. This explains their popularity. GitHub and specialised marketplaces overflow with free skills for virtually any conceivable service.

The problem: determining whether skills are safe.

This is where things get uncomfortable

A skill is simply code. Code written by unknown individuals, published on platforms with minimal oversight, and executed on your computer — with access to files, API keys, email and cloud storage.

The implicit trust created when installing skills is enormous. Their risk profile differs fundamentally from traditional software. Traditional malware must convince you to click something. A poisoned skill requires no convincing. The attack conceals itself in tool descriptions — text invisible to users but faithfully executed by AI. Security researchers call this "tool poisoning," and it escapes human code review.

Three documented incidents

The stolen emails

In September 2025, security firm Koi Security discovered a malicious MCP server on npm: postmark-mcp. On the surface, it imitated the legitimate Postmark email server. The difference: one line of code that sent a silent BCC of every outgoing email to an attacker's address. Password resets, invoices, internal memos, quotes — everything. Downloaded 1,643 times before removal, the package affected approximately 300 organisations.

The $500,000 stolen via a fake extension

A blockchain developer loses half a million in cryptocurrency. He searches Open VSX Marketplace for a Solidity extension for smart contract development. The extension looked legitimate — but its ranking was thanks to a simple trick: the fake version had been updated more recently than the genuine package. The fake extension silently installed remote-control software that emptied cryptocurrency wallets.

1.5 million infected VS Code installations

In early 2026, security researchers discovered two VS Code extensions posing as Chinese ChatGPT integrations, with 1.34 million and 150,000 installations respectively. Both contained identical spyware: they sent the full contents of every opened file to external servers, and logged all code changes. The code functioned exactly as advertised — and much more besides.

The scale of the problem

These are not isolated incidents.

Security firm Wiz analysed Forbes AI 50 companies and found that 65% had demonstrably leaked API keys and other sensitive data via GitHub. Equixly tested hundreds of publicly available MCP servers and concluded that 43% contained a command injection vulnerability.

OWASP ranks prompt injection first in their 2025 LLM Top 10, explicitly stating that no adequate prevention method exists — you can mitigate the risk, but not eliminate it entirely.

The picture is clear: the AI tooling ecosystem is growing faster than it is being secured.

What can you do? Option one: have Claude review the code

An obvious first step involves reviewing code before it is downloaded. Visit the GitHub page, copy the source code, paste it into Claude and ask targeted questions.

Ask Claude: "Analyse this code. What external connections does this code make? Which files or environment variables are read? Are there any hidden network requests?"

This is better than blind installation, but carries limitations: you are still trusting code from an unknown author, and thorough analysis sometimes misses subtle attack vectors — especially with dynamically loaded malware or post-update activation.

The real solution: have Claude recreate the skill

This is the heart of this article.

The strongest protection against poisoned skills is never to download them. Instead, ask Claude (opent in nieuw venster) to rebuild the functionality. Surprisingly practical — most skills remain conceptually simple. They connect one service to Claude. They read something from somewhere. They write something to somewhere.

Say you want a skill that lets Claude access a Notion database. Instead of downloading random notion-mcp packages from unknown GitHub developers, ask Claude: "Build a simple MCP server that connects to the Notion API. I want to be able to ask it to create a page and retrieve existing pages. Use the official Notion API documentation."

Claude builds the skill. You see every line. You understand what it does — or ask for an explanation until you do. No hidden dependencies. No code updates that install future malware.

With a downloaded skill, you trust the integrity of an unknown party and the controls of a platform that has been proven inadequate. With a skill built by Claude, the code is yours, transparent, and based on official API documentation.

"But I can't programme"

Precisely the point.

You don't need to code. Simply describe what you want. Claude translates descriptions into working code. If something fails, describe the error message — Claude resolves it.

Vibe coding — building with AI without formal programming knowledge — is becoming increasingly accessible. The skills required are not technical. They involve describing intentions clearly, remaining sceptical about downloads, and staying curious about mechanisms.

The irony: the AI tools that attackers seek to compromise are exactly the same tools you can use to protect yourself against those attacks.

Practical: how do you evaluate an AI skill?

When you discover interesting skills online:

1. Check the source. Is the skill available through official channels — Anthropic's MCP server list, your tool's extension marketplace, or a known organisation?

2. Check popularity. High user numbers and active maintenance suggest lower risk.

3. Unsure? Copy the source code and ask Claude for a security assessment.

4. Prefer to be safe? Describe the desired functionality to Claude and have him build it. This is always safer.

The only exception: when skills are too complex to rebuild and you fully trust the source. But that situation is rarer than you think.

The broader lesson

We are entering an era where AI skills become attack vectors comparable to email attachments in the 1990s. Back then, everyone learned: avoid opening attachments from unknown senders. Now: do not download AI skills from unknown sources.

One essential difference. Previously, alternatives did not exist. Now they do: you can ask Claude to build the functionality itself. Previous generations did not have this option.

The best skill is the skill you understand. And the fastest way to understand a skill is to have it built yourself.


Do you know someone who installs skills without thinking? Pass this on. Or respond: have you ever installed a questionable extension?