Skip to content

Stonefish-Labs/mcd-taxonomy

Repository files navigation

Malicious Code Detection Taxonomy

A structured, language-agnostic framework for detecting malicious behaviors in source code and compiled binaries. Every category applies equally to Python, JavaScript, Java, C, Go, Rust, shell scripts, compiled binaries, and any other artifact that can express computation.

Architecture

The taxonomy operates on two primary layers and one supporting layer:

  • Points of Investigation (POIs) — Atomic, independently detectable indicators that something warrants closer inspection. These are evidence, not verdicts.
  • Behavioral Patterns — Named compositions of POIs that, taken together, suggest a specific malicious intent. These are hypotheses built from evidence.
  • Contextual Signals — Ecosystem and metadata observations that modify confidence in findings from the first two layers. These are not detectable within the code itself.

Core Principle

No single POI is proof of malice. A hardcoded URL is not malicious. A base64 decode is not malicious. But a hardcoded URL combined with base64-encoded shell commands executed at install time in a package that appeared yesterday — that demands investigation.

The power of this taxonomy is in composition.

Quick Reference

POI Categories (16)

ID Name ID Name
ARTF Hardcoded Artifacts PRST Persistence
NETW Network Communication PRIV Privilege Escalation
FSYS Filesystem Operations RECN System Reconnaissance
EXEC Code Execution TIME Temporal Operations
LOAD Dynamic Code Loading PKGM Package & Build Manipulation
OBFS Obfuscation CRPT Cryptographic Operations
EVSN Evasion & Anti-Analysis AITM AI-Targeted Manipulation
CRED Credential & Secret Access RSRC Resource Manipulation

Behavioral Patterns (15)

ID Name ID Name
BP-SUPPLY Supply Chain Payload BP-MINER Resource Hijacking
BP-CREDTHEFT Credential Theft BP-ROOTKIT Rootkit / Self-Modification
BP-BACKDOOR Backdoor BP-WORM Worm / Propagation
BP-DROPPER Dropper / Downloader BP-TROJAN Trojan / Disguised Payload
BP-EXFIL Data Exfiltration BP-AGENTMANIP Agent Manipulation
BP-RANSOM Ransomware BP-TYPOSQUAT Typosquat / Dep. Confusion
BP-TIMEBOMB Logic Bomb / Time Bomb BP-LATERAL Lateral Movement
BP-MITM Traffic Interception

On Binary Analysis

This taxonomy treats compiled binaries as first-class targets. Decompiled output, disassembly, import tables, string dumps, and behavioral traces are all valid surfaces for POI detection. Where a POI manifests differently in source versus binary form, the description calls this out. Source-to-binary drift — where a compiled artifact contains behaviors not present in the published source — is itself a high-value signal addressed in Contextual Signals.

Investigation and Response

Every finding should be accompanied by structured investigation guidance. The Investigation Framework defines how to move from detection to determination: is this malicious, benign, or inconclusive?

Once a determination is made, the Response Framework defines what to do about it — six tiers from closing a benign finding to activating incident response:

Tier Name Summary
0 Informational — Close Confirmed benign. Document and close.
1 Document and Monitor Ambiguous signal. Watch for code changes that escalate.
2 Engineering Referral Security flaw, not malicious. Route to engineering.
3 Passive Monitoring Instrument and observe. Track execution and code changes.
4 Active Monitoring Real-time alerting. Containment staged and ready.
5 Immediate Response Confirmed malicious. Contain, escalate, respond.

Contributing

This taxonomy is a living document. Contributions are welcome:

  • Open an issue to propose new POI subtypes, behavioral patterns, or contextual signals.
  • Submit examples of real-world malicious code mapped to the taxonomy. Each POI category has an examples/ directory for this purpose.
  • Challenge the model — if a category is too broad, too narrow, or missing a real-world attack pattern, say so.

The goal is a community-validated reference that detection tooling, security teams, and researchers can build against.

Version History

Current version: 2.3

Version Changes
2.3 Added three new POI subtypes: EVSN.SECDISABLE (security control disabling — firewalls, AV exclusions, exploit mitigation weakening), PRIV.ACCOUNT (account/identity manipulation — backdoor accounts, group membership changes), FSYS.HIDDEN (hidden storage mechanisms — NTFS ADS, extended attributes, resource forks). Added BP-MITM behavioral pattern for traffic interception and man-in-the-middle setup. Added Network Destination contextual signal category (jurisdictional risk, bulletproof hosting, recently registered domains, dynamic DNS). Enhanced Source-to-Binary Drift signals with binary security posture (disabled exploit mitigations, unsigned/weakly-signed binaries, debug information leakage).
2.2 Incorporated lessons from the Axios npm supply chain compromise (March 2026). Added EVSN.MASQ, PKGM.PHANTOM, expanded EVSN.FORENSIC and EXEC.PROC, added contextual signals for provenance attestation downgrade and pre-staged clean versions.
2.1 Incorporated lessons from the TeamPCP/LiteLLM supply chain campaign (March 2026). Added RECN.PROCMEM, NETW.DECENTRAL, EVSN.FORENSIC, OBFS.FILELESS, BP-LATERAL, and execution context signals.
2.0 Initial public release. 16 POI categories, 14 behavioral patterns, contextual signals, and investigation guidance framework.

License

This work is licensed under CC BY 4.0. You are free to share and adapt this material for any purpose, with attribution.

About

A structured, language-agnostic taxonomy for detecting malicious behaviors in source code and compiled binaries.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors