Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
69d99a3
Perform enrichment on entire IOC set, not per file
malx-labs Apr 10, 2026
a1af5d3
Implement extended PE metadata analysis: full import details, exports…
malx-labs Apr 11, 2026
5201e44
Add additional pe_parser capability to meet v0.6.0 feature requiremen…
malx-labs Apr 11, 2026
7cdade5
Make pe_parser more defensive to malformed PE internals
malx-labs Apr 11, 2026
440e418
Add tests to cover analyse_extended with defensive machine and subsys…
malx-labs Apr 11, 2026
ae7c130
Write tests for extended analysis functionality - coverage remains full
malx-labs Apr 12, 2026
68fe639
Structural refactor of pe_parser for better maintainability. Hardened…
malx-labs Apr 12, 2026
ea23cf2
Dynamic lang code decoding
malx-labs Apr 12, 2026
b86714b
Added v0.6.0 security considerations to threat model
malx-labs Apr 12, 2026
b17c99c
Alter security consideration and relationship to v0.7.0 section inden…
malx-labs Apr 12, 2026
c7dab8a
PE pipeline documentation initial commit
malx-labs Apr 12, 2026
a1b0e0e
Pe pipeline documentation 2nd draft
malx-labs Apr 13, 2026
37f19c3
Pe pipeline 3rd draft
malx-labs Apr 13, 2026
11063ce
Remove line breaks from mermaid diagram
malx-labs Apr 13, 2026
ac7f858
Remove parentheses from mermaid diagram
malx-labs Apr 13, 2026
d08f10e
Remove periods from mermaid diagram
malx-labs Apr 13, 2026
4b13901
fix mermaid diagram
malx-labs Apr 13, 2026
f014c53
fix mermaid diagram #2
malx-labs Apr 13, 2026
93aa138
fix mermaid diagram #3
malx-labs Apr 13, 2026
c3af1f6
fix mermaid diagram #4
malx-labs Apr 13, 2026
539ce72
fix mermaid diagram #5
malx-labs Apr 13, 2026
cad3b8b
Add in extended summary into pe pipeline mermaid
malx-labs Apr 13, 2026
4aa72c7
Fix typo
malx-labs Apr 13, 2026
97e119c
Core metadata remove det
malx-labs Apr 13, 2026
933e4b1
Fix typo
malx-labs Apr 13, 2026
2cef277
Fix typo #2
malx-labs Apr 13, 2026
2c092cb
Redesign pe pipeline diagram
malx-labs Apr 13, 2026
e0eeeef
Add accompanying text to the pe pipeline diagram
malx-labs Apr 13, 2026
7b1177f
Reinstate old diagram
malx-labs Apr 13, 2026
d072efc
Slight change to control flow
malx-labs Apr 13, 2026
b754192
Slight change to control flow #2
malx-labs Apr 13, 2026
ccd9ef4
Slight change to control flow #3
malx-labs Apr 13, 2026
d88ef94
Slight change to pe pipeline diagram copy
malx-labs Apr 13, 2026
617d9b9
Final tweaks to PE pipeline document
malx-labs Apr 13, 2026
071315f
Add schema contract test to catch regression in output structure
malx-labs Apr 14, 2026
b226199
Add v0.6.0 schema details to README
malx-labs Apr 14, 2026
7577e82
Update pypi readme, and performance badges in the github readme
malx-labs Apr 14, 2026
856922c
Update the version number in pyproject.toml
malx-labs Apr 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ pip-wheel-metadata/

!tests/integration/fixtures/bin/
!tests/integration/fixtures/bin/*.exe
!tests/integration/fixtures/bin/analysis/*.exe


*.dll
Expand Down
32 changes: 22 additions & 10 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ INTEGRATION_DIR := tests/integration
FUZZ_DIR := tests/fuzz
ROBUSTNESS_DIR := tests/robustness
PERFORMANCE_DIR := tests/performance
CONTRACT_DIR := tests/contract

.PHONY: activate
activate:
Expand All @@ -29,16 +30,19 @@ activate:
help:
@echo ""
@echo "Available commands:"
@echo " make venv Create virtual environment (only once)"
@echo " make install Install package in editable mode"
@echo " make dev Install dev tools (pytest, ruff, black)"
@echo " make test Run test suite"
@echo " make lint Run ruff linter"
@echo " make format Auto-format with black"
@echo " make run Run CLI tool"
@echo " make clean Remove build artifacts"
@echo " make dist Build wheel + sdist"
@echo " make reset Delete venv and reinstall everything"
@echo " make venv Create virtual environment (only once)"
@echo " make install Install package in editable mode"
@echo " make dev Install dev tools (pytest, ruff, black, coverage, pip-audit, bandit, pytest-timeout)"
@echo " make test Run unit test suite only"
@echo " make test-[option] Run test suite (option=contract, fuzz, integration, performance, robustness, coverage)"
@echo " make security Run security scans (pip-audit, bandit)"
@echo " make lint Run ruff linter"
@echo " make format Auto-format with black"
@echo " make run Run CLI tool"
@echo " make clean Remove build artifacts"
@echo " make clean-all Remove build artifacts and virtual environment"
@echo " make dist Build wheel + sdist"
@echo " make reset Delete venv and reinstall everything"
@echo ""


Expand Down Expand Up @@ -122,6 +126,14 @@ test-coverage: dev
$(PYTHON) -m coverage run -m pytest
$(PYTHON) -m coverage report -m

# ----------------------------------------
# Contract tests only
# ----------------------------------------
.PHONY: test-contract
test-contract: dev
@echo "Running contract tests..."
$(PYTEST) -m contract $(CONTRACT_DIR)

# ----------------------------------------
# Static analysis and SCA
# ----------------------------------------
Expand Down
26 changes: 21 additions & 5 deletions README-pypi.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ This is the **official IOCX engine** for static IOC extraction and PE analysis.
- **Organisation:** https://github.com/iocx-dev
- **Website:** https://iocx.dev

IOCX is **not** an OSINT reputation checker, HTML report generator, or IP/domain scoring tool.
IOCX is **not** an OSINT reputation checker, HTML report generator, or IP/domain scoring tool.
It is a **static analysis engine** focused on extracting Indicators of Compromise (IOCs) from binaries and text.

---
Expand All @@ -19,6 +19,22 @@ It is a **static analysis engine** focused on extracting Indicators of Compromis
IOCX is a fast, safe, deterministic engine for extracting Indicators of Compromise (IOCs) from binaries, text, and logs.
It performs **pure static analysis** — no execution, no sandboxing, no risk.

## What's new in v0.6.0

- Stable JSON schema across all analysis levels
- Deterministic PE metadata (headers, TLS, optional header, signatures)
- Guaranteed IOC categories (always present, empty arrays when no matches)
- Formalised analysis levels:
- core behaviour → no analysis block
- basic → section layout + entropy
- deep → adds obfuscation heuristics
- full → extended metadata summaries
- Schema‑contract tests to prevent drift across releases

## Schema stability

IOCX guarantees a stable JSON schema, not a guaranteed ordering of keys within objects. JSON objects are unordered by definition, so consumers should rely on field presence and structure rather than positional ordering.

## Features

- Extracts IOCs from Windows PE files and raw text
Expand All @@ -27,6 +43,7 @@ It performs **pure static analysis** — no execution, no sandboxing, no risk.
- Deterministic output suitable for automation
- Minimal dependencies and safe for enterprise environments
- CLI and Python API
- Binary-aware static analysis with multi-level depth

## Installation

Expand Down Expand Up @@ -58,8 +75,8 @@ print(results)

- Static‑only design (never executes untrusted code)
- Binary‑aware IOC extraction
- Stable JSON schema
- High performance (~200 MB/s throughput)
- Stable, predictable JSON schema
- High performance: ~25-30 MB/s end-to-end, with individual detectors reaching 150-450 MB/s throughput)
- Ideal for DFIR, SOC automation, CI/CD, and threat‑intel pipelines

## Project identity & naming
Expand All @@ -81,8 +98,7 @@ Community tools that integrate with IOCX are encouraged to use names like:

## Extensibility

IOCX includes a lightweight plugin system that allows you to add custom detectors, parsers, and transformation rules.
Plugins can emit new IOC categories, override built-in behaviour, or integrate IOCX into larger analysis pipelines.
IOCX includes a lightweight plugin system for custom detectors, parsers, and transformation rules. Plugins can emit new IOC categories, override built‑in behaviour, or integrate IOCX into larger analysis pipelines.

See the documentation for details on writing detectors and plugins.

Expand Down
94 changes: 90 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,19 @@
<img src="https://img.shields.io/pypi/v/iocx?logo=pypi&logoColor=white" alt="PyPI Version">
</a>
<img src="https://img.shields.io/badge/coverage-100%25-brightgreen" alt="Coverage">
<img src="https://img.shields.io/badge/tests-576_passed-brightgreen" alt="Tests">
<img src="https://img.shields.io/badge/tests-633_passed-brightgreen" alt="Tests">
<img src="https://img.shields.io/badge/python-3.12-blue" alt="Python Version">
<a href="https://github.com/iocx-dev/iocx/blob/main/LICENSE">
<img src="https://img.shields.io/github/license/iocx-dev/iocx" alt="License">
</a>
<a href="https://github.com/iocx-dev/iocx/actions">
<img src="https://img.shields.io/github/actions/workflow/status/iocx-dev/iocx/ci.yml?label=build" alt="Build Status">
</a>
<img src="https://img.shields.io/badge/v0.2.0_performance-1MB_in_0.0053s-brightgreen" alt="Performance">
<img src="https://img.shields.io/badge/v0.2.0_throughput-~200MB%2Fs-brightgreen" alt="Throughput">
<img src="https://img.shields.io/badge/v0.2.0_pathological_IPv6-0.0005s-brightgreen" alt="Pathological IPv6 Timing">
<img src="https://img.shields.io/badge/v0.6.0_engine_1MB-0.0358s-brightgreen" alt="Engine Performance">
<img src="https://img.shields.io/badge/v0.6.0_engine_throughput-~28MB%2Fs-brightgreen" alt="Engine Throughput">
<img src="https://img.shields.io/badge/v0.6.0_detector_peak-150--450MB%2Fs-blue" alt="Detector Peak Throughput">
<img src="https://img.shields.io/badge/v0.6.0_pathological_IPv6-0.0004s-brightgreen" alt="Pathological IPv6 Timing">
<img src="https://img.shields.io/badge/v0.6.0_perf-28MB/s_engine_|_450MB/s_peak_|_0.0004s_path-brightgreen" alt="Performance Cluster">
</p>

# Official IOCX Project
Expand Down Expand Up @@ -93,6 +95,23 @@ IOCX is **static extraction only**, by design.

## Version Highlights

### v0.6.0 — Stable Output Schema, Deterministic PE Metadata, Contract‑Safe Analysis Levels

- Introduced a fully stable JSON schema across all analysis levels
- Added strict structural guarantees for `iocs`, `metadata`, and `analysis` blocks
- Normalised PE metadata fields for deterministic output (headers, TLS, optional header, signatures)
- Ensured **all IOC categories always exist** (empty arrays when no matches)
- Formalised analysis‑level behaviour:
- core behaviour → no analysis block
- basic → section layout + entropy
- deep → adds obfuscation heuristics
- full → adds extended metadata summaries
- Added **snapshot‑contract tests** to prevent schema drift across releases
- Improved PE parser consistency for imports, resources, and section metadata
- Strengthened safety guarantees for CI/CD and large‑scale automation pipelines

This release establishes the long‑term schema contract that downstream tools can rely on.

### v0.5.0 — Analysis Levels, PE Section Analysis, Obfuscation Hints

- New analysis‑level system: basic, deep (default), and full (future‑ready)
Expand Down Expand Up @@ -330,6 +349,73 @@ If you are building something that integrates with IOCX and want guidance on nam

Static analysis ensures **safety**, **determinism**, and **CI‑friendly operation**. No sandboxing, no execution, and no risk of triggering malware behaviour.

## Output Schema (v0.6.0)

IOCX v0.6.0 defines a stable, deterministic JSON schema designed for DFIR, SOC automation, and threat‑intel pipelines. The schema is intentionally simple, predictable, and safe for long‑term integrations.

The top‑level structure contains three blocks:

- `iocs` — extracted indicators
- `metadata` — structural information about the artifact
- `analysis` — optional deeper inspection depending on analysis level

This structure is identical across all input types, with PE‑specific fields populated only when applicable.

### IOC Categories

The `iocs` block always contains the same keys, regardless of analysis level:

- `urls`
- `domains`
- `ips`
- `hashes`
- `emails`
- `filepaths`
- `base64`
- `crypto.btc`
- `crypto.eth`

Each category is always an array. Empty categories are returned as empty arrays to ensure predictable downstream parsing.

### Metadata Categories

The metadata block contains structural information about the file. For PE files, this includes:

- Imports and import details
- Sections
- Resources and resource strings
- TLS directory
- Header and optional header
- Rich header
- Signatures

These fields are always present, even when empty. Metadata is **independent of analysis level** and is always returned in full.

### Analysis Levels

The `analysis` block is the only part of the schema that changes based on the selected analysis level.

- **basic** — section layout + entropy
- **deep** — adds obfuscation heuristics
- **full** — adds extended metadata summaries

This tiered design allows users to trade off performance vs. depth without changing their downstream parsing logic.

### Deterministic Output

IOCX v0.6.0 guarantees:

- Stable keys
- Stable types
- No volatile values in minimal modes
- Deterministic behaviour across runs and platforms

This makes IOCX safe for SIEM/SOAR ingestion, CI/CD pipelines, and large‑scale batch processing.

### Schema stability

IOCX guarantees a stable JSON schema, not a guaranteed ordering of keys within objects. JSON objects are defined as unordered maps, so consumers should rely on field presence and structure rather than positional ordering. All fields, types, and structural relationships remain consistent across versions, even if internal key order changes.

## Quickstart

### Install
Expand Down
Loading
Loading