Learning Through Noise: Why Subliminal Learning Works and When It Fails

This repository contains the code used to run controlled split-head teacher/student experiments on MNIST and EMNIST. The implementation is intentionally lightweight: each call to src/run_subliminal.py performs one experiment, writes tidy CSV/JSONL outputs, and exposes the experimental controls through command-line flags. Multi-seed manuscript sweeps are provided as shell scripts under scripts/.

Repository structure

subliminal_learning/
├── README.md                 # This file
├── LICENSE                   # BSD 3-Clause license
├── subliminal-cpu.yml        # Conda environment for CPU execution
├── src/
│   ├── run_subliminal.py     # Command-line entry point; one run per invocation
│   └── subliminal_core.py    # Models, data, training, perturbations, metrics, logging
└── scripts/
    ├── figure2/              # Multi-seed m-sweeps over layer-initialisation controls
    ├── figure3/              # Architecture and EMNIST class-count sweeps
    ├── figure4/              # Shared m/N-sweeps
    ├── figure5/              # Student-head perturbation sweeps
    └── figure6/              # Hidden-dimension and head-freezing sweeps

The scripts/*/output and scripts/*/outputs directories, if present, are generated run outputs or example output stubs. They can be deleted and regenerated by rerunning the corresponding scripts.

Installation

Create and activate the supplied CPU environment:

conda env create -f subliminal-cpu.yml
conda activate subliminal-cpu

The environment specification is:

name: subliminal-cpu
channels:
  - conda-forge
dependencies:
  - python=3.10
  - pip
  - numpy=2.2
  - pytorch-cpu=2.8
  - torchvision=0.23

Equivalent mamba commands may also be used. GPU execution is supported by the code through --device cuda, but the included environment and scripts are CPU-oriented.

Quick start

Run a single default MNIST MLP→MLP experiment from the repository root:

python src/run_subliminal.py --device cpu --outdir ./outputs/baseline

This uses the default configuration: seed 42, two hidden layers of width 256, auxiliary dimension m=10, 5 teacher epochs, 5 student epochs, 60 synthetic-noise batches per student epoch, and uniform noise. The first run downloads MNIST into ./MNIST_DATA unless --data-dir is changed.

To inspect the full command-line interface:

python src/run_subliminal.py --help

Core experiment logic

The implementation follows a controlled split-head setup:

A teacher and a student are instantiated with separate feature extractors and two output heads.
class_head outputs classification logits and is used for supervised teacher training.
aux_head outputs auxiliary logits and is used for student distillation on task-unrelated synthetic noise.
The teacher is trained on labeled MNIST or EMNIST with cross-entropy loss.
The student is trained to match the teacher auxiliary logits on synthetic noise using mean-squared error.
The student class head receives no direct supervised gradient during the distillation phase unless made trainable and indirectly affected through shared features.
Initial/final teacher and student weights are snapshotted and compared layer-wise.

src/subliminal_core.py contains the reusable components:

Component	Purpose
`ExperimentConfig`	Central dataclass for all experiment settings.
`SplitHeadMLP`	MLP feature extractor with `class_head` and `aux_head`.
`SplitHeadCNN`	Configurable CNN feature extractor with split heads.
Data utilities	MNIST/EMNIST loading, optional class truncation, deterministic dataloader seeds.
Noise utilities	Uniform, Gaussian, and Perlin-noise sampling.
Training utilities	Teacher cross-entropy training and student auxiliary-MSE distillation.
Perturbation utilities	Layer-wise additive Gaussian perturbations at named experiment timings.
Logging utilities	Per-run, per-layer, perturbation, and config output files.

Output files

Each invocation appends to a seed-specific output directory:

<outdir>/seed_000042/runs.csv
<outdir>/seed_000042/layer_metrics.csv
<outdir>/seed_000042/perturbations.csv
<outdir>/seed_000042/configs.jsonl

File	Contents
`runs.csv`	One row per experiment, including configuration, seeds, parameter counts, teacher/student accuracies and losses, auxiliary MSE, and selected compact layer-metric aliases.
`layer_metrics.csv`	Per-layer tensor comparisons between teacher/student and initial/final snapshots. Metrics include cosine similarity, norm differences, relative differences, shapes, and status fields for missing or mismatched layers.
`perturbations.csv`	Perturbation diagnostics. Empty unless perturbations are requested.
`configs.jsonl`	Full effective configuration, resolved layer initialisation/trainability configs, architecture descriptions, and derived seed bookkeeping.

With --append-global, aggregate files are also appended under <outdir>:

<outdir>/runs_all_seeds.csv
<outdir>/layer_metrics_all_seeds.csv
<outdir>/perturbations_all_seeds.csv

Use global appends for sequential sweeps. For many concurrent jobs, prefer per-seed outputs unless the filesystem safely handles concurrent appends.

Running the figure sweep scripts

The shell scripts in scripts/ are multi-seed sweeps. They use relative paths such as ../../src/run_subliminal.py, so run each script from its own directory:

cd scripts/figure2
bash same_init.sh

Do not run them from the repository root as bash scripts/figure2/same_init.sh unless you first modify the relative paths.

Most scripts use 20 seeds, seq 0 19. They are intended for figure-level experiments and can take substantially longer than the quick-start run.

Figure 2 scripts: initialisation controls over auxiliary dimension

All Figure 2 scripts run MNIST MLP→MLP sweeps over

m = 3, 10, 25, 50, 100, 250
seeds = 0, ..., 19
teacher hidden dims = 256,256
student hidden dims = 256,256
noise = uniform

For a two-hidden-layer MLP, positional layer configs follow

fc1, fc2, class_head, aux_head

Script	Student initialisation condition
`same_init.sh`	`A,A,A,A`: all student layers share source A with the teacher.
`rand_first_hid.sh`	`random,A,A,A`: random student `fc1`.
`rand_sec_hid.sh`	`A,random,A,A`: random student `fc2`.
`rand_both_hid.sh`	`random,random,A,A`: random student hidden layers.
`rand_class_head.sh`	`A,A,random,A`: random student `class_head`.
`rand_aux_head.sh`	`A,A,A,random`: random student `aux_head`.

Figure 3 scripts: architecture and class-count sweeps

Script	Sweep
`emnist_sweep.sh`	EMNIST balanced class-count sweep with `K=2,...,47`, `m=50`, random hidden layers, shared heads.
`student_first_layer_d_sweep_m10.sh`	MNIST sweep over the first student hidden width with `m=10`.
`student_first_layer_d_sweep_m50.sh`	Same first-layer-width sweep with `m=50`.
`student_minus_one_layer_m10.sh`	Teacher has two hidden layers; student has one hidden layer; `m=10`.
`student_minus_one_layer_m50.sh`	Same one-layer-smaller student setting with `m=50`.
`student_plus_one_layer_m10.sh`	Teacher has two hidden layers; student has three hidden layers; `m=10`.
`student_plus_one_layer_m50.sh`	Same one-layer-larger student setting with `m=50`.
`mlp_teacher_cnn_student_m10.sh`	Cross-architecture sweep with a MLP teacher and CNN student; `m=10`.
`mlp_teacher_cnn_student_m50.sh`	Same MLP-teacher/CNN-student setting with `m=50`.

The first-layer-width scripts use:

D1 = 8, 11, 16, 23, 32, 45, 64, 91, 128, 181, 256, 362,
     512, 724, 1024, 1448, 2048, 2896, 4096

Figure 4 scripts: aux-head capacity and noise sample sweep

Script	Sweep
`noise_m_sweep.sh`	MNIST sweep over networks with different auxiliary head sizes `m` and number of noise steps `noise-steps`
`m1_noise_sweep.sh`	Same sweep but only with a single auxiliary neuron (`m=1`)
`noise1_m_sweep.sh`	Same sweep but with a fixed budget of noise steps (`N=10^3`)

Figure 5 scripts: student-head perturbation sweeps

Script	Perturbed layer
`perturb_student_aux_head.sh`	Student `aux_head`.
`perturb_student_class_head.sh`	Student `class_head`.

Both scripts use MNIST MLP→MLP with all layers initialised from source A, all layers trainable, m=10, and perturb the selected student head immediately before student training:

perturbation std = 40 linearly spaced values from 0.0 to 0.2
seeds = 0, ..., 19
timing = before_student_training
include_weight = true
include_bias = true

Figure 6 scripts: hidden-dimension and head-freezing sweeps

All Figure 6 scripts sweep the second hidden-layer width D:

D = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 16, 23, 32, 45, 64,
    91, 128, 181, 256, 362, 512, 724, 1024, 1448, 2048, 2896,
    4096, 5793, 8192, 11585
seeds = 0, ..., 19
teacher hidden dims = 256,D
student hidden dims = 256,D
noise = uniform

Script	Dataset / condition
`d_sweep_baseline.sh`	MNIST baseline with `m=10`, all layers trainable.
`d_sweep_baseline_emnist.sh`	EMNIST baseline with `m=50`, all layers trainable.
`d_sweep_fixed_class_head.sh`	MNIST, class heads frozen for teacher and student.
`d_sweep_fixed_aux_head.sh`	MNIST, auxiliary heads frozen for teacher and student.
`d_sweep_fixed_aux_class_head.sh`	MNIST, both class and auxiliary heads frozen for teacher and student.

Command-line reference

I/O and reproducibility

Flag	Default	Meaning
`--outdir`	`./outputs`	Root directory for generated outputs.
`--data-dir`	`./MNIST`	Dataset storage directory.
`--seed`	`42`	Base seed. Deterministic derived seeds are used internally for model initialisation, training, dataloader shuffling, perturbations, and noise evaluation.
`--num-workers`	`0`	Number of PyTorch dataloader workers.
`--device`	`auto`	Device string: `auto`, `cpu`, `cuda`, `cuda:0`, etc.
`--sweep-name`	`default`	Free-form sweep label written to outputs.
`--run-label`	empty	Additional free-form label written to outputs.
`--append-global`	off	Also append to aggregate CSV files under `--outdir`.

Dataset flags

Flag	Default	Meaning
`--dataset`	`mnist`	Dataset: `mnist` or `emnist`.
`--emnist-split`	`balanced`	EMNIST split: `balanced` or `letters`. Ignored for MNIST.
`--class-count`	`None`	If set, keep only labels `0,...,K-1` after any EMNIST target transform and remap them to zero-based labels.
`--class-selection`	`first`	Class-truncation rule. Currently only `first` is implemented.

MNIST has 10 classes. EMNIST balanced has 47 classes. EMNIST letters has 26 classes after shifting labels to zero-based indexing.

Architecture flags

Flag	Default	Meaning
`--teacher-type`	`mlp`	Teacher architecture: `mlp` or `cnn`.
`--student-type`	`mlp`	Student architecture: `mlp` or `cnn`.
`--teacher-hidden-dims`	`256,256`	Comma-separated teacher MLP hidden widths. For the default CNN, the last value sets the final linear feature dimension.
`--student-hidden-dims`	`256,256`	Same for the student.
`--teacher-arch-spec`	`None`	JSON string or path to a JSON list describing teacher CNN feature layers. Ignored for MLP.
`--student-arch-spec`	`None`	JSON string or path to a JSON list describing student CNN feature layers. Ignored for MLP.
`--m`	`10`	Auxiliary head output dimension.

MLP layer names are stable and ordered as:

fc1, fc2, ..., class_head, aux_head

CNN layer names are taken from the architecture spec, followed by:

class_head, aux_head

If no CNN architecture spec is supplied, the default CNN feature extractor is:

conv1: 32 channels, 3x3, padding 1, ReLU, max-pool 2
conv2: 64 channels, 3x3, padding 1, ReLU, max-pool 2
fc1: final linear feature layer

A CNN spec can be passed either as a JSON string or as a path to a JSON file. Example:

[
  {
    "name": "conv1",
    "type": "conv2d",
    "out_channels": 32,
    "kernel_size": 3,
    "padding": 1,
    "activation": "relu",
    "pool": {"type": "max", "kernel_size": 2}
  },
  {
    "name": "fc1",
    "type": "linear",
    "out_features": 256,
    "activation": "relu"
  }
]

Supported CNN feature-layer types are conv2d and linear. Supported activations are relu, gelu, tanh, sigmoid, and identity-style values such as none. Supported pools include max pooling, average pooling, and adaptive average pooling.

Initialisation and trainability flags

Flag	Default	Meaning
`--teacher-init`	all `A`	Per-layer teacher initialisation source.
`--student-init`	all `A`	Per-layer student initialisation source.
`--teacher-trainable`	all `true`	Per-layer teacher trainability.
`--student-trainable`	all `true`	Per-layer student trainability.

Initialisation sources:

A       deterministic source A
B       deterministic source B
random  keep the model's independently seeded random initialisation

Layer configs may be positional:

--teacher-init A,A,A,A
--student-trainable true,true,true,true

or named:

--teacher-init fc1:A,fc2:A,class_head:A,aux_head:A
--student-trainable all:true

The named form accepts all:<value> as a default override. Named configs are recommended whenever teacher and student architectures have different numbers of layers.

Training flags

Flag	Default	Meaning
`--teacher-epochs`	`5`	Number of supervised teacher epochs.
`--student-epochs`	`5`	Number of student distillation epochs.
`--data-bsize`	`1024`	Batch size for supervised data loaders.
`--noise-bsize`	`1000`	Batch size for synthetic-noise samples during student training.
`--noise-steps`	`60`	Number of noise batches per student epoch. Samples per student epoch are `noise_bsize × noise_steps`.
`--teacher-lr`	`1e-3`	Teacher Adam learning rate.
`--student-lr`	`1e-3`	Student Adam learning rate.

Noise flags

Flag	Default	Meaning
`--noise-dist`	`uniform`	Synthetic noise distribution: `uniform`, `normal`, or `perlin`.
`--perlin-res`	`8`	Perlin grid resolution when `--noise-dist perlin` is used.
`--normalize-noise`	off	Apply MNIST normalisation constants to synthetic noise.
`--eval-noise-batches`	`10`	Number of noise batches for auxiliary-MSE evaluation.
`--eval-noise-bsize`	`1000`	Batch size for auxiliary-MSE evaluation.

Uniform noise is sampled in [-1, 1]. Normal noise is sampled from a standard Gaussian. Perlin noise is generated on 28×28 images and scaled by its per-sample maximum absolute value. Dataset images are normalised with MNIST constants, mean 0.1307 and standard deviation 0.3081.

Perturbation flags

Perturbations can be specified with repeatable shorthand flags:

--perturb "student:aux_head,std=0.1,timing=before_student_training,include_weight=true,include_bias=true"

or with a JSON object/list:

--perturb-spec '[{"target":"student","layers":["aux_head"],"std":0.1,"timing":"before_student_training"}]'

Available perturbation fields:

Field	Meaning
`target`	`teacher` or `student`.
`layers`	Layer name, list of layer names, `all`, or shorthand such as `fc1+aux_head`.
`std` / `sigma`	Standard deviation of the additive Gaussian perturbation.
`timing`	When to apply the perturbation.
`include_weight`	Whether to perturb layer weights.
`include_bias`	Whether to perturb layer biases.
`distribution`	Currently `normal`.

Allowed timings:

before_teacher_training
after_teacher_training
before_student_training
after_student_training

Minimal examples

Change only the seed:

python src/run_subliminal.py --device cpu --seed 123 --outdir ./outputs/seed_123_test

Use a larger auxiliary dimension:

python src/run_subliminal.py --device cpu --m 100 --outdir ./outputs/m100

Use EMNIST balanced with the first 20 classes:

python src/run_subliminal.py \
  --device cpu \
  --dataset emnist \
  --emnist-split balanced \
  --class-count 20 \
  --data-dir ./EMNIST_DATA \
  --outdir ./outputs/emnist_k20

Randomise the student hidden layers but keep the heads shared:

python src/run_subliminal.py \
  --device cpu \
  --teacher-init A,A,A,A \
  --student-init random,random,A,A \
  --outdir ./outputs/random_student_features

Freeze both student heads during distillation:

python src/run_subliminal.py \
  --device cpu \
  --student-trainable fc1:true,fc2:true,class_head:false,aux_head:false \
  --outdir ./outputs/frozen_student_heads

Use Perlin noise:

python src/run_subliminal.py \
  --device cpu \
  --noise-dist perlin \
  --perlin-res 8 \
  --outdir ./outputs/perlin_res8

Perturb the student auxiliary head before student training:

python src/run_subliminal.py \
  --device cpu \
  --outdir ./outputs/perturb_aux \
  --perturb "student:aux_head,std=0.1,timing=before_student_training,include_weight=true,include_bias=true"

Reproducibility notes

Each invocation performs exactly one run.
The base --seed is expanded into deterministic derived seeds for model initialisation, dataloader shuffling, teacher training, student noise training, perturbations, and noise evaluation.
The code records resolved configurations and derived seed bookkeeping in configs.jsonl.
Linear and convolutional layers use PyTorch initialisation unless reset through deterministic sources A or B.
The sweep scripts do not perform plotting or bootstrap aggregation. They generate CSV/JSONL files intended for downstream analysis.

Troubleshooting

If a shell script cannot be executed directly, run it via bash script_name.sh or make it executable with chmod +x script_name.sh.
Run figure scripts from their own directory so that their relative paths resolve correctly.
The first MNIST/EMNIST run may spend additional time downloading data.
For quick implementation checks, prefer a single command such as python src/run_subliminal.py --device cpu --outdir ./outputs/smoke_test before launching a full multi-seed sweep.

License

This repository is distributed under the BSD 3-Clause license; see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Through Noise: Why Subliminal Learning Works and When It Fails

Repository structure

Installation

Quick start

Core experiment logic

Output files

Running the figure sweep scripts

Figure 2 scripts: initialisation controls over auxiliary dimension

Figure 3 scripts: architecture and class-count sweeps

Figure 4 scripts: aux-head capacity and noise sample sweep

Figure 5 scripts: student-head perturbation sweeps

Figure 6 scripts: hidden-dimension and head-freezing sweeps

Command-line reference

I/O and reproducibility

Dataset flags

Architecture flags

Initialisation and trainability flags

Training flags

Noise flags

Perturbation flags

Minimal examples

Reproducibility notes

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
subliminal-cpu.yml		subliminal-cpu.yml

Folders and files

Latest commit

History

Repository files navigation

Learning Through Noise: Why Subliminal Learning Works and When It Fails

Repository structure

Installation

Quick start

Core experiment logic

Output files

Running the figure sweep scripts

Figure 2 scripts: initialisation controls over auxiliary dimension

Figure 3 scripts: architecture and class-count sweeps

Figure 4 scripts: aux-head capacity and noise sample sweep

Figure 5 scripts: student-head perturbation sweeps

Figure 6 scripts: hidden-dimension and head-freezing sweeps

Command-line reference

I/O and reproducibility

Dataset flags

Architecture flags

Initialisation and trainability flags

Training flags

Noise flags

Perturbation flags

Minimal examples

Reproducibility notes

Troubleshooting

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages