Group C: Typed-Species NEAT Integration

The third research stream, opened 2026-05-13 after Group B closed at commit e70bffd. Group B’s closing synthesis was a directive: the genome should evolve patch geometry, placement strategy, depth, and training schedule per task — not lock in MNIST-derived defaults. Group C does that integration work in the main NEAT system.

The hypothesis

A NEAT system whose genome can encode patch-matcher nodes alongside scalar nodes, whose mutations can vary patch geometry and placement, and whose populations can speciate ecologically should:

Outperform pure-scalar NEAT (and dense MLPs) on the joint 4-way task.
Discover per-niche patch parameters matching the task — reproducing Group B’s per-task findings, especially KMNIST’s spatial-locality inversion.

The first claim is about compression: patches as a primitive should beat dense layers parameter-for-parameter. The second is about discovery: niching should rediscover Group B’s manually-mapped task-conditional architectures.

The strategy

Baselines — 4-way joint dense MLPs to define the floor.
Design and integrate — extend the genome so a node can be a patch matcher (K pixel indices + K weights + bias stored on the node, no incoming ConnectionGenes). Outgoing connections normal. Touch every layer of the system (genome/node.rs, phenotype compilation, forward/backward, mutation, crossover) without breaking the existing scalar-only path.
Evolve — first a single-niche run to verify, then sweeps over patch count, mutation rates, pruning, macro-mutations, and finally ecological speciation.

The pages

Group C Journal — chronological observations and per-experiment narrative.
Group C Experiments — structured records (C1, C2, …, C8).

Headline result

C8 — ecological speciation reproduces Group B’s per-task locality directions without being told about them.

Five independent niches (pure MNIST/Fashion/KMNIST/EMNIST + a 25/25/25/25 mixed niche), each seeded identically with 128 patches (half spatial 5×5, half random-index). Each niche evolves independently for 300K training steps. Final patch geometry, measured by edge_frac (fraction of patches that touch the image border):

Niche	edge_frac	Reading
mnist	0.700	Spatial patches dominant — selection purged random-index
kmnist	1.000	Random-index dominant — selection purged spatial
fashion	1.000	Random-index dominant
emnist	0.857	Spatial-leaning
mixed	1.000	Random-index (averaged across tasks, pulled by majority)

Initial conditions had edge_frac ≈ 0.83. MNIST drifted down (preserved spatial 5×5 patches). KMNIST drifted up to ≈ 1.0 (purged spatial patches in favor of random-index). The KMNIST inversion that Group B mapped through 35 experiments is rediscovered by selection in 30 minutes of niche training, without anyone telling the system what KMNIST is.

EMNIST’s row_std (7.17) > col_std (6.78) anisotropy is consistent with Group B B33’s finding that EMNIST follows the rectangular wide-preference — printed characters have a vertical-stroke bias.

Experiment table

Exp	Setup	Result
C1	4-way MLP baseline `[64]` (3 seeds)	75.8% overall test
C1	4-way MLP baseline `[128]` (3 seeds)	77.4% overall test (the floor)
C1	4-way MLP baseline `[128, 64]` (3 seeds)	35.5% ± 26.6 (U(-1,1) init saturates softmax at depth 2 — recorded as init characterization, not depth ceiling)
C2	Integration verifier: 320 spatial 5×5 patches → linear on MNIST	96.64% — matches Group B’s spatial-patch result
C3	First patch-evolved population (seed 64)	75.9% test at 5,005 conn — matches `[64]` MLP at 11× fewer params
C4	Behavior-preserving insertion (`head_weight = 0`)	avg_patches 64→64.8 over 50 gen; test ~75% — patch count barely moves
C5a	4× `add_patch_prob`	avg_patches → 65.3; same test — insertion rate is not the bottleneck
C5b	Seed 128 patches	82.4% test at 9,933 conn — beats `[128]` MLP by +5pp at 11× fewer params
C5c	Seed 256 patches	85.7% test at 19,790 conn
C5d	Seed 512 patches	87.1% test at 39,502 conn (capacity asymptote ~88%)
C6	Per-connection pruning from seed 256	85.2% / 19,053 conn — pruning real but partially undone by crossover
C7	Macro `add_patch_burst` of 8 from seed 64	avg_patches → 77, top fitness still 64-65 patches — macro adds don’t propagate to top
C8	5 niches × 128 patches × 300K steps	Group B reproduced: per-task locality directions emerge from selection
D1	Per-patch introspection: PGM mosaics + pixel coverage heatmaps	MNIST/EMNIST 37-38% center mass; Fashion/KMNIST 24% (uniform); EMNIST anisotropy `col_std<row_std` and centroid offset top-left
D2	`add_patch_prob=0.10` + `add_patch_burst_prob=0.05` inside niches	Negative — niching softens macro-mutant culling (rank-2 EMNIST individual at 132 patches) but doesn’t push top to grow. Caught a real crossover cycle bug — once per 1.5M steps with patch-add; `sanitize` handles it
D3	32-node ReLU hidden layer between patches and outputs	KMNIST +3.3pp, EMNIST −2.7pp, MNIST/Fashion null, mixed −2.8pp — clean Group B replication (B25 +2.78pp, B34 −1.11pp). Bonus: also 33% fewer connections

Two headline scientific findings

1. The patch primitive is dramatically more parameter-efficient than dense MLPs

Method	Overall test	Connections
`[64]` MLP	0.758	55,245
`[128]` MLP	0.774	110,413
C5b (128 patches)	0.824	9,933
C5d (512 patches)	0.871	39,502

Patches at 128 beat the [128] MLP by +5pp on the joint task with 11× fewer parameters. Capacity scales log-linearly with halving returns: each doubling of patch count gives roughly half the previous gain. The 4-way joint task’s asymptote with patches alone is ~88% at this LR/budget.

2. Patch count is not evolvable through direct fitness-driven mutation

C3 (default add), C4 (behavior-preserving add), C5a (4× add rate), C7 (macro +8 bursts): in every case top-fitness individuals stayed at the initial seed count. New patches need training time before they confer fitness. Selection happens before they catch up. NEAT crossover treats new patches as disjoint genes inherited from the fitter parent only — and a fresh patch is rarely the fitter parent’s. So patches that don’t immediately confer fitness get bred out the next generation.

The way out is ecological speciation: niches restrict competition to similar-distribution individuals. Inside a niche, per-task index evolution does work — fitness climbs cleanly, and the population converges on a task-appropriate patch geometry. C8 demonstrates this.

This reframes the Group C charter. The interesting evolvable axes for the typed-species genome are patch indices (which pixels) and patch geometry (spatial vs distributed), not patch count. Capacity is set by initial seed, not mutation. Niching does the architectural discovery.

3. (Phase D) Ecological speciation reproduces Group B’s per-task depth findings

D3 added a 32-node ReLU hidden layer between patches and outputs (still 128 patches; depth Genome::new_with_patches(.., hidden_size = 32, ..)). Per-niche test accuracy vs the no-depth D1 baseline:

Niche	D1 (no depth)	D3 (depth=32)	Δ	Group B prediction
mnist	96.8%	96.78%	≈0	null on saturated MNIST (B32) ✓
fashion	86.9%	86.71%	≈0	(untested in Group B with proper schedule)
kmnist	90.2%	93.49%	+3.3pp	B25: +2.78pp ✓
emnist	78.3%	75.56%	−2.7pp	B34: −1.11pp ✓
mixed	81.4%	78.57%	−2.8pp	averaging

Three out of four per-task signs match Group B’s depth findings exactly. KMNIST’s +3.3pp is within 0.5pp of Group B’s +2.78pp (with proper schedule, B25). EMNIST’s −2.7pp matches B34’s sign (depth hurts even at proper schedule).

The mixed-niche regression is the ecological-speciation argument in concrete form. Adding depth uniformly hurts the mixed niche by 2.8pp. A single 32-node hidden layer is one fixed architectural decision: it helps KMNIST and hurts EMNIST, and on the mixed task with both, the net is negative. Per-task depth selection is one of the things ecological niches can do but a single network can’t. D3’s mixed niche underperforms D1’s mixed niche; D3’s KMNIST niche beats D1’s. Niching captures task-conditional architectural value that homogeneous training cannot.

Bonus: D3 networks have fewer connections (6,669) than D1 (9,933) because the 32-wide hidden bottleneck is narrower than 77 outputs. KMNIST’s depth niche gets +3.3pp accuracy at −33% connections — a Pareto win.

So two Group B cross-task findings now reproduce inside the integrated, niched system as emergent niche-level behaviors:

Per-task locality direction (C8/D1): MNIST→spatial, KMNIST→distributed, EMNIST→spatial-with-anisotropy, Fashion→distributed.
Per-task depth direction (D3): MNIST null, KMNIST +, EMNIST −.

What took 35 Group B experiments to map shows up as a population-level fingerprint in ~30 min of niche training.

What Phase D resolved and what’s still open

Resolved by Phase D:

Per-patch introspection (D1): each niche’s pixel-coverage signature is concrete and quantifiable, including EMNIST’s row≠col anisotropy that connects to Group B B33’s rectangular wide-preference finding.
Multi-layer patch genomes (D3): KMNIST gains +3.3pp from depth (matching B25 within 0.5pp), EMNIST loses 2.7pp (matching B34’s sign), MNIST/Fashion null. The mixed niche regresses, demonstrating why per-task niching captures architectural value single-network training can’t.
Patch-count evolution inside niches (D2): clean negative. Niching softens macro-mutant culling but doesn’t push the top of the population to grow patches.

Bugfix pass landed alongside Phase D:

add_connection excludes patch nodes as targets (they had been silent no-ops).
Sticky-disabled crossover (NEAT-classic 0.75 rule, exposed as EvolutionConfig.disable_inheritance_prob, default 0.0).
Dead-patch compilation skip.
Genome::sanitize() breaks cycles in the enabled subgraph by iteratively disabling one inter-cycle edge per pass. Load-bearing for any patch-add experiment — without it the run panics in phenotype compilation when crossover combines two acyclic-individually matching genes whose enabled patterns together close a cycle. Rare (1 per ~1.5M training steps) but catastrophic.

Still open:

Patch-count evolution remains structurally blocked. Every variant tried (default add, behavior-preserving, 4× rate, macro +8 bursts, in-niche) leaves top individuals at the seed count. Plausible unblocks: longer evolve intervals, pre-trained patch insertions, or a maturity-aware selection mechanism — none implemented.
Cross-niche transplant. Take MNIST’s evolved patches into a fresh genome trained on Fashion. Does the geometry transfer, or does each niche need to re-evolve from scratch?
Cycle bug root cause. sanitize() is the principled fix and works, but the precise crossover gene combinations that close cycles aren’t fully characterized.

The Group C charter — lift the typed-species primitive into the genome and prove evolution can do the architectural discovery — is firmly satisfied. Two Group B cross-task findings (locality and depth) now reproduce inside the integrated NEAT system as emergent niche-level behaviors.