Core Concepts¶

ShinkaEvolve works best when the task boundary is explicit: one initial program, one evaluator, one results directory, and a search loop that mutates code while preserving correctness.

Task Contract¶

Every runnable task centers on two files:

File	Purpose
`initial.<ext>`	Seed program with mutable code regions
`evaluate.py`	Evaluation harness that executes candidates, validates outputs, returns metrics

At minimum, the evaluator surfaces a combined_score and whether the candidate is functionally correct. This creates a stable interface for both the Hydra launcher and the task-directory CLI.

Evolution Loop¶

Each generation follows the same loop:

Sample a parent or inspiration context from the database and archive
Ask one or more LLMs to propose a patch or full candidate program
Validate and materialize the candidate
Run the evaluator and collect metrics
Persist the result into the program database
Update archive state, island state, and optional meta-memory

The async runner overlaps proposal generation and evaluation to keep worker slots busy when LLM sampling is slower than the evaluator.

Runtime Layers¶

The runtime splits into three major configuration domains:

Domain	Controls
`EvolutionConfig`	Mutation behavior, model selection, budgets, prompt evolution, async proposal controls
`DatabaseConfig`	Archive sizing, island behavior, migration, parent selection, archive ranking
`JobConfig` variants	Where and how candidate programs are executed

See Configuration for field-level reference.

Model Selection¶

When EvolutionConfig.llm_models contains multiple mutation models, Shinka can shift sampling probability across them with dynamic bandit selection.

Reward-side utility from observed candidate improvements
Exploration bonus for under-sampled models
Optional cost-aware blending to prefer cheaper models when quality is close

Use the interactive UCB1 Bandit LLM Selection to see how cost_aware_coef changes the selection posterior over time.

Archives and Islands¶

The database stores both the current population and derived state used to guide future sampling:

Global archive of high-value programs
Island-local populations to preserve diversity
Migration settings for cross-island transfer
Parent and inspiration selection strategies

This mechanism lets Shinka balance local exploitation with broader search over qualitatively different candidate programs.

Prompt Co-Evolution¶

Shinka can evolve not only programs but also the system prompts that generate them:

System-prompt archive
Prompt mutation operators
Prompt sampling controls
Prompt fitness tracking over time

Use prompt co-evolution when task quality depends heavily on mutation style or when different prompts unlock distinct improvement modes.

Execution Modes¶

Mode	Description
Local current interpreter	Default; runs in active Python env
Local sourced environment	Via `activate_script`
SLURM + Conda	Conda-backed cluster environments
SLURM + Docker	Container-backed cluster jobs

This keeps the search loop decoupled from the execution environment of the candidate code.

Results and Inspection¶

Each run writes artifacts into a results directory:

Persisted program records and metrics
Candidate code snapshots and diffs
Timing metadata
Prompt-evolution artifacts when enabled

Inspect results through the built-in WebUI, notebooks in the examples directory, or custom post-processing scripts.