🏆 ALE-Bench Leaderboard

ALE-Bench is a benchmark for evaluating AI systems on score-based algorithmic programming contests. Drawing on real-world tasks from the AtCoder Heuristic Contest (AHC), ALE-Bench presents optimization problems (e.g., routing and scheduling) that are computationally hard and admit no known exact solution.

This page displays the leaderboard for ALE-Bench, showcasing the performance of various AI models on the benchmark tasks.

Current version is updated as of 2026-03-21. For the initial version of the leaderboard (as of 2025-06-17), please visit here.

All the result data published on this webpage can be downloaded in bulk with download_results_json.sh.

Problem Set Vertical Metric Horizontal Metric

Loading evaluation data…

Experimental Setup: Self-refine x1 involved 15 sampling iterations, from which the response getting the median score across 50 local cases was selected. Details can be found in the ALE-Bench GitHub repository ( LLMs' setting and implementation code ).

Metric Self-Refine

Model Configuration Problem