Local Reasoning Mini
OfficialA lightweight 10-question sanity check for locally served models. Designed for the trusted /api/evals/execute path.
Category: reasoningRunner: CustomVersion: v1.0Submitted by: Lottolabs
Eval Details
Direction
Higher is better
Default Run Config
TopP: 1Temperature: 0
| Task | Dataset | Weight | Shots | Max Tokens |
|---|
Basic Math basic_math | 5 inline items | 0.5 | Default | 16 |
Basic Logic basic_logic | 5 inline items | 0.5 | Default | 8 |
Basic Math
basic_math
Dataset5 inline items
Weight0.5
ShotsDefault
Max tokens16
Basic Logic
basic_logic
Dataset5 inline items
Weight0.5
ShotsDefault
Max tokens8
Leaderboard— best run per model
| # | Model | Score | Quant | Hardware | |
|---|
| Qwen3.6-27B Qwen | | IQ4_NL | NVIDIA GeForce RTX 3090 | |
QuantIQ4_NL
HardwareNVIDIA GeForce RTX 3090
Task Breakdown— top model
basic_logic— · 0 samples
basic_math— · 0 samples