ModelsLeaderboardEvalsTrainRentalsAPI Docs

Local Reasoning Mini

Official

A lightweight 10-question sanity check for locally served models. Designed for the trusted /api/evals/execute path.

Category: reasoningRunner: CustomVersion: v1.0Submitted by: Lottolabs

Eval Details

Scoring
Exact Match
Aggregation
Mean
Direction
Higher is better
Tasks
2 tasks

Default Run Config

TopP: 1Temperature: 0
TaskDatasetWeightShotsMax Tokens
Basic Math
basic_math
5 inline items0.5Default16
Basic Logic
basic_logic
5 inline items0.5Default8

Leaderboard— best run per model

#ModelScoreQuantHardware
Qwen3.6-27B
Qwen
100.0%
IQ4_NLNVIDIA GeForce RTX 3090

Task Breakdown— top model

basic_logic
100.0%
· 0 samples
basic_math
100.0%
· 0 samples