Community benchmark suites for evaluating local LLM quality. Submit results via the API.
Approved suites will appear here. Submit one via the API.