模型排行榜模型Marketplace評測訓練租用API文件
Language
Your Ad Here

Marketplace

TEST Sale 3090

Used hardware listed by Lottolabs

$900.00
TEST Sale 3090

Hardware details

Testing marketplace sale of 3090

Message seller

Lottolabs on X

Send a private message through LocalMaxxing. The conversation stays in your account inbox.

Linked benchmark runs

Expandable proof-of-performance results attached by the seller.

2 runs
Qwen3.6-27B

2x NVIDIA GeForce RTX 3090 · llama.cpp / IQ4_NL · 4/25/2026

42.1
tok/s
229ms
TTFT
n/a
VRAM
Show all run details

Run ID

cmodt5m380002l204yqwh7l4a

Model

unsloth/Qwen3.6-27B

Display name

Qwen3.6-27B

Revision

main

Family

Qwen

Parameters

28B

Active params

n/a

MoE

no

Output tok/s

42.1

Prefill tok/s

n/a

Total tok/s

n/a

TTFT

228.8ms

Peak VRAM

n/a

Prompt tokens

25

Output tokens

256

Prefill tokens

n/a

Context length

131,712

Batch size

1

Hardware class

DISCRETE_GPU

Hardware

2x NVIDIA GeForce RTX 3090

GPU slots

n/a

GPU count

2

VRAM

48GB

Chip vendor

n/a

Chip family

n/a

Chip variant

n/a

Unified memory

n/a

NPU TOPS

n/a

CPU

AMD 7790

RAM

96GB

OS

windows 11

Power

n/a

Engine

llama.cpp

Engine version

n/a

Quantization

IQ4_NL

Backend

n/a

Tensor parallel

n/a

Pipeline parallel

n/a

GPU layers

n/a

Split mode

n/a

KV cache dtype

n/a

KV cache size

n/a

Prefix caching

n/a

Attention backend

n/a

Flash attention

n/a

Chunked prefill

n/a

Prefill chunk

n/a

Continuous batching

n/a

CPU offload

n/a

CPU layers

n/a

Rope scaling

n/a

Rope scale

n/a

Yarn ext factor

n/a

Engine quant

n/a

SGLang quant

n/a

GPU mem util

n/a

Max running seqs

n/a

Scheduler delay

n/a

Num parallel

n/a

Concurrency

n/a

Spec decoding

no

Spec method

n/a

Spec model

n/a

Spec draft model

n/a

Spec tokens

n/a

Spec ngram

n/a

Spec draft TP

n/a

MTP enabled

no

MTP draft layers

n/a

Temperature

n/a

Top P

n/a

Top K

n/a

Min P

n/a

Repeat penalty

n/a

Mirostat

n/a

Command

Extra flags

n/a

Notes

LM Studio on Win11, 96GB RAM, dual RTX 3090 (48GB VRAM), flash_attention=true, parallel=2 instances

Submitted

4/25/2026, 3:56:57 AM

Last edited

n/a

Qwen3.6-27B

2x NVIDIA GeForce RTX 3090 · llama.cpp / IQ4_NL · 4/28/2026

40.4
tok/s
463ms
TTFT
n/a
VRAM
Show all run details

Run ID

cmoi5t8qc0008ib04atua9wzl

Model

unsloth/Qwen3.6-27B

Display name

Qwen3.6-27B

Revision

main

Family

Qwen

Parameters

28B

Active params

n/a

MoE

no

Output tok/s

40.4

Prefill tok/s

n/a

Total tok/s

n/a

TTFT

463ms

Peak VRAM

n/a

Prompt tokens

30

Output tokens

256

Prefill tokens

n/a

Context length

131,712

Batch size

1

Hardware class

DISCRETE_GPU

Hardware

2x NVIDIA GeForce RTX 3090

GPU slots

n/a

GPU count

2

VRAM

48GB

Chip vendor

n/a

Chip family

n/a

Chip variant

n/a

Unified memory

n/a

NPU TOPS

n/a

CPU

AMD 7790

RAM

96GB

OS

windows 11

Power

n/a

Engine

llama.cpp

Engine version

n/a

Quantization

IQ4_NL

Backend

cuda

Tensor parallel

n/a

Pipeline parallel

n/a

GPU layers

n/a

Split mode

n/a

KV cache dtype

n/a

KV cache size

n/a

Prefix caching

n/a

Attention backend

n/a

Flash attention

n/a

Chunked prefill

n/a

Prefill chunk

n/a

Continuous batching

n/a

CPU offload

n/a

CPU layers

n/a

Rope scaling

n/a

Rope scale

n/a

Yarn ext factor

n/a

Engine quant

n/a

SGLang quant

n/a

GPU mem util

n/a

Max running seqs

n/a

Scheduler delay

n/a

Num parallel

n/a

Concurrency

n/a

Spec decoding

no

Spec method

n/a

Spec model

n/a

Spec draft model

n/a

Spec tokens

n/a

Spec ngram

n/a

Spec draft TP

n/a

MTP enabled

no

MTP draft layers

n/a

Temperature

n/a

Top P

n/a

Top K

n/a

Min P

n/a

Repeat penalty

n/a

Mirostat

n/a

Command

Extra flags

n/a

Notes

Test submission - LM Studio on Win11, 96GB RAM, 2x RTX 3090. Flash attention enabled.

Submitted

4/28/2026, 5:02:19 AM

Last edited

n/a