Mixtral 8×22B

by Mistralopen weights

Mixtral family

Parameters

141B

Context window

64K tokens

Released

2024-04-10

Input price / 1M tok

$1.20

Output price / 1M tok

$1.20

Modality

tools

About Mixtral 8×22B

Mistral's flagship MoE — 8 experts, 22B each, 39B active per token. Fully Apache 2.0 licensed. Runs at Llama-3-70B-class quality but ~2x throughput on the same hardware.

Strengths

Apache 2.0 licence
MoE throughput advantage
Strong tool use
Cheap on Together/Fireworks

Weaknesses

64K context
Superseded by newer Mistral models
No vision

Best for

Cost-sensitive productionSelf-hosting on 4×H100Code

Vendor page →HuggingFace →