Pricing comparison

DeepSeek vs Meta Llama Pricing

Per-million-token pricing for DeepSeek and Meta Llama, with side-by-side flagship models, cheapest tiers, and context windows. Pricing data syncs weekly from a continuously-updated model catalog — last updated May 22, 2026.

As of May 22, 2026, Meta Llama offers the lowest output-token price at $0.02/1M (Llama 3.2 3B Instruct — $0.02/1M output).

Who wins on what

Cheapest input tokens

$0.02/1M

DeepSeek

DeepSeek R1 0528 Qwen3 8B — $0.02/1M input

Cheapest output tokens

$0.02/1M

Meta Llama

Llama 3.2 3B Instruct — $0.02/1M output

Longest context window

1.0M

Meta Llama

Llama 4 Maverick — 1.0M input tokens

Lowest average output cost

$0.67/1M

Meta Llama

Provider-wide average across 22 models

Largest model catalog

22 models

Meta Llama

More options to match cost vs capability

Most reasoning models

3 models

DeepSeek

Models with dedicated reasoning / thinking support

Open-weights available

Yes

DeepSeek

Offers open-weight models you can self-host

Side-by-side

21 models

DeepSeek

Cheapest input

$0.020

DeepSeek R1 0528 Qwen3 8B

Cheapest output

$0.100

DeepSeek R1 0528 Qwen3 8B

Longest context

1.0M

DeepSeek Chat

Avg output / 1M

$0.915

Across catalog

Cheapest cached input

$0.028

DeepSeek Chat

ModelIn/1MOut/1MCtx
DeepSeek V4 Pro
ReasoningToolsCache
$1.74$3.481.0M
DeepSeek V4 Flash
ReasoningToolsCache
$0.140$0.2801.0M
DeepSeek Chat
ToolsCache
$0.140$0.2801.0M
DeepSeek Reasoner
ReasoningToolsCache
$0.140$0.2801.0M
DeepSeek R1$0.550$2.1966K
DeepSeek R1 0528 Qwen3 8B$0.020$0.10033K
22 models

Meta Llama

Cheapest input

$0.020

Llama 3.1 8B Instruct

Cheapest output

$0.020

Llama 3.2 3B Instruct

Longest context

1.0M

Llama 4 Maverick

Avg output / 1M

$0.674

Across catalog

ModelIn/1MOut/1MCtx
Llama 3.1 405B (base)$4.00$4.0033K
Llama 3.1 405B Instruct$3.50$3.50131K
Llama 4 Maverick$0.150$0.6001.0M
Llama 3 70B Instruct$0.300$0.4008K
Llama 3.1 70B Instruct$0.400$0.400131K
Llama 3.2 3B Instruct$0.020$0.020131K

All prices in USD per 1 million tokens. Showing top 6 models per provider, sorted by output cost.

Related comparisons

Run the numbers for your workload

Calcaas multiplies per-token costs by your real usage patterns — inputs, outputs, retries, and conversation history — across both providers in one model.