Google Vertex AI vs Meta Llama Pricing
Per-million-token pricing for Google Vertex AI and Meta Llama, with side-by-side flagship models, cheapest tiers, and context windows. Pricing data syncs weekly from a continuously-updated model catalog — last updated May 22, 2026.
As of May 22, 2026, Meta Llama offers the lowest output-token price at $0.02/1M (Llama 3.2 3B Instruct — $0.02/1M output).
Who wins on what
Cheapest input tokens
$0.02/1MGoogle Vertex AI
Gemma 3 4B — $0.02/1M input
Cheapest output tokens
$0.02/1MMeta Llama
Llama 3.2 3B Instruct — $0.02/1M output
Longest context window
2.0MGoogle Vertex AI
Gemini 3/3.1 (> 200k context) — 2.0M input tokens
Lowest average output cost
$0.67/1MMeta Llama
Provider-wide average across 22 models
Largest model catalog
49 modelsGoogle Vertex AI
More options to match cost vs capability
Most reasoning models
10 modelsGoogle Vertex AI
Models with dedicated reasoning / thinking support
Most vision models
10 modelsGoogle Vertex AI
Models that accept image input
Side-by-side
Google Vertex AI
Cheapest input
$0.017
Gemma 3 4B
Cheapest output
$0.040
Gemma 3n 4B
Longest context
2.0M
Gemini 3/3.1 (> 200k context)
Avg output / 1M
$6.14
Across catalog
Cheapest cached input
$0.025
Gemini 3.1 Flash Lite
| Model | In/1M | Out/1M | Ctx |
|---|---|---|---|
| Gemini 3.5 Flash VisionReasoningToolsCache | $1.50 | $9.00 | 1.0M |
| Gemini 3.1 Flash Lite VisionReasoningToolsCache | $0.250 | $1.50 | 1.0M |
| Gemini 3.1 Flash Lite Preview VisionReasoningToolsCache | $0.250 | $1.50 | 1.0M |
| Nano Banana 2 VisionReasoning | $0.500 | $60.00 | 66K |
| Gemini 3.1 Pro Preview VisionReasoningToolsCache | $2.00 | $12.00 | 1.0M |
| Gemma 3n 4B | $0.020 | $0.040 | 33K |
Meta Llama
Cheapest input
$0.020
Llama 3.1 8B Instruct
Cheapest output
$0.020
Llama 3.2 3B Instruct
Longest context
1.0M
Llama 4 Maverick
Avg output / 1M
$0.674
Across catalog
| Model | In/1M | Out/1M | Ctx |
|---|---|---|---|
| Llama 3.1 405B (base) | $4.00 | $4.00 | 33K |
| Llama 3.1 405B Instruct | $3.50 | $3.50 | 131K |
| Llama 4 Maverick | $0.150 | $0.600 | 1.0M |
| Llama 3 70B Instruct | $0.300 | $0.400 | 8K |
| Llama 3.1 70B Instruct | $0.400 | $0.400 | 131K |
| Llama 3.2 3B Instruct | $0.020 | $0.020 | 131K |
All prices in USD per 1 million tokens. Showing top 6 models per provider, sorted by output cost.
Related comparisons
Run the numbers for your workload
Calcaas multiplies per-token costs by your real usage patterns — inputs, outputs, retries, and conversation history — across both providers in one model.