🤖 AI Toolset

高用量场景下的最便宜与免费 AI API (2026)

2026 年 4 月更新 · 每百万 token 计价 · 均附来源链接

🎯 本页适合谁?

需要大规模用 AI、但不必追求前沿模型质量:文本分类、情感分析、关键词/实体抽取、摘要、翻译、审核、数据 enrichment、批量处理等。也适合原型、测试与 side project 希望 $0 成本的场景。

🌐 免费模型聚合平台

下列平台聚合多家提供商的免费模型,一个 API Key 即可访问多种免费模型,多数无需信用卡。

平台 免费模型 速率限制 亮点 来源
OpenRouter 29 free models including Llama 4, GPT-oss, DeepSeek R1, Qwen3, GLM-4.5 Air, Mistral, Hermes 405B [1] 20 RPM, 200 RPD per model Largest free model selection; unified API; tools support openrouter.ai
硅基流动 SiliconFlow DeepSeek V3/R1, Qwen3-8B/32B, GLM-4-9B, Llama 4 Scout [2] 1000 RPM per model Highest free RPM; 中国直连; OpenAI-compatible API siliconflow.cn
Google AI Studio Gemini 2.5 Flash, Flash-Lite, Gemma 3 [3] 15-30 RPM, 1-2M TPM, 250 RPD Best free quality; 1M context window; vision support ai.google.dev
Groq Llama 4 Scout/Maverick, Mixtral, Gemma 2 [4] Free tier with limited RPM Fastest inference (~100ms); real-time apps groq.com
Cloudflare Workers AI Llama 3.3, Mistral 7B, Phi-4 mini [5] Free: 10K neurons/day Edge inference; no cold start; global CDN cloudflare.com
Mistral AI (La Plateforme) Mistral Small, Codestral [6] Free tier with rate limits European data residency; good coding model mistral.ai

⭐ 聚合平台上的代表免费模型

经 OpenRouter 或 SiliconFlow 等聚合平台可访问的优质免费模型(数据格与英文页一致)。

模型 免费接入途径 上下文 能力 最适合
GPT-oss-120BOpenRouter [1]131KToolsGeneral Q&A, coding, agents
Hermes 3 Llama 3.1 405BOpenRouter [1]131KLargest free model, strong reasoning
DeepSeek R1OpenRouter, SiliconFlow [1]128KReasoningChain-of-thought, math, coding
DeepSeek V3.2SiliconFlow [2]128KBest free quality for general tasks
Qwen3 32BSiliconFlow, OpenRouter [2]128KToolsMultilingual, coding, Chinese tasks
GLM-4.5 AirOpenRouter [1]128KChinese tasks, general purpose
Llama 4 Scout (17B)Groq, SiliconFlow [4]128K⚡ Fastest; real-time chat, agents
Gemini 2.5 FlashGoogle AI Studio [3]1MVisionLongest free context; image understanding
Gemini 2.5 Flash-LiteGoogle AI Studio [3]1MHighest free volume + 1M context
NVIDIA Nemotron Nano 12BOpenRouter [1]8KLightweight, fast, simple tasks
Liquid LFM 2.5 1.2BOpenRouter [1]66KUltra-lightweight; fastest free model

💰 超低价付费 API(输入 < $0.30/MTok)

模型 提供商 输入 $/MTok 输出 $/MTok 上下文 最适合 来源
GPT-4.1 Nano OpenAI $0.10 $0.40 1M Classification, extraction, simple Q&A openai.com
Gemini 3.1 Flash-Lite Google $0.10 $0.40 1M Cheapest per-token with massive context ai.google.dev
Groq (Llama 4 Scout) Groq $0.11 $0.34 128K ⚡ Fastest inference (~100ms), real-time apps groq.com
DeepSeek V3.2 DeepSeek $0.14 $0.28 128K Best quality/price ratio, coding & reasoning deepseek.com
Gemini 2.5 Flash Google $0.15 $0.60 1M Great quality at budget price, long context ai.google.dev
DeepSeek R1 DeepSeek $0.55 $2.19 128K Cheapest reasoning model (chain-of-thought) deepseek.com
o4 Mini OpenAI $0.55 $2.20 200K Reasoning tasks, coding, math openai.com
Claude Haiku 3.5 Anthropic $0.80 $4.00 200K Budget Anthropic option, fast responses anthropic.com
📊 Real Cost Comparison: 10M Input Tokens + 5M Output Tokens
GPT-4.1 Nano
$3.00
Gemini 3.1 Flash-Lite
$3.00
Groq Llama 4
$2.80
DeepSeek V3.2
$2.80
Gemini 2.5 Flash
$4.50
Claude Haiku 3.5
$28.00
GPT-4o (for comparison)
$37.50

💡 同等工作量下,预算型模型比 GPT-4o 便宜 7–13 倍;简单任务质量差距通常不大。

🎯 不同任务该选哪款便宜模型?

任务 推荐模型 原因 约每百万次请求成本
Sentiment AnalysisGPT-4.1 NanoShort input/output, simple classification~$1-3
Keyword/Entity ExtractionGemini Flash-LiteCheapest per-token, handles structured output~$2-5
Translation (high volume)DeepSeek V3.2Excellent multilingual at lowest cost~$5-15
SummarizationGemini 2.5 Flash1M context window, good quality~$5-20
Content ModerationGroq (Llama 4)Sub-100ms latency for real-time~$3-8
Data EnrichmentDeepSeek V3.2Good at structured data, cheapest~$5-15
Email Auto-ReplyClaude Haiku 3.5Best tone/quality at budget price~$10-30
Code Review (bulk)DeepSeek V3.2Strong coding ability, very cheap~$10-25

🏠 自托管:极致低价方案

若用量极大、追求最低单价,可自托管开源权重模型,仅付 GPU 租金、无按 token 计费。

模型最低显存GPU 成本(估)折合每 token
Llama 4 Scout (17B)24GB~$0.50/hr (A10G)~$0.001/MTok (effectively free)
Qwen3 14B16GB~$0.40/hr (T4)~$0.001/MTok
DeepSeek V3.2 (full)8×H100~$16/hrOnly worthwhile at extreme scale
Gemma 3 4B8GB~$0.25/hr (T4)Cheapest self-hosted option

GPU 成本参考 AWS/GCP/Spot。可用 vast.ai or runpod.io for cheaper community GPUs.

📋 官方定价来源

  1. OpenRouter Free Models Collection — 29 free models, no credit card
  2. 硅基流动 SiliconFlow 定价 — 1000 RPM free, Chinese open-source models
  3. Google AI Studio Pricing (Gemini) — Free Gemini Flash
  4. Groq Pricing — Free tier with ultra-fast inference
  5. Cloudflare Workers AI Models — Edge inference free tier
  6. Mistral AI Pricing — Free tier for Mistral Small
  7. OpenAI API Pricing — GPT-4.1 Nano at $0.10/MTok
  8. DeepSeek API Pricing — V3.2 at $0.14/MTok
  9. Anthropic API Pricing — Haiku 3.5 at $0.80/MTok

价格为截至 2026 年 4 月的公开费率;免费档限额可能变动,购买前请以官网为准。

更多对比

继续浏览