Available Models

View as Markdown

Available Models

This page is auto-generated from GET /v1/models so the catalog stays aligned with the live Mesh API inventory.

Total models: 345

Showing 1-25 of 345 models

NameModel IDProviderTierContextInput (USD)Output (USD)Description
Arcee AI: Trinity Large Preview (free)
arcee-ai/trinity-large-preview:freeArcee AiFree131 KFreeFree

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B…

Arcee AI: Trinity Mini (free)
arcee-ai/trinity-mini:freeArcee AiFree131 KFreeFree

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 1…

Google: Gemma 3 12B (free)
google/gemma-3-12b-it:freeGoogleFree32.8 KFreeFree

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles…

Google: Gemma 3 27B (free)
google/gemma-3-27b-it:freeGoogleFree131 KFreeFree

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles…

Google: Gemma 3 4B (free)
google/gemma-3-4b-it:freeGoogleFree32.8 KFreeFree

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles…

Google: Gemma 3n 2B (free)
google/gemma-3n-e2b-it:freeGoogleFree8.19 KFreeFree

Gemma 3n E2B IT is a multimodal, instruction-tuned model developed by Google DeepMind, designed…

Google: Gemma 3n 4B (free)
google/gemma-3n-e4b-it:freeGoogleFree8.19 KFreeFree

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as…

Google: Lyria 3 Clip Preview
google/lyria-3-clip-previewGoogleFree1.05 MFreeFree

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google’s family of music gener…

Google: Lyria 3 Pro Preview
google/lyria-3-pro-previewGoogleFree1.05 MFreeFree

Full-length songs are priced at $0.08 per song. Lyria 3 is Google’s family of music generation m…

LiquidAI: LFM2.5-1.2B-Instruct (free)
liquid/lfm-2.5-1.2b-instruct:freeLiquidFree32.8 KFreeFree

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-de…

LiquidAI: LFM2.5-1.2B-Thinking (free)
liquid/lfm-2.5-1.2b-thinking:freeLiquidFree32.8 KFreeFree

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data…

Meta: Llama 3.2 3B Instruct (free)
meta-llama/llama-3.2-3b-instruct:freeMeta LlamaFree131 KFreeFree

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced…

Meta: Llama 3.3 70B Instruct (free)
meta-llama/llama-3.3-70b-instruct:freeMeta LlamaFree65.5 KFreeFree

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned…

MiniMax: MiniMax M2.5 (free)
minimax/minimax-m2.5:freeMinimaxFree197 KFreeFree

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a d…

Nous: Hermes 3 405B Instruct (free)
nousresearch/hermes-3-llama-3.1-405b:freeNousresearchFree131 KFreeFree

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced…

NVIDIA: Nemotron 3 Nano 30B A3B (free)
nvidia/nemotron-3-nano-30b-a3b:freeNvidiaFree256 KFreeFree

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and…

NVIDIA: Nemotron 3 Super (free)
nvidia/nemotron-3-super-120b-a12b:freeNvidiaFree262 KFreeFree

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameter…

NVIDIA: Nemotron Nano 12B 2 VL (free)
nvidia/nemotron-nano-12b-v2-vl:freeNvidiaFree128 KFreeFree

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for…

NVIDIA: Nemotron Nano 9B V2 (free)
nvidia/nemotron-nano-9b-v2:freeNvidiaFree128 KFreeFree

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and d…

OpenAI: gpt-oss-120b (free)
openai/gpt-oss-120b:freeOpenaiFree131 KFreeFree

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from Open…

OpenAI: gpt-oss-20b (free)
openai/gpt-oss-20b:freeOpenaiFree131 KFreeFree

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 licens…

Qwen: Qwen3 Coder 480B A35B (free)
qwen/qwen3-coder:freeQwenFree262 KFreeFree

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by…

Qwen: Qwen3 Next 80B A3B Instruct (free)
qwen/qwen3-next-80b-a3b-instruct:freeQwenFree262 KFreeFree

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimize…

Qwen: Qwen3.6 Plus Preview (free)
qwen/qwen3.6-plus-preview:freeQwenFree1 MFreeFree

Qwen 3.6 Plus Preview is the next-generation evolution of the Qwen Plus series, featuring an adv…

StepFun: Step 3.5 Flash (free)
stepfun/step-3.5-flash:freeStepfunFree256 KFreeFree

Step 3.5 Flash is StepFun’s most capable open-source foundation model. Built on a sparse Mixture…

Venice: Uncensored (free)
cognitivecomputations/dolphin-mistral-24b-venice-edition:freeCognitivecomputationsFree32.8 KFreeFree

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24…

Z.ai: GLM 4.5 Air (free)
z-ai/glm-4.5-air:freeZ AiFree131 KFreeFree

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built f…

AI21: Jamba Large 1.7
ai21/jamba-large-1.7Ai21Paid256 K0.0020.008

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding…

AionLabs: Aion-1.0
aion-labs/aion-1.0Aion LabsPaid131 K0.0040.008

Aion-1.0 is a multi-model system designed for high performance across various tasks, including r…

AionLabs: Aion-1.0-Mini
aion-labs/aion-1.0-miniAion LabsPaid131 K0.00070.0014

Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for…

AionLabs: Aion-2.0
aion-labs/aion-2.0Aion LabsPaid131 K0.00080.0016

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It…

AionLabs: Aion-RP 1.0 (8B)
aion-labs/aion-rp-llama-3.1-8bAion LabsPaid32.8 K0.00080.0016

Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBench-Auto b…

AlfredPros: CodeLLaMa 7B Instruct Solidity
alfredpros/codellama-7b-instruct-solidityAlfredprosPaid4.1 K0.00080.0012

A finetuned 7 billion parameters Code LLaMA - Instruct model to generate Solidity smart contract…

AllenAI: Olmo 2 32B Instruct
allenai/olmo-2-0325-32b-instructAllenaiPaid128 K0.000050.0002

OLMo-2 32B Instruct is a supervised instruction-finetuned variant of the OLMo-2 32B March 2025 b…

AllenAI: Olmo 3 32B Think
allenai/olmo-3-32b-thinkAllenaiPaid65.5 K0.000150.0005

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning,…

AllenAI: Olmo 3.1 32B Instruct
allenai/olmo-3.1-32b-instructAllenaiPaid65.5 K0.00020.0006

Olmo 3.1 32B Instruct is a large-scale, 32-billion-parameter instruction-tuned language model en…

AllenAI: Olmo 3.1 32B Think
allenai/olmo-3.1-32b-thinkAllenaiPaid65.5 K0.000150.0005

Olmo 3.1 32B Think is a large-scale, 32-billion-parameter model designed for deep reasoning, com…

Amazon: Nova 2 Lite
amazon/nova-2-lite-v1AmazonPaid1 M0.00030.0025

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process te…

Amazon: Nova Lite 1.0
amazon/nova-lite-v1AmazonPaid3 K0.000060.00024

Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast proces…

Amazon: Nova Micro 1.0
amazon/nova-micro-v1AmazonPaid128 K0.0000350.00014

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Ama…

Amazon: Nova Premier 1.0
amazon/nova-premier-v1AmazonPaid1 M0.00250.0125

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning task…

Amazon: Nova Pro 1.0
amazon/nova-pro-v1AmazonPaid3 K0.00080.0032

Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination…

Anthropic: Claude 3 Haiku
anthropic/claude-3-haikuAnthropicPaid2 K0.000250.00125

Claude 3 Haiku is Anthropic’s fastest and most compact model for near-instant responsiveness. Qu…

Anthropic: Claude 3.5 Haiku
anthropic/claude-3.5-haikuAnthropicPaid2 K0.00080.004

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use…

Anthropic: Claude 3.5 Sonnet
anthropic/claude-3.5-sonnetAnthropicPaid2 K0.0060.03

New Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the…

Anthropic: Claude 3.7 Sonnet
anthropic/claude-3.7-sonnetAnthropicPaid2 K0.0030.015

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and probl…

Anthropic: Claude 3.7 Sonnet (thinking)
anthropic/claude-3.7-sonnet:thinkingAnthropicPaid2 K0.0030.015

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and probl…

Anthropic: Claude Haiku 4.5
anthropic/claude-haiku-4.5AnthropicPaid2 K0.0010.005

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intel…

Anthropic: Claude Opus 4
anthropic/claude-opus-4AnthropicPaid2 K0.0150.075

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sust…

Anthropic: Claude Opus 4.1
anthropic/claude-opus-4.1AnthropicPaid2 K0.0150.075

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performan…

Anthropic: Claude Opus 4.5
anthropic/claude-opus-4.5AnthropicPaid2 K0.0050.025

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineeri…

Anthropic: Claude Opus 4.6
anthropic/claude-opus-4.6AnthropicPaid1 M0.0050.025

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is bu…

Anthropic: Claude Sonnet 4
anthropic/claude-sonnet-4AnthropicPaid2 K0.0030.015

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excellin…

Anthropic: Claude Sonnet 4.5
anthropic/claude-sonnet-4.5AnthropicPaid1 M0.0030.015

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world ag…

Anthropic: Claude Sonnet 4.6
anthropic/claude-sonnet-4.6AnthropicPaid1 M0.0030.015

Sonnet 4.6 is Anthropic’s most capable Sonnet-class model yet, with frontier performance across…

Arcee AI: Coder Large
arcee-ai/coder-largeArcee AiPaid32.8 K0.00050.0008

Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on…

Arcee AI: Maestro Reasoning
arcee-ai/maestro-reasoningArcee AiPaid131 K0.00090.0033

Maestro Reasoning is Arcee’s flagship analysis model: a 32 B‑parameter derivative of Qwen 2.5‑32…

Arcee AI: Spotlight
arcee-ai/spotlightArcee AiPaid131 K0.000180.00018

Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fine‑tuned…

Arcee AI: Trinity Mini
arcee-ai/trinity-miniArcee AiPaid131 K0.0000450.00015

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 1…

Arcee AI: Virtuoso Large
arcee-ai/virtuoso-largeArcee AiPaid131 K0.000750.0012

Virtuoso‑Large is Arcee’s top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross…

Baidu: ERNIE 4.5 21B A3B
baidu/ernie-4.5-21b-a3bBaiduPaid12 K0.000070.00028

A sophisticated text-based Mixture-of-Experts (MoE) model featuring 21B total parameters with 3B…

Baidu: ERNIE 4.5 21B A3B Thinking
baidu/ernie-4.5-21b-a3b-thinkingBaiduPaid131 K0.000070.00028

ERNIE-4.5-21B-A3B-Thinking is Baidu’s upgraded lightweight MoE model, refined to boost reasoning…

Baidu: ERNIE 4.5 300B A47B
baidu/ernie-4.5-300b-a47bBaiduPaid123 K0.000280.0011

ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Bai…

Baidu: ERNIE 4.5 VL 28B A3B
baidu/ernie-4.5-vl-28b-a3bBaiduPaid30 K0.000140.00056

A powerful multimodal Mixture-of-Experts chat model featuring 28B total parameters with 3B activ…

Baidu: ERNIE 4.5 VL 424B A47B
baidu/ernie-4.5-vl-424b-a47bBaiduPaid123 K0.000420.00125

ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 ser…

ByteDance Seed: Seed 1.6
bytedance-seed/seed-1.6Bytedance SeedPaid262 K0.000250.002

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimo…

ByteDance Seed: Seed 1.6 Flash
bytedance-seed/seed-1.6-flashBytedance SeedPaid262 K0.0000750.0003

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting bot…

ByteDance Seed: Seed-2.0-Lite
bytedance-seed/seed-2.0-liteBytedance SeedPaid262 K0.000250.002

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimoda…

ByteDance Seed: Seed-2.0-Mini
bytedance-seed/seed-2.0-miniBytedance SeedPaid262 K0.00010.0004

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasi…

ByteDance: UI-TARS 7B
bytedance/ui-tars-1.5-7bBytedancePaid128 K0.00010.0002

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, includin…

Cohere: Command A
cohere/command-aCoherePaid256 K0.00250.01

Command A is an open-weights 111B parameter model with a 256k context window focused on deliveri…

Cohere: Command R (08-2024)
cohere/command-r-08-2024CoherePaid128 K0.000150.0006

command-r-08-2024 is an update of the Command R with improved perfor…

Cohere: Command R+ (08-2024)
cohere/command-r-plus-08-2024CoherePaid128 K0.00250.01

command-r-plus-08-2024 is an update of the Command R+ with roug…

Cohere: Command R7B (12-2024)
cohere/command-r7b-12-2024CoherePaid128 K0.00003750.00015

Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 202…

Deep Cogito: Cogito v2.1 671B
deepcogito/cogito-v2.1-671bDeepcogitoPaid128 K0.001250.00125

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance…

DeepSeek: DeepSeek V3
deepseek/deepseek-chatDeepseekPaid164 K0.000320.00089

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following…

DeepSeek: DeepSeek V3 0324
deepseek/deepseek-chat-v3-0324DeepseekPaid164 K0.00020.00077

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship…

DeepSeek: DeepSeek V3.1
deepseek/deepseek-chat-v3.1DeepseekPaid32.8 K0.000150.00075

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both…

DeepSeek: DeepSeek V3.1 Terminus
deepseek/deepseek-v3.1-terminusDeepseekPaid164 K0.000210.00079

DeepSeek-V3.1 Terminus is an update to DeepSeek V3.1 that mainta…

DeepSeek: DeepSeek V3.2
deepseek/deepseek-v3.2DeepseekPaid164 K0.000260.00038

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with…

DeepSeek: DeepSeek V3.2 Exp
deepseek/deepseek-v3.2-expDeepseekPaid164 K0.000270.00041

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediat…

DeepSeek: DeepSeek V3.2 Speciale
deepseek/deepseek-v3.2-specialeDeepseekPaid164 K0.00040.0012

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasonin…

DeepSeek: R1
deepseek/deepseek-r1DeepseekPaid64 K0.00070.0025

DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with…

DeepSeek: R1 0528
deepseek/deepseek-r1-0528DeepseekPaid164 K0.000450.00215

May 28th update to the original DeepSeek R1 Performance on par with [Op…

DeepSeek: R1 Distill Llama 70B
deepseek/deepseek-r1-distill-llama-70bDeepseekPaid131 K0.00070.0008

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instru…

DeepSeek: R1 Distill Qwen 32B
deepseek/deepseek-r1-distill-qwen-32bDeepseekPaid32.8 K0.000290.00029

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https:/…

EleutherAI: Llemma 7b
eleutherai/llemma_7bEleutheraiPaid4.1 K0.00080.0012

Llemma 7B is a language model for mathematics. It was initialized with Code Llama 7B weights, an…

EssentialAI: Rnj 1 Instruct
essentialai/rnj-1-instructEssentialaiPaid32.8 K0.000150.00015

Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential AI and trained…

Goliath 120B
alpindale/goliath-120bAlpindalePaid6.14 K0.003750.0075

A large LLM created by combining two fine-tuned Llama 70B models into one 120B model. Combines X…

Google: Gemini 2.0 Flash
google/gemini-2.0-flash-001GooglePaid1.05 M0.0001/image0.0001/image

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Fl…

Google: Gemini 2.0 Flash Lite
google/gemini-2.0-flash-lite-001GooglePaid1.05 M0.000075/image0.000075/image

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to [Gemi…

Google: Gemini 2.5 Flash
google/gemini-2.5-flashGooglePaid1.05 M0.0003/image0.0003/image

Gemini 2.5 Flash is Google’s state-of-the-art workhorse model, specifically designed for advance…

Google: Gemini 2.5 Flash Lite
google/gemini-2.5-flash-liteGooglePaid1.05 M0.0001/image0.0001/image

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for u…

Google: Gemini 2.5 Flash Lite Preview 09-2025
google/gemini-2.5-flash-lite-preview-09-2025GooglePaid1.05 M0.0001/image0.0001/image

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for u…

Google: Gemini 2.5 Pro
google/gemini-2.5-proGooglePaid1.05 M0.00125/image0.00125/image

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, ma…

Google: Gemini 2.5 Pro Preview 05-06
google/gemini-2.5-pro-preview-05-06GooglePaid1.05 M0.00125/image0.00125/image

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, ma…

Google: Gemini 2.5 Pro Preview 06-05
google/gemini-2.5-pro-previewGooglePaid1.05 M0.00125/image0.00125/image

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, ma…

Google: Gemini 3 Flash Preview
google/gemini-3-flash-previewGooglePaid1.05 M0.0005/image0.0005/image

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows…

Google: Gemini 3.1 Flash Lite Preview
google/gemini-3.1-flash-lite-previewGooglePaid1.05 M0.00025/image0.00025/image

Gemini 3.1 Flash Lite Preview is Google’s high-efficiency model optimized for high-volume use ca…

Google: Gemini 3.1 Pro Preview
google/gemini-3.1-pro-previewGooglePaid1.05 M0.002/image0.002/image

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engine…

Google: Gemini 3.1 Pro Preview Custom Tools
google/gemini-3.1-pro-preview-customtoolsGooglePaid1.05 M0.002/image0.002/image

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection…

Google: Gemma 2 27B
google/gemma-2-27b-itGooglePaid8.19 K0.000650.00065

Gemma 2 27B by Google is an open model built from the same research and technology used to creat…

Google: Gemma 2 9B
google/gemma-2-9b-itGooglePaid8.19 K0.000030.00009

Gemma 2 9B by Google is an advanced, open-source language model that sets a new standard for eff…

Google: Gemma 3 12B
google/gemma-3-12b-itGooglePaid131 K0.000040.00013

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles…

Google: Gemma 3 27B
google/gemma-3-27b-itGooglePaid131 K0.000080.00016

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles…

Google: Gemma 3 4B
google/gemma-3-4b-itGooglePaid131 K0.000040.00008

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles…

Google: Gemma 3n 4B
google/gemma-3n-e4b-itGooglePaid32.8 K0.000020.00004

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as…

Google: Nano Banana (Gemini 2.5 Flash Image)
google/gemini-2.5-flash-imageGooglePaid32.8 K0.0003/image0.0003/image

Gemini 2.5 Flash Image, a.k.a. “Nano Banana,” is now generally available. It is a state of the a…

Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)
google/gemini-3.1-flash-image-previewGooglePaid65.5 K0.00050.003

Gemini 3.1 Flash Image Preview, a.k.a. “Nano Banana 2,” is Google’s latest state of the art imag…

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)
google/gemini-3-pro-image-previewGooglePaid65.5 K0.002/image0.002/image

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3…

IBM: Granite 4.0 Micro
ibm-granite/granite-4.0-h-microIbm GranitePaid131 K0.0000170.00011

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the…

Inception: Mercury
inception/mercuryInceptionPaid128 K0.000250.00075

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete dif…

Inception: Mercury 2
inception/mercury-2InceptionPaid128 K0.000250.00075

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Inst…

Inception: Mercury Coder
inception/mercury-coderInceptionPaid128 K0.000250.00075

Mercury Coder is the first diffusion large language model (dLLM). Applying a breakthrough discre…

Inflection: Inflection 3 Pi
inflection/inflection-3-piInflectionPaid8 K0.00250.01

Inflection 3 Pi powers Inflection’s Pi chatbot, including backstory, emotional…

Inflection: Inflection 3 Productivity
inflection/inflection-3-productivityInflectionPaid8 K0.00250.01

Inflection 3 Productivity is optimized for following instructions. It is better for tasks requir…

Kwaipilot: KAT-Coder-Pro V2
kwaipilot/kat-coder-pro-v2KwaipilotPaid256 K0.00030.0012

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed fo…

LiquidAI: LFM2-2.6B
liquid/lfm-2.2-6bLiquidPaid32.8 K0.000010.00002

LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge…

LiquidAI: LFM2-24B-A2B
liquid/lfm-2-24b-a2bLiquidPaid32.8 K0.000030.00012

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for effici…

LiquidAI: LFM2-8B-A1B
liquid/lfm2-8b-a1bLiquidPaid32.8 K0.000010.00002

LFM2-8B-A1B is an efficient on-device Mixture-of-Experts (MoE) model from Liquid AI’s LFM2 famil…

Llama Guard 3 8B
meta-llama/llama-guard-3-8bMeta LlamaPaid131 K0.000020.00006

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification…

Magnum v4 72B
anthracite-org/magnum-v4-72bAnthracite OrgPaid16.4 K0.0030.005

This is a series of models designed to replicate the prose quality of the Claude 3 models, speci…

Mancer: Weaver (alpha)
mancer/weaverMancerPaid8 K0.000750.001

An attempt to recreate Claude-style verbosity, but don’t expect the same level of coherence or m…

Meituan: LongCat Flash Chat
meituan/longcat-flash-chatMeituanPaid131 K0.00020.0008

LongCat-Flash-Chat is a large-scale Mixture-of-Experts (MoE) model with 560B total parameters, o…

Meta: Llama 3 70B Instruct
meta-llama/llama-3-70b-instructMeta LlamaPaid8.19 K0.000510.00074

Meta’s latest class of model (Llama 3) launched with a variety of sizes & flavors. This 70B inst…

Meta: Llama 3 8B Instruct
meta-llama/llama-3-8b-instructMeta LlamaPaid8.19 K0.000030.00004

Meta’s latest class of model (Llama 3) launched with a variety of sizes & flavors. This 8B instr…

Meta: Llama 3.1 70B Instruct
meta-llama/llama-3.1-70b-instructMeta LlamaPaid131 K0.00040.0004

Meta’s latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B in…

Meta: Llama 3.1 8B Instruct
meta-llama/llama-3.1-8b-instructMeta LlamaPaid16.4 K0.000020.00005

Meta’s latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B ins…

Meta: Llama 3.2 11B Vision Instruct
meta-llama/llama-3.2-11b-vision-instructMeta LlamaPaid131 K0.0000490.000049

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks…

Meta: Llama 3.2 1B Instruct
meta-llama/llama-3.2-1b-instructMeta LlamaPaid60 K0.0000270.0002

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural l…

Meta: Llama 3.2 3B Instruct
meta-llama/llama-3.2-3b-instructMeta LlamaPaid80 K0.0000510.00034

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced…

Meta: Llama 3.3 70B Instruct
meta-llama/llama-3.3-70b-instructMeta LlamaPaid131 K0.00010.00032

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned…

Meta: Llama 4 Maverick
meta-llama/llama-4-maverickMeta LlamaPaid1.05 M0.000150.0006

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, bui…

Meta: Llama 4 Scout
meta-llama/llama-4-scoutMeta LlamaPaid328 K0.000080.0003

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta,…

Meta: Llama Guard 4 12B
meta-llama/llama-guard-4-12bMeta LlamaPaid164 K0.000180.00018

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content saf…

Microsoft: Phi 4
microsoft/phi-4MicrosoftPaid16.4 K0.0000650.00014

Microsoft Research Phi-4 is designed to perform well in complex reasoning tasks an…

MiniMax: MiniMax M1
minimax/minimax-m1MinimaxPaid1 M0.00040.0022

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-…

MiniMax: MiniMax M2
minimax/minimax-m2MinimaxPaid197 K0.0002550.001

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding an…

MiniMax: MiniMax M2-her
minimax/minimax-m2-herMinimaxPaid65.5 K0.00030.0012

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-…

MiniMax: MiniMax M2.1
minimax/minimax-m2.1MinimaxPaid197 K0.000270.00095

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agent…

MiniMax: MiniMax M2.5
minimax/minimax-m2.5MinimaxPaid197 K0.000190.00115

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a d…

MiniMax: MiniMax M2.7
minimax/minimax-m2.7MinimaxPaid205 K0.00030.0012

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world produ…

MiniMax: MiniMax-01
minimax/minimax-01MinimaxPaid1 M0.00020.0011

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image underst…

Mistral Large
mistralai/mistral-largeMistralaiPaid128 K0.0020.006

This is Mistral AI’s flagship model, Mistral Large 2 (version mistral-large-2407). It’s a prop…

Mistral Large 2407
mistralai/mistral-large-2407MistralaiPaid131 K0.0020.006

This is Mistral AI’s flagship model, Mistral Large 2 (version mistral-large-2407). It’s a propri…

Mistral Large 2411
mistralai/mistral-large-2411MistralaiPaid131 K0.0020.006

Mistral Large 2 2411 is an update of Mistral Large 2 released togeth…

Mistral: Codestral 2508
mistralai/codestral-2508MistralaiPaid256 K0.00030.0009

Mistral’s cutting-edge language model for coding released end of July 2025. Codestral specialize…

Mistral: Devstral 2 2512
mistralai/devstral-2512MistralaiPaid262 K0.00040.002

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding…

Mistral: Devstral Medium
mistralai/devstral-mediumMistralaiPaid131 K0.00040.002

Devstral Medium is a high-performance code generation and agentic reasoning model developed join…

Mistral: Devstral Small 1.1
mistralai/devstral-smallMistralaiPaid131 K0.00010.0003

Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents…

Mistral: Ministral 3 14B 2512
mistralai/ministral-14b-2512MistralaiPaid262 K0.00020.0002

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and pe…

Mistral: Ministral 3 3B 2512
mistralai/ministral-3b-2512MistralaiPaid131 K0.00010.0001

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny langu…

Mistral: Ministral 3 8B 2512
mistralai/ministral-8b-2512MistralaiPaid262 K0.000150.00015

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny languag…

Mistral: Mistral 7B Instruct v0.1
mistralai/mistral-7b-instruct-v0.1MistralaiPaid2.82 K0.000110.00019

A 7.3B parameter model that outperforms Llama 2 13B on all benchmarks, with optimizations for sp…

Mistral: Mistral Large 3 2512
mistralai/mistral-large-2512MistralaiPaid262 K0.00050.0015

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-expe…

Mistral: Mistral Medium 3
mistralai/mistral-medium-3MistralaiPaid131 K0.00040.002

Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver front…

Mistral: Mistral Medium 3.1
mistralai/mistral-medium-3.1MistralaiPaid131 K0.00040.002

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterp…

Mistral: Mistral Nemo
mistralai/mistral-nemoMistralaiPaid131 K0.000020.00004

A 12B parameter model with a 128k token context length built by Mistral in collaboration with NV…

Mistral: Mistral Small 3
mistralai/mistral-small-24b-instruct-2501MistralaiPaid32.8 K0.000050.00008

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across c…

Mistral: Mistral Small 3.1 24B
mistralai/mistral-small-3.1-24b-instructMistralaiPaid131 K0.000030.00011

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 bi…

Mistral: Mistral Small 3.2 24B
mistralai/mistral-small-3.2-24b-instructMistralaiPaid128 K0.0000750.0002

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for…

Mistral: Mistral Small 4
mistralai/mistral-small-2603MistralaiPaid262 K0.000150.0006

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities…

Mistral: Mistral Small Creative
mistralai/mistral-small-creativeMistralaiPaid32.8 K0.00010.0003

Mistral Small Creative is an experimental small model designed for creative writing, narrative g…

Mistral: Mixtral 8x22B Instruct
mistralai/mixtral-8x22b-instructMistralaiPaid65.5 K0.0020.006

Mistral’s official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22…

Mistral: Mixtral 8x7B Instruct
mistralai/mixtral-8x7b-instructMistralaiPaid32.8 K0.000540.00054

Mixtral 8x7B Instruct is a pretrained generative Sparse Mixture of Experts, by Mistral AI, for c…

Mistral: Pixtral Large 2411
mistralai/pixtral-large-2411MistralaiPaid131 K0.0020.006

Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large…

Mistral: Saba
mistralai/mistral-sabaMistralaiPaid32.8 K0.00020.0006

Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and Sou…

Mistral: Voxtral Small 24B 2507
mistralai/voxtral-small-24b-2507MistralaiPaid32 K0.00010.0003

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input c…

MoonshotAI: Kimi K2 0711
moonshotai/kimi-k2MoonshotaiPaid131 K0.000570.0023

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot…

MoonshotAI: Kimi K2 0905
moonshotai/kimi-k2-0905MoonshotaiPaid131 K0.00040.002

Kimi K2 0905 is the September update of Kimi K2 0711. It is a large-scale…

MoonshotAI: Kimi K2 Thinking
moonshotai/kimi-k2-thinkingMoonshotaiPaid131 K0.000470.002

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 s…

MoonshotAI: Kimi K2.5
moonshotai/kimi-k2.5MoonshotaiPaid262 K0.000420.0022

Kimi K2.5 is Moonshot AI’s native multimodal model, delivering state-of-the-art visual coding ca…

Morph: Morph V3 Fast
morph/morph-v3-fastMorphPaid81.9 K0.00080.0012

Morph’s fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code…

Morph: Morph V3 Large
morph/morph-v3-largeMorphPaid262 K0.00090.0019

Morph’s high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy fo…

MythoMax 13B
gryphe/mythomax-l2-13bGryphePaid4.1 K0.000060.00006

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions…

Nex AGI: DeepSeek V3.1 Nex N1
nex-agi/deepseek-v3.1-nex-n1Nex AgiPaid131 K0.0001350.0005

DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model designe…

Nous: Hermes 3 405B Instruct
nousresearch/hermes-3-llama-3.1-405bNousresearchPaid131 K0.0010.001

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced…

Nous: Hermes 3 70B Instruct
nousresearch/hermes-3-llama-3.1-70bNousresearchPaid131 K0.00030.0003

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresea…

Nous: Hermes 4 405B
nousresearch/hermes-4-405bNousresearchPaid131 K0.0010.003

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Rese…

Nous: Hermes 4 70B
nousresearch/hermes-4-70bNousresearchPaid131 K0.000130.0004

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It int…

NousResearch: Hermes 2 Pro - Llama-3 8B
nousresearch/hermes-2-pro-llama-3-8bNousresearchPaid8.19 K0.000140.00014

Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cl…

NVIDIA: Llama 3.1 Nemotron 70B Instruct
nvidia/llama-3.1-nemotron-70b-instructNvidiaPaid131 K0.00120.0012

NVIDIA’s Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful r…

NVIDIA: Llama 3.1 Nemotron Ultra 253B v1
nvidia/llama-3.1-nemotron-ultra-253b-v1NvidiaPaid131 K0.00060.0018

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasonin…

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5
nvidia/llama-3.3-nemotron-super-49b-v1.5NvidiaPaid131 K0.00010.0004

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model deriv…

NVIDIA: Nemotron 3 Nano 30B A3B
nvidia/nemotron-3-nano-30b-a3bNvidiaPaid262 K0.000050.0002

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and…

NVIDIA: Nemotron 3 Super
nvidia/nemotron-3-super-120b-a12bNvidiaPaid262 K0.00010.0005

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameter…

NVIDIA: Nemotron Nano 12B 2 VL
nvidia/nemotron-nano-12b-v2-vlNvidiaPaid131 K0.00020.0006

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for…

NVIDIA: Nemotron Nano 9B V2
nvidia/nemotron-nano-9b-v2NvidiaPaid131 K0.000040.00016

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and d…

OpenAI: GPT Audio
openai/gpt-audioOpenaiPaid128 K0.00250.01

The gpt-audio model is OpenAI’s first generally available audio model. The new snapshot features…

OpenAI: GPT Audio Mini
openai/gpt-audio-miniOpenaiPaid128 K0.00060.0024

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more na…

OpenAI: GPT-3.5 Turbo
openai/gpt-3.5-turboOpenaiPaid16.4 K0.00050.0015

GPT-3.5 Turbo is OpenAI’s fastest model. It can understand and generate natural language or code…

OpenAI: GPT-3.5 Turbo (older v0613)
openai/gpt-3.5-turbo-0613OpenaiPaid4.09 K0.0010.002

GPT-3.5 Turbo is OpenAI’s fastest model. It can understand and generate natural language or code…

OpenAI: GPT-3.5 Turbo 16k
openai/gpt-3.5-turbo-16kOpenaiPaid16.4 K0.0030.004

This model offers four times the context length of gpt-3.5-turbo, allowing it to support approxi…

OpenAI: GPT-3.5 Turbo Instruct
openai/gpt-3.5-turbo-instructOpenaiPaid4.09 K0.00150.002

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-relat…

OpenAI: GPT-4
openai/gpt-4OpenaiPaid8.19 K0.030.06

OpenAI’s flagship model, GPT-4 is a large-scale multimodal language model capable of solving dif…

OpenAI: GPT-4 (older v0314)
openai/gpt-4-0314OpenaiPaid8.19 K0.030.06

GPT-4-0314 is the first version of GPT-4 released, with a context length of 8,192 tokens, and wa…

OpenAI: GPT-4 Turbo
openai/gpt-4-turboOpenaiPaid128 K0.010.03

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and…

OpenAI: GPT-4 Turbo (older v1106)
openai/gpt-4-1106-previewOpenaiPaid128 K0.010.03

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and…

OpenAI: GPT-4 Turbo Preview
openai/gpt-4-turbo-previewOpenaiPaid128 K0.010.03

The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, pa…

OpenAI: GPT-4.1
openai/gpt-4.1OpenaiPaid1.05 M0.0020.008

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-wo…

OpenAI: GPT-4.1 Mini
openai/gpt-4.1-miniOpenaiPaid1.05 M0.00040.0016

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantiall…

OpenAI: GPT-4.1 Nano
openai/gpt-4.1-nanoOpenaiPaid1.05 M0.00010.0004

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1…

OpenAI: GPT-4o
openai/gpt-4oOpenaiPaid128 K0.00250.01

GPT-4o (“o” for “omni”) is OpenAI’s latest AI model, supporting both text and image inputs with…

OpenAI: GPT-4o (2024-05-13)
openai/gpt-4o-2024-05-13OpenaiPaid128 K0.0050.015

GPT-4o (“o” for “omni”) is OpenAI’s latest AI model, supporting both text and image inputs with…

OpenAI: GPT-4o (2024-08-06)
openai/gpt-4o-2024-08-06OpenaiPaid128 K0.00250.01

The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the abi…

OpenAI: GPT-4o (2024-11-20)
openai/gpt-4o-2024-11-20OpenaiPaid128 K0.00250.01

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural,…

OpenAI: GPT-4o (extended)
openai/gpt-4o:extendedOpenaiPaid128 K0.0060.018

GPT-4o (“o” for “omni”) is OpenAI’s latest AI model, supporting both text and image inputs with…

OpenAI: GPT-4o Audio
openai/gpt-4o-audio-previewOpenaiPaid128 K0.00250.01

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows…

OpenAI: GPT-4o Search Preview
openai/gpt-4o-search-previewOpenaiPaid128 K0.00250.01

GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to…

OpenAI: GPT-4o-mini
openai/gpt-4o-miniOpenaiPaid128 K0.000150.0006

GPT-4o mini is OpenAI’s newest model after GPT-4 Omni, supporting both…

OpenAI: GPT-4o-mini (2024-07-18)
openai/gpt-4o-mini-2024-07-18OpenaiPaid128 K0.000150.0006

GPT-4o mini is OpenAI’s newest model after GPT-4 Omni, supporting both…

OpenAI: GPT-4o-mini Search Preview
openai/gpt-4o-mini-search-previewOpenaiPaid128 K0.000150.0006

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trai…

OpenAI: GPT-5
openai/gpt-5OpenaiPaid4 K0.001250.01

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, a…

OpenAI: GPT-5 Chat
openai/gpt-5-chatOpenaiPaid128 K0.001250.01

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for en…

OpenAI: GPT-5 Codex
openai/gpt-5-codexOpenaiPaid4 K0.001250.01

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding work…

OpenAI: GPT-5 Image
openai/gpt-5-imageOpenaiPaid4 K0.010.01

GPT-5 Image combines OpenAI’s GPT-5 model with state-of-th…

OpenAI: GPT-5 Image Mini
openai/gpt-5-image-miniOpenaiPaid4 K0.00250.002

GPT-5 Image Mini combines OpenAI’s advanced language capabilities, powered by [GPT-5 Mini](https…

OpenAI: GPT-5 Mini
openai/gpt-5-miniOpenaiPaid4 K0.000250.002

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It…

OpenAI: GPT-5 Nano
openai/gpt-5-nanoOpenaiPaid4 K0.000050.0004

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tool…

OpenAI: GPT-5 Pro
openai/gpt-5-proOpenaiPaid4 K0.0150.12

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code qualit…

OpenAI: GPT-5.1
openai/gpt-5.1OpenaiPaid4 K0.001250.01

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpos…

OpenAI: GPT-5.1 Chat
openai/gpt-5.1-chatOpenaiPaid128 K0.001250.01

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-l…

OpenAI: GPT-5.1-Codex
openai/gpt-5.1-codexOpenaiPaid4 K0.001250.01

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding…

OpenAI: GPT-5.1-Codex-Max
openai/gpt-5.1-codex-maxOpenaiPaid4 K0.001250.01

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-conte…

OpenAI: GPT-5.1-Codex-Mini
openai/gpt-5.1-codex-miniOpenaiPaid4 K0.000250.002

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

OpenAI: GPT-5.2
openai/gpt-5.2OpenaiPaid4 K0.001750.014

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and lo…

OpenAI: GPT-5.2 Chat
openai/gpt-5.2-chatOpenaiPaid128 K0.001750.014

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-…

OpenAI: GPT-5.2 Pro
openai/gpt-5.2-proOpenaiPaid4 K0.0210.168

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and l…

OpenAI: GPT-5.2-Codex
openai/gpt-5.2-codexOpenaiPaid4 K0.001750.014

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and cod…

OpenAI: GPT-5.3 Chat
openai/gpt-5.3-chatOpenaiPaid128 K0.001750.014

GPT-5.3 Chat is an update to ChatGPT’s most-used model that makes everyday conversations smoothe…

OpenAI: GPT-5.3-Codex
openai/gpt-5.3-codexOpenaiPaid4 K0.001750.014

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software en…

OpenAI: GPT-5.4
openai/gpt-5.4OpenaiPaid1.05 M0.00250.015

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system…

OpenAI: GPT-5.4 Mini
openai/gpt-5.4-miniOpenaiPaid4 K0.000750.0045

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized…

OpenAI: GPT-5.4 Nano
openai/gpt-5.4-nanoOpenaiPaid4 K0.00020.00125

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized…

OpenAI: GPT-5.4 Pro
openai/gpt-5.4-proOpenaiPaid1.05 M0.030.18

GPT-5.4 Pro is OpenAI’s most advanced model, building on GPT-5.4’s unified architecture with enh…

OpenAI: gpt-oss-120b
openai/gpt-oss-120bOpenaiPaid131 K0.0000390.00019

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from Open…

OpenAI: gpt-oss-20b
openai/gpt-oss-20bOpenaiPaid131 K0.000030.00011

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 licens…

OpenAI: gpt-oss-safeguard-20b
openai/gpt-oss-safeguard-20bOpenaiPaid131 K0.0000750.0003

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-…

OpenAI: o1
openai/o1OpenaiPaid2 K0.0150.06

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking be…

OpenAI: o1-pro
openai/o1-proOpenaiPaid2 K0.150.6

The o1 series of models are trained with reinforcement learning to think before they answer and…

OpenAI: o3
openai/o3OpenaiPaid2 K0.0020.008

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science…

OpenAI: o3 Deep Research
openai/o3-deep-researchOpenaiPaid2 K0.010.04

o3-deep-research is OpenAI’s advanced model for deep research, designed to tackle complex, multi…

OpenAI: o3 Mini
openai/o3-miniOpenaiPaid2 K0.00110.0044

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particular…

OpenAI: o3 Mini High
openai/o3-mini-highOpenaiPaid2 K0.00110.0044

OpenAI o3-mini-high is the same model as o3-mini with reasoning_effort set to…

OpenAI: o3 Pro
openai/o3-proOpenaiPaid2 K0.020.08

The o-series of models are trained with reinforcement learning to think before they answer and p…

OpenAI: o4 Mini
openai/o4-miniOpenaiPaid2 K0.00110.0044

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient…

OpenAI: o4 Mini Deep Research
openai/o4-mini-deep-researchOpenaiPaid2 K0.0020.008

o4-mini-deep-research is OpenAI’s faster, more affordable deep research model—ideal for tackling…

OpenAI: o4 Mini High
openai/o4-mini-highOpenaiPaid2 K0.00110.0044

OpenAI o4-mini-high is the same model as o4-mini with reasoning_effort set to…

Perplexity: Sonar
perplexity/sonarPerplexityPaid127 K0.0010.001

Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the abil…

Perplexity: Sonar Deep Research
perplexity/sonar-deep-researchPerplexityPaid128 K0.0020.008

Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, an…

Perplexity: Sonar Pro
perplexity/sonar-proPerplexityPaid2 K0.0030.015

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perp

Perplexity: Sonar Pro Search
perplexity/sonar-pro-searchPerplexityPaid2 K0.0030.015

Exclusively available on the OpenRouter API, Sonar Pro’s new Pro Search mode is Perplexity’s mos…

Perplexity: Sonar Reasoning Pro
perplexity/sonar-reasoning-proPerplexityPaid128 K0.0020.008

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perp

Prime Intellect: INTELLECT-3
prime-intellect/intellect-3Prime IntellectPaid131 K0.00020.0011

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-…

Qwen2.5 72B Instruct
qwen/qwen-2.5-72b-instructQwenPaid32.8 K0.000120.00039

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following imp…

Qwen2.5 Coder 32B Instruct
qwen/qwen-2.5-coder-32b-instructQwenPaid32.8 K0.000660.001

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known a…

Qwen: Qwen Plus 0728
qwen/qwen-plus-2025-07-28QwenPaid1 M0.000260.00078

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning mod…

Qwen: Qwen Plus 0728 (thinking)
qwen/qwen-plus-2025-07-28:thinkingQwenPaid1 M0.000260.00078

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning mod…

Qwen: Qwen VL Max
qwen/qwen-vl-maxQwenPaid131 K0.000520.00208

Qwen VL Max is a visual understanding model with 7500 tokens context length. It excels in delive…

Qwen: Qwen VL Plus
qwen/qwen-vl-plusQwenPaid131 K0.00013650.0004095

Qwen’s Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition cap…

Qwen: Qwen-Max
qwen/qwen-maxQwenPaid32.8 K0.001040.00416

Qwen-Max, based on Qwen2.5, provides the best inference performance among Qwen models,…

Qwen: Qwen-Plus
qwen/qwen-plusQwenPaid1 M0.000260.00078

Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced perfor…

Qwen: Qwen-Turbo
qwen/qwen-turboQwenPaid131 K0.00003250.00013

Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost, suita…

Qwen: Qwen2.5 7B Instruct
qwen/qwen-2.5-7b-instructQwenPaid32.8 K0.000040.0001

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following impr…

Qwen: Qwen2.5 Coder 7B Instruct
qwen/qwen2.5-coder-7b-instructQwenPaid32.8 K0.000030.00009

Qwen2.5-Coder-7B-Instruct is a 7B parameter instruction-tuned language model optimized for code-…

Qwen: Qwen2.5 VL 32B Instruct
qwen/qwen2.5-vl-32b-instructQwenPaid128 K0.00020.0006

Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning f…

Qwen: Qwen2.5 VL 72B Instruct
qwen/qwen2.5-vl-72b-instructQwenPaid32.8 K0.00080.0008

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects…

Qwen: Qwen3 14B
qwen/qwen3-14bQwenPaid41 K0.000060.00024

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for b…

Qwen: Qwen3 235B A22B
qwen/qwen3-235b-a22bQwenPaid131 K0.0004550.00182

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating…

Qwen: Qwen3 235B A22B Instruct 2507
qwen/qwen3-235b-a22b-2507QwenPaid262 K0.0000710.0001

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language m…

Qwen: Qwen3 235B A22B Thinking 2507
qwen/qwen3-235b-a22b-thinking-2507QwenPaid131 K0.00014950.001495

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) langua…

Qwen: Qwen3 30B A3B
qwen/qwen3-30b-a3bQwenPaid41 K0.000080.00028

Qwen3, the latest generation in the Qwen large language model series, features both dense and mi…

Qwen: Qwen3 30B A3B Instruct 2507
qwen/qwen3-30b-a3b-instruct-2507QwenPaid262 K0.000090.0003

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, wi…

Qwen: Qwen3 30B A3B Thinking 2507
qwen/qwen3-30b-a3b-thinking-2507QwenPaid131 K0.000080.0004

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for…

Qwen: Qwen3 32B
qwen/qwen3-32bQwenPaid41 K0.000080.00024

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for…

Qwen: Qwen3 8B
qwen/qwen3-8bQwenPaid41 K0.000050.0004

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for bot…

Qwen: Qwen3 Coder 30B A3B Instruct
qwen/qwen3-coder-30b-a3b-instructQwenPaid16 K0.000070.00027

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 expert…

Qwen: Qwen3 Coder 480B A35B
qwen/qwen3-coderQwenPaid262 K0.000220.001

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by…

Qwen: Qwen3 Coder Flash
qwen/qwen3-coder-flashQwenPaid1 M0.0001950.000975

Qwen3 Coder Flash is Alibaba’s fast and cost efficient version of their proprietary Qwen3 Coder…

Qwen: Qwen3 Coder Next
qwen/qwen3-coder-nextQwenPaid262 K0.000120.00075

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local d…

Qwen: Qwen3 Coder Plus
qwen/qwen3-coder-plusQwenPaid1 M0.000650.00325

Qwen3 Coder Plus is Alibaba’s proprietary version of the Open Source Qwen3 Coder 480B A35B. It i…

Qwen: Qwen3 Max
qwen/qwen3-maxQwenPaid262 K0.000780.0039

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reason…

Qwen: Qwen3 Max Thinking
qwen/qwen3-max-thinkingQwenPaid262 K0.000780.0039

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes…

Qwen: Qwen3 Next 80B A3B Instruct
qwen/qwen3-next-80b-a3b-instructQwenPaid262 K0.000090.0011

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimize…

Qwen: Qwen3 Next 80B A3B Thinking
qwen/qwen3-next-80b-a3b-thinkingQwenPaid131 K0.00009750.00078

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs…

Qwen: Qwen3 VL 235B A22B Instruct
qwen/qwen3-vl-235b-a22b-instructQwenPaid262 K0.00020.00088

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generati…

Qwen: Qwen3 VL 235B A22B Thinking
qwen/qwen3-vl-235b-a22b-thinkingQwenPaid131 K0.000260.0026

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visua…

Qwen: Qwen3 VL 30B A3B Instruct
qwen/qwen3-vl-30b-a3b-instructQwenPaid131 K0.000130.00052

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual…

Qwen: Qwen3 VL 30B A3B Thinking
qwen/qwen3-vl-30b-a3b-thinkingQwenPaid131 K0.000130.00156

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual…

Qwen: Qwen3 VL 32B Instruct
qwen/qwen3-vl-32b-instructQwenPaid131 K0.0001040.000416

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precis…

Qwen: Qwen3 VL 8B Instruct
qwen/qwen3-vl-8b-instructQwenPaid131 K0.000080.0005

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for h…

Qwen: Qwen3 VL 8B Thinking
qwen/qwen3-vl-8b-thinkingQwenPaid131 K0.0001170.001365

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, des…

Qwen: Qwen3.5 397B A17B
qwen/qwen3.5-397b-a17bQwenPaid262 K0.000390.00234

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that…

Qwen: Qwen3.5 Plus 2026-02-15
qwen/qwen3.5-plus-02-15QwenPaid1 M0.000260.00156

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that in…

Qwen: Qwen3.5-122B-A10B
qwen/qwen3.5-122b-a10bQwenPaid262 K0.000260.00208

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integr…

Qwen: Qwen3.5-27B
qwen/qwen3.5-27bQwenPaid262 K0.0001950.00156

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, de…

Qwen: Qwen3.5-35B-A3B
qwen/qwen3.5-35b-a3bQwenPaid262 K0.00016250.0013

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture…

Qwen: Qwen3.5-9B
qwen/qwen3.5-9bQwenPaid256 K0.000050.00015

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong…

Qwen: Qwen3.5-Flash
qwen/qwen3.5-flash-02-23QwenPaid1 M0.0000650.00026

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrat…

Qwen: QwQ 32B
qwen/qwq-32bQwenPaid131 K0.000150.00058

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned mode…

Reka Edge
reka/reka-edgeRekaPaid16.4 K0.00010.0001

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video…

Reka: Flash 3
rekaai/reka-flash-3RekaaiPaid65.5 K0.00010.0002

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parame…

Relace: Relace Apply 3
relace/relace-apply-3RelacePaid256 K0.000850.00125

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into y…

Relace: Relace Search
relace/relace-searchRelacePaid256 K0.0010.003

The relace-search model uses 4-12 view_file and grep tools in parallel to explore a codebase…

ReMM SLERP 13B
undi95/remm-slerp-l2-13bUndi95Paid6.14 K0.000450.00065

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

Sao10K: Llama 3 8B Lunaris
sao10k/l3-lunaris-8bSao10KPaid8.19 K0.000040.00005

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It’s a strategic me…

Sao10k: Llama 3 Euryale 70B v2.1
sao10k/l3-euryale-70bSao10KPaid8.19 K0.001480.00148

Euryale 70B v2.1 is a model focused on creative roleplay from Sao10k

Sao10K: Llama 3.1 70B Hanami x1
sao10k/l3.1-70b-hanami-x1Sao10KPaid16 K0.0030.003

This is Sao10K’s experiment over Euryale v2.2.

Sao10K: Llama 3.1 Euryale 70B v2.2
sao10k/l3.1-euryale-70bSao10KPaid131 K0.000850.00085

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sa

Sao10K: Llama 3.3 Euryale 70B
sao10k/l3.3-euryale-70bSao10KPaid131 K0.000650.00075

Euryale L3.3 70B is a model focused on creative roleplay from Sao10k

StepFun: Step 3.5 Flash
stepfun/step-3.5-flashStepfunPaid262 K0.00010.0003

Step 3.5 Flash is StepFun’s most capable open-source foundation model. Built on a sparse Mixture…

Switchpoint Router
switchpoint/routerSwitchpointPaid131 K0.000850.0034

Switchpoint AI’s router instantly analyzes your request and directs it to the optimal AI from an…

Tencent: Hunyuan A13B Instruct
tencent/hunyuan-a13b-instructTencentPaid131 K0.000140.00057

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tenc…

TheDrummer: Cydonia 24B V4.1
thedrummer/cydonia-24b-v4.1ThedrummerPaid131 K0.00030.0005

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt ad…

TheDrummer: Rocinante 12B
thedrummer/rocinante-12bThedrummerPaid32.8 K0.000170.00043

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported:…

TheDrummer: Skyfall 36B V2
thedrummer/skyfall-36b-v2ThedrummerPaid32.8 K0.000550.0008

Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for impro…

TheDrummer: UnslopNemo 12B
thedrummer/unslopnemo-12bThedrummerPaid32.8 K0.00040.0004

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure wri…

TNG: DeepSeek R1T2 Chimera
tngtech/deepseek-r1t2-chimeraTngtechPaid164 K0.00030.0011

DeepSeek-TNG-R1T2-Chimera is the second-generation Chimera model from TNG Tech. It is a 671 B-pa…

Tongyi DeepResearch 30B A3B
alibaba/tongyi-deepresearch-30b-a3bAlibabaPaid131 K0.000090.00045

Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion…

Upstage: Solar Pro 3
upstage/solar-pro-3UpstagePaid128 K0.000150.0006

Solar Pro 3 is Upstage’s powerful Mixture-of-Experts (MoE) language model. With 102B total param…

WizardLM-2 8x22B
microsoft/wizardlm-2-8x22bMicrosoftPaid65.5 K0.000620.00062

WizardLM-2 8x22B is Microsoft AI’s most advanced Wizard model. It demonstrates highly competitiv…

Writer: Palmyra X5
writer/palmyra-x5WriterPaid1.04 M0.00060.006

Palmyra X5 is Writer’s most advanced model, purpose-built for building and scaling AI agents acr…

xAI: Grok 3
x-ai/grok-3X AiPaid131 K0.0030.015

Grok 3 is the latest model from xAI. It’s their flagship model that excels at enterprise use cas…

xAI: Grok 3 Beta
x-ai/grok-3-betaX AiPaid131 K0.0030.015

Grok 3 is the latest model from xAI. It’s their flagship model that excels at enterprise use cas…

xAI: Grok 3 Mini
x-ai/grok-3-miniX AiPaid131 K0.00030.0005

A lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks…

xAI: Grok 3 Mini Beta
x-ai/grok-3-mini-betaX AiPaid131 K0.00030.0005

Grok 3 Mini is a lightweight, smaller thinking model. Unlike traditional models that generate an…

xAI: Grok 4
x-ai/grok-4X AiPaid256 K0.0030.015

Grok 4 is xAI’s latest reasoning model with a 256k context window. It supports parallel tool cal…

xAI: Grok 4 Fast
x-ai/grok-4-fastX AiPaid2 M0.00020.0005

Grok 4 Fast is xAI’s latest multimodal model with SOTA cost-efficiency and a 2M token context wi…

xAI: Grok 4.1 Fast
x-ai/grok-4.1-fastX AiPaid2 M0.00020.0005

Grok 4.1 Fast is xAI’s best agentic tool calling model that shines in real-world use cases like…

xAI: Grok 4.20 Beta
x-ai/grok-4.20-betaX AiPaid2 M0.0020.006

Grok 4.20 Beta is xAI’s newest flagship model with industry-leading speed and agentic tool calli…

xAI: Grok 4.20 Multi-Agent Beta
x-ai/grok-4.20-multi-agent-betaX AiPaid2 M0.0020.006

Grok 4.20 Multi-Agent Beta is a variant of xAI’s Grok 4.20 designed for collaborative, agent-bas…

xAI: Grok Code Fast 1
x-ai/grok-code-fast-1X AiPaid256 K0.00020.0015

Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding. With…

Xiaomi: MiMo-V2-Flash
xiaomi/mimo-v2-flashXiaomiPaid262 K0.000090.00029

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-o…

Xiaomi: MiMo-V2-Omni
xiaomi/mimo-v2-omniXiaomiPaid262 K0.00040.002

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inpu…

Xiaomi: MiMo-V2-Pro
xiaomi/mimo-v2-proXiaomiPaid1.05 M0.0010.003

MiMo-V2-Pro is Xiaomi’s flagship foundation model, featuring over 1T total parameters and a 1M c…

Z.ai: GLM 4 32B
z-ai/glm-4-32bZ AiPaid128 K0.00010.0001

GLM 4 32B is a cost-effective foundation language model. It can efficiently perform complex task…

Z.ai: GLM 4.5
z-ai/glm-4.5Z AiPaid131 K0.00060.0022

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It…

Z.ai: GLM 4.5 Air
z-ai/glm-4.5-airZ AiPaid131 K0.000130.00085

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built f…

Z.ai: GLM 4.5V
z-ai/glm-4.5vZ AiPaid65.5 K0.00060.0018

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mix…

Z.ai: GLM 4.6
z-ai/glm-4.6Z AiPaid205 K0.000390.0019

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: T…

Z.ai: GLM 4.6V
z-ai/glm-4.6vZ AiPaid131 K0.00030.0009

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-co…

Z.ai: GLM 4.7
z-ai/glm-4.7Z AiPaid203 K0.000390.00175

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programmi…

Z.ai: GLM 4.7 Flash
z-ai/glm-4.7-flashZ AiPaid203 K0.000060.0004

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and effic…

Z.ai: GLM 5
z-ai/glm-5Z AiPaid80 K0.000720.0023

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and…

Z.ai: GLM 5 Turbo
z-ai/glm-5-turboZ AiPaid203 K0.00120.004

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent…

Page 1 of 1