AI Models - Mulu Code

Every frontier model, in one app.

Switch models with a single keystroke. Mulu Code handles provider differences under the hood. You just pick the model best suited to the task in front of you.

Model	Provider	Strengths	Context	Pricing
Mulu Agent 1	Mulu	Reasoning, coding, agentic tool use	200K	$2.50 / $8.00 per 1M
Claude Sonnet 4.6	Anthropic	Coding, reasoning, extended thinking	1M	Competitive
Claude Opus 4.6	Anthropic	Deep reasoning, complex systems	1M	Competitive
Claude Haiku 4.5	Anthropic	Fast, cheap, light coding tasks	200K	Competitive
GPT-5.4	OpenAI	Reasoning, broad general purpose	1M	Competitive
GPT-5.3 Codex	OpenAI	Coding specialist, fast iteration	400K	Competitive
Gemini 3.1 Pro	Google	Deep Think, long-context, UI design	1M	Competitive
Gemini 3 Flash	Google	Fast, cheap, 1M context	1M	Competitive
Grok 4.2	xAI	Reasoning toggle, 2M context	2M	Competitive
Grok 4.2 Agents	xAI	Multi-agent orchestration variant	2M	Competitive
Kimi K2.5	Moonshot	76.8% SWE-bench, 1T MoE	256K	Competitive
Qwen 3.6 Plus	Alibaba	76.4% SWE-bench, reasoning support	1M	Competitive
MiniMax M2.7	MiniMax	Strong reasoning, broad tasks	200K	Competitive
MiMo v2 Pro	Xiaomi	1T+ MoE (42B active), near-Opus coding	1M	Competitive
MiMo v2 Flash	Xiaomi	Fast iteration, long context	256K	Competitive
GLM-5	Zhipu	Balanced reasoning and coding	200K	Competitive
NVIDIA Nemotron 3 Super	NVIDIA	120B MoE (12B active), efficient	1M	Competitive

Auto-routing picks the right model for the job.

Turn on Auto and Mulu routes each message to the best model for the task: Gemini 3.1 Pro for UI design, Claude Opus 4.6 for complex systems, GPT-5.3 Codex for fast code iteration, Mulu Agent 1 for agentic tool-calling work. Latency, quality, and cost are balanced automatically.

Override at any time. Force a specific model for a single message with a dropdown, or lock a whole conversation to a preferred model. Auto is a sensible default, not a cage.

Power users see which model handled each reply and can swap retroactively. "Redo this with Opus" is a single click on a message.

Image: a chat message with a small model badge showing "Gemini 3.1 Pro" under the reply, and a hover tooltip revealing the routing reason

Compare answers from multiple models in parallel.

Ask the same question of three, four, or five models at once and read their responses side by side. Useful for important architectural decisions, unfamiliar APIs, and debugging sessions where you want a second opinion without switching tabs.

Each response is tagged with the model that produced it, and tokens used. You can cherry-pick a single reply to continue the thread with, or merge the best elements of several into a final answer. The orchestrator handles the prompt plumbing for you.

For long-term decisions, save comparison sessions to the project memory. Six months later, when you wonder "which model got this right?", the record is still there.

Image: a three-column grid showing the same prompt answered by Claude Opus, GPT-5.4, and Gemini 3.1 Pro with model badges at the top

Mulu Agent 1. Our own frontier model.

Mulu Agent 1 is our in-house reasoning and coding model, tuned specifically for agentic work inside Mulu Code. It's strong at multi-step tool use, code editing, and tasks that require following a plan across many turns without losing thread.

Priced at $2.50 per million input tokens and $8.00 per million output tokens. On the Power plan it's available at a dedicated per-month token pool for power users who want to run heavy workloads on one model.

You don't need to know what's inside. Just know it's tuned for the work you're doing here, and it keeps getting better.

Image: a Mulu Agent 1 badge on a reply followed by a chain of tool-use messages (read_file, edit_file, run_tests) completing a multi-step task

No lock-in. Bring your own keys if you want.

Every model runs through Mulu's managed infrastructure by default. No separate signups, no juggling five provider accounts. If you'd rather use your own API keys from Anthropic, OpenAI, Google, or xAI, paste them into settings and the app routes your requests directly.

BYO keys bypass Mulu's billing entirely for those models. Your keys are stored in the OS keychain (Keychain, DPAPI, libsecret), never written to a plaintext config file, never synced to our servers.

You can mix and match: BYO keys for one provider, Mulu's managed billing for another. Teams can standardize on managed billing with central invoicing while letting individual contributors use personal keys when they prefer.

Image: settings page showing toggles for each provider (Managed vs BYO) with key inputs next to Anthropic and OpenAI

17 frontier models. One subscription.

Every frontier model, in one app.

Auto-routing picks the right model for the job.

Compare answers from multiple models in parallel.

Mulu Agent 1. Our own frontier model.

No lock-in. Bring your own keys if you want.

Every model worth using. In one place.