Bentoo AI — Models · Every frontier AI in one place

Qwen3.7-Plus

Qwen

Qwen3.7-Plus is a cost-effective product in Alibaba's Qwen3.7 series. It supports text and image input, and supports text output. Based on the original series' text processing capabilities, it has comprehensively upgraded visual language capabilities while retaining full-stack agents for coding, tool use, and productivity workflows. Its notable feature is multimodal interactive hybrid agent capabilities: it can perceive real scenes, read screens, interact with GUI, generate code based on visual references, and perform end-to-end navigation in mobile applications.

Deep thinking Visual comprehension Text generation

In$0.360/1M

Out$1.440/1M

Save 10%

Claude Opus 4.8

Anthropic

Claude Opus 4.8 is the most powerful general-purpose model in Anthropic's Opus series. It supports text, image, and file input, outputs text, has reasoning capabilities, and a context window of 1 million tokens. It is suitable for highly autonomous agents, long-term agent tasks, knowledge tasks, and memory-driven tasks requiring high session consistency.

Coding Reasoning Agents

In$4.50/1M

Out$22.50/1M

Save 10%

Qwen3.7-Max

Qwen

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output, designed for agent-centric workloads, especially excelling in coding, office and productivity tasks, and long-cycle autonomous execution. Compared with previous Qwen products, this model has significant improvements in coding and agent performance, and supports explicit prompt caching for efficient context reuse.

Deep thinking Visual comprehension Text generation

In$1.125/1M

Out$3.375/1M

Save 10%

GPT-5.5

OpenAI

OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and 1M+ token context.

Reasoning Multimodal Coding

In$4.5/1M

Out$27/1M

Save 10%

Qwen3.6-Flash

Qwen

Qwen3.6 Flash is a fast and efficient language model in Alibaba's Qwen 3.6 series. It supports text, image, and video input, with a context window size of 1 million tokens. It adopts tiered pricing after 256K tokens. It supports instant caching and provides two pricing methods: explicit cache read and cache creation.

Deep thinking Visual comprehension Text generation

In$0.169/1M

Out$1.013/1M

Save 10%

Qwen Image Plus

Qwen

Qwen Image Plus is an enhanced multimodal image generation and precise editing model. It supports single-image fine-tuning, multi-image fusion, and object transformation, maintaining character identity and product features without drift, achieving natural matching of lighting and textures. It supports up to 2048×2048 resolution, has stronger Chinese text rendering capabilities, better instruction following and detail preservation, suitable for e-commerce product images, design drafts, and marketing material production.

Deep thinking Visual comprehension Text generation

In$0.027/img

Save 10%

Claude Opus 4.7

Anthropic

The next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Delivers stronger performance on complex, multi-step tasks and reliable agentic execution.

Coding Agents Reasoning

In$4.50/1M

Out$22.50/1M

Save 10%

Qwen3.6-Plus

Qwen

The Qwen3.6 native visual language series Plus model demonstrates outstanding performance comparable to the current top-edge models, with a significant improvement in model effect compared to the 3.5 series. The model has been significantly enhanced in code capabilities such as Agentic coding, front-end programming, Vibe coding, as well as multi-modal all-encompassing recognition, OCR, object positioning, etc.

Deep thinking Visual comprehension Text generation

In$0.293/1M

Out$1.755/1M

Save 10%

GPT-5.4

OpenAI

GPT-5.4 is OpenAI's latest frontier model that integrates Codex and GPT series into one system. It has a context window of over 1 million tokens (922K input, 128K output), supports text and image input, enabling high-context reasoning, coding, and multimodal analysis in the same workflow.

Audio Transcription

In$2.25/1M

Out$13.5/1M

Save 10%

Qwen3.5-Plus

Qwen

The Qwen3.5 native visual language series Plus model is designed based on a hybrid architecture, integrating linear attention mechanism and sparse mixed expert model, achieving higher inference efficiency. In multiple task evaluations, the 3.5 series has demonstrated outstanding performance comparable to the current top-edge models. Compared with the 3 series, the model effect has achieved a leapfrog improvement in both pure text and multimodal aspects.

Deep Thinking Visual Understanding Text Generation

In$0.270/1M

Out$1.620/1M

Save 10%

DeepSeek V3.2 Exp

DeepSeek

DeepSeek is a large language model developed by DeepSeek. It excels in code generation, mathematical reasoning, and other fields.

Deep thinking Text generation Reasoning

In$0.243/1M

Out$0.369/1M

Save 10%

Qwen3.5-Flash

Qwen

The Qwen3.5 native visual language series Flash model is designed based on a hybrid architecture, integrating the linear attention mechanism and sparse hybrid expert model. This enables higher inference efficiency. The model's performance has achieved a leapfrog improvement compared to the 3 series in both pure text and multimodal aspects; it is fast in response, and has both inference speed and performance.

Deep thinking Visual comprehension Text generation

In$0.0585/1M

Out$0.234/1M

Save 10%

Gemini 3.5 Flash

Google

Gemini 3.5 Flash is Google's efficient multimodal model that achieves near-professional-grade coding and reasoning capabilities at Flash-level cost and speed. It is highly optimized for coding efficiency and parallel agent execution loops, supporting text, image, video, audio, and PDF input.

Reasoning Multimodal Agentic

In$1.35/1M

Out$8.10/1M

Save 10%

Gemini 3.1 Pro Preview

Google

Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.

Reasoning Coding Agentic

In$1.8/1M

Out$10.8/1M

Save 10%

Wan2.6-I2V-Flash

Qwen

WanXiang 2.6 - Video Production - Flash, generates faster and offers better cost-effectiveness. Intelligent scene scheduling supports multi-camera narrative, stable conversations for multiple people, more natural and realistic sound quality. Supports generation up to a maximum duration of 15 seconds.

Video Generation

In$0.07974/s

Save 10%

Wan2.6-T2I

Qwen

WanXiang 2.6 - Image Generation from Text. The picture texture, aesthetic expression, and instruction compliance have been upgraded. It demonstrates outstanding capabilities in precise control of artistic style, creating realistic and touching images, generating long texts into images, and covering a wide range of historical and cultural IP. It can generate high-quality and expressive visual content.

Image generation

In$0.027/img

Save 10%

Wan2.6-R2V-Flash

Qwen

WanXiang 2.6 - Reference Live Video - Flash, generates faster and offers better cost-effectiveness. Supports specifying a particular person or any item for reference, precisely maintaining consistency in appearance and voice. Supports multi-character reference for seamless collaboration.

Video Generation

In$0.07974/s

Save 10%

Wan2.6-Image

Qwen

WanXiang 2.6 - Image Generation, All-in-One Image Generation Model, Supports Integrated Text-Image Reasoning and Generation, Equipped with Multi-image Creative Integration, Commercial-level Consistency, Transfer of Aesthetic Elements, and Precise Control of Camera Light and Shadows, Significantly Improving the Consistency, Controllability, and Expressiveness of Image Generation.

Image generation

In$0.02655/img

Save 10%

Wan2.6-T2V

Qwen

WanXiang 2.6 - WenSheng Video, intelligent shot scheduling supports multi-camera narrative, capable of generating multi-camera narrative videos with consistent main subjects, scenes and atmosphere, with a maximum duration of 15 seconds, higher-quality sound generation, better compliance with instructions and visual quality.

Video Generation

In$0.009/s

Save 10%

Qwen3-Max

Qwen

The Max model of the Thousand Questions 3 series has undergone specialized upgrades in the areas of agent programming and tool invocation compared to the preview version. The officially released version of this model has reached the state-of-the-art level in the field and is better suited to meet the more complex requirements of intelligent agents.

Text generation Deep thinking

In$0.702/1M

Out$3.510/1M

Save 10%

DeepSeek R1

DeepSeek

DeepSeek R1 is now released: performance comparable to OpenAI o1, but it is open-source and the reasoning tokens are fully open. It has a parameter scale of 671 billion, with 37 billion parameters active during one inference.

Deep thinking Text generation Reasoning

In$0.63/1M

Out$2.25/1M

Save 10%

Z-Image-Turbo

Qwen

Z-Image-Turbo is an efficient image generation model that topped the list of open-source image models in the Artificial Analysis evaluation. With only 6 billion parameters and 8 steps of inference, it can generate photo-realistic images comparable to large-scale commercial models. It also excels in Chinese-English text rendering, complex semantic understanding, and diverse topic generation.

Image generation

In$0.0090/img

Out$0.0090/img

Save 10%

Qwen-MT-Image

Qwen

Specializing in providing model services for image translation, this system can translate images from 11 languages including Chinese, English, and Japanese into the desired language, accurately reproducing the layout and content information of the images. It supports custom functions such as term definition, sensitive word filtering, and product subject detection, offering flexible, accurate, and efficient image localization services.

Image generation

In$0.0003/img

Out$0.0003/img

Save 10%

Qwen-Image-Edit-Plus

Qwen

The Qianwan series of image editing Plus model has further optimized the inference performance and system stability based on the initial Edit model, significantly reducing the response time for image generation and editing; it supports returning multiple images in a single request, greatly enhancing the user experience.

Image generation

In$0.027/img

Save 10%

Qwen-Image-2.0-Pro

Qwen

The Qwen-Image-2.0 full-powered model integrates image generation and image editing; it has a more professional text rendering capability with 1k token instruction support, a more delicate and realistic texture, a more detailed depiction of realistic scenes, and a stronger semantic following ability. The full-powered version possesses the strongest text rendering and realistic texture capabilities of the 2.0 series.

Image generation

In$0.0675/img

Save 10%

Qwen-Image-Edit-Max

Qwen

The Max series of the Thousand Question Image Editing Model offers more stable and comprehensive editing capabilities: enhancing industrial design and geometric reasoning abilities; improving character consistency; reducing offset issues; integrating Lora capabilities, allowing for more functions of image editing. This version is a snapshot as of January 16, 2026.

Image generation

In$0.0675/img

Save 10%

Qwen-Image-2.0

Qwen

The Qwen-Image-2.0 series of accelerated models have achieved the integration of image generation and image editing; they possess a more professional ability to render text with 1k token instructions, a more delicate and realistic texture, a more detailed depiction of realistic scenes, and a stronger ability to follow semantics. The accelerated version effectively achieves the optimal balance between model effect and performance.

Image generation

In$0.0315/img

Save 10%

MiniMax-M2.5

MiniMax

The SOTA of the agent world, specially designed for Agent 2.0, extends the coding to the real world including the workspace, entertainment and personal assistant. Model highlights: Global SOTA open-source coding and agent model; Scores higher than Opus 4.6 in SWE-bench Pro and SWE-bench Verified; Global SOTA in Excel, search and research, and document summarization; The perfect main model for future workspaces; Lightning-fast: Optimizes thinking efficiency, 100+ TPS, achieving a speed 3 times faster than Opus; Ultimate cost-effectiveness, supporting always-online agents.

Deep thinking Text generation

In$0.135/1M

Out$1.035/1M

Save 10%

Qwen-Image-Max

Qwen

The Max series of the Thousand Questions image generation model has performed exceptionally well in various generation tasks. Compared to the Plus series, it significantly reduces the artificiality of generated images and enhances the authenticity of the images; it features more realistic human texture, finer natural textures, and more aesthetically pleasing text rendering.

Image generation

In$0.0675/img

Save 10%

GLM-5.1

Z.ai

GLM-5.1 is a model designed by ZhishuAI for long-term tasks. It has a total of 744B parameters and supports 200K extremely long contexts. The maximum output is 128K tokens. It possesses strong logical reasoning, long text understanding and code generation capabilities, and strikes a balance between performance and inference efficiency. It performs exceptionally well in multi-task benchmarks and is suitable for scenarios such as intelligent interaction, enterprise applications, and development assistance.

Text Generation

In$0.54/1M

Out$1.872/1M

Save 10%

Kimi-K2.5

Moonshot

Kimi-k2.5 is the most comprehensive model released by the Dark Side of the Moon to date. It features a native multimodal architecture design, and supports both visual and textual inputs, thinking and non-thinking modes, as well as dialogue and Agent tasks.

Deep thinking Visual comprehension Text generation

In$0.36/1M

Out$1.71/1M

Save 10%

Every frontier model.
One unified API.

One key.
Every model on this page.

Every frontier model.One unified API.

One key.Every model on this page.

Welcome back

Reset password

Password reset!

Create your account

Account created!

Terms of Service

Every frontier model.
One unified API.

One key.
Every model on this page.