Higress · AI Models · Unitalk

Store

Qwen Turbo

Qwen Turbo is a large-scale language model supporting input in various languages including Chinese and English.

Qwen Plus

Qwen Plus is an enhanced large-scale language model supporting input in various languages including Chinese and English.

Qwen Max

Qwen Max is a trillion-level large-scale language model that supports input in various languages including Chinese and English, and is the API model behind the current Qwen 2.5 product version.

Qwen Long

Qwen is a large-scale language model that supports long text contexts and dialogue capabilities based on long documents and multiple documents.

Qwen VL Plus

Tongyi Qianwen's large-scale visual language model enhanced version. Significantly improves detail recognition and text recognition capabilities, supporting ultra-high pixel resolution and images of any aspect ratio.

Qwen VL Max

Tongyi Qianwen's ultra-large-scale visual language model. Compared to the enhanced version, it further improves visual reasoning and instruction-following abilities, providing a higher level of visual perception and cognition.

Qwen Math Turbo

The Tongyi Qianwen Math model is specifically designed for solving mathematical problems.

Qwen Math Plus

The Tongyi Qianwen Math model is specifically designed for solving mathematical problems.

Qwen Coder Turbo

The Tongyi Qianwen Coder model.

Qwen2.5 7B

qwen2.5-7b-instruct.description

Qwen2.5 14B

qwen2.5-14b-instruct.description

Qwen2.5 32B

qwen2.5-32b-instruct.description

Qwen2.5 72B

qwen2.5-72b-instruct.description

Qwen2.5 Math 1.5B

qwen2.5-math-1.5b-instruct.description

Qwen2.5 Math 7B

qwen2.5-math-7b-instruct.description

Qwen2.5 Math 72B

qwen2.5-math-72b-instruct.description

Qwen2.5 Coder 1.5B

qwen2.5-coder-1.5b-instruct.description

Qwen2.5 Coder 7B

qwen2.5-coder-7b-instruct.description

Qwen VL

Initialized with the Qwen-7B language model, this pre-trained model adds an image model with an input resolution of 448.

Qwen VL Chat

Qwen VL supports flexible interaction methods, including multi-image, multi-turn Q&A, and creative capabilities.

Moonshot V1 8K

Moonshot V1 8K is designed for generating short text tasks, featuring efficient processing performance, capable of handling 8,192 tokens, making it ideal for brief dialogues, note-taking, and rapid content generation.

Moonshot V1 32K

Moonshot V1 32K offers medium-length context processing capabilities, able to handle 32,768 tokens, particularly suitable for generating various long documents and complex dialogues, applicable in content creation, report generation, and dialogue systems.

Moonshot V1 128K

Moonshot V1 128K is a model with ultra-long context processing capabilities, suitable for generating extremely long texts, meeting the demands of complex generation tasks, capable of handling up to 128,000 tokens, making it ideal for research, academia, and large document generation.

Baichuan 4

The model is the best in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also boasts industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.

Baichuan 4 Turbo

The leading model in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also possesses industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.

Baichuan 4 Air

The leading model in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also possesses industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.

Baichuan 3 Turbo

Optimized for high-frequency enterprise scenarios, significantly improving performance and cost-effectiveness. Compared to the Baichuan2 model, content creation improves by 20%, knowledge Q&A by 17%, and role-playing ability by 40%. Overall performance is superior to GPT-3.5.

Baichuan 3 Turbo 128k

Features a 128K ultra-long context window, optimized for high-frequency enterprise scenarios, significantly improving performance and cost-effectiveness. Compared to the Baichuan2 model, content creation improves by 20%, knowledge Q&A by 17%, and role-playing ability by 40%. Overall performance is superior to GPT-3.5.

Baichuan 2 Turbo

Utilizes search enhancement technology to achieve comprehensive links between large models and domain knowledge, as well as knowledge from the entire web. Supports uploads of various documents such as PDF and Word, and URL input, providing timely and comprehensive information retrieval with accurate and professional output.

Yi Lightning

The latest high-performance model, ensuring high-quality output while significantly improving reasoning speed.

Yi Spark

Small yet powerful, lightweight and fast model. Provides enhanced mathematical computation and coding capabilities.

Yi Medium

Medium-sized model upgraded and fine-tuned, balanced capabilities, and high cost-performance ratio. Deeply optimized instruction-following capabilities.

Yi Medium 200K

200K ultra-long context window, providing deep understanding and generation capabilities for long texts.

Yi Large Turbo

Exceptional performance at a high cost-performance ratio. Conducts high-precision tuning based on performance, inference speed, and cost.

Yi Large RAG

High-level service based on the yi-large super strong model, combining retrieval and generation techniques to provide precise answers and real-time information retrieval services.

Yi Large FC

Based on the yi-large model, supports and enhances tool invocation capabilities, suitable for various business scenarios requiring agent or workflow construction.

Yi Large

A new trillion-parameter model, providing super strong question-answering and text generation capabilities.

Yi Vision

Model for complex visual tasks, providing high-performance image understanding and analysis capabilities.

Yi Large Preview

Initial version, recommended to use yi-large (new version).

Yi Lightning Lite

A lightweight version, recommended to use yi-lightning.

GLM-4-Flash

GLM-4-Flash is the ideal choice for handling simple tasks, being the fastest and most cost-effective.

GLM-4-FlashX

GLM-4-FlashX is an enhanced version of Flash, featuring ultra-fast inference speed.

GLM-4-Long

GLM-4-Long supports ultra-long text inputs, suitable for memory-based tasks and large-scale document processing.

GLM-4-Air

GLM-4-Air is a cost-effective version with performance close to GLM-4, offering fast speed at an affordable price.

GLM-4-AirX

GLM-4-AirX provides an efficient version of GLM-4-Air, with inference speeds up to 2.6 times faster.

GLM-4-AllTools

GLM-4-AllTools is a multifunctional intelligent agent model optimized to support complex instruction planning and tool invocation, such as web browsing, code interpretation, and text generation, suitable for multitasking.

GLM-4-Plus

GLM-4-Plus, as a high-intelligence flagship, possesses strong capabilities for processing long texts and complex tasks, with overall performance improvements.

GLM-4-0520

GLM-4-0520 is the latest model version designed for highly complex and diverse tasks, demonstrating outstanding performance.

GLM-4

GLM-4 is the old flagship version released in January 2024, currently replaced by the more powerful GLM-4-0520.

GLM-4V-Plus

GLM-4V-Plus has the ability to understand video content and multiple images, suitable for multimodal tasks.

GLM-4V

GLM-4V provides strong image understanding and reasoning capabilities, supporting various visual tasks.

CharGLM-3

CharGLM-3 is designed for role-playing and emotional companionship, supporting ultra-long multi-turn memory and personalized dialogue, with wide applications.

Emohaa

Emohaa is a psychological model with professional counseling capabilities, helping users understand emotional issues.

360GPT2 Pro

360GPT2 Pro is an advanced natural language processing model launched by 360, featuring exceptional text generation and understanding capabilities, particularly excelling in generation and creative tasks, capable of handling complex language transformations and role-playing tasks.

360GPT Pro

360GPT Pro, as an important member of the 360 AI model series, meets diverse natural language application scenarios with efficient text processing capabilities, supporting long text understanding and multi-turn dialogue.

360GPT Turbo

360GPT Turbo offers powerful computation and dialogue capabilities, with excellent semantic understanding and generation efficiency, making it an ideal intelligent assistant solution for enterprises and developers.

360GPT Turbo Responsibility 8K

360GPT Turbo Responsibility 8K emphasizes semantic safety and responsibility, designed specifically for applications with high content safety requirements, ensuring accuracy and robustness in user experience.

ERNIE 3.5 8K

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.

ERNIE 3.5 8K Preview

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.

ERNIE 3.5 128K

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.

ERNIE 4.0 8K

Baidu's self-developed flagship ultra-large-scale language model, which has achieved a comprehensive upgrade in model capabilities compared to ERNIE 3.5, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information.

ERNIE 4.0 8K Preview

Baidu's self-developed flagship ultra-large-scale language model, which has achieved a comprehensive upgrade in model capabilities compared to ERNIE 3.5, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information.

ERNIE 4.0 Turbo 8K

Baidu's self-developed flagship ultra-large-scale language model, demonstrating excellent overall performance, suitable for complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information. It offers better performance compared to ERNIE 4.0.

ERNIE 4.0 Turbo 8K Preview

Baidu's self-developed flagship ultra-large-scale language model, demonstrating excellent overall performance, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information. It outperforms ERNIE 4.0 in performance.

ERNIE Lite Pro 128K

Baidu's self-developed lightweight large language model, balancing excellent model performance with inference efficiency, offering better results than ERNIE Lite, suitable for inference on low-power AI acceleration cards.

ERNIE Speed Pro 128K

Baidu's latest self-developed high-performance large language model released in 2024, with outstanding general capabilities, providing better results than ERNIE Speed, suitable as a base model for fine-tuning, effectively addressing specific scenario issues while also exhibiting excellent inference performance.

ERNIE Speed 128K

Baidu's latest self-developed high-performance large language model released in 2024, with outstanding general capabilities, suitable as a base model for fine-tuning, effectively addressing specific scenario issues while also exhibiting excellent inference performance.

ERNIE Character 8K

Baidu's self-developed vertical scene large language model, suitable for applications such as game NPCs, customer service dialogues, and role-playing conversations, featuring more distinct and consistent character styles, stronger adherence to instructions, and superior inference performance.

Hunyuan Lite

Upgraded to a MOE structure with a context window of 256k, leading many open-source models in various NLP, coding, mathematics, and industry benchmarks.

Hunyuan Standard

Utilizes a superior routing strategy while alleviating issues of load balancing and expert convergence. For long texts, the needle-in-a-haystack metric reaches 99.9%. MOE-32K offers a relatively higher cost-performance ratio, balancing effectiveness and price while enabling processing of long text inputs.

Hunyuan Standard 256K

Utilizes a superior routing strategy while alleviating issues of load balancing and expert convergence. For long texts, the needle-in-a-haystack metric reaches 99.9%. MOE-256K further breaks through in length and effectiveness, greatly expanding the input length capacity.

Hunyuan Turbo

The preview version of the next-generation Hunyuan large language model, featuring a brand-new mixed expert model (MoE) structure, which offers faster inference efficiency and stronger performance compared to Hunyuan Pro.

Hunyuan Pro

A trillion-parameter scale MOE-32K long text model. Achieves absolute leading levels across various benchmarks, capable of handling complex instructions and reasoning, with advanced mathematical abilities, supporting function calls, and optimized for applications in multilingual translation, finance, law, and healthcare.

Hunyuan Large

Hunyuan Vision

The latest multimodal model from Hunyuan, supporting image + text input to generate textual content.

Hunyuan Code

The latest code generation model from Hunyuan, trained on a base model with 200B high-quality code data, iteratively trained for six months with high-quality SFT data, increasing the context window length to 8K. It ranks among the top in automatic evaluation metrics for code generation across five major programming languages, and performs in the first tier for comprehensive human quality assessments across ten aspects of coding tasks.

Hunyuan FunctionCall

The latest MOE architecture FunctionCall model from Hunyuan, trained on high-quality FunctionCall data, with a context window of 32K, leading in multiple dimensions of evaluation metrics.

Hunyuan Role

The latest role-playing model from Hunyuan, fine-tuned and trained by Hunyuan's official team, based on the Hunyuan model combined with role-playing scenario datasets for enhanced foundational performance in role-playing contexts.

Step 1 Flash

High-speed model, suitable for real-time dialogues.

Step 1 8K

Small model, suitable for lightweight tasks.

Step 1 32K

Supports medium-length dialogues, applicable to various application scenarios.

Step 1 128K

Balances performance and cost, suitable for general scenarios.

Step 1 256K

Equipped with ultra-long context processing capabilities, especially suitable for long document analysis.

Step 2 16K

Supports large-scale context interactions, suitable for complex dialogue scenarios.

Step 1V 8K

A small visual model suitable for basic text and image tasks.

Step 1V 32K

Supports visual input, enhancing multimodal interaction experiences.

Step 1.5V Mini

This model has powerful video understanding capabilities.

Spark Lite

Spark Lite is a lightweight large language model with extremely low latency and efficient processing capabilities, completely free and open, supporting real-time online search functionality. Its quick response feature makes it excel in inference applications and model fine-tuning on low-power devices, providing users with excellent cost-effectiveness and intelligent experiences, particularly in knowledge Q&A, content generation, and search scenarios.

Spark Pro

Spark Pro is a high-performance large language model optimized for professional fields, focusing on mathematics, programming, healthcare, education, and more, supporting online search and built-in plugins for weather, dates, etc. Its optimized model demonstrates excellent performance and efficiency in complex knowledge Q&A, language understanding, and high-level text creation, making it an ideal choice for professional application scenarios.

Spark Pro 128K

Spark Pro 128K is equipped with an extra-large context processing capability, able to handle up to 128K of contextual information, making it particularly suitable for long-form content that requires comprehensive analysis and long-term logical connections, providing smooth and consistent logic and diverse citation support in complex text communication.

Spark Max

generalv3.5.description

Spark Max 32K

Spark Max 32K is configured with large context processing capabilities, enhanced contextual understanding, and logical reasoning abilities, supporting text input of 32K tokens, suitable for long document reading, private knowledge Q&A, and other scenarios.

Spark 4.0 Ultra

Spark4.0 Ultra is the most powerful version in the Spark large model series, enhancing text content understanding and summarization capabilities while upgrading online search links. It is a comprehensive solution for improving office productivity and accurately responding to demands, leading the industry as an intelligent product.

OpenAI o1-mini

o1-mini is a fast and cost-effective reasoning model designed for programming, mathematics, and scientific applications. This model features a 128K context and has a knowledge cutoff date of October 2023.

OpenAI o1-preview

o1 is OpenAI's new reasoning model, suitable for complex tasks that require extensive general knowledge. This model features a 128K context and has a knowledge cutoff date of October 2023.

GPT-4o mini

GPT-4o mini is the latest model released by OpenAI after GPT-4 Omni, supporting both image and text input while outputting text. As their most advanced small model, it is significantly cheaper than other recent cutting-edge models, costing over 60% less than GPT-3.5 Turbo. It maintains state-of-the-art intelligence while offering remarkable cost-effectiveness. GPT-4o mini scored 82% on the MMLU test and currently ranks higher than GPT-4 in chat preferences.

GPT-4o

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

GPT-4o 0806

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

GPT-4o 0513

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

ChatGPT-4o

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

GPT-4 Turbo

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4 Turbo Vision 0409

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4 Turbo Preview

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4 Turbo Preview 0125

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4 Turbo Preview 1106

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT-4 0613

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT-4 32K

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT-4 32K 0613

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT-3.5 Turbo

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.

GPT-3.5 Turbo 0125

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.

GPT-3.5 Turbo 1106

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.

GPT-3.5 Turbo Instruct

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.

GPT 3.5 Turbo

GPT 3.5 Turbo is an efficient model provided by OpenAI, suitable for chat and text generation tasks, supporting parallel function calls.

GPT 3.5 Turbo

GPT 3.5 Turbo 16k is a high-capacity text generation model suitable for complex tasks.

GPT 4 Turbo

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT 4 Turbo with Vision Preview

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT 4o Mini

GPT-4o mini is the latest model released by OpenAI after GPT-4 Omni, supporting both image and text input while outputting text. As their most advanced small model, it is significantly cheaper than other recent cutting-edge models, costing over 60% less than GPT-3.5 Turbo. It maintains state-of-the-art intelligence while offering remarkable cost-effectiveness. GPT-4o mini scored 82% on the MMLU test and currently ranks higher than GPT-4 in chat preferences.

GPT 4o

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

OpenAI o1-mini

o1-mini is a fast and cost-effective reasoning model designed for programming, mathematics, and scientific applications. This model features a 128K context and has a knowledge cutoff date of October 2023.

OpenAI o1-preview

o1 is OpenAI's new reasoning model, suitable for complex tasks that require extensive general knowledge. This model features a 128K context and has a knowledge cutoff date of October 2023.

OpenAI GPT-4o mini

GPT-4o mini is the latest model released by OpenAI after GPT-4 Omni, supporting both image and text input while outputting text. As their most advanced small model, it is significantly cheaper than other recent cutting-edge models, costing over 60% less than GPT-3.5 Turbo. It maintains state-of-the-art intelligence while offering remarkable cost-effectiveness. GPT-4o mini scored 82% on the MMLU test and currently ranks higher than GPT-4 in chat preferences.

OpenAI GPT-4o

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

AI21 Jamba 1.5 Mini

A 52B parameter (12B active) multilingual model, offering a 256K long context window, function calling, structured output, and grounded generation.

AI21 Jamba 1.5 Large

A 398B parameter (94B active) multilingual model, offering a 256K long context window, function calling, structured output, and grounded generation.

Cohere Command R

Command R is a scalable generative model targeting RAG and Tool Use to enable production-scale AI for enterprises.

Cohere Command R+

Command R+ is a state-of-the-art RAG-optimized model designed to tackle enterprise-grade workloads.

Mistral Nemo

Mistral Nemo, developed in collaboration with Mistral AI and NVIDIA, is a high-performance 12B model.

Mistral Small

Mistral Small can be used for any language-based task that requires high efficiency and low latency.

Mistral Large

Mixtral Large is Mistral's flagship model, combining capabilities in code generation, mathematics, and reasoning, supporting a 128k context window.

Llama 3.2 11B Vision

Excellent image reasoning capabilities on high-resolution images, suitable for visual understanding applications.

Llama 3.2 90B Vision

Advanced image reasoning capabilities suitable for visual understanding agent applications.

Meta Llama 3.1 8B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.

Meta Llama 3.1 70B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.

Meta Llama 3.1 405B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.

Meta Llama 3 8B

A versatile 8-billion parameter model optimized for dialogue and text generation tasks.

Meta Llama 3 70B

A powerful 70-billion parameter model excelling in reasoning, coding, and broad language applications.

Phi-3.5-mini 128K

An updated version of the Phi-3-mini model.

Phi-3.5-vision 128K

An updated version of the Phi-3-vision model.

Phi-3-mini 4K

The smallest member of the Phi-3 family, optimized for both quality and low latency.

Phi-3-mini 128K

The same Phi-3-mini model, but with a larger context size for RAG or few-shot prompting.

Phi-3-small 8K

A 7B parameter model that provides better quality than Phi-3-mini, focusing on high-quality, reasoning-dense data.

Phi-3-small 128K

The same Phi-3-small model, but with a larger context size for RAG or few-shot prompting.

Phi-3-medium 4K

A 14B parameter model that provides better quality than Phi-3-mini, focusing on high-quality, reasoning-dense data.

Phi-3-medium 128K

The same Phi-3-medium model, but with a larger context size for RAG or few-shot prompting.

Llama 3.2 11B Vision (Preview)

Llama 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.2 90B Vision (Preview)

Llama 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.1 8B

Llama 3.1 8B is a high-performance model that offers rapid text generation capabilities, making it ideal for applications requiring large-scale efficiency and cost-effectiveness.

Llama 3.1 70B

Llama 3.1 70B provides enhanced AI reasoning capabilities, suitable for complex applications, supporting extensive computational processing while ensuring efficiency and accuracy.

Llama 3 Groq 8B Tool Use (Preview)

Llama 3 Groq 8B Tool Use is a model optimized for efficient tool usage, supporting fast parallel computation.

Llama 3 Groq 70B Tool Use (Preview)

Llama 3 Groq 70B Tool Use offers powerful tool invocation capabilities, supporting efficient processing of complex tasks.

Meta Llama 3 8B

Meta Llama 3 8B delivers high-quality reasoning performance, suitable for diverse application needs.

Meta Llama 3 70B

Meta Llama 3 70B provides unparalleled complexity handling capabilities, tailored for high-demand projects.

Gemma 2 9B

Gemma 2 9B is a model optimized for specific tasks and tool integration.

Gemma 7B

Gemma 7B is suitable for medium to small-scale task processing, offering cost-effectiveness.

Mixtral 8x7B

Mixtral 8x7B provides high fault-tolerant parallel computing capabilities, suitable for complex tasks.

LLaVA 1.5 7B

LLaVA 1.5 7B offers integrated visual processing capabilities, generating complex outputs from visual information inputs.

DeepSeek V2.5

A new open-source model that integrates general and coding capabilities, retaining the general conversational abilities of the original Chat model and the powerful code handling capabilities of the Coder model, while better aligning with human preferences. Additionally, DeepSeek-V2.5 has achieved significant improvements in writing tasks, instruction following, and more.

Claude 3.5 Haiku

Claude 3.5 Haiku is Anthropic's fastest next-generation model. Compared to Claude 3 Haiku, Claude 3.5 Haiku has improved in various skills and has surpassed the previous generation's largest model, Claude 3 Opus, in many intelligence benchmark tests.

Claude 3.5 Sonnet

Claude 3.5 Sonnet offers capabilities that surpass Opus and faster speeds than Sonnet, while maintaining the same pricing as Sonnet. Sonnet excels particularly in programming, data science, visual processing, and agent tasks.

Claude 3.5 Sonnet 0620

Claude 3.5 Sonnet offers capabilities that surpass Opus and faster speeds than Sonnet, while maintaining the same price as Sonnet. Sonnet excels particularly in programming, data science, visual processing, and agent tasks.

Claude 3 Haiku

Claude 3 Haiku is Anthropic's fastest and most compact model, designed for near-instantaneous responses. It features rapid and accurate directional performance.

Claude 3 Sonnet

Claude 3 Sonnet provides an ideal balance of intelligence and speed for enterprise workloads. It offers maximum utility at a lower price, reliable and suitable for large-scale deployment.

Claude 3 Opus

Claude 3 Opus is Anthropic's most powerful model for handling highly complex tasks. It excels in performance, intelligence, fluency, and comprehension.

Claude 2.1

Claude 2 provides advancements in key capabilities for enterprises, including industry-leading 200K token context, significantly reducing the occurrence of model hallucinations, system prompts, and a new testing feature: tool invocation.

Claude 2.0

Claude 2 provides advancements in key capabilities for enterprises, including industry-leading 200K token context, significantly reducing the occurrence of model hallucinations, system prompts, and a new testing feature: tool invocation.

Gemini 1.5 Flash

Gemini 1.5 Flash is Google's latest multimodal AI model, featuring fast processing capabilities and supporting text, image, and video inputs, making it suitable for efficient scaling across various tasks.

Gemini 1.5 Flash 002

Gemini 1.5 Flash 002 is an efficient multimodal model that supports a wide range of applications.

Gemini 1.5 Flash 001

Gemini 1.5 Flash 001 is an efficient multimodal model that supports extensive application scaling.

Gemini 1.5 Flash 0827

Gemini 1.5 Flash 0827 provides optimized multimodal processing capabilities, suitable for various complex task scenarios.

Gemini 1.5 Flash 8B

Gemini 1.5 Flash 8B is an efficient multimodal model that supports a wide range of applications.

Gemini 1.5 Flash 8B 0924

Gemini 1.5 Flash 8B 0924 is the latest experimental model, showcasing significant performance improvements in both text and multimodal use cases.

Gemini 1.5 Pro

Gemini 1.5 Pro supports up to 2 million tokens, making it an ideal choice for medium-sized multimodal models, providing multifaceted support for complex tasks.

Gemini 1.5 Pro 002

Gemini 1.5 Pro 002 is the latest production-ready model, delivering higher quality outputs, with notable enhancements in mathematics, long-context, and visual tasks.

Gemini 1.5 Pro 001

Gemini 1.5 Pro 001 is a scalable multimodal AI solution that supports a wide range of complex tasks.

Gemini 1.5 Pro 0827

Gemini 1.5 Pro 0827 combines the latest optimization technologies for more efficient multimodal data processing.

Gemini 1.5 Pro 0801

Gemini 1.5 Pro 0801 offers excellent multimodal processing capabilities, providing greater flexibility for application development.

Gemini 1.0 Pro

Gemini 1.0 Pro is Google's high-performance AI model, designed for extensive task scaling.

Gemini 1.0 Pro 001 (Tuning)

Gemini 1.0 Pro 001 (Tuning) offers stable and tunable performance, making it an ideal choice for complex task solutions.

Gemini 1.0 Pro 002 (Tuning)

Gemini 1.0 Pro 002 (Tuning) provides excellent multimodal support, focusing on effective solutions for complex tasks.

Mistral Nemo

Mistral Nemo is a 12B model developed in collaboration with Nvidia, offering outstanding reasoning and coding performance, easy to integrate and replace.

Mistral Small

Mistral Small is a cost-effective, fast, and reliable option suitable for use cases such as translation, summarization, and sentiment analysis.

Mistral Large

Mistral Large is the flagship model, excelling in multilingual tasks, complex reasoning, and code generation, making it an ideal choice for high-end applications.

Codestral

Codestral is a cutting-edge generative model focused on code generation, optimized for intermediate filling and code completion tasks.

Pixtral 12B

The Pixtral model demonstrates strong capabilities in tasks such as chart and image understanding, document question answering, multimodal reasoning, and instruction following. It can ingest images at natural resolutions and aspect ratios and handle an arbitrary number of images within a long context window of up to 128K tokens.

Ministral 3B

Ministral 3B is Mistral's top-tier edge model.

Ministral 8B

Ministral 8B is Mistral's cost-effective edge model.

Mistral 7B

Mistral 7B is a compact yet high-performance model, excelling in batch processing and simple tasks such as classification and text generation, with good reasoning capabilities.

Mixtral 8x7B

Mixtral 8x7B is a sparse expert model that leverages multiple parameters to enhance reasoning speed, suitable for handling multilingual and code generation tasks.

Mixtral 8x22B

Mixtral 8x22B is a larger expert model focused on complex tasks, providing excellent reasoning capabilities and higher throughput.

Codestral Mamba

Codestral Mamba is a language model focused on code generation, providing strong support for advanced coding and reasoning tasks.

abab6.5s

Suitable for a wide range of natural language processing tasks, including text generation and dialogue systems.

abab6.5g

Designed for multilingual persona dialogue, supporting high-quality dialogue generation in English and other languages.

abab6.5t

Optimized for Chinese persona dialogue scenarios, providing smooth dialogue generation that aligns with Chinese expression habits.

abab5.5

Targeted at productivity scenarios, supporting complex task processing and efficient text generation, suitable for professional applications.

abab5.5s

Designed for Chinese persona dialogue scenarios, providing high-quality Chinese dialogue generation capabilities, suitable for various application contexts.

command-r

Command R is an LLM optimized for dialogue and long context tasks, particularly suitable for dynamic interactions and knowledge management.

command-r-plus

Command R+ is a high-performance large language model designed for real enterprise scenarios and complex applications.

command-light

Doubao-lite-4k

Doubao-lite-4k.description

Doubao-lite-32k

Doubao-lite-32k.description

Doubao-lite-128k

Doubao-lite-128k.description

Doubao-pro-4k

Doubao-pro-4k.description

Doubao-pro-32k

Doubao-pro-32k.description

Doubao-pro-128k

Doubao-pro-128k.description

Skylark2-pro-character-4k

Skylark2-pro-character-4k.description

Skylark2-pro-32k

Skylark2-pro-32k.description

Skylark2-pro-4k

Skylark2-pro-4k.description

Skylark2-pro-turbo-8k

Skylark2-pro-turbo-8k.description

Skylark2-lite-8k

Skylark2-lite-8k.description