AIモデル · Unitalk

Store

GPT-5

GPT-5 is our flagship model for coding, reasoning, and agentic tasks across domains. The best model for coding and agentic tasks with higher reasoning capabilities.

GPT-5 Mini

GPT-5 mini is a faster, more cost-efficient version of GPT-5. Great for well-defined tasks and precise prompts with high reasoning capabilities.

GPT-5 Nano

GPT-5 nano is our fastest, cheapest version of GPT-5. Great for summarization and classification tasks with average reasoning capabilities.

Claude 3.7 Sonnet

Grok 3 Thinking

Grok 3 Mini Beta model, supports reasoning trace via reasoning_content field.

Grok 4

XAI latest and greatest flagship model, offering unparalleled performance in natural language, math and reasoning - the perfect jack of all trades.

OpenAI o4 Mini

o4-mini is OpenAI’s new reasoning model that supports image and text inputs and generates text outputs. It is suitable for complex tasks requiring broad general knowledge. The model has a 200K token context window and a knowledge cutoff of May 31, 2024.

OpenAI o3

o3 is OpenAI’s new reasoning model that supports image and text inputs and generates text outputs. It is suitable for complex tasks requiring broad general knowledge. The model has a 200K token context window and a knowledge cutoff of May 31, 2024.

Gemini 2.5 Pro

Gemini 2.5 Pro is Google's latest multimodal AI model, offering advanced reasoning capabilities and efficient handling of complex tasks. It supports multimodal generation, enabling seamless integration of text, images, and other inputs for more powerful and flexible AI interactions.

Gemini 2.5 Flash

Gemini 2.5 Flash is a lightweight, high-speed AI model from Google, optimized for fast, efficient performance in tasks like chat, summarization, and multimodal understanding.

Gemini 2.5 Flash Thinking

Gemini 2.5 Flash (Thinking Mode) is Google’s fast, lightweight AI model with enhanced reasoning for complex tasks. It supports long context, multimodal input, and native tool use, balancing speed and intelligence.

Gemini 1.5 Pro

Gemini 1.5 Pro supports up to 2 million tokens, making it an ideal choice for medium-sized multimodal models, providing multifaceted support for complex tasks.

Mistral Small

Mistral Small is a cost-effective, fast, and reliable option suitable for use cases such as translation, summarization, and sentiment analysis.

Mistral Medium 3

Mistral Medium 3 is a frontier-class multimodal model released on May 2025. It is termed the new large and has a context window of 128k tokens. Mistral Medium 3 delivers state-of-the-art performance at 8X lower cost with radically simplified enterprise deployments.

Pixtral 12B

The Pixtral model demonstrates strong capabilities in tasks such as chart and image understanding, document question answering, multimodal reasoning, and instruction following. It can ingest images at natural resolutions and aspect ratios and handle an arbitrary number of images within a long context window of up to 128K tokens.

Qwen Plus

Tongyi Qianwen Plus is an enhanced version of the large-scale language model, supporting input in various languages, including Chinese and English.

Nova Pro

Nova Micro

Nova Lite

Qwen Max

Tongyi Qianwen Max is a large-scale language model with hundreds of billions of parameters, supporting input in various languages, including Chinese and English. It is the API model behind the current Tongyi Qianwen 2.5 product version.

DeepSeek V3

A new open-source model that integrates general and coding capabilities, retaining the general conversational abilities of the original Chat model and the powerful code handling capabilities of the Coder model, while better aligning with human preferences. Additionally, DeepSeek-V2.5 has achieved significant improvements in writing tasks, instruction following, and more.

DeepSeek R1

R1 is the reasoning model launched by DeepSeek. Before outputting the final answer, the model first provides a chain of thought to enhance the accuracy of the final response.

Sonar Reasoning Pro

Sonar Reasoning Pro is a high-performance AI model designed for complex, multi-step reasoning across text and visual inputs. Optimized for analytical depth, it supports long-context processing and excels in domains like research, strategy, and technical problem-solving.

Sonar Reasoning

Sonar Reasoning is an advanced AI model built for deep logical analysis and structured thinking. It handles complex tasks with precision, supports long-form context, and is ideal for applications in education, decision support, and scientific research.

Sonar Pro

Sonar Pro is an advanced search and reasoning model from Perplexity, designed to deliver accurate, context-aware answers by integrating real-time web search with multi-turn query handling. It supports complex queries, follow-up questions, and deep retrieval tasks, making it ideal for research, enterprise search, and professional decision-making.

GPT-4o Search Preview

GPT model for web search in Chat Completions. High performance model trained to understand and execute web search queries with additional per-tool-call fees. No function calling support.

GPT-4o Mini Search Preview

Fast, affordable small model for web search. Specialized for web search in Chat Completions with additional per-tool-call fees. No function calling support.

OpenAI o1-mini

o1-mini is a fast and cost-effective reasoning model designed for programming, mathematics, and scientific applications. This model features a 128K context and has a knowledge cutoff date of October 2023.

OpenAI o1-preview

o1 is OpenAI's new reasoning model, suitable for complex tasks that require extensive general knowledge. This model features a 128K context and has a knowledge cutoff date of October 2023.

GPT-4.1 mini

gpt-4.1-mini.description

GPT-4.1 nano

gpt-4.1-nano.description

GPT-4o 1120

ChatGPT-4o is a dynamic model that updates in real-time to maintain the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications including customer service, education, and technical support.

GPT-4.1

gpt-4.1.description

GPT-4o 0806

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

GPT-4o 0513

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

ChatGPT-4o

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.

GPT-4 Turbo

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4 Turbo Vision 0409

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4 Turbo Preview

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4 Turbo Preview 0125

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4 Turbo Preview 1106

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

GPT-4

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT-4 0613

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT-4 32K

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT-4 32K 0613

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.

GPT-3.5 Turbo

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.

GPT-3.5 Turbo 0125

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.

GPT-3.5 Turbo 1106

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.

GPT-3.5 Turbo Instruct

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.

Llama 3.1 8B

Llama 3.1 is a leading model launched by Meta, supporting up to 405B parameters, applicable in complex dialogues, multilingual translation, and data analysis.

Llama 3.1 70B

70b.description

Llama 3.1 405B

405b.description

Code Llama 7B

Code Llama is an LLM focused on code generation and discussion, combining extensive programming language support, suitable for developer environments.

Code Llama 13B

13b.description

Code Llama 34B

34b.description

Code Llama 70B

70b.description

QwQ 32B

QwQ is an experimental research model focused on improving AI reasoning capabilities.

Qwen2.5 0.5B

0.5b.description

Qwen2.5 1.5B

1.5b.description

Qwen2.5 7B

qwen2.5.description

Qwen2.5 72B

72b.description

CodeQwen1.5 7B

CodeQwen1.5 is a large language model trained on extensive code data, specifically designed to solve complex programming tasks.

Qwen2 0.5B

0.5b.description

Qwen2 1.5B

1.5b.description

Qwen2 7B

Qwen2 is Alibaba's next-generation large-scale language model, supporting diverse application needs with excellent performance.

Qwen2 72B

72b.description

Gemma 2 2B

Gemma 2 9B

Gemma 2 is an efficient model launched by Google, covering a variety of application scenarios from small applications to complex data processing.

Gemma 2 27B

27b.description

CodeGemma 2B

CodeGemma 7B

CodeGemma is a lightweight language model dedicated to various programming tasks, supporting rapid iteration and integration.

Phi-3 3.8B

Phi-3 is a lightweight open model launched by Microsoft, suitable for efficient integration and large-scale knowledge reasoning.

Phi-3 14B

14b.description

WizardLM 2 7B

WizardLM 2 is a language model provided by Microsoft AI, excelling in complex dialogues, multilingual capabilities, reasoning, and intelligent assistant applications.

WizardLM 2 8x22B

8x22b.description

MathΣtral 7B

MathΣtral is designed for scientific research and mathematical reasoning, providing effective computational capabilities and result interpretation.

Mistral 7B

Mistral is a 7B model released by Mistral AI, suitable for diverse language processing needs.

Mixtral 8x7B

Mixtral is an expert model from Mistral AI, featuring open-source weights and providing support in code generation and language understanding.

Mixtral 8x22B

8x22b.description

Mixtral Large 123B

Mixtral Large is Mistral's flagship model, combining capabilities in code generation, mathematics, and reasoning, supporting a 128k context window.

Mixtral Nemo 12B

Mistral Nemo, developed in collaboration with Mistral AI and NVIDIA, is a high-performance 12B model.

Codestral 22B

Codestral is Mistral AI's first code model, providing excellent support for code generation tasks.

Aya 23 8B

Aya 23 is a multilingual model launched by Cohere, supporting 23 languages, facilitating diverse language applications.

Aya 23 35B

35b.description

Command R 35B

Command R is an LLM optimized for dialogue and long context tasks, particularly suitable for dynamic interactions and knowledge management.

Command R+ 104B

Command R+ is a high-performance large language model designed for real enterprise scenarios and complex applications.

DeepSeek V2 16B

DeepSeek V2 is an efficient Mixture-of-Experts language model, suitable for cost-effective processing needs.

DeepSeek V2 236B

236b.description

DeepSeek Coder V2 16B

DeepSeek Coder V2 is an open-source hybrid expert code model that performs excellently in coding tasks, comparable to GPT4-Turbo.

DeepSeek Coder V2 236B

236b.description

LLaVA 7B

LLaVA is a multimodal model that combines a visual encoder with Vicuna for powerful visual and language understanding.

LLaVA 13B

13b.description

LLaVA 34B

34b.description

MiniCPM-V 8B

MiniCPM-V is a next-generation multimodal large model launched by OpenBMB, boasting exceptional OCR recognition and multimodal understanding capabilities, supporting a wide range of application scenarios.

Claude 3.5 Haiku

Claude 3.5 Haiku is Anthropic's fastest next-generation model. Compared to Claude 3 Haiku, Claude 3.5 Haiku has improved in various skills and has surpassed the previous generation's largest model, Claude 3 Opus, in many intelligence benchmark tests.

Claude 3.5 Sonnet

Claude 3.5 Sonnet 0620

Claude 3.5 Sonnet offers capabilities that surpass Opus and faster speeds than Sonnet, while maintaining the same price as Sonnet. Sonnet excels particularly in programming, data science, visual processing, and agent tasks.

Claude 3 Haiku

Claude 3 Haiku is Anthropic's fastest and most compact model, designed for near-instantaneous responses. It features rapid and accurate directional performance.

Claude 3 Sonnet

Claude 3 Sonnet provides an ideal balance of intelligence and speed for enterprise workloads. It offers maximum utility at a lower price, reliable and suitable for large-scale deployment.

Claude 3 Opus

Claude 3 Opus is Anthropic's most powerful model for handling highly complex tasks. It excels in performance, intelligence, fluency, and comprehension.

Claude 2.1

Claude 2 provides advancements in key capabilities for enterprises, including industry-leading 200K token context, significantly reducing the occurrence of model hallucinations, system prompts, and a new testing feature: tool invocation.

Claude 2.0

Claude 2 provides advancements in key capabilities for enterprises, including industry-leading 200K token context, significantly reducing the occurrence of model hallucinations, system prompts, and a new testing feature: tool invocation.

Claude 3.5 Sonnet v2 (Inference profile)

Claude 3.5 Sonnet 0620

Claude 3 Haiku

Claude 3 Sonnet

Claude 3 Opus

Claude 2.1

Claude 2.0

Anthropic's model demonstrates high capability across a wide range of tasks, from complex conversations and creative content generation to detailed instruction following.

Claude Instant

A fast, economical, yet still highly capable model that can handle a range of tasks, including everyday conversations, text analysis, summarization, and document Q&A.

Llama 3.1 8B Instruct

Llama 3.1 70B Instruct

Llama 3.1 405B Instruct

Llama 3 8B Instruct

Llama 3 70B Instruct

Gemini 2.5 Pro

gemini-2.5-pro.description

Gemini 2.0 Flash Thinking

gemini-2.0-flash-thinking-exp-01-21.description

Gemini 2.0 Flash Thinking Experimental 1219

gemini-2.0-flash-thinking-exp-1219.description

Gemini 2.0 Flash

gemini-2.0-flash.description

Gemini 2.5 Pro Search

gemini-2.5-pro-search.description

Gemini 2.5 Flash Search

gemini-2.5-flash-search.description

Gemini Experimental 1206

gemini-exp-1206.description

Gemini Experimental 1121

gemini-exp-1121.description

Gemini Experimental 1114

Gemini Exp 1114 is Google's latest experimental multimodal AI model, featuring rapid processing capabilities and supporting text, image, and video inputs, making it suitable for efficient scaling across various tasks.

LearnLM 1.5 Pro Experimental

learnlm-1.5-pro-experimental.description

Gemini 1.5 Flash

Gemini 1.5 Flash is Google's latest multimodal AI model, featuring fast processing capabilities and supporting text, image, and video inputs, making it suitable for efficient scaling across various tasks.

Gemini 1.5 Flash 002

Gemini 1.5 Flash 002 is an efficient multimodal model that supports a wide range of applications.

Gemini 1.5 Flash 001

Gemini 1.5 Flash 001 is an efficient multimodal model that supports extensive application scaling.

Gemini 1.5 Pro 002

Gemini 1.5 Pro 002 is the latest production-ready model, delivering higher quality outputs, with notable enhancements in mathematics, long-context, and visual tasks.

Gemini 1.5 Pro 001

Gemini 1.5 Pro 001 is a scalable multimodal AI solution that supports a wide range of complex tasks.

Gemini 1.5 Flash 8B

Gemini 1.5 Flash 8B is an efficient multimodal model that supports a wide range of applications.

Gemini 1.5 Flash 8B 0924

Gemini 1.5 Flash 8B 0924 is the latest experimental model, showcasing significant performance improvements in both text and multimodal use cases.

Gemini 1.0 Pro

Gemini 1.0 Pro is Google's high-performance AI model, designed for extensive task scaling.

Gemini 1.0 Pro 001 (Tuning)

Gemini 1.0 Pro 001 (Tuning) offers stable and tunable performance, making it an ideal choice for complex task solutions.

Gemini 1.0 Pro 002 (Tuning)

Gemini 1.0 Pro 002 (Tuning) provides excellent multimodal support, focusing on effective solutions for complex tasks.

Mistral 7B Instruct v0.3

Mistral (7B) Instruct v0.3 offers efficient computational power and natural language understanding, suitable for a wide range of applications.

Gemma 2 2B Instruct

Google's lightweight instruction-tuning model.

Qwen 2.5 72B Instruct

A large language model developed by the Alibaba Cloud Tongyi Qianwen team

Qwen 2.5 Coder 32B Instruct

Qwen2.5-Coder focuses on code writing.

QwQ 32B Preview

QwQ-32B-Preview is Qwen's latest experimental research model, focusing on enhancing AI reasoning capabilities. By exploring complex mechanisms such as language mixing and recursive reasoning, its main advantages include strong analytical reasoning, mathematical, and programming abilities. However, it also faces challenges such as language switching issues, reasoning loops, safety considerations, and differences in other capabilities.

Phi 3.5 mini instruct

microsoft/Phi-3.5-mini-instruct.description

Hermes 3 Llama 3.1 8B

NousResearch/Hermes-3-Llama-3.1-8B.description

DeepSeek R1

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B.description

Auto (best for prompt)

Based on context length, topic, and complexity, your request will be sent to Llama 3 70B Instruct, Claude 3.5 Sonnet (self-regulating), or GPT-4o.

OpenAI o1

openai/o1.description

GPT-4o mini

GPT-4o mini is the latest model released by OpenAI following GPT-4 Omni, supporting both text and image input while outputting text. As their most advanced small model, it is significantly cheaper than other recent cutting-edge models and over 60% cheaper than GPT-3.5 Turbo. It maintains state-of-the-art intelligence while offering remarkable cost-effectiveness. GPT-4o mini scored 82% on the MMLU test and currently ranks higher than GPT-4 in chat preferences.

GPT-4o

ChatGPT-4o is a dynamic model that updates in real-time to maintain the latest version. It combines powerful language understanding and generation capabilities, suitable for large-scale application scenarios, including customer service, education, and technical support.

Claude 3 Haiku

Claude 3 Haiku is Anthropic's fastest and most compact model, designed for near-instantaneous responses. It features quick and accurate directional performance.

Claude 3.5 Sonnet

Claude 3.5 Sonnet offers capabilities that surpass Opus and faster speeds than Sonnet, while maintaining the same pricing as Sonnet. Sonnet excels particularly in programming, data science, visual processing, and agent tasks.

Claude 3 Opus

Claude 3 Opus is Anthropic's most powerful model for handling highly complex tasks. It excels in performance, intelligence, fluency, and comprehension.

Gemini 1.5 Flash

Gemini 1.5 Flash offers optimized multimodal processing capabilities, suitable for various complex task scenarios.

Gemini 1.5 Pro

Gemini 1.5 Pro combines the latest optimization technologies to deliver more efficient multimodal data processing capabilities.

Llama 3.2 11B Vision

LLaMA 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.2 90B Vision

LLaMA 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.

Qwen2 7B (Free)

free.description

Llama 3.1 8B (Free)

free.description

Gemma 2 9B (Free)

free.description

deepseek-coder-6.7b-instruct-awq

@hf/thebloke/deepseek-coder-6.7b-instruct-awq.description

gemma-7b-it

@hf/google/gemma-7b-it.description

hermes-2-pro-mistral-7b

@hf/nousresearch/hermes-2-pro-mistral-7b.description

llama-3-8b-instruct-awq

@cf/meta/llama-3-8b-instruct-awq.description

openchat-3.5-0106

@cf/openchat/openchat-3.5-0106.description

qwen1.5-14b-chat-awq

@cf/qwen/qwen1.5-14b-chat-awq.description

starling-lm-7b-beta

@hf/nexusflow/starling-lm-7b-beta.description

meta-llama-3-8b-instruct

Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.

DeepSeek R1

DeepSeek-R1.description

AI21 Jamba 1.5 Mini

A 52B parameter (12B active) multilingual model, offering a 256K long context window, function calling, structured output, and grounded generation.

AI21 Jamba 1.5 Large

A 398B parameter (94B active) multilingual model, offering a 256K long context window, function calling, structured output, and grounded generation.

Cohere Command R

Command R is a scalable generative model targeting RAG and Tool Use to enable production-scale AI for enterprises.

Cohere Command R+

Command R+ is a state-of-the-art RAG-optimized model designed to tackle enterprise-grade workloads.

Mistral Small

Mistral Small can be used for any language-based task that requires high efficiency and low latency.

Codestral

Codestral-2501.description

Meta Llama 3.1 8B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.

Meta Llama 3.1 70B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.

Meta Llama 3.1 405B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.

Meta Llama 3 8B

A versatile 8-billion parameter model optimized for dialogue and text generation tasks.

Meta Llama 3 70B

A powerful 70-billion parameter model excelling in reasoning, coding, and broad language applications.

Phi 4

Phi-4.description

Phi 3.5 MoE

Phi-3.5-MoE-instruct.description

Phi-3.5-vision 128K

An updated version of the Phi-3-vision model.

Phi-3-mini 4K

The smallest member of the Phi-3 family, optimized for both quality and low latency.

Phi-3-mini 128K

The same Phi-3-mini model, but with a larger context size for RAG or few-shot prompting.

Phi-3-small 8K

A 7B parameter model that provides better quality than Phi-3-mini, focusing on high-quality, reasoning-dense data.

Phi-3-small 128K

The same Phi-3-small model, but with a larger context size for RAG or few-shot prompting.

Phi-3-medium 4K

A 14B parameter model that provides better quality than Phi-3-mini, focusing on high-quality, reasoning-dense data.

Phi-3-medium 128K

The same Phi-3-medium model, but with a larger context size for RAG or few-shot prompting.

Llama 3.1 8B Instruct

Llama 3.1 8B Instruct is the latest version released by Meta, optimized for high-quality conversational scenarios, outperforming many leading closed-source models.

Llama 3.1 70B Instruct

Llama 3.1 70B Instruct is designed for high-quality conversations, excelling in human evaluations, particularly in highly interactive scenarios.

Llama 3.1 405B Instruct

Llama 3.1 405B Instruct is the latest version from Meta, optimized for generating high-quality dialogues, surpassing many leading closed-source models.

Llama 3 8B Instruct

Llama 3 8B Instruct is optimized for high-quality conversational scenarios, performing better than many closed-source models.

Llama 3 70B Instruct

Llama 3 70B Instruct is optimized for high-quality conversational scenarios, demonstrating excellent performance in various human evaluations.

Gemma 2 9B

Gemma 2 is Google's lightweight open-source text model series.

Mistral 7B Instruct

Mistral 7B Instruct is a high-performance industry-standard model optimized for speed and long context support.

WizardLM 2 7B

WizardLM 2 7B is Microsoft's latest lightweight AI model, performing nearly ten times better than existing leading open-source models.

WizardLM-2 8x22B

WizardLM-2 8x22B is Microsoft's state-of-the-art Wizard model, demonstrating extremely competitive performance.

Dolphin Mixtral 8x22B

Dolphin Mixtral 8x22B is a model designed for instruction following, dialogue, and programming.

Hermes 2 Pro Llama 3 8B

Hermes 2 Pro Llama 3 8B is an upgraded version of Nous Hermes 2, featuring the latest internally developed datasets.

Hermes 2 Mixtral 8x7B DPO

Hermes 2 Mixtral 8x7B DPO is a highly flexible multi-model fusion designed to provide an exceptional creative experience.

MythoMax l2 13B

MythoMax l2 13B is a language model that combines creativity and intelligence by merging multiple top models.

OpenChat 7B

OpenChat 7B is an open-source language model library fine-tuned using the 'C-RLFT (Conditional Reinforcement Learning Fine-Tuning)' strategy.

Llama 3.2 3B Instruct Turbo

LLaMA 3.2 is designed for tasks involving both visual and textual data. It excels in tasks like image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.2 11B Vision Instruct Turbo (Free)

LLaMA 3.2 is designed for tasks involving both visual and textual data. It excels in tasks like image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.2 11B Vision Instruct Turbo

LLaMA 3.2 is designed for tasks involving both visual and textual data. It excels in tasks like image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.2 90B Vision Instruct Turbo

LLaMA 3.2 is designed for tasks involving both visual and textual data. It excels in tasks like image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.1 8B Instruct Turbo

Llama 3.1 8B model utilizes FP8 quantization, supporting up to 131,072 context tokens, making it a standout in open-source models, excelling in complex tasks and outperforming many industry benchmarks.

Llama 3.1 70B Instruct Turbo

Llama 3.1 70B model is finely tuned for high-load applications, quantized to FP8 for enhanced computational efficiency and accuracy, ensuring outstanding performance in complex scenarios.

Llama 3.1 405B Instruct Turbo

The 405B Llama 3.1 Turbo model provides massive context support for big data processing, excelling in large-scale AI applications.

Llama 3.1 Nemotron 70B

Llama 3.1 Nemotron 70B is a large language model customized by NVIDIA, designed to enhance the helpfulness of LLM-generated responses to user queries. The model has excelled in benchmark tests such as Arena Hard, AlpacaEval 2 LC, and GPT-4-Turbo MT-Bench, ranking first in all three automatic alignment benchmarks as of October 1, 2024. The model is trained using RLHF (specifically REINFORCE), Llama-3.1-Nemotron-70B-Reward, and HelpSteer2-Preference prompts based on the Llama-3.1-70B-Instruct model.

Llama 3 8B Instruct Turbo

Llama 3 8B Instruct Turbo is a high-performance large language model, supporting a wide range of application scenarios.

Llama 3 70B Instruct Turbo

Llama 3 70B Instruct Turbo offers exceptional language understanding and generation capabilities, suitable for the most demanding computational tasks.

Llama 3 8B Instruct Lite

Llama 3 8B Instruct Lite is designed for resource-constrained environments, providing excellent balanced performance.

Llama 3 70B Instruct Lite

Llama 3 70B Instruct Lite is suitable for environments requiring high performance and low latency.

Llama 3 8B Instruct Reference

LLaMA-3 Chat (8B) provides multilingual support, covering a rich array of domain knowledge.

Llama 3 70B Instruct Reference

LLaMA-3 Chat (70B) is a powerful chat model that supports complex conversational needs.

LLaMA-2 Chat (13B)

LLaMA-2 Chat (13B) offers excellent language processing capabilities and outstanding interactive experiences.

LLaMA-2 (70B)

LLaMA-2 provides excellent language processing capabilities and outstanding interactive experiences.

CodeLlama 34B Instruct

Code Llama is an LLM focused on code generation and discussion, with extensive support for various programming languages, suitable for developer environments.

Gemma 2 27B

Gemma 2 continues the design philosophy of being lightweight and efficient.

Gemma Instruct (2B)

Gemma Instruct (2B) provides basic instruction processing capabilities, suitable for lightweight applications.

Mistral (7B) Instruct v0.2

Mistral 7B is a model fine-tuned on demand, providing optimized answers for tasks.

Mistral (7B) Instruct

Mistral (7B) Instruct is known for its high performance, suitable for various language tasks.

Mistral (7B)

Mistral 7B is a compact yet high-performance model, adept at handling batch processing and simple tasks like classification and text generation, featuring good reasoning capabilities.

Mixtral-8x7B Instruct (46.7B)

Mixtral 8x7B is a pre-trained sparse mixture of experts model for general text tasks.

Mixtral-8x7B (46.7B)

Mixtral 8x7B is a sparse expert model that utilizes multiple parameters to enhance reasoning speed, suitable for multilingual and code generation tasks.

Mixtral-8x22B Instruct (141B)

Mixtral-8x22B Instruct (141B) is a super large language model that supports extremely high processing demands.

WizardLM-2 8x22B

WizardLM 2 is a language model provided by Microsoft AI, excelling in complex dialogues, multilingual capabilities, reasoning, and intelligent assistant tasks.

DeepSeek LLM Chat (67B)

DeepSeek 67B is an advanced model trained for highly complex conversations.

Qwen 2.5 7B Instruct Turbo

Qwen2.5 is a new large language model series designed to optimize instruction-based task processing.

Qwen 2.5 72B Instruct Turbo

Qwen2.5 is a new large language model series designed to optimize instruction-based task processing.

Qwen 2 Instruct (72B)

Qwen2 is an advanced general-purpose language model that supports various types of instructions.

DBRX Instruct

DBRX Instruct provides highly reliable instruction processing capabilities, supporting applications across multiple industries.

Upstage SOLAR Instruct v1 (11B)

Upstage SOLAR Instruct v1 (11B) is suitable for refined instruction tasks, offering excellent language processing capabilities.

MythoMax-L2 (13B)

MythoMax-L2 (13B) is an innovative model suitable for multi-domain applications and complex tasks.

StripedHyena Nous (7B)

StripedHyena Nous (7B) provides enhanced computational capabilities through efficient strategies and model architecture.

Llama 3.3 70B Instruct

accounts/fireworks/models/llama-v3p3-70b-instruct.description

Yi-Large

Yi-Large model, featuring exceptional multilingual processing capabilities, suitable for various language generation and understanding tasks.

Llama 3.3 70B

Meta Llama 3.3 is a multilingual large language model (LLM) with 70 billion parameters (text input/text output), featuring pre-training and instruction-tuning. The instruction-tuned pure text model of Llama 3.3 is optimized for multilingual conversational use cases and outperforms many available open-source and closed chat models on common industry benchmarks.

Llama 3.2 11B Vision (Preview)

Llama 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.2 90B Vision (Preview)

Llama 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.

Llama 3.1 8B

Llama 3.1 8B is a high-performance model that offers rapid text generation capabilities, making it ideal for applications requiring large-scale efficiency and cost-effectiveness.

Llama 3.1 70B

Llama 3.1 70B provides enhanced AI reasoning capabilities, suitable for complex applications, supporting extensive computational processing while ensuring efficiency and accuracy.

Llama 3 Groq 8B Tool Use (Preview)

Llama 3 Groq 8B Tool Use is a model optimized for efficient tool usage, supporting fast parallel computation.

Llama 3 Groq 70B Tool Use (Preview)

Llama 3 Groq 70B Tool Use offers powerful tool invocation capabilities, supporting efficient processing of complex tasks.

Meta Llama 3 8B

Meta Llama 3 8B delivers high-quality reasoning performance, suitable for diverse application needs.

Meta Llama 3 70B

Meta Llama 3 70B provides unparalleled complexity handling capabilities, tailored for high-demand projects.

Gemma 2 9B

Gemma 2 9B is a model optimized for specific tasks and tool integration.

Gemma 7B

Gemma 7B is suitable for medium to small-scale task processing, offering cost-effectiveness.

Mixtral 8x7B

Mixtral 8x7B provides high fault-tolerant parallel computing capabilities, suitable for complex tasks.

LLaVA 1.5 7B

LLaVA 1.5 7B offers integrated visual processing capabilities, generating complex outputs from visual information inputs.

Sonar

sonar.description

Llama 3.1 Sonar Small Online

Llama 3.1 Sonar Small Online model, featuring 8B parameters, supports a context length of approximately 127,000 tokens, designed for online chat, efficiently handling various text interactions.

Llama 3.1 Sonar Large Online

Llama 3.1 Sonar Large Online model, featuring 70B parameters, supports a context length of approximately 127,000 tokens, suitable for high-capacity and diverse chat tasks.

Llama 3.1 Sonar Huge Online

Llama 3.1 Sonar Huge Online model, featuring 405B parameters, supports a context length of approximately 127,000 tokens, designed for complex online chat applications.

Mistral Nemo

Mistral Nemo is a 12B model developed in collaboration with Nvidia, offering outstanding reasoning and coding performance, easy to integrate and replace.

Mistral Large

Mistral Large is the flagship model, excelling in multilingual tasks, complex reasoning, and code generation, making it an ideal choice for high-end applications.

Codestral

Codestral is a cutting-edge generative model focused on code generation, optimized for intermediate filling and code completion tasks.

Pixtral Large

Pixtral Large is an open-source multimodal model with 124 billion parameters, built on Mistral Large 2. This is the second model in our multimodal family, showcasing cutting-edge image understanding capabilities.

Ministral 3B

Ministral 3B is Mistral's top-tier edge model.

Ministral 8B

Ministral 8B is Mistral's cost-effective edge model.

Mistral 7B

Mistral 7B is a compact yet high-performance model, excelling in batch processing and simple tasks such as classification and text generation, with good reasoning capabilities.

Mixtral 8x7B

Mixtral 8x7B is a sparse expert model that leverages multiple parameters to enhance reasoning speed, suitable for handling multilingual and code generation tasks.

Mixtral 8x22B

Mixtral 8x22B is a larger expert model focused on complex tasks, providing excellent reasoning capabilities and higher throughput.

Codestral Mamba

Codestral Mamba is a language model focused on code generation, providing strong support for advanced coding and reasoning tasks.

Jamba 1.5 Mini

jamba-1.5-mini.description

Jamba 1.5 Large

jamba-1.5-large.description

Solar Mini

Solar Mini is a compact LLM that outperforms GPT-3.5, featuring strong multilingual capabilities, supporting English and Korean, and providing an efficient and compact solution.

Solar Mini (Ja)

Solar Mini (Ja) extends the capabilities of Solar Mini, focusing on Japanese while maintaining efficiency and excellent performance in English and Korean usage.

Solar Pro

Solar Pro is a highly intelligent LLM launched by Upstage, focusing on single-GPU instruction-following capabilities, with an IFEval score above 80. Currently supports English, with a formal version planned for release in November 2024, which will expand language support and context length.

Grok 3

XAI latest and greatest flagship model, offering unparalleled performance in natural language, math and reasoning - the perfect jack of all trades.

Grok Vision

grok-2-vision-latest.description

Grok 4 Search

grok-4-search.description

Qwen Turbo

Tongyi Qianwen is a large-scale language model that supports input in various languages, including Chinese and English.

Qwen Long

Qwen is a large-scale language model that supports long text contexts and dialogue capabilities based on long documents and multiple documents.

Qwen VL Plus

qwen-vl-plus.description

Qwen VL Max

Tongyi Qianwen's ultra-large-scale visual language model. Compared to the enhanced version, it further improves visual reasoning and instruction-following abilities, providing a higher level of visual perception and cognition.

Qwen VL OCR

qwen-vl-ocr-latest.description

Qwen Math Turbo

The Tongyi Qianwen Math model is specifically designed for solving mathematical problems.

Qwen Math Plus

The Tongyi Qianwen Math model is specifically designed for solving mathematical problems.

Qwen Coder Turbo

The Tongyi Qianwen Coder model.

Qwen Coder Plus

Tongyi Qianwen code model.

QwQ 32B Preview

The QwQ model is an experimental research model developed by the Qwen team, focusing on enhancing AI reasoning capabilities.

QVQ 72B Preview

qvq-72b-preview.description

Qwen2.5 7B

qwen2.5-7b-instruct.description

Qwen2.5 14B

qwen2.5-14b-instruct.description

Qwen2.5 32B

qwen2.5-32b-instruct.description

Qwen2.5 72B

qwen2.5-72b-instruct.description

Qwen2.5 14B 1M

qwen2.5-14b-instruct-1m.description

Qwen2.5 Math 7B

qwen2.5-math-7b-instruct.description

Qwen2.5 Math 72B

qwen2.5-math-72b-instruct.description

Qwen2.5 Coder 7B

qwen2.5-coder-7b-instruct.description

Qwen2.5 Coder 32B

qwen2.5-coder-32b-instruct.description

Qwen VL

Initialized with the Qwen-7B language model, this pre-trained model adds an image model with an input resolution of 448.

Qwen VL Chat

Qwen VL supports flexible interaction methods, including multi-image, multi-turn Q&A, and creative capabilities.

Qwen2.5 VL 72B

qwen2.5-vl-72b-instruct.description

DeepSeek R1

deepseek-r1.description

DeepSeek V3

deepseek-v3.description

ERNIE 3.5 8K

ernie-3.5-8k.description

ERNIE 3.5 8K Preview

ernie-3.5-8k-preview.description

ERNIE 3.5 128K

ernie-3.5-128k.description

ERNIE 4.0 8K

ernie-4.0-8k-latest.description

ERNIE 4.0 8K Preview

ernie-4.0-8k-preview.description

ERNIE 4.0 Turbo 8K

ernie-4.0-turbo-8k-latest.description

ERNIE 4.0 Turbo 128K

ernie-4.0-turbo-128k.description

ERNIE 4.0 Turbo 8K Preview

ernie-4.0-turbo-8k-preview.description

ERNIE Lite 8K

ernie-lite-8k.description

ERNIE Lite Pro 128K

ernie-lite-pro-128k.description

ERNIE Tiny 8K

ernie-tiny-8k.description

ERNIE Speed 128K

ernie-speed-128k.description

ERNIE Speed Pro 128K

ernie-speed-pro-128k.description

ERNIE Character 8K

ernie-char-8k.description

ERNIE Character Fiction 8K

ernie-char-fiction-8k.description

ERNIE Novel 8K

ernie-novel-8k.description

Hunyuan Lite

Upgraded to a MOE structure with a context window of 256k, leading many open-source models in various NLP, coding, mathematics, and industry benchmarks.

Hunyuan Standard

Utilizes a superior routing strategy while alleviating issues of load balancing and expert convergence. For long texts, the needle-in-a-haystack metric reaches 99.9%. MOE-32K offers a relatively higher cost-performance ratio, balancing effectiveness and price while enabling processing of long text inputs.

Hunyuan Standard 256K

Utilizes a superior routing strategy while alleviating issues of load balancing and expert convergence. For long texts, the needle-in-a-haystack metric reaches 99.9%. MOE-256K further breaks through in length and effectiveness, greatly expanding the input length capacity.

Hunyuan Turbo

The preview version of the next-generation Hunyuan large language model, featuring a brand-new mixed expert model (MoE) structure, which offers faster inference efficiency and stronger performance compared to Hunyuan Pro.

Hunyuan Pro

A trillion-parameter scale MOE-32K long text model. Achieves absolute leading levels across various benchmarks, capable of handling complex instructions and reasoning, with advanced mathematical abilities, supporting function calls, and optimized for applications in multilingual translation, finance, law, and healthcare.

Hunyuan Vision

The latest multimodal model from Hunyuan, supporting image + text input to generate textual content.

Hunyuan Code

The latest code generation model from Hunyuan, trained on a base model with 200B high-quality code data, iteratively trained for six months with high-quality SFT data, increasing the context window length to 8K. It ranks among the top in automatic evaluation metrics for code generation across five major programming languages, and performs in the first tier for comprehensive human quality assessments across ten aspects of coding tasks.

Hunyuan FunctionCall

The latest MOE architecture FunctionCall model from Hunyuan, trained on high-quality FunctionCall data, with a context window of 32K, leading in multiple dimensions of evaluation metrics.

Hunyuan Role

The latest role-playing model from Hunyuan, fine-tuned and trained by Hunyuan's official team, based on the Hunyuan model combined with role-playing scenario datasets for enhanced foundational performance in role-playing contexts.

GLM-Zero-Preview

glm-zero-preview.description

GLM-4-Flash

GLM-4-Flash is the ideal choice for handling simple tasks, being the fastest and most cost-effective.

GLM-4-FlashX

GLM-4-FlashX is an enhanced version of Flash, featuring ultra-fast inference speed.

GLM-4-Long

GLM-4-Long supports ultra-long text inputs, suitable for memory-based tasks and large-scale document processing.

GLM-4-Air

GLM-4-Air is a cost-effective version with performance close to GLM-4, offering fast speed at an affordable price.

GLM-4-AirX

GLM-4-AirX provides an efficient version of GLM-4-Air, with inference speeds up to 2.6 times faster.

GLM-4-AllTools

GLM-4-AllTools is a multifunctional intelligent agent model optimized to support complex instruction planning and tool invocation, such as web browsing, code interpretation, and text generation, suitable for multitasking.

GLM-4-Plus

GLM-4-Plus, as a high-intelligence flagship, possesses strong capabilities for processing long texts and complex tasks, with overall performance improvements.

GLM-4-0520

GLM-4-0520 is the latest model version designed for highly complex and diverse tasks, demonstrating outstanding performance.

GLM-4

GLM-4 is the old flagship version released in January 2024, currently replaced by the more powerful GLM-4-0520.

GLM-4V-Flash

GLM-4V-Flash focuses on efficient single image understanding, suitable for scenarios that require rapid image parsing, such as real-time image analysis or batch image processing.

GLM-4V-Plus

GLM-4V-Plus has the ability to understand video content and multiple images, suitable for multimodal tasks.

GLM-4V

GLM-4V provides strong image understanding and reasoning capabilities, supporting various visual tasks.

CodeGeeX-4

CodeGeeX-4 is a powerful AI programming assistant that supports intelligent Q&A and code completion in various programming languages, enhancing development efficiency.

CharGLM-3

CharGLM-3 is designed for role-playing and emotional companionship, supporting ultra-long multi-turn memory and personalized dialogue, with wide applications.

Emohaa

Emohaa is a psychological model with professional counseling capabilities, helping users understand emotional issues.

DeepSeek V3

deepseek-ai/DeepSeek-V3.description

DeepSeek R1 Distill Llama 70B

deepseek-ai/DeepSeek-R1-Distill-Llama-70B.description

DeepSeek R1 Distill Qwen 14B

deepseek-ai/DeepSeek-R1-Distill-Qwen-14B.description

DeepSeek R1 Distill Llama 8B (Free)

deepseek-ai/DeepSeek-R1-Distill-Llama-8B.description

DeepSeek R1 Distill Qwen 7B (Free)

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B.description

DeepSeek-R1-Distill-Qwen-1.5B (Free)

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.description

DeepSeek V2.5

DeepSeek V2.5 combines the excellent features of previous versions, enhancing general and coding capabilities.

DeepSeek VL2

deepseek-ai/deepseek-vl2.description

QVQ 72B Preview

Qwen/QVQ-72B-Preview.description

Qwen2.5 7B Instruct (Free)

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.

Qwen2.5 7B Instruct (LoRA)

Qwen2.5-7B-Instruct is one of the latest large language models released by Alibaba Cloud. This 7B model shows significant improvements in coding and mathematics. It also provides multilingual support, covering over 29 languages, including Chinese and English. The model has made notable advancements in instruction following, understanding structured data, and generating structured outputs, especially JSON.

Qwen2.5 14B Instruct

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.

Qwen2.5 32B Instruct

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.

Qwen2.5 72B Instruct 128K

Qwen2.5 is a new large language model series with enhanced understanding and generation capabilities.

Qwen2.5 Coder 7B Instruct (Free)

Qwen2.5-Coder-7B-Instruct is the latest version in Alibaba Cloud's series of code-specific large language models. This model significantly enhances code generation, reasoning, and repair capabilities based on Qwen2.5, trained on 55 trillion tokens. It not only improves coding abilities but also maintains advantages in mathematics and general capabilities, providing a more comprehensive foundation for practical applications such as code agents.

Qwen2 1.5B Instruct (Free)

Qwen2-1.5B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 1.5B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models. Compared to Qwen1.5-1.8B-Chat, Qwen2-1.5B-Instruct shows significant performance improvements in tests such as MMLU, HumanEval, GSM8K, C-Eval, and IFEval, despite having slightly fewer parameters.

Qwen2 72B Instruct

Qwen2-72B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 72B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It can handle large-scale inputs. The model excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models and demonstrating competitive performance comparable to proprietary models in certain tasks.

Qwen2 VL 72B Instruct

Qwen2-VL is the latest iteration of the Qwen-VL model, achieving state-of-the-art performance in visual understanding benchmarks.

InternLM2.5 7B Chat (Free)

InternLM2.5 offers intelligent dialogue solutions across multiple scenarios.

InternLM2.5 20B Chat

The innovative open-source model InternLM2.5 enhances dialogue intelligence through a large number of parameters.

InternVL2 8B (Pro)

InternVL2 demonstrates exceptional performance across various visual language tasks, including document and chart understanding, scene text understanding, OCR, and solving scientific and mathematical problems.

InternVL2 26B

InternVL2 demonstrates exceptional performance across various visual language tasks, including document and chart understanding, scene text understanding, OCR, and solving scientific and mathematical problems.

GLM-4 9B Chat (Free)

GLM-4 9B is an open-source version that provides an optimized conversational experience for chat applications.

GLM-4 9B Chat (Pro)

GLM-4-9B-Chat is the open-source version of the GLM-4 series pre-trained models launched by Zhipu AI. This model excels in semantics, mathematics, reasoning, code, and knowledge. In addition to supporting multi-turn dialogues, GLM-4-9B-Chat also features advanced capabilities such as web browsing, code execution, custom tool invocation (Function Call), and long-text reasoning. The model supports 26 languages, including Chinese, English, Japanese, Korean, and German. In multiple benchmark tests, GLM-4-9B-Chat has demonstrated excellent performance, such as in AlignBench-v2, MT-Bench, MMLU, and C-Eval. The model supports a maximum context length of 128K, making it suitable for academic research and commercial applications.

ChatGLM3 6B (Free)

THUDM/chatglm3-6b.description

Yi-1.5 6B Chat (Free)

Yi-1.5-6B-Chat is a variant of the Yi-1.5 series, belonging to the open-source chat model. Yi-1.5 is an upgraded version of Yi, continuously pre-trained on 500B high-quality corpora and fine-tuned on over 3M diverse samples. Compared to Yi, Yi-1.5 demonstrates stronger capabilities in coding, mathematics, reasoning, and instruction following, while maintaining excellent language understanding, common sense reasoning, and reading comprehension abilities. The model is available in context length versions of 4K, 16K, and 32K, with a total pre-training volume reaching 3.6T tokens.

Yi-1.5 9B Chat 16K (Free)

Yi-1.5 9B supports 16K tokens, providing efficient and smooth language generation capabilities.

Yi-1.5 34B Chat 16K

Yi-1.5 34B delivers superior performance in industry applications with a wealth of training samples.

Llama 3.1 8B Instruct (Free)

LLaMA 3.1 provides multilingual support and is one of the industry's leading generative models.

Llama 3.1 70B Instruct

LLaMA 3.1 70B offers efficient conversational support in multiple languages.

Llama 3.1 405B Instruct

LLaMA 3.1 405B is a powerful model for pre-training and instruction tuning.

Llama 3.3 70B Instruct

meta-llama/Llama-3.3-70B-Instruct.description

TeleChat2

TeleAI/TeleChat2.description

TeleMM

TeleAI/TeleMM.description

Yi Lightning

The latest high-performance model, ensuring high-quality output while significantly improving reasoning speed.

Yi Spark

Small yet powerful, lightweight and fast model. Provides enhanced mathematical computation and coding capabilities.

Yi Medium

Medium-sized model upgraded and fine-tuned, balanced capabilities, and high cost-performance ratio. Deeply optimized instruction-following capabilities.

Yi Medium 200K

200K ultra-long context window, providing deep understanding and generation capabilities for long texts.

Yi Large Turbo

Exceptional performance at a high cost-performance ratio. Conducts high-precision tuning based on performance, inference speed, and cost.

Yi Large RAG

High-level service based on the yi-large super strong model, combining retrieval and generation techniques to provide precise answers and real-time information retrieval services.

Yi Large FC

Based on the yi-large model, supports and enhances tool invocation capabilities, suitable for various business scenarios requiring agent or workflow construction.

Yi Large

A new trillion-parameter model, providing super strong question-answering and text generation capabilities.

Yi Vision

Model for complex visual tasks, providing high-performance image understanding and analysis capabilities.

Yi Large Preview

Initial version, recommended to use yi-large (new version).

Yi Lightning Lite

A lightweight version, recommended to use yi-lightning.

Spark Lite

Spark Lite is a lightweight large language model with extremely low latency and efficient processing capabilities, completely free and open, supporting real-time online search functionality. Its quick response feature makes it excel in inference applications and model fine-tuning on low-power devices, providing users with excellent cost-effectiveness and intelligent experiences, particularly in knowledge Q&A, content generation, and search scenarios.

Spark Pro

Spark Pro is a high-performance large language model optimized for professional fields, focusing on mathematics, programming, healthcare, education, and more, supporting online search and built-in plugins for weather, dates, etc. Its optimized model demonstrates excellent performance and efficiency in complex knowledge Q&A, language understanding, and high-level text creation, making it an ideal choice for professional application scenarios.

Spark Pro 128K

Spark Pro 128K is equipped with an extra-large context processing capability, able to handle up to 128K of contextual information, making it particularly suitable for long-form content that requires comprehensive analysis and long-term logical connections, providing smooth and consistent logic and diverse citation support in complex text communication.

Spark Max

generalv3.5.description

Spark Max 32K

Spark Max 32K is configured with large context processing capabilities, enhanced contextual understanding, and logical reasoning abilities, supporting text input of 32K tokens, suitable for long document reading, private knowledge Q&A, and other scenarios.

Spark 4.0 Ultra

Spark4.0 Ultra is the most powerful version in the Spark large model series, enhancing text content understanding and summarization capabilities while upgrading online search links. It is a comprehensive solution for improving office productivity and accurately responding to demands, leading the industry as an intelligent product.

SenseChat 5.5

The latest version model (V5.5) with a context length of 128K shows significant improvements in mathematical reasoning, English conversation, instruction following, and long text comprehension, comparable to GPT-4o.

SenseChat 5.0 Turbo

Suitable for fast question answering and model fine-tuning scenarios.

SenseChat 5.0 Cantonese

With a context length of 32K, it surpasses GPT-4 in Cantonese conversation comprehension and is competitive with GPT-4 Turbo in knowledge, reasoning, mathematics, and code writing across multiple domains.

SenseChat 4.0 128K

Basic version model (V4) with a context length of 128K, excelling in long text comprehension and generation tasks.

SenseChat 4.0 32K

Basic version model (V4) with a context length of 32K, flexibly applicable to various scenarios.

SenseChat 4.0 4K

Basic version model (V4) with a context length of 4K, featuring strong general capabilities.

SenseChat Character

Standard version model with an 8K context length and high response speed.

SenseChat Character Pro

Advanced version model with a context length of 32K, offering comprehensive capability enhancements and supporting both Chinese and English conversations.

Step 1 Flash

High-speed model, suitable for real-time dialogues.

Step 1 8K

Small model, suitable for lightweight tasks.

Step 1 32K

Supports medium-length dialogues, applicable to various application scenarios.

Step 1 128K

Balances performance and cost, suitable for general scenarios.

Step 1 256K

Equipped with ultra-long context processing capabilities, especially suitable for long document analysis.

Step 2 16K

Supports large-scale context interactions, suitable for complex dialogue scenarios.

Step 2 Mini

step-2-mini.description

Step 2 16K Exp

step-2-16k-exp.description

Step 1V 8K

A small visual model suitable for basic text and image tasks.

Step 1V 32K

Supports visual input, enhancing multimodal interaction experiences.

Step 1o Vision 32K

step-1o-vision-32k.description

Step 1.5V Mini

This model has powerful video understanding capabilities.

Moonshot V1 8K

Moonshot V1 8K is designed for generating short text tasks, featuring efficient processing performance, capable of handling 8,192 tokens, making it ideal for brief dialogues, note-taking, and rapid content generation.

Moonshot V1 32K

Moonshot V1 32K offers medium-length context processing capabilities, able to handle 32,768 tokens, particularly suitable for generating various long documents and complex dialogues, applicable in content creation, report generation, and dialogue systems.

Moonshot V1 128K

Moonshot V1 128K is a model with ultra-long context processing capabilities, suitable for generating extremely long texts, meeting the demands of complex generation tasks, capable of handling up to 128,000 tokens, making it ideal for research, academia, and large document generation.

Baichuan 4

The model is the best in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also boasts industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.

Baichuan 4 Turbo

The leading model in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also possesses industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.

Baichuan 4 Air

The leading model in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also possesses industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.

Baichuan 3 Turbo

Optimized for high-frequency enterprise scenarios, significantly improving performance and cost-effectiveness. Compared to the Baichuan2 model, content creation improves by 20%, knowledge Q&A by 17%, and role-playing ability by 40%. Overall performance is superior to GPT-3.5.

Baichuan 3 Turbo 128k

Features a 128K ultra-long context window, optimized for high-frequency enterprise scenarios, significantly improving performance and cost-effectiveness. Compared to the Baichuan2 model, content creation improves by 20%, knowledge Q&A by 17%, and role-playing ability by 40%. Overall performance is superior to GPT-3.5.

Baichuan 2 Turbo

Utilizes search enhancement technology to achieve comprehensive links between large models and domain knowledge, as well as knowledge from the entire web. Supports uploads of various documents such as PDF and Word, and URL input, providing timely and comprehensive information retrieval with accurate and professional output.

abab6.5s

Suitable for a wide range of natural language processing tasks, including text generation and dialogue systems.

abab6.5g

Designed for multilingual persona dialogue, supporting high-quality dialogue generation in English and other languages.

abab6.5t

Optimized for Chinese persona dialogue scenarios, providing smooth dialogue generation that aligns with Chinese expression habits.

abab5.5

Targeted at productivity scenarios, supporting complex task processing and efficient text generation, suitable for professional applications.

abab5.5s

Designed for Chinese persona dialogue scenarios, providing high-quality Chinese dialogue generation capabilities, suitable for various application contexts.

InternLM2.5

Our latest model series, featuring exceptional reasoning performance, supporting a context length of 1M, and enhanced instruction following and tool invocation capabilities.

InternLM2 Pro Chat

An older version of the model that we still maintain, available in various parameter sizes of 7B and 20B.

Qwen Turbo

Qwen Turbo is a large-scale language model supporting input in various languages including Chinese and English.

Qwen Plus

Qwen Plus is an enhanced large-scale language model supporting input in various languages including Chinese and English.

Qwen Max

Qwen Max is a trillion-level large-scale language model that supports input in various languages including Chinese and English, and is the API model behind the current Qwen 2.5 product version.

Qwen VL Plus

Tongyi Qianwen's large-scale visual language model enhanced version. Significantly improves detail recognition and text recognition capabilities, supporting ultra-high pixel resolution and images of any aspect ratio.

Qwen2.5 Math 1.5B

qwen2.5-math-1.5b-instruct.description

Qwen2.5 Coder 1.5B

qwen2.5-coder-1.5b-instruct.description

360GPT2 Pro

360GPT2 Pro is an advanced natural language processing model launched by 360, featuring exceptional text generation and understanding capabilities, particularly excelling in generation and creative tasks, capable of handling complex language transformations and role-playing tasks.

360GPT Pro

360GPT Pro, as an important member of the 360 AI model series, meets diverse natural language application scenarios with efficient text processing capabilities, supporting long text understanding and multi-turn dialogue.

360GPT Turbo

360GPT Turbo offers powerful computation and dialogue capabilities, with excellent semantic understanding and generation efficiency, making it an ideal intelligent assistant solution for enterprises and developers.

360GPT Turbo Responsibility 8K

360GPT Turbo Responsibility 8K emphasizes semantic safety and responsibility, designed specifically for applications with high content safety requirements, ensuring accuracy and robustness in user experience.

ERNIE 3.5 8K

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.

ERNIE 3.5 8K Preview

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.

ERNIE 3.5 128K

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.

ERNIE 4.0 8K

Baidu's self-developed flagship ultra-large-scale language model, which has achieved a comprehensive upgrade in model capabilities compared to ERNIE 3.5, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information.

ERNIE 4.0 8K Preview

Baidu's self-developed flagship ultra-large-scale language model, which has achieved a comprehensive upgrade in model capabilities compared to ERNIE 3.5, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information.

ERNIE 4.0 Turbo 8K

Baidu's self-developed flagship ultra-large-scale language model, demonstrating excellent overall performance, suitable for complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information. It offers better performance compared to ERNIE 4.0.

ERNIE 4.0 Turbo 8K Preview

Baidu's self-developed flagship ultra-large-scale language model, demonstrating excellent overall performance, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information. It outperforms ERNIE 4.0 in performance.

ERNIE Lite Pro 128K

Baidu's self-developed lightweight large language model, balancing excellent model performance with inference efficiency, offering better results than ERNIE Lite, suitable for inference on low-power AI acceleration cards.

ERNIE Speed Pro 128K

Baidu's latest self-developed high-performance large language model released in 2024, with outstanding general capabilities, providing better results than ERNIE Speed, suitable as a base model for fine-tuning, effectively addressing specific scenario issues while also exhibiting excellent inference performance.

ERNIE Speed 128K

Baidu's latest self-developed high-performance large language model released in 2024, with outstanding general capabilities, suitable as a base model for fine-tuning, effectively addressing specific scenario issues while also exhibiting excellent inference performance.

ERNIE Character 8K

Baidu's self-developed vertical scene large language model, suitable for applications such as game NPCs, customer service dialogues, and role-playing conversations, featuring more distinct and consistent character styles, stronger adherence to instructions, and superior inference performance.

Hunyuan Large

GPT 3.5 Turbo

GPT 3.5 Turbo is an efficient model provided by OpenAI, suitable for chat and text generation tasks, supporting parallel function calls.

GPT 3.5 Turbo

GPT 3.5 Turbo 16k is a high-capacity text generation model suitable for complex tasks.

GPT 4 Turbo with Vision Preview

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.

Claude 3.5 Sonnet

Claude 3.5 Sonnet offers capabilities that surpass Opus and faster speeds than Sonnet, while maintaining the same pricing as Sonnet. Sonnet excels particularly in programming, data science, visual processing, and agent tasks.

Gemini 1.5 Flash 0827

Gemini 1.5 Flash 0827 provides optimized multimodal processing capabilities, suitable for various complex task scenarios.

Gemini 1.5 Pro 0827

Gemini 1.5 Pro 0827 combines the latest optimization technologies for more efficient multimodal data processing.

Gemini 1.5 Pro 0801

Gemini 1.5 Pro 0801 offers excellent multimodal processing capabilities, providing greater flexibility for application development.

command-light

Doubao-lite-4k

Doubao-lite-4k.description

Doubao-lite-32k

Doubao-lite-32k.description

Doubao-lite-128k

Doubao-lite-128k.description

Doubao-pro-4k

Doubao-pro-4k.description

Doubao-pro-32k

Doubao-pro-32k.description

Doubao-pro-128k

Doubao-pro-128k.description

Skylark2-pro-character-4k

Skylark2-pro-character-4k.description

Skylark2-pro-32k

Skylark2-pro-32k.description

Skylark2-pro-4k

Skylark2-pro-4k.description

Skylark2-pro-turbo-8k

Skylark2-pro-turbo-8k.description

Skylark2-lite-8k

Skylark2-lite-8k.description

Qwen2.5 Coder 14B Instruct

Qwen2.5-Coder-14B-Instruct.description

Qwen2 VL 72B

Qwen2-VL-72B.description

InternVL2.5 26B

InternVL2.5-26B is a powerful visual language model that supports multimodal processing of images and text, capable of accurately recognizing image content and generating relevant descriptions or answers.

InternVL2 8B

InternVL2-8B is a powerful visual language model that supports multimodal processing of images and text, capable of accurately recognizing image content and generating relevant descriptions or answers.

Yi 34B Chat

Yi-34B-Chat.description

DeepSeek Coder 33B Instruct

DeepSeek Coder 33B is a code language model trained on 20 trillion data points, of which 87% are code and 13% are in Chinese and English. The model introduces a 16K window size and fill-in-the-blank tasks, providing project-level code completion and snippet filling capabilities.

CodeGeeX4 All 9B

CodeGeeX4-ALL-9B is a multilingual code generation model that supports comprehensive functions including code completion and generation, code interpretation, web search, function calls, and repository-level code Q&A, covering various scenarios in software development. It is a top-tier code generation model with fewer than 10B parameters.

Taichu 2.0

The ZD Taichu language model possesses strong language understanding capabilities and excels in text creation, knowledge Q&A, code programming, mathematical calculations, logical reasoning, sentiment analysis, and text summarization. It innovatively combines large-scale pre-training with rich knowledge from multiple sources, continuously refining algorithmic techniques and absorbing new knowledge in vocabulary, structure, grammar, and semantics from vast text data, resulting in an evolving model performance. It provides users with more convenient information and services, as well as a more intelligent experience.

Taichu 2.0V

taichu2_mm.description

360GPT2 o1

360gpt2-o1.description