Unitalk
All Providers
unitalk
OpenAI

OpenAI o1-mini

o1-mini is a fast and cost-effective reasoning model designed for programming, mathematics, and scientific applications. This model features a 128K context and has a knowledge cutoff date of October 2023.
--
unitalk
OpenAI

OpenAI o1-preview

o1 is OpenAI's new reasoning model, suitable for complex tasks that require extensive general knowledge. This model features a 128K context and has a knowledge cutoff date of October 2023.
--
unitalk
OpenAI

GPT-4o

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.
--
unitalk
OpenAI

GPT-4o mini

GPT-4o mini is the latest model released by OpenAI after GPT-4 Omni, supporting both image and text input while outputting text. As their most advanced small model, it is significantly cheaper than other recent cutting-edge models, costing over 60% less than GPT-3.5 Turbo. It maintains state-of-the-art intelligence while offering remarkable cost-effectiveness. GPT-4o mini scored 82% on the MMLU test and currently ranks higher than GPT-4 in chat preferences.
--
unitalk
Gemini

Gemini 1.5 Pro

Gemini 1.5 Pro supports up to 2 million tokens, making it an ideal choice for medium-sized multimodal models, providing multifaceted support for complex tasks.
--
unitalk
Gemini

Gemini 1.5 Flash

Gemini 1.5 Flash is Google's latest multimodal AI model, featuring fast processing capabilities and supporting text, image, and video inputs, making it suitable for efficient scaling across various tasks.
--
unitalk
Claude

Claude 3.5 Sonnet

Claude 3.5 Sonnet offers capabilities that surpass Opus and faster speeds than Sonnet, while maintaining the same price as Sonnet. Sonnet excels particularly in programming, data science, visual processing, and agent tasks.
--
unitalk
Claude

Claude 3 Haiku

Claude 3 Haiku is Anthropic's fastest and most compact model, designed for near-instantaneous responses. It features rapid and accurate directional performance.
--
unitalk
Claude

Claude 3 Opus

Claude 3 Opus is Anthropic's most powerful model for handling highly complex tasks. It excels in performance, intelligence, fluency, and comprehension.
--
unitalk
Mistral

Mistral Nemo

Mistral Nemo is a 12B model developed in collaboration with Nvidia, offering outstanding reasoning and coding performance, easy to integrate and replace.
--
unitalk
Mistral

Mistral Small

Mistral Small is a cost-effective, fast, and reliable option suitable for use cases such as translation, summarization, and sentiment analysis.
--
unitalk
Mistral

Mistral Large

Mistral Large is the flagship model, excelling in multilingual tasks, complex reasoning, and code generation, making it an ideal choice for high-end applications.
--
unitalk
Mistral

Codestral

Codestral is a cutting-edge generative model focused on code generation, optimized for intermediate filling and code completion tasks.
--
unitalk
Mistral

Pixtral 12B

The Pixtral model demonstrates strong capabilities in tasks such as chart and image understanding, document question answering, multimodal reasoning, and instruction following. It can ingest images at natural resolutions and aspect ratios and handle an arbitrary number of images within a long context window of up to 128K tokens.
--
unitalk
Meta

Llama 3.1 Sonar Small Online

Llama 3.1 Sonar Small Online model, featuring 8B parameters, supports a context length of approximately 127,000 tokens, designed for online chat, efficiently handling various text interactions.
--
unitalk
Meta

Llama 3.1 Sonar Large Online

Llama 3.1 Sonar Large Online model, featuring 70B parameters, supports a context length of approximately 127,000 tokens, suitable for high-capacity and diverse chat tasks.
--
unitalk
Meta

Llama 3.1 Sonar Huge Online

Llama 3.1 Sonar Huge Online model, featuring 405B parameters, supports a context length of approximately 127,000 tokens, designed for complex online chat applications.
--
unitalk
DeepSeek

DeepSeek V3

A new open-source model that integrates general and coding capabilities, retaining the general conversational abilities of the original Chat model and the powerful code handling capabilities of the Coder model, while better aligning with human preferences. Additionally, DeepSeek-V2.5 has achieved significant improvements in writing tasks, instruction following, and more.
64K
unitalk
DeepSeek

DeepSeek R1

R1 is the reasoning model launched by DeepSeek. Before outputting the final answer, the model first provides a chain of thought to enhance the accuracy of the final response.
64K
openai
OpenAI

OpenAI o4 Mini

o4-mini.description
200K
openai
OpenAI

OpenAI o3

o3.description
200K
openai
OpenAI

GPT-4.1 mini

gpt-4.1-mini.description
1M
openai
OpenAI

GPT-4.1 nano

gpt-4.1-nano.description
1M
openai
OpenAI

GPT-4o 1120

ChatGPT-4o is a dynamic model that updates in real-time to maintain the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications including customer service, education, and technical support.
128K
openai
OpenAI

GPT-4.1

gpt-4.1.description
1M
openai
OpenAI

GPT-4o 0806

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.
128K
openai
OpenAI

GPT-4o 0513

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.
128K
openai
OpenAI

ChatGPT-4o

ChatGPT-4o is a dynamic model that updates in real-time to stay current with the latest version. It combines powerful language understanding and generation capabilities, making it suitable for large-scale applications, including customer service, education, and technical support.
128K
openai
OpenAI

GPT-4 Turbo

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.
128K
openai
OpenAI

GPT-4 Turbo Vision 0409

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.
128K
openai
OpenAI

GPT-4 Turbo Preview

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.
128K
openai
OpenAI

GPT-4 Turbo Preview 0125

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.
128K
openai
OpenAI

GPT-4 Turbo Preview 1106

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.
128K
openai
OpenAI

GPT-4

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.
8K
openai
OpenAI

GPT-4 0613

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.
8K
openai
OpenAI

GPT-4 32K

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.
32K
openai
OpenAI

GPT-4 32K 0613

GPT-4 offers a larger context window, capable of handling longer text inputs, making it suitable for scenarios that require extensive information integration and data analysis.
32K
openai
OpenAI

GPT-3.5 Turbo

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.
16K
openai
OpenAI

GPT-3.5 Turbo 0125

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.
16K
openai
OpenAI

GPT-3.5 Turbo 1106

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.
16K
openai
OpenAI

GPT-3.5 Turbo Instruct

GPT 3.5 Turbo is suitable for various text generation and understanding tasks. Currently points to gpt-3.5-turbo-0125.
4K
ollama
Meta

Llama 3.1 8B

Llama 3.1 is a leading model launched by Meta, supporting up to 405B parameters, applicable in complex dialogues, multilingual translation, and data analysis.
128K
ollama
Meta

Llama 3.1 70B

70b.description
128K
ollama
Meta

Llama 3.1 405B

405b.description
128K
ollama
Meta

Code Llama 7B

Code Llama is an LLM focused on code generation and discussion, combining extensive programming language support, suitable for developer environments.
16K
ollama
Meta

Code Llama 13B

13b.description
16K
ollama
Meta

Code Llama 34B

34b.description
16K
ollama
Meta

Code Llama 70B

70b.description
16K
ollama
Qwen

QwQ 32B

QwQ is an experimental research model focused on improving AI reasoning capabilities.
128K
ollama
Qwen

Qwen2.5 0.5B

0.5b.description
128K
ollama
Qwen

Qwen2.5 1.5B

1.5b.description
128K
ollama
Qwen

Qwen2.5 7B

qwen2.5.description
128K
ollama
Qwen

Qwen2.5 72B

72b.description
128K
ollama
Qwen

CodeQwen1.5 7B

CodeQwen1.5 is a large language model trained on extensive code data, specifically designed to solve complex programming tasks.
64K
ollama
Qwen

Qwen2 0.5B

0.5b.description
128K
ollama
Qwen

Qwen2 1.5B

1.5b.description
128K
ollama
Qwen

Qwen2 7B

Qwen2 is Alibaba's next-generation large-scale language model, supporting diverse application needs with excellent performance.
128K
ollama
Qwen

Qwen2 72B

72b.description
128K
ollama
Gemma

Gemma 2 2B

2b.description
8K
ollama
Gemma

Gemma 2 9B

Gemma 2 is an efficient model launched by Google, covering a variety of application scenarios from small applications to complex data processing.
8K
ollama
Gemma

Gemma 2 27B

27b.description
8K
ollama
Gemma

CodeGemma 2B

2b.description
8K
ollama
Gemma

CodeGemma 7B

CodeGemma is a lightweight language model dedicated to various programming tasks, supporting rapid iteration and integration.
8K
ollama
Azure

Phi-3 3.8B

Phi-3 is a lightweight open model launched by Microsoft, suitable for efficient integration and large-scale knowledge reasoning.
128K
ollama
Azure

Phi-3 14B

14b.description
128K
ollama
Azure

WizardLM 2 7B

WizardLM 2 is a language model provided by Microsoft AI, excelling in complex dialogues, multilingual capabilities, reasoning, and intelligent assistant applications.
32K
ollama
Azure

WizardLM 2 8x22B

8x22b.description
64K
ollama
Mistral

MathΣtral 7B

MathΣtral is designed for scientific research and mathematical reasoning, providing effective computational capabilities and result interpretation.
32K
ollama
Mistral

Mistral 7B

Mistral is a 7B model released by Mistral AI, suitable for diverse language processing needs.
32K
ollama
Mistral

Mixtral 8x7B

Mixtral is an expert model from Mistral AI, featuring open-source weights and providing support in code generation and language understanding.
32K
ollama
Mistral

Mixtral 8x22B

8x22b.description
64K
ollama
Mistral

Mixtral Large 123B

Mixtral Large is Mistral's flagship model, combining capabilities in code generation, mathematics, and reasoning, supporting a 128k context window.
128K
ollama
Mistral

Mixtral Nemo 12B

Mistral Nemo, developed in collaboration with Mistral AI and NVIDIA, is a high-performance 12B model.
128K
ollama
Mistral

Codestral 22B

Codestral is Mistral AI's first code model, providing excellent support for code generation tasks.
32K
ollama
Aya

Aya 23 8B

Aya 23 is a multilingual model launched by Cohere, supporting 23 languages, facilitating diverse language applications.
8K
ollama
Aya

Aya 23 35B

35b.description
8K
ollama
Cohere

Command R 35B

Command R is an LLM optimized for dialogue and long context tasks, particularly suitable for dynamic interactions and knowledge management.
128K
ollama
Cohere

Command R+ 104B

Command R+ is a high-performance large language model designed for real enterprise scenarios and complex applications.
128K
ollama
DeepSeek

DeepSeek V2 16B

DeepSeek V2 is an efficient Mixture-of-Experts language model, suitable for cost-effective processing needs.
32K
ollama
DeepSeek

DeepSeek V2 236B

236b.description
128K
ollama
DeepSeek

DeepSeek Coder V2 16B

DeepSeek Coder V2 is an open-source hybrid expert code model that performs excellently in coding tasks, comparable to GPT4-Turbo.
128K
ollama
DeepSeek

DeepSeek Coder V2 236B

236b.description
128K
ollama
LLaVA

LLaVA 7B

LLaVA is a multimodal model that combines a visual encoder with Vicuna for powerful visual and language understanding.
4K
ollama
LLaVA

LLaVA 13B

13b.description
4K
ollama
LLaVA

LLaVA 34B

34b.description
4K
ollama

MiniCPM-V 8B

MiniCPM-V is a next-generation multimodal large model launched by OpenBMB, boasting exceptional OCR recognition and multimodal understanding capabilities, supporting a wide range of application scenarios.
128K
anthropic
Claude

Claude 3.7 Sonnet

0.description
200K
anthropic
Claude

Claude 3.5 Haiku

Claude 3.5 Haiku is Anthropic's fastest next-generation model. Compared to Claude 3 Haiku, Claude 3.5 Haiku has improved in various skills and has surpassed the previous generation's largest model, Claude 3 Opus, in many intelligence benchmark tests.
200K
anthropic
Claude

Claude 3.5 Sonnet

0.description
200K
anthropic
Claude

Claude 3 Sonnet

Claude 3 Sonnet provides an ideal balance of intelligence and speed for enterprise workloads. It offers maximum utility at a lower price, reliable and suitable for large-scale deployment.
200K
anthropic
Claude

Claude 2.1

Claude 2 provides advancements in key capabilities for enterprises, including industry-leading 200K token context, significantly reducing the occurrence of model hallucinations, system prompts, and a new testing feature: tool invocation.
200K
anthropic
Claude

Claude 2.0

Claude 2 provides advancements in key capabilities for enterprises, including industry-leading 200K token context, significantly reducing the occurrence of model hallucinations, system prompts, and a new testing feature: tool invocation.
97K
bedrock

Nova Pro

0.description
300K
bedrock

Nova Lite

0.description
300K
bedrock

Nova Micro

0.description
128K
bedrock
Claude

Claude 3.5 Sonnet v2 (Inference profile)

0.description
200K
bedrock
Claude

Claude 3.5 Sonnet 0620

0.description
200K
bedrock
Claude

Claude 3 Haiku

0.description
200K
bedrock
Claude

Claude 3 Sonnet

0.description
200K
bedrock
Claude

Claude 3 Opus

0.description
200K
bedrock
Claude

Claude 2.1

1.description
200K
bedrock
Claude

Claude 2.0

Anthropic's model demonstrates high capability across a wide range of tasks, from complex conversations and creative content generation to detailed instruction following.
97K
bedrock
Claude

Claude Instant

A fast, economical, yet still highly capable model that can handle a range of tasks, including everyday conversations, text analysis, summarization, and document Q&A.
97K
bedrock
Meta

Llama 3.1 8B Instruct

0.description
128K
bedrock
Meta

Llama 3.1 70B Instruct

0.description
128K
bedrock
Meta

Llama 3.1 405B Instruct

0.description
128K
bedrock
Meta

Llama 3 8B Instruct

0.description
8K
bedrock
Meta

Llama 3 70B Instruct

0.description
8K
google
Gemini

Gemini 2.5 Pro

gemini-2.5-pro-preview-03-25.description
1M
google
Gemini

Gemini 2.0 Flash Thinking

gemini-2.0-flash-thinking-exp-01-21.description
1M
google
Gemini

Gemini 2.0 Flash Thinking Experimental 1219

gemini-2.0-flash-thinking-exp-1219.description
40K
google
Gemini

Gemini 2.0 Flash

gemini-2.0-flash.description
1M
google
Gemini

Gemini 2.5 Flash

gemini-2.5-flash-standard.description
1M
google
Gemini

Gemini 2.5 Flash Thinking

gemini-2.5-flash-thinking.description
1M
google
Gemini

Gemini Experimental 1206

gemini-exp-1206.description
2M
google
Gemini

Gemini Experimental 1121

gemini-exp-1121.description
40K
google
Gemini

Gemini Experimental 1114

Gemini Exp 1114 is Google's latest experimental multimodal AI model, featuring rapid processing capabilities and supporting text, image, and video inputs, making it suitable for efficient scaling across various tasks.
40K
google

LearnLM 1.5 Pro Experimental

learnlm-1.5-pro-experimental.description
40K
google
Gemini

Gemini 1.5 Flash 002

Gemini 1.5 Flash 002 is an efficient multimodal model that supports a wide range of applications.
1M
google
Gemini

Gemini 1.5 Flash 001

Gemini 1.5 Flash 001 is an efficient multimodal model that supports extensive application scaling.
1M
google
Gemini

Gemini 1.5 Pro 002

Gemini 1.5 Pro 002 is the latest production-ready model, delivering higher quality outputs, with notable enhancements in mathematics, long-context, and visual tasks.
2M
google
Gemini

Gemini 1.5 Pro 001

Gemini 1.5 Pro 001 is a scalable multimodal AI solution that supports a wide range of complex tasks.
2M
google
Gemini

Gemini 1.5 Flash 8B

Gemini 1.5 Flash 8B is an efficient multimodal model that supports a wide range of applications.
1M
google
Gemini

Gemini 1.5 Flash 8B 0924

Gemini 1.5 Flash 8B 0924 is the latest experimental model, showcasing significant performance improvements in both text and multimodal use cases.
1M
google
Gemini

Gemini 1.0 Pro

Gemini 1.0 Pro is Google's high-performance AI model, designed for extensive task scaling.
32K
google
Gemini

Gemini 1.0 Pro 001 (Tuning)

Gemini 1.0 Pro 001 (Tuning) offers stable and tunable performance, making it an ideal choice for complex task solutions.
32K
google
Gemini

Gemini 1.0 Pro 002 (Tuning)

Gemini 1.0 Pro 002 (Tuning) provides excellent multimodal support, focusing on effective solutions for complex tasks.
32K
huggingface
Mistral

Mistral 7B Instruct v0.3

Mistral (7B) Instruct v0.3 offers efficient computational power and natural language understanding, suitable for a wide range of applications.
32K
huggingface
Gemma

Gemma 2 2B Instruct

Google's lightweight instruction-tuning model.
8K
huggingface
Qwen

Qwen 2.5 72B Instruct

A large language model developed by the Alibaba Cloud Tongyi Qianwen team
32K
huggingface
Qwen

Qwen 2.5 Coder 32B Instruct

Qwen2.5-Coder focuses on code writing.
32K
huggingface
Qwen

QwQ 32B Preview

QwQ-32B-Preview is Qwen's latest experimental research model, focusing on enhancing AI reasoning capabilities. By exploring complex mechanisms such as language mixing and recursive reasoning, its main advantages include strong analytical reasoning, mathematical, and programming abilities. However, it also faces challenges such as language switching issues, reasoning loops, safety considerations, and differences in other capabilities.
32K
huggingface

Phi 3.5 mini instruct

microsoft/Phi-3.5-mini-instruct.description
32K
huggingface
Meta

Hermes 3 Llama 3.1 8B

NousResearch/Hermes-3-Llama-3.1-8B.description
16K
huggingface
Qwen

DeepSeek R1

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B.description
16K
openrouter
OpenRouter

Auto (best for prompt)

Based on context length, topic, and complexity, your request will be sent to Llama 3 70B Instruct, Claude 3.5 Sonnet (self-regulating), or GPT-4o.
128K
openrouter
OpenAI

OpenAI o1

openai/o1.description
200K
openrouter
Claude

Claude 3 Haiku

Claude 3 Haiku is Anthropic's fastest and most compact model, designed for near-instantaneous responses. It features quick and accurate directional performance.
200K
openrouter
Claude

Claude 3.5 Sonnet

Claude 3.5 Sonnet offers capabilities that surpass Opus and faster speeds than Sonnet, while maintaining the same pricing as Sonnet. Sonnet excels particularly in programming, data science, visual processing, and agent tasks.
200K
openrouter
Claude

Claude 3 Opus

Claude 3 Opus is Anthropic's most powerful model for handling highly complex tasks. It excels in performance, intelligence, fluency, and comprehension.
200K
openrouter
Gemini

Gemini 1.5 Flash

Gemini 1.5 Flash offers optimized multimodal processing capabilities, suitable for various complex task scenarios.
1M
openrouter
Gemini

Gemini 1.5 Pro

Gemini 1.5 Pro combines the latest optimization technologies to deliver more efficient multimodal data processing capabilities.
2M
openrouter
Meta

Llama 3.2 11B Vision

LLaMA 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.
128K
openrouter
Meta

Llama 3.2 90B Vision

LLaMA 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.
128K
openrouter
Qwen

Qwen2 7B (Free)

free.description
32K
openrouter
Meta

Llama 3.1 8B (Free)

free.description
32K
openrouter
Gemma

Gemma 2 9B (Free)

free.description
8K
cloudflare
DeepSeek

deepseek-coder-6.7b-instruct-awq

@hf/thebloke/deepseek-coder-6.7b-instruct-awq.description
16K
cloudflare
Gemma

gemma-7b-it

@hf/google/gemma-7b-it.description
2K
cloudflare
Mistral

hermes-2-pro-mistral-7b

@hf/nousresearch/hermes-2-pro-mistral-7b.description
4K
cloudflare
Meta

llama-3-8b-instruct-awq

@cf/meta/llama-3-8b-instruct-awq.description
8K
cloudflare

openchat-3.5-0106

@cf/openchat/openchat-3.5-0106.description
8K
cloudflare
Qwen

qwen1.5-14b-chat-awq

@cf/qwen/qwen1.5-14b-chat-awq.description
32K
cloudflare

starling-lm-7b-beta

@hf/nexusflow/starling-lm-7b-beta.description
4K
cloudflare
Meta

meta-llama-3-8b-instruct

Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.
--
github
DeepSeek

DeepSeek R1

DeepSeek-R1.description
128K
github
AI21

AI21 Jamba 1.5 Mini

A 52B parameter (12B active) multilingual model, offering a 256K long context window, function calling, structured output, and grounded generation.
262K
github
AI21

AI21 Jamba 1.5 Large

A 398B parameter (94B active) multilingual model, offering a 256K long context window, function calling, structured output, and grounded generation.
262K
github
Cohere

Cohere Command R

Command R is a scalable generative model targeting RAG and Tool Use to enable production-scale AI for enterprises.
128K
github
Cohere

Cohere Command R+

Command R+ is a state-of-the-art RAG-optimized model designed to tackle enterprise-grade workloads.
128K
github
Mistral

Mistral Small

Mistral Small can be used for any language-based task that requires high efficiency and low latency.
128K
github
Mistral

Codestral

Codestral-2501.description
262K
github
Meta

Meta Llama 3.1 8B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.
128K
github
Meta

Meta Llama 3.1 70B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.
128K
github
Meta

Meta Llama 3.1 405B

The Llama 3.1 instruction-tuned text-only models are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed chat models on common industry benchmarks.
128K
github
Meta

Meta Llama 3 8B

A versatile 8-billion parameter model optimized for dialogue and text generation tasks.
8K
github
Meta

Meta Llama 3 70B

A powerful 70-billion parameter model excelling in reasoning, coding, and broad language applications.
8K
github
Azure

Phi 4

Phi-4.description
16K
github
Azure

Phi 3.5 MoE

Phi-3.5-MoE-instruct.description
128K
github
Azure

Phi-3.5-vision 128K

An updated version of the Phi-3-vision model.
128K
github
Azure

Phi-3-mini 4K

The smallest member of the Phi-3 family, optimized for both quality and low latency.
4K
github
Azure

Phi-3-mini 128K

The same Phi-3-mini model, but with a larger context size for RAG or few-shot prompting.
128K
github
Azure

Phi-3-small 8K

A 7B parameter model that provides better quality than Phi-3-mini, focusing on high-quality, reasoning-dense data.
8K
github
Azure

Phi-3-small 128K

The same Phi-3-small model, but with a larger context size for RAG or few-shot prompting.
128K
github
Azure

Phi-3-medium 4K

A 14B parameter model that provides better quality than Phi-3-mini, focusing on high-quality, reasoning-dense data.
4K
github
Azure

Phi-3-medium 128K

The same Phi-3-medium model, but with a larger context size for RAG or few-shot prompting.
128K
novita
Meta

Llama 3.1 8B Instruct

Llama 3.1 8B Instruct is the latest version released by Meta, optimized for high-quality conversational scenarios, outperforming many leading closed-source models.
8K
novita
Meta

Llama 3.1 70B Instruct

Llama 3.1 70B Instruct is designed for high-quality conversations, excelling in human evaluations, particularly in highly interactive scenarios.
128K
novita
Meta

Llama 3.1 405B Instruct

Llama 3.1 405B Instruct is the latest version from Meta, optimized for generating high-quality dialogues, surpassing many leading closed-source models.
32K
novita
Meta

Llama 3 8B Instruct

Llama 3 8B Instruct is optimized for high-quality conversational scenarios, performing better than many closed-source models.
8K
novita
Meta

Llama 3 70B Instruct

Llama 3 70B Instruct is optimized for high-quality conversational scenarios, demonstrating excellent performance in various human evaluations.
8K
novita
Gemma

Gemma 2 9B

Gemma 2 is Google's lightweight open-source text model series.
8K
novita
Mistral

Mistral 7B Instruct

Mistral 7B Instruct is a high-performance industry-standard model optimized for speed and long context support.
32K
novita
Azure

WizardLM 2 7B

WizardLM 2 7B is Microsoft's latest lightweight AI model, performing nearly ten times better than existing leading open-source models.
32K
novita
Azure

WizardLM-2 8x22B

WizardLM-2 8x22B is Microsoft's state-of-the-art Wizard model, demonstrating extremely competitive performance.
63K
novita
Mistral

Dolphin Mixtral 8x22B

Dolphin Mixtral 8x22B is a model designed for instruction following, dialogue, and programming.
16K
novita
Meta

Hermes 2 Pro Llama 3 8B

Hermes 2 Pro Llama 3 8B is an upgraded version of Nous Hermes 2, featuring the latest internally developed datasets.
8K
novita
Mistral

Hermes 2 Mixtral 8x7B DPO

Hermes 2 Mixtral 8x7B DPO is a highly flexible multi-model fusion designed to provide an exceptional creative experience.
32K
novita

MythoMax l2 13B

MythoMax l2 13B is a language model that combines creativity and intelligence by merging multiple top models.
4K
novita
OpenChat

OpenChat 7B

OpenChat 7B is an open-source language model library fine-tuned using the 'C-RLFT (Conditional Reinforcement Learning Fine-Tuning)' strategy.
4K
togetherai
Meta

Llama 3.2 3B Instruct Turbo

LLaMA 3.2 is designed for tasks involving both visual and textual data. It excels in tasks like image description and visual question answering, bridging the gap between language generation and visual reasoning.
128K
togetherai
Meta

Llama 3.2 11B Vision Instruct Turbo (Free)

LLaMA 3.2 is designed for tasks involving both visual and textual data. It excels in tasks like image description and visual question answering, bridging the gap between language generation and visual reasoning.
128K
togetherai
Meta

Llama 3.2 11B Vision Instruct Turbo

LLaMA 3.2 is designed for tasks involving both visual and textual data. It excels in tasks like image description and visual question answering, bridging the gap between language generation and visual reasoning.
128K
togetherai
Meta

Llama 3.2 90B Vision Instruct Turbo

LLaMA 3.2 is designed for tasks involving both visual and textual data. It excels in tasks like image description and visual question answering, bridging the gap between language generation and visual reasoning.
128K
togetherai
Meta

Llama 3.1 8B Instruct Turbo

Llama 3.1 8B model utilizes FP8 quantization, supporting up to 131,072 context tokens, making it a standout in open-source models, excelling in complex tasks and outperforming many industry benchmarks.
128K
togetherai
Meta

Llama 3.1 70B Instruct Turbo

Llama 3.1 70B model is finely tuned for high-load applications, quantized to FP8 for enhanced computational efficiency and accuracy, ensuring outstanding performance in complex scenarios.
128K
togetherai
Meta

Llama 3.1 405B Instruct Turbo

The 405B Llama 3.1 Turbo model provides massive context support for big data processing, excelling in large-scale AI applications.
130K
togetherai
Meta

Llama 3.1 Nemotron 70B

Llama 3.1 Nemotron 70B is a large language model customized by NVIDIA, designed to enhance the helpfulness of LLM-generated responses to user queries. The model has excelled in benchmark tests such as Arena Hard, AlpacaEval 2 LC, and GPT-4-Turbo MT-Bench, ranking first in all three automatic alignment benchmarks as of October 1, 2024. The model is trained using RLHF (specifically REINFORCE), Llama-3.1-Nemotron-70B-Reward, and HelpSteer2-Preference prompts based on the Llama-3.1-70B-Instruct model.
32K
togetherai
Meta

Llama 3 8B Instruct Turbo

Llama 3 8B Instruct Turbo is a high-performance large language model, supporting a wide range of application scenarios.
8K
togetherai
Meta

Llama 3 70B Instruct Turbo

Llama 3 70B Instruct Turbo offers exceptional language understanding and generation capabilities, suitable for the most demanding computational tasks.
8K
togetherai
Meta

Llama 3 8B Instruct Lite

Llama 3 8B Instruct Lite is designed for resource-constrained environments, providing excellent balanced performance.
8K
togetherai
Meta

Llama 3 70B Instruct Lite

Llama 3 70B Instruct Lite is suitable for environments requiring high performance and low latency.
8K
togetherai
Meta

Llama 3 8B Instruct Reference

LLaMA-3 Chat (8B) provides multilingual support, covering a rich array of domain knowledge.
8K
togetherai
Meta

Llama 3 70B Instruct Reference

LLaMA-3 Chat (70B) is a powerful chat model that supports complex conversational needs.
8K
togetherai
Meta

LLaMA-2 Chat (13B)

LLaMA-2 Chat (13B) offers excellent language processing capabilities and outstanding interactive experiences.
4K
togetherai
Meta

LLaMA-2 (70B)

LLaMA-2 provides excellent language processing capabilities and outstanding interactive experiences.
4K
togetherai
Meta

CodeLlama 34B Instruct

Code Llama is an LLM focused on code generation and discussion, with extensive support for various programming languages, suitable for developer environments.
16K
togetherai
Gemma

Gemma 2 27B

Gemma 2 continues the design philosophy of being lightweight and efficient.
8K
togetherai
Gemma

Gemma Instruct (2B)

Gemma Instruct (2B) provides basic instruction processing capabilities, suitable for lightweight applications.
8K
togetherai
Mistral

Mistral (7B) Instruct v0.2

Mistral 7B is a model fine-tuned on demand, providing optimized answers for tasks.
32K
togetherai
Mistral

Mistral (7B) Instruct

Mistral (7B) Instruct is known for its high performance, suitable for various language tasks.
8K
togetherai
Mistral

Mistral (7B)

Mistral 7B is a compact yet high-performance model, adept at handling batch processing and simple tasks like classification and text generation, featuring good reasoning capabilities.
8K
togetherai
Mistral

Mixtral-8x7B Instruct (46.7B)

Mixtral 8x7B is a pre-trained sparse mixture of experts model for general text tasks.
32K
togetherai
Mistral

Mixtral-8x7B (46.7B)

Mixtral 8x7B is a sparse expert model that utilizes multiple parameters to enhance reasoning speed, suitable for multilingual and code generation tasks.
32K
togetherai
Mistral

Mixtral-8x22B Instruct (141B)

Mixtral-8x22B Instruct (141B) is a super large language model that supports extremely high processing demands.
64K
togetherai
Azure

WizardLM-2 8x22B

WizardLM 2 is a language model provided by Microsoft AI, excelling in complex dialogues, multilingual capabilities, reasoning, and intelligent assistant tasks.
64K
togetherai
DeepSeek

DeepSeek LLM Chat (67B)

DeepSeek 67B is an advanced model trained for highly complex conversations.
4K
togetherai
Qwen

Qwen 2.5 7B Instruct Turbo

Qwen2.5 is a new large language model series designed to optimize instruction-based task processing.
32K
togetherai
Qwen

Qwen 2.5 72B Instruct Turbo

Qwen2.5 is a new large language model series designed to optimize instruction-based task processing.
32K
togetherai
Qwen

Qwen 2 Instruct (72B)

Qwen2 is an advanced general-purpose language model that supports various types of instructions.
32K
togetherai
DBRX

DBRX Instruct

DBRX Instruct provides highly reliable instruction processing capabilities, supporting applications across multiple industries.
32K
togetherai
Upsate

Upstage SOLAR Instruct v1 (11B)

Upstage SOLAR Instruct v1 (11B) is suitable for refined instruction tasks, offering excellent language processing capabilities.
4K
togetherai

MythoMax-L2 (13B)

MythoMax-L2 (13B) is an innovative model suitable for multi-domain applications and complex tasks.
4K
togetherai

StripedHyena Nous (7B)

StripedHyena Nous (7B) provides enhanced computational capabilities through efficient strategies and model architecture.
32K
fireworksai
Meta

Llama 3.3 70B Instruct

accounts/fireworks/models/llama-v3p3-70b-instruct.description
128K
fireworksai
Yi

Yi-Large

Yi-Large model, featuring exceptional multilingual processing capabilities, suitable for various language generation and understanding tasks.
32K
groq
Meta

Llama 3.3 70B

Meta Llama 3.3 is a multilingual large language model (LLM) with 70 billion parameters (text input/text output), featuring pre-training and instruction-tuning. The instruction-tuned pure text model of Llama 3.3 is optimized for multilingual conversational use cases and outperforms many available open-source and closed chat models on common industry benchmarks.
128K
groq
Meta

Llama 3.2 11B Vision (Preview)

Llama 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.
8K
groq
Meta

Llama 3.2 90B Vision (Preview)

Llama 3.2 is designed to handle tasks that combine visual and textual data. It excels in tasks such as image description and visual question answering, bridging the gap between language generation and visual reasoning.
8K
groq
Meta

Llama 3.1 8B

Llama 3.1 8B is a high-performance model that offers rapid text generation capabilities, making it ideal for applications requiring large-scale efficiency and cost-effectiveness.
128K
groq
Meta

Llama 3.1 70B

Llama 3.1 70B provides enhanced AI reasoning capabilities, suitable for complex applications, supporting extensive computational processing while ensuring efficiency and accuracy.
128K
groq
Meta

Llama 3 Groq 8B Tool Use (Preview)

Llama 3 Groq 8B Tool Use is a model optimized for efficient tool usage, supporting fast parallel computation.
8K
groq
Meta

Llama 3 Groq 70B Tool Use (Preview)

Llama 3 Groq 70B Tool Use offers powerful tool invocation capabilities, supporting efficient processing of complex tasks.
8K
groq
Meta

Meta Llama 3 8B

Meta Llama 3 8B delivers high-quality reasoning performance, suitable for diverse application needs.
8K
groq
Meta

Meta Llama 3 70B

Meta Llama 3 70B provides unparalleled complexity handling capabilities, tailored for high-demand projects.
8K
groq
Gemma

Gemma 2 9B

Gemma 2 9B is a model optimized for specific tasks and tool integration.
8K
groq
Gemma

Gemma 7B

Gemma 7B is suitable for medium to small-scale task processing, offering cost-effectiveness.
8K
groq
Mistral

Mixtral 8x7B

Mixtral 8x7B provides high fault-tolerant parallel computing capabilities, suitable for complex tasks.
32K
groq
LLaVA

LLaVA 1.5 7B

LLaVA 1.5 7B offers integrated visual processing capabilities, generating complex outputs from visual information inputs.
4K
perplexity
Perplexity

Sonar Reasoning Pro

sonar-reasoning-pro.description
124K
perplexity
Perplexity

Sonar Reasoning

sonar-reasoning.description
124K
perplexity
Perplexity

Sonar Pro

sonar-pro.description
200K
perplexity
Perplexity

Sonar

sonar.description
124K
mistral
Mistral

Pixtral Large

Pixtral Large is an open-source multimodal model with 124 billion parameters, built on Mistral Large 2. This is the second model in our multimodal family, showcasing cutting-edge image understanding capabilities.
128K
mistral
Mistral

Ministral 3B

Ministral 3B is Mistral's top-tier edge model.
128K
mistral
Mistral

Ministral 8B

Ministral 8B is Mistral's cost-effective edge model.
128K
mistral
Mistral

Mistral 7B

Mistral 7B is a compact yet high-performance model, excelling in batch processing and simple tasks such as classification and text generation, with good reasoning capabilities.
32K
mistral
Mistral

Mixtral 8x7B

Mixtral 8x7B is a sparse expert model that leverages multiple parameters to enhance reasoning speed, suitable for handling multilingual and code generation tasks.
32K
mistral
Mistral

Mixtral 8x22B

Mixtral 8x22B is a larger expert model focused on complex tasks, providing excellent reasoning capabilities and higher throughput.
64K
mistral
Mistral

Codestral Mamba

Codestral Mamba is a language model focused on code generation, providing strong support for advanced coding and reasoning tasks.
256K
ai21
AI21

Jamba 1.5 Mini

jamba-1.5-mini.description
256K
ai21
AI21

Jamba 1.5 Large

jamba-1.5-large.description
256K
upstage
Upsate

Solar Mini

Solar Mini is a compact LLM that outperforms GPT-3.5, featuring strong multilingual capabilities, supporting English and Korean, and providing an efficient and compact solution.
32K
upstage
Upsate

Solar Mini (Ja)

Solar Mini (Ja) extends the capabilities of Solar Mini, focusing on Japanese while maintaining efficiency and excellent performance in English and Korean usage.
32K
upstage
Upsate

Solar Pro

Solar Pro is a highly intelligent LLM launched by Upstage, focusing on single-GPU instruction-following capabilities, with an IFEval score above 80. Currently supports English, with a formal version planned for release in November 2024, which will expand language support and context length.
32K
xai
Grok

Grok 3 Thinking

grok-3-mini-beta.description
128K
xai
Grok

Grok 3

grok-3-latest.description
128K
xai
Grok

Grok Vision

grok-2-vision-latest.description
32K
qwen
Qwen

Qwen Turbo

Tongyi Qianwen is a large-scale language model that supports input in various languages, including Chinese and English.
1M
qwen
Qwen

Qwen Plus

Tongyi Qianwen Plus is an enhanced version of the large-scale language model, supporting input in various languages, including Chinese and English.
128K
qwen
Qwen

Qwen Max

Tongyi Qianwen Max is a large-scale language model with hundreds of billions of parameters, supporting input in various languages, including Chinese and English. It is the API model behind the current Tongyi Qianwen 2.5 product version.
32K
qwen
Qwen

Qwen Long

Qwen is a large-scale language model that supports long text contexts and dialogue capabilities based on long documents and multiple documents.
1M
qwen
Qwen

Qwen VL Plus

qwen-vl-plus.description
32K
qwen
Qwen

Qwen VL Max

Tongyi Qianwen's ultra-large-scale visual language model. Compared to the enhanced version, it further improves visual reasoning and instruction-following abilities, providing a higher level of visual perception and cognition.
32K
qwen
Qwen

Qwen VL OCR

qwen-vl-ocr-latest.description
34K
qwen
Qwen

Qwen Math Turbo

The Tongyi Qianwen Math model is specifically designed for solving mathematical problems.
4K
qwen
Qwen

Qwen Math Plus

The Tongyi Qianwen Math model is specifically designed for solving mathematical problems.
4K
qwen
Qwen

Qwen Coder Turbo

The Tongyi Qianwen Coder model.
128K
qwen
Qwen

Qwen Coder Plus

Tongyi Qianwen code model.
128K
qwen
Qwen

QwQ 32B Preview

The QwQ model is an experimental research model developed by the Qwen team, focusing on enhancing AI reasoning capabilities.
32K
qwen
Qwen

QVQ 72B Preview

qvq-72b-preview.description
32K
qwen
Qwen

Qwen2.5 7B

qwen2.5-7b-instruct.description
128K
qwen
Qwen

Qwen2.5 14B

qwen2.5-14b-instruct.description
128K
qwen
Qwen

Qwen2.5 32B

qwen2.5-32b-instruct.description
128K
qwen
Qwen

Qwen2.5 72B

qwen2.5-72b-instruct.description
128K
qwen
Qwen

Qwen2.5 14B 1M

qwen2.5-14b-instruct-1m.description
1M
qwen
Qwen

Qwen2.5 Math 7B

qwen2.5-math-7b-instruct.description
4K
qwen
Qwen

Qwen2.5 Math 72B

qwen2.5-math-72b-instruct.description
4K
qwen
Qwen

Qwen2.5 Coder 7B

qwen2.5-coder-7b-instruct.description
128K
qwen
Qwen

Qwen2.5 Coder 32B

qwen2.5-coder-32b-instruct.description
128K
qwen
Qwen

Qwen VL

Initialized with the Qwen-7B language model, this pre-trained model adds an image model with an input resolution of 448.
8K
qwen
Qwen

Qwen VL Chat

Qwen VL supports flexible interaction methods, including multi-image, multi-turn Q&A, and creative capabilities.
8K
qwen
Qwen

Qwen2.5 VL 72B

qwen2.5-vl-72b-instruct.description
128K
qwen
DeepSeek

DeepSeek R1

deepseek-r1.description
64K
qwen
DeepSeek

DeepSeek V3

deepseek-v3.description
64K
wenxin
Wenxin

ERNIE 3.5 8K

ernie-3.5-8k.description
8K
wenxin
Wenxin

ERNIE 3.5 8K Preview

ernie-3.5-8k-preview.description
8K
wenxin
Wenxin

ERNIE 3.5 128K

ernie-3.5-128k.description
128K
wenxin
Wenxin

ERNIE 4.0 8K

ernie-4.0-8k-latest.description
8K
wenxin
Wenxin

ERNIE 4.0 8K Preview

ernie-4.0-8k-preview.description
8K
wenxin
Wenxin

ERNIE 4.0 Turbo 8K

ernie-4.0-turbo-8k-latest.description
8K
wenxin
Wenxin

ERNIE 4.0 Turbo 128K

ernie-4.0-turbo-128k.description
128K
wenxin
Wenxin

ERNIE 4.0 Turbo 8K Preview

ernie-4.0-turbo-8k-preview.description
8K
wenxin
Wenxin

ERNIE Lite 8K

ernie-lite-8k.description
8K
wenxin
Wenxin

ERNIE Lite Pro 128K

ernie-lite-pro-128k.description
128K
wenxin
Wenxin

ERNIE Tiny 8K

ernie-tiny-8k.description
8K
wenxin
Wenxin

ERNIE Speed 128K

ernie-speed-128k.description
128K
wenxin
Wenxin

ERNIE Speed Pro 128K

ernie-speed-pro-128k.description
128K
wenxin
Wenxin

ERNIE Character 8K

ernie-char-8k.description
8K
wenxin
Wenxin

ERNIE Character Fiction 8K

ernie-char-fiction-8k.description
8K
wenxin
Wenxin

ERNIE Novel 8K

ernie-novel-8k.description
8K
hunyuan
Hunyuan

Hunyuan Lite

Upgraded to a MOE structure with a context window of 256k, leading many open-source models in various NLP, coding, mathematics, and industry benchmarks.
256K
hunyuan
Hunyuan

Hunyuan Standard

Utilizes a superior routing strategy while alleviating issues of load balancing and expert convergence. For long texts, the needle-in-a-haystack metric reaches 99.9%. MOE-32K offers a relatively higher cost-performance ratio, balancing effectiveness and price while enabling processing of long text inputs.
32K
hunyuan
Hunyuan

Hunyuan Standard 256K

Utilizes a superior routing strategy while alleviating issues of load balancing and expert convergence. For long texts, the needle-in-a-haystack metric reaches 99.9%. MOE-256K further breaks through in length and effectiveness, greatly expanding the input length capacity.
256K
hunyuan
Hunyuan

Hunyuan Turbo

The preview version of the next-generation Hunyuan large language model, featuring a brand-new mixed expert model (MoE) structure, which offers faster inference efficiency and stronger performance compared to Hunyuan Pro.
32K
hunyuan
Hunyuan

Hunyuan Pro

A trillion-parameter scale MOE-32K long text model. Achieves absolute leading levels across various benchmarks, capable of handling complex instructions and reasoning, with advanced mathematical abilities, supporting function calls, and optimized for applications in multilingual translation, finance, law, and healthcare.
32K
hunyuan
Hunyuan

Hunyuan Vision

The latest multimodal model from Hunyuan, supporting image + text input to generate textual content.
8K
hunyuan
Hunyuan

Hunyuan Code

The latest code generation model from Hunyuan, trained on a base model with 200B high-quality code data, iteratively trained for six months with high-quality SFT data, increasing the context window length to 8K. It ranks among the top in automatic evaluation metrics for code generation across five major programming languages, and performs in the first tier for comprehensive human quality assessments across ten aspects of coding tasks.
8K
hunyuan
Hunyuan

Hunyuan FunctionCall

The latest MOE architecture FunctionCall model from Hunyuan, trained on high-quality FunctionCall data, with a context window of 32K, leading in multiple dimensions of evaluation metrics.
32K
hunyuan
Hunyuan

Hunyuan Role

The latest role-playing model from Hunyuan, fine-tuned and trained by Hunyuan's official team, based on the Hunyuan model combined with role-playing scenario datasets for enhanced foundational performance in role-playing contexts.
8K
zhipu
ChatGLM

GLM-Zero-Preview

glm-zero-preview.description
16K
zhipu
ChatGLM

GLM-4-Flash

GLM-4-Flash is the ideal choice for handling simple tasks, being the fastest and most cost-effective.
128K
zhipu
ChatGLM

GLM-4-FlashX

GLM-4-FlashX is an enhanced version of Flash, featuring ultra-fast inference speed.
128K
zhipu
ChatGLM

GLM-4-Long

GLM-4-Long supports ultra-long text inputs, suitable for memory-based tasks and large-scale document processing.
1M
zhipu
ChatGLM

GLM-4-Air

GLM-4-Air is a cost-effective version with performance close to GLM-4, offering fast speed at an affordable price.
128K
zhipu
ChatGLM

GLM-4-AirX

GLM-4-AirX provides an efficient version of GLM-4-Air, with inference speeds up to 2.6 times faster.
8K
zhipu
ChatGLM

GLM-4-AllTools

GLM-4-AllTools is a multifunctional intelligent agent model optimized to support complex instruction planning and tool invocation, such as web browsing, code interpretation, and text generation, suitable for multitasking.
128K
zhipu
ChatGLM

GLM-4-Plus

GLM-4-Plus, as a high-intelligence flagship, possesses strong capabilities for processing long texts and complex tasks, with overall performance improvements.
128K
zhipu
ChatGLM

GLM-4-0520

GLM-4-0520 is the latest model version designed for highly complex and diverse tasks, demonstrating outstanding performance.
128K
zhipu
ChatGLM

GLM-4

GLM-4 is the old flagship version released in January 2024, currently replaced by the more powerful GLM-4-0520.
128K
zhipu
ChatGLM

GLM-4V-Flash

GLM-4V-Flash focuses on efficient single image understanding, suitable for scenarios that require rapid image parsing, such as real-time image analysis or batch image processing.
8K
zhipu
ChatGLM

GLM-4V-Plus

GLM-4V-Plus has the ability to understand video content and multiple images, suitable for multimodal tasks.
8K
zhipu
ChatGLM

GLM-4V

GLM-4V provides strong image understanding and reasoning capabilities, supporting various visual tasks.
2K
zhipu
CodeGeeX

CodeGeeX-4

CodeGeeX-4 is a powerful AI programming assistant that supports intelligent Q&A and code completion in various programming languages, enhancing development efficiency.
128K
zhipu
ChatGLM

CharGLM-3

CharGLM-3 is designed for role-playing and emotional companionship, supporting ultra-long multi-turn memory and personalized dialogue, with wide applications.
4K
zhipu

Emohaa

Emohaa is a psychological model with professional counseling capabilities, helping users understand emotional issues.
8K
siliconcloud
DeepSeek

DeepSeek V3

deepseek-ai/DeepSeek-V3.description
64K
siliconcloud
Meta

DeepSeek R1 Distill Llama 70B

deepseek-ai/DeepSeek-R1-Distill-Llama-70B.description
32K
siliconcloud
Qwen

DeepSeek R1 Distill Qwen 14B

deepseek-ai/DeepSeek-R1-Distill-Qwen-14B.description
32K
siliconcloud
Meta

DeepSeek R1 Distill Llama 8B (Free)

deepseek-ai/DeepSeek-R1-Distill-Llama-8B.description
32K
siliconcloud
Qwen

DeepSeek R1 Distill Qwen 7B (Free)

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B.description
32K
siliconcloud
Qwen

DeepSeek-R1-Distill-Qwen-1.5B (Free)

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.description
32K
siliconcloud
DeepSeek

DeepSeek V2.5

DeepSeek V2.5 combines the excellent features of previous versions, enhancing general and coding capabilities.
32K
siliconcloud
DeepSeek

DeepSeek VL2

deepseek-ai/deepseek-vl2.description
4K
siliconcloud
Qwen

QVQ 72B Preview

Qwen/QVQ-72B-Preview.description
32K
siliconcloud
Qwen

Qwen2.5 7B Instruct (Free)

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.
32K
siliconcloud
Qwen

Qwen2.5 7B Instruct (LoRA)

Qwen2.5-7B-Instruct is one of the latest large language models released by Alibaba Cloud. This 7B model shows significant improvements in coding and mathematics. It also provides multilingual support, covering over 29 languages, including Chinese and English. The model has made notable advancements in instruction following, understanding structured data, and generating structured outputs, especially JSON.
32K
siliconcloud
Qwen

Qwen2.5 14B Instruct

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.
32K
siliconcloud
Qwen

Qwen2.5 32B Instruct

Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks.
32K
siliconcloud
Qwen

Qwen2.5 72B Instruct 128K

Qwen2.5 is a new large language model series with enhanced understanding and generation capabilities.
128K
siliconcloud
Qwen

Qwen2.5 Coder 7B Instruct (Free)

Qwen2.5-Coder-7B-Instruct is the latest version in Alibaba Cloud's series of code-specific large language models. This model significantly enhances code generation, reasoning, and repair capabilities based on Qwen2.5, trained on 55 trillion tokens. It not only improves coding abilities but also maintains advantages in mathematics and general capabilities, providing a more comprehensive foundation for practical applications such as code agents.
32K
siliconcloud
Qwen

Qwen2 1.5B Instruct (Free)

Qwen2-1.5B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 1.5B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models. Compared to Qwen1.5-1.8B-Chat, Qwen2-1.5B-Instruct shows significant performance improvements in tests such as MMLU, HumanEval, GSM8K, C-Eval, and IFEval, despite having slightly fewer parameters.
32K
siliconcloud
Qwen

Qwen2 7B Instruct (Free)

Qwen2-72B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 72B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It can handle large-scale inputs. The model excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models and demonstrating competitive performance comparable to proprietary models in certain tasks.
32K
siliconcloud
Qwen

Qwen2 VL 72B Instruct

Qwen2-VL is the latest iteration of the Qwen-VL model, achieving state-of-the-art performance in visual understanding benchmarks.
32K
siliconcloud
InternLM

InternLM2.5 7B Chat (Free)

InternLM2.5 offers intelligent dialogue solutions across multiple scenarios.
32K
siliconcloud
InternLM

InternLM2.5 20B Chat

The innovative open-source model InternLM2.5 enhances dialogue intelligence through a large number of parameters.
32K
siliconcloud
InternLM

InternVL2 8B (Pro)

InternVL2 demonstrates exceptional performance across various visual language tasks, including document and chart understanding, scene text understanding, OCR, and solving scientific and mathematical problems.
32K
siliconcloud
InternLM

InternVL2 26B

InternVL2 demonstrates exceptional performance across various visual language tasks, including document and chart understanding, scene text understanding, OCR, and solving scientific and mathematical problems.
32K
siliconcloud
ChatGLM

GLM-4 9B Chat (Free)

GLM-4 9B is an open-source version that provides an optimized conversational experience for chat applications.
128K
siliconcloud
ChatGLM

GLM-4 9B Chat (Pro)

GLM-4-9B-Chat is the open-source version of the GLM-4 series pre-trained models launched by Zhipu AI. This model excels in semantics, mathematics, reasoning, code, and knowledge. In addition to supporting multi-turn dialogues, GLM-4-9B-Chat also features advanced capabilities such as web browsing, code execution, custom tool invocation (Function Call), and long-text reasoning. The model supports 26 languages, including Chinese, English, Japanese, Korean, and German. In multiple benchmark tests, GLM-4-9B-Chat has demonstrated excellent performance, such as in AlignBench-v2, MT-Bench, MMLU, and C-Eval. The model supports a maximum context length of 128K, making it suitable for academic research and commercial applications.
128K
siliconcloud
ChatGLM

ChatGLM3 6B (Free)

THUDM/chatglm3-6b.description
32K
siliconcloud
Yi

Yi-1.5 6B Chat (Free)

Yi-1.5-6B-Chat is a variant of the Yi-1.5 series, belonging to the open-source chat model. Yi-1.5 is an upgraded version of Yi, continuously pre-trained on 500B high-quality corpora and fine-tuned on over 3M diverse samples. Compared to Yi, Yi-1.5 demonstrates stronger capabilities in coding, mathematics, reasoning, and instruction following, while maintaining excellent language understanding, common sense reasoning, and reading comprehension abilities. The model is available in context length versions of 4K, 16K, and 32K, with a total pre-training volume reaching 3.6T tokens.
4K
siliconcloud
Yi

Yi-1.5 9B Chat 16K (Free)

Yi-1.5 9B supports 16K tokens, providing efficient and smooth language generation capabilities.
16K
siliconcloud
Yi

Yi-1.5 34B Chat 16K

Yi-1.5 34B delivers superior performance in industry applications with a wealth of training samples.
16K
siliconcloud
Meta

Llama 3.1 8B Instruct (Free)

LLaMA 3.1 provides multilingual support and is one of the industry's leading generative models.
32K
siliconcloud
Meta

Llama 3.1 70B Instruct

LLaMA 3.1 70B offers efficient conversational support in multiple languages.
32K
siliconcloud
Meta

Llama 3.1 405B Instruct

LLaMA 3.1 405B is a powerful model for pre-training and instruction tuning.
32K
siliconcloud
Meta

Llama 3.3 70B Instruct

meta-llama/Llama-3.3-70B-Instruct.description
32K
siliconcloud

TeleChat2

TeleAI/TeleChat2.description
8K
siliconcloud

TeleMM

TeleAI/TeleMM.description
32K
zeroone
Yi

Yi Lightning

The latest high-performance model, ensuring high-quality output while significantly improving reasoning speed.
16K
zeroone
Yi

Yi Spark

Small yet powerful, lightweight and fast model. Provides enhanced mathematical computation and coding capabilities.
16K
zeroone
Yi

Yi Medium

Medium-sized model upgraded and fine-tuned, balanced capabilities, and high cost-performance ratio. Deeply optimized instruction-following capabilities.
16K
zeroone
Yi

Yi Medium 200K

200K ultra-long context window, providing deep understanding and generation capabilities for long texts.
200K
zeroone
Yi

Yi Large Turbo

Exceptional performance at a high cost-performance ratio. Conducts high-precision tuning based on performance, inference speed, and cost.
16K
zeroone
Yi

Yi Large RAG

High-level service based on the yi-large super strong model, combining retrieval and generation techniques to provide precise answers and real-time information retrieval services.
16K
zeroone
Yi

Yi Large FC

Based on the yi-large model, supports and enhances tool invocation capabilities, suitable for various business scenarios requiring agent or workflow construction.
32K
zeroone
Yi

Yi Large

A new trillion-parameter model, providing super strong question-answering and text generation capabilities.
32K
zeroone
Yi

Yi Vision

Model for complex visual tasks, providing high-performance image understanding and analysis capabilities.
16K
zeroone
Yi

Yi Large Preview

Initial version, recommended to use yi-large (new version).
16K
zeroone
Yi

Yi Lightning Lite

A lightweight version, recommended to use yi-lightning.
16K
spark
Spark

Spark Lite

Spark Lite is a lightweight large language model with extremely low latency and efficient processing capabilities, completely free and open, supporting real-time online search functionality. Its quick response feature makes it excel in inference applications and model fine-tuning on low-power devices, providing users with excellent cost-effectiveness and intelligent experiences, particularly in knowledge Q&A, content generation, and search scenarios.
8K
spark
Spark

Spark Pro

Spark Pro is a high-performance large language model optimized for professional fields, focusing on mathematics, programming, healthcare, education, and more, supporting online search and built-in plugins for weather, dates, etc. Its optimized model demonstrates excellent performance and efficiency in complex knowledge Q&A, language understanding, and high-level text creation, making it an ideal choice for professional application scenarios.
8K
spark
Spark

Spark Pro 128K

Spark Pro 128K is equipped with an extra-large context processing capability, able to handle up to 128K of contextual information, making it particularly suitable for long-form content that requires comprehensive analysis and long-term logical connections, providing smooth and consistent logic and diverse citation support in complex text communication.
128K
spark
Spark

Spark Max

generalv3.5.description
8K
spark
Spark

Spark Max 32K

Spark Max 32K is configured with large context processing capabilities, enhanced contextual understanding, and logical reasoning abilities, supporting text input of 32K tokens, suitable for long document reading, private knowledge Q&A, and other scenarios.
32K
spark
Spark

Spark 4.0 Ultra

Spark4.0 Ultra is the most powerful version in the Spark large model series, enhancing text content understanding and summarization capabilities while upgrading online search links. It is a comprehensive solution for improving office productivity and accurately responding to demands, leading the industry as an intelligent product.
8K
sensenova
SenseNova

SenseChat 5.5

The latest version model (V5.5) with a context length of 128K shows significant improvements in mathematical reasoning, English conversation, instruction following, and long text comprehension, comparable to GPT-4o.
128K
sensenova
SenseNova

SenseChat 5.0 Turbo

Suitable for fast question answering and model fine-tuning scenarios.
32K
sensenova
SenseNova

SenseChat 5.0 Cantonese

With a context length of 32K, it surpasses GPT-4 in Cantonese conversation comprehension and is competitive with GPT-4 Turbo in knowledge, reasoning, mathematics, and code writing across multiple domains.
32K
sensenova
SenseNova

SenseChat 4.0 128K

Basic version model (V4) with a context length of 128K, excelling in long text comprehension and generation tasks.
128K
sensenova
SenseNova

SenseChat 4.0 32K

Basic version model (V4) with a context length of 32K, flexibly applicable to various scenarios.
32K
sensenova
SenseNova

SenseChat 4.0 4K

Basic version model (V4) with a context length of 4K, featuring strong general capabilities.
4K
sensenova
SenseNova

SenseChat Character

Standard version model with an 8K context length and high response speed.
8K
sensenova
SenseNova

SenseChat Character Pro

Advanced version model with a context length of 32K, offering comprehensive capability enhancements and supporting both Chinese and English conversations.
32K
stepfun
Stepfun

Step 1 Flash

High-speed model, suitable for real-time dialogues.
8K
stepfun
Stepfun

Step 1 8K

Small model, suitable for lightweight tasks.
8K
stepfun
Stepfun

Step 1 32K

Supports medium-length dialogues, applicable to various application scenarios.
32K
stepfun
Stepfun

Step 1 128K

Balances performance and cost, suitable for general scenarios.
128K
stepfun
Stepfun

Step 1 256K

Equipped with ultra-long context processing capabilities, especially suitable for long document analysis.
256K
stepfun
Stepfun

Step 2 16K

Supports large-scale context interactions, suitable for complex dialogue scenarios.
16K
stepfun
Stepfun

Step 2 Mini

step-2-mini.description
8K
stepfun
Stepfun

Step 2 16K Exp

step-2-16k-exp.description
16K
stepfun
Stepfun

Step 1V 8K

A small visual model suitable for basic text and image tasks.
8K
stepfun
Stepfun

Step 1V 32K

Supports visual input, enhancing multimodal interaction experiences.
32K
stepfun
Stepfun

Step 1o Vision 32K

step-1o-vision-32k.description
32K
stepfun
Stepfun

Step 1.5V Mini

This model has powerful video understanding capabilities.
32K
moonshot
MoonshotAI

Moonshot V1 8K

Moonshot V1 8K is designed for generating short text tasks, featuring efficient processing performance, capable of handling 8,192 tokens, making it ideal for brief dialogues, note-taking, and rapid content generation.
8K
moonshot
MoonshotAI

Moonshot V1 32K

Moonshot V1 32K offers medium-length context processing capabilities, able to handle 32,768 tokens, particularly suitable for generating various long documents and complex dialogues, applicable in content creation, report generation, and dialogue systems.
32K
moonshot
MoonshotAI

Moonshot V1 128K

Moonshot V1 128K is a model with ultra-long context processing capabilities, suitable for generating extremely long texts, meeting the demands of complex generation tasks, capable of handling up to 128,000 tokens, making it ideal for research, academia, and large document generation.
128K
baichuan
Baichuan

Baichuan 4

The model is the best in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also boasts industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.
32K
baichuan
Baichuan

Baichuan 4 Turbo

The leading model in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also possesses industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.
32K
baichuan
Baichuan

Baichuan 4 Air

The leading model in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also possesses industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks.
32K
baichuan
Baichuan

Baichuan 3 Turbo

Optimized for high-frequency enterprise scenarios, significantly improving performance and cost-effectiveness. Compared to the Baichuan2 model, content creation improves by 20%, knowledge Q&A by 17%, and role-playing ability by 40%. Overall performance is superior to GPT-3.5.
32K
baichuan
Baichuan

Baichuan 3 Turbo 128k

Features a 128K ultra-long context window, optimized for high-frequency enterprise scenarios, significantly improving performance and cost-effectiveness. Compared to the Baichuan2 model, content creation improves by 20%, knowledge Q&A by 17%, and role-playing ability by 40%. Overall performance is superior to GPT-3.5.
128K
baichuan
Baichuan

Baichuan 2 Turbo

Utilizes search enhancement technology to achieve comprehensive links between large models and domain knowledge, as well as knowledge from the entire web. Supports uploads of various documents such as PDF and Word, and URL input, providing timely and comprehensive information retrieval with accurate and professional output.
32K
minimax
Minimax

abab6.5s

Suitable for a wide range of natural language processing tasks, including text generation and dialogue systems.
245K
minimax
Minimax

abab6.5g

Designed for multilingual persona dialogue, supporting high-quality dialogue generation in English and other languages.
8K
minimax
Minimax

abab6.5t

Optimized for Chinese persona dialogue scenarios, providing smooth dialogue generation that aligns with Chinese expression habits.
8K
minimax
Minimax

abab5.5

Targeted at productivity scenarios, supporting complex task processing and efficient text generation, suitable for professional applications.
16K
minimax
Minimax

abab5.5s

Designed for Chinese persona dialogue scenarios, providing high-quality Chinese dialogue generation capabilities, suitable for various application contexts.
8K
internlm
InternLM

InternLM2.5

Our latest model series, featuring exceptional reasoning performance, supporting a context length of 1M, and enhanced instruction following and tool invocation capabilities.
32K
internlm
InternLM

InternLM2 Pro Chat

An older version of the model that we still maintain, available in various parameter sizes of 7B and 20B.
32K
higress
Qwen

Qwen Turbo

Qwen Turbo is a large-scale language model supporting input in various languages including Chinese and English.
128K
higress
Qwen

Qwen Plus

Qwen Plus is an enhanced large-scale language model supporting input in various languages including Chinese and English.
128K
higress
Qwen

Qwen Max

Qwen Max is a trillion-level large-scale language model that supports input in various languages including Chinese and English, and is the API model behind the current Qwen 2.5 product version.
32K
higress
Qwen

Qwen VL Plus

Tongyi Qianwen's large-scale visual language model enhanced version. Significantly improves detail recognition and text recognition capabilities, supporting ultra-high pixel resolution and images of any aspect ratio.
32K
higress
Qwen

Qwen2.5 Math 1.5B

qwen2.5-math-1.5b-instruct.description
4K
higress
Qwen

Qwen2.5 Coder 1.5B

qwen2.5-coder-1.5b-instruct.description
128K
higress
AI360

360GPT2 Pro

360GPT2 Pro is an advanced natural language processing model launched by 360, featuring exceptional text generation and understanding capabilities, particularly excelling in generation and creative tasks, capable of handling complex language transformations and role-playing tasks.
8K
higress
AI360

360GPT Pro

360GPT Pro, as an important member of the 360 AI model series, meets diverse natural language application scenarios with efficient text processing capabilities, supporting long text understanding and multi-turn dialogue.
8K
higress
AI360

360GPT Turbo

360GPT Turbo offers powerful computation and dialogue capabilities, with excellent semantic understanding and generation efficiency, making it an ideal intelligent assistant solution for enterprises and developers.
8K
higress
AI360

360GPT Turbo Responsibility 8K

360GPT Turbo Responsibility 8K emphasizes semantic safety and responsibility, designed specifically for applications with high content safety requirements, ensuring accuracy and robustness in user experience.
8K
higress
Wenxin

ERNIE 3.5 8K

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.
8K
higress
Wenxin

ERNIE 3.5 8K Preview

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.
8K
higress
Wenxin

ERNIE 3.5 128K

Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information.
128K
higress
Wenxin

ERNIE 4.0 8K

Baidu's self-developed flagship ultra-large-scale language model, which has achieved a comprehensive upgrade in model capabilities compared to ERNIE 3.5, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information.
8K
higress
Wenxin

ERNIE 4.0 8K Preview

Baidu's self-developed flagship ultra-large-scale language model, which has achieved a comprehensive upgrade in model capabilities compared to ERNIE 3.5, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information.
8K
higress
Wenxin

ERNIE 4.0 Turbo 8K

Baidu's self-developed flagship ultra-large-scale language model, demonstrating excellent overall performance, suitable for complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information. It offers better performance compared to ERNIE 4.0.
8K
higress
Wenxin

ERNIE 4.0 Turbo 8K Preview

Baidu's self-developed flagship ultra-large-scale language model, demonstrating excellent overall performance, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information. It outperforms ERNIE 4.0 in performance.
8K
higress
Wenxin

ERNIE Lite Pro 128K

Baidu's self-developed lightweight large language model, balancing excellent model performance with inference efficiency, offering better results than ERNIE Lite, suitable for inference on low-power AI acceleration cards.
128K
higress
Wenxin

ERNIE Speed Pro 128K

Baidu's latest self-developed high-performance large language model released in 2024, with outstanding general capabilities, providing better results than ERNIE Speed, suitable as a base model for fine-tuning, effectively addressing specific scenario issues while also exhibiting excellent inference performance.
128K
higress
Wenxin

ERNIE Speed 128K

Baidu's latest self-developed high-performance large language model released in 2024, with outstanding general capabilities, suitable as a base model for fine-tuning, effectively addressing specific scenario issues while also exhibiting excellent inference performance.
128K
higress
Wenxin

ERNIE Character 8K

Baidu's self-developed vertical scene large language model, suitable for applications such as game NPCs, customer service dialogues, and role-playing conversations, featuring more distinct and consistent character styles, stronger adherence to instructions, and superior inference performance.
8K
higress
Hunyuan

Hunyuan Large

--
higress
OpenAI

GPT 3.5 Turbo

GPT 3.5 Turbo is an efficient model provided by OpenAI, suitable for chat and text generation tasks, supporting parallel function calls.
16K
higress
OpenAI

GPT 3.5 Turbo

GPT 3.5 Turbo 16k is a high-capacity text generation model suitable for complex tasks.
16K
higress
OpenAI

GPT 4 Turbo with Vision Preview

The latest GPT-4 Turbo model features visual capabilities. Now, visual requests can be made using JSON format and function calls. GPT-4 Turbo is an enhanced version that provides cost-effective support for multimodal tasks. It strikes a balance between accuracy and efficiency, making it suitable for applications requiring real-time interaction.
128K
higress
Claude

Claude 3.5 Sonnet

Claude 3.5 Sonnet offers capabilities that surpass Opus and faster speeds than Sonnet, while maintaining the same pricing as Sonnet. Sonnet excels particularly in programming, data science, visual processing, and agent tasks.
200K
higress
Gemini

Gemini 1.5 Flash 0827

Gemini 1.5 Flash 0827 provides optimized multimodal processing capabilities, suitable for various complex task scenarios.
1M
higress
Gemini

Gemini 1.5 Pro 0827

Gemini 1.5 Pro 0827 combines the latest optimization technologies for more efficient multimodal data processing.
2M
higress
Gemini

Gemini 1.5 Pro 0801

Gemini 1.5 Pro 0801 offers excellent multimodal processing capabilities, providing greater flexibility for application development.
2M
higress
Cohere

command-light

--
higress
Doubao

Doubao-lite-4k

Doubao-lite-4k.description
--
higress
Doubao

Doubao-lite-32k

Doubao-lite-32k.description
--
higress
Doubao

Doubao-lite-128k

Doubao-lite-128k.description
--
higress
Doubao

Doubao-pro-4k

Doubao-pro-4k.description
--
higress
Doubao

Doubao-pro-32k

Doubao-pro-32k.description
--
higress
Doubao

Doubao-pro-128k

Doubao-pro-128k.description
--
higress
ByteDance

Skylark2-pro-character-4k

Skylark2-pro-character-4k.description
--
higress
ByteDance

Skylark2-pro-32k

Skylark2-pro-32k.description
--
higress
ByteDance

Skylark2-pro-4k

Skylark2-pro-4k.description
--
higress
ByteDance

Skylark2-pro-turbo-8k

Skylark2-pro-turbo-8k.description
--
higress
ByteDance

Skylark2-lite-8k

Skylark2-lite-8k.description
--
giteeai
Qwen

Qwen2.5 Coder 14B Instruct

Qwen2.5-Coder-14B-Instruct.description
24K
giteeai
Qwen

Qwen2 VL 72B

Qwen2-VL-72B.description
32K
giteeai
InternLM

InternVL2.5 26B

InternVL2.5-26B is a powerful visual language model that supports multimodal processing of images and text, capable of accurately recognizing image content and generating relevant descriptions or answers.
32K
giteeai
InternLM

InternVL2 8B

InternVL2-8B is a powerful visual language model that supports multimodal processing of images and text, capable of accurately recognizing image content and generating relevant descriptions or answers.
32K
giteeai
Yi

Yi 34B Chat

Yi-34B-Chat.description
4K
giteeai
DeepSeek

DeepSeek Coder 33B Instruct

DeepSeek Coder 33B is a code language model trained on 20 trillion data points, of which 87% are code and 13% are in Chinese and English. The model introduces a 16K window size and fill-in-the-blank tasks, providing project-level code completion and snippet filling capabilities.
8K
giteeai
CodeGeeX

CodeGeeX4 All 9B

CodeGeeX4-ALL-9B is a multilingual code generation model that supports comprehensive functions including code completion and generation, code interpretation, web search, function calls, and repository-level code Q&A, covering various scenarios in software development. It is a top-tier code generation model with fewer than 10B parameters.
32K
taichu
AiMass

Taichu 2.0

The ZD Taichu language model possesses strong language understanding capabilities and excels in text creation, knowledge Q&A, code programming, mathematical calculations, logical reasoning, sentiment analysis, and text summarization. It innovatively combines large-scale pre-training with rich knowledge from multiple sources, continuously refining algorithmic techniques and absorbing new knowledge in vocabulary, structure, grammar, and semantics from vast text data, resulting in an evolving model performance. It provides users with more convenient information and services, as well as a more intelligent experience.
32K
taichu
AiMass

Taichu 2.0V

taichu2_mm.description
4K
ai360
AI360

360GPT2 o1

360gpt2-o1.description
8K