AI Model Database

Comprehensive database of open-source AI models from leading providers. Compare specifications, architectures, and find optimal deployment configurations.

73

Total Models

15

Providers

671B

Largest Model

10,000,000

Max Context

AI21 Labs

2
Jamba 1.5 Large

Mamba-Transformer Hybrid

94B

Hybrid Mamba-Transformer with 256K context window.

256,000 ctx
2024-08
long-contexthybridefficient

Recommended Quantization:

Q4_K_MQ5_K_M
Jamba 1.5 Mini

Mamba-Transformer Hybrid

12B

Compact hybrid model with 256K context window.

256,000 ctx
2024-08
long-contexthybridefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

Alibaba

11
Qwen 2.5 72B

Transformer

72B

Alibaba's flagship model with strong multilingual and coding capabilities.

128,000 ctx
2024-09
multilingualcodingmathematics

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 32B

Transformer

32B

Mid-size Qwen model with excellent coding and reasoning capabilities.

128,000 ctx
2024-09
multilingualcodingefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 14B

Transformer

14B

Consumer GPU-friendly model with strong multilingual support.

128,000 ctx
2024-09
multilingualconsumer-gpu

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 7B

Transformer

7B

Efficient multilingual model suitable for edge deployment.

128,000 ctx
2024-09
multilingualedge-friendly

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 3B

Transformer

3B

Compact multilingual model for mobile and edge devices.

128,000 ctx
2024-09
multilingualedgemobile

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 0.5B

Transformer

0.5B

Ultra-lightweight multilingual model for minimal hardware.

32,000 ctx
2024-09
multilingualultra-lightmobile

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 Coder 32B

Transformer

32B

Specialized coding model with state-of-the-art code generation capabilities.

128,000 ctx
2024-11
codingspecializedefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 Coder 14B

Transformer

14B

Efficient coding model suitable for consumer hardware.

128,000 ctx
2024-11
codingspecializedconsumer-gpu

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 Coder 7B

Transformer

7B

Lightweight coding model for edge deployment.

128,000 ctx
2024-11
codingspecializededge-friendly

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen 2.5 Math 72B

Transformer

72B

Specialized mathematical reasoning model with exceptional problem-solving capabilities.

4,096 ctx
2024-09
mathematicsreasoningspecialized

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Qwen2 VL 72B

Vision Transformer

72B

Vision-language model with exceptional document and image understanding capabilities.

32,000 ctx
2024-08
visionmultimodaldocument-understanding

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

BigCode/ServiceNow

3
StarCoder2 15B

Transformer

15B

Open code generation model with fill-in-the-middle capabilities.

16,384 ctx
2024-02
codingfill-in-middleopen

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
StarCoder2 7B

Transformer

7B

Efficient code generation model.

16,384 ctx
2024-02
codingfill-in-middleefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
StarCoder2 3B

Transformer

3B

Compact code generation model for edge deployment.

16,384 ctx
2024-02
codingfill-in-middleedge

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

Cohere

3
Command R+

Transformer

104B

Cohere's most capable model with advanced RAG and tool use capabilities.

128,000 ctx
2024-04
ragtool-usemultilingual

Recommended Quantization:

Q4_K_MQ5_K_M
Command R

Transformer

35B

Efficient model with strong RAG and tool use capabilities.

128,000 ctx
2024-03
ragtool-usemultilingual

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Command R7B

Transformer

7B

Compact model with RAG and tool use capabilities, Apache 2.0 licensed.

128,000 ctx
2024-12
ragtool-useefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

Databricks

1
DBRX

Mixture of Experts

132B

Databricks' open MoE model with strong coding and reasoning capabilities.

32,000 ctx
2024-03
moeopencoding

Recommended Quantization:

Q4_K_MQ5_K_M

DeepSeek

7
DeepSeek V3

Mixture of Experts

671B

DeepSeek V3 is a 671B parameter MoE model with only 37B activated per token, delivering exceptional efficiency.

64,000 ctx
2024-12
moereasoningcoding

Recommended Quantization:

Q4_K_MQ5_K_M
DeepSeek R1

Mixture of Experts

671B

Open-source reasoning model matching OpenAI o1 performance on math, code, and logic tasks.

64,000 ctx
2025-01
reasoningmoeopen-reasoning

Recommended Quantization:

Q4_K_MQ5_K_M
DeepSeek R1 Distill (Llama 70B)

Transformer

70B

Reasoning capabilities distilled into a 70B model, offering excellent reasoning at manageable size.

128,000 ctx
2025-01
distilledreasoningefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
DeepSeek R1 Distill (Qwen 32B)

Transformer

32B

Compact reasoning model with impressive performance for its size.

128,000 ctx
2025-01
distilledreasoningefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
DeepSeek R1 Distill (Qwen 14B)

Transformer

14B

Consumer GPU-friendly reasoning model with strong performance.

128,000 ctx
2025-01
distilledreasoningconsumer-gpu

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
DeepSeek R1 Distill (Qwen 7B)

Transformer

7B

Edge-friendly reasoning model suitable for consumer hardware.

128,000 ctx
2025-01
distilledreasoningedge-friendly

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
DeepSeek R1 Distill (Qwen 1.5B)

Transformer

1.5B

Ultra-lightweight reasoning model for minimal hardware.

128,000 ctx
2025-01
distilledreasoningultra-light

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

EleutherAI

3
GPT-J 6B

Transformer

6B

One of the first truly open large language models.

2,048 ctx
2021-06
openfoundationallegacy

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
GPT-NeoX 20B

Transformer

20B

20B parameter open model from EleutherAI.

2,048 ctx
2022-02
openfoundationallegacy

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Pythia 12B

Transformer

12B

Research-focused model designed for interpretability studies.

2,048 ctx
2023-02
openinterpretabilityresearch

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

Google

7
Gemma 2 27B

Transformer

27B

Google's 27B model with knowledge distillation from larger models.

8,192 ctx
2024-06
efficientopenknowledge-distillation

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Gemma 2 9B

Transformer

9B

Efficient 9B model with impressive performance for its size.

8,192 ctx
2024-06
efficientopenknowledge-distillation

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Gemma 2 2B

Transformer

2B

Ultra-lightweight model for edge and mobile deployment.

8,192 ctx
2024-06
edgemobileultra-light

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Gemma 3 27B

Transformer

27B

Multimodal model with vision capabilities and extended context.

128,000 ctx
2025-03
multimodalvisionlong-context

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Gemma 3 12B

Transformer

12B

Efficient multimodal model with vision capabilities.

128,000 ctx
2025-03
multimodalvisionefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Gemma 3 4B

Transformer

4B

Compact multimodal model suitable for edge deployment.

128,000 ctx
2025-03
multimodalvisionedge-friendly

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Gemma 3 1B

Transformer

1B

Ultra-lightweight multimodal model for minimal hardware.

32,000 ctx
2025-03
multimodalvisionultra-light

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

IBM

4
Granite 3.1 8B

Transformer

8B

IBM's enterprise-focused model with strong tool use and safety features.

128,000 ctx
2024-12
enterprisetool-useopen

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Granite 3.1 2B

Transformer

2B

Compact enterprise model for edge deployment.

128,000 ctx
2024-12
enterpriseedgeopen

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Granite Code 20B

Transformer

20B

Specialized code model with enterprise-grade capabilities.

8,192 ctx
2024-05
codingspecializedenterprise

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Granite Code 8B

Transformer

8B

Efficient code model suitable for enterprise deployment.

8,192 ctx
2024-05
codingspecializedefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

Meta

12
Llama 4 Maverick

Mixture of Experts

400B

Llama 4 Maverick is a 400B parameter Mixture-of-Experts model with native multimodal capabilities, supporting text, image, and video understanding.

256,000 ctx
2025-04
multimodalreasoningmoe

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 4 Scout

Mixture of Experts

109B

Llama 4 Scout features an industry-leading 10M token context window, making it ideal for long-document analysis and code understanding.

10,000,000 ctx
2025-04
long-contextmoemultimodal

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3.3 70B

Transformer

70B

Llama 3.3 70B delivers comparable performance to Llama 3.1 405B with significantly reduced computational requirements.

128,000 ctx
2024-12
multilingualinstruction-tunedefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3.2 90B Vision

Vision Transformer

90B

Llama 3.2 90B Vision is Meta's most capable vision model, supporting image understanding and visual reasoning tasks.

128,000 ctx
2024-09
visionmultimodalinstruction-tuned

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3.2 11B Vision

Vision Transformer

11B

Efficient vision-language model optimized for edge deployment while maintaining strong multimodal capabilities.

128,000 ctx
2024-09
visionmultimodaledge-friendly

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3.2 3B

Transformer

3B

Lightweight model designed for mobile and edge devices with surprisingly strong performance.

128,000 ctx
2024-09
edgemobileefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3.2 1B

Transformer

1B

Ultra-lightweight model for the most constrained environments.

128,000 ctx
2024-09
edgemobileultra-light

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3.1 405B

Transformer

405B

Meta's largest open model with state-of-the-art reasoning, tool use, and multilingual capabilities.

128,000 ctx
2024-07
frontiermultilingualreasoning

Recommended Quantization:

Q4_K_MQ5_K_M
Llama 3.1 70B

Transformer

70B

Highly capable model balancing performance and efficiency. One of the most popular open models.

128,000 ctx
2024-07
multilingualinstruction-tunedpopular

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3.1 8B

Transformer

8B

Accessible model with strong performance for its size and full multilingual support.

128,000 ctx
2024-07
efficientmultilingualaccessible

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3 70B

Transformer

70B

Meta's third generation Llama model with significant improvements in reasoning and coding.

8,192 ctx
2024-04
instruction-tunedpopular

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Llama 3 8B

Transformer

8B

The most popular small open model, excellent for fine-tuning and edge deployment.

8,192 ctx
2024-04
efficientaccessiblepopular

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

Microsoft

6
Phi 4

Transformer

14B

Microsoft's Phi 4 with exceptional reasoning and math capabilities from synthetic data training.

16,000 ctx
2024-12
reasoningmathsynthetic-data

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Phi 3.5 MoE

Mixture of Experts

42B

Mixture of Experts model with 42B parameters and 128K context.

128,000 ctx
2024-08
moelong-contextefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Phi 3.5 Mini

Transformer

3.8B

Compact model with 128K context window for edge deployment.

128,000 ctx
2024-08
long-contextedgemobile

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Phi 3 Medium

Transformer

14B

Mid-size Phi model with strong reasoning capabilities.

128,000 ctx
2024-05
reasoninglong-context

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Phi 3 Small

Transformer

7B

Efficient Phi model with strong performance for its size.

128,000 ctx
2024-05
reasoninglong-contextefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Phi 3 Mini

Transformer

3.8B

Compact model that started the Phi 3 series with impressive capabilities.

128,000 ctx
2024-04
reasoninglong-contextedge

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

Mistral AI

9
Mistral Large 2

Transformer

123B

Mistral's most capable model with strong multilingual and coding performance.

128,000 ctx
2024-07
multilingualcodingreasoning

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Mistral Small 3

Transformer

24B

Latency-optimized model with excellent performance for its size. Apache 2.0 licensed.

32,000 ctx
2025-01
efficientlatency-optimizedopen

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Pixtral Large

Vision Transformer

124B

Mistral's flagship vision-language model with advanced document understanding.

128,000 ctx
2024-11
visionmultimodaldocument

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Pixtral 12B

Vision Transformer

12B

Efficient vision-language model with strong multimodal capabilities.

128,000 ctx
2024-09
visionmultimodalefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Mixtral 8x22B

Mixture of Experts

141B

Sparse MoE model with 141B total parameters but only 39B active per token.

64,000 ctx
2024-04
moeopenreasoning

Recommended Quantization:

Q4_K_MQ5_K_M
Mixtral 8x7B

Mixture of Experts

47B

Groundbreaking MoE model that started the open MoE revolution.

32,000 ctx
2023-12
moeopenpopular

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Codestral 22B

Transformer

22B

Specialized coding model with fill-in-the-middle capabilities.

32,000 ctx
2024-05
codingfill-in-middlespecialized

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Mistral NeMo 12B

Transformer

12B

12B model with 128K context and strong multilingual capabilities.

128,000 ctx
2024-07
multilingualefficientopen

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Mistral 7B

Transformer

7B

The model that proved small models can be highly capable.

32,000 ctx
2023-09
efficientopenfoundational

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

NVIDIA

2
Nemotron-4 340B

Transformer

340B

NVIDIA's largest open model designed for synthetic data generation and reward modeling.

4,096 ctx
2024-06
synthetic-datareward-modelingfrontier

Recommended Quantization:

Q4_K_MQ5_K_M
Nemotron-4 15B

Transformer

15B

Efficient model for synthetic data generation.

4,096 ctx
2024-06
synthetic-dataefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0

Snowflake

1
Snowflake Arctic

Mixture of Experts

480B

Enterprise-focused MoE model optimized for SQL and coding tasks.

4,096 ctx
2024-04
moeenterprisesql

Recommended Quantization:

Q4_K_MQ5_K_M

Stability AI

2
Stable LM 2 12B

Transformer

12B

Stability AI's language model with multilingual support.

4,096 ctx
2024-01
openmultilingual

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0
Stable Code 3B

Transformer

3B

Lightweight coding model with fill-in-the-middle support.

16,384 ctx
2024-01
codingfill-in-middleefficient

Recommended Quantization:

Q4_K_MQ5_K_MQ8_0