Variant Name | VRAM Requirement | Recommended Configuration | Best Use Case |
---|---|---|---|
70b | 43GB | Mac Studio (M2 Ultra 128GB) | General-purpose inference |
70b-instruct-fp16 | 141GB | Mac Studio Cluster (2x M2 Ultra 192GB) | High-precision fine-tuning and training |
70b-instruct-q2_K | 26GB | Mac Studio (M1/M2 Ultra 64GB) | Lightweight inference with reduced precision |
70b-instruct-q3KM | 34GB | Mac Studio (M1/M2 Ultra 64GB) | Balanced performance and efficiency |
70b-instruct-q3KS | 31GB | Mac Studio (M1/M2 Ultra 64GB) | Lower memory, faster inference tasks |
70b-instruct-q4_0 | 40GB | Mac Studio (M1/M2 Ultra 64GB) | High-speed, mid-precision inference |
70b-instruct-q4_1 | 44GB | Mac Studio (M2 Ultra 128GB) | Precision-critical inference tasks |
70b-instruct-q4KM | 43GB | Mac Studio (M2 Ultra 128GB) | Optimized for larger models with precision |
70b-instruct-q4KS | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Standard performance inference tasks |
70b-instruct-q5_0 | 49GB | Mac Studio (M2 Ultra 128GB) | High-efficiency inference tasks |
70b-instruct-q5_1 | 53GB | Mac Studio (M2 Ultra 128GB) | Complex inference and light training |
70b-instruct-q5KM | 50GB | Mac Studio (M2 Ultra 128GB) | Memory-intensive inference tasks |
70b-instruct-q6_K | 58GB | Mac Studio (M2 Ultra 128GB) | Large-scale precision and training |
70b-instruct-q8_0 | 75GB | Mac Studio (M2 Ultra 128GB) | Heavy-duty inference and fine-tuning |
Variant Name | VRAM Requirement | Recommended Configuration | Best Use Case |
---|---|---|---|
1b | 1.3GB | Any M-series (8GB+) | Lightweight inference tasks |
3b | 2.0GB | Any M-series (8GB+) | General-purpose inference |
1b-instruct-fp16 | 2.5GB | Any M-series (8GB+) | Fine-tuning and precision-critical tasks |
1b-instruct-q2_K | 581MB | Any M-series (8GB+) | Reduced precision, memory-efficient inference |
1b-instruct-q3KL | 733MB | Any M-series (8GB+) | Efficient inference with balanced precision |
1b-instruct-q3KM | 691MB | Any M-series (8GB+) | Smaller, balanced precision tasks |
1b-instruct-q3KS | 642MB | Any M-series (8GB+) | Lower memory, lightweight inference |
1b-instruct-q4_0 | 771MB | Any M-series (8GB+) | Mid-precision inference tasks |
1b-instruct-q4_1 | 832MB | Any M-series (8GB+) | Precision-critical small models |
1b-instruct-q4KM | 808MB | Any M-series (8GB+) | Balanced, memory-optimized tasks |
1b-instruct-q4KS | 776MB | Any M-series (8GB+) | Lightweight inference with precision |
1b-instruct-q5_0 | 893MB | Any M-series (8GB+) | Higher-efficiency inference tasks |
1b-instruct-q5_1 | 953MB | Any M-series (8GB+) | Small models with complex inference |
1b-instruct-q5KM | 912MB | Any M-series (8GB+) | Memory-optimized, efficient inference |
1b-instruct-q5KS | 893MB | Any M-series (8GB+) | Low memory, efficient inference |
1b-instruct-q6_K | 1.0GB | Any M-series (8GB+) | Medium memory, balanced inference |
1b-instruct-q8_0 | 1.3GB | Any M-series (8GB+) | Standard inference for small models |
3b-instruct-fp16 | 6.4GB | Any M-series (8GB+) | Fine-tuning and precision-critical tasks |
3b-instruct-q2_K | 1.4GB | Any M-series (8GB+) | Reduced precision, lightweight inference |
3b-instruct-q3KL | 1.8GB | Any M-series (8GB+) | Balanced precision inference tasks |
3b-instruct-q3KM | 1.7GB | Any M-series (8GB+) | Efficient, memory-optimized inference |
3b-instruct-q3KS | 1.5GB | Any M-series (8GB+) | Lightweight, small batch inference |
3b-instruct-q4_0 | 1.9GB | Any M-series (8GB+) | Mid-precision general inference |
3b-instruct-q4_1 | 2.1GB | Any M-series (8GB+) | Higher precision, small tasks |
3b-instruct-q4KM | 2.0GB | Any M-series (8GB+) | Memory-optimized small models |
3b-instruct-q4KS | 1.9GB | Any M-series (8GB+) | Mid-memory general inference |
3b-instruct-q5_0 | 2.3GB | Any M-series (8GB+) | High-efficiency inference tasks |
3b-instruct-q5_1 | 2.4GB | Any M-series (8GB+) | Fine-tuned, higher complexity tasks |
3b-instruct-q5KM | 2.3GB | Any M-series (8GB+) | Efficient inference with optimization |
3b-instruct-q5KS | 2.3GB | Any M-series (8GB+) | High efficiency, balanced memory tasks |
3b-instruct-q6_K | 2.6GB | Any M-series (8GB+) | Balanced precision for small tasks |
3b-instruct-q8_0 | 3.4GB | Any M-series (8GB+) | High-memory inference and tasks |
Variant Name | VRAM Requirement | Recommended Configuration | Best Use Case |
---|---|---|---|
8b | 4.9GB | Any M-series (8GB+) | General-purpose inference |
70b | 43GB | Mac Studio (M2 Ultra 128GB) | Large-scale inference |
405b | 243GB | Mac Studio Cluster (4x M2 Ultra 192GB) | Large-scale model training |
405b-instruct-fp16 | 812GB | Mac Studio Cluster (11x M2 Ultra 192GB) | Precision-critical, fine-tuning tasks |
405b-instruct-q2_K | 149GB | Mac Studio Cluster (2x M2 Ultra 192GB) | Memory-optimized inference |
405b-instruct-q3KL | 213GB | Mac Studio Cluster (3x M2 Ultra 192GB) | Balanced precision for large-scale tasks |
405b-instruct-q3KM | 195GB | Mac Studio Cluster (3x M2 Ultra 192GB) | High-efficiency large-scale inference |
405b-instruct-q3KS | 175GB | Mac Studio Cluster (2x M2 Ultra 192GB) | Efficient inference with lower precision |
405b-instruct-q4_0 | 229GB | Mac Studio Cluster (3x M2 Ultra 192GB) | Mid-precision for large models |
405b-instruct-q4_1 | 254GB | Mac Studio Cluster (4x M2 Ultra 192GB) | High-precision inference |
405b-instruct-q4KM | 243GB | Mac Studio Cluster (4x M2 Ultra 192GB) | Optimized precision for large models |
405b-instruct-q4KS | 231GB | Mac Studio Cluster (3x M2 Ultra 192GB) | Balanced memory with precision inference |
405b-instruct-q5_0 | 279GB | Mac Studio Cluster (4x M2 Ultra 192GB) | High-efficiency large-scale tasks |
405b-instruct-q5_1 | 305GB | Mac Studio Cluster (4x M2 Ultra 192GB) | Complex inference and fine-tuning |
405b-instruct-q5KM | 287GB | Mac Studio Cluster (4x M2 Ultra 192GB) | Memory-intensive training and inference |
405b-instruct-q5KS | 279GB | Mac Studio Cluster (4x M2 Ultra 192GB) | Efficient training with lower memory |
405b-instruct-q6_K | 333GB | Mac Studio Cluster (5x M2 Ultra 192GB) | High-performance training for large models |
405b-instruct-q8_0 | 431GB | Mac Studio Cluster (6x M2 Ultra 192GB) | Heavy-duty, precision-critical training |
70b-instruct-fp16 | 141GB | Mac Studio Cluster (2x M2 Ultra 192GB) | Fine-tuning and high-precision inference |
70b-instruct-q2_K | 26GB | Mac Studio (M1/M2 Ultra 64GB) | Lightweight inference |
70b-instruct-q3KL | 37GB | Mac Studio (M1/M2 Ultra 64GB) | Balanced precision inference |
70b-instruct-q3KM | 34GB | Mac Studio (M1/M2 Ultra 64GB) | Efficient inference with memory savings |
70b-instruct-q3KS | 31GB | Mac Studio (M1/M2 Ultra 64GB) | Lightweight, low-memory inference |
70b-instruct-q4_0 | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Mid-precision general inference |
70b-instruct-q4KM | 43GB | Mac Studio (M2 Ultra 128GB) | Precision-critical large models |
70b-instruct-q4KS | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Memory-optimized mid-scale inference |
70b-instruct-q5_0 | 49GB | Mac Studio (M2 Ultra 128GB) | Efficient high-memory tasks |
70b-instruct-q5_1 | 53GB | Mac Studio (M2 Ultra 128GB) | Complex inference tasks |
70b-instruct-q5KM | 50GB | Mac Studio (M2 Ultra 128GB) | Memory-efficient inference |
70b-instruct-q5KS | 49GB | Mac Studio (M2 Ultra 128GB) | Efficient, large-scale inference |
70b-instruct-q6_K | 58GB | Mac Studio (M2 Ultra 128GB) | High-efficiency precision tasks |
70b-instruct-q8_0 | 75GB | Mac Studio (M2 Ultra 128GB) | Heavy-duty, large-scale inference |
8b-instruct-fp16 | 16GB | M-series Pro (16GB+) | Fine-tuning tasks |
8b-instruct-q2_K | 3.2GB | Any M-series (8GB+) | Lightweight precision tasks |
8b-instruct-q3KL | 4.3GB | Any M-series (8GB+) | Balanced precision and memory tasks |
8b-instruct-q3KM | 4.0GB | Any M-series (8GB+) | Efficient small-scale inference |
8b-instruct-q3KS | 3.7GB | Any M-series (8GB+) | Lightweight low-memory inference |
8b-instruct-q4_0 | 4.7GB | Any M-series (8GB+) | Mid-scale inference |
8b-instruct-q4_1 | 5.1GB | Any M-series (8GB+) | Precision-critical small models |
8b-instruct-q4KM | 4.9GB | Any M-series (8GB+) | Balanced memory with precision inference |
8b-instruct-q4KS | 4.7GB | Any M-series (8GB+) | Mid-precision small-scale inference |
8b-instruct-q5_0 | 5.6GB | Any M-series (8GB+) | Efficient mid-scale inference tasks |
8b-instruct-q5_1 | 6.1GB | Any M-series (8GB+) | Complex, small-scale inference |
8b-instruct-q6_K | 6.6GB | Any M-series (8GB+) | Balanced precision and memory tasks |
8b-instruct-q8_0 | 8.5GB | M-series (16GB+) | Large-scale, memory-intensive inference |
Variant Name | VRAM Requirement | Recommended Configuration | Best Use Case |
---|---|---|---|
8b | 4.7GB | Any M-series (8GB+) | General-purpose inference |
70b | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Large-scale inference |
70b-instruct | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Instruction-tuned inference tasks |
70b-instruct-fp16 | 141GB | Mac Studio Cluster (2x M2 Ultra 192GB) | Precision-critical, fine-tuning tasks |
70b-instruct-q2_K | 26GB | Mac Studio (M1/M2 Ultra 64GB) | Lightweight inference |
70b-instruct-q3KL | 37GB | Mac Studio (M1/M2 Ultra 64GB) | Balanced precision inference |
70b-instruct-q3KM | 34GB | Mac Studio (M1/M2 Ultra 64GB) | Efficient inference with memory savings |
70b-instruct-q3KS | 31GB | Mac Studio (M1/M2 Ultra 64GB) | Lightweight, low-memory inference |
70b-instruct-q4_0 | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Mid-precision general inference |
70b-instruct-q4_1 | 44GB | Mac Studio (M2 Ultra 128GB) | High-precision inference tasks |
70b-instruct-q4KM | 43GB | Mac Studio (M2 Ultra 128GB) | Optimized for larger models with precision |
70b-instruct-q4KS | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Memory-optimized mid-scale inference |
70b-instruct-q5_0 | 49GB | Mac Studio (M2 Ultra 128GB) | High-efficiency inference tasks |
70b-instruct-q5_1 | 53GB | Mac Studio (M2 Ultra 128GB) | Complex inference tasks |
70b-instruct-q5KM | 50GB | Mac Studio (M2 Ultra 128GB) | Memory-efficient inference |
70b-instruct-q5KS | 49GB | Mac Studio (M2 Ultra 128GB) | Efficient, large-scale inference |
70b-instruct-q6_K | 58GB | Mac Studio (M2 Ultra 128GB) | High-efficiency precision tasks |
70b-instruct-q8_0 | 75GB | Mac Studio (M2 Ultra 128GB) | Heavy-duty, large-scale inference |
8b-instruct-fp16 | 16GB | M-series Pro (16GB+) | Fine-tuning tasks |
8b-instruct-q2_K | 3.2GB | Any M-series (8GB+) | Lightweight precision tasks |
8b-instruct-q3KL | 4.3GB | Any M-series (8GB+) | Balanced precision and memory tasks |
8b-instruct-q3KM | 4.0GB | Any M-series (8GB+) | Efficient small-scale inference |
8b-instruct-q3KS | 3.7GB | Any M-series (8GB+) | Lightweight low-memory inference |
8b-instruct-q4_0 | 4.7GB | Any M-series (8GB+) | Mid-scale inference |
8b-instruct-q4_1 | 5.1GB | Any M-series (8GB+) | Precision-critical small models |
8b-instruct-q4KM | 4.9GB | Any M-series (8GB+) | Balanced memory with precision inference |
8b-instruct-q4KS | 4.7GB | Any M-series (8GB+) | Mid-precision small-scale inference |
8b-instruct-q5_0 | 5.6GB | Any M-series (8GB+) | Efficient mid-scale inference tasks |
8b-instruct-q5_1 | 6.1GB | Any M-series (8GB+) | Complex, small-scale inference |
8b-instruct-q6_K | 6.6GB | Any M-series (8GB+) | Balanced precision and memory tasks |
8b-instruct-q8_0 | 8.5GB | M-series (16GB+) | Large-scale, memory-intensive inference |
70b-text | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Text-specific large-scale inference |
70b-text-fp16 | 141GB | Mac Studio Cluster (2x M2 Ultra 192GB) | Text fine-tuning with high precision |
70b-text-q2_K | 26GB | Mac Studio (M1/M2 Ultra 64GB) | Text inference with reduced precision |
70b-text-q3KL | 37GB | Mac Studio (M1/M2 Ultra 64GB) | Balanced text inference |
70b-text-q3KM | 34GB | Mac Studio (M1/M2 Ultra 64GB) | Efficient text inference |
70b-text-q3KS | 31GB | Mac Studio (M1/M2 Ultra 64GB) | Lightweight, low-memory text tasks |
70b-text-q4_0 | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Text inference with mid-precision |
70b-text-q4_1 | 44GB | Mac Studio (M2 Ultra 128GB) | Precision-critical text tasks |
70b-text-q4KM | 43GB | Mac Studio (M2 Ultra 128GB) | Memory-efficient text inference |
70b-text-q4KS | 40GB | Mac Studio (M1/M2 Ultra 64GB) | Optimized text inference |
70b-text-q5_0 | 49GB | Mac Studio (M2 Ultra 128GB) | Efficient text inference |
70b-text-q5_1 | 53GB | Mac Studio (M2 Ultra 128GB) | Complex text-specific inference tasks |
70b-text-q6_K | 58GB | Mac Studio (M2 Ultra 128GB) | High-efficiency text tasks |
70b-text-q8_0 | 75GB | Mac Studio (M2 Ultra 128GB) | Heavy-duty, precision text inference |
8b-text | 4.7GB | Any M-series (8GB+) | Text-specific general-purpose inference |
instruct | 4.7GB | Any M-series (8GB+) | General-purpose instruction tuning |
text | 4.7GB | Any M-series (8GB+) | General-purpose text tasks |