Llama 3 System Requirements Tables

Llama 3.3 Requirements

Variant Name	VRAM Requirement	Recommended Configuration	Best Use Case
70b	43GB	Mac Studio (M2 Ultra 128GB)	General-purpose inference
70b-instruct-fp16	141GB	Mac Studio Cluster (2x M2 Ultra 192GB)	High-precision fine-tuning and training
70b-instruct-q2_K	26GB	Mac Studio (M1/M2 Ultra 64GB)	Lightweight inference with reduced precision
70b-instruct-q3KM	34GB	Mac Studio (M1/M2 Ultra 64GB)	Balanced performance and efficiency
70b-instruct-q3KS	31GB	Mac Studio (M1/M2 Ultra 64GB)	Lower memory, faster inference tasks
70b-instruct-q4_0	40GB	Mac Studio (M1/M2 Ultra 64GB)	High-speed, mid-precision inference
70b-instruct-q4_1	44GB	Mac Studio (M2 Ultra 128GB)	Precision-critical inference tasks
70b-instruct-q4KM	43GB	Mac Studio (M2 Ultra 128GB)	Optimized for larger models with precision
70b-instruct-q4KS	40GB	Mac Studio (M1/M2 Ultra 64GB)	Standard performance inference tasks
70b-instruct-q5_0	49GB	Mac Studio (M2 Ultra 128GB)	High-efficiency inference tasks
70b-instruct-q5_1	53GB	Mac Studio (M2 Ultra 128GB)	Complex inference and light training
70b-instruct-q5KM	50GB	Mac Studio (M2 Ultra 128GB)	Memory-intensive inference tasks
70b-instruct-q6_K	58GB	Mac Studio (M2 Ultra 128GB)	Large-scale precision and training
70b-instruct-q8_0	75GB	Mac Studio (M2 Ultra 128GB)	Heavy-duty inference and fine-tuning

Llama 3.2 Requirements

Variant Name	VRAM Requirement	Recommended Configuration	Best Use Case
1b	1.3GB	Any M-series (8GB+)	Lightweight inference tasks
3b	2.0GB	Any M-series (8GB+)	General-purpose inference
1b-instruct-fp16	2.5GB	Any M-series (8GB+)	Fine-tuning and precision-critical tasks
1b-instruct-q2_K	581MB	Any M-series (8GB+)	Reduced precision, memory-efficient inference
1b-instruct-q3KL	733MB	Any M-series (8GB+)	Efficient inference with balanced precision
1b-instruct-q3KM	691MB	Any M-series (8GB+)	Smaller, balanced precision tasks
1b-instruct-q3KS	642MB	Any M-series (8GB+)	Lower memory, lightweight inference
1b-instruct-q4_0	771MB	Any M-series (8GB+)	Mid-precision inference tasks
1b-instruct-q4_1	832MB	Any M-series (8GB+)	Precision-critical small models
1b-instruct-q4KM	808MB	Any M-series (8GB+)	Balanced, memory-optimized tasks
1b-instruct-q4KS	776MB	Any M-series (8GB+)	Lightweight inference with precision
1b-instruct-q5_0	893MB	Any M-series (8GB+)	Higher-efficiency inference tasks
1b-instruct-q5_1	953MB	Any M-series (8GB+)	Small models with complex inference
1b-instruct-q5KM	912MB	Any M-series (8GB+)	Memory-optimized, efficient inference
1b-instruct-q5KS	893MB	Any M-series (8GB+)	Low memory, efficient inference
1b-instruct-q6_K	1.0GB	Any M-series (8GB+)	Medium memory, balanced inference
1b-instruct-q8_0	1.3GB	Any M-series (8GB+)	Standard inference for small models
3b-instruct-fp16	6.4GB	Any M-series (8GB+)	Fine-tuning and precision-critical tasks
3b-instruct-q2_K	1.4GB	Any M-series (8GB+)	Reduced precision, lightweight inference
3b-instruct-q3KL	1.8GB	Any M-series (8GB+)	Balanced precision inference tasks
3b-instruct-q3KM	1.7GB	Any M-series (8GB+)	Efficient, memory-optimized inference
3b-instruct-q3KS	1.5GB	Any M-series (8GB+)	Lightweight, small batch inference
3b-instruct-q4_0	1.9GB	Any M-series (8GB+)	Mid-precision general inference
3b-instruct-q4_1	2.1GB	Any M-series (8GB+)	Higher precision, small tasks
3b-instruct-q4KM	2.0GB	Any M-series (8GB+)	Memory-optimized small models
3b-instruct-q4KS	1.9GB	Any M-series (8GB+)	Mid-memory general inference
3b-instruct-q5_0	2.3GB	Any M-series (8GB+)	High-efficiency inference tasks
3b-instruct-q5_1	2.4GB	Any M-series (8GB+)	Fine-tuned, higher complexity tasks
3b-instruct-q5KM	2.3GB	Any M-series (8GB+)	Efficient inference with optimization
3b-instruct-q5KS	2.3GB	Any M-series (8GB+)	High efficiency, balanced memory tasks
3b-instruct-q6_K	2.6GB	Any M-series (8GB+)	Balanced precision for small tasks
3b-instruct-q8_0	3.4GB	Any M-series (8GB+)	High-memory inference and tasks

Llama 3.1 Requirements

Variant Name	VRAM Requirement	Recommended Configuration	Best Use Case
8b	4.9GB	Any M-series (8GB+)	General-purpose inference
70b	43GB	Mac Studio (M2 Ultra 128GB)	Large-scale inference
405b	243GB	Mac Studio Cluster (4x M2 Ultra 192GB)	Large-scale model training
405b-instruct-fp16	812GB	Mac Studio Cluster (11x M2 Ultra 192GB)	Precision-critical, fine-tuning tasks
405b-instruct-q2_K	149GB	Mac Studio Cluster (2x M2 Ultra 192GB)	Memory-optimized inference
405b-instruct-q3KL	213GB	Mac Studio Cluster (3x M2 Ultra 192GB)	Balanced precision for large-scale tasks
405b-instruct-q3KM	195GB	Mac Studio Cluster (3x M2 Ultra 192GB)	High-efficiency large-scale inference
405b-instruct-q3KS	175GB	Mac Studio Cluster (2x M2 Ultra 192GB)	Efficient inference with lower precision
405b-instruct-q4_0	229GB	Mac Studio Cluster (3x M2 Ultra 192GB)	Mid-precision for large models
405b-instruct-q4_1	254GB	Mac Studio Cluster (4x M2 Ultra 192GB)	High-precision inference
405b-instruct-q4KM	243GB	Mac Studio Cluster (4x M2 Ultra 192GB)	Optimized precision for large models
405b-instruct-q4KS	231GB	Mac Studio Cluster (3x M2 Ultra 192GB)	Balanced memory with precision inference
405b-instruct-q5_0	279GB	Mac Studio Cluster (4x M2 Ultra 192GB)	High-efficiency large-scale tasks
405b-instruct-q5_1	305GB	Mac Studio Cluster (4x M2 Ultra 192GB)	Complex inference and fine-tuning
405b-instruct-q5KM	287GB	Mac Studio Cluster (4x M2 Ultra 192GB)	Memory-intensive training and inference
405b-instruct-q5KS	279GB	Mac Studio Cluster (4x M2 Ultra 192GB)	Efficient training with lower memory
405b-instruct-q6_K	333GB	Mac Studio Cluster (5x M2 Ultra 192GB)	High-performance training for large models
405b-instruct-q8_0	431GB	Mac Studio Cluster (6x M2 Ultra 192GB)	Heavy-duty, precision-critical training
70b-instruct-fp16	141GB	Mac Studio Cluster (2x M2 Ultra 192GB)	Fine-tuning and high-precision inference
70b-instruct-q2_K	26GB	Mac Studio (M1/M2 Ultra 64GB)	Lightweight inference
70b-instruct-q3KL	37GB	Mac Studio (M1/M2 Ultra 64GB)	Balanced precision inference
70b-instruct-q3KM	34GB	Mac Studio (M1/M2 Ultra 64GB)	Efficient inference with memory savings
70b-instruct-q3KS	31GB	Mac Studio (M1/M2 Ultra 64GB)	Lightweight, low-memory inference
70b-instruct-q4_0	40GB	Mac Studio (M1/M2 Ultra 64GB)	Mid-precision general inference
70b-instruct-q4KM	43GB	Mac Studio (M2 Ultra 128GB)	Precision-critical large models
70b-instruct-q4KS	40GB	Mac Studio (M1/M2 Ultra 64GB)	Memory-optimized mid-scale inference
70b-instruct-q5_0	49GB	Mac Studio (M2 Ultra 128GB)	Efficient high-memory tasks
70b-instruct-q5_1	53GB	Mac Studio (M2 Ultra 128GB)	Complex inference tasks
70b-instruct-q5KM	50GB	Mac Studio (M2 Ultra 128GB)	Memory-efficient inference
70b-instruct-q5KS	49GB	Mac Studio (M2 Ultra 128GB)	Efficient, large-scale inference
70b-instruct-q6_K	58GB	Mac Studio (M2 Ultra 128GB)	High-efficiency precision tasks
70b-instruct-q8_0	75GB	Mac Studio (M2 Ultra 128GB)	Heavy-duty, large-scale inference
8b-instruct-fp16	16GB	M-series Pro (16GB+)	Fine-tuning tasks
8b-instruct-q2_K	3.2GB	Any M-series (8GB+)	Lightweight precision tasks
8b-instruct-q3KL	4.3GB	Any M-series (8GB+)	Balanced precision and memory tasks
8b-instruct-q3KM	4.0GB	Any M-series (8GB+)	Efficient small-scale inference
8b-instruct-q3KS	3.7GB	Any M-series (8GB+)	Lightweight low-memory inference
8b-instruct-q4_0	4.7GB	Any M-series (8GB+)	Mid-scale inference
8b-instruct-q4_1	5.1GB	Any M-series (8GB+)	Precision-critical small models
8b-instruct-q4KM	4.9GB	Any M-series (8GB+)	Balanced memory with precision inference
8b-instruct-q4KS	4.7GB	Any M-series (8GB+)	Mid-precision small-scale inference
8b-instruct-q5_0	5.6GB	Any M-series (8GB+)	Efficient mid-scale inference tasks
8b-instruct-q5_1	6.1GB	Any M-series (8GB+)	Complex, small-scale inference
8b-instruct-q6_K	6.6GB	Any M-series (8GB+)	Balanced precision and memory tasks
8b-instruct-q8_0	8.5GB	M-series (16GB+)	Large-scale, memory-intensive inference

Llama 3 Requirements

Variant Name	VRAM Requirement	Recommended Configuration	Best Use Case
8b	4.7GB	Any M-series (8GB+)	General-purpose inference
70b	40GB	Mac Studio (M1/M2 Ultra 64GB)	Large-scale inference
70b-instruct	40GB	Mac Studio (M1/M2 Ultra 64GB)	Instruction-tuned inference tasks
70b-instruct-fp16	141GB	Mac Studio Cluster (2x M2 Ultra 192GB)	Precision-critical, fine-tuning tasks
70b-instruct-q2_K	26GB	Mac Studio (M1/M2 Ultra 64GB)	Lightweight inference
70b-instruct-q3KL	37GB	Mac Studio (M1/M2 Ultra 64GB)	Balanced precision inference
70b-instruct-q3KM	34GB	Mac Studio (M1/M2 Ultra 64GB)	Efficient inference with memory savings
70b-instruct-q3KS	31GB	Mac Studio (M1/M2 Ultra 64GB)	Lightweight, low-memory inference
70b-instruct-q4_0	40GB	Mac Studio (M1/M2 Ultra 64GB)	Mid-precision general inference
70b-instruct-q4_1	44GB	Mac Studio (M2 Ultra 128GB)	High-precision inference tasks
70b-instruct-q4KM	43GB	Mac Studio (M2 Ultra 128GB)	Optimized for larger models with precision
70b-instruct-q4KS	40GB	Mac Studio (M1/M2 Ultra 64GB)	Memory-optimized mid-scale inference
70b-instruct-q5_0	49GB	Mac Studio (M2 Ultra 128GB)	High-efficiency inference tasks
70b-instruct-q5_1	53GB	Mac Studio (M2 Ultra 128GB)	Complex inference tasks
70b-instruct-q5KM	50GB	Mac Studio (M2 Ultra 128GB)	Memory-efficient inference
70b-instruct-q5KS	49GB	Mac Studio (M2 Ultra 128GB)	Efficient, large-scale inference
70b-instruct-q6_K	58GB	Mac Studio (M2 Ultra 128GB)	High-efficiency precision tasks
70b-instruct-q8_0	75GB	Mac Studio (M2 Ultra 128GB)	Heavy-duty, large-scale inference
8b-instruct-fp16	16GB	M-series Pro (16GB+)	Fine-tuning tasks
8b-instruct-q2_K	3.2GB	Any M-series (8GB+)	Lightweight precision tasks
8b-instruct-q3KL	4.3GB	Any M-series (8GB+)	Balanced precision and memory tasks
8b-instruct-q3KM	4.0GB	Any M-series (8GB+)	Efficient small-scale inference
8b-instruct-q3KS	3.7GB	Any M-series (8GB+)	Lightweight low-memory inference
8b-instruct-q4_0	4.7GB	Any M-series (8GB+)	Mid-scale inference
8b-instruct-q4_1	5.1GB	Any M-series (8GB+)	Precision-critical small models
8b-instruct-q4KM	4.9GB	Any M-series (8GB+)	Balanced memory with precision inference
8b-instruct-q4KS	4.7GB	Any M-series (8GB+)	Mid-precision small-scale inference
8b-instruct-q5_0	5.6GB	Any M-series (8GB+)	Efficient mid-scale inference tasks
8b-instruct-q5_1	6.1GB	Any M-series (8GB+)	Complex, small-scale inference
8b-instruct-q6_K	6.6GB	Any M-series (8GB+)	Balanced precision and memory tasks
8b-instruct-q8_0	8.5GB	M-series (16GB+)	Large-scale, memory-intensive inference
70b-text	40GB	Mac Studio (M1/M2 Ultra 64GB)	Text-specific large-scale inference
70b-text-fp16	141GB	Mac Studio Cluster (2x M2 Ultra 192GB)	Text fine-tuning with high precision
70b-text-q2_K	26GB	Mac Studio (M1/M2 Ultra 64GB)	Text inference with reduced precision
70b-text-q3KL	37GB	Mac Studio (M1/M2 Ultra 64GB)	Balanced text inference
70b-text-q3KM	34GB	Mac Studio (M1/M2 Ultra 64GB)	Efficient text inference
70b-text-q3KS	31GB	Mac Studio (M1/M2 Ultra 64GB)	Lightweight, low-memory text tasks
70b-text-q4_0	40GB	Mac Studio (M1/M2 Ultra 64GB)	Text inference with mid-precision
70b-text-q4_1	44GB	Mac Studio (M2 Ultra 128GB)	Precision-critical text tasks
70b-text-q4KM	43GB	Mac Studio (M2 Ultra 128GB)	Memory-efficient text inference
70b-text-q4KS	40GB	Mac Studio (M1/M2 Ultra 64GB)	Optimized text inference
70b-text-q5_0	49GB	Mac Studio (M2 Ultra 128GB)	Efficient text inference
70b-text-q5_1	53GB	Mac Studio (M2 Ultra 128GB)	Complex text-specific inference tasks
70b-text-q6_K	58GB	Mac Studio (M2 Ultra 128GB)	High-efficiency text tasks
70b-text-q8_0	75GB	Mac Studio (M2 Ultra 128GB)	Heavy-duty, precision text inference
8b-text	4.7GB	Any M-series (8GB+)	Text-specific general-purpose inference
instruct	4.7GB	Any M-series (8GB+)	General-purpose instruction tuning
text	4.7GB	Any M-series (8GB+)	General-purpose text tasks

antarr/spec.md

Llama 3 System Requirements Tables

Llama 3.3 Requirements

Llama 3.2 Requirements

Llama 3.1 Requirements

Llama 3 Requirements