Skip to content

Instantly share code, notes, and snippets.

@oneryalcin
Created September 9, 2025 06:56
Show Gist options
  • Save oneryalcin/9161e8f337f3f8a35dfa3e628796f086 to your computer and use it in GitHub Desktop.
Save oneryalcin/9161e8f337f3f8a35dfa3e628796f086 to your computer and use it in GitHub Desktop.
UV Scripts + HF Jobs: Workflow Guide

UV Scripts + HF Jobs: Workflow Guide

Introduction: The Problem This Solves

Imagine you're a data scientist with a powerful script that processes images using machine learning. Locally, it works perfectly on your laptop with 10 sample images. But now you need to process 10,000 images, and you need serious GPU power.

The traditional path is painful:

  1. Set up cloud infrastructure (AWS/GCP)
  2. Configure Docker containers
  3. Manage dependencies and environments
  4. Upload data to cloud storage
  5. Write deployment scripts
  6. Handle output collection
  7. Debug networking and permissions

UV Scripts + HF Jobs eliminates 90% of this complexity.

Part 1: Understanding UV - The Standalone Script Revolution

What UV Is

UV is a blazingly fast Python package manager written in Rust that introduces a game-changing concept: self-contained executable scripts. Think of it as turning Python scripts into something closer to compiled binaries - they carry their own dependencies.

The Inline Dependencies Magic

Traditional Python script:

# requirements.txt needed separately
# Virtual environment setup required
# Hope dependencies don't conflict

import requests
import pandas as pd
from transformers import pipeline

def process_data():
    # Your code here
    pass

UV-enabled script:

# /// script
# dependencies = [
#     "requests>=2.28.0",
#     "pandas>=2.0.0", 
#     "transformers>=4.30.0",
#     "torch>=2.0.0"
# ]
# requires-python = ">=3.9"
# ///

import requests
import pandas as pd
from transformers import pipeline

def process_data():
    # Exact same code, but now self-contained
    pass

The Execution Model

Local execution:

uv run my_script.py

What happens:

  1. UV reads the inline metadata block
  2. Creates an isolated temporary environment
  3. Installs exact dependency versions
  4. Executes your script
  5. Cleans up automatically

No virtual environment management. No dependency conflicts. It just works.

Why This Is Revolutionary

Before UV:

  • python script.py ➜ Import errors, version conflicts
  • Need separate requirements.txt, setup instructions
  • "Works on my machine" syndrome

With UV:

  • uv run script.py ➜ Guaranteed to work identically everywhere
  • Script is self-documenting and self-contained
  • Perfect reproducibility

Part 2: HF Jobs - Cloud Compute Made Simple

What HF Jobs Provides

HF Jobs is Hugging Face's answer to "I need GPU compute without the DevOps nightmare." It's Docker-in-the-cloud, but optimized for AI/ML workloads.

The Core Architecture

Your Command → Docker Container → GPU Hardware → Results

Available Hardware Flavors:

  • cpu-basic: Standard CPU processing
  • t4-small: NVIDIA T4 GPU (entry-level ML)
  • a10g-small/large: NVIDIA A10G GPU (solid mid-range)
  • a100-large: NVIDIA A100 GPU (high-end training)

Execution Models

CLI-based:

hf jobs run python:3.12 python -c "print('Hello from the cloud!')"
hf jobs run --flavor a10g-small pytorch/pytorch:latest python train.py

Python API:

from huggingface_hub import run_job

job = run_job(
    image="python:3.12",
    command=["python", "my_script.py"],
    flavor="a100-large",
    timeout="60m"
)

Key Characteristics

  • Ephemeral: No persistent infrastructure
  • Pay-per-second: Only pay for actual compute time
  • Docker-based: Containers ensure consistency
  • Pro-only: Requires $9/month HF Pro subscription

Part 3: The Powerful Combination - UV Scripts on HF Jobs

The Integration Magic

The hf jobs uv run command bridges local development and cloud execution:

# Test locally first
uv run my_ml_script.py --sample-data

# Same exact script on GPU cloud
hf jobs uv run --flavor a10g-large my_ml_script.py --full-dataset

What makes this special:

  • Zero modification: Same script works locally and in cloud
  • No Dockerfiles: UV handles the environment automatically
  • Perfect parity: Identical dependency resolution

The Technical Flow

When you run hf jobs uv run script.py:

  1. HF Jobs spins up a container with UV pre-installed
  2. UV reads your script's inline dependencies
  3. Container installs exact packages in isolation
  4. Script executes with full GPU access
  5. Results saved back to HF Hub
  6. Resources automatically cleaned up

Part 4: The Storage Architecture - Why HF Hub Is Essential

The Local vs Cloud Storage Problem

Local development pattern:

my_project/
├── images/           # 1000 local image files
│   ├── img001.jpg
│   └── img002.jpg
├── script.py         # Reads from ./images/
└── results/          # Outputs to ./results/

Cloud reality:

# In HF Jobs container:
ls images/  # ❌ Directory doesn't exist
            # Your local files aren't magically available

The HF Hub Solution

HF Hub serves as the universal data layer:

Local Machine ←→ HF Hub ←→ HF Jobs Container

Data flows:

  1. Upload: Local data → HF Hub dataset
  2. Process: HF Jobs reads from Hub, writes to Hub
  3. Download: Results downloaded from Hub

Practical Storage Patterns

Pattern 1: Hub-First Workflow

Step 1 - Upload data locally:

# upload_data.py
from datasets import Dataset, Features, Image as ImageFeature
import os
from PIL import Image

def create_dataset():
    images = []
    for filename in os.listdir("./images/"):
        img = Image.open(f"./images/{filename}")
        images.append({"image": img, "filename": filename})
    
    dataset = Dataset.from_list(images)
    dataset.push_to_hub("username/my-images")

create_dataset()

Step 2 - Create processing script:

# process_images.py
# /// script  
# dependencies = [
#     "datasets>=2.0",
#     "transformers>=4.30",
#     "torch>=2.0",
#     "pillow>=9.0"
# ]
# ///

import sys
from datasets import load_dataset, Dataset

def main(input_dataset, output_dataset):
    # Load from Hub
    dataset = load_dataset(input_dataset)['train']
    
    results = []
    for item in dataset:
        # Process each image
        text = extract_text_from_image(item['image'])
        results.append({
            "filename": item['filename'],
            "extracted_text": text,
            "image": item['image']
        })
    
    # Save back to Hub
    results_dataset = Dataset.from_list(results)
    results_dataset.push_to_hub(output_dataset)

if __name__ == "__main__":
    main(sys.argv[1], sys.argv[2])

Step 3 - Execute anywhere:

# Test locally
uv run process_images.py username/my-images username/results-test

# Production run on GPU
hf jobs uv run --flavor a10g-large \
  process_images.py username/my-images username/results-production

Part 5: Complete Workflow Examples

Example 1: Image OCR Pipeline

The Script:

# /// script
# dependencies = [
#     "datasets>=2.0",
#     "transformers>=4.30", 
#     "torch>=2.0",
#     "pillow>=9.0",
#     "easyocr>=1.7.0"
# ]
# ///

import sys
import easyocr
from datasets import load_dataset, Dataset

def ocr_pipeline(input_dataset_id, output_dataset_id):
    # Initialize OCR reader (GPU-accelerated if available)
    reader = easyocr.Reader(['en'])
    
    # Load dataset from Hub
    dataset = load_dataset(input_dataset_id)['train']
    
    results = []
    for i, item in enumerate(dataset):
        print(f"Processing {i+1}/{len(dataset)}: {item['filename']}")
        
        # Convert PIL Image to numpy array for easyocr
        img_array = np.array(item['image'])
        
        # Extract text
        ocr_results = reader.readtext(img_array)
        extracted_text = ' '.join([result[1] for result in ocr_results])
        
        results.append({
            "filename": item['filename'],
            "extracted_text": extracted_text,
            "confidence_scores": [result[2] for result in ocr_results],
            "original_image": item['image']
        })
    
    # Save results to Hub
    output_dataset = Dataset.from_list(results)
    output_dataset.push_to_hub(output_dataset_id)
    print(f"Saved {len(results)} processed images to {output_dataset_id}")

if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: python ocr_script.py <input_dataset> <output_dataset>")
        sys.exit(1)
    
    ocr_pipeline(sys.argv[1], sys.argv[2])

Usage:

# Development: Test with small dataset
uv run ocr_script.py username/sample-images username/ocr-test

# Production: Full dataset on GPU
hf jobs uv run --flavor a10g-large \
  ocr_script.py username/production-images username/ocr-results-final

Example 2: Model Fine-tuning Pipeline

The Script:

# /// script
# dependencies = [
#     "datasets>=2.0",
#     "transformers>=4.30",
#     "torch>=2.0", 
#     "accelerate>=0.20",
#     "peft>=0.4.0"
# ]
# ///

import sys
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments, Trainer
from peft import LoraConfig, get_peft_model

def fine_tune_model(dataset_id, model_output_id):
    # Load training data from Hub
    dataset = load_dataset(dataset_id)
    
    # Load base model
    model_name = "microsoft/DialoGPT-medium"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name)
    
    # Configure LoRA for efficient fine-tuning
    peft_config = LoraConfig(
        r=16,
        lora_alpha=32,
        lora_dropout=0.1,
        target_modules=["c_attn"]
    )
    model = get_peft_model(model, peft_config)
    
    # Training configuration
    training_args = TrainingArguments(
        output_dir="./results",
        num_train_epochs=3,
        per_device_train_batch_size=4,
        save_steps=500,
        logging_steps=100,
    )
    
    # Train model
    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=dataset['train'],
        tokenizer=tokenizer,
    )
    
    trainer.train()
    
    # Save fine-tuned model to Hub
    model.push_to_hub(model_output_id)
    tokenizer.push_to_hub(model_output_id)

if __name__ == "__main__":
    fine_tune_model(sys.argv[1], sys.argv[2])

Usage:

# Long training job on high-end GPU
hf jobs uv run --flavor a100-large --timeout 240m \
  finetune_script.py username/training-data username/my-finetuned-model

Part 6: Best Practices and Patterns

Development Workflow

  1. Start Local: Always test with small datasets locally first
uv run script.py sample-dataset test-output
  1. Scale Gradually: Move to cloud with progressively larger datasets
hf jobs uv run --flavor cpu-basic script.py medium-dataset test-cloud
hf jobs uv run --flavor a10g-small script.py full-dataset production
  1. Monitor and Debug: Use HF Jobs monitoring
hf jobs logs <job-id>
hf jobs ps  # List running jobs

Data Management Patterns

Small Datasets (<1GB): Upload directly to Hub

dataset.push_to_hub("username/my-data")

Large Datasets (1GB+): Use streaming

dataset = load_dataset("username/huge-dataset", streaming=True)

Incremental Processing: Check for existing results

try:
    existing = load_dataset(output_dataset_id)
    processed_files = set(existing['train']['filename'])
except:
    processed_files = set()

# Only process new files
for item in dataset:
    if item['filename'] not in processed_files:
        # Process item
        pass

Error Handling and Resilience

# /// script
# dependencies = ["datasets>=2.0", "huggingface_hub>=0.20"]
# ///

import sys
from datasets import load_dataset, Dataset
from huggingface_hub import HfApi

def resilient_processing(input_id, output_id, checkpoint_interval=100):
    dataset = load_dataset(input_id)['train']
    api = HfApi()
    
    results = []
    for i, item in enumerate(dataset):
        try:
            result = process_item(item)
            results.append(result)
            
            # Checkpoint every N items
            if i % checkpoint_interval == 0 and results:
                checkpoint_dataset = Dataset.from_list(results)
                checkpoint_dataset.push_to_hub(f"{output_id}-checkpoint-{i}")
                
        except Exception as e:
            print(f"Error processing item {i}: {e}")
            continue
    
    # Final save
    final_dataset = Dataset.from_list(results)
    final_dataset.push_to_hub(output_id)

Part 7: Advanced Patterns

Multi-Stage Pipelines

# Stage 1: Data preprocessing
hf jobs uv run --flavor cpu-basic \
  preprocess.py raw-data preprocessed-data

# Stage 2: GPU processing  
hf jobs uv run --flavor a100-large \
  process.py preprocessed-data processed-results

# Stage 3: Post-processing
hf jobs uv run --flavor cpu-basic \
  postprocess.py processed-results final-results

Parameterized Scripts

# /// script
# dependencies = ["datasets>=2.0", "click>=8.0"]
# ///

import click
from datasets import load_dataset

@click.command()
@click.argument('input_dataset')
@click.argument('output_dataset')
@click.option('--batch-size', default=32, help='Processing batch size')
@click.option('--max-samples', default=None, help='Limit number of samples')
@click.option('--model-name', default='default-model', help='Model to use')
def process_data(input_dataset, output_dataset, batch_size, max_samples, model_name):
    # Processing logic with parameters
    pass

if __name__ == "__main__":
    process_data()

Usage:

hf jobs uv run --flavor a10g-large \
  script.py input-data output-data --batch-size 64 --max-samples 1000

Part 8: Economics and Optimization

Cost Optimization Strategies

Development Phase:

# Use CPU for debugging
hf jobs uv run --flavor cpu-basic script.py small-sample debug-output

Scaling Phase:

# Use mid-tier GPU for validation
hf jobs uv run --flavor a10g-small script.py medium-sample validation-output

Production Phase:

# Use high-end GPU for full processing
hf jobs uv run --flavor a100-large script.py full-dataset production-output

Cost Examples:

  • CPU debugging: ~$0.01 for 10-minute test
  • A10G validation: ~$0.25 for 30-minute run
  • A100 production: ~$8.25 for 2-hour job

Performance Optimization

Batch Processing:

def process_in_batches(dataset, batch_size=32):
    for i in range(0, len(dataset), batch_size):
        batch = dataset[i:i+batch_size]
        # Process batch efficiently
        yield process_batch(batch)

Memory Management:

# Use streaming for large datasets
dataset = load_dataset(dataset_id, streaming=True)
for item in dataset:
    result = process_item(item)
    # Process one item at a time to avoid memory issues

Conclusion: The Paradigm Shift

UV Scripts + HF Jobs represents a fundamental shift in how we think about ML workflows:

Old Paradigm:

  • Write script → Set up infrastructure → Deploy → Debug → Scale
  • Days of setup, brittle configurations, "works on my machine"

New Paradigm:

  • Write self-contained script → Test locally → Scale to cloud
  • Minutes of setup, guaranteed reproducibility, seamless scaling

The Key Insights:

  1. Self-contained scripts eliminate environment hell
  2. Hub-centric storage enables universal data access
  3. Serverless execution removes infrastructure complexity
  4. Pay-per-use makes GPU compute economically accessible

This combination makes complex ML workflows as simple as running any command-line tool, while maintaining the power and flexibility needed for production workloads.

The future of ML development is: write once, run anywhere, scale instantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment