Skip to content

Instantly share code, notes, and snippets.

@ashokallu
Last active August 13, 2025 00:51
Show Gist options
  • Save ashokallu/26445f706d33308b0f243f76618cb687 to your computer and use it in GitHub Desktop.
Save ashokallu/26445f706d33308b0f243f76618cb687 to your computer and use it in GitHub Desktop.
Idiomatic Go Programming Guide 2025+ - Gold Standard Edition

Idiomatic Go Programming Guide 2025+ - Gold Standard Edition

"Go is about making software engineering more effective, not just making programmers more productive." - The Go Team

Target Go Version: Go 1.23+ (examples align with 1.23–1.24 features)


Executive Summary

This guide synthesizes production wisdom from Google, Uber, and the broader Go community into actionable patterns for building maintainable, performant Go applications. It emphasizes explicit over implicit, composition over inheritance, and simplicity over cleverness.

Prerequisites: Basic Go syntax knowledge and familiarity with standard library concepts.

TL;DR: Code Review Cheat Sheet

Topic ✅ Do This ❌ Never This Why (What Goes Wrong)
Dependencies Inject via constructors Global variables Testing impossible, hidden coupling, debugging nightmare
Errors errors.Is/As checking err == ErrFoo Wrapped errors break equality, silent failures in production
Context Pass as first parameter Store in structs Stale contexts, goroutine leaks, lifecycle chaos
Concurrency Use errgroup Bare sync.WaitGroup No error handling, no cancellation, silent failures
Interfaces Small, focused, consumer-defined God interfaces Impossible to test, violates ISP, tight coupling
Logging Structured with log/slog fmt.Printf debugging No searchability, impossible monitoring/alerting
Performance Pre-allocate with capacity Growing slices Multiple reallocations, memory churn, GC pressure
Naming Intention-revealing Cryptic abbreviations Code becomes write-only, maintenance nightmare

Key Principles

  • Dependencies are explicit and injectable
  • Errors are handled, never ignored
  • Concurrency uses errgroup, never bare sync.WaitGroup
  • Context flows through parameters, never stored in structs
  • Interfaces are small and defined where they're used

Quick Navigation:

  • Foundations: §1-3 (Philosophy, Structure, Style)
  • Core Patterns: §4-7 (Errors, Concurrency, Architecture, Observability)
  • Advanced Topics: §8-11 (Performance, Generics, Patterns, Footguns)
  • Reference: §12 (Anti-Patterns Checklist)

Table of Contents


1. Go's Design Philosophy & Why It Matters

The Power of Constraints

Go's apparent simplicity is actually sophisticated constraint. By limiting language features, Go reduces cognitive load and forces developers to write explicit, maintainable code.

Decision Rationale: These constraints aren't limitations—they're design enablers. When you can't hide complexity, you're forced to address it directly, leading to better architecture.

What Makes Go Different:

Aspect Go's Approach Why This Matters What Goes Wrong Without It
Formatting Single style (gofmt) Eliminates bike-shedding, enables automatic formatting Teams waste hours debating tabs vs spaces, inconsistent codebases become unreadable
Error Handling Explicit returns No hidden control flow, all error paths visible Exceptions hide failure paths, leading to unhandled edge cases and mysterious crashes
Inheritance Composition via interfaces Prevents deep hierarchies, enables flexible design Deep inheritance chains become unmaintainable, change in base class breaks everything
Concurrency Goroutines + channels Makes concurrent programming accessible Thread management becomes complex, deadlocks and race conditions proliferate
Dependencies Explicit imports No hidden dependencies or global state Hidden dependencies make testing impossible, circular imports, mysterious side effects

What Goes Wrong Without These Principles

Production Disaster #1: Hidden Global State

// BAD: Hidden global state - The debugging nightmare
var DB *sql.DB  // Where does this come from? How do I test it?

func GetUser(id string) (*User, error) {
    return DB.Query(...) // Cannot mock, cannot test, cannot trace failures
}

What Happened in Production: A major e-commerce platform spent 6 months debugging intermittent database errors. The root cause? A global DB variable was being shared across goroutines without proper synchronization. During peak traffic, connection pool exhaustion caused random query failures that were impossible to trace because the dependency was hidden throughout the codebase.

Why This Is a Nightmare:

  • Testing Impossible: Cannot inject mock database for unit tests
  • Debugging Hell: No way to trace which code path caused the DB failure
  • Hidden Coupling: Every package secretly depends on database initialization order
  • Race Conditions: Global state shared across goroutines without synchronization
  • Environment Issues: Cannot easily switch between dev/staging/prod databases

Production Disaster #2: Exception-Driven Error Handling

// What developers coming from Java/C# try to do:
func processOrder(orderID string) {
    order := getOrder(orderID) // What if this fails?
    payment := processPayment(order) // What if payment gateway is down?
    shipment := createShipment(order, payment) // What if shipping service fails?
    // Three points of failure, zero error handling
}

What Went Wrong: A payment processing service had 47 different points where external API calls could fail. Without explicit error handling, failures bubbled up as generic "internal server error" messages. Customer support couldn't troubleshoot issues because there was no way to determine if the problem was with payment gateway, inventory service, or shipping calculator.

With Explicit Dependencies & Error Handling:

// GOOD: Clear, testable, traceable design
type UserService struct {
    db     *sql.DB
    logger *slog.Logger
}

func NewUserService(db *sql.DB, logger *slog.Logger) *UserService {
    return &UserService{
        db:     db,
        logger: logger,
    }
}

func (s *UserService) GetUser(ctx context.Context, id string) (*User, error) {
    user, err := s.db.QueryContext(ctx, "SELECT * FROM users WHERE id = $1", id)
    if err != nil {
        s.logger.ErrorContext(ctx, "database query failed", 
            "user_id", id, 
            "error", err)
        return nil, fmt.Errorf("failed to get user %s: %w", id, err)
    }
    return user, nil // Explicit success path
}

Why This Works:

  • Mockable: Can inject test database for unit tests
  • Traceable: Clear error context with structured logging
  • Explicit: All dependencies visible in constructor
  • Contextual: Request tracing through context parameter
  • Debuggable: Each error path provides specific failure information

When Go Shines vs. When to Choose Alternatives

✅ Perfect for Go:

  • Network Services: HTTP/gRPC APIs, microservices
  • CLI Tools: Single binary deployment, fast startup
  • System Tools: Container orchestration, infrastructure tools
  • Data Pipelines: Concurrent processing, stream handling
  • Backend Services: API servers, background processors

❌ Consider Alternatives:

  • GUI Applications: Limited ecosystem (use web tech for UI)
  • Scientific Computing: NumPy/SciPy ecosystem in Python is mature
  • Game Development: GC pauses affect frame rates (use C++/Rust)
  • Embedded Systems: Runtime overhead may be prohibitive

Decision Rationale: Go optimizes for network-connected, concurrent, server-side applications. It's the infrastructure language of the cloud-native era.


2. Project Organization & Package Design

Project Layout: Community Convention, Not Standard

⚠️ CRITICAL CLARIFICATION: The widely-referenced golang-standards/project-layout is a community proposal, not an official Go standard. The Go team advocates for starting simple.

Official Go Advice: Start with a flat structure and only add directories when complexity demands them.

What Goes Wrong with Premature Organization:

  • Over-Engineering: Creating cmd/, internal/, pkg/ for a 500-line project
  • Artificial Complexity: Forcing abstractions before understanding the domain
  • Maintenance Burden: More directories mean more places to look for code
  • Cognitive Overhead: Developers spend time navigating structure instead of solving problems

Simple Project (Start Here):

github.com/ashokallu/myservice/
├── main.go           # Single binary? Keep it simple
├── user.go           # Domain logic
├── handler.go        # HTTP handlers  
├── repository.go     # Data access
├── go.mod
└── go.sum

When to Evolve to Complex Structure:

github.com/ashokallu/myservice/
├── cmd/                    # Multiple binaries
│   ├── server/main.go      # Main application server
│   └── migrator/main.go    # Database migration tool
├── internal/               # Private application code
│   ├── user/              # Domain packages
│   │   ├── service.go
│   │   ├── repository.go
│   │   └── models.go
│   ├── order/
│   └── shared/            # Shared internal utilities
├── pkg/                    # Public library code
│   └── client/            # Client SDK for this service
│       ├── client.go
│       └── models.go
├── api/                    # API definitions
│   ├── proto/             # Protocol buffer definitions
│   └── openapi/           # OpenAPI specifications
├── go.mod
└── go.sum

When to Add Directories:

  • cmd/ - Only when you have multiple binaries. Use go run ./cmd/server pattern.
  • internal/ - When you need to prevent external imports
  • pkg/ - Only when other projects will import your code as a library
  • api/ - When you have API contracts (protobuf, OpenAPI)
  • tools/ - For development tools with //go:build tools pattern

Tools Directory Pattern:

// tools/tools.go
//go:build tools

package tools

import (
    _ "github.com/google/wire/cmd/wire"
    _ "github.com/golang/mock/mockgen"
    _ "github.com/golangci/golangci-lint/cmd/golangci-lint"
)

Multi-Architecture Builds:

// internal/platform/cpu_amd64.go
//go:build amd64

package platform

const Arch = "amd64"

// internal/platform/cpu_arm64.go  
//go:build arm64

package platform

const Arch = "arm64"

Decision Rationale: Premature organization is as harmful as premature optimization. Start simple, refactor when complexity demands structure.

Package Design Principles

✅ DO: Design packages around capabilities, not data structures

// GOOD: Package organized by capability
package user

type Service struct {
    repo   Repository
    logger *slog.Logger
}

type Repository interface {
    GetUser(ctx context.Context, id string) (*User, error)
    SaveUser(ctx context.Context, user *User) error
}

func (s *Service) CreateUser(ctx context.Context, req *CreateUserRequest) (*User, error) {
    // Business logic here
}

❌ DON'T: Create god packages or organize by technical layers

// BAD: Layer-based organization creates tight coupling
package models  // Everything in one package

type User struct {...}
type Order struct {...}
type UserService struct {...}    // Business logic mixed with data
type OrderService struct {...}
// ... 50 more types creating circular dependencies

What Goes Wrong with God Packages:

  • Circular Dependencies: User needs Order, Order needs User, import cycle hell
  • Compilation Bottlenecks: Changes to any type force recompilation of everything
  • Testing Nightmare: Cannot test individual components in isolation
  • Merge Conflicts: Multiple developers editing the same massive file
  • Cognitive Load: 2000-line files where finding anything requires scrolling

Production War Story: A fintech startup had a models package with 127 types. Every change required 15+ minute rebuilds, developers spent more time resolving merge conflicts than writing features, and the codebase became so tightly coupled that simple changes broke unrelated functionality.

The Dependency Rule

Critical Principle: Your package dependencies must form a Directed Acyclic Graph (DAG). Dependencies should always point inward, from outer layers to inner layers.

┌─────────────┐
│   handler   │ ──────┐
└─────────────┘       │
                      ▼
┌─────────────┐   ┌─────────────┐
│  transport  │──▶│   service   │ 
│  (HTTP/gRPC)│   │ (business)  │
└─────────────┘   └─────────────┘
                      │
                      ▼
                 ┌──────────────┐
                 │ repository   │
                 │ (interface)  │ 
                 └──────────────┘
                      ▲
                      │
              ┌──────────────────┐
              │ repository/mysql │
              │ (implementation) │
              └──────────────────┘

Dependency Flow Rules:

  • handler depends on service
  • service depends on repository (interface)
  • repository/mysql (implementation) depends on repository (interface)
  • The service package knows nothing about HTTP or MySQL
  • The repository interface knows nothing about specific implementations

Why This Works: This inversion of control, facilitated by interfaces, makes the architecture testable and flexible. You can swap MySQL for PostgreSQL without changing business logic.

What Goes Wrong with Circular Dependencies:

// DISASTER: Circular dependency hell
package user
import "myapp/order"  // User needs Order

type User struct {
    Orders []order.Order  // User knows about orders
}

package order
import "myapp/user"  // Order needs User

type Order struct {
    Customer user.User  // Order knows about user
}

// Result: go build fails with "import cycle not allowed"

How to Break Circular Dependencies:

// SOLUTION 1: Define interfaces in consuming package
package user

type OrderService interface {
    GetOrdersForUser(ctx context.Context, userID string) ([]Order, error)
}

type Service struct {
    orderService OrderService  // Depends on interface, not concrete type
}

// SOLUTION 2: Create shared domain package
package domain

type User struct { ID string }
type Order struct { UserID string }

// Both user and order packages import domain, no cycle

Go Modules and Workspaces

Module Toolchain Directive: Ensure reproducible builds with explicit toolchain versions:

// go.mod
module github.com/ashokallu/myservice

go 1.23

toolchain go1.24.5 // Guarantees exact Go toolchain for reproducible builds

require (
    golang.org/x/sync v0.7.0
)

Why Toolchain Matters: Different Go versions can produce different binaries. The toolchain directive prevents subtle bugs from compiler differences between development and production environments.

Monorepo with Workspaces (Go 1.18+):

// go.work
go 1.23

use (
    ./service-a
    ./service-b
    ./shared
)

Run go mod tidy -go=1.23 to maintain version consistency across workspace modules.

Package Naming Rules

Type Convention Example Rationale What Goes Wrong
Package Short, lowercase, no underscores user, http, time Clear, concise, no stuttering userManagement creates verbose imports
Interface Behavior-focused, often ends in -er Reader, UserRepository Describes what it does IUser brings Java naming to Go
Struct Clear noun, avoid stuttering User (not UserStruct) Package context provides clarity user.UserStruct is redundant
Methods Verb or verb phrase GetUser, ProcessPayment Action-oriented RetrieveUserDataFromDatabase is verbose

3. Naming Conventions & Code Style

The Art of Go Naming

Decision Rationale: Good names are documentation. They should reveal intent and make incorrect usage difficult.

What Goes Wrong with Poor Naming:

  • Write-Only Code: Code becomes impossible to maintain
  • Context Loss: Future developers (including you) can't understand intent
  • Bug Introduction: Cryptic names lead to wrong assumptions
  • Onboarding Nightmare: New team members can't contribute effectively

Quick Reference Table

Element Convention Example Anti-Pattern What Goes Wrong
Package Short, descriptive user, payment userManagement, utils Verbose imports, unclear purpose
Interface Behavior + -er (if single method) Reader, UserFinder IUser, UserInterface Java-style naming
Struct Clear noun User, PaymentRequest UserStruct, UserData Redundant suffixes
Method Verb phrase GetUser, ProcessPayment RetrieveUserData Overly verbose
Variable Concise in narrow scope u for user in short function userData in 2-line scope Unnecessary verbosity
Constant Descriptive DefaultTimeout, MaxRetries TIMEOUT_VALUE C-style naming
Error Variable Err prefix ErrUserNotFound ErrorUserNotFound Non-standard pattern

Context-Aware Naming

✅ DO: Use intention-revealing names with proper scope

// EXCELLENT: Clear intent and appropriate scope
func (s *PaymentService) ProcessTransaction(ctx context.Context, amount decimal.Decimal, cardToken string) (*Transaction, error) {
    // Short names in narrow scope are perfectly fine
    tx := &Transaction{
        ID:     generateID(),
        Amount: amount,
        Status: StatusPending,
    }
    
    // Descriptive names for important operations
    validationResult, err := s.validatePaymentDetails(ctx, amount, cardToken)
    if err != nil {
        return nil, fmt.Errorf("payment validation failed: %w", err)
    }
    
    return tx, nil
}

// Helper function to generate unique IDs
func generateID() string {
    return fmt.Sprintf("tx_%d", time.Now().UnixNano())
}

❌ DON'T: Use cryptic abbreviations or overly verbose names

// BAD: Cryptic and unclear
func (s *PaymentService) ProcTx(c context.Context, amt decimal.Decimal, ct string) (*Transaction, error) {
    t := &Transaction{...} // What is 't'?
    vr, e := s.valPmtDtls(c, amt, ct) // Impossible to understand
    return t, e
}

// ALSO BAD: Overly verbose
func (s *PaymentService) ProcessPaymentTransactionWithCardTokenValidation(
    applicationContext context.Context, 
    paymentAmountInDecimalFormat decimal.Decimal, 
    encryptedCardTokenString string) (*PaymentTransactionResult, error) {
    // Too verbose for simple concepts
}

What Goes Wrong with Bad Naming:

// Production debugging nightmare:
func (s *Service) p(c context.Context, d []byte) error {
    r, e := s.r.g(c, string(d)) // What does this do?
    if e != nil {
        return e // Which error? From where?
    }
    return s.pr(c, r) // What is 'pr'? What is 'r'?
}

Real Production Story: A team inherited code where functions were named p1, p2, p3 (for "process 1", "process 2", etc.). A critical bug required understanding the business logic, but with names like r := s.p2(d.f, d.s), debugging took 3 weeks instead of 3 hours.

Error Variable Patterns

✅ DO: Follow standard library conventions

// Error variables: Err prefix, clear message
var (
    ErrUserNotFound     = errors.New("user not found")
    ErrInvalidEmail     = errors.New("invalid email address")
    ErrPaymentDeclined  = errors.New("payment declined")
    ErrInsufficientFunds = errors.New("insufficient funds")
    ErrEmailRequired     = errors.New("email is required")
    ErrEmailInvalid      = errors.New("email format is invalid")
    ErrNameRequired      = errors.New("name is required")
)

// Constants: Clear, descriptive
const (
    DefaultTimeout      = 30 * time.Second
    MaxRetryAttempts    = 3
    APIVersion          = "v1"
)

Why This Matters: Consistent error naming enables reliable error handling with errors.Is and makes error types instantly recognizable in logs and debugging.

Note: Package-level sentinel errors (e.g., var ErrX = errors.New("x")) are idiomatic and expected. What’s not acceptable are mutable global state and comparing errors with == instead of errors.Is.


4. Error Handling Excellence

Modern Error Handling Philosophy

Go's explicit error handling is a feature, not a bug. It forces you to think about failure modes and makes error paths visible.

Decision Rationale: Hidden exceptions create invisible control flow. Explicit errors make failure modes obvious and force developers to handle them consciously.

The Golden Rules

✅ ALWAYS Use errors.Is and errors.As:

// MODERN: Works with wrapped errors
if errors.Is(err, ErrUserNotFound) {
    return handleNotFound(ctx, userID)
}

var validationErr *ValidationError
if errors.As(err, &validationErr) {
    return handleValidationError(ctx, validationErr)
}

❌ NEVER Use Equality Operators:

// LEGACY: Breaks with wrapped errors
if err == ErrUserNotFound {  // ❌ Fragile
    return handleNotFound(ctx, userID)
}

if err.Error() == "user not found" {  // ❌ Extremely fragile
    return handleNotFound(ctx, userID)
}

What Goes Wrong with Error Equality:

// The Production Disaster Scenario
func (s *UserService) GetUser(ctx context.Context, id string) (*User, error) {
    user, err := s.repo.GetUser(ctx, id)
    if err != nil {
        // This wrapping breaks == comparison
        return nil, fmt.Errorf("failed to get user %s: %w", id, err)
    }
    return user, nil
}

func (h *HTTPHandler) GetUser(w http.ResponseWriter, r *http.Request) {
    user, err := h.service.GetUser(r.Context(), userID)
    if err == ErrUserNotFound {  // ❌ This will NEVER be true now!
        http.Error(w, "User not found", http.StatusNotFound)
        return
    }
    // User not found becomes 500 Internal Server Error instead of 404
    http.Error(w, "Internal error", http.StatusInternalServerError)
}

Production War Story: An e-commerce API was returning 500 errors for "product not found" instead of 404s. The issue was error wrapping in the repository layer broke == comparisons in HTTP handlers. This caused search engines to stop indexing their product pages because they were seen as server errors instead of missing products.

Why errors.Is Works:

  • Unwraps Error Chains: Traverses the entire error chain looking for matches
  • Handles Wrapping: Works even when errors are wrapped multiple times
  • Semantic Equality: Compares error identity, not string representation

Error Wrapping with Rich Context

✅ DO: Provide rich context while preserving error chains

func (s *UserService) CreateUser(ctx context.Context, req *CreateUserRequest) (*User, error) {
    logger := s.logger.With("operation", "create_user", "email", req.Email)
    
    if err := s.validateUser(req); err != nil {
        logger.WarnContext(ctx, "user validation failed", "error", err)
        return nil, fmt.Errorf("CreateUser: validation failed for %s: %w", req.Email, err)
    }
    
    user := &User{
        ID:        generateID(),
        Email:     req.Email,
        CreatedAt: time.Now(),
    }
    
    if err := s.repo.SaveUser(ctx, user); err != nil {
        logger.ErrorContext(ctx, "failed to save user", "error", err, "user_id", user.ID)
        return nil, fmt.Errorf("CreateUser: failed to save user %s: %w", user.ID, err)
    }
    
    logger.InfoContext(ctx, "user created successfully", "user_id", user.ID)
    return user, nil
}

Why This Pattern Is Critical:

  • Debugging Speed: Full error context enables fast problem resolution
  • Error Chain Preservation: %w verb preserves the original error for errors.Is/As
  • Operation Context: Clear operation names help identify where failures occur
  • Structured Logging: Consistent logging enables automated alerting

Custom Error Types for Rich Context

✅ DO: Create structured error types when you need rich error information

// ServiceError provides structured error context
type ServiceError struct {
    Code      string                 `json:"code"`
    Message   string                 `json:"message"`
    Service   string                 `json:"service"`
    Operation string                 `json:"operation"`
    RequestID string                 `json:"request_id,omitempty"`
    Details   map[string]interface{} `json:"details,omitempty"`
    Cause     error                  `json:"-"` // Don't serialize wrapped error
}

func (e *ServiceError) Error() string {
    if e.Cause != nil {
        return fmt.Sprintf("%s.%s failed: %s (caused by: %v)", 
            e.Service, e.Operation, e.Message, e.Cause)
    }
    return fmt.Sprintf("%s.%s failed: %s", e.Service, e.Operation, e.Message)
}

func (e *ServiceError) Unwrap() error {
    return e.Cause
}

// Builder pattern for cleaner error creation
func NewServiceError(code, service, operation, message string) *ServiceError {
    return &ServiceError{
        Code:      code,
        Service:   service,
        Operation: operation,
        Message:   message,
        Details:   make(map[string]interface{}),
    }
}

func (e *ServiceError) WithCause(err error) *ServiceError {
    e.Cause = err
    return e
}

func (e *ServiceError) WithRequestID(id string) *ServiceError {
    e.RequestID = id
    return e
}

func (e *ServiceError) WithDetail(key string, value interface{}) *ServiceError {
    e.Details[key] = value
    return e
}

When to Use Custom Error Types:

  • API Responses: When errors need to be serialized to clients
  • Retry Logic: When different error types require different retry strategies
  • Monitoring: When you need to categorize errors for metrics
  • Debugging: When you need rich context for production troubleshooting

Multi-Error Handling (Go 1.20+)

✅ DO: Use errors.Join for combining multiple errors

import (
    "errors"
    "fmt"
)

// Define sentinel errors for specific validation failures
var (
    ErrEmailRequired = errors.New("email is required")
    ErrEmailInvalid  = errors.New("email format is invalid")
    ErrNameRequired  = errors.New("name is required")
)

func isValidEmail(email string) bool {
    return strings.Contains(email, "@") // Simplified validation
}

func (s *UserService) ValidateAndCreate(ctx context.Context, req *CreateUserRequest) (*User, error) {
    var errs []error
    
    // Collect all validation errors using sentinel values
    if req.Email == "" {
        errs = append(errs, ErrEmailRequired)
    } else if !isValidEmail(req.Email) {
        errs = append(errs, ErrEmailInvalid)
    }
    
    if req.Name == "" {
        errs = append(errs, ErrNameRequired)
    }
    
    // Join all errors if any exist
    if len(errs) > 0 {
        return nil, errors.Join(errs...)
    }
    
    return s.createUser(ctx, req)
}

// Handle multi-errors with proper unwrapping
func handleValidationErrors(err error) {
    // Check for specific errors using errors.Is (works with Join chains)
    if errors.Is(err, ErrEmailRequired) {
        fmt.Println("Email is required for user creation")
    }
    if errors.Is(err, ErrEmailInvalid) {
        fmt.Println("Email format is invalid")
    }

    // Correctly traverse joined errors (Unwrap() []error for Join)
    type unwrapper interface{ Unwrap() []error }
    var stack []error
    stack = append(stack, err)
    for len(stack) > 0 {
        // pop
        e := stack[len(stack)-1]
        stack = stack[:len(stack)-1]
        fmt.Printf("Validation error: %v\n", e)
        if u, ok := e.(unwrapper); ok {
            stack = append(stack, u.Unwrap()...)
        } else if ue := errors.Unwrap(e); ue != nil {
            stack = append(stack, ue)
        }
    }
}

Why Multi-Error Handling Matters:

  • Better User Experience: Show all validation issues at once, not one at a time
  • API Design: RESTful APIs can return all field errors in a single response
  • Batch Processing: Collect all failures when processing multiple items

Note: errors.Join returns an error whose Unwrap() method yields []error. Do not use a simple errors.Unwrap loop; rely on errors.Is/As or explicitly walk the unwrapped slice as shown.

Error Handling Anti-Patterns Checklist

❌ Critical Mistakes to Avoid:

Anti-Pattern Why It's Wrong Production Impact What Goes Wrong
if err != nil { panic(err) } Crashes entire service Service downtime, cascading failures Customer-facing 500 errors, data loss
_, err := fn(); err != nil { return } Ignores errors silently Silent failures, data corruption Money transfers fail silently, user data lost
err.Error() == "some string" Breaks with wrapped errors Missed error conditions Wrong HTTP status codes, broken retry logic
Global mutable variables Hidden coupling, test flakiness Difficult to test, race conditions Inject via constructors
log.Printf("Error: %v", err) Unstructured, no context Difficult to debug, no alerting Cannot search logs, no automated incident response

Production Horror Story: A payment service used panic(err) for database connection failures. During a brief database outage, the entire payment system crashed instead of gracefully degrading. This caused a 2-hour service outage during Black Friday, resulting in $2M in lost revenue.


5. Concurrency Mastery

The errgroup Revolution

Decision Rationale: sync.WaitGroup is too primitive for most production systems. errgroup provides error handling, context cancellation, and resource management that real applications need. However, sync.WaitGroup is acceptable for fire-and-forget operations where errors can be safely ignored.

Why errgroup is Superior

Feature sync.WaitGroup errgroup What Goes Wrong with WaitGroup
Error Handling Manual collection, race conditions Automatic error propagation Errors lost, silent failures, no way to stop other goroutines when one fails
Cancellation Manual context checking Built-in context cancellation Goroutines continue running even after parent context is cancelled
Resource Limits No built-in limiting SetLimit() prevents resource exhaustion Unbounded goroutine creation leads to OOM crashes
First Error Must collect all errors First error cancels all operations Cannot implement fail-fast behavior, waste resources on doomed operations
API Cleanliness Manual Add/Done counting Clean function-based API Easy to forget Done() calls, leading to deadlocks
Best For Fire-and-forget fan-out Operations needing error handling Use for logging, metrics collection (where failures don't matter)

Production Disaster with sync.WaitGroup:

// DANGEROUS: Real production bug that caused service outage
func processUserBatch(userIDs []string) error {
    var wg sync.WaitGroup
    var errors []error
    var mu sync.Mutex
    
    for _, userID := range userIDs {
        wg.Add(1)
        go func(id string) {
            defer wg.Done()
            if err := processUser(id); err != nil {
                mu.Lock()
                errors = append(errors, err) // Race condition potential
                mu.Unlock()
            }
        }(userID)
    }
    
    wg.Wait()
    
    if len(errors) > 0 {
        return fmt.Errorf("processing failed: %v", errors[0])
    }
    return nil
}

What Went Wrong:

  1. No Resource Limits: With 10,000 user IDs, created 10,000 goroutines simultaneously
  2. Memory Exhaustion: Server ran out of memory and crashed
  3. No Cancellation: Even when first user processing failed, all other goroutines continued
  4. Error Collection Race: Multiple goroutines modifying errors slice despite mutex
  5. No Context: Impossible to implement timeouts or cancellation

Basic errgroup Pattern

✅ DO: Use errgroup for all concurrent operations

import "golang.org/x/sync/errgroup"

func (s *UserService) ProcessUserBatch(ctx context.Context, userIDs []string) error {
    g, ctx := errgroup.WithContext(ctx)
    g.SetLimit(10) // ALWAYS set limits to prevent resource exhaustion
    
    for _, userID := range userIDs {
        userID := userID // Capture loop variable
        g.Go(func() error {
            return s.processUser(ctx, userID)
        })
    }
    
    return g.Wait() // Returns first error, cancels all other operations
}

Why This Works:

  • Resource Control: SetLimit(10) prevents creating thousands of goroutines
  • Fail Fast: First error cancels context, stopping other operations
  • Error Propagation: First error is returned to caller
  • Context Integration: Respects parent context cancellation
  • Clean API: No manual Add/Done counting
flowchart LR
    A["Start parent context"]:::context
    B["errgroup.WithContext"]:::context
    C{"Spawn goroutines"}:::decision
    D["Go fn (task 1)"]:::task
    E["Go fn (task 2)"]:::task
    F["Go fn (task 3)"]:::task
    G{"Error occurred?"}:::decision
    H{"Error occurred?"}:::decision
    I{"Error occurred?"}:::decision
    J["Return first error"]:::error
    K["Cancel context"]:::context
    L["Stop remaining work"]:::completion
    M["All tasks completed"]:::completion
    N["g.Wait returns nil"]:::success
    
    A --> B
    B --> C
    C -->|task 1| D
    C -->|task 2| E
    C -->|task 3| F
    D --> G
    E --> H
    F --> I
    G -->|Yes| J
    H -->|Yes| J
    I -->|Yes| J
    J --> K
    K --> L
    G -->|No| M
    H -->|No| M
    I -->|No| M
    M --> N
    
    classDef context fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#0d47a1;
    classDef decision fill:#e0f7fa,stroke:#00838f,stroke-width:2px,color:#006064;
    classDef task fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#4a148c;
    classDef error fill:#ffebee,stroke:#c62828,stroke-width:2px,color:#b71c1c;
    classDef completion fill:#fff8e1,stroke:#ffa000,stroke-width:2px,color:#e65100;
    classDef success fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#1b5e20;
Loading

🚨 CRITICAL: Panic Recovery in errgroup

🚨 THE BIGGEST GOTCHA: Panics in errgroup goroutines are NOT caught and will crash your entire service!

Decision Rationale: Go's panic/recover mechanism doesn't cross goroutine boundaries. A panic in one goroutine crashes the entire program unless explicitly handled.

❌ DANGEROUS:

func processItems(ctx context.Context, items []Item) error {
    g, ctx := errgroup.WithContext(ctx)
    
    for _, item := range items {
        item := item
        g.Go(func() error {
            // If this panics, your entire service crashes!
            result := item.Data["key"].(string) // Potential panic
            return processResult(ctx, result)
        })
    }
    
    return g.Wait() // Panic bubbles up and crashes the program
}

Production War Story: A high-traffic API service had a panic in one goroutine due to a malformed JSON response from an external service. This single panic crashed the entire service, causing a 15-minute outage affecting millions of users.

✅ PRODUCTION SOLUTION: Panic-safe errgroup wrapper

import (
    "context"
    "fmt"
    "runtime/debug"
    "golang.org/x/sync/errgroup"
    "log/slog"
)

// SafeGroup wraps errgroup.Group with panic recovery
type SafeGroup struct {
    group  *errgroup.Group
    logger *slog.Logger
}

func NewSafeGroup(ctx context.Context, logger *slog.Logger) (*SafeGroup, context.Context) {
    g, ctx := errgroup.WithContext(ctx)
    return &SafeGroup{
        group:  g,
        logger: logger,
    }, ctx
}

func (sg *SafeGroup) Go(ctx context.Context, fn func() error) {
    sg.group.Go(func() (err error) {
        defer func() {
            if r := recover(); r != nil {
                stack := debug.Stack()
                
                // Convert panic to error
                panicErr := fmt.Errorf("goroutine panic: %v", r)
                
                sg.logger.ErrorContext(ctx, "goroutine panic recovered",
                    "panic", r,
                    "stack_trace", string(stack))
                
                err = panicErr
            }
        }()
        
        return fn()
    })
}

func (sg *SafeGroup) SetLimit(n int) {
    sg.group.SetLimit(n)
}

func (sg *SafeGroup) Wait() (err error) {
    // CRITICAL: Handle panics during Wait() itself
    defer func() {
        if r := recover(); r != nil {
            sg.logger.Error("panic during SafeGroup.Wait()",
                "panic", r,
                "stack_trace", string(debug.Stack()))
            err = fmt.Errorf("wait panic: %v", r)
        }
    }()
    
    return sg.group.Wait()
}

// Usage: Production-safe concurrent processing
func (s *PaymentService) ProcessBatchPayments(ctx context.Context, paymentIDs []string) error {
    logger := s.logger.With("operation", "process_batch_payments")
    
    g, ctx := NewSafeGroup(ctx, logger)
    g.SetLimit(5) // Limit concurrent operations
    
    for _, paymentID := range paymentIDs {
        paymentID := paymentID
        g.Go(ctx, func() error {
            // Now safe from panics that could crash the service
            return s.processPayment(ctx, paymentID)
        })
    }
    
    return g.Wait()
}

Channel Fan-Out/Fan-In Pattern

✅ DO: Use channels for elegant coordination patterns

Input Channel ──┬─► Worker 1 ──┐
                ├─► Worker 2 ──┼─► Collector ──► Output Channel
                └─► Worker 3 ──┘

Simple Example: Here's a basic illustration before diving into the full implementation:

// Simple fan-out to 2 workers, fan-in to a single result channel
func simpleFanOutFanIn(ctx context.Context, input <-chan int) <-chan int {
    // Step 1: Fan-out - distribute work
    worker1Chan := make(chan int)
    worker2Chan := make(chan int)
    
    go func() {
        defer close(worker1Chan)
        defer close(worker2Chan)
        for item := range input {
            // Distribute items alternately
            select {
            case worker1Chan <- item:
            case worker2Chan <- item:
            case <-ctx.Done():
                return
            }
        }
    }()
    
    // Step 2: Worker processing (simple operation)
    processWorker := func(in <-chan int, out chan<- int) {
        for item := range in {
            select {
            case out <- item * 2: // Simple processing
            case <-ctx.Done():
                return
            }
        }
    }
    
    worker1Out := make(chan int)
    worker2Out := make(chan int)
    go processWorker(worker1Chan, worker1Out)
    go processWorker(worker2Chan, worker2Out)
    
    // Step 3: Fan-in - collect results
    output := make(chan int)
    go func() {
        defer close(output)
        for {
            select {
            case res1, ok1 := <-worker1Out:
                if ok1 {
                    output <- res1
                } else {
                    worker1Out = nil // Channel closed, stop selecting from it
                }
            case res2, ok2 := <-worker2Out:
                if ok2 {
                    output <- res2
                } else {
                    worker2Out = nil
                }
            case <-ctx.Done():
                return
            }
            // Exit loop if both worker channels are closed
            if worker1Out == nil && worker2Out == nil {
                break
            }
        }
    }()
    return output
}

Full Example with Error Handling & SafeGroup: This is a more robust implementation suitable for production.

// Fan-out/Fan-in pattern with proper error handling
func (s *DataService) processDataPipeline(ctx context.Context, input <-chan DataItem) <-chan Result {
    const numWorkers = 3
    
    // Create worker channels
    workChan := make(chan DataItem, numWorkers*2) // Buffered for better throughput
    resultChan := make(chan Result, numWorkers)
    
    // Start workers using SafeGroup
    g, ctx := NewSafeGroup(ctx, s.logger)
    
    // Fan-out: Start workers to process items
    for i := 0; i < numWorkers; i++ {
        workerID := i
        g.Go(ctx, func() error {
            for {
                select {
                case item, ok := <-workChan:
                    if !ok {
                        return nil // Work channel closed
                    }
                    
                    result, err := s.processDataItem(ctx, item)
                    if err != nil {
                        return fmt.Errorf("worker %d failed: %w", workerID, err)
                    }
                    
                    select {
                    case resultChan <- result:
                    case <-ctx.Done():
                        return ctx.Err()
                    }
                    
                case <-ctx.Done():
                    return ctx.Err()
                }
            }
        })
    }
    
    // Fan-in: Collect and forward results
    output := make(chan Result)
    go func() {
        defer close(output)
        
        // Forward input to workers
        go func() {
            defer close(workChan)
            for {
                select {
                case item, ok := <-input:
                    if !ok {
                        return // Input closed
                    }
                    select {
                    case workChan <- item:
                    case <-ctx.Done():
                        return
                    }
                case <-ctx.Done():
                    return
                }
            }
        }()
        
        // Wait for workers and collect results
        go func() {
            defer close(resultChan)
            _ = g.Wait() // Workers complete
        }()
        
        // Forward results to output
        for result := range resultChan {
            select {
            case output <- result:
            case <-ctx.Done():
                return
            }
        }
    }()
    
    return output
}

Why This Pattern Works:

  • Backpressure: Buffered channels prevent goroutine blocking
  • Context Integration: Proper cancellation throughout the pipeline
  • Error Propagation: SafeGroup handles worker failures correctly
  • Resource Cleanup: Channels are properly closed to prevent goroutine leaks

Context Propagation Mastery

Decision Rationale: Context enables request tracing, cancellation, and timeout control across service boundaries. Without proper context propagation, debugging distributed systems becomes impossible.

Context Keys: The Idiomatic Way

✅ ALWAYS: Use unexported struct pointers as context keys

// CORRECT: Unique memory addresses prevent collisions
var (
    requestIDKey     = &struct{}{}
    correlationIDKey = &struct{}{}
    userIDKey        = &struct{}{}
)

func WithRequestID(parent context.Context, id string) context.Context {
    return context.WithValue(parent, requestIDKey, id)
}

func GetRequestID(ctx context.Context) string {
    if id, ok := ctx.Value(requestIDKey).(string); ok && id != "" {
        return id
    }
    return "unknown-request"
}

❌ NEVER: Use string types as context keys

// WRONG: String keys can collide across packages
const userIDKey = "user_id"  // ❌ Risk of collision

func StoreUserID(ctx context.Context, userID string) context.Context {
    return context.WithValue(ctx, userIDKey, userID) // ❌ Fragile
}

What Goes Wrong with String Keys:

// Package A
const requestIDKey = "request_id"
func SetRequestID(ctx context.Context, id string) context.Context {
    return context.WithValue(ctx, requestIDKey, id)
}

// Package B (different team, unaware of Package A) 
const requestIDKey = "request_id"  // Same string!
func SetRequestData(ctx context.Context, data string) context.Context {
    return context.WithValue(ctx, requestIDKey, data) // Overwrites Package A's value!
}

// Result: Silent data corruption, impossible to debug

Production Disaster: A microservice had two teams using "user_id" as context keys. Team A stored the authenticated user ID, Team B stored the target user ID for admin operations. The values overwrote each other, causing admin operations to be performed on the wrong users, including deleting accounts belonging to other customers.

flowchart TB
    subgraph Root["Root Context"]
        R["context.Background()"]:::background
    end
    
    R --> A["WithTimeout 2m (batchCtx)"]:::timeout
    A --> B["WithValue operation=process_user_batch"]:::value
    A --> C["WithCancel (worker 1)"]:::cancel
    A --> D["WithCancel (worker 2)"]:::cancel
    A --> E["WithCancel (worker 3)"]:::cancel
    
    classDef background fill:#e0f2f1,stroke:#00897b,stroke-width:2px,color:#00695c;
    classDef timeout fill:#fff8e1,stroke:#ffa000,stroke-width:2px,color:#e65100;
    classDef value fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#0d47a1;
    classDef cancel fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#4a148c;
Loading

Context Anti-Patterns (Critical to Avoid)

❌ ANTI-PATTERN #1: Storing context in struct fields

// WRONG: Context stored in struct
type Service struct {
    ctx    context.Context   // ❌ NEVER DO THIS
    logger *slog.Logger
}

func (s *Service) ProcessData(data string) error {
    return s.doWork(s.ctx, data) // ❌ Using potentially stale/cancelled context
}

Why This Is Catastrophically Wrong:

1. Lifecycle Mismatch Nightmare:

// The disaster scenario
func main() {
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    
    service := &Service{ctx: ctx} // Context with 5 second timeout
    
    cancel() // Context is now cancelled
    
    // 10 minutes later...
    service.ProcessData("important data") // Uses CANCELLED context!
    // All operations fail immediately with "context canceled"
}

2. Stale Context Hell:

func (h *HTTPHandler) CreateService(w http.ResponseWriter, r *http.Request) {
    // Request-scoped context
    service := &Service{ctx: r.Context()}
    
    // Store service globally (common pattern)  
    GlobalServices["user"] = service
    
    // Request ends, context is cancelled
    
}

// Later, different request:
func (h *HTTPHandler) ProcessUser(w http.ResponseWriter, r *http.Request) {
    service := GlobalServices["user"] // Service has STALE context
    service.ProcessData("user data") // Fails with cancelled context!
}

3. Race Conditions:

type Service struct {
    ctx context.Context // Shared across goroutines
}

func (s *Service) HandleRequests() {
    for i := 0; i < 100; i++ {
        go func() {
            // Multiple goroutines reading/writing s.ctx
            s.ctx = context.WithValue(s.ctx, "key", "value") // Race condition!
        }()
    }
}

4. Testing Nightmare:

func TestService_ProcessData(t *testing.T) {
    service := &Service{ctx: context.Background()}
    
    // Cannot inject different contexts for different test cases
    // Cannot test timeout behavior
    // Cannot test cancellation behavior
    // Cannot test with different request IDs
}

Production War Story: A background job service stored context in a struct field. The context came from an HTTP request with a 30-second timeout. When the HTTP request finished, the context was cancelled, but the background jobs kept running for hours using the cancelled context. All database operations failed with "context canceled" errors, but the jobs didn't realize they were failing and kept retrying forever, creating an infinite loop of database connection attempts.

✅ CORRECT: Context flows through method parameters

// RIGHT: Context passed as parameter
type Service struct {
    logger *slog.Logger      // ✅ No context stored
}

func (s *Service) ProcessData(ctx context.Context, data string) error {
    return s.doWork(ctx, data) // ✅ Fresh context for each operation
}

Why This Works:

  • Lifecycle Control: Each operation gets appropriate context
  • Testability: Can inject different contexts per test
  • Cancellation: Operations can be cancelled independently
  • Tracing: Request-specific tracing information flows correctly

Context Derivation & Lifecycle Management (Concurrency Context)

Understanding Context Derivation: Every derived context shares certain properties with its parent while having its own lifecycle. This is fundamental to Go's context system.

What Derived Contexts Share with Parent:

  • Values: All key-value pairs from parent context
  • Deadline: The most restrictive deadline (parent or child)
  • Done Channel: Parent cancellation cascades to children
  • Error: Cancellation reason propagates

What Derived Contexts Don't Share:

  • Cancellation Functions: Each has its own cancel function
  • Lifecycle: Child can be cancelled without affecting parent
  • Memory: Child context doesn't hold parent alive unnecessarily

✅ DO: Master context derivation patterns

// Context derivation hierarchy
func (s *UserService) ProcessUserBatch(ctx context.Context, userIDs []string) error {
    // 1. Operation-level context with timeout
    batchCtx, batchCancel := context.WithTimeout(ctx, 2*time.Minute)
    defer batchCancel()
    
    // Add operation-specific values
    batchCtx = context.WithValue(batchCtx, operationKey, "process_user_batch")
    batchCtx = context.WithValue(batchCtx, batchSizeKey, len(userIDs))
    
    logger := s.logger.With(
        "operation", "process_user_batch",
        "batch_size", len(userIDs),
        "correlation_id", getCorrelationID(batchCtx),
    )
    
    logger.InfoContext(batchCtx, "batch processing started")
    
    // 2. Individual user processing contexts
    g, groupCtx := NewSafeGroup(batchCtx, logger)
    g.SetLimit(10)
    
    for _, userID := range userIDs {
        userID := userID
        g.Go(groupCtx, func() error {
            // 3. Per-user context with individual timeout
            userCtx, userCancel := context.WithTimeout(groupCtx, 30*time.Second)
            defer userCancel()
            
            // Add user-specific values
            userCtx = context.WithValue(userCtx, userIDKey, userID)
            
            return s.processUserWithContext(userCtx, userID)
        })
    }
    
    if err := g.Wait(); err != nil {
        logger.ErrorContext(batchCtx, "batch processing failed", "error", err)
        return fmt.Errorf("batch processing failed: %w", err)
    }
    
    logger.InfoContext(batchCtx, "batch processing completed successfully")
    return nil
}

// Helper functions for context values
var (
    operationKey = &struct{}{}
    batchSizeKey = &struct{}{}
    userIDKey    = &struct{}{}
)

func getCorrelationID(ctx context.Context) string {
    if id, ok := ctx.Value(correlationIDKey).(string); ok {
        return id
    }
    return "unknown-correlation"
}

Context Derivation Footguns (Concurrency Context)

❌ FOOTGUN #1: Storing large objects in context values

// WRONG: Large objects in context cause memory retention
func processUser(ctx context.Context, userData []byte) error {
    // ❌ This keeps userData alive for the entire context tree lifetime
    ctx = context.WithValue(ctx, userDataKey, userData)
    return longRunningOperation(ctx)
}

What Goes Wrong: Context values are copied to all derived contexts. Large objects (slices, structs) can cause memory leaks if the context tree is long-lived.

✅ The Fix: Pass data as function parameters

// GOOD: Pass data explicitly
func processUser(ctx context.Context, userData []byte) error {
    return longRunningOperation(ctx, userData) // ✅ Explicit parameter
}

❌ FOOTGUN #2: Not cancelling derived contexts

// WRONG: Context leak - derived context never cancelled
func performOperation(ctx context.Context) error {
    opCtx, _ := context.WithTimeout(ctx, time.Minute) // ❌ Cancel function ignored
    
    return doWork(opCtx) // Context may leak resources
}

✅ The Fix: Always defer cancel

// GOOD: Proper context cleanup
func performOperation(ctx context.Context) error {
    opCtx, cancel := context.WithTimeout(ctx, time.Minute)
    defer cancel() // ✅ Always clean up
    
    return doWork(opCtx)
}

Context Best Practices Summary (Concurrency Context)

✅ Context Do's:

  • Always pass context as first parameter
  • Use context.WithTimeout for operations with time limits
  • Derive contexts for adding values or deadlines
  • Always defer cancel() when creating cancellable contexts
  • Use context for cancellation, deadlines, and request-scoped values

❌ Context Don'ts:

  • Never store context in struct fields
  • Don't use context for passing optional parameters
  • Don't put large objects in context values
  • Don't ignore context cancellation in loops
  • Don't create contexts without cancel functions

6. Dependency Injection & Clean Architecture

Interface-Driven Design Philosophy

Go's interfaces enable true dependency inversion without complex frameworks. The key principle: define interfaces where they're used, not where they're implemented.

Decision Rationale: Small, focused interfaces enable testing, flexibility, and clear API boundaries. They're the foundation of maintainable systems.

flowchart LR
    subgraph Web[HTTP Layer]
      H[Handler]
    end
    subgraph Biz[Domain/Service]
      S[UserService]
    end
    subgraph Ports[Interfaces]
      I1[(UserRepository)]
      I2[(EmailSender)]
    end
    subgraph Adapters[Adapters]
      A1[PostgresRepo]
      A2[SendgridSender]
    end
    H --> S
    S --> I1
    S --> I2
    I1 -.implemented by.-> A1
    I2 -.implemented by.-> A2
    classDef layer fill:#f5f5f5,stroke:#777;
    class Web,Biz,Ports,Adapters layer;
Loading

What Goes Wrong Without Interface-Driven Design:

// DISASTER: Tight coupling nightmare
package handler

import "myapp/mysql" // Direct dependency on MySQL

type HTTPHandler struct {
    db *mysql.Database // Tightly coupled to MySQL
}

func (h *HTTPHandler) GetUser(w http.ResponseWriter, r *http.Request) {
    // Direct MySQL calls in HTTP handler
    user := h.db.QueryUser("SELECT * FROM users WHERE id = ?", userID)
    json.Encode(w, user)
}

Production Nightmare: A team needed to migrate from MySQL to PostgreSQL. Because HTTP handlers directly imported MySQL packages, they had to:

  1. Change 47 import statements across 23 files
  2. Rewrite all SQL queries (MySQL → PostgreSQL syntax differences)
  3. Update all database connection handling
  4. Rewrite all unit tests to use PostgreSQL test containers
  5. Deal with compilation errors for 2 weeks while migration was in progress

The Fix: Interface-driven design

// SOLUTION: Interface-driven design
package handler

// Small, focused interfaces defined WHERE THEY'RE USED
type UserService interface {
    GetUser(ctx context.Context, id string) (*User, error)
    CreateUser(ctx context.Context, req *CreateUserRequest) (*User, error)
}

type HTTPHandler struct {
    userService UserService // Depends on interface, not concrete type
    logger      *slog.Logger
}

func (h *HTTPHandler) GetUser(w http.ResponseWriter, r *http.Request) {
    user, err := h.userService.GetUser(r.Context(), userID)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    json.NewEncoder(w).Encode(user)
}

Result: Database migration required zero changes to HTTP handlers. Swapped implementation at startup, everything worked perfectly.

The Interface Segregation Principle

✅ DO: Define small, focused interfaces in consuming packages

// Define interfaces where they're used
package handler

// Small, focused interfaces
type UserService interface {
    GetUser(ctx context.Context, id string) (*User, error)
    CreateUser(ctx context.Context, req *CreateUserRequest) (*User, error)
}

type EmailSender interface {
    SendEmail(ctx context.Context, to, subject, body string) error
}

// Handler depends on interfaces, not concrete types
type HTTPHandler struct {
    userService UserService
    emailSender EmailSender
    logger      *slog.Logger
}

func NewHTTPHandler(userService UserService, emailSender EmailSender, logger *slog.Logger) *HTTPHandler {
    return &HTTPHandler{
        userService: userService,
        emailSender: emailSender,
        logger:      logger,
    }
}

Why This Pattern Is Critical:

  • Testing Simplicity: Easy to mock individual interfaces
  • Clear Dependencies: You can see exactly what each component needs
  • Interface Segregation: Small interfaces are easier to implement
  • Flexible Implementation: Swap implementations without changing business logic

What Goes Wrong with God Interfaces:

// BAD: God interface violates Interface Segregation Principle
type UserRepository interface {
    // User operations
    GetUser(ctx context.Context, id string) (*User, error)
    CreateUser(ctx context.Context, user *User) error
    UpdateUser(ctx context.Context, user *User) error
    DeleteUser(ctx context.Context, id string) error
    ListUsers(ctx context.Context, filter UserFilter) ([]*User, error)
    
    // Order operations (why is this here?)
    GetOrdersForUser(ctx context.Context, userID string) ([]*Order, error)
    CreateOrder(ctx context.Context, order *Order) error
    
    // Email operations (this is getting ridiculous)
    SendWelcomeEmail(ctx context.Context, userID string) error
    SendPasswordResetEmail(ctx context.Context, email string) error
    
    // Analytics (seriously?)
    TrackUserEvent(ctx context.Context, userID, event string) error
    GetUserMetrics(ctx context.Context, userID string) (*UserMetrics, error)
}

Production Consequences:

  • Impossible to Test: Must mock 12 methods for simple user creation test
  • Violates SRP: One interface doing four different jobs
  • Hard to Implement: New implementations must implement ALL methods
  • Tight Coupling: Changes to email functionality force user repository changes

Constructor Injection Pattern

✅ DO: Use explicit constructor injection for all dependencies

// UserService with explicit dependencies
type UserService struct {
    repo      UserRepository
    validator Validator
    logger    *slog.Logger
    clock     Clock // Even time is injectable for testing
}

// Clock interface makes time testable
type Clock interface {
    Now() time.Time
}

type RealClock struct{}
func (RealClock) Now() time.Time { return time.Now() }

func NewUserService(repo UserRepository, validator Validator, logger *slog.Logger, clock Clock) *UserService {
    return &UserService{
        repo:      repo,
        validator: validator,
        logger:    logger,
        clock:     clock,
    }
}

func (s *UserService) CreateUser(ctx context.Context, req *CreateUserRequest) (*User, error) {
    if err := s.validator.ValidateCreateUserRequest(req); err != nil {
        return nil, fmt.Errorf("validation failed: %w", err)
    }
    
    user := &User{
        ID:        generateID(),
        Email:     req.Email,
        CreatedAt: s.clock.Now(), // Testable time
    }
    
    return user, s.repo.SaveUser(ctx, user)
}

// Helper function for ID generation
func generateID() string {
    return fmt.Sprintf("user_%d_%d", time.Now().UnixNano(), rand.Intn(1000))
}

Why This Works:

  • Testable: Every dependency can be mocked
  • Explicit: All dependencies visible in constructor signature
  • Flexible: Can swap implementations easily
  • No Globals: No hidden dependencies or side effects

Composition Over Inheritance

Decision Rationale: Go's embedding enables composition without inheritance complexity. Use it consciously to add meaningful functionality.

✅ DO: Use embedding to extend functionality

// Good: Embedding adds meaningful functionality
type LoggingDB struct {
    *sql.DB
    logger *slog.Logger
}

func (db *LoggingDB) QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error) {
    start := time.Now()
    rows, err := db.DB.QueryContext(ctx, query, args...)
    duration := time.Since(start)
    
    db.logger.InfoContext(ctx, "database query executed",
        "query", query,
        "duration_ms", duration.Milliseconds(),
        "error", err)
    
    return rows, err
}

⚠️ Embedding Limitation: This pattern is effective but has limits. For example, if you call db.Begin(), it returns a standard *sql.Tx, not a wrapped, logging-aware transaction. Any subsequent calls on that transaction object will bypass your logging.

// The embedding footgun
func (db *LoggingDB) processTransaction() error {
    tx, err := db.Begin() // Returns *sql.Tx, NOT *LoggingTx
    if err != nil {
        return err
    }
    defer tx.Rollback()
    
    // These operations are NOT logged!
    _, err = tx.Exec("INSERT INTO users...")
    _, err = tx.Exec("UPDATE accounts...")
    
    return tx.Commit()
}

A truly comprehensive logging solution for database/sql requires a more invasive wrapper or a different driver. This example illustrates the principle of embedding, but be aware of its boundaries.

❌ DON'T: Embed types that expose unwanted functionality

// Bad: Exposes mutex methods to external users
type Counter struct {
    sync.Mutex  // ❌ Now Counter.Lock() and Counter.Unlock() are public
    value int
}

// Good: Use composition instead
type Counter struct {
    mu    sync.Mutex  // Private field
    value int
}

func (c *Counter) Increment() {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.value++
}

7. Structured Logging & Observability

Production Logging with log/slog

Go 1.21's log/slog brings enterprise-grade structured logging to the standard library. Unstructured logs are useless at scale.

Decision Rationale: Structured logging is essential for production systems. Unstructured logs cannot be searched, aggregated, or alerted on effectively.

What Goes Wrong with Unstructured Logging:

// DISASTER: Unstructured logging nightmare
func (s *UserService) CreateUser(ctx context.Context, req *CreateUserRequest) (*User, error) {
    fmt.Printf("Creating user: %s\n", req.Email) // No timestamp, no context
    
    user, err := s.repo.SaveUser(ctx, &User{Email: req.Email})
    if err != nil {
        fmt.Printf("Error: %v\n", err) // No context, no correlation ID
        return nil, err
    }
    
    fmt.Printf("User created successfully\n") // Which user? When?
    return user, nil
}

Production Horror Story: A payment service processed millions of transactions daily. When customers reported missing payments, debugging required manually scanning through 50GB of log files looking for patterns like "Error: database timeout". It took 3 days to find that payment ID "pay_12345" failed because you had to grep through unstructured text files with no correlation between related log entries.

After implementing structured logging, the same investigation took 3 minutes:

# Find all events for specific payment
kubectl logs | jq 'select(.payment_id == "pay_12345")'

# Find all database timeouts in the last hour
kubectl logs | jq 'select(.error_type == "database_timeout" and .timestamp > "2025-01-15T10:00:00Z")'

The Three Pillars of Production Logging

  1. Structured: Machine-readable format (JSON)
  2. Contextual: Request tracing and correlation
  3. Actionable: Logs that help solve problems

✅ DO: Implement production-grade logging from day one

import (
    "context"
    "log/slog"
    "os"
    "time"
)

func setupLogger(level string) (*slog.Logger, error) {
    var logLevel slog.Level
    switch level {
    case "debug":
        logLevel = slog.LevelDebug
    case "info":
        logLevel = slog.LevelInfo
    case "warn":
        logLevel = slog.LevelWarn
    case "error":
        logLevel = slog.LevelError
    default:
        logLevel = slog.LevelInfo
    }
    
    handler := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
        Level:     logLevel,
        AddSource: true, // Add file:line for debugging
        ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
            if a.Key == slog.TimeKey {
                return slog.Attr{
                    Key:   "timestamp",
                    Value: slog.StringValue(time.Now().UTC().Format(time.RFC3339)),
                }
            }
            return a
        },
    })
    
    return slog.New(handler), nil
}

Correlation IDs and Request Tracing

Decision Rationale: In distributed systems, a single request spans multiple services. Without correlation IDs, debugging becomes impossible.

What Goes Wrong Without Correlation IDs:

# Distributed system without correlation - debugging nightmare
service-a: "Processing user creation"
service-b: "Database connection timeout"  
service-c: "Email sent successfully"
service-a: "User creation failed"
service-b: "Database connection restored"
service-c: "Email delivery failed"

# Which logs are related? Which user creation failed? 
# Which email failed? Impossible to tell!

✅ DO: Implement comprehensive request tracing

// Context keys for request tracking
var (
    correlationIDKey = &struct{}{}
    requestIDKey     = &struct{}{}
    userIDKey        = &struct{}{}
)

func getCorrelationID(ctx context.Context) string {
    if id, ok := ctx.Value(correlationIDKey).(string); ok {
        return id
    }
    return "unknown-correlation"
}

func getRequestID(ctx context.Context) string {
    if id, ok := ctx.Value(requestIDKey).(string); ok {
        return id
    }
    return "unknown-request"
}

// Service demonstrates structured logging with context
func (s *UserService) ProcessUser(ctx context.Context, userID string) error {
    // Create operation-scoped logger with consistent context
    logger := s.logger.With(
        "service", "user_service",
        "operation", "process_user",
        "user_id", userID,
        "correlation_id", getCorrelationID(ctx),
        "request_id", getRequestID(ctx),
    )
    
    logger.InfoContext(ctx, "processing user started")
    
    start := time.Now()
    defer func() {
        duration := time.Since(start)
        logger.InfoContext(ctx, "processing user completed",
            "duration_ms", duration.Milliseconds())
    }()
    
    user, err := s.repo.GetUser(ctx, userID)
    if err != nil {
        logger.ErrorContext(ctx, "failed to fetch user",
            "error", err,
            "step", "fetch_user")
        return fmt.Errorf("failed to fetch user: %w", err)
    }
    
    logger.InfoContext(ctx, "user processing successful")
    return nil
}

Result with Correlation IDs:

{"timestamp":"2025-01-15T10:15:30Z","level":"INFO","msg":"processing user started","correlation_id":"req_12345","service":"user_service","user_id":"user_789"}
{"timestamp":"2025-01-15T10:15:31Z","level":"ERROR","msg":"database timeout","correlation_id":"req_12345","service":"database","query":"SELECT * FROM users"}
{"timestamp":"2025-01-15T10:15:32Z","level":"ERROR","msg":"failed to fetch user","correlation_id":"req_12345","service":"user_service","user_id":"user_789"}

Now you can trace the entire request flow: user service → database timeout → user service failure, all linked by correlation_id: req_12345.

Log Level Strategy

Decision Rationale: Different log levels serve different purposes. Understanding when to use each level is crucial for effective monitoring.

Level Purpose Examples Production Volume What Goes Wrong If Misused
DEBUG Development only Variable values, flow control Disabled Performance impact, storage costs
INFO Business operations User created, payment processed Normal Log spam if overused
WARN Recoverable issues Retry attempts, fallback used Low Alert fatigue if not actionable
ERROR Service failures Database errors, external API failures Very low Ignored if overused

✅ Production Logging Example:

func (s *PaymentService) ProcessPayment(ctx context.Context, req *PaymentRequest) (*PaymentResult, error) {
    logger := s.logger.With(
        "service", "payment_service",
        "operation", "process_payment",
        "payment_id", req.PaymentID,
        "amount", req.Amount.String(),
        "correlation_id", getCorrelationID(ctx),
    )
    
    // INFO: Business operations
    logger.InfoContext(ctx, "payment processing started")
    
    if err := s.validatePaymentRequest(req); err != nil {
        // WARN: Recoverable errors - user can fix this
        logger.WarnContext(ctx, "payment validation failed",
            "error", err,
            "step", "validation")
        return nil, &ValidationError{Field: "payment_request", Message: err.Error()}
    }
    
    result, err := s.gateway.ProcessPayment(ctx, req)
    if err != nil {
        // ERROR: Critical failures - requires investigation
        logger.ErrorContext(ctx, "payment gateway failed",
            "error", err,
            "gateway", "stripe",
            "step", "gateway_call")
        return nil, fmt.Errorf("payment gateway failed: %w", err)
    }
    
    // INFO: Successful completion
    logger.InfoContext(ctx, "payment processed successfully",
        "transaction_id", result.TransactionID,
        "status", result.Status)
    
    return result, nil
}

8. Signal Handling & Graceful Shutdown

The Production Reality: Why Signal Handling Matters

Decision Rationale: Production services must handle shutdown signals gracefully to prevent data corruption, lost requests, and inconsistent state. Kubernetes and Docker send SIGTERM signals that must be handled properly.

What Goes Wrong Without Proper Signal Handling:

// DISASTER: Abrupt shutdown nightmare
func main() {
    server := &http.Server{Addr: ":8080"}
    
    // Start server
    log.Fatal(server.ListenAndServe()) // Just dies on SIGTERM
}

Production Consequences:

  • Data Loss: In-flight transactions are lost during abrupt shutdown
  • Connection Leaks: Open database connections aren't closed properly
  • Inconsistent State: Partially completed operations leave system in bad state
  • Client Errors: Clients receive connection errors instead of graceful responses
  • Kubernetes Issues: Pods get killed forcefully after grace period

Real Production Disaster: An e-commerce platform deployed to Kubernetes without graceful shutdown. During a deployment, 200 concurrent checkout operations were abruptly terminated when pods were killed. This resulted in:

  • $50,000 in lost sales (customers got errors during checkout)
  • 150 orders in inconsistent state (payment charged but order not created)
  • 3 hours of manual data recovery
  • Customer trust issues
sequenceDiagram
    participant A as Kubernetes
    participant B as Application
    participant C as HTTP Server
    
    A->>B: SIGTERM
    B->>C: Shutdown
    C-->>B: Draining connections
    B-->>A: exit 0
    
    Note right of A: Graceful shutdown sequence
Loading

Context Derivation & Lifecycle Management (Signal Handling Context)

Understanding Context Derivation: Every derived context shares certain properties with its parent while having its own lifecycle. This is fundamental to Go's context system.

Context Keys: The Idiomatic Way:

// ALWAYS use unexported struct pointer keys to prevent collisions
var (
    requestIDKey     = &struct{}{}
    correlationIDKey = &struct{}{}
    userIDKey        = &struct{}{}
    operationKey     = &struct{}{}
    batchSizeKey     = &struct{}{}
    userDataKey      = &struct{}{}
)

// Helper functions for type-safe context access
func WithRequestID(parent context.Context, id string) context.Context {
    return context.WithValue(parent, requestIDKey, id)
}

func GetRequestID(ctx context.Context) string {
    if id, ok := ctx.Value(requestIDKey).(string); ok && id != "" {
        return id
    }
    return "unknown-request"
}

func WithCorrelationID(parent context.Context, id string) context.Context {
    return context.WithValue(parent, correlationIDKey, id)
}

func GetCorrelationID(ctx context.Context) string {
    if id, ok := ctx.Value(correlationIDKey).(string); ok && id != "" {
        return id
    }
    return "unknown-correlation"
}

func WithUserID(parent context.Context, id string) context.Context {
    return context.WithValue(parent, userIDKey, id)
}

func GetUserID(ctx context.Context) string {
    if id, ok := ctx.Value(userIDKey).(string); ok && id != "" {
        return id
    }
    return "anonymous"
}

Why Struct Pointer Keys Work: Each variable points to a unique memory address, making collisions impossible across packages. This is the standard library pattern.

What Derived Contexts Share with Parent:

  • Values: All key-value pairs from parent context
  • Deadline: The most restrictive deadline (parent or child)
  • Done Channel: Parent cancellation cascades to children
  • Error: Cancellation reason propagates

What Derived Contexts Don't Share:

  • Cancellation Functions: Each has its own cancel function
  • Lifecycle: Child can be cancelled without affecting parent
  • Memory: Child context doesn't hold parent alive unnecessarily

Context Derivation Patterns

✅ DO: Master context derivation patterns

// Context derivation hierarchy
func (s *UserService) ProcessUserBatch(ctx context.Context, userIDs []string) error {
    // 1. Operation-level context with timeout
    batchCtx, batchCancel := context.WithTimeout(ctx, 2*time.Minute)
    defer batchCancel()
    
    // Add operation-specific values
    batchCtx = context.WithValue(batchCtx, operationKey, "process_user_batch")
    batchCtx = context.WithValue(batchCtx, batchSizeKey, len(userIDs))
    
    logger := s.logger.With(
        "operation", "process_user_batch",
        "batch_size", len(userIDs),
        "correlation_id", GetCorrelationID(batchCtx),
    )
    
    logger.InfoContext(batchCtx, "batch processing started")
    
    // 2. Individual user processing contexts
    g, groupCtx := NewSafeGroup(batchCtx, logger)
    g.SetLimit(10)
    
    for _, userID := range userIDs {
        userID := userID
        g.Go(groupCtx, func() error {
            // 3. Per-user context with individual timeout
            userCtx, userCancel := context.WithTimeout(groupCtx, 30*time.Second)
            defer userCancel()
            
            // Add user-specific values
            userCtx = context.WithValue(userCtx, userIDKey, userID)
            
            return s.processUserWithContext(userCtx, userID)
        })
    }
    
    if err := g.Wait(); err != nil {
        logger.ErrorContext(batchCtx, "batch processing failed", "error", err)
        return fmt.Errorf("batch processing failed: %w", err)
    }
    
    logger.InfoContext(batchCtx, "batch processing completed successfully")
    return nil
}

// Placeholder for processUserWithContext
func (s *UserService) processUserWithContext(ctx context.Context, userID string) error {
    // Implementation would go here
    return nil
}

// Example service struct
type UserService struct {
    logger *slog.Logger
}

Context Derivation Footguns (Signal Handling Context)

❌ FOOTGUN #1: Storing large objects in context values

// WRONG: Large objects in context cause memory retention
func processUser(ctx context.Context, userData []byte) error {
    // ❌ This keeps userData alive for the entire context tree lifetime
    ctx = context.WithValue(ctx, userDataKey, userData)
    return longRunningOperation(ctx)
}

✅ The Fix: Pass data as function parameters

// GOOD: Pass data explicitly
func processUser(ctx context.Context, userData []byte) error {
    return longRunningOperation(ctx, userData) // ✅ Explicit parameter
}

// Placeholder function
func longRunningOperation(ctx context.Context, data []byte) error {
    // Implementation would process the data
    return nil
}

❌ FOOTGUN #2: Not cancelling derived contexts

// WRONG: Context leak - derived context never cancelled
func performOperation(ctx context.Context) error {
    opCtx, _ := context.WithTimeout(ctx, time.Minute) // ❌ Cancel function ignored
    
    return doWork(opCtx) // Context may leak resources
}

✅ The Fix: Always defer cancel

// GOOD: Proper context cleanup
func performOperation(ctx context.Context) error {
    opCtx, cancel := context.WithTimeout(ctx, time.Minute)
    defer cancel() // ✅ Always clean up
    
    return doWork(opCtx)
}

// Placeholder function
func doWork(ctx context.Context) error {
    // Implementation would do the work
    return nil
}

Advanced Context Patterns

Context Builder for Complex Scenarios (Use Sparingly):

// ContextBuilder for request processing (only when needed)
type RequestContextBuilder struct {
    parent context.Context
}

func NewRequestContext(parent context.Context) *RequestContextBuilder {
    if parent == nil {
        parent = context.Background()
    }
    return &RequestContextBuilder{parent: parent}
}

func (b *RequestContextBuilder) WithTimeout(timeout time.Duration) (*RequestContextBuilder, context.CancelFunc) {
    ctx, cancel := context.WithTimeout(b.parent, timeout)
    b.parent = ctx
    return b, cancel
}

func (b *RequestContextBuilder) WithRequestID(id string) *RequestContextBuilder {
    b.parent = context.WithValue(b.parent, requestIDKey, id)
    return b
}

func (b *RequestContextBuilder) WithUserID(id string) *RequestContextBuilder {
    b.parent = context.WithValue(b.parent, userIDKey, id)
    return b
}

func (b *RequestContextBuilder) Build() context.Context {
    return b.parent
}

// Usage for complex request setup
func (h *HTTPHandler) setupRequestContext(r *http.Request) (context.Context, context.CancelFunc) {
    builder := NewRequestContext(r.Context())
    
    // Add request timeout
    builder, cancel := builder.WithTimeout(30 * time.Second)
    
    // Add request metadata
    requestID := r.Header.Get("X-Request-ID")
    if requestID == "" {
        requestID = generateRequestID()
    }
    
    userID := extractUserIDFromAuth(r)
    
    ctx := builder.
        WithRequestID(requestID).
        WithUserID(userID).
        Build()
    
    return ctx, cancel
}

// Placeholder functions
func generateRequestID() string {
    return fmt.Sprintf("req_%d", time.Now().UnixNano())
}

func extractUserIDFromAuth(r *http.Request) string {
    // Implementation would extract user ID from auth header
    return "user_123"
}

// Example HTTP handler struct
type HTTPHandler struct{}

Context Cancellation with Cause (Go 1.20+)

💡 Advanced: Context cancellation reasons for better debugging

import "context"

// WithCancelCause enables rich cancellation reasons
func processWithDetailedCancellation(ctx context.Context) error {
    ctx, cancel := context.WithCancelCause(ctx)
    defer cancel(nil) // Call with nil if no specific cause
    
    go func() {
        // Background operation that might fail
        if err := riskyOperation(); err != nil {
            // Cancel with specific cause
            cancel(fmt.Errorf("risky operation failed: %w", err))
        }
    }()
    
    // Wait for completion or cancellation
    <-ctx.Done()
    
    // Check cancellation cause
    if cause := context.Cause(ctx); cause != nil {
        return fmt.Errorf("operation cancelled: %w", cause)
    }
    
    return ctx.Err() // Standard cancellation
}

// Placeholder function
func riskyOperation() error {
    // Simulate a risky operation that might fail
    return nil
}

When to Use: When you need to distinguish between different cancellation reasons for debugging or error handling.

Graceful Shutdown Implementation

✅ DO: Implement comprehensive signal handling

import (
    "context"
    "os"
    "os/signal"
    "syscall" 
    "time"
    "log/slog"
    "fmt"
)

// GracefulShutdown manages clean application termination
type GracefulShutdown struct {
    logger   *slog.Logger
    timeout  time.Duration
    handlers []ShutdownHandler
}

type ShutdownHandler interface {
    Shutdown(ctx context.Context) error
    Name() string
}

func NewGracefulShutdown(logger *slog.Logger, timeout time.Duration) *GracefulShutdown {
    return &GracefulShutdown{
        logger:  logger,
        timeout: timeout,
    }
}

func (gs *GracefulShutdown) RegisterHandler(handler ShutdownHandler) {
    gs.handlers = append(gs.handlers, handler)
}

func (gs *GracefulShutdown) WaitForSignal(ctx context.Context) {
    sigChan := make(chan os.Signal, 1)
    
    // Register for shutdown signals
    signal.Notify(sigChan,
        syscall.SIGINT,  // Ctrl+C
        syscall.SIGTERM, // Kubernetes/Docker termination
        syscall.SIGQUIT, // Quit signal
    )
    
    select {
    case sig := <-sigChan:
        gs.logger.InfoContext(ctx, "shutdown signal received", "signal", sig.String())
        gs.performShutdown(ctx)
    case <-ctx.Done():
        gs.logger.InfoContext(ctx, "context cancelled, initiating shutdown")
        gs.performShutdown(ctx)
    }
}

func (gs *GracefulShutdown) performShutdown(ctx context.Context) {
    shutdownCtx, cancel := context.WithTimeout(context.Background(), gs.timeout)
    defer cancel()
    
    gs.logger.InfoContext(shutdownCtx, "graceful shutdown initiated",
        "timeout", gs.timeout,
        "handlers_count", len(gs.handlers))
    
    // Use SafeGroup for concurrent shutdown
    g, gCtx := NewSafeGroup(shutdownCtx, gs.logger)
    
    for _, handler := range gs.handlers {
        handler := handler // Capture for goroutine
        g.Go(gCtx, func() error {
            start := time.Now()
            err := handler.Shutdown(gCtx)
            duration := time.Since(start)
            
            if err != nil {
                gs.logger.ErrorContext(gCtx, "component shutdown failed",
                    "component", handler.Name(),
                    "duration", duration,
                    "error", err)
                return fmt.Errorf("shutdown failed for %s: %w", handler.Name(), err)
            }
            
            gs.logger.InfoContext(gCtx, "component shutdown completed",
                "component", handler.Name(),
                "duration", duration)
            
            return nil
        })
    }
    
    if err := g.Wait(); err != nil {
        gs.logger.ErrorContext(shutdownCtx, "shutdown completed with errors", "error", err)
    } else {
        gs.logger.InfoContext(shutdownCtx, "graceful shutdown completed successfully")
    }
}

Shutdown Handlers for Different Components

import (
    "net/http"
    "database/sql"
)

// Shutdown handlers for different components
type HTTPServerHandler struct {
    server *http.Server
    logger *slog.Logger
}

func (h *HTTPServerHandler) Shutdown(ctx context.Context) error {
    h.logger.InfoContext(ctx, "shutting down HTTP server")
    return h.server.Shutdown(ctx)
}

func (h *HTTPServerHandler) Name() string {
    return "http_server"
}

type DatabaseHandler struct {
    db     *sql.DB
    logger *slog.Logger
}

func (h *DatabaseHandler) Shutdown(ctx context.Context) error {
    h.logger.InfoContext(ctx, "closing database connections")
    return h.db.Close()
}

func (h *DatabaseHandler) Name() string {
    return "database"
}

type BackgroundWorkerHandler struct {
    cancel context.CancelFunc
    done   <-chan struct{}
    logger *slog.Logger
}

func (h *BackgroundWorkerHandler) Shutdown(ctx context.Context) error {
    h.logger.InfoContext(ctx, "stopping background workers")
    h.cancel() // Signal workers to stop
    
    select {
    case <-h.done:
        return nil
    case <-ctx.Done():
        return fmt.Errorf("background workers shutdown timeout: %w", ctx.Err())
    }
}

func (h *BackgroundWorkerHandler) Name() string {
    return "background_workers"
}

// Cache handler for in-memory caches
type CacheHandler struct {
    cache  map[string]interface{}
    logger *slog.Logger
}

func (h *CacheHandler) Shutdown(ctx context.Context) error {
    h.logger.InfoContext(ctx, "clearing cache")
    // Clear cache to free memory
    for k := range h.cache {
        delete(h.cache, k)
    }
    return nil
}

func (h *CacheHandler) Name() string {
    return "cache"
}

Production Application with Signal Handling

✅ Complete Production Example:

// Application demonstrates signal handling integration
type Application struct {
    httpServer  *http.Server
    db          *sql.DB
    logger      *slog.Logger
    shutdown    *GracefulShutdown
    // NOTE: NO context stored - contexts flow through method parameters
}

func (app *Application) Run(ctx context.Context) error {
    // Setup shutdown handlers
    app.shutdown.RegisterHandler(&HTTPServerHandler{
        server: app.httpServer,
        logger: app.logger,
    })
    
    app.shutdown.RegisterHandler(&DatabaseHandler{
        db:     app.db,
        logger: app.logger,
    })
    
    // Start services with proper context management
    g, appCtx := NewSafeGroup(ctx, app.logger)
    
    // Start HTTP server
    g.Go(appCtx, func() error {
        app.logger.InfoContext(appCtx, "starting HTTP server", "addr", app.httpServer.Addr)
        if err := app.httpServer.ListenAndServe(); err != nil && err != http.ErrServerClosed {
            return fmt.Errorf("HTTP server failed: %w", err)
        }
        return nil
    })
    
    // Start background workers
    workerCtx, workerCancel := context.WithCancel(appCtx)
    workerDone := make(chan struct{})
    
    app.shutdown.RegisterHandler(&BackgroundWorkerHandler{
        cancel: workerCancel,
        done:   workerDone,
        logger: app.logger,
    })
    
    g.Go(appCtx, func() error {
        defer close(workerDone)
        return app.runBackgroundWorkers(workerCtx)
    })
    
    // Wait for shutdown signal
    g.Go(appCtx, func() error {
        app.shutdown.WaitForSignal(appCtx)
        return nil
    })
    
    return g.Wait()
}

func (app *Application) runBackgroundWorkers(ctx context.Context) error {
    workerCtx := context.WithValue(ctx, operationKey, "background_workers")
    
    app.logger.InfoContext(workerCtx, "starting background workers")
    
    g, gCtx := NewSafeGroup(workerCtx, app.logger)
    
    // Email worker
    g.Go(gCtx, func() error {
        return app.runEmailWorker(gCtx)
    })
    
    // Data sync worker
    g.Go(gCtx, func() error {
        return app.runDataSyncWorker(gCtx)
    })
    
    // Metrics collection worker
    g.Go(gCtx, func() error {
        return app.runMetricsWorker(gCtx)
    })
    
    return g.Wait()
}

func (app *Application) runEmailWorker(ctx context.Context) error {
    emailCtx := context.WithValue(ctx, operationKey, "email_worker")
    
    ticker := time.NewTicker(30 * time.Second)
    defer ticker.Stop()
    
    for {
        select {
        case <-ticker.C:
            // Process emails with operation-specific timeout
            processCtx, cancel := context.WithTimeout(emailCtx, 25*time.Second)
            
            if err := app.processEmailQueue(processCtx); err != nil {
                app.logger.ErrorContext(emailCtx, "email processing failed", "error", err)
                // Continue processing despite errors
            }
            
            cancel()
            
        case <-ctx.Done():
            app.logger.InfoContext(emailCtx, "email worker stopping")
            return ctx.Err()
        }
    }
}

func (app *Application) runDataSyncWorker(ctx context.Context) error {
    syncCtx := context.WithValue(ctx, operationKey, "data_sync_worker")
    
    ticker := time.NewTicker(5 * time.Minute)
    defer ticker.Stop()
    
    for {
        select {
        case <-ticker.C:
            syncOpCtx, cancel := context.WithTimeout(syncCtx, 4*time.Minute)
            
            if err := app.syncData(syncOpCtx); err != nil {
                app.logger.ErrorContext(syncCtx, "data sync failed", "error", err)
            }
            
            cancel()
            
        case <-ctx.Done():
            app.logger.InfoContext(syncCtx, "data sync worker stopping")
            return ctx.Err()
        }
    }
}

func (app *Application) runMetricsWorker(ctx context.Context) error {
    metricsCtx := context.WithValue(ctx, operationKey, "metrics_worker")
    
    ticker := time.NewTicker(10 * time.Second)
    defer ticker.Stop()
    
    for {
        select {
        case <-ticker.C:
            metricsOpCtx, cancel := context.WithTimeout(metricsCtx, 5*time.Second)
            
            if err := app.collectMetrics(metricsOpCtx); err != nil {
                app.logger.ErrorContext(metricsCtx, "metrics collection failed", "error", err)
            }
            
            cancel()
            
        case <-ctx.Done():
            app.logger.InfoContext(metricsCtx, "metrics worker stopping")
            return ctx.Err()
        }
    }
}

// Placeholder methods
func (app *Application) processEmailQueue(ctx context.Context) error {
    // Implementation would process email queue
    return nil
}

func (app *Application) syncData(ctx context.Context) error {
    // Implementation would sync data with external systems
    return nil
}

func (app *Application) collectMetrics(ctx context.Context) error {
    // Implementation would collect and export metrics
    return nil
}

// Production main function with signal handling
func main() {
    logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
        Level: slog.LevelInfo,
    }))
    
    app, err := NewApplication(logger)
    if err != nil {
        logger.Error("failed to initialize application", "error", err)
        os.Exit(1)
    }
    defer app.Close()
    
    // Setup graceful shutdown with 30-second timeout
    app.shutdown = NewGracefulShutdown(logger, 30*time.Second)
    
    // Run application with root context
    ctx := context.Background()
    if err := app.Run(ctx); err != nil {
        logger.Error("application failed", "error", err)
        os.Exit(1)
    }
    
    logger.Info("application stopped successfully")
}

// Placeholder constructor and cleanup
func NewApplication(logger *slog.Logger) (*Application, error) {
    // Implementation would initialize HTTP server, database, etc.
    server := &http.Server{
        Addr:         ":8080",
        ReadTimeout:  15 * time.Second,
        WriteTimeout: 15 * time.Second,
    }
    
    return &Application{
        httpServer: server,
        logger:     logger,
    }, nil
}

func (app *Application) Close() {
    // Cleanup any remaining resources
    if app.db != nil {
        app.db.Close()
    }
}

// SafeGroup placeholder (would be imported from previous sections)
type SafeGroup struct {
    // Implementation details would be here
}

func NewSafeGroup(ctx context.Context, logger *slog.Logger) (*SafeGroup, context.Context) {
    // Implementation would return panic-safe errgroup wrapper
    return &SafeGroup{}, ctx
}

func (sg *SafeGroup) Go(ctx context.Context, fn func() error) {
    // Implementation would execute function with panic recovery
}

func (sg *SafeGroup) Wait() error {
    // Implementation would wait for all goroutines to complete
    return nil
}

Context Best Practices Summary (Signal Handling Context)

✅ Context Do's:

  • Always pass context as first parameter
  • Use context.WithTimeout for operations with time limits
  • Derive contexts for adding values or deadlines
  • Always defer cancel() when creating cancellable contexts
  • Use context for cancellation, deadlines, and request-scoped values
  • Use struct pointer keys to prevent collisions

❌ Context Don'ts:

  • Never store context in struct fields
  • Don't use context for passing optional parameters
  • Don't put large objects in context values
  • Don't ignore context cancellation in loops
  • Don't create contexts without cancel functions
  • Don't use string keys for context values

Signal Handling Best Practices

✅ Graceful Shutdown Do's:

  • Handle SIGTERM, SIGINT, and SIGQUIT signals
  • Use timeout contexts for shutdown operations
  • Shut down components concurrently when possible
  • Log shutdown progress for debugging
  • Clean up resources (database connections, file handles)
  • Allow in-flight requests to complete

❌ Graceful Shutdown Don'ts:

  • Don't ignore shutdown signals
  • Don't use infinite timeouts for shutdown
  • Don't forget to close background workers
  • Don't block shutdown on non-critical operations
  • Don't leak goroutines during shutdown

Production Impact: Proper signal handling prevents:

  • Data corruption from interrupted transactions
  • Resource leaks from unclosed connections
  • Client errors from abrupt disconnections
  • SLA violations from ungraceful restarts
  • Manual cleanup work after deployments

9. Performance & Memory Optimization

Understanding Go's Memory Model

Decision Rationale: Go's garbage collector is optimized for low-latency applications. Working with it, rather than against it, is crucial for performance.

The Four Pillars of Go Memory Performance

  1. Stack allocation is free - avoid heap escapes
  2. Object pooling - reuse expensive objects
  3. Pre-allocation - avoid slice/map growth
  4. Cache-friendly layouts - minimize pointer chasing

Escape Analysis Mastery

Understanding What Escapes to Heap:

// STAYS ON STACK: Returning values
func createUser(name string) User {
    return User{Name: name} // ✅ Stack allocated
}

// ESCAPES TO HEAP: Returning pointers
func createUserPtr(name string) *User {
    return &User{Name: name} // ❌ Escapes to heap
}

// SOPHISTICATED: Control escape behavior
func processUsers(users []User) []Result {
    // Pre-allocate prevents multiple heap allocations
    results := make([]Result, 0, len(users)) // ✅ Single allocation
    
    for _, user := range users {
        result := processUser(user) // ✅ Value processing on stack
        results = append(results, result)
    }
    
    return results
}

// Placeholder for processUser function
func processUser(user User) Result {
    return Result{UserID: user.Name} // Simplified processing
}

// Example types for demonstration
type User struct {
    Name string
}

type Result struct {
    UserID string
}

How to Check Escape Analysis: You can use the Go compiler flags to see escape analysis decisions:

go build -gcflags="-m" your_file.go
# For more detailed output:
go build -gcflags="-m -m" your_file.go

This will print lines like ./file.go:10:2: moved to heap: varName indicating what escapes.

Why This Matters:

  • Performance: Stack allocation is ~100x faster than heap allocation
  • GC Pressure: Stack allocations don't contribute to garbage collection
  • Memory Usage: Stack memory is automatically managed

What Goes Wrong with Heap Escapes:

// DISASTER: Unnecessary heap allocations
func processRequests(requests []Request) []*Response {
    var responses []*Response
    
    for _, req := range requests {
        // Every response escapes to heap!
        response := &Response{
            ID:   req.ID,
            Data: processRequestData(req),
        }
        responses = append(responses, response) // Growing slice causes more allocations
    }
    
    return responses
}

Production Impact: A high-traffic API was allocating 50,000 response objects per second on the heap. Each allocation required garbage collection, causing 99th percentile latency to spike from 50ms to 300ms during GC pauses.

The Fix: Use values instead of pointers when possible

// BETTER: Stack-based processing
func processRequests(requests []Request) []Response {
    // Pre-allocate with known capacity
    responses := make([]Response, 0, len(requests))
    
    for _, req := range requests {
        // Value stays on stack
        response := Response{
            ID:   req.ID,
            Data: processRequestData(req),
        }
        responses = append(responses, response)
    }
    
    return responses
}

// Placeholder functions
type Request struct {
    ID string
}

type Response struct {
    ID   string
    Data string
}

func processRequestData(req Request) string {
    return "processed_" + req.ID
}

The Slice Memory Leak Footgun

🚨 CRITICAL PRODUCTION FOOTGUN: Taking a small slice from a very large one can prevent the large slice's underlying array from being garbage collected.

❌ The Silent Memory Leak:

func getFirstN(largeSlice []byte, n int) []byte {
    // Returns a slice header pointing to the original large array.
    // If largeSlice is 1GB, the GC cannot reclaim that 1GB
    // as long as the small returned slice is alive.
    return largeSlice[:n]
}

func processLogFile(filename string) ([]byte, error) {
    // Read entire 100MB log file
    data, err := os.ReadFile(filename)
    if err != nil {
        return nil, err
    }
    
    // Extract just the header (first 1KB)
    header := data[:1024] // ❌ Keeps entire 100MB in memory!
    return header, nil
}

Production War Story: A data processing service reading large files (1GB+) and extracting small headers (1KB) suffered from massive memory usage. The small header slices were preventing garbage collection of the entire large files, causing out-of-memory crashes. The service was using 50GB of RAM to store what should have been 50KB of actual data.

What Happens:

  1. os.ReadFile allocates 100MB array for file contents
  2. data[:1024] creates slice header pointing to same 100MB array
  3. Even though you only need 1KB, entire 100MB stays in memory
  4. GC cannot free the 100MB because slice header still references it
  5. Multiply by 1000 files = 100GB memory usage for 1MB of actual data

✅ The Fix: Explicitly Copy the Data:

func getFirstNSafe(largeSlice []byte, n int) []byte {
    // Create a new slice of the exact size needed
    result := make([]byte, n)
    // Copy only the data we need
    copy(result, largeSlice[:n])
    // Now the reference to largeSlice's array is gone,
    // and it can be garbage collected
    return result
}

func processLogFileSafe(filename string) ([]byte, error) {
    data, err := os.ReadFile(filename)
    if err != nil {
        return nil, err
    }
    
    // Extract header with explicit copy
    header := make([]byte, 1024)
    copy(header, data[:1024]) // ✅ Only keeps 1KB, not 100MB
    return header, nil
}

When This Matters: Any time you're sub-slicing large data structures and keeping the small slice around long-term.

Additional Slice Footgun: Slice truncation with pointers

// MEMORY LEAK: Truncating slice with pointers
func removeFirst(items []*Item) []*Item {
    truncated := items[1:] // Still references first item!
    return truncated
}

// FIX: Nil out the reference
func removeFirstSafe(items []*Item) []*Item {
    items[0] = nil // Allow GC to collect the first item
    return items[1:]
}

// Example type for demonstration
type Item struct {
    Data string
}

Object Pooling for High-Frequency Allocations

✅ DO: Use sync.Pool for expensive, reusable objects

import "sync"

// BufferPool manages reusable byte buffers
type BufferPool struct {
    pool sync.Pool
}

func NewBufferPool() *BufferPool {
    return &BufferPool{
        pool: sync.Pool{
            New: func() interface{} {
                return make([]byte, 0, 1024) // Reasonable initial capacity
            },
        },
    }
}

func (p *BufferPool) Get() []byte {
    buf := p.pool.Get().([]byte)
    return buf[:0] // Reset length but keep capacity
}

func (p *BufferPool) Put(buf []byte) {
    // Only pool buffers within reasonable size limits
    if cap(buf) > 64*1024 {
        return // Don't pool very large buffers
    }
    
    // CRITICAL: Reset slice and prevent hidden capacity retention
    p.pool.Put(buf[:0:cap(buf)])
}

// Usage in high-throughput service
func (h *HTTPHandler) processRequest(w http.ResponseWriter, r *http.Request) {
    buf := h.bufferPool.Get()
    defer h.bufferPool.Put(buf)
    
    // Use buffer for processing
    buf = processRequestData(r, buf)
    w.Write(buf)
}

// Example HTTP handler struct and method
type HTTPHandler struct {
    bufferPool *BufferPool
}

func processRequestData(r *http.Request, buf []byte) []byte {
    // Simulate processing request data into buffer
    data := "processed: " + r.URL.Path
    return append(buf, data...)
}

Production Impact: High-traffic APIs report 50-70% reduction in garbage generation and significantly improved 99th percentile latency.

What Goes Wrong Without Object Pooling:

// DISASTER: Constant allocation pressure
func handleRequests(requests <-chan Request) {
    for req := range requests {
        // Creates new buffer every time!
        buf := make([]byte, 1024) // ❌ Constant heap allocation
        processRequest(req, buf)
        // Buffer is garbage collected, creating GC pressure
    }
}

Production War Story: A message processing service was creating 100,000 temporary buffers per second. Each buffer allocation triggered garbage collection, causing periodic 100ms latency spikes that violated SLA requirements. Implementing buffer pooling reduced allocation rate by 95% and eliminated latency spikes.

Memory-Efficient Data Structures

✅ DO: Optimize struct field ordering for memory efficiency

// BAD: Poor field alignment wastes memory
type BadUser struct {
    IsActive bool      // 1 byte + 7 bytes padding
    ID       int64     // 8 bytes
    Name     string    // 16 bytes
    Age      int32     // 4 bytes + 4 bytes padding
}
// Memory layout: [bool•••••••][int64---][string--------][int32••••]
// Total: 40 bytes with padding waste

// GOOD: Optimized field alignment  
type GoodUser struct {
    ID       int64     // 8 bytes
    Name     string    // 16 bytes
    Age      int32     // 4 bytes
    IsActive bool      // 1 byte + 3 bytes natural padding
}
// Memory layout: [int64---][string--------][int32][bool•••]
// Total: 32 bytes, 20% memory savings (• = padding)

Why This Matters: In high-volume applications, struct alignment can save significant memory. With millions of structs, 20% memory savings means better cache performance and lower memory costs.

Production Impact: A trading system reduced memory usage by 30% just by reordering struct fields in their high-frequency data structures. This improved L1 cache hit rates and reduced latency by 15%.

What Goes Wrong with Poor Alignment:

// WASTEFUL: Poor memory layout
type WastefulOrder struct {
    IsProcessed bool     // 1 byte + 7 bytes padding
    OrderID     int64    // 8 bytes  
    UserID      int64    // 8 bytes
    IsPaid      bool     // 1 byte + 7 bytes padding
    Amount      float64  // 8 bytes
    IsActive    bool     // 1 byte + 7 bytes padding
}
// Total: 48 bytes (21 bytes wasted on padding!)

// EFFICIENT: Optimized memory layout
type EfficientOrder struct {
    OrderID     int64    // 8 bytes
    UserID      int64    // 8 bytes  
    Amount      float64  // 8 bytes
    IsProcessed bool     // 1 byte
    IsPaid      bool     // 1 byte
    IsActive    bool     // 1 byte + 5 bytes padding
}
// Total: 32 bytes (13 bytes saved per struct)

String Operations and Performance

✅ DO: Use strings.Builder for efficient string construction

import "strings"

// EXCELLENT: Pre-calculated string building
func buildQuery(table string, fields []string, conditions []string) string {
    var builder strings.Builder
    
    // Calculate capacity to avoid reallocations
    estimatedSize := len("SELECT ") + len(" FROM ") + len(table)
    for _, field := range fields {
        estimatedSize += len(field) + 2 // ", "
    }
    for _, condition := range conditions {
        estimatedSize += len(condition) + 5 // " AND "  
    }
    
    builder.Grow(estimatedSize) // Pre-allocate
    
    builder.WriteString("SELECT ")
    for i, field := range fields {
        if i > 0 {
            builder.WriteString(", ")
        }
        builder.WriteString(field)
    }
    
    builder.WriteString(" FROM ")
    builder.WriteString(table)
    
    if len(conditions) > 0 {
        builder.WriteString(" WHERE ")
        for i, condition := range conditions {
            if i > 0 {
                builder.WriteString(" AND ")
            }
            builder.WriteString(condition)
        }
    }
    
    return builder.String()
}

What Goes Wrong with String Concatenation:

// DISASTER: Quadratic performance nightmare
func buildQueryBad(fields []string) string {
    query := "SELECT "
    for i, field := range fields {
        if i > 0 {
            query += ", " // Creates new string every iteration!
        }
        query += field // Another new string!
    }
    return query
}

Production Horror: A report generation service used string concatenation to build SQL queries with 1000+ fields. Each field addition created a new string, copying all previous data. For 1000 fields, this performed 500,000+ string copies, making query building take longer than query execution.

Why String Concatenation Is Quadratic:

Iteration 1: "SELECT " (7 chars) → copy 7 chars
Iteration 2: "SELECT field1, " (15 chars) → copy 15 chars  
Iteration 3: "SELECT field1, field2, " (23 chars) → copy 23 chars
...
Total copies: 7 + 15 + 23 + ... = O(n²) complexity

strings.Builder Performance:

func demonstrateStringPerformance() {
    // BAD: O(n²) time complexity
    start := time.Now()
    result := ""
    for i := 0; i < 10000; i++ {
        result += fmt.Sprintf("field%d, ", i) // Copies entire string each time
    }
    fmt.Printf("Concatenation: %v\n", time.Since(start))
    
    // GOOD: O(n) time complexity
    start = time.Now()
    var builder strings.Builder
    builder.Grow(200000) // Pre-allocate capacity
    for i := 0; i < 10000; i++ {
        builder.WriteString(fmt.Sprintf("field%d, ", i)) // Appends to buffer
    }
    result = builder.String()
    fmt.Printf("Builder: %v\n", time.Since(start))
}

Managing Memory with GOMEMLIMIT (Go 1.19+)

Decision Rationale: GOMEMLIMIT enables soft memory limits, allowing applications to stay within memory budgets while maintaining performance. This is crucial for containerized environments with memory constraints.

✅ DO: Set memory limits for production applications

import (
    "runtime/debug"
    "os"
    "strconv"
    "fmt"
)

func setupMemoryLimit() {
    // Option 1: Environment variable (recommended for containers)
    // export GOMEMLIMIT=1GiB
    
    // Option 2: Programmatic setting
    if limit := os.Getenv("MEMORY_LIMIT_MB"); limit != "" {
        if mb, err := strconv.Atoi(limit); err == nil {
            // Set limit with 10% headroom for non-Go memory
            limitBytes := int64(mb) * 1024 * 1024 * 9 / 10
            debug.SetMemoryLimit(limitBytes)
            
            fmt.Printf("Memory limit set to %d MB\n", mb*9/10)
        }
    }
}

// Production example: Container startup
func main() {
    setupMemoryLimit()
    
    // Your application logic...
}

Container Integration:

# Set memory limit in container
ENV GOMEMLIMIT=1GiB
# Container memory limit should be higher (e.g., 1.2GB) to account for non-Go memory

Why This Matters: Without GOMEMLIMIT, Go applications can consume all available memory before triggering garbage collection, leading to OOM kills in containers.

Production War Story: A microservice running in Kubernetes pods with 2GB memory limits was getting OOM killed during traffic spikes. Setting GOMEMLIMIT=1.8GiB allowed the GC to kick in earlier, preventing container kills and maintaining stable performance.

What Goes Wrong Without Memory Limits:

// DANGEROUS: Unbounded memory growth
func processLargeDataset() {
    var cache [][]byte
    
    for i := 0; i < 1000000; i++ {
        // Each iteration adds 1MB to memory
        data := make([]byte, 1024*1024)
        cache = append(cache, data)
        
        // Without GOMEMLIMIT, this keeps growing until OOM
        // GC doesn't kick in until it's too late
    }
}

The Fix with GOMEMLIMIT:

  • GC triggers more aggressively as memory approaches limit
  • Application stays within container memory boundaries
  • Prevents sudden OOM kills during traffic spikes
  • Maintains predictable performance characteristics

Performance Profiling Integration

✅ DO: Use Go's built-in profiling tools

import (
    _ "net/http/pprof" // Enable pprof endpoints
    "net/http"
    "log"
)

// Add to your main function or setup
func enableProfiling() {
    go func() {
        log.Println("pprof server starting on :6060")
        log.Println(http.ListenAndServe(":6060", nil))
    }()
}

// Usage:
// go tool pprof http://localhost:6060/debug/pprof/heap
// go test -bench=. -cpuprofile=cpu.prof
// go tool pprof cpu.prof

// Example benchmark with memory profiling
func BenchmarkDataProcessing(b *testing.B) {
    data := generateLargeDataset(10000)
    
    b.ResetTimer()
    b.ReportAllocs() // Show allocation stats
    
    for i := 0; i < b.N; i++ {
        result := processData(data)
        _ = result // Prevent optimization
    }
}

// Placeholder functions for profiling example
func generateLargeDataset(size int) []int {
    data := make([]int, size)
    for i := range data {
        data[i] = i
    }
    return data
}

func processData(data []int) []int {
    result := make([]int, len(data))
    for i, v := range data {
        result[i] = v * v
    }
    return result
}

Production Profiling Workflow:

# 1. Capture heap profile from running service
go tool pprof http://your-service:6060/debug/pprof/heap

# 2. Analyze memory usage
(pprof) top10
(pprof) list functionName
(pprof) web

# 3. Capture CPU profile during load test
go tool pprof http://your-service:6060/debug/pprof/profile?seconds=30

# 4. Find CPU hotspots
(pprof) top10
(pprof) web

What Profiling Reveals:

// Example: Memory leak discovered through profiling
func leakyFunction() {
    cache := make(map[string][]byte)
    
    for i := 0; i < 1000000; i++ {
        key := fmt.Sprintf("key_%d", i)
        // This cache grows without bounds!
        cache[key] = make([]byte, 1024)
    }
    
    // Profile shows this function consuming 1GB+ memory
    // Solution: Add cache eviction or size limits
}

Performance Anti-Patterns Checklist

❌ Critical Performance Mistakes:

Anti-Pattern Why It's Slow Better Approach Production Impact
Growing slices without capacity Multiple reallocations make([]T, 0, expectedSize) 10x slower allocation performance
String concatenation with + Creates new strings each time strings.Builder Quadratic performance degradation
Not pooling expensive objects Constant allocation/GC pressure sync.Pool High latency spikes during GC
Unnecessary pointer escapes Heap allocation overhead Return values, not pointers 100x allocation performance penalty
Large struct copying Expensive memory copies Pass pointers for large structs CPU cache misses, memory bandwidth waste
Poor struct field alignment Memory waste, cache misses Order fields by size (large to small) 20-40% memory waste
Sub-slicing large arrays Prevents GC of large backing arrays Explicit copying with make + copy Memory leaks, OOM crashes
Ignoring GOMEMLIMIT OOM kills in containers Set appropriate memory limits Service instability, container restarts

Real-World Performance Case Studies

Case Study 1: API Response Serialization:

// BEFORE: Slow JSON marshaling
func (h *Handler) getUsers(w http.ResponseWriter, r *http.Request) {
    users := h.service.GetUsers(r.Context())
    
    // Creates new encoder each time
    json.NewEncoder(w).Encode(users) // ❌ Slow
}

// AFTER: Pooled encoders + pre-allocated buffers
type Handler struct {
    encoderPool *sync.Pool
    bufferPool  *BufferPool
}

func (h *Handler) getUsersOptimized(w http.ResponseWriter, r *http.Request) {
    users := h.service.GetUsers(r.Context())
    
    buf := h.bufferPool.Get()
    defer h.bufferPool.Put(buf)
    
    encoder := h.encoderPool.Get().(*json.Encoder)
    defer h.encoderPool.Put(encoder)
    
    encoder.Reset(bytes.NewBuffer(buf))
    encoder.Encode(users)
    
    w.Write(buf)
}

Result: 60% reduction in response time, 80% reduction in allocations.

Case Study 2: Database Query Building

// BEFORE: String concatenation nightmare
func buildComplexQuery(filters []Filter) string {
    query := "SELECT * FROM users WHERE 1=1"
    for _, filter := range filters {
        query += " AND " + filter.Column + " = '" + filter.Value + "'" // ❌ Quadratic
    }
    return query
}

// AFTER: Pre-allocated builder
func buildComplexQueryOptimized(filters []Filter) string {
    const baseQuery = "SELECT * FROM users WHERE 1=1"
    
    // Estimate capacity
    capacity := len(baseQuery)
    for _, filter := range filters {
        capacity += len(" AND ") + len(filter.Column) + len(" = '") + len(filter.Value) + len("'")
    }
    
    var builder strings.Builder
    builder.Grow(capacity)
    builder.WriteString(baseQuery)
    
    for _, filter := range filters {
        builder.WriteString(" AND ")
        builder.WriteString(filter.Column)
        builder.WriteString(" = '")
        builder.WriteString(filter.Value)
        builder.WriteString("'")
    }
    
    return builder.String()
}

// Example filter type
type Filter struct {
    Column string
    Value  string
}

Result: O(n²) → O(n) complexity, 95% performance improvement for large filter sets.

🚨 WARNING: Manual Garbage Collection

runtime.GC() is Almost Always Wrong

// BAD: Fighting the runtime instead of working with it
func processLargeFile(filename string) error {
    // ... processing logic
    
    // Force GC "to help performance"
    runtime.GC() // ❌ Usually makes things worse
    
    return nil
}

Why Manual GC is Dangerous:

  • Masks Memory Leaks: Hides the real problem
  • Unpredictable Latency: Introduces pause spikes
  • Runtime Conflict: Fights against optimized GC algorithms
  • Code Smell: Usually indicates poor allocation patterns

What Happens with Manual GC:

  1. Stop-the-World: All goroutines pause during GC
  2. Poor Timing: GC runs when you force it, not when optimal
  3. Wasted Work: May run GC when little garbage exists
  4. Performance Regression: Often makes performance worse

✅ Better Solutions:

  • Reduce Allocations: Use object pooling and pre-allocation
  • Profile First: Use go tool pprof to understand allocation patterns
  • Batch Processing: Control memory usage through batching
  • Let GC Work: Trust the optimized garbage collector
  • Use GOMEMLIMIT: Set soft memory limits instead

Production Example of Manual GC Gone Wrong:

// DISASTER: Manual GC causing latency spikes
func processUserBatch(users []User) {
    for _, user := range users {
        processUser(user)
        
        // "Helping" the GC every 100 users
        if len(users)%100 == 0 {
            runtime.GC() // ❌ Causes 50ms pause every 100 users
        }
    }
}

// SOLUTION: Let GC work naturally
func processUserBatchOptimized(users []User) {
    // Pre-allocate if needed
    results := make([]Result, 0, len(users))
    
    for _, user := range users {
        result := processUser(user) // Stack allocation when possible
        results = append(results, result)
    }
    
    // GC runs automatically at optimal times
    return results
}

The Rule: If you're calling runtime.GC(), you probably have an allocation problem, not a GC problem. Fix the allocations instead.


10. Modern Go: Generics & Advanced Features

Generics in Go 1.18+ (Production Ready)

Decision Rationale: Generics enable type-safe, reusable code without the performance overhead of interface{}. They're essential for modern Go applications.

Generics Decision Framework

Decision Rationale: Generics are powerful but can add complexity. Use this framework to decide when they're appropriate.

// GENERICS DECISION TREE
// ✅ Use generics when:
//   - Building reusable data structures (trees, caches, queues)
//   - Implementing type-safe algorithms (sort, filter, map, reduce)
//   - Avoiding interface{} type assertions in performance-critical paths
//   - Creating type-safe APIs where compile-time checking prevents bugs
//
// ❌ Avoid generics when:
//   - A simple interface suffices (e.g., io.Reader, fmt.Stringer)
//   - Only one concrete type is needed (YAGNI principle)
//   - They obscure readability without tangible benefits
//   - The abstraction is forced rather than natural

// GOOD: Natural abstraction
func Map[T, U any](items []T, fn func(T) U) []U {
    result := make([]U, len(items))
    for i, item := range items {
        result[i] = fn(item)
    }
    return result
}

// BAD: Unnecessary complexity
func Print[T any](value T) {
    fmt.Println(value) // Just use fmt.Println(value interface{})
}

// GOOD: Type-safe cache
type Cache[K comparable, V any] struct {
    data map[K]V
    mu   sync.RWMutex
}

func (c *Cache[K, V]) Get(key K) (V, bool) {
    c.mu.RLock()
    defer c.mu.RUnlock()
    val, ok := c.data[key]
    return val, ok
}

// BAD: Over-genericized simple operation
func Add[T int | float64](a, b T) T {
    return a + b // Just write separate functions if only 2 types needed
}

When in Doubt: Start with concrete types or interfaces. Refactor to generics only when you have multiple concrete implementations and proven need for type safety.

Generic Repository Pattern

✅ DO: Use generics for type-safe repositories

import (
    "context"
    "database/sql"
    "fmt"
)

// Generic repository interface
type Repository[T any, ID comparable] interface {
    Create(ctx context.Context, entity T) error
    GetByID(ctx context.Context, id ID) (T, error)
    Update(ctx context.Context, entity T) error
    Delete(ctx context.Context, id ID) error
    List(ctx context.Context, filter Filter) ([]T, error)
}

// Filter for listing operations
type Filter struct {
    Limit  int
    Offset int
}

// Generic repository implementation
type SQLRepository[T any, ID comparable] struct {
    db        *sql.DB
    tableName string
    mapper    EntityMapper[T]
}

func NewSQLRepository[T any, ID comparable](db *sql.DB, tableName string, mapper EntityMapper[T]) *SQLRepository[T, ID] {
    return &SQLRepository[T, ID]{
        db:        db,
        tableName: tableName,
        mapper:    mapper,
    }
}

func (r *SQLRepository[T, ID]) GetByID(ctx context.Context, id ID) (T, error) {
    var zero T
    
    query := fmt.Sprintf("SELECT * FROM %s WHERE id = $1", r.tableName)
    row := r.db.QueryRowContext(ctx, query, id)
    
    entity, err := r.mapper.ScanFrom(row)
    if err != nil {
        if errors.Is(err, sql.ErrNoRows) {
            return zero, fmt.Errorf("%w: entity %v", sql.ErrNoRows, id)
        }
        return zero, fmt.Errorf("failed to get entity: %w", err)
    }
    
    return entity, nil
}

// EntityMapper defines contract for mapping database rows.
// Note: sql.Row.Scan and sql.Rows.Scan are standard methods for reading
// query results into Go variables. The mapper encapsulates this logic
// for a specific type T.
type EntityMapper[T any] interface {
    ScanFrom(row *sql.Row) (T, error)
    ScanFromRows(rows *sql.Rows) (T, error)
}

// Concrete implementation for User
type UserMapper struct{}

func (m UserMapper) ScanFrom(row *sql.Row) (User, error) {
    var user User
    err := row.Scan(&user.ID, &user.Email, &user.Name, &user.CreatedAt)
    return user, err
}

func (m UserMapper) ScanFromRows(rows *sql.Rows) (User, error) {
    var user User
    err := rows.Scan(&user.ID, &user.Email, &user.Name, &user.CreatedAt)
    return user, err
}

// Usage: Type-safe repositories
func setupRepositories(db *sql.DB) {
    userRepo := NewSQLRepository[User, string](db, "users", UserMapper{})
    orderRepo := NewSQLRepository[Order, int64](db, "orders", OrderMapper{})
    
    // Type-safe usage
    user, err := userRepo.GetByID(ctx, "user-123")
    order, err := orderRepo.GetByID(ctx, int64(456))
}

// Example types for demonstration
type User struct {
    ID        string    `json:"id"`
    Email     string    `json:"email"`
    Name      string    `json:"name"`
    CreatedAt time.Time `json:"created_at"`
}

type Order struct {
    ID     int64  `json:"id"`
    UserID string `json:"user_id"`
    Amount int    `json:"amount"`
}

type OrderMapper struct{}

func (m OrderMapper) ScanFrom(row *sql.Row) (Order, error) {
    var order Order
    err := row.Scan(&order.ID, &order.UserID, &order.Amount)
    return order, err
}

func (m OrderMapper) ScanFromRows(rows *sql.Rows) (Order, error) {
    var order Order
    err := rows.Scan(&order.ID, &order.UserID, &order.Amount)
    return order, err
}

Generic Constraints and Type Sets

✅ DO: Use constraints for type-safe generic functions

// Ordered constraint for sortable types
type Ordered interface {
    ~int | ~int8 | ~int16 | ~int32 | ~int64 |
    ~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 |
    ~float32 | ~float64 | ~string
}

// Generic min/max functions
func Min[T Ordered](a, b T) T {
    if a < b {
        return a
    }
    return b
}

func Max[T Ordered](a, b T) T {
    if a > b {
        return a
    }
    return b
}

// Numeric constraint for mathematical operations
type Numeric interface {
    ~int | ~int8 | ~int16 | ~int32 | ~int64 |
    ~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 |
    ~float32 | ~float64
}

// Generic mathematical functions
func Sum[T Numeric](values []T) T {
    var sum T
    for _, v := range values {
        sum += v
    }
    return sum
}

func Average[T Numeric](values []T) T {
    if len(values) == 0 {
        return T(0)
    }
    return Sum(values) / T(len(values))
}

Functional Options with Generics

✅ DO: Combine generics with functional options for flexible APIs

// Generic configuration with options
type Config[T any] struct {
    timeout     time.Duration
    retries     int
    validator   Validator[T]
    transformer Transformer[T]
}

type Option[T any] func(*Config[T])

func WithTimeout[T any](timeout time.Duration) Option[T] {
    return func(c *Config[T]) {
        c.timeout = timeout
    }
}

func WithRetries[T any](retries int) Option[T] {
    return func(c *Config[T]) {
        c.retries = retries
    }
}

func WithValidator[T any](validator Validator[T]) Option[T] {
    return func(c *Config[T]) {
        c.validator = validator
    }
}

// Interface definitions for type safety
type Validator[T any] interface {
    Validate(T) error
}

type Transformer[T any] interface {
    Transform(T) T
}

// Generic service with options
type Service[T any] struct {
    config Config[T]
}

func NewService[T any](options ...Option[T]) *Service[T] {
    config := Config[T]{
        timeout: 30 * time.Second,
        retries: 3,
    }
    
    for _, option := range options {
        option(&config)
    }
    
    return &Service[T]{config: config}
}

// Example validators for concrete types
type UserValidator struct{}

func (v UserValidator) Validate(user User) error {
    if user.Email == "" {
        return errors.New("email is required")
    }
    return nil
}

type OrderValidator struct{}

func (v OrderValidator) Validate(order Order) error {
    if order.Amount <= 0 {
        return errors.New("amount must be positive")
    }
    return nil
}

// Usage: Type-safe service configuration  
func main() {
    userService := NewService[User](
        WithTimeout[User](10 * time.Second),
        WithRetries[User](5),
        WithValidator[User](UserValidator{}),
    )
    
    orderService := NewService[Order](
        WithTimeout[Order](15 * time.Second),
        WithValidator[Order](OrderValidator{}),
    )
}

Generic Performance Considerations

⚠️ Important: Generics can impact compilation time and binary size

  • Monomorphization: Each type instantiation creates separate code
  • Compilation Time: Many generic instantiations slow compilation
  • Binary Size: Multiple instantiations increase binary size
  • Runtime Performance: Generally same as hand-written code

✅ Best Practices:

  • Use generics for genuine type safety, not just to avoid typing
  • Prefer interface{} for simple cases where type safety isn't critical
  • Profile before optimizing - measure actual impact
  • Consider code generation for very performance-critical paths

Go 1.24+ Features Update

Range over Functions (Stable in Go 1.25):

// Iterator pattern with range over func
func Numbers(start, end int) func(func(int) bool) {
    return func(yield func(int) bool) {
        for i := start; i < end; i++ {
            if !yield(i) {
                return
            }
        }
    }
}

// Usage (stable in Go 1.25)
for n := range Numbers(1, 10) {
    fmt.Println(n)
}

Enhanced Error Handling:

  • Improved error messages in generics
  • Better type inference in complex scenarios
  • Performance improvements in error wrapping

11. Production-Ready Patterns

The Functional Options Pattern

Decision Rationale: Functional options provide flexible configuration without breaking API compatibility. Essential for libraries and services with many configuration parameters.

What Goes Wrong Without Functional Options:

// NIGHTMARE: Constructors with many parameters
func NewServer(addr string, readTimeout time.Duration, writeTimeout time.Duration, 
    maxHeaderBytes int, enableCORS bool, enableGzip bool, enableMetrics bool, 
    logLevel string, middleware []Middleware) *Server {
    // 9 parameters is already unmanageable
    // Adding new config options breaks all existing calls
}

// Caller nightmare
server := NewServer(":8080", 15*time.Second, 15*time.Second, 1024*1024, 
    true, true, false, "info", []Middleware{}) // What do these booleans mean?

✅ DO: Use functional options for complex configuration

// Server with many optional configurations
type Server struct {
    addr            string
    readTimeout     time.Duration
    writeTimeout    time.Duration
    maxHeaderBytes  int
    middleware      []Middleware
    errorHandler    ErrorHandler
}

type Option func(*Server)

func WithAddr(addr string) Option {
    return func(s *Server) {
        s.addr = addr
    }
}

func WithTimeouts(read, write time.Duration) Option {
    return func(s *Server) {
        s.readTimeout = read
        s.writeTimeout = write
    }
}

func WithMiddleware(middleware ...Middleware) Option {
    return func(s *Server) {
        s.middleware = append(s.middleware, middleware...)
    }
}

func WithErrorHandler(handler ErrorHandler) Option {
    return func(s *Server) {
        s.errorHandler = handler
    }
}

func NewServer(options ...Option) *Server {
    // Sensible defaults
    server := &Server{
        addr:           ":8080",
        readTimeout:    15 * time.Second,
        writeTimeout:   15 * time.Second,
        maxHeaderBytes: 1 << 20, // 1MB
        errorHandler:   DefaultErrorHandler{},
    }
    
    // Apply options (nil slice is safe - range over nil slice is no-op)
    for _, option := range options {
        if option != nil { // Defensive check for nil functional options
            option(server)
        }
    }
    
    return server
}

// Usage: Clean and extensible
server := NewServer(
    WithAddr(":3000"),
    WithTimeouts(10*time.Second, 30*time.Second),
    WithMiddleware(
        corsMiddleware(),
        requestIDMiddleware(),
        recoveryMiddleware(),
    ),
    WithErrorHandler(CustomErrorHandler{}),
)

// Example middleware and handler types
type Middleware func(http.Handler) http.Handler

type ErrorHandler interface {
    HandleError(w http.ResponseWriter, r *http.Request, err error)
}

type DefaultErrorHandler struct{}

func (h DefaultErrorHandler) HandleError(w http.ResponseWriter, r *http.Request, err error) {
    http.Error(w, "Internal Server Error", http.StatusInternalServerError)
}

type CustomErrorHandler struct{}

func (h CustomErrorHandler) HandleError(w http.ResponseWriter, r *http.Request, err error) {
    // Custom error handling logic
    http.Error(w, "Custom Error Response", http.StatusInternalServerError)
}

// Example middleware functions
func corsMiddleware() Middleware {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            w.Header().Set("Access-Control-Allow-Origin", "*")
            next.ServeHTTP(w, r)
        })
    }
}

func requestIDMiddleware() Middleware {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            requestID := generateRequestID()
            w.Header().Set("X-Request-ID", requestID)
            next.ServeHTTP(w, r)
        })
    }
}

func recoveryMiddleware() Middleware {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            defer func() {
                if err := recover(); err != nil {
                    http.Error(w, "Internal Server Error", http.StatusInternalServerError)
                }
            }()
            next.ServeHTTP(w, r)
        })
    }
}

func generateRequestID() string {
    return fmt.Sprintf("req_%d", time.Now().UnixNano())
}

The Repository Pattern

Decision Rationale: Repository pattern separates data access from business logic, enabling testability and flexibility in storage implementations.

✅ DO: Implement clean repository interfaces

// Domain model (pure business logic)
type User struct {
    ID        string    `json:"id"`
    Email     string    `json:"email"`
    Name      string    `json:"name"`
    CreatedAt time.Time `json:"created_at"`
    UpdatedAt time.Time `json:"updated_at"`
}

// Repository interface (defined in domain layer)
type UserRepository interface {
    Create(ctx context.Context, user *User) error
    GetByID(ctx context.Context, id string) (*User, error)
    GetByEmail(ctx context.Context, email string) (*User, error)
    Update(ctx context.Context, user *User) error
    Delete(ctx context.Context, id string) error
    List(ctx context.Context, filter UserFilter) ([]*User, error)
}

type UserFilter struct {
    Email  string
    Limit  int
    Offset int
}

// SQL implementation (infrastructure layer)
type sqlUserRepository struct {
    db     *sql.DB
    logger *slog.Logger
}

func NewSQLUserRepository(db *sql.DB, logger *slog.Logger) UserRepository {
    return &sqlUserRepository{
        db:     db,
        logger: logger,
    }
}

func (r *sqlUserRepository) Create(ctx context.Context, user *User) error {
    query := `
        INSERT INTO users (id, email, name, created_at, updated_at)
        VALUES ($1, $2, $3, $4, $5)`
    
    _, err := r.db.ExecContext(ctx, query,
        user.ID, user.Email, user.Name, user.CreatedAt, user.UpdatedAt)
    if err != nil {
        r.logger.ErrorContext(ctx, "failed to create user",
            "error", err,
            "user_id", user.ID)
        return fmt.Errorf("failed to create user: %w", err)
    }
    
    return nil
}

func (r *sqlUserRepository) GetByID(ctx context.Context, id string) (*User, error) {
    query := `
        SELECT id, email, name, created_at, updated_at
        FROM users
        WHERE id = $1`
    
    var user User
    err := r.db.QueryRowContext(ctx, query, id).Scan(
        &user.ID, &user.Email, &user.Name, &user.CreatedAt, &user.UpdatedAt)
        
    if err != nil {
        if errors.Is(err, sql.ErrNoRows) {
            return nil, ErrUserNotFound
        }
        return nil, fmt.Errorf("failed to get user: %w", err)
    }
    
    return &user, nil
}

Connecting the Dots: Remember to inject the concrete repository implementation into your services. For example:

// userService := NewUserService(NewSQLUserRepository(db, logger), logger)

This ties back to our dependency injection principles from Section 6.


12. Go Footguns & Critical Gotchas

Loop Variable Capture (The Classic)

The Most Common Go Bug: Loop variable capture affects even experienced developers.

❌ The Bug:

// WRONG: All goroutines process the same item!
for _, item := range items {
    go func() {
        process(item) // BUG: captures the loop variable reference
    }()
}

What Goes Wrong: All goroutines process the last item in the slice because the loop variable item is reused across iterations.

Why This Happens:

Iteration 1: item points to items[0] ──┐
Iteration 2: item points to items[1] ──┼─► All goroutines read this
Iteration 3: item points to items[2] ──┘    same memory location!

Production War Story: A major streaming service had recommendation bugs where all users saw the same content. The issue was loop variable capture in goroutines processing user preferences. 10 million users received identical recommendations because all goroutines captured the last user's preferences.

✅ The Fix:

// CORRECT: Each goroutine gets its own copy
for _, item := range items {
    item := item // Capture the loop variable
    go func() {
        process(item) // Safe: each goroutine has its own item
    }()
}

// Better: Use SafeGroup (recommended)
g, ctx := NewSafeGroup(ctx, logger)
for _, item := range items {
    item := item
    g.Go(ctx, func() error {
        return process(ctx, item)
    })
}
return g.Wait()

Interface Nil Gotcha

The Surprise: An interface is only nil if both its type and value are nil.

❌ The Problem:

func returnsTypedNil() error {
    var err *MyError = nil
    return err // This interface is NOT nil!
}

func main() {
    if err := returnsTypedNil(); err != nil {
        fmt.Println("Error occurred") // This will print!
    }
}

type MyError struct {
    msg string
}

func (e *MyError) Error() string {
    return e.msg
}

Why This Happens: The interface contains a non-nil type (*MyError) even though the value is nil.

Interface Structure:

interface{} = (Type: *MyError, Value: nil)  // NOT nil interface
interface{} = (Type: nil,     Value: nil)  // nil interface

Production War Story: A logging service reported errors even when none occurred because functions returned typed nil errors that were treated as real errors. Alert systems fired continuously, causing on-call fatigue and masking real issues.

✅ The Fix:

func returnsNil() error {
    var err *MyError = nil
    if err != nil {
        return err
    }
    return nil // Explicitly return untyped nil
}

// Better: Return concrete errors directly
func betterError() error {
    if someCondition {
        return &MyError{msg: "something went wrong"}
    }
    return nil
}

// Example condition for demonstration
var someCondition = false

Slice Header Gotchas

The Confusion: Slice behavior depends on capacity, leading to unexpected mutations.

❌ The Problem:

func modifySlice(s []int) {
    s[0] = 999        // Always modifies original
    s = append(s, 42) // May or may not affect original
}

original := []int{1, 2, 3}
modifySlice(original)
// original[0] is now 999, but length unchanged!

What Goes Wrong: Index assignment always affects the underlying array, but append behavior depends on capacity.

Why This Happens:

Original: [1, 2, 3] (len=3, cap=3)
s[0] = 999: [999, 2, 3] (modifies original array)
append(s, 42): Creates new array because cap is full
original is still [999, 2, 3], append had no effect

✅ The Solution:

// Be explicit about mutation intentions
func modifySliceInPlace(s []int) {
    if len(s) > 0 {
        s[0] = 999 // Caller expects modification
    }
}

func appendToSlice(s []int, value int) []int {
    return append(s, value) // Return new slice
}

// Defensive copying when needed
func processSliceSafely(s []int) []int {
    result := make([]int, len(s)) // Create copy
    copy(result, s)
    
    // Safe to modify result
    for i := range result {
        result[i] *= 2
    }
    
    return result
}

Map Concurrency Gotcha

❌ The Problem: Maps are not thread-safe

// DANGEROUS: Concurrent map access
var cache = make(map[string]string)

func getValue(key string) string {
    return cache[key] // Race condition!
}

func setValue(key, value string) {
    cache[key] = value // Race condition!
}

What Goes Wrong: Concurrent reads and writes to maps cause race conditions that can corrupt data or crash the program.

Production Crash: A cache service crashed with "fatal error: concurrent map writes" during peak traffic. Multiple goroutines were updating the same map simultaneously, causing the Go runtime to detect the race condition and terminate the program.

✅ The Fix:

// Use sync.Map for concurrent access
var cache sync.Map

func getValue(key string) (string, bool) {
    value, ok := cache.Load(key)
    if !ok {
        return "", false
    }
    return value.(string), true
}

func setValue(key, value string) {
    cache.Store(key, value)
}

// Or use a mutex for more complex operations
type SafeCache struct {
    mu    sync.RWMutex
    cache map[string]string
}

func NewSafeCache() *SafeCache {
    return &SafeCache{
        cache: make(map[string]string),
    }
}

func (c *SafeCache) Get(key string) (string, bool) {
    c.mu.RLock()
    defer c.mu.RUnlock()
    value, ok := c.cache[key]
    return value, ok
}

func (c *SafeCache) Set(key, value string) {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.cache[key] = value
}

Networking Footguns

❌ The Problem: Default HTTP client has no timeouts

// DANGEROUS: Can hang forever
func callAPI() error {
    resp, err := http.Get("https://api.example.com/data")
    if err != nil {
        return err
    }
    defer resp.Body.Close()
    // Process response...
    return nil
}

What Goes Wrong: Without timeouts, HTTP requests can hang indefinitely, causing goroutine leaks and service degradation.

✅ The Fix: Always set timeouts

import (
    "net"
    "net/http"
    "time"
)

// SAFE: Proper timeout configuration
func callAPISafe() error {
    client := &http.Client{
        Timeout: 30 * time.Second,
        Transport: &http.Transport{
            DialContext: (&net.Dialer{
                Timeout: 10 * time.Second,
            }).DialContext,
            TLSHandshakeTimeout:   10 * time.Second,
            ResponseHeaderTimeout: 10 * time.Second,
            IdleConnTimeout:       60 * time.Second,
            MaxIdleConns:          10,
            MaxIdleConnsPerHost:   2,
            DisableKeepAlives:     false, // Reuse connections for performance
        },
    }
    
    resp, err := client.Get("https://api.example.com/data")
    if err != nil {
        return err
    }
    defer resp.Body.Close()
    // Process response...
    return nil
}

Additional Critical Footguns

❌ The time.After Leak in Loops

// BAD: Creates a new timer on each iteration that never gets garbage collected
for {
    select {
    case <-time.After(1 * time.Second): // ❌ Timer leak!
        doWork()
    case <-ctx.Done():
        return
    }
}

What Goes Wrong: time.After creates a timer that isn't garbage collected until it fires. In an infinite loop, this creates thousands of timers consuming memory.

✅ The Fix: Reuse timers:

// GOOD: Reuse timer to prevent memory leaks
timer := time.NewTimer(1 * time.Second)
defer timer.Stop()

for {
    select {
    case <-timer.C:
        doWork()
        timer.Reset(1 * time.Second) // ✅ Safe reuse
    case <-ctx.Done():
        return
    }
}

// Placeholder for doWork function
func doWork() {
    // Implementation would go here
}

❌ Ticker Not Stopped:

// BAD: time.Tick cannot be stopped, keeps running forever
ticker := time.Tick(5 * time.Second) // ❌ Memory leak!
for range ticker {
    if shouldStop() {
        return // Timer keeps running!
    }
    doPeriodicWork()
}

✅ The Fix: Use NewTicker:

// GOOD: NewTicker can be stopped
ticker := time.NewTicker(5 * time.Second)
defer ticker.Stop() // ✅ Properly cleaned up

for {
    select {
    case <-ticker.C:
        if shouldStop() {
            return
        }
        doPeriodicWork()
    case <-ctx.Done():
        return
    }
}

// Placeholder functions
func shouldStop() bool {
    return false // Implementation would determine when to stop
}

func doPeriodicWork() {
    // Implementation would go here
}

Defer in Loops Gotcha

❌ The Problem: Deferred functions accumulate

// BAD: Defers accumulate, not executed per iteration
func processFiles(filenames []string) error {
    for _, filename := range filenames {
        file, err := os.Open(filename)
        if err != nil {
            return err
        }
        defer file.Close() // ❌ All files stay open until function returns!
        
        // Process file...
    }
    return nil
}

What Goes Wrong: All defer statements are executed when the function returns, not when the loop iteration ends. This keeps all files open simultaneously.

Production Impact: A log processing service tried to process 10,000 files using this pattern. It ran out of file descriptors and crashed because all files remained open.

✅ The Fix:

// GOOD: Use anonymous function or separate function
func processFiles(filenames []string) error {
    for _, filename := range filenames {
        if err := processFile(filename); err != nil {
            return err
        }
    }
    return nil
}

func processFile(filename string) error {
    file, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer file.Close() // ✅ Closes immediately after processing
    
    // Process file...
    return nil
}

// Alternative: Anonymous function
func processFilesAlternative(filenames []string) error {
    for _, filename := range filenames {
        err := func() error {
            file, err := os.Open(filename)
            if err != nil {
                return err
            }
            defer file.Close() // ✅ Executes at end of anonymous function
            
            // Process file...
            return nil
        }()
        
        if err != nil {
            return err
        }
    }
    return nil
}

13. Testing Excellence in Go

Go's Testing Philosophy

Go's built-in testing framework embodies the same principles as the language itself: simple, explicit, and effective. Writing testable code is a core theme throughout this guide—here are the canonical patterns for writing the tests themselves.

Decision Rationale: Go treats testing as a first-class citizen. The testing package provides everything needed for unit tests, benchmarks, and fuzzing without external dependencies.

Table-Driven Tests: The Idiomatic Pattern

✅ DO: Use table-driven tests for multiple test cases

import (
    "testing"
)

// Example function being tested
func Min(a, b int) int {
    if a < b {
        return a
    }
    return b
}

func TestMin(t *testing.T) {
    testCases := []struct {
        name     string
        a, b     int
        expected int
    }{
        {"a is smaller", 5, 10, 5},
        {"b is smaller", 10, 5, 5},
        {"equal numbers", 5, 5, 5},
        {"negative numbers", -5, -10, -10},
        {"zero values", 0, 0, 0},
    }
    
    for _, tc := range testCases {
        t.Run(tc.name, func(t *testing.T) {
            got := Min(tc.a, tc.b)
            if got != tc.expected {
                t.Errorf("Min(%d, %d) = %d; want %d", tc.a, tc.b, got, tc.expected)
            }
        })
    }
}

Why This Pattern Works:

  • Maintainable: Easy to add new test cases
  • Clear: Each case is self-documenting with descriptive names
  • Isolated: t.Run creates subtests that run independently
  • Debuggable: Can run specific cases with go test -run TestMin/equal

Test Helpers with t.Helper()

✅ DO: Use t.Helper() in test utility functions

// Test helper that reports failures at the correct line  
func assertEqual[T comparable](t *testing.T, got, want T) {
    t.Helper() // This is the magic!
    if got != want {
        t.Errorf("want %v, got %v", want, got) // Standard library convention: want first
    }
}

func assertNoError(t *testing.T, err error) {
    t.Helper()
    if err != nil {
        t.Fatalf("unexpected error: %v", err)
    }
}

func TestUserService_CreateUser(t *testing.T) {
    service := setupTestService(t)
    
    user, err := service.CreateUser(context.Background(), &CreateUserRequest{
        Email: "[email protected]",
        Name:  "Test User",
    })
    
    // If these fail, the error points to THIS line, not inside the helpers
    assertNoError(t, err)
    assertEqual(t, user.Email, "[email protected]")
    assertEqual(t, user.Name, "Test User")
}

// Placeholder function for test setup
func setupTestService(t *testing.T) *UserService {
    // Would return a properly configured service for testing
    return &UserService{}
}

// Example types for testing
type CreateUserRequest struct {
    Email string
    Name  string
}

type UserService struct{}

func (s *UserService) CreateUser(ctx context.Context, req *CreateUserRequest) (*User, error) {
    return &User{
        Email: req.Email,
        Name:  req.Name,
    }, nil
}

Why t.Helper() Is Critical: Without it, test failures are reported inside the helper function, making debugging much harder.

Testing with Dependency Injection

✅ DO: Create test doubles using interfaces

import (
    "context"
    "errors"
    "io"
    "log/slog"
    "testing"
)

// Mock implementation for testing
type mockUserRepository struct {
    users map[string]*User
    err   error
}

func (m *mockUserRepository) GetUser(ctx context.Context, id string) (*User, error) {
    if m.err != nil {
        return nil, m.err
    }
    user, exists := m.users[id]
    if !exists {
        return nil, ErrUserNotFound
    }
    return user, nil
}

func (m *mockUserRepository) SaveUser(ctx context.Context, user *User) error {
    if m.err != nil {
        return m.err
    }
    m.users[user.ID] = user
    return nil
}

// Test using the mock
func TestUserService_GetUser(t *testing.T) {
    tests := []struct {
        name           string
        userID         string
        setupMock      func(*mockUserRepository)
        expectedUser   *User
        expectedError  error
    }{
        {
            name:   "user exists",
            userID: "user-123",
            setupMock: func(mock *mockUserRepository) {
                mock.users["user-123"] = &User{
                    ID:    "user-123",
                    Email: "[email protected]",
                    Name:  "Test User",
                }
            },
            expectedUser: &User{
                ID:    "user-123",
                Email: "[email protected]", 
                Name:  "Test User",
            },
        },
        {
            name:   "user not found",
            userID: "nonexistent",
            setupMock: func(mock *mockUserRepository) {
                // No setup needed - empty map
            },
            expectedError: ErrUserNotFound,
        },
    }
    
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            mock := &mockUserRepository{users: make(map[string]*User)}
            tt.setupMock(mock)
            
            // Avoid global defaults in tests: build a logger explicitly
            logger := slog.New(slog.NewTextHandler(io.Discard, &slog.HandlerOptions{Level: slog.LevelDebug}))
            service := NewUserService(mock, logger)
            
            user, err := service.GetUser(context.Background(), tt.userID)
            
            if tt.expectedError != nil {
                if !errors.Is(err, tt.expectedError) {
                    t.Fatalf("expected error %v, got %v", tt.expectedError, err)
                }
                return // Error was expected, test passes
            }
            
            assertNoError(t, err)
            assertEqual(t, user.ID, tt.expectedUser.ID)
            assertEqual(t, user.Email, tt.expectedUser.Email)
        })
    }
}

Benchmarking for Performance

✅ DO: Write benchmarks for performance-critical code

func BenchmarkUserProcessing(b *testing.B) {
    service := setupBenchmarkService() // Setup test service with mocks
    users := generateTestUsers(1000)   // Create test data
    
    b.ResetTimer()      // Don't count setup time
    b.ReportAllocs()    // Report memory allocations
    
    for i := 0; i < b.N; i++ {
        for _, user := range users {
            _ = service.ProcessUser(context.Background(), user)
        }
    }
}

// Benchmark different implementations
func BenchmarkStringBuilding(b *testing.B) {
    tests := []struct {
        name string
        fn   func([]string) string
    }{
        {"concat", concatenateStrings},
        {"builder", buildStringsWithBuilder}, 
        {"join", strings.Join},
    }
    
    words := []string{"hello", "world", "this", "is", "a", "benchmark"}
    
    for _, test := range tests {
        b.Run(test.name, func(b *testing.B) {
            b.ReportAllocs()
            for i := 0; i < b.N; i++ {
                _ = test.fn(words)
            }
        })
    }
}

// Example string building functions for benchmarking
func concatenateStrings(words []string) string {
    result := ""
    for _, word := range words {
        result += word + " "
    }
    return result
}

func buildStringsWithBuilder(words []string) string {
    var builder strings.Builder
    for _, word := range words {
        builder.WriteString(word)
        builder.WriteString(" ")
    }
    return builder.String()
}

// Placeholder functions for benchmarking setup
func setupBenchmarkService() *UserService {
    return &UserService{}
}

func generateTestUsers(count int) []User {
    users := make([]User, count)
    for i := 0; i < count; i++ {
        users[i] = User{
            ID:    fmt.Sprintf("user-%d", i),
            Email: fmt.Sprintf("user%[email protected]", i),
            Name:  fmt.Sprintf("User %d", i),
        }
    }
    return users
}

func (s *UserService) ProcessUser(ctx context.Context, user User) error {
    // Placeholder processing logic
    return nil
}

Running Benchmarks:

# Run benchmarks with memory stats
go test -bench=. -benchmem

# Run with race detection
go test -race -run=^$ -bench=.

# Use benchstat for statistical comparison
go test -bench=. -count=10 > old.txt
# Make changes...
go test -bench=. -count=10 > new.txt
benchstat old.txt new.txt

Fuzzing for Edge Cases (Go 1.18+)

✅ DO: Use fuzzing to discover edge cases

func FuzzParseEmail(f *testing.F) {
    // Seed corpus with known inputs
    f.Add("[email protected]")
    f.Add("[email protected]")
    f.Add("invalid-email")
    
    f.Fuzz(func(t *testing.T, email string) {
        result, err := ParseEmail(email)
        
        // Invariants that should always hold
        if err == nil {
            // Valid email should have @ symbol
            if !strings.Contains(result.Address, "@") {
                t.Errorf("valid email missing @: %s", result.Address)
            }
            
            // Should be able to round-trip
            if result.String() != email {
                t.Errorf("round-trip failed: %s != %s", result.String(), email)
            }
        }
        
        // Function should never panic
        // (if it panics, the fuzzer will catch it)
    })
}

// Example email parsing function for fuzzing
type EmailAddress struct {
    Address string
}

func (e EmailAddress) String() string {
    return e.Address
}

func ParseEmail(email string) (*EmailAddress, error) {
    if !strings.Contains(email, "@") {
        return nil, errors.New("invalid email format")
    }
    return &EmailAddress{Address: email}, nil
}

Running Fuzzing:

# Run fuzzing for 10 seconds (good for CI)
go test -fuzz=FuzzParseEmail -fuzztime=10s

# Run until failure is found
go test -fuzz=FuzzParseEmail

Testing Patterns Reference

Pattern When to Use Example
Table-Driven Multiple similar test cases Validation functions, algorithms
Subtests Grouping related tests t.Run() for test organization
Mocks/Stubs External dependencies Database, HTTP clients, time
Benchmarks Performance-critical code Parsers, serialization, algorithms
Fuzzing Edge case discovery Input validation, parsers
Integration End-to-end flows HTTP handlers, database operations

Testing Anti-Patterns

❌ Common Testing Mistakes:

Anti-Pattern Why It's Wrong Better Approach
No t.Helper() Confusing error locations Always use in test utilities
Testing Implementation Brittle, breaks on refactoring Test behavior, not internals
Shared Mutable State Tests affect each other Create fresh test data per test
No Error Testing Missing unhappy path coverage Test both success and failure cases
Slow Tests Developers skip running them Use mocks, avoid real I/O

14. Anti-Patterns Reference & Final Checklist

Critical Anti-Patterns (Instant PR Rejection)

Anti-Pattern Why It's Wrong Production Impact Fix
Global Variables Hidden dependencies, testing nightmare Cannot test in isolation, race conditions, debugging hell Dependency injection
Context in Structs Stale contexts, lifecycle issues Goroutine leaks, cancelled operations, memory leaks Context as parameters
String Context Keys Key collisions across packages Silent data corruption, debugging nightmare Unexported struct pointer keys
sync.WaitGroup without error handling No error handling, no cancellation Silent failures, resource leaks, no fail-fast Use errgroup
Error Equality == Breaks with wrapped errors Wrong HTTP status codes, broken retry logic Use errors.Is/As
Ignoring Errors Silent failures, data corruption Money transfers fail silently, data loss Always handle errors
Hard-coded Dependencies Cannot test, no flexibility Impossible unit testing, deployment issues Interface injection

Production Readiness Checklist

✅ Architecture:

  • All dependencies injected through constructors
  • Interfaces defined where they're used
  • No global state (except errors/constants)
  • Context flows through parameters
  • Structured logging with correlation IDs

✅ Error Handling:

  • All errors handled explicitly
  • Use errors.Is/As for error checking
  • Rich error context with wrapping
  • Custom error types for structured handling
  • No panics in production paths

✅ Concurrency:

  • Use errgroup instead of sync.WaitGroup
  • Panic recovery in concurrent code
  • Context cancellation respected
  • Resource limits set (g.SetLimit)
  • Loop variables captured correctly

✅ Performance:

  • Pre-allocate slices with known capacity
  • Use object pooling for expensive allocations
  • Understand escape analysis implications
  • Avoid unnecessary pointer escapes
  • Profile before optimizing

✅ Code Quality:

  • Intention-revealing names
  • Small, focused functions
  • Single responsibility principle
  • No god packages or interfaces
  • Consistent error handling patterns

Quick Reference: Naming Conventions

Type Pattern Example Notes
Package Short, lowercase user, http No underscores
Interface Behavior + -er Reader, UserFinder Single method = -er suffix
Struct Clear noun User, HTTPClient No stuttering
Method Verb phrase GetUser, Close Action-oriented
Variable Scope-appropriate u (short), currentUser (long) Short scope = short name
Constant Descriptive DefaultTimeout Clear intent
Error Variable Err prefix ErrNotFound Standard pattern

Final Words: The Go Philosophy

Go succeeds because it optimizes for the right things:

  • Reading over Writing: Code is read 10x more than written
  • Clarity over Cleverness: Simple solutions beat complex ones
  • Explicit over Implicit: Make dependencies and error paths visible
  • Composition over Inheritance: Build from small, composable parts
  • Practicality over Purity: Choose what works in production

Remember: Following these patterns doesn't just make your Go code better—it makes you a better engineer. You learn to think in systems, value simplicity, design for testing, handle failures gracefully, and build for observability.

Go teaches us that great software engineering is about building reliable, maintainable systems—not showing off how clever we are.


References

Authoritative Sources:

Performance and Memory:

Modern Go Features:


"The key to making programs fast is to make them do practically nothing." - Mike Haertel

This guide represents the collective wisdom of the Go community, distilled into actionable patterns for building production-grade applications. Master these patterns, and you'll write Go code that is not only correct and performant, but maintainable, testable, and joyful to work with.


Contributing: This guide evolves with the Go community. Share your production experiences, war stories, and improvements to help fellow engineers build better Go applications.**: Consistent error naming enables reliable error handling with errors.Is and makes error types instantly recognizable in logs and debugging.

@ashokallu
Copy link
Author

@kaspertoq sorry for deletion of my last guide. I thought no one would use it. I uploaded a newer version with easy accesibility. Please check now,

@kaspertoq
Copy link

Thank you for reuploading the guide. I find it really helpful! I stumpled upon it because it is linked here: https://steipete.me/posts/2025/poltergeist-ghost-keeps-builds-fresh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment