Skip to content

Instantly share code, notes, and snippets.

@cybersiddhu
Created October 24, 2025 15:36
Show Gist options
  • Select an option

  • Save cybersiddhu/ecb9c896259c1ab126d981569ba2bac5 to your computer and use it in GitHub Desktop.

Select an option

Save cybersiddhu/ecb9c896259c1ab126d981569ba2bac5 to your computer and use it in GitHub Desktop.
create database using pebble, a rocksdb variant in golang

Comprehensive Guide to Building Searchable Databases with Pebble

Let me break down everything step-by-step with detailed explanations for beginners.

Table of Contents

  1. Understanding Key-Value Databases
  2. Basic Concepts
  3. Simple Examples with Explanations
  4. Building Real-World Applications
  5. Advanced Patterns

Understanding Key-Value Databases

What is a Key-Value Store?

Think of a key-value store like a giant dictionary or HashMap:

  • Key: A unique identifier (like "user:123")
  • Value: The data you want to store (like user information)
Key: "user:1"  →  Value: {"name": "John", "age": 30}
Key: "user:2"  →  Value: {"name": "Jane", "age": 25}

Why Pebble?

Pebble is like a super-fast filing cabinet that:

  • Stores data on disk (persistent)
  • Retrieves data very quickly
  • Handles millions of records efficiently
  • Is written in Go (no C dependencies like RocksDB)

Basic Concepts

1. Keys Are Sorted

Pebble stores keys in alphabetical order. This is crucial for searching:

"user:001"
"user:002"
"user:003"
"user:100"

This ordering allows us to:

  • Find ranges of data quickly
  • Search by prefix
  • Iterate through related records

2. Bytes, Not Strings

Pebble works with []byte (byte slices), not strings directly:

// Converting between string and bytes
key := []byte("user:1")        // string to bytes
value := []byte("John Doe")    // string to bytes
str := string(key)             // bytes to string

Simple Examples with Explanations

Example 1: Your First Pebble Database

Let's start with the absolute basics:

package main

import (
    "fmt"
    "log"
    
    "github.com/cockroachdb/pebble"
)

func main() {
    // Step 1: Open (or create) a database
    // "./mydb" is the folder where data will be stored
    db, err := pebble.Open("./mydb", &pebble.Options{})
    if err != nil {
        log.Fatal(err)
    }
    // Always close the database when done
    defer db.Close()
    
    // Step 2: Store some data
    // Think: "In the 'apple' drawer, put 'A red fruit'"
    key := []byte("apple")
    value := []byte("A red fruit")
    
    // pebble.Sync means: make sure it's written to disk before continuing
    err = db.Set(key, value, pebble.Sync)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println("✓ Stored: apple → A red fruit")
    
    // Step 3: Retrieve the data
    // Think: "Open the 'apple' drawer and read what's inside"
    data, closer, err := db.Get(key)
    if err != nil {
        log.Fatal(err)
    }
    // closer is like closing a file - always do it!
    defer closer.Close()
    
    fmt.Printf("✓ Retrieved: %s → %s\n", key, data)
    
    // Step 4: Update the data
    // Same key, new value - it overwrites the old value
    newValue := []byte("A delicious red fruit")
    err = db.Set(key, newValue, pebble.Sync)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println("✓ Updated the value")
    
    // Step 5: Delete the data
    err = db.Delete(key, pebble.Sync)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println("✓ Deleted: apple")
    
    // Step 6: Try to get deleted data
    _, _, err = db.Get(key)
    if err == pebble.ErrNotFound {
        fmt.Println("✓ Confirmed: apple is gone")
    }
}

What's happening here?

  1. We open a database (creates a folder with database files)
  2. We store a key-value pair
  3. We retrieve it back
  4. We update it
  5. We delete it
  6. We verify it's gone

Example 2: Storing Multiple Items and Searching

Now let's store multiple items and learn to search through them:

package main

import (
    "fmt"
    "log"
    
    "github.com/cockroachdb/pebble"
)

func main() {
    db, err := pebble.Open("./fruitdb", &pebble.Options{})
    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()
    
    // Store multiple fruits
    // Notice the key pattern: "fruit:name"
    fruits := map[string]string{
        "fruit:apple":  "Red, crunchy",
        "fruit:banana": "Yellow, soft",
        "fruit:cherry": "Red, small",
        "fruit:date":   "Brown, sweet",
    }
    
    fmt.Println("=== Storing Fruits ===")
    for key, value := range fruits {
        err := db.Set([]byte(key), []byte(value), pebble.Sync)
        if err != nil {
            log.Fatal(err)
        }
        fmt.Printf("✓ Stored: %s\n", key)
    }
    
    // Now let's search for all fruits
    // We use an Iterator - think of it as a cursor that moves through data
    fmt.Println("\n=== All Fruits ===")
    
    // Create an iterator that only looks at keys starting with "fruit:"
    iter, err := db.NewIter(&pebble.IterOptions{
        LowerBound: []byte("fruit:"),      // Start here
        UpperBound: []byte("fruit:\xff"),  // End here (\xff is the highest byte)
    })
    if err != nil {
        log.Fatal(err)
    }
    defer iter.Close()
    
    // Move to the first item
    for iter.First(); iter.Valid(); iter.Next() {
        // iter.Key() gives us the key
        // iter.Value() gives us the value
        fmt.Printf("%s → %s\n", iter.Key(), iter.Value())
    }
    
    // Check if there were any errors during iteration
    if err := iter.Error(); err != nil {
        log.Fatal(err)
    }
}

Key Concepts:

  1. Key Naming Pattern: We use "fruit:apple" instead of just "apple". This groups related items together.

  2. Iterator: Like a bookmark that moves through your data:

    • iter.First(): Go to the first item
    • iter.Valid(): Check if we're still in valid data
    • iter.Next(): Move to the next item
    • iter.Key(): Get the current key
    • iter.Value(): Get the current value
  3. Bounds:

    • LowerBound: Where to start searching
    • UpperBound: Where to stop searching
    • \xff is a special character that's "higher" than any normal character

Example 3: Building a Simple Contact Book

Let's build something practical - a contact book:

package main

import (
    "encoding/json"
    "fmt"
    "log"
    "strings"
    
    "github.com/cockroachdb/pebble"
)

// Contact represents a person's information
type Contact struct {
    ID    string `json:"id"`
    Name  string `json:"name"`
    Phone string `json:"phone"`
    Email string `json:"email"`
}

// ContactBook manages our contacts database
type ContactBook struct {
    db *pebble.DB
}

// NewContactBook creates a new contact book
func NewContactBook(path string) (*ContactBook, error) {
    db, err := pebble.Open(path, &pebble.Options{})
    if err != nil {
        return nil, err
    }
    return &ContactBook{db: db}, nil
}

// Close closes the database
func (cb *ContactBook) Close() error {
    return cb.db.Close()
}

// AddContact adds a new contact to the database
func (cb *ContactBook) AddContact(contact Contact) error {
    // Step 1: Convert the Contact struct to JSON
    // This turns our struct into a string we can store
    contactJSON, err := json.Marshal(contact)
    if err != nil {
        return fmt.Errorf("failed to convert contact to JSON: %w", err)
    }
    
    // Step 2: Create the key
    // Pattern: "contact:ID"
    key := fmt.Sprintf("contact:%s", contact.ID)
    
    // Step 3: Store it
    err = cb.db.Set([]byte(key), contactJSON, pebble.Sync)
    if err != nil {
        return fmt.Errorf("failed to store contact: %w", err)
    }
    
    fmt.Printf("✓ Added contact: %s (%s)\n", contact.Name, contact.ID)
    return nil
}

// GetContact retrieves a contact by ID
func (cb *ContactBook) GetContact(id string) (*Contact, error) {
    // Step 1: Build the key
    key := fmt.Sprintf("contact:%s", id)
    
    // Step 2: Get the data
    data, closer, err := cb.db.Get([]byte(key))
    if err == pebble.ErrNotFound {
        return nil, fmt.Errorf("contact %s not found", id)
    }
    if err != nil {
        return nil, err
    }
    defer closer.Close()
    
    // Step 3: Convert JSON back to Contact struct
    var contact Contact
    err = json.Unmarshal(data, &contact)
    if err != nil {
        return nil, fmt.Errorf("failed to parse contact data: %w", err)
    }
    
    return &contact, nil
}

// ListAllContacts returns all contacts
func (cb *ContactBook) ListAllContacts() ([]Contact, error) {
    // Create iterator for all contacts
    iter, err := cb.db.NewIter(&pebble.IterOptions{
        LowerBound: []byte("contact:"),
        UpperBound: []byte("contact:\xff"),
    })
    if err != nil {
        return nil, err
    }
    defer iter.Close()
    
    var contacts []Contact
    
    // Loop through all contacts
    for iter.First(); iter.Valid(); iter.Next() {
        var contact Contact
        
        // Parse each contact
        err := json.Unmarshal(iter.Value(), &contact)
        if err != nil {
            // Skip invalid entries
            continue
        }
        
        contacts = append(contacts, contact)
    }
    
    return contacts, iter.Error()
}

// DeleteContact removes a contact
func (cb *ContactBook) DeleteContact(id string) error {
    key := fmt.Sprintf("contact:%s", id)
    err := cb.db.Delete([]byte(key), pebble.Sync)
    if err != nil {
        return err
    }
    fmt.Printf("✓ Deleted contact: %s\n", id)
    return nil
}

func main() {
    // Create a new contact book
    book, err := NewContactBook("./contacts")
    if err != nil {
        log.Fatal(err)
    }
    defer book.Close()
    
    // Add some contacts
    contacts := []Contact{
        {ID: "1", Name: "Alice Johnson", Phone: "555-0101", Email: "[email protected]"},
        {ID: "2", Name: "Bob Smith", Phone: "555-0102", Email: "[email protected]"},
        {ID: "3", Name: "Charlie Brown", Phone: "555-0103", Email: "[email protected]"},
    }
    
    fmt.Println("=== Adding Contacts ===")
    for _, contact := range contacts {
        if err := book.AddContact(contact); err != nil {
            log.Printf("Error: %v", err)
        }
    }
    
    // Get a specific contact
    fmt.Println("\n=== Getting Contact #2 ===")
    contact, err := book.GetContact("2")
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        fmt.Printf("Found: %s - %s - %s\n", contact.Name, contact.Phone, contact.Email)
    }
    
    // List all contacts
    fmt.Println("\n=== All Contacts ===")
    allContacts, err := book.ListAllContacts()
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        for _, c := range allContacts {
            fmt.Printf("- %s: %s (%s)\n", c.ID, c.Name, c.Phone)
        }
    }
    
    // Delete a contact
    fmt.Println("\n=== Deleting Contact #2 ===")
    if err := book.DeleteContact("2"); err != nil {
        log.Printf("Error: %v", err)
    }
    
    // Verify deletion
    fmt.Println("\n=== Contacts After Deletion ===")
    allContacts, _ = book.ListAllContacts()
    for _, c := range allContacts {
        fmt.Printf("- %s: %s (%s)\n", c.ID, c.Name, c.Phone)
    }
}

What We Learned:

  1. Structs: We use Go structs to organize data
  2. JSON Marshaling: Converting structs to/from JSON for storage
  3. Error Handling: Proper error checking and messages
  4. Encapsulation: Wrapping database operations in a struct with methods

Example 4: Adding Search by Name (Secondary Index)

Now let's add the ability to search contacts by name:

package main

import (
    "encoding/json"
    "fmt"
    "log"
    "strings"
    
    "github.com/cockroachdb/pebble"
)

type Contact struct {
    ID    string `json:"id"`
    Name  string `json:"name"`
    Phone string `json:"phone"`
    Email string `json:"email"`
}

type ContactBook struct {
    db *pebble.DB
}

func NewContactBook(path string) (*ContactBook, error) {
    db, err := pebble.Open(path, &pebble.Options{})
    if err != nil {
        return nil, err
    }
    return &ContactBook{db: db}, nil
}

func (cb *ContactBook) Close() error {
    return cb.db.Close()
}

// AddContact now creates TWO entries:
// 1. The main contact data: "contact:ID" → contact JSON
// 2. A name index: "name:LOWERCASE_NAME" → ID
func (cb *ContactBook) AddContact(contact Contact) error {
    contactJSON, err := json.Marshal(contact)
    if err != nil {
        return err
    }
    
    // Use a batch to write multiple keys atomically
    // This means: either ALL writes succeed, or NONE do
    batch := cb.db.NewBatch()
    defer batch.Close()
    
    // Write #1: Main contact data
    contactKey := fmt.Sprintf("contact:%s", contact.ID)
    if err := batch.Set([]byte(contactKey), contactJSON, nil); err != nil {
        return err
    }
    
    // Write #2: Name index
    // We store the name in lowercase for case-insensitive search
    nameKey := fmt.Sprintf("name:%s", strings.ToLower(contact.Name))
    // The value is just the ID - we'll use it to look up the full contact
    if err := batch.Set([]byte(nameKey), []byte(contact.ID), nil); err != nil {
        return err
    }
    
    // Commit both writes together
    if err := batch.Commit(pebble.Sync); err != nil {
        return err
    }
    
    fmt.Printf("✓ Added contact with name index: %s\n", contact.Name)
    return nil
}

// SearchByName finds contacts whose names start with the given prefix
func (cb *ContactBook) SearchByName(namePrefix string) ([]Contact, error) {
    // Convert to lowercase for case-insensitive search
    prefix := strings.ToLower(namePrefix)
    
    // Create bounds for our search
    lowerBound := fmt.Sprintf("name:%s", prefix)
    upperBound := fmt.Sprintf("name:%s\xff", prefix)
    
    iter, err := cb.db.NewIter(&pebble.IterOptions{
        LowerBound: []byte(lowerBound),
        UpperBound: []byte(upperBound),
    })
    if err != nil {
        return nil, err
    }
    defer iter.Close()
    
    var contacts []Contact
    
    // For each matching name...
    for iter.First(); iter.Valid(); iter.Next() {
        // The value is the contact ID
        contactID := string(iter.Value())
        
        // Look up the full contact using the ID
        contact, err := cb.GetContact(contactID)
        if err != nil {
            continue // Skip if we can't find it
        }
        
        contacts = append(contacts, *contact)
    }
    
    return contacts, iter.Error()
}

func (cb *ContactBook) GetContact(id string) (*Contact, error) {
    key := fmt.Sprintf("contact:%s", id)
    data, closer, err := cb.db.Get([]byte(key))
    if err != nil {
        return nil, err
    }
    defer closer.Close()
    
    var contact Contact
    if err := json.Unmarshal(data, &contact); err != nil {
        return nil, err
    }
    return &contact, nil
}

func main() {
    book, err := NewContactBook("./searchable_contacts")
    if err != nil {
        log.Fatal(err)
    }
    defer book.Close()
    
    // Add contacts
    contacts := []Contact{
        {ID: "1", Name: "Alice Anderson", Phone: "555-0101", Email: "[email protected]"},
        {ID: "2", Name: "Alice Baker", Phone: "555-0102", Email: "[email protected]"},
        {ID: "3", Name: "Bob Smith", Phone: "555-0103", Email: "[email protected]"},
        {ID: "4", Name: "Charlie Brown", Phone: "555-0104", Email: "[email protected]"},
    }
    
    fmt.Println("=== Adding Contacts ===")
    for _, contact := range contacts {
        if err := book.AddContact(contact); err != nil {
            log.Printf("Error: %v", err)
        }
    }
    
    // Search for names starting with "alice"
    fmt.Println("\n=== Searching for 'alice' ===")
    results, err := book.SearchByName("alice")
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        fmt.Printf("Found %d contacts:\n", len(results))
        for _, c := range results {
            fmt.Printf("  - %s: %s\n", c.Name, c.Phone)
        }
    }
    
    // Search for names starting with "bob"
    fmt.Println("\n=== Searching for 'bob' ===")
    results, err = book.SearchByName("bob")
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        fmt.Printf("Found %d contacts:\n", len(results))
        for _, c := range results {
            fmt.Printf("  - %s: %s\n", c.Name, c.Phone)
        }
    }
}

Understanding Secondary Indexes:

Think of it like a library:

  • Primary Index (contact:ID): The actual book on the shelf
  • Secondary Index (name:NAME): A card catalog that tells you where to find the book

When you search by name:

  1. Look in the name index → find the ID
  2. Use the ID to get the full contact data

Why use batches?

batch := cb.db.NewBatch()
batch.Set(key1, value1, nil)
batch.Set(key2, value2, nil)
batch.Commit(pebble.Sync)

This ensures that BOTH the contact and the name index are written together. If one fails, neither is written (atomicity).


Example 5: Range Queries - Finding Products by Price

Let's build a product catalog where we can search by price range:

package main

import (
    "encoding/binary"
    "encoding/json"
    "fmt"
    "log"
    
    "github.com/cockroachdb/pebble"
)

type Product struct {
    ID    string  `json:"id"`
    Name  string  `json:"name"`
    Price float64 `json:"price"`
}

type ProductCatalog struct {
    db *pebble.DB
}

func NewProductCatalog(path string) (*ProductCatalog, error) {
    db, err := pebble.Open(path, &pebble.Options{})
    if err != nil {
        return nil, err
    }
    return &ProductCatalog{db: db}, nil
}

func (pc *ProductCatalog) Close() error {
    return pc.db.Close()
}

// encodePrice converts a price to bytes that sort correctly
// This is CRUCIAL for range queries to work!
func encodePrice(price float64) []byte {
    // Convert float to uint64 bits
    bits := binary.Float64bits(price)
    
    // Flip bits so negative numbers sort before positive
    if price >= 0 {
        bits ^= 0x8000000000000000
    } else {
        bits ^= 0xffffffffffffffff
    }
    
    // Convert to bytes in big-endian format (most significant byte first)
    buf := make([]byte, 8)
    binary.BigEndian.PutUint64(buf, bits)
    return buf
}

// decodePrice converts bytes back to a price
func decodePrice(buf []byte) float64 {
    bits := binary.BigEndian.Uint64(buf)
    
    // Reverse the bit flipping
    if (bits & 0x8000000000000000) != 0 {
        bits ^= 0x8000000000000000
    } else {
        bits ^= 0xffffffffffffffff
    }
    
    return binary.Float64frombits(bits)
}

// AddProduct stores a product with a price index
func (pc *ProductCatalog) AddProduct(product Product) error {
    productJSON, err := json.Marshal(product)
    if err != nil {
        return err
    }
    
    batch := pc.db.NewBatch()
    defer batch.Close()
    
    // Store #1: Main product data
    productKey := fmt.Sprintf("product:%s", product.ID)
    if err := batch.Set([]byte(productKey), productJSON, nil); err != nil {
        return err
    }
    
    // Store #2: Price index
    // Key format: "price:" + encoded_price + ":" + product_id
    // This allows us to find all products in a price range
    priceBytes := encodePrice(product.Price)
    priceKey := fmt.Sprintf("price:%s:%s", string(priceBytes), product.ID)
    if err := batch.Set([]byte(priceKey), []byte(product.ID), nil); err != nil {
        return err
    }
    
    if err := batch.Commit(pebble.Sync); err != nil {
        return err
    }
    
    fmt.Printf("✓ Added product: %s ($%.2f)\n", product.Name, product.Price)
    return nil
}

// GetProduct retrieves a product by ID
func (pc *ProductCatalog) GetProduct(id string) (*Product, error) {
    key := fmt.Sprintf("product:%s", id)
    data, closer, err := pc.db.Get([]byte(key))
    if err != nil {
        return nil, err
    }
    defer closer.Close()
    
    var product Product
    if err := json.Unmarshal(data, &product); err != nil {
        return nil, err
    }
    return &product, nil
}

// FindProductsByPriceRange finds all products within a price range
func (pc *ProductCatalog) FindProductsByPriceRange(minPrice, maxPrice float64) ([]Product, error) {
    // Encode the price boundaries
    minPriceBytes := encodePrice(minPrice)
    maxPriceBytes := encodePrice(maxPrice)
    
    // Create search bounds
    lowerBound := fmt.Sprintf("price:%s", string(minPriceBytes))
    upperBound := fmt.Sprintf("price:%s\xff", string(maxPriceBytes))
    
    iter, err := pc.db.NewIter(&pebble.IterOptions{
        LowerBound: []byte(lowerBound),
        UpperBound: []byte(upperBound),
    })
    if err != nil {
        return nil, err
    }
    defer iter.Close()
    
    var products []Product
    
    for iter.First(); iter.Valid(); iter.Next() {
        // The value in the price index is the product ID
        productID := string(iter.Value())
        
        // Look up the full product
        product, err := pc.GetProduct(productID)
        if err != nil {
            continue
        }
        
        products = append(products, *product)
    }
    
    return products, iter.Error()
}

func main() {
    catalog, err := NewProductCatalog("./products")
    if err != nil {
        log.Fatal(err)
    }
    defer catalog.Close()
    
    // Add products with various prices
    products := []Product{
        {ID: "1", Name: "Budget Mouse", Price: 9.99},
        {ID: "2", Name: "Standard Keyboard", Price: 29.99},
        {ID: "3", Name: "Gaming Headset", Price: 79.99},
        {ID: "4", Name: "Wireless Mouse", Price: 49.99},
        {ID: "5", Name: "Mechanical Keyboard", Price: 129.99},
        {ID: "6", Name: "4K Monitor", Price: 399.99},
        {ID: "7", Name: "USB Cable", Price: 5.99},
    }
    
    fmt.Println("=== Adding Products ===")
    for _, product := range products {
        if err := catalog.AddProduct(product); err != nil {
            log.Printf("Error: %v", err)
        }
    }
    
    // Search for products under $50
    fmt.Println("\n=== Products Under $50 ===")
    affordable, err := catalog.FindProductsByPriceRange(0, 50)
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        for _, p := range affordable {
            fmt.Printf("  - %s: $%.2f\n", p.Name, p.Price)
        }
    }
    
    // Search for mid-range products ($50-$150)
    fmt.Println("\n=== Mid-Range Products ($50-$150) ===")
    midRange, err := catalog.FindProductsByPriceRange(50, 150)
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        for _, p := range midRange {
            fmt.Printf("  - %s: $%.2f\n", p.Name, p.Price)
        }
    }
    
    // Search for premium products (over $150)
    fmt.Println("\n=== Premium Products (Over $150) ===")
    premium, err := catalog.FindProductsByPriceRange(150, 10000)
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        for _, p := range premium {
            fmt.Printf("  - %s: $%.2f\n", p.Name, p.Price)
        }
    }
}

Why Encode Prices?

Imagine storing prices as strings:

"10.00"
"100.00"
"20.00"
"5.00"

Alphabetically, they sort as: "10.00", "100.00", "20.00", "5.00" ❌ Wrong!

By encoding as bytes, they sort numerically: 5.00, 10.00, 20.00, 100.00 ✓ Correct!


Example 6: Building a Blog System with Multiple Indexes

Let's build a complete blog system with posts that can be searched by:

  • Author
  • Tag
  • Date range
package main

import (
    "encoding/json"
    "fmt"
    "log"
    "strings"
    "time"
    
    "github.com/cockroachdb/pebble"
)

type BlogPost struct {
    ID        string    `json:"id"`
    Title     string    `json:"title"`
    Content   string    `json:"content"`
    Author    string    `json:"author"`
    Tags      []string  `json:"tags"`
    CreatedAt time.Time `json:"created_at"`
}

type Blog struct {
    db *pebble.DB
}

func NewBlog(path string) (*Blog, error) {
    db, err := pebble.Open(path, &pebble.Options{})
    if err != nil {
        return nil, err
    }
    return &Blog{db: db}, nil
}

func (b *Blog) Close() error {
    return b.db.Close()
}

// PublishPost creates multiple indexes for a blog post
func (b *Blog) PublishPost(post BlogPost) error {
    postJSON, err := json.Marshal(post)
    if err != nil {
        return err
    }
    
    batch := b.db.NewBatch()
    defer batch.Close()
    
    // Index 1: Main post data
    // Key: "post:ID"
    postKey := fmt.Sprintf("post:%s", post.ID)
    if err := batch.Set([]byte(postKey), postJSON, nil); err != nil {
        return err
    }
    
    // Index 2: Author index
    // Key: "author:AUTHOR_NAME:TIMESTAMP:ID"
    // This allows us to find all posts by an author, sorted by date
    authorKey := fmt.Sprintf("author:%s:%d:%s",
        strings.ToLower(post.Author),
        post.CreatedAt.Unix(),
        post.ID)
    if err := batch.Set([]byte(authorKey), []byte(post.ID), nil); err != nil {
        return err
    }
    
    // Index 3: Tag indexes (one for each tag)
    // Key: "tag:TAG_NAME:TIMESTAMP:ID"
    for _, tag := range post.Tags {
        tagKey := fmt.Sprintf("tag:%s:%d:%s",
            strings.ToLower(tag),
            post.CreatedAt.Unix(),
            post.ID)
        if err := batch.Set([]byte(tagKey), []byte(post.ID), nil); err != nil {
            return err
        }
    }
    
    // Index 4: Date index
    // Key: "date:TIMESTAMP:ID"
    // This allows us to find posts in a date range
    dateKey := fmt.Sprintf("date:%d:%s", post.CreatedAt.Unix(), post.ID)
    if err := batch.Set([]byte(dateKey), []byte(post.ID), nil); err != nil {
        return err
    }
    
    if err := batch.Commit(pebble.Sync); err != nil {
        return err
    }
    
    fmt.Printf("✓ Published: '%s' by %s\n", post.Title, post.Author)
    return nil
}

// GetPost retrieves a post by ID
func (b *Blog) GetPost(id string) (*BlogPost, error) {
    key := fmt.Sprintf("post:%s", id)
    data, closer, err := b.db.Get([]byte(key))
    if err != nil {
        return nil, err
    }
    defer closer.Close()
    
    var post BlogPost
    if err := json.Unmarshal(data, &post); err != nil {
        return nil, err
    }
    return &post, nil
}

// FindPostsByAuthor finds all posts by a specific author
func (b *Blog) FindPostsByAuthor(author string) ([]BlogPost, error) {
    prefix := fmt.Sprintf("author:%s:", strings.ToLower(author))
    return b.findPostsByPrefix(prefix)
}

// FindPostsByTag finds all posts with a specific tag
func (b *Blog) FindPostsByTag(tag string) ([]BlogPost, error) {
    prefix := fmt.Sprintf("tag:%s:", strings.ToLower(tag))
    return b.findPostsByPrefix(prefix)
}

// FindPostsByDateRange finds posts within a date range
func (b *Blog) FindPostsByDateRange(start, end time.Time) ([]BlogPost, error) {
    lowerBound := fmt.Sprintf("date:%d:", start.Unix())
    upperBound := fmt.Sprintf("date:%d:\xff", end.Unix())
    
    iter, err := b.db.NewIter(&pebble.IterOptions{
        LowerBound: []byte(lowerBound),
        UpperBound: []byte(upperBound),
    })
    if err != nil {
        return nil, err
    }
    defer iter.Close()
    
    var posts []BlogPost
    seen := make(map[string]bool) // Prevent duplicates
    
    for iter.First(); iter.Valid(); iter.Next() {
        postID := string(iter.Value())
        
        if seen[postID] {
            continue
        }
        seen[postID] = true
        
        post, err := b.GetPost(postID)
        if err != nil {
            continue
        }
        posts = append(posts, *post)
    }
    
    return posts, iter.Error()
}

// Helper function to find posts by prefix
func (b *Blog) findPostsByPrefix(prefix string) ([]BlogPost, error) {
    iter, err := b.db.NewIter(&pebble.IterOptions{
        LowerBound: []byte(prefix),
        UpperBound: []byte(prefix + "\xff"),
    })
    if err != nil {
        return nil, err
    }
    defer iter.Close()
    
    var posts []BlogPost
    seen := make(map[string]bool)
    
    for iter.First(); iter.Valid(); iter.Next() {
        postID := string(iter.Value())
        
        if seen[postID] {
            continue
        }
        seen[postID] = true
        
        post, err := b.GetPost(postID)
        if err != nil {
            continue
        }
        posts = append(posts, *post)
    }
    
    return posts, iter.Error()
}

func main() {
    blog, err := NewBlog("./blog")
    if err != nil {
        log.Fatal(err)
    }
    defer blog.Close()
    
    // Create some blog posts
    posts := []BlogPost{
        {
            ID:        "1",
            Title:     "Getting Started with Go",
            Content:   "Go is a great language...",
            Author:    "Alice",
            Tags:      []string{"go", "programming", "tutorial"},
            CreatedAt: time.Now().Add(-48 * time.Hour),
        },
        {
            ID:        "2",
            Title:     "Advanced Go Patterns",
            Content:   "Let's explore advanced patterns...",
            Author:    "Alice",
            Tags:      []string{"go", "advanced", "patterns"},
            CreatedAt: time.Now().Add(-24 * time.Hour),
        },
        {
            ID:        "3",
            Title:     "Introduction to Databases",
            Content:   "Databases are essential...",
            Author:    "Bob",
            Tags:      []string{"database", "tutorial"},
            CreatedAt: time.Now().Add(-12 * time.Hour),
        },
        {
            ID:        "4",
            Title:     "Go and Databases",
            Content:   "Using databases in Go...",
            Author:    "Alice",
            Tags:      []string{"go", "database"},
            CreatedAt: time.Now(),
        },
    }
    
    fmt.Println("=== Publishing Posts ===")
    for _, post := range posts {
        if err := blog.PublishPost(post); err != nil {
            log.Printf("Error: %v", err)
        }
    }
    
    // Search by author
    fmt.Println("\n=== Posts by Alice ===")
    alicePosts, err := blog.FindPostsByAuthor("Alice")
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        for _, p := range alicePosts {
            fmt.Printf("  - %s (published %s)\n", p.Title, p.CreatedAt.Format("2006-01-02"))
        }
    }
    
    // Search by tag
    fmt.Println("\n=== Posts tagged 'go' ===")
    goPosts, err := blog.FindPostsByTag("go")
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        for _, p := range goPosts {
            fmt.Printf("  - %s by %s\n", p.Title, p.Author)
        }
    }
    
    // Search by date range (last 36 hours)
    fmt.Println("\n=== Posts from last 36 hours ===")
    recentPosts, err := blog.FindPostsByDateRange(
        time.Now().Add(-36*time.Hour),
        time.Now(),
    )
    if err != nil {
        log.Printf("Error: %v", err)
    } else {
        for _, p := range recentPosts {
            fmt.Printf("  - %s (%s ago)\n", p.Title, time.Since(p.CreatedAt).Round(time.Hour))
        }
    }
}

Key Concepts in This Example:

  1. Multiple Indexes: Each post is indexed in 4 different ways
  2. Composite Keys: Keys like "author:alice:1234567890:post1" combine multiple pieces of information
  3. Deduplication: We use a seen map to avoid returning the same post multiple times
  4. Time-based Sorting: Including timestamps in keys automatically sorts by date

Common Patterns and Best Practices

1. Key Design Patterns

// Pattern 1: Simple ID lookup
"user:123"user data

// Pattern 2: Secondary index
"email:[email protected]""123"  // Then lookup "user:123"

// Pattern 3: Sorted index with timestamp
"posts:2024-01-15:post123"post ID

// Pattern 4: Composite index
"author:alice:2024-01-15:post123"post ID

2. Always Use Batches for Multiple Writes

// ❌ Bad: Multiple separate writes
db.Set(key1, value1, pebble.Sync)
db.Set(key2, value2, pebble.Sync)

// ✓ Good: Atomic batch
batch := db.NewBatch()
batch.Set(key1, value1, nil)
batch.Set(key2, value2, nil)
batch.Commit(pebble.Sync)

3. Always Close Resources

// Always defer Close() calls
db, _ := pebble.Open("./db", &pebble.Options{})
defer db.Close()

iter, _ := db.NewIter(nil)
defer iter.Close()

data, closer, _ := db.Get(key)
defer closer.Close()

4. Handle Errors Properly

// Check for specific errors
data, closer, err := db.Get(key)
if err == pebble.ErrNotFound {
    // Handle missing key
    return nil, fmt.Errorf("not found")
}
if err != nil {
    // Handle other errors
    return nil, err
}
defer closer.Close()

Performance Tips

  1. Use Batches: Group multiple writes together
  2. Reuse Iterators: Create once, use multiple times if possible
  3. Choose Sync Wisely: Use pebble.Sync for important data, pebble.NoSync for less critical writes
  4. Index Strategically: Only create indexes you'll actually use
  5. Keep Keys Short: Shorter keys = less storage and faster lookups

Summary

Pebble is a powerful key-value store that becomes a searchable database through clever key design:

  1. Primary Keys: Direct access to data
  2. Secondary Indexes: Additional ways to find data
  3. Composite Keys: Combine multiple attributes for complex queries
  4. Range Queries: Use sorted keys to find ranges of data
  5. Batches: Ensure consistency across multiple writes

The key to success is designing your key schema carefully based on how you need to search your data!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment