Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save Shafran123/abe82b7b23ce8c4e8accbad51369598a to your computer and use it in GitHub Desktop.
Save Shafran123/abe82b7b23ce8c4e8accbad51369598a to your computer and use it in GitHub Desktop.
Technical Implementation of RAG System for Sri Lankan Tourism & Governance

Technical Implementation of RAG System for Sri Lankan Tourism & Governance

Comprehensive technical guide for developers and implementers


System Architecture Overview

Complete RAG Workflow

┌─────────────────────────────────────────────────────────────────────────────────┐
│                          RAG System Architecture                                │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    │
│  │   Tourist   │    │   Citizen   │    │ Government  │    │   Business  │    │
│  │   Query     │    │   Query     │    │    Query    │    │   Query     │    │
│  └──────┬──────┘    └──────┬──────┘    └──────┬──────┘    └──────┬──────┘    │
│         │                  │                  │                  │            │
│         └──────────────────┼──────────────────┼──────────────────┘            │
│                            │                  │                               │
│                            ▼                  ▼                               │
│                    ┌─────────────────────────────┐                            │
│                    │     Query Processing        │                            │
│                    │  • Language Detection       │                            │
│                    │  • Text Normalization       │                            │
│                    │  • Vector Embedding         │                            │
│                    └─────────────┬───────────────┘                            │
│                                  │                                            │
│                                  ▼                                            │
│                    ┌─────────────────────────────┐                            │
│                    │    Qdrant Vector Search     │                            │
│                    │  • Semantic Similarity      │                            │
│                    │  • Cosine Distance          │                            │
│                    │  • Multi-source Retrieval   │                            │
│                    └─────────────┬───────────────┘                            │
│                                  │                                            │
│                                  ▼                                            │
│  ┌─────────────────────────────────────────────────────────────────────────┐  │
│  │                      Knowledge Base                                     │  │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐    │  │
│  │  │   Tourism   │ │ Government  │ │  Cultural   │ │ Emergency   │    │  │
│  │  │    Data     │ │  Services   │ │   Content   │ │   Info      │    │  │
│  │  │• Hotels     │ │• Procedures │ │• Festivals  │ │• Contacts   │    │  │
│  │  │• Attractions│ │• Documents  │ │• Traditions │ │• Hotlines   │    │  │
│  │  │• Transport  │ │• Policies   │ │• Languages  │ │• Hospitals  │    │  │
│  │  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘    │  │
│  └─────────────────────────────────────────────────────────────────────────┘  │
│                                  │                                            │
│                                  ▼                                            │
│                    ┌─────────────────────────────┐                            │
│                    │    Context Assembly         │                            │
│                    │  • Top 6 Relevant Chunks    │                            │
│                    │  • Score-based Ranking      │                            │
│                    │  • Deduplication            │                            │
│                    └─────────────┬───────────────┘                            │
│                                  │                                            │
│                                  ▼                                            │
│                    ┌─────────────────────────────┐                            │
│                    │    AI Response Generation   │                            │
│                    │  • Groq/Llama3 API         │                            │
│                    │  • Context + Query          │                            │
│                    │  • Fallback to Ollama       │                            │
│                    └─────────────┬───────────────┘                            │
│                                  │                                            │
│                                  ▼                                            │
│                    ┌─────────────────────────────┐                            │
│                    │   Intelligent Response      │                            │
│                    │  • Accurate Information     │                            │
│                    │  • Cultural Context         │                            │
│                    │  • Multi-language Support   │                            │
│                    └─────────────────────────────┘                            │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

Simple Flow: User Query → Vector Search → Relevant Documents → AI Generation → Contextual Response

Qdrant Vector Database Implementation

Database Configuration

Our implementation uses Qdrant, a high-performance vector database that enables semantic search capabilities:

// Qdrant Configuration
const client = new QdrantClient({ 
  host: process.env.QDRANT_HOST || 'localhost', 
  port: 6333 
});

// Collection Setup
const collection = {
  name: 'tourism_kb',
  vectors: {
    size: 384,  // all-MiniLM-L6-v2 embeddings
    distance: 'Cosine'
  }
};

Understanding Cosine Similarity

Cosine similarity is like measuring how "similar" two documents are by comparing their direction rather than their exact words. Imagine two arrows pointing in space - if they point in the same direction, they're similar (score near 1.0). If they point in opposite directions, they're different (score near 0.0).

In our RAG system:

  • Each document becomes a 384-dimensional arrow (vector) based on its meaning
  • User queries also become arrows in the same space
  • Cosine similarity finds documents pointing in the same direction as the query
  • This works even when different words are used - "hotel" and "accommodation" point in similar directions

RAG Search Implementation

RAG System Example:

User Query: "Where can I stay in Kandy?"
                    |
                    ↓
            [Vector Embedding]
                    |
                    ↓
         Query Vector: [0.2, 0.8, 0.1, ...]
                    |
                    ↓
      [Cosine Similarity Search in Tourism Database]
                    |
                    ↓
    ┌─────────────────────────────────────────────────┐
    │ Document Vectors & Similarity Scores:           │
    │                                                 │
    │ "Hotels in Kandy"           → 0.92 (Very High) │
    │ "Guesthouses near Kandy"    → 0.87 (High)      │
    │ "Accommodation options"     → 0.84 (High)      │
    │ "Restaurants in Kandy"      → 0.31 (Low)       │
    │ "Beaches in Galle"          → 0.12 (Very Low)  │
    └─────────────────────────────────────────────────┘
                    |
                    ↓
    Top 3 matches returned to user

Document Processing Pipeline

Multi-Format Support

// Document Processing Pipeline
const supportedFormats = {
  csv: processCSV,      // Government databases, tourism data
  pdf: processPDF,      // Official documents, brochures
  markdown: processMD,  // Tourism guides, policies
  excel: processXLS     // Budget sheets, statistics
};

Complete Processing Workflow

┌─────────────────────────────────────────────────────────────────────────────────┐
│                        Document Ingestion Pipeline                              │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐           │
│  │   CSV       │  │    PDF      │  │  Markdown   │  │   Excel     │           │
│  │  Files      │  │   Files     │  │   Files     │  │   Files     │           │
│  │             │  │             │  │             │  │             │           │
│  │• Hotels.csv │  │• Brochures  │  │• Guides.md  │  │• Data.xlsx  │           │
│  │• Attractions│  │• Policies   │  │• Culture.md │  │• Stats.xls  │           │
│  │• Transport  │  │• Procedures │  │• History.md │  │• Budget.xlsx│           │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘           │
│         │                │                │                │                  │
│         └────────────────┼────────────────┼────────────────┘                  │
│                          │                │                                   │
│                          ▼                ▼                                   │
│                    ┌─────────────────────────────┐                            │
│                    │    Format Detection         │                            │
│                    │  • MIME Type Analysis       │                            │
│                    │  • Extension Validation     │                            │
│                    │  • Content Verification     │                            │
│                    └─────────────┬───────────────┘                            │
│                                  │                                            │
│                                  ▼                                            │
│                    ┌─────────────────────────────┐                            │
│                    │    Content Extraction       │                            │
│                    │  • Text Extraction          │                            │
│                    │  • Structure Preservation   │                            │
│                    │  • Metadata Capture         │                            │
│                    └─────────────┬───────────────┘                            │
│                                  │                                            │
│                                  ▼                                            │
│                    ┌─────────────────────────────┐                            │
│                    │    Smart Chunking           │                            │
│                    │  • Structured Data: By Row  │                            │
│                    │  • Text: 800 chars + overlap│                            │
│                    │  • Paragraph-aware splits   │                            │
│                    └─────────────┬───────────────┘                            │
│                                  │                                            │
│                                  ▼                                            │
│                    ┌─────────────────────────────┐                            │
│                    │    Vector Generation        │                            │
│                    │  • all-MiniLM-L6-v2 Model  │                            │
│                    │  • 384-dimensional vectors  │                            │
│                    │  • Mean pooling + normalize │                            │
│                    └─────────────┬───────────────┘                            │
│                                  │                                            │
│                                  ▼                                            │
│                    ┌─────────────────────────────┐                            │
│                    │    Qdrant Storage           │                            │
│                    │  • Vector indexing          │                            │
│                    │  • Metadata association     │                            │
│                    │  • Similarity optimization  │                            │
│                    └─────────────────────────────┘                            │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘


## Docker Containerization Strategy

### Container Architecture

```yaml
# docker-compose.yml
services:
  qdrant:
    image: qdrant/qdrant:latest
    ports: ["6333:6333"]
    volumes: [qdrant_data:/qdrant/storage]
    
  backend:
    build: .
    ports: ["3000:3000"]
    depends_on: [qdrant]
    environment:
      - QDRANT_HOST=qdrant
      - GROQ_API_KEY=${GROQ_API_KEY}

Advanced RAG Scoring Algorithm

// Advanced Hybrid Search Implementation
async function performHybridSearch(query, limit = 6) {
  // Primary semantic search
  const primaryResults = await searchVectors(query, limit * 2);
  
  // Query preprocessing for better matches
  const preprocessedQuery = preprocessQuery(query);
  const secondaryResults = await searchVectors(preprocessedQuery, limit);
  
  // Advanced scoring with multiple factors
  const scoredResults = enhanceScoring(
    [...primaryResults, ...secondaryResults],
    query
  );
  
  return deduplicateAndFilter(scoredResults, limit);
}

// Enhanced Scoring Algorithm
function enhanceScoring(results, originalQuery) {
  return results.map(result => {
    let score = result.score; // Base cosine similarity
    
    // Exact match boost
    if (result.content.toLowerCase().includes(originalQuery.toLowerCase())) {
      score += 0.4;
    }
    
    // Structured data boost (CSV sources)
    if (result.metadata.source_type === 'csv') {
      score += 0.3;
    }
    
    // Field-specific boosts
    if (result.metadata.field === 'name') {
      score += 0.5;
    } else if (result.metadata.field === 'description') {
      score += 0.3;
    }
    
    // Word overlap scoring
    const overlap = calculateWordOverlap(originalQuery, result.content);
    score += overlap * 0.2;
    
    return { ...result, enhanced_score: score };
  });
}

Code Examples and Implementation Details

Complete RAG Service Implementation

// services/ragService.js
class RAGService {
  constructor() {
    this.qdrantClient = new QdrantClient({
      host: process.env.QDRANT_HOST || 'localhost',
      port: 6333
    });
    this.embedder = null; // Lazy loaded
    this.cacheManager = new RAGCacheManager();
  }
  
  async initialize() {
    // Load embedding model
    this.embedder = await pipeline(
      'feature-extraction',
      'Xenova/all-MiniLM-L6-v2'
    );
  }
  
  async searchRelevantContext(query, limit = 6) {
    // Generate query embedding
    const queryEmbedding = await this.generateEmbedding(query);
    
    // Search Qdrant
    const searchResults = await this.qdrantClient.search('tourism_kb', {
      vector: queryEmbedding,
      limit: limit * 2,
      score_threshold: 0.3
    });
    
    // Enhanced scoring and filtering
    const enhancedResults = this.enhanceScoring(searchResults, query);
    const deduplicated = this.deduplicateResults(enhancedResults);
    
    return deduplicated.slice(0, limit);
  }
  
  async generateEmbedding(text) {
    if (!this.embedder) await this.initialize();
    
    const output = await this.embedder(text, {
      pooling: 'mean',
      normalize: true
    });
    
    return Array.from(output.data);
  }
  
  enhanceScoring(results, originalQuery) {
    return results.map(result => {
      let score = result.score;
      
      // Exact match bonus
      if (result.payload.content.toLowerCase().includes(originalQuery.toLowerCase())) {
        score += 0.4;
      }
      
      // Source type bonuses
      if (result.payload.source_type === 'csv') {
        score += 0.3;
      }
      
      // Field-specific bonuses
      if (result.payload.field === 'name') {
        score += 0.5;
      } else if (result.payload.field === 'description') {
        score += 0.3;
      }
      
      return { ...result, enhanced_score: score };
    });
  }
  
  deduplicateResults(results) {
    const seen = new Set();
    return results.filter(result => {
      const signature = result.payload.content.slice(0, 50);
      if (seen.has(signature)) return false;
      seen.add(signature);
      return true;
    });
  }
}

Document Ingestion Implementation

// services/documentIngestion.js
class DocumentIngestionService {
  constructor() {
    this.ragService = new RAGService();
    this.supportedFormats = {
      '.csv': this.processCSV.bind(this),
      '.pdf': this.processPDF.bind(this),
      '.md': this.processMarkdown.bind(this),
      '.xlsx': this.processExcel.bind(this)
    };
  }
  
  async ingestDocuments(directoryPath) {
    const files = await this.getFilesRecursively(directoryPath);
    const batches = this.createBatches(files, 10);
    
    for (const batch of batches) {
      await Promise.all(
        batch.map(file => this.processFile(file))
      );
    }
  }
  
  async processFile(filePath) {
    const extension = path.extname(filePath);
    const processor = this.supportedFormats[extension];
    
    if (!processor) {
      console.warn(`Unsupported file format: ${extension}`);
      return;
    }
    
    try {
      const chunks = await processor(filePath);
      await this.storeChunks(chunks);
    } catch (error) {
      console.error(`Error processing ${filePath}:`, error);
    }
  }
  
  async processCSV(filePath) {
    const records = await csv().fromFile(filePath);
    const chunks = [];
    
    for (const record of records) {
      for (const [field, value] of Object.entries(record)) {
        if (value && value.trim()) {
          chunks.push({
            content: value,
            metadata: {
              source: filePath,
              source_type: 'csv',
              field: field,
              record_id: record.id || `row_${chunks.length}`
            }
          });
        }
      }
    }
    
    return chunks;
  }
  
  async storeChunks(chunks) {
    const points = [];
    
    for (const chunk of chunks) {
      const embedding = await this.ragService.generateEmbedding(chunk.content);
      
      points.push({
        id: crypto.randomUUID(),
        vector: embedding,
        payload: {
          content: chunk.content,
          ...chunk.metadata
        }
      });
    }
    
    await this.ragService.qdrantClient.upsert('tourism_kb', {
      wait: true,
      points: points
    });
  }
}

This technical document provides comprehensive implementation details for developers who need to build, deploy, and maintain the RAG system, while the main article focuses on the business impact and vision for decision makers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment