Comprehensive technical guide for developers and implementers
┌─────────────────────────────────────────────────────────────────────────────────┐
│ RAG System Architecture │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Tourist │ │ Citizen │ │ Government │ │ Business │ │
│ │ Query │ │ Query │ │ Query │ │ Query │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │ │
│ └──────────────────┼──────────────────┼──────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Query Processing │ │
│ │ • Language Detection │ │
│ │ • Text Normalization │ │
│ │ • Vector Embedding │ │
│ └─────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Qdrant Vector Search │ │
│ │ • Semantic Similarity │ │
│ │ • Cosine Distance │ │
│ │ • Multi-source Retrieval │ │
│ └─────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐ │
│ │ Knowledge Base │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Tourism │ │ Government │ │ Cultural │ │ Emergency │ │ │
│ │ │ Data │ │ Services │ │ Content │ │ Info │ │ │
│ │ │• Hotels │ │• Procedures │ │• Festivals │ │• Contacts │ │ │
│ │ │• Attractions│ │• Documents │ │• Traditions │ │• Hotlines │ │ │
│ │ │• Transport │ │• Policies │ │• Languages │ │• Hospitals │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Context Assembly │ │
│ │ • Top 6 Relevant Chunks │ │
│ │ • Score-based Ranking │ │
│ │ • Deduplication │ │
│ └─────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ AI Response Generation │ │
│ │ • Groq/Llama3 API │ │
│ │ • Context + Query │ │
│ │ • Fallback to Ollama │ │
│ └─────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Intelligent Response │ │
│ │ • Accurate Information │ │
│ │ • Cultural Context │ │
│ │ • Multi-language Support │ │
│ └─────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
Simple Flow: User Query → Vector Search → Relevant Documents → AI Generation → Contextual Response
Our implementation uses Qdrant, a high-performance vector database that enables semantic search capabilities:
// Qdrant Configuration
const client = new QdrantClient({
host: process.env.QDRANT_HOST || 'localhost',
port: 6333
});
// Collection Setup
const collection = {
name: 'tourism_kb',
vectors: {
size: 384, // all-MiniLM-L6-v2 embeddings
distance: 'Cosine'
}
};
Cosine similarity is like measuring how "similar" two documents are by comparing their direction rather than their exact words. Imagine two arrows pointing in space - if they point in the same direction, they're similar (score near 1.0). If they point in opposite directions, they're different (score near 0.0).
In our RAG system:
- Each document becomes a 384-dimensional arrow (vector) based on its meaning
- User queries also become arrows in the same space
- Cosine similarity finds documents pointing in the same direction as the query
- This works even when different words are used - "hotel" and "accommodation" point in similar directions
RAG System Example:
User Query: "Where can I stay in Kandy?"
|
↓
[Vector Embedding]
|
↓
Query Vector: [0.2, 0.8, 0.1, ...]
|
↓
[Cosine Similarity Search in Tourism Database]
|
↓
┌─────────────────────────────────────────────────┐
│ Document Vectors & Similarity Scores: │
│ │
│ "Hotels in Kandy" → 0.92 (Very High) │
│ "Guesthouses near Kandy" → 0.87 (High) │
│ "Accommodation options" → 0.84 (High) │
│ "Restaurants in Kandy" → 0.31 (Low) │
│ "Beaches in Galle" → 0.12 (Very Low) │
└─────────────────────────────────────────────────┘
|
↓
Top 3 matches returned to user
// Document Processing Pipeline
const supportedFormats = {
csv: processCSV, // Government databases, tourism data
pdf: processPDF, // Official documents, brochures
markdown: processMD, // Tourism guides, policies
excel: processXLS // Budget sheets, statistics
};
┌─────────────────────────────────────────────────────────────────────────────────┐
│ Document Ingestion Pipeline │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ CSV │ │ PDF │ │ Markdown │ │ Excel │ │
│ │ Files │ │ Files │ │ Files │ │ Files │ │
│ │ │ │ │ │ │ │ │ │
│ │• Hotels.csv │ │• Brochures │ │• Guides.md │ │• Data.xlsx │ │
│ │• Attractions│ │• Policies │ │• Culture.md │ │• Stats.xls │ │
│ │• Transport │ │• Procedures │ │• History.md │ │• Budget.xlsx│ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │ │
│ └────────────────┼────────────────┼────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Format Detection │ │
│ │ • MIME Type Analysis │ │
│ │ • Extension Validation │ │
│ │ • Content Verification │ │
│ └─────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Content Extraction │ │
│ │ • Text Extraction │ │
│ │ • Structure Preservation │ │
│ │ • Metadata Capture │ │
│ └─────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Smart Chunking │ │
│ │ • Structured Data: By Row │ │
│ │ • Text: 800 chars + overlap│ │
│ │ • Paragraph-aware splits │ │
│ └─────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Vector Generation │ │
│ │ • all-MiniLM-L6-v2 Model │ │
│ │ • 384-dimensional vectors │ │
│ │ • Mean pooling + normalize │ │
│ └─────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Qdrant Storage │ │
│ │ • Vector indexing │ │
│ │ • Metadata association │ │
│ │ • Similarity optimization │ │
│ └─────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
## Docker Containerization Strategy
### Container Architecture
```yaml
# docker-compose.yml
services:
qdrant:
image: qdrant/qdrant:latest
ports: ["6333:6333"]
volumes: [qdrant_data:/qdrant/storage]
backend:
build: .
ports: ["3000:3000"]
depends_on: [qdrant]
environment:
- QDRANT_HOST=qdrant
- GROQ_API_KEY=${GROQ_API_KEY}
// Advanced Hybrid Search Implementation
async function performHybridSearch(query, limit = 6) {
// Primary semantic search
const primaryResults = await searchVectors(query, limit * 2);
// Query preprocessing for better matches
const preprocessedQuery = preprocessQuery(query);
const secondaryResults = await searchVectors(preprocessedQuery, limit);
// Advanced scoring with multiple factors
const scoredResults = enhanceScoring(
[...primaryResults, ...secondaryResults],
query
);
return deduplicateAndFilter(scoredResults, limit);
}
// Enhanced Scoring Algorithm
function enhanceScoring(results, originalQuery) {
return results.map(result => {
let score = result.score; // Base cosine similarity
// Exact match boost
if (result.content.toLowerCase().includes(originalQuery.toLowerCase())) {
score += 0.4;
}
// Structured data boost (CSV sources)
if (result.metadata.source_type === 'csv') {
score += 0.3;
}
// Field-specific boosts
if (result.metadata.field === 'name') {
score += 0.5;
} else if (result.metadata.field === 'description') {
score += 0.3;
}
// Word overlap scoring
const overlap = calculateWordOverlap(originalQuery, result.content);
score += overlap * 0.2;
return { ...result, enhanced_score: score };
});
}
// services/ragService.js
class RAGService {
constructor() {
this.qdrantClient = new QdrantClient({
host: process.env.QDRANT_HOST || 'localhost',
port: 6333
});
this.embedder = null; // Lazy loaded
this.cacheManager = new RAGCacheManager();
}
async initialize() {
// Load embedding model
this.embedder = await pipeline(
'feature-extraction',
'Xenova/all-MiniLM-L6-v2'
);
}
async searchRelevantContext(query, limit = 6) {
// Generate query embedding
const queryEmbedding = await this.generateEmbedding(query);
// Search Qdrant
const searchResults = await this.qdrantClient.search('tourism_kb', {
vector: queryEmbedding,
limit: limit * 2,
score_threshold: 0.3
});
// Enhanced scoring and filtering
const enhancedResults = this.enhanceScoring(searchResults, query);
const deduplicated = this.deduplicateResults(enhancedResults);
return deduplicated.slice(0, limit);
}
async generateEmbedding(text) {
if (!this.embedder) await this.initialize();
const output = await this.embedder(text, {
pooling: 'mean',
normalize: true
});
return Array.from(output.data);
}
enhanceScoring(results, originalQuery) {
return results.map(result => {
let score = result.score;
// Exact match bonus
if (result.payload.content.toLowerCase().includes(originalQuery.toLowerCase())) {
score += 0.4;
}
// Source type bonuses
if (result.payload.source_type === 'csv') {
score += 0.3;
}
// Field-specific bonuses
if (result.payload.field === 'name') {
score += 0.5;
} else if (result.payload.field === 'description') {
score += 0.3;
}
return { ...result, enhanced_score: score };
});
}
deduplicateResults(results) {
const seen = new Set();
return results.filter(result => {
const signature = result.payload.content.slice(0, 50);
if (seen.has(signature)) return false;
seen.add(signature);
return true;
});
}
}
// services/documentIngestion.js
class DocumentIngestionService {
constructor() {
this.ragService = new RAGService();
this.supportedFormats = {
'.csv': this.processCSV.bind(this),
'.pdf': this.processPDF.bind(this),
'.md': this.processMarkdown.bind(this),
'.xlsx': this.processExcel.bind(this)
};
}
async ingestDocuments(directoryPath) {
const files = await this.getFilesRecursively(directoryPath);
const batches = this.createBatches(files, 10);
for (const batch of batches) {
await Promise.all(
batch.map(file => this.processFile(file))
);
}
}
async processFile(filePath) {
const extension = path.extname(filePath);
const processor = this.supportedFormats[extension];
if (!processor) {
console.warn(`Unsupported file format: ${extension}`);
return;
}
try {
const chunks = await processor(filePath);
await this.storeChunks(chunks);
} catch (error) {
console.error(`Error processing ${filePath}:`, error);
}
}
async processCSV(filePath) {
const records = await csv().fromFile(filePath);
const chunks = [];
for (const record of records) {
for (const [field, value] of Object.entries(record)) {
if (value && value.trim()) {
chunks.push({
content: value,
metadata: {
source: filePath,
source_type: 'csv',
field: field,
record_id: record.id || `row_${chunks.length}`
}
});
}
}
}
return chunks;
}
async storeChunks(chunks) {
const points = [];
for (const chunk of chunks) {
const embedding = await this.ragService.generateEmbedding(chunk.content);
points.push({
id: crypto.randomUUID(),
vector: embedding,
payload: {
content: chunk.content,
...chunk.metadata
}
});
}
await this.ragService.qdrantClient.upsert('tourism_kb', {
wait: true,
points: points
});
}
}
This technical document provides comprehensive implementation details for developers who need to build, deploy, and maintain the RAG system, while the main article focuses on the business impact and vision for decision makers.