Skip to content

Instantly share code, notes, and snippets.

@boddhisattva
Last active January 18, 2026 10:16
Show Gist options
  • Select an option

  • Save boddhisattva/1a7d9ecc39a2f95d86d82f80ab43de8b to your computer and use it in GitHub Desktop.

Select an option

Save boddhisattva/1a7d9ecc39a2f95d86d82f80ab43de8b to your computer and use it in GitHub Desktop.
Benchmark test script with results on why use lower(lesson) for learnings in: https://github.com/boddhisattva/learner-web
#!/usr/bin/env ruby
# frozen_string_literal: true
# COMPARISON: ILIKE vs lower()+LIKE Performance at 200k Records
# WITH INDEX ON lower(lesson) AND lower() IN SEARCH CODE
# This script compares ILIKE vs lower()+LIKE when index is on lower(lesson)
# Run with: bin/rails runner app/docs/benchmarks/compare_ilike_vs_lower_like_with_lower_index_at_200k.rb
# Output saved to: ilike_vs_lower_like_comparison_200k_with_lower_index_and_lower_search.txt
require 'benchmark'
# Use stdout for immediate visibility
$stdout.sync = true
puts '=' * 100
puts 'PERFORMANCE COMPARISON: ILIKE vs lower()+LIKE at 200k Records'
puts 'WITH INDEX ON lower(lesson) AND lower() IN SEARCH CODE'
puts '=' * 100
puts
TARGET_RECORD_COUNT = 200_000
TEST_QUERY = 'rest' # Selective query to ensure both approaches use index
# Check current data count and create test data if needed
puts '=' * 100
puts 'STEP 1: Ensuring 200k records exist for comparison'
puts '=' * 100
puts
original_count = Learning.count
puts "Current learnings: #{original_count}"
records_to_create = [TARGET_RECORD_COUNT - original_count, 0].max
if records_to_create > 0
puts "⚠️ Need to create #{records_to_create} records to reach #{TARGET_RECORD_COUNT} total"
puts ' Creating test data now...'
puts
# Get sample data for creating test records
sample_learning = Learning.first
unless sample_learning
puts '❌ ERROR: No existing learnings found. Please create at least one learning first.'
exit 1
end
creator_id = sample_learning.creator_id
organization_id = sample_learning.organization_id
last_modifier_id = sample_learning.last_modifier_id
test_lessons = [
'What is delayed is not denied',
'Never give up on your dreams',
'Practice makes perfect',
'Learning from mistakes is growth',
'Consistency beats intensity',
'never rest till the end',
'rest at the end, not in the middle',
'leave things better than you found out',
'never leave things in half',
'Small steps lead to big changes',
'Focus on progress not perfection',
'Embrace the journey of learning',
'Challenges are opportunities in disguise',
'Growth happens outside comfort zone'
]
created_ids = []
batch_size = 5000
start_time = Time.zone.now
ActiveRecord::Base.transaction do
(records_to_create.to_f / batch_size).ceil.times do |batch_num|
batch_data = []
batch_limit = [batch_size, records_to_create - created_ids.length].min
batch_limit.times do |i|
batch_data << {
lesson: test_lessons.sample,
description: "Test learning #{original_count + batch_num * batch_size + i}",
creator_id: creator_id,
organization_id: organization_id,
last_modifier_id: last_modifier_id,
created_at: Time.zone.now,
updated_at: Time.zone.now
}
end
inserted = Learning.insert_all(batch_data, returning: [:id])
created_ids.concat(inserted.pluck('id'))
progress = ((created_ids.length.to_f / records_to_create) * 100).round(1)
print "\r📊 Progress: #{created_ids.length}/#{records_to_create} records (#{progress}%) - Creating..."
$stdout.flush
end
end
creation_time = Time.zone.now - start_time
puts
puts "✅ Created #{created_ids.length} test learnings in #{creation_time.round(2)}s"
else
puts "✅ Already have #{original_count} records (>= #{TARGET_RECORD_COUNT}) - no creation needed"
created_ids = []
end
final_count = Learning.count
puts
puts "📊 TOTAL RECORDS NOW: #{final_count}"
puts " ✓ Comparison will run on #{final_count} records"
puts
unless final_count >= TARGET_RECORD_COUNT
puts "❌ ERROR: Still only have #{final_count} records, need #{TARGET_RECORD_COUNT}"
exit 1
end
# Store for cleanup
$created_ids_for_cleanup = created_ids
# Update statistics
puts '=' * 100
puts 'STEP 2: Updating PostgreSQL statistics'
puts '=' * 100
puts 'Updating statistics for accurate query planning...'
ActiveRecord::Base.connection.execute('ANALYZE learnings;')
puts '✅ Statistics updated'
puts
# Verify indexes
puts '=' * 100
puts 'STEP 3: Verifying indexes'
puts '=' * 100
puts 'Checking trigram indexes...'
indexes = ActiveRecord::Base.connection.execute(
"SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'learnings' AND indexname LIKE '%trgm%';"
)
trgm_index_on_lesson = nil
trgm_index_on_lower_lesson = nil
indexes.each do |idx|
if idx['indexdef'].include?('lower(')
trgm_index_on_lower_lesson = idx['indexname']
puts "✅ Found index on lower(lesson): #{trgm_index_on_lower_lesson}"
else
trgm_index_on_lesson = idx['indexname']
puts "✅ Found index on lesson: #{trgm_index_on_lesson}"
end
end
unless trgm_index_on_lesson || trgm_index_on_lower_lesson
puts '❌ ERROR: No trigram index found!'
puts ' Need index on either "lesson" or "lower(lesson)"'
exit 1
end
if trgm_index_on_lower_lesson && !trgm_index_on_lesson
puts '✅ Primary index is on lower(lesson) - this is the current configuration'
puts ' lower()+LIKE queries will use this index efficiently'
elsif trgm_index_on_lesson && !trgm_index_on_lower_lesson
puts '⚠️ WARNING: Index is on lesson (raw column), not lower(lesson)'
puts ' This script expects index on lower(lesson) for optimal lower()+LIKE performance'
end
puts
# Calculate selectivity (using the actual search method)
matching_count = if Learning.respond_to?(:search)
Learning.search(TEST_QUERY).count
else
Learning.where('lower(lesson) LIKE lower(?)', "%#{TEST_QUERY}%").count
end
selectivity = (matching_count.to_f / final_count * 100).round(2)
puts "Query: '#{TEST_QUERY}'"
puts "Selectivity: #{matching_count}/#{final_count} rows (#{selectivity}%)"
puts
# Query plan analysis
puts '=' * 100
puts 'STEP 4: Query Plan Analysis (Running on ' + final_count.to_s + ' records)'
puts '=' * 100
puts
puts '1. ILIKE Query Plan:'
ilike_plan = ActiveRecord::Base.connection.execute(
"EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT * FROM learnings WHERE lesson ILIKE '%#{TEST_QUERY}%';"
)
ilike_plan_lines = []
ilike_plan.each do |row|
line = row['QUERY PLAN']
ilike_plan_lines << line
puts " #{line}"
end
puts
ilike_uses_index = ilike_plan_lines.join("\n").include?('Bitmap Index Scan') ||
ilike_plan_lines.join("\n").include?('Index Scan')
puts "Index Used: #{ilike_uses_index ? '✅ YES' : '❌ NO'}"
puts
puts '2. lower() + LIKE Query Plan:'
lower_plan = ActiveRecord::Base.connection.execute(
"EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT * FROM learnings WHERE lower(lesson) LIKE lower('%#{TEST_QUERY}%');"
)
lower_plan_lines = []
lower_plan.each do |row|
line = row['QUERY PLAN']
lower_plan_lines << line
puts " #{line}"
end
puts
lower_uses_index = lower_plan_lines.join("\n").include?('Bitmap Index Scan') ||
lower_plan_lines.join("\n").include?('Index Scan')
puts "Index Used: #{lower_uses_index ? '✅ YES' : '❌ NO'}"
puts
# Extract execution times from plans
ilike_exec_time = nil
ilike_exec_time = Regexp.last_match(1).to_f if ilike_plan_lines.join("\n") =~ /Execution Time: ([\d.]+) ms/
lower_exec_time = nil
lower_exec_time = Regexp.last_match(1).to_f if lower_plan_lines.join("\n") =~ /Execution Time: ([\d.]+) ms/
# Performance benchmark
puts '=' * 100
puts 'STEP 5: Performance Benchmark (Running on ' + final_count.to_s + ' records, 50 queries each)'
puts '=' * 100
puts 'Benchmarking performance...'
puts
results = Benchmark.bm(40) do |x|
x.report('ILIKE:') do
50.times { Learning.where('lesson ILIKE ?', "%#{TEST_QUERY}%").to_a }
end
x.report('lower() + LIKE:') do
50.times { Learning.where('lower(lesson) LIKE lower(?)', "%#{TEST_QUERY}%").to_a }
end
end
ilike_time = results[0].real
lower_like_time = results[1].real
# CORRECTED: Calculate which is faster
# If ilike_time > lower_like_time, then lower() is faster
# If ilike_time < lower_like_time, then ILIKE is faster
speedup_ratio = (ilike_time / lower_like_time).round(2)
lower_is_faster = ilike_time > lower_like_time
speedup_factor = lower_is_faster ? speedup_ratio : (lower_like_time / ilike_time).round(2)
puts
puts 'Results:'
puts " ILIKE total: #{ilike_time.round(3)}s (#{(ilike_time * 1000).round(2)}ms)"
puts " ILIKE per query: #{(ilike_time * 20).round(2)}ms"
puts " lower()+LIKE total: #{lower_like_time.round(3)}s (#{(lower_like_time * 1000).round(2)}ms)"
puts " lower()+LIKE per query: #{(lower_like_time * 20).round(2)}ms"
puts " Winner: #{lower_is_faster ? 'lower()+LIKE' : 'ILIKE'} is #{speedup_factor}x faster"
puts
if ilike_exec_time && lower_exec_time
exec_lower_is_faster = ilike_exec_time > lower_exec_time
exec_speedup_ratio = (ilike_exec_time / lower_exec_time).round(2)
exec_speedup_factor = exec_lower_is_faster ? exec_speedup_ratio : (lower_exec_time / ilike_exec_time).round(2)
puts 'Execution Time from EXPLAIN ANALYZE:'
puts " ILIKE: #{ilike_exec_time}ms"
puts " lower()+LIKE: #{lower_exec_time}ms"
puts " Winner: #{exec_lower_is_faster ? 'lower()+LIKE' : 'ILIKE'} is #{exec_speedup_factor}x faster"
puts
end
# Calculate performance difference
if lower_is_faster
performance_diff_percent = ((speedup_ratio - 1) * 100).round(1)
puts "Performance Difference: lower()+LIKE is #{performance_diff_percent}% faster than ILIKE"
else
performance_diff_percent = ((1 / speedup_ratio - 1) * 100).round(1)
puts "Performance Difference: ILIKE is #{performance_diff_percent}% faster than lower()+LIKE"
end
puts
# Save results to file
output_file = Rails.root.join('app/docs/benchmarks/ilike_vs_lower_like_comparison_200k_with_lower_index_and_lower_search.txt')
File.write(output_file, <<~OUTPUT)
ILIKE vs lower()+LIKE Performance Comparison at #{final_count} Records
WITH INDEX ON lower(lesson) AND lower() IN SEARCH CODE
Generated: #{Time.zone.now}
================================================================================
Test Configuration:
- Records: #{final_count} (#{records_to_create > 0 ? "#{records_to_create} created for this test" : 'existing data'})
- Test Query: '#{TEST_QUERY}'
- Selectivity: #{selectivity}% (#{matching_count}/#{final_count} rows)
- Index on lesson: #{trgm_index_on_lesson || 'NONE'}
- Index on lower(lesson): #{trgm_index_on_lower_lesson || 'NONE'}
================================================================================
QUERY PLAN ANALYSIS
================================================================================
ILIKE Query Plan:
Index Used: #{ilike_uses_index ? 'YES ✅' : 'NO ❌'}
#{ilike_plan_lines.join("\n")}
lower() + LIKE Query Plan:
Index Used: #{lower_uses_index ? 'YES ✅' : 'NO ❌'}
#{lower_plan_lines.join("\n")}
================================================================================
PERFORMANCE BENCHMARK (50 queries)
================================================================================
ILIKE:
Total Time: #{ilike_time.round(3)}s (#{(ilike_time * 1000).round(2)}ms)
Per Query: #{(ilike_time * 20).round(2)}ms
Execution Time: #{ilike_exec_time ? "#{ilike_exec_time}ms" : 'N/A'}
lower() + LIKE:
Total Time: #{lower_like_time.round(3)}s (#{(lower_like_time * 1000).round(2)}ms)
Per Query: #{(lower_like_time * 20).round(2)}ms
Execution Time: #{lower_exec_time ? "#{lower_exec_time}ms" : 'N/A'}
================================================================================
SUMMARY
================================================================================
Winner: #{lower_is_faster ? 'lower()+LIKE' : 'ILIKE'} is #{speedup_factor}x faster
Performance Difference:
#{if lower_is_faster
"lower()+LIKE is #{((speedup_ratio - 1) * 100).round(1)}% faster than ILIKE"
else
"ILIKE is #{((1 / speedup_ratio - 1) * 100).round(1)}% faster than lower()+LIKE"
end}
Execution Time Comparison:
#{if lower_exec_time && ilike_exec_time
exec_lower_is_faster = ilike_exec_time > lower_exec_time
exec_speedup = exec_lower_is_faster ? (ilike_exec_time / lower_exec_time).round(2) : (lower_exec_time / ilike_exec_time).round(2)
"#{exec_lower_is_faster ? 'lower()+LIKE' : 'ILIKE'} is #{exec_speedup}x faster (#{lower_exec_time}ms vs #{ilike_exec_time}ms)"
else
'N/A'
end}
Verdict:
#{if lower_is_faster
"lower()+LIKE performs better with index on lower(lesson). The performance advantage is #{((speedup_ratio - 1) * 100).round(1)}%."
else
"ILIKE performs better despite index on lower(lesson). The performance advantage is #{((1 / speedup_ratio - 1) * 100).round(1)}%."
end}
================================================================================
NOTE
================================================================================
This comparison uses index on 'lower(lesson)' configuration.
lower()+LIKE queries can use this index efficiently, while ILIKE cannot.
OUTPUT
puts "✅ Results saved to: #{output_file}"
puts
# Cleanup test data if we created any
if $created_ids_for_cleanup&.any?
puts '=' * 100
puts 'STEP 6: Cleaning up test data'
puts '=' * 100
puts "Removing #{$created_ids_for_cleanup.length} test learnings..."
start_time = Time.zone.now
ActiveRecord::Base.transaction do
Learning.where(id: $created_ids_for_cleanup).delete_all
end
cleanup_time = Time.zone.now - start_time
final_count_after_cleanup = Learning.count
puts "✅ Cleanup complete in #{cleanup_time.round(2)}s"
puts " Final learnings count: #{final_count_after_cleanup}"
puts
end
puts '=' * 100
puts 'COMPARISON COMPLETE'
puts "📊 Tested with #{final_count} records"
puts "📄 Results saved to: #{output_file}"
puts '=' * 100
ILIKE vs lower()+LIKE Performance Comparison at 1000 Records
WITH INDEX ON lower(lesson) AND lower() IN SEARCH CODE
Generated: 2026-01-18 10:08:55 UTC
================================================================================
Test Configuration:
- Records: 1000 (899 created for this test)
- Test Query: 'rest'
- Selectivity: 12.7% (127/1000 rows)
- Index on lesson: NONE
- Index on lower(lesson): index_learnings_on_lesson_trgm
================================================================================
QUERY PLAN ANALYSIS
================================================================================
ILIKE Query Plan:
Index Used: NO ❌
Seq Scan on public.learnings (cost=0.00..1622.76 rows=7149 width=107) (actual time=0.003..18.842 rows=7190 loops=1)
Output: id, created_at, creator_id, deleted_at, description, last_modifier_id, lesson, organization_id, public_visibility, updated_at
Filter: ((learnings.lesson)::text ~~* '%rest%'::text)
Rows Removed by Filter: 43911
Buffers: shared hit=984
Query Identifier: -6720959310802970193
Planning:
Buffers: shared hit=34
Planning Time: 0.102 ms
Execution Time: 19.001 ms
lower() + LIKE Query Plan:
Index Used: YES ✅
Bitmap Heap Scan on public.learnings (cost=449.76..1541.00 rows=7149 width=107) (actual time=0.807..3.141 rows=7190 loops=1)
Output: id, created_at, creator_id, deleted_at, description, last_modifier_id, lesson, organization_id, public_visibility, updated_at
Recheck Cond: (lower((learnings.lesson)::text) ~~ '%rest%'::text)
Heap Blocks: exact=984
Buffers: shared hit=1092
-> Bitmap Index Scan on index_learnings_on_lesson_trgm (cost=0.00..447.98 rows=7149 width=0) (actual time=0.740..0.740 rows=7190 loops=1)
Index Cond: (lower((learnings.lesson)::text) ~~ '%rest%'::text)
Buffers: shared hit=108
Query Identifier: -9216490675865545704
Planning:
Buffers: shared hit=1
Planning Time: 0.049 ms
Execution Time: 3.302 ms
================================================================================
PERFORMANCE BENCHMARK (50 queries)
================================================================================
ILIKE:
Total Time: 0.041s (40.58ms)
Per Query: 0.81ms
Execution Time: 19.001ms
lower() + LIKE:
Total Time: 0.029s (28.97ms)
Per Query: 0.58ms
Execution Time: 3.302ms
================================================================================
SUMMARY
================================================================================
Winner: lower()+LIKE is 1.4x faster
Performance Difference:
lower()+LIKE is 40.0% faster than ILIKE
Execution Time Comparison:
lower()+LIKE is 5.75x faster (3.302ms vs 19.001ms)
Verdict:
lower()+LIKE performs better with index on lower(lesson). The performance advantage is 40.0%.
================================================================================
NOTE
================================================================================
This comparison uses index on 'lower(lesson)' configuration.
lower()+LIKE queries can use this index efficiently, while ILIKE cannot.
ILIKE vs lower()+LIKE Performance Comparison at 200000 Records
WITH INDEX ON lower(lesson) AND lower() IN SEARCH CODE
Generated: 2026-01-18 06:48:32 UTC
================================================================================
Test Configuration:
- Records: 200000 (199900 created for this test)
- Test Query: 'rest'
- Selectivity: 14.21% (28421/200000 rows)
- Index on lesson: NONE
- Index on lower(lesson): index_learnings_on_lesson_trgm
================================================================================
QUERY PLAN ANALYSIS
================================================================================
ILIKE Query Plan:
Index Used: NO ❌
Seq Scan on public.learnings (cost=0.00..6225.00 rows=29040 width=108) (actual time=0.045..74.085 rows=28421 loops=1)
Output: id, created_at, creator_id, deleted_at, description, last_modifier_id, lesson, organization_id, public_visibility, updated_at
Filter: ((learnings.lesson)::text ~~* '%rest%'::text)
Rows Removed by Filter: 171579
Buffers: shared hit=3725
Query Identifier: -6720959310802970193
Planning:
Buffers: shared hit=32
Planning Time: 0.093 ms
Execution Time: 74.686 ms
lower() + LIKE Query Plan:
Index Used: YES ✅
Bitmap Heap Scan on public.learnings (cost=1906.77..6067.37 rows=29040 width=108) (actual time=3.890..14.817 rows=28421 loops=1)
Output: id, created_at, creator_id, deleted_at, description, last_modifier_id, lesson, organization_id, public_visibility, updated_at
Recheck Cond: (lower((learnings.lesson)::text) ~~ '%rest%'::text)
Heap Blocks: exact=3723
Buffers: shared hit=4146
-> Bitmap Index Scan on index_learnings_on_lesson_trgm (cost=0.00..1899.51 rows=29040 width=0) (actual time=3.565..3.565 rows=28421 loops=1)
Index Cond: (lower((learnings.lesson)::text) ~~ '%rest%'::text)
Buffers: shared hit=423
Query Identifier: -9216490675865545704
Planning:
Buffers: shared hit=1
Planning Time: 0.095 ms
Execution Time: 15.446 ms
================================================================================
PERFORMANCE BENCHMARK (50 queries)
================================================================================
ILIKE:
Total Time: 2.187s (2186.89ms)
Per Query: 43.74ms
Execution Time: 74.686ms
lower() + LIKE:
Total Time: 1.905s (1905.15ms)
Per Query: 38.1ms
Execution Time: 15.446ms
================================================================================
SUMMARY
================================================================================
Winner: lower()+LIKE is 1.15x faster
Performance Difference:
lower()+LIKE is 15.0% faster than ILIKE
Execution Time Comparison:
lower()+LIKE is 4.84x faster (15.446ms vs 74.686ms)
Verdict:
lower()+LIKE performs better with index on lower(lesson). The performance advantage is 15.0%.
================================================================================
NOTE
================================================================================
This comparison uses index on 'lower(lesson)' configuration.
lower()+LIKE queries can use this index efficiently, while ILIKE cannot.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment