Skip to content

Instantly share code, notes, and snippets.

@chriselgee
Last active February 27, 2025 04:08
Show Gist options
  • Save chriselgee/8cc878868e8dcedbfe759cb31e137447 to your computer and use it in GitHub Desktop.
Save chriselgee/8cc878868e8dcedbfe759cb31e137447 to your computer and use it in GitHub Desktop.
### This assumes you have a credential file called BreachData.txt in the format of email:password, one per line
# Setup and start Clickhouse
mkdir clickhouse && cd clickhouse
curl https://clickhouse.com/ | sh
./clickhouse server
# IN A NEW TERMINAL
# Create and populate the database (assumes the cred file is one level up)
./clickhouse client 'CREATE DATABASE creds'
./clickhouse client 'CREATE TABLE creds.logins(`email` String,`password` String) ENGINE = MergeTree ORDER BY email'
# this next one took about 30 minutes on my NUC (for ~110GB of data!)
cat ../BreachData.txt | ./clickhouse client --query="INSERT INTO creds.logins FORMAT Regexp" --format_regexp='^(?<email>[^:]+):(?<password>.*)$'
# To query
./clickhouse client "SELECT * FROM creds.logins WHERE email LIKE '%@client.org'"
./clickhouse client "SELECT * FROM creds.logins WHERE email LIKE 'specific.victim%'"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment