Skip to content

Instantly share code, notes, and snippets.

@sughodke
Last active February 28, 2025 00:07
Show Gist options
  • Save sughodke/1f198a2efe8dd7418fdaa57f003baea7 to your computer and use it in GitHub Desktop.
Save sughodke/1f198a2efe8dd7418fdaa57f003baea7 to your computer and use it in GitHub Desktop.
Read Mail.app Database

If you wanted to dig into your emails and you use the native OSX Mail app, check out these queries.

Mail.app uses sqlite as its datastore, for messages, recipients, etc.

Launch sqlite on the sqlite database.

cd ~/Library/Mail/V5/MailData
sqlite3 "Envelope Index"

Looking around you can run a couple of interesting queries, some interesting ones:

--- Spammy Users in the last 180 days
SELECT address, count(sender) AS cnt FROM messages 
JOIN addresses ON sender = addresses.rowid 
WHERE date_sent > strftime('%s',date('now','start of month','-3 month'))
GROUP BY sender 
ORDER BY cnt DESC 
LIMIT 25;
--- Unique Domains from Known Address
SELECT r.name
     , count(*) AS cnt
FROM (
  SELECT 
    lower(
      substr(address,
      instr(address, '@')
      )
    ) name
  FROM addresses 
) r
GROUP BY r.name
ORDER BY cnt DESC;
"""
Plots the date last opened against date received
"""
import pandas as pd
from sqlalchemy import create_engine
from matplotlib import pyplot as plt
engine = create_engine('sqlite:///~/Library/Mail/V5/MailData/Envelope Index')
with engine.connect() as conn, conn.begin():
msg = pd.read_sql_table('messages', conn)
adr = pd.read_sql_table('addresses', conn)
for x in ['date_sent', 'date_received', 'date_created', 'date_last_viewed']:
msg[x.replace('date_', 'dt_')] = pd.to_datetime(msg[x], unit='s')
# Plot opened emails in the last 180 days
mask = msg['date_sent'] > 1508958377
msg[mask].plot(x='date_received', y='date_last_viewed', kind='scatter', figsize=(10,10))
plt.title(f'Re-opened emails in the last 180 days')
plt.show()
@zrlram
Copy link

zrlram commented Dec 11, 2023

Here is another query that might be useful ... thought I'd share if you want to add:

SELECT
m.date_received, DATETIME(ROUND(m.date_sent), 'unixepoch') AS date ,
su.subject, r.address as _from,
GROUP_CONCAT( a.address , ',') as _to_multi, mail.url
FROM recipients s
JOIN addresses a on s.address = a.rowid
JOIN messages m on s.message = m.rowid
JOIN subjects su on m.subject = su.rowid
JOIN addresses r on m.sender = r.rowid
JOIN mailboxes mail ON m.mailbox = mail.ROWID
GROUP BY s.message, m.subject, su.subject, r.address
ORDER BY date desc
;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment