Skip to content

Instantly share code, notes, and snippets.

@justinabrahms
Last active October 10, 2025 22:43
Show Gist options
  • Save justinabrahms/700ce58f4629734c21558ec1dae8960a to your computer and use it in GitHub Desktop.
Save justinabrahms/700ce58f4629734c21558ec1dae8960a to your computer and use it in GitHub Desktop.
700ce58f4629734c21558ec1dae8960a

GitHub Audit Log Filter for SOX Compliance

This tool filters GitHub audit log exports to show only events from repositories with the sox:true custom property.

Requirements

  • Python 3.6+
  • GitHub CLI (gh) installed and authenticated
  • Organization admin access to view audit logs and custom properties

Usage

Step 1: Export Audit Log from GitHub

  1. Go to your organization settings → Audit log
  2. Apply filters (see "Recommended Filters" below)
  3. Export as CSV

Alternatively, use the GitHub CLI:

gh api "/orgs/YOUR_ORG/audit-log?phrase=created:2025-07-01%20-action:team_sync_tenant.enabled%20..." --paginate > audit.json

Step 2: Filter by SOX Repos

python filter_audit_log.py input.csv --org YOUR_ORG -o filtered.csv

Example

# Filter the audit log export
python filter_audit_log.py ~/Downloads/audit-export.csv \
  --org myorg \
  -o sox_audit_filtered.csv

Recommended Audit Log Filters

When querying the audit log, use these filters to exclude noisy events:

-action:team_sync_tenant.enabled
-action:pull_request.*
-action:org.register_self_hosted_runner
-action:issue_comment.*
-action:pull_request_review_comment.*
-action:pull_request_review.*
-action:workflows.*
-action:repo.download_zip
-action:protected_branch.rejected_ref_update
-action:custom_property_value.*
-action:integration_installation.*
-action:org_credential_authorization.*
-action:*.*actions_secret
-action:environment.create
-action:repo.create_actions_secret
-action:repository_vulnerability_alert.*
-action:repo.remove_actions_secret
-action:repo.update_actions_secret
-action:team.add_repository

These filters remove:

  • Pull request and issue comment activity
  • Workflow runs and secrets management
  • Integration installations
  • Other high-volume, low-risk events

You can customize these filters based on your compliance requirements.

How It Works

  1. Queries GitHub for all repositories with sox:true custom property
  2. Reads the CSV audit log export
  3. Filters events to only include those from SOX-compliant repos
  4. Outputs a new CSV with only relevant events

Output

The script provides a summary:

Found 42 repositories with sox:true
Processed 15000 events
Skipped 8000 events without a repo
Kept 2500 events from sox:true repos
Filtered 4500 events from non-sox repos
Filtered output written to: sox_audit_filtered.csv

Sharing Configuration

To share your configuration files (README, filter script, etc.) as a public gist:

make gist

This will:

  • Create a new public gist with all non-JSON, non-CSV files (first run)
  • Save the gist ID to .gist_id (committed to version control)
  • Update the existing gist on subsequent runs

Everyone using this repo will update the same gist since .gist_id is tracked in version control.

#!/usr/bin/env python3
"""
Filter GitHub audit log CSV to only include events from repos with sox:true custom property.
Usage:
python filter_audit_log.py input.csv -o output.csv
python filter_audit_log.py input.csv --org myorg -o output.csv
"""
import argparse
import csv
import json
import subprocess
import sys
from typing import Set, List
def get_sox_repos(org: str) -> Set[str]:
"""
Get all repositories with sox:true custom property using gh CLI.
Returns a set of full repository names (e.g., "org/repo").
"""
try:
# Query custom properties for all repos
# The API returns: [{"repository_full_name": "org/repo", "properties": [{"property_name": "sox", "value": "true"}]}]
cmd = [
"gh", "api",
f"/orgs/{org}/properties/values",
"--paginate",
"-q", '.[] | select(.properties[] | select(.property_name == "sox" and .value == "true")) | .repository_full_name'
]
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
repos = set(line.strip() for line in result.stdout.strip().split('\n') if line.strip())
print(f"Found {len(repos)} repositories with sox:true", file=sys.stderr)
return repos
except subprocess.CalledProcessError as e:
print(f"Error querying GitHub API: {e.stderr}", file=sys.stderr)
sys.exit(1)
except FileNotFoundError:
print("Error: 'gh' CLI not found. Please install GitHub CLI.", file=sys.stderr)
sys.exit(1)
def filter_audit_log(input_file: str, output_file: str, sox_repos: Set[str], org: str):
"""
Filter audit log CSV to only include events from sox repos.
"""
filtered_count = 0
total_count = 0
skipped_no_repo = 0
with open(input_file, 'r', encoding='utf-8') as infile, \
open(output_file, 'w', encoding='utf-8', newline='') as outfile:
reader = csv.DictReader(infile)
if not reader.fieldnames:
print("Error: Input CSV has no headers", file=sys.stderr)
sys.exit(1)
writer = csv.DictWriter(outfile, fieldnames=reader.fieldnames)
writer.writeheader()
for row in reader:
total_count += 1
# Extract repo name from the event
# The 'repo' column contains full names like "org/repo-name"
repo_full_name = row.get('repo', '').strip()
if not repo_full_name:
# Skip events without a repo (SSO, org-level events, etc.)
skipped_no_repo += 1
continue
if repo_full_name in sox_repos:
writer.writerow(row)
filtered_count += 1
print(f"Processed {total_count} events", file=sys.stderr)
print(f"Skipped {skipped_no_repo} events without a repo", file=sys.stderr)
print(f"Kept {filtered_count} events from sox:true repos", file=sys.stderr)
print(f"Filtered {total_count - skipped_no_repo - filtered_count} events from non-sox repos", file=sys.stderr)
print(f"Filtered output written to: {output_file}", file=sys.stderr)
def main():
parser = argparse.ArgumentParser(
description="Filter GitHub audit log CSV to only sox:true repos",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Example workflow:
1. Export audit log from GitHub UI (Settings → Audit log → Export)
2. Run: python filter_audit_log.py audit.csv --org MYORG -o filtered.csv
"""
)
parser.add_argument('input', help='Input CSV file (audit log export)')
parser.add_argument('-o', '--output', required=True, help='Output CSV file')
parser.add_argument('--org', required=True, help='GitHub organization name')
args = parser.parse_args()
# Get repos with sox:true
sox_repos = get_sox_repos(args.org)
if not sox_repos:
print("Warning: No repositories found with sox:true custom property", file=sys.stderr)
sys.exit(0)
# Filter audit log
filter_audit_log(args.input, args.output, sox_repos, args.org)
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment