Skip to content

Instantly share code, notes, and snippets.

@asaf400
Created June 15, 2022 16:30
Show Gist options
  • Save asaf400/d207fb4e25a55eb666c67996bb403910 to your computer and use it in GitHub Desktop.
Save asaf400/d207fb4e25a55eb666c67996bb403910 to your computer and use it in GitHub Desktop.
Kafka Topic Statistics
#!/bin/bash
kafkajson='{"bootstrapserver":"kafka-staging-us-east-1..com:9092","sets":[{"name":"abc","topic":"def"}]}'
for topic in $(jq -r '.sets[] | .topic' <<<"$kafkajson" | tr '\n' ' '); do
echo "Working on Topic: $topic"
echo "running: kcat -e -b acid-kafka-staging-us-east-1.isappcloud.com:9092 -t $topic -f '%T %S\n' -o beginning"
#kcat -e -b acid-kafka-staging-us-east-1.isappcloud.com:9092 -t $topic -f '%T %S\n' -o beginning > "$topic"
cut -d' ' -f2 <"$topic" > "$topic""_bytes"
cut -d' ' -f1 <"$topic" > "$topic""_timestamps"
avg=$(echo "$(jq -s add/length <"$topic""_bytes")/1" | bc)
earliest=$(sed -n '1p' <"$topic""_timestamps")
latest=$(sed -n '$p' <"$topic""_timestamps")
earliest_s=$(date -d @$( echo "($earliest + 500) / 1000" | bc) +%s)
latest_s=$(date -d @$( echo "($latest + 500) / 1000" | bc) +%s)
seconds=$((latest_s-earliest_s))
length=$(wc -l "$topic""_timestamps" | awk '{print $1}')
per_s=$(echo "$length / $seconds" | bc)
echo "Topic: $topic has $avg avg bytes of data for each message"
echo "Topic: $topic has $per_s messages per seconds"
echo "Topic: $topic starts @ $(date -d @$earliest_s) and ends @ $(date -d @$latest_s)"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment