Skip to content

Instantly share code, notes, and snippets.

@austinlparker
Last active October 31, 2024 06:59
Show Gist options
  • Save austinlparker/998e74700549684e12b58d26da7438c0 to your computer and use it in GitHub Desktop.
Save austinlparker/998e74700549684e12b58d26da7438c0 to your computer and use it in GitHub Desktop.
OpenTelemetry Collector Log Parser for ATProto PDS

Parsing PDS Logs With OpenTelemetry

This configuration should parse your ATProto/Bluesky PDS logs into nicely formatted and structured OpenTelemetry Logs.

You'll need to use OpenTelemetry Collector Contrib, and you can either install it on your host or run it as a container (if running as a container, you'll need to map the Docker log path in as a read-only volume).

This also assumes that your Docker daemon.json has "tag": "{{.Name}}|{{.ImageName}}|{{.ID}}" set under log-opts (this is how we parse the container metadata).

As an exercise to the reader, you can use the count connector to parse the log stream into Prometheus or OTLP metrics. You can also add in hostmetrics receiver to get memory, CPU, disk, etc. utilization.

To read the data stream, create an exporter to your favorite OTLP destination (like honeycomb.io) or others.

extensions:
health_check:
pprof:
endpoint: localhost:1777
zpages:
endpoint: localhost:55679
receivers:
otlp:
protocols:
grpc:
endpoint: localhost:4317
http:
endpoint: localhost:4318
# Collect own metrics
prometheus:
config:
scrape_configs:
- job_name: "otel-collector"
scrape_interval: 10s
static_configs:
- targets: ["localhost:8888"]
filelog:
include:
- /var/lib/docker/containers/*/*-json.log
encoding: utf-8
force_flush_period: "0"
include_file_name: false
include_file_path: true
max_concurrent_files: 1024
max_log_size: 1MiB
operators:
- id: parser-docker
timestamp:
layout: "%Y-%m-%dT%H:%M:%S.%LZ"
parse_from: attributes.time
type: json_parser
- id: extract_metadata_from_docker_tag
parse_from: attributes.attrs.tag
regex: ^(?P<name>[^\|]+)\|(?P<image_name>[^\|]+)\|(?P<id>[^$]+)$
type: regex_parser
if: "attributes?.attrs?.tag != nil"
- from: attributes.name
to: resource["docker.container.name"]
type: move
if: "attributes?.name != nil"
- from: attributes.image_name
to: resource["docker.image.name"]
type: move
if: "attributes?.image_name != nil"
- from: attributes.id
to: resource["docker.container.id"]
type: move
if: "attributes?.id != nil"
- from: attributes.stream
to: resource["log.io.stream"]
type: move
- field: attributes.attrs.tag
type: remove
if: "attributes?.attrs?.tag != nil"
- from: attributes.log
to: body
type: move
poll_interval: 200ms
start_at: beginning
processors:
batch:
transform:
error_mode: ignore
log_statements:
- context: log
statements:
- set(body, ParseJSON(body)) where IsMatch(body, "^\\{")
exporters:
debug:
debug/verbose:
verbosity: detailed
service:
telemetry:
logs:
level: debug
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [debug]
metrics:
receivers: [otlp, prometheus]
processors: [batch]
exporters: [debug]
logs:
receivers: [otlp, filelog]
processors: [transform, batch]
exporters: [debug]
extensions: [health_check, pprof, zpages]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment