Skip to content

Instantly share code, notes, and snippets.

@kaapstorm
Last active October 25, 2024 11:42
Show Gist options
  • Save kaapstorm/635e54e7728155e39980ad232beab8f7 to your computer and use it in GitHub Desktop.
Save kaapstorm/635e54e7728155e39980ad232beab8f7 to your computer and use it in GitHub Desktop.
Adding metrics to Datadog

Adding metrics to Datadog

commcare-hq supports both Datadog and Prometheus (FOSS Datadog)

I will discuss Datadog.

Types

From Prometheus documentation:

Counter

A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart. For example, you can use a counter to represent the number of requests served, tasks completed, or errors.

Do not use a counter to expose a value that can decrease. For example, do not use a counter for the number of currently running processes; instead use a gauge.

Gauge

A gauge is a metric that represents a single numerical value that can arbitrarily go up and down.

Gauges are typically used for measured values like temperatures or current memory usage, but also "counts" that can go up and down, like the number of concurrent requests.

Histogram

A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.

https://en.wikipedia.org/wiki/Histogram#Examples

Works slightly differently in Datadog & Prometheus.

Code

corehq/util/metrics/__init__.py

https://github.com/dimagi/commcare-hq/blob/master/corehq/util/metrics/__init__.py

When naming your metrics, the name must start with "commcare.".

Counter

metrics_counter()

commcare.repeaters.check.locked_out

commcare-hq:

metrics_counter("commcare.repeaters.check.locked_out", tags={'partition': partition})
Datadog

commcare.repeaters.check.attempt_forward

commcare-hq:

metrics_counter("commcare.repeaters.check.attempt_forward")
Datadog

Gauge

metrics_gauge()

commcare-hq:

metrics_gauge(
    'commcare.pillowtop.error_queue',
    row['num_errors'],
    tags={
        'pillow_name': row['pillow'],
        'host': 'celery',
        'group': 'celery'
    },
    multiprocess_mode=MPM_MAX  # (Prometheus only)
)

metrics_gauge_task()

metrics_gauge_task() wraps metrics_gauge() and defines a task.

commcare-hq:

metrics_gauge_task(
    'commcare.repeaters.overdue',
    RepeatRecord.objects.count_overdue,
    run_every=crontab(),  # every minute
    multiprocess_mode=MPM_MAX  # (Prometheus only)
)

Datadog

Histogram

metrics_histogram()

commcare-hq:

metrics_histogram(
    'commcare.repeaters.repeat_record_processing.timing',
    processing_time * 1000,
    buckets=(100, 500, 1000, 5000),
    bucket_tag='duration',
    bucket_unit='ms',
    tags={
        'domain': repeat_record.domain,
        'action': action,
    },
)

Datadog: Tags allow you to filter your data

Click the "Edit" tab on each of those graphs. Compare their entries under "Graph your data". Notice that in the "from" field, data is filter by specific values of the "actions" tag.

Datadog: Tags allow users to filter the graph

Click the "Edit" tab on each of those graphs. Compare their entries under "Graph your data". Notice that the metric and the "from" field are the same. One graph has "sum by" set to the "duration" tag, and the other graph has "sum by" set to the "domain" tag. This difference allows users to filter the same graph by different tags.

metrics_histogram_timer()

metrics_histogram_timer() is a context manager for timing a block of code, and organizing the timings into buckets.

commcare-hq:

with metrics_histogram_timer(
    "commcare.repeaters.check.processing",
    timing_buckets=_check_repeaters_buckets,
):

Datadog

Adding a new widget

  1. In your editor or IDE, copy the name of the metric from the code.

  2. In Datadog, scroll to the group for the new widget.

  3. Click "+ Add Widgets".

  4. Drag "Timeseries" into the group.

  5. The graphs initially shows the "system.cpu.user" metric. Click in the field, paste the name of the metric over "system.cpu.user", and select the metric.

  6. Scroll to the bottom of the modal, and give your widget a name.

  7. Click "Save".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment