An experienced operators guide to streaming Kubernetes workload logs into Quickwit.
Every Kubernetes pod has its STDOUT
and STDIN
streams written to the kube node filesystem. Ever wondered how kubectl logs ...
works? Well, the container logs are always written to disk and automatically trimmed before they can fill up the kube node's filesystem. The node's kubelet streams them to you on-demand -- going from filesystem to kubelet to kube api and finally over to your kubectl! Read more about Kubernetes standard log management.
We want to create an alternative path for these logs -- they will still be available for use with commands such as kubectl logs ...
but we want to tail the files and ship their contents off to Quickwit as soon as possible. Getting to the point where everything works can be an odyssey, so this is my attempt at writing the book pamphlet.
I have been accumulating data with Quickwit+Vector for about two days. The visibility has already helped me identify orphaned workloads and misconfigured logging in important cronjobs. And it has done so with a minimum Kubernetes footprint and negligible S3 usage.
1.9GB of compressed data in S3:

and the kube footprint is minuscule (and also not configured for high-availability just yet!)
% kubectl -n vector top pods
NAME CPU(cores) MEMORY(bytes)
vector-agent-2fw7s 2m 25Mi
vector-agent-448zq 1m 26Mi
vector-agent-45m98 2m 27Mi
vector-agent-4b4r4 1m 21Mi
vector-agent-67lkg 1m 18Mi
[[[ SNIP ]]]
and
% kubectl -n quickwit top pods
NAME CPU(cores) MEMORY(bytes)
quickwit-logs-control-plane-7447dfb4d9-xgb2m 2m 11Mi
quickwit-logs-indexer-0 25m 166Mi
quickwit-logs-janitor-9cb844987-whg7g 2m 19Mi
quickwit-logs-metastore-54d7c68f59-gnfjx 2m 15Mi
quickwit-logs-searcher-0 2m 249Mi
Around these parts, we use ArgoCD to manage our apps. We don't manage helm repos locally and we don't run helm release updates by hand. We rely on gitops and active state sync to keep our many apps in sync. Kubernetes is complex and ArgoCD adds just a smidge more complexity in order to give unified visibility.
Best part of ArgoCD, it's really easy to share definitive Application specifications!
You have to create a tenancy with some guard rails. We want to limit which helm charts are installed to which namespaces (and what objects the charts can and cannot install in the cluster). Here's the relevant AppProject's edited for clarity:
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: kube-system
namespace: argo
spec:
destinations:
- namespace: 'kube-system'
server: '*'
- namespace: 'prometheus' # creates a ton of metric-gathering pods
server: '*'
- namespace: 'vector' # creates a ton of vector agent pods
server: '*'
sourceRepos:
- 'https://prometheus-community.github.io/helm-charts'
- 'https://helm.vector.dev'
clusterResourceWhitelist: # required to install the Prometheus CRDs
- group: "*"
kind: "*"
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: quickwit
namespace: argo
spec:
destinations:
- namespace: "quickwit"
server: "*"
sourceRepos:
- "https://helm.quickwit.io"
- "https://github.com/xrl/quickwit-helm-charts.git"
these AppProject definitions are naturally managed by Git but that whole state-sync-webhooktastic architecture is out of scope.
Vector runs in agent mode:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: vector-agent
namespace: argo
spec:
project: kube-system
syncPolicy:
automated:
prune: true
source:
repoURL: https://helm.vector.dev
targetRevision: 0.37.0
chart: vector
helm:
releaseName: vector-agent
values: |
fullnameOverride: "vector-agent"
role: Agent
customConfig:
data_dir: /vector-data-dir
api:
enabled: true
address: 0.0.0.0:8686
sources:
kubernetes_logs:
type: kubernetes_logs
transforms:
filtered_logs:
type: remap
inputs: ["kubernetes_logs"]
source: |
.message = string!(.message)
if contains(.message, "GET /ready HTTP/1.1") {
abort # we don't care RX health messages
}
kube_logs_to_otel:
type: remap
inputs: ["filtered_logs"]
source: |
.timestamp_nanos = to_unix_timestamp!(.timestamp, unit: "nanoseconds")
.severity_text = "INFO"
.body = {
"message": .message,
"stream": .stream
}
.attributes = .kubernetes
del(.file)
del(.timestamp)
del(.source_type)
del(.stream)
del(.kubernetes)
del(.message)
sinks:
quickwit_logs:
type: http
method: post
inputs: ["kube_logs_to_otel"]
encoding:
codec: "json"
framing:
method: "newline_delimited"
uri: "http://quickwit-logs-indexer.quickwit.svc.cluster.local:7280/api/v1/otel-logs-v0_7/ingest"
# livenessProbe -- Override default liveness probe settings, if customConfig is used requires customConfig.api.enabled true
## Requires Vector's API to be enabled
livenessProbe:
httpGet:
path: /health
port: api
# readinessProbe -- Override default readiness probe settings, if customConfig is used requires customConfig.api.enabled true
## Requires Vector's API to be enabled
readinessProbe:
httpGet:
path: /health
port: api
destination:
server: https://kubernetes.default.svc
namespace: vector
good to notice in this config:
- kubernetes logs are filtered to remove nuisance messages
- remap transform is used to painfully convert to OTel-compatible log format
- use
kubectl -n vector exec -it $vector_pod -- vector tap kubernetes_logs
to see what a kube log look like (I passed this throughprintf '$json' | jq
to format it):
- use
{
"file": "/var/log/pods/argo_argo-cd-repo-server-869d695dc8-fgmqc_eab867f0-389b-4b6b-9b7f-69c7c3474c45/repo-server/2.log",
"kubernetes": {
"container_id": "containerd://d3e051e44f0fe97790d4615998b4747abe3a1f3adae0d7a9395934d526386615",
"container_image": "quay.io/argoproj/argocd:v2.11.7",
"container_image_id": "quay.io/argoproj/argocd@sha256:47e3e00dc501680e77b2496c67ed2e6bff8de1c71e55b56b37b9b11fc34f2ed4",
"container_name": "repo-server",
"namespace_labels": {
"kubernetes.io/metadata.name": "argo"
},
"node_labels": {
"arch": "amd64",
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/instance-type": "r5.xlarge",
"beta.kubernetes.io/os": "linux",
"eks.amazonaws.com/capacityType": "ON_DEMAND",
"eks.amazonaws.com/nodegroup": "ondemand-1b-2024083115310830840000000d",
"eks.amazonaws.com/nodegroup-image": "ami-039bdded3573af90a",
"failure-domain.beta.kubernetes.io/region": "eu-central-1",
"failure-domain.beta.kubernetes.io/zone": "eu-central-1b",
"k8s.io/cloud-provider-aws": "3a3320977962e39cf45d0123eecd5f54",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "ip-172-30-50-168.eu-central-1.compute.internal",
"kubernetes.io/os": "linux",
"lifecycle": "ondemand",
"node.kubernetes.io/instance-type": "r5.xlarge",
"nodegroup": "ondemand-eu-central-1b",
"topology.ebs.csi.aws.com/zone": "eu-central-1b",
"topology.k8s.aws/zone-id": "euc1-az3",
"topology.kubernetes.io/region": "eu-central-1",
"topology.kubernetes.io/zone": "eu-central-1b"
},
"pod_annotations": {
"checksum/cm": "860c7d2900972fc99c6d7059e06a25d9646dcbf74da82484611321c8cce79377",
"checksum/cmd-params": "4c016fc0004793cf74267de6a9da23ad69fb79f0f9cd503ffae016297898f41d"
},
"pod_ip": "172.30.34.204",
"pod_ips": [
"172.30.34.204"
],
"pod_labels": {
"app.kubernetes.io/component": "repo-server",
"app.kubernetes.io/instance": "argo-cd",
"app.kubernetes.io/managed-by": "Helm",
"app.kubernetes.io/name": "argocd-repo-server",
"app.kubernetes.io/part-of": "argocd",
"app.kubernetes.io/version": "v2.11.7",
"helm.sh/chart": "argo-cd-7.3.11",
"pod-template-hash": "869d695dc8"
},
"pod_name": "argo-cd-repo-server-869d695dc8-fgmqc",
"pod_namespace": "argo",
"pod_node_name": "ip-172-30-50-168.eu-central-1.compute.internal",
"pod_owner": "ReplicaSet/argo-cd-repo-server-869d695dc8",
"pod_uid": "eab867f0-389b-4b6b-9b7f-69c7c3474c45"
},
"message": "time=\"2024-10-31T02:28:03Z\" level=info msg=\"finished unary call with code OK\" grpc.code=OK grpc.method=Check grpc.service=grpc.health.v1.Health grpc.start_time=\"2024-10-31T02:28:03Z\" grpc.time_ms=0.019 span.kind=server system=grpc",
"source_type": "kubernetes_logs",
"stream": "stderr",
"timestamp": "2024-10-31T02:28:03.185047076Z"
}
- use
kubectl -n vector exec -it $vector_pod -- vector top
to see how many messages are moving through the agent - quickwit is hit with the generic
http
output plugin, values taken from the quickwit vector docs
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: quickwit-logs
namespace: argo
spec:
project: quickwit
syncPolicy:
automated:
prune: true
source:
repoURL: "https://github.com/xrl/quickwit-helm-charts.git"
path: charts/quickwit
targetRevision: per-service-env-from
helm:
releaseName: quickwit-logs
values: |
fullnameOverride: quickwit-logs
config:
default_index_root_uri: s3://quickwit-logs
storage:
s3:
region: eu-central-1
metastore:
extraEnv:
- name: QW_METASTORE_URI
valueFrom:
secretKeyRef:
name: quickwitlogs-secret
key: POSTGRES_URL
searcher:
replicaCount: 1
serviceAccount:
create: true
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::1234567890:role/quickwit-logs"
destination:
server: https://kubernetes.default.svc
namespace: quickwit
good to notice in this config:
- uses a fork of the helm chart until this PR can be addressed
- I have a RDS postgres instance I want to connect to. I will never put postgres credentials in my helm values 🫡
- I only need one searcher for now
- uses the EKS service account mechanism to inject a AWS session token in to the pod
- I will never roll/manage AWS service account credentials again 🫡
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: kube-prometheus-stack
namespace: argo
spec:
project: kube-system
syncPolicy:
automated:
prune: true
syncOptions:
- ServerSideApply=true
source:
repoURL: https://prometheus-community.github.io/helm-charts
targetRevision: 65.3.2
chart: kube-prometheus-stack
helm:
releaseName: prometheus
values: |
fullnameOverride: "prometheus"
grafana:
env:
GF_INSTALL_PLUGINS: "quickwit-quickwit-datasource"
persistence:
enabled: true
additionalDataSources:
- name: Quickwit Logs
type: quickwit-quickwit-datasource
url: http://quickwit-logs-searcher.quickwit.svc.cluster.local:7280/api/v1
jsonData:
index: otel-logs-v0_7
logMessageField: body
logLevelField: severity_text
grafana.ini:
auth:
disable_login_form: true
disable_signout_menu: true
auth.anonymous:
enabled: true
org_name: Main Org.
org_role: Editor
# database:
# type: postgres
# url: "${POSTGRES_URL}"
prometheusOperator:
kubeletService:
enabled: false
prometheus:
prometheusSpec:
resources:
requests:
memory: "28Gi"
cpu: "2000m"
## Prometheus StorageSpec for persistent data
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/storage.md
##
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: gp2
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 200Gi
kube-state-metrics:
podSecurityPolicy:
enabled: false
destination:
server: https://kubernetes.default.svc
namespace: prometheus
good to notice in this config:
- this configures a prometheus instance AND a daemonset which forwards metrics from every kube node
- the
GF_INSTALL_PLUGINS
ENV var lets us install the quickwit plugin on every container boot (that was new to me!) - we configure the data source right in the helm chart values (also new to me, I usually did clickops for that)
- the persistence configuration kind of stinks, I have problems with the rollout strategy creating a deadlock over the PVC. deleting replicasets works
- the ultimate goal should be to use postgres to store my grafana dashboard data, no PVCs
- I disable the grafana auth machinery because I have a OIDC gateway in front of the service
- out of scope from this documentation
- you can go straight to the grafana service with
kubectl -n prometheus port-forward svc/prometheus-grafana 8080:80
, then openhttp://localhost:8080
in your browser. no login required.
resource "aws_iam_role" "quickwit-logs" {
name = "quickwit-logs"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRoleWithWebIdentity"
Effect = "Allow"
Principal = {
Federated = format("arn:aws:iam::%s:oidc-provider/%s", var.aws_account_id, var.oidc_provider_id)
}
Condition = {
StringLike = {
"${var.oidc_provider_id}:sub" : "system:serviceaccount:quickwit:quickwit-logs",
"${var.oidc_provider_id}:aud" : "sts.amazonaws.com"
}
}
}
]
})
inline_policy {
name = "s3-access"
policy = jsonencode({
"Version" : "2012-10-17",
"Statement" : [
{
"Effect" : "Allow",
"Action" : [
"s3:ListBucket"
],
"Resource" : [
"arn:aws:s3:::quickwit-logs"
]
},
{
"Effect" : "Allow",
"Action" : [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload"
],
"Resource" : [
"arn:aws:s3:::quickwit-logs/*"
]
}
]
})
}
managed_policy_arns = []
}
good to notice in this terraform:
- OIDC trust relationship with the Kubernetes cluster (see the link to the IAM docs above)
- limited access to kubernetes service accounts in the
quickwit
namespace
- limited access to kubernetes service accounts in the
- grants a variety of S3 permissions to the
quickwit-logs
bucket- the S3 permissions were found in quickwit's AWS S3 storage docs
- remember you want to follow the principle of least privilege, grant only what is strictly necessary
the exact mechanism of running this terraform is out of scope -- but take comfort in knowing that it was applied through a gitops workflow.
Port-forward in to the quickwit searcher service: kubectl -n quickwit port-forward svc/quickwit-logs-searcher 7280:7280
and then open your browser to http://localhost:7280
and you'll see this:
The Quickwit UI homepage showing the search interface
and if you go to look at the automatic otel logs index, you'll see this:

things to notice:
- compression is a great thing. we pay for ~250MB of S3 storage for almost 5GB of JSON (there's a lot of junk in there I plan to strip out with vector's VRL)
- at the time of writing this incarnation of quickwit had been running for less than 8 hours. this is a production kube cluster with ~150 nodes handling an enterprise workload.
- the constant object writes, reads, write-backs might add up. I'll keep an eye on things best I can.
- the number of splits changes often as quickwit service opens, merges, and garbage collects
- a split is a single file with all the contents of a tantivy segment in one compressed, seakable blob. check out the contents in S3 like this:
% aws s3 ls s3://quickwit-logs/otel-logs-v0_7/
2024-10-30 16:24:59 7336474 01JBFHMNRV7XQJAF3B4PHBJ2CY.split
2024-10-30 16:26:38 9817784 01JBFHQNS1J8KGC50MJ3BP6S9V.split
2024-10-30 16:27:33 29905109 01JBFHSATR3TXDDDD65R3K4RHN.split
2024-10-30 17:25:12 63586 01JBFN2RTR4Y10873NBMP0A3NQ.split
2024-10-30 17:25:17 80415 01JBFN2XQ4PB5PJJBD2G8SKMCQ.split
2024-10-30 19:22:47 149153365 01JBFVT4BJ6DYAB86SK8V6B439.split
[[[ SNIP ]]]
- the quickwit project has not tackled the unenviable task of authentication at the quickwit level. don't expose quickwit to the open internet.
- grafana is the full-fledged dashboard builder so it's probably best to leave the auth to them. when I get grafana OIDC working I'll update this document.
read more:
- https://github.com/quickwit-oss/tantivy/blob/main/ARCHITECTURE.md
- https://github.com/quickwit-oss/tantivy/wiki/Life-of-a-Segment
- https://docs.rs/tantivy/latest/tantivy/
When visiting http://localhost:7280
, I can query the Kubernetes logs and just make sure I understand what a document looks like.
The Query Editor
panel of the Quickwit UI looks like this when our index has data flowing:

and one log's JSON looks like:
{
"attributes": {
"container_id": "containerd://9f6e5be434e97b7e37628b5f7a2423c4ec293939fbf58b22a66446ebff54ba87",
"container_image": "registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7",
"container_image_id": "registry.k8s.io/ingress-nginx/controller@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7",
"container_name": "controller",
"namespace_labels": {
"kubernetes.io/metadata.name": "ingress-nginx"
},
"node_labels": {
"arch": "amd64",
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/instance-type": "r5.xlarge",
"beta.kubernetes.io/os": "linux",
"eks.amazonaws.com/capacityType": "ON_DEMAND",
"eks.amazonaws.com/nodegroup": "ondemand-1a-20240831153108298900000007",
"eks.amazonaws.com/nodegroup-image": "ami-039bdded3573af90a",
"failure-domain.beta.kubernetes.io/region": "eu-central-1",
"failure-domain.beta.kubernetes.io/zone": "eu-central-1a",
"k8s.io/cloud-provider-aws": "3a3320977962e39cf45d0123eecd5f54",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "ip-172-30-22-198.eu-central-1.compute.internal",
"kubernetes.io/os": "linux",
"lifecycle": "ondemand",
"node.kubernetes.io/instance-type": "r5.xlarge",
"nodegroup": "ondemand-eu-central-1a",
"topology.ebs.csi.aws.com/zone": "eu-central-1a",
"topology.k8s.aws/zone-id": "euc1-az2",
"topology.kubernetes.io/region": "eu-central-1",
"topology.kubernetes.io/zone": "eu-central-1a"
},
"pod_annotations": {
"kubectl.kubernetes.io/restartedAt": "2023-12-06T01:04:59Z"
},
"pod_ip": "172.30.11.35",
"pod_ips": [
"172.30.11.35"
],
"pod_labels": {
"app.kubernetes.io/component": "controller",
"app.kubernetes.io/instance": "ingress-nginx",
"app.kubernetes.io/managed-by": "Helm",
"app.kubernetes.io/name": "ingress-nginx",
"app.kubernetes.io/part-of": "ingress-nginx",
"app.kubernetes.io/version": "1.11.3",
"helm.sh/chart": "ingress-nginx-4.11.3",
"pod-template-hash": "6bc959cb88"
},
"pod_name": "ingress-nginx-controller-6bc959cb88-fp97t",
"pod_namespace": "ingress-nginx",
"pod_node_name": "ip-172-30-22-198.eu-central-1.compute.internal",
"pod_owner": "ReplicaSet/ingress-nginx-controller-6bc959cb88",
"pod_uid": "3d301f4e-b13a-45b3-8853-99b836e464a1"
},
"body": {
"message": "172.30.9.182 - - [31/Oct/2024:02:46:19 +0000] \"GET /inventory?id=8934812a-40c7-4df9-8b79-32a02f358282 HTTP/1.1\" 200 11645 \"-\" \"Mozilla/5.0 (compatible; AhrefsBot/7.0; +http://ahrefs.com/robot/)\" 392 0.159 [production-rx-production-80] [] 172.30.55.162:80 11624 0.160 200 2d56343d741dbdbbc2a2d0dfbbcbe7f8",
"stream": "stdout"
},
"severity_text": "INFO",
"timestamp_nanos": 1730342779504241000
}
Things to note:
- the object structure is dictated by my VRL from the vector-agent Application definition above, but copied here:
kube_logs_to_otel:
type: remap
inputs: ["filtered_logs"]
source: |
.timestamp_nanos = to_unix_timestamp!(.timestamp, unit: "nanoseconds")
.severity_text = "INFO"
.body = {
"message": .message,
"stream": .stream
}
.attributes = .kubernetes
del(.file)
del(.timestamp)
del(.source_type)
del(.stream)
del(.kubernetes)
del(.message)
- This is the first time I have tried to match the otel-logs schema
- The layout of the log object may not make sense but I'm glad it passes schema validation from quickwit's
otel-logs-v0_7
index. Perhaps the opentelemetry-collector which is well documented by quickwit would fare better.
- The layout of the log object may not make sense but I'm glad it passes schema validation from quickwit's
My must-haves to perform field-based search:
- a field must be an exactly value:
attributes.container_id:"containerd://9f6e5be434e97b7e37628b5f7a2423c4ec293939fbf58b22a66446ebff54ba87"
- a field must be one of a list of values
- a field must not be a value
-attributes.container_id:"containerd://9f6e5be434e97b7e37628b5f7a2423c4ec293939fbf58b22a66446ebff54ba87"
(note the minus) - a field should be present (no specific value in mind)
- a field should be not be present
The hits just keep coming!

How do we tweak the otel logs retention? 🤔
You don't want to use the Quickwit UI for day-to-day observability/incident response. It's handy, for sure, but you want to build dashboards that lay out all the data at once.
This tutorial focuses on Grafana 12.
Remember that we're building on top of the popular kube-prometheus-stack
helm chart. This chart injects a whole suite of prometheus-centric dashboards to the grafana, creating what is essential a stdlib of monitoring. And we're totally able to add more dashboards.
- Make a
Logs
folder where we can store our quickwit-backed dashboards:
Creating a new folder to organize Quickwit log dashboards
- Select "New Dashboard" from the dropdown menu:
Selecting the New Dashboard option from Grafana menu
- And we'll create a new dashboard to lay out some smart widgets:
Empty dashboard ready for new panels
- Add your first visualization panel:
Adding a new visualization panel to the dashboard
- Let's just start by saving the empty dashboard:
Saving the dashboard with initial configuration
- Confirm the save operation:
Confirming dashboard save operation
Aggregations, or bucketing, is used for generating summary statistics of a dataset. The dataset is split in to multiple buckets and we can ask Quickwit to generate summary statistics on each bucket. The usual suspects for summarizing a bucket: count
, average
, min
, max
, sum
, percentiles
, etc. The docs will tell you that aggregations are only performed on fast
fields -- stats are calculated from the columnar portion of the quickwit split without having to read all the data.
Let's use aggregations to identify the noisiest kube cluster namespaces. We want to group by kube namespace and emit a count metric for each group. We'll further aggregate our data by time so we get a sense for the trends.
- Add a visualization, choose the
Quickwit Logs
data source (remember we configured this as part of the helm values for theprometheus-kube-stack
):
Selecting the Quickwit Logs data source for the visualization
- You'll then be met with an intimidating blank panel editing screen:
The initial blank panel editing interface
- Be sure your panel is the
Metric
type and specify thetimestamp_nanos
field for the aggregation:
Configuring the metric type and timestamp field
I don't like the line graph. In the top-right you can change the visualization type to Bar Chart
:
Switching the visualization type to Bar Chart
Set the first group by statement to build a date histogram on the timestamp_nanos
field and then hit the grafana's dashboard refresh icon and you'll see:
Initial date histogram visualization
You'll need to hit the grafana refresh button often, it doesn't look like the quickwit plugin reissues the query after changing the grafana UI, this button:
Location of the refresh button for updating visualizations
Make the bar chart more intelligible by increasing bucketing interval, set it to 5m
:
Setting the bucket interval to 5 minutes
And we'll further subdivide those buckets by adding another term aggregation, let's group by attributes.namespace_labels.kubernetes.io/metadata.name
. Click on the +
icon to the right of the Group By
expression builder:
Configuring the group by settings
And thankfully you can type-ahead to discover relevant fields:
Using type-ahead to find relevant fields
And you'll get something like this, with the bar side-by-side:
Visualization showing side-by-side bar chart
Search the options on the right-side for stacking
:
Locating the stacking option in settings
And then choose normal
:
Selecting normal stacking mode
Give the panel a smart title:
Adding a descriptive title to the panel
And that's enough of the screenshot parade. Hit Save
on the page and now you have your first starter dashboard:
The completed dashboard with configured visualization
Aggregations are great for summarizing what's going on. But what about when it's time to dig in to specifics? Thankfully, Grafana has a built in panel type for displaying log data. It has a few gotchas but let's go ahead and add a new panel to our dashboard:
Note: I am filtering to quickwit's pods
- Add another visualization
Adding a new visualization panel to the dashboard
- Change the visualization type from "Time series" to "Logs"
Switching visualization type to Logs view
- Change query type to "Logs"
Setting the query type to Logs
- Click "refresh dashboard" to fetch data and populate the panel
Initial view after refreshing the dashboard
Notice the little chevrons and the lack of log textual data. Grafana is definitely fetching the data from quickwit but the data is not coming back in a format matching the panels conventions. It's not really documented anywhere, but the logs panel presents data based on its position in dataset returned by the database plugin. The logs panel does not depend on the name of the field, just the position. You can see the data using the table view toggle:
Table view showing the raw data structure
Notice how $qw_message
is the second column and it's blank. I'm not sure what the $qw_message
template variable is used for, but we want to reorder the dataset and put body.message
as the second column. Good news for everyone, Grafana added data transforms in version 11, which was new to me. Data transform seems like a powerful feature, let's try it out here. Switch from Query
to Transform data (0)
:
Accessing the data transform options
Click "Add transformation" and search for "organize":
Adding the organize transformation
This will then show the columns as they are currently sorted:
Current order of columns before reorganization
Scroll down, find the body.message
column and drag it up. You'll be rewarded with an instantaneous re-rendering showing the logs:
Logs displaying correctly after column reordering
Good time to hit "save" -- click on the dashboard's name in the breadcrumbs ("Kube logs") to leave the editor:
Using breadcrumb navigation to exit editor
Which now shows:
Dashboard view after saving changes
Let's drag the logs panel below the aggregations and make them full width:
Final dashboard layout with full-width logs panel
Looking good folks!
Thanks for putting this together! 🍻