Logging with a Vector log aggregator
This tutorial teaches you how to deploy a Vector aggregator together with a product - in this case ZooKeeper - and how to configure both of them so the logs are sent from the product to the aggregator. Logging on the Stackable Data Platform is always configured in the same way, so you can use this knowledge to configure logging in any product that you want to deploy.
Prerequisites:
-
a k8s cluster available, or kind installed
-
stackablectl installed
-
Helm installed to deploy Vector
-
basic knowledge of how to create resources in Kubernetes (i.e.
kubectl apply -f <filename>.yaml
) and inspect them (kubectl get
or a tool like k9s)
Install the ZooKeeper operator
Install the Stackable Operator for Apache ZooKeeper and its dependencies, so you can deploy a ZooKeeper instance later.
stackablectl release install -i secret -i commons -i listener -i zookeeper 23.11
Install the Vector aggregator
Install the Vector aggregator using Helm.
First, create a vector-aggregator-values.yaml
file with the Helm values:
role: Aggregator
customConfig:
sources:
vector: (1)
address: 0.0.0.0:6000
type: vector
version: "2"
sinks:
console: (2)
type: console
inputs:
- vector
encoding:
codec: json
target: stderr
1 | define a source of type vector which listens to incoming log messages at port 6000. |
2 | define a console sink, logging all received logs to stderr . |
Deploy Vector with these values using Helm:
helm install \
--wait \
--values vector-aggregator-values.yaml \
vector-aggregator vector/vector
This is a minimal working configuration. The source should be defined in this way, but you can configure different sinks, depending on your needs. You can find an overview of all sinks in the Vector documentation, specifically the Elasticsearch sink might be useful, it also works when configured with OpenSearch.
To make the Vector aggregator discoverable to ZooKeeper, deploy a discovery ConfigMap called vector-aggregator-discovery
.
Create a file called vector-aggregator-discovery.yaml
:
apiVersion: v1
kind: ConfigMap
metadata:
name: vector-aggregator-discovery
data:
ADDRESS: vector-aggregator:6000
and apply it:
kubectl apply -f vector-aggregator-discovery.yaml
Install ZooKeeper
Now that the aggregator is running, you can install a ZooKeeper cluster which is configured to send logs to the aggregator.
Create a file called zookeeper.yaml
with the following ZookeeperCluster definition:
---
apiVersion: zookeeper.stackable.tech/v1alpha1
kind: ZookeeperCluster
metadata:
name: simple-zk
spec:
image:
productVersion: 3.8.0
stackableVersion: "0.0.0-dev"
clusterConfig:
vectorAggregatorConfigMapName: vector-aggregator-discovery (1)
servers:
roleGroups:
default:
replicas: 3
config:
logging: (2)
enableVectorAgent: true
containers:
vector:
file:
level: WARN
zookeeper:
console:
level: INFO
file:
level: INFO
loggers:
ROOT:
level: INFO
org.apache.zookeeper.server.NettyServerCnxn:
level: NONE
1 | This is the reference to the discovery ConfigMap created in the previous step. |
2 | This is the logging configuration, where logging is first enabled and then a few settings are made. |
and apply it:
kubectl apply -f zookeeper.yaml
You can learn more about how to configure logging in a product at the logging concept documentation. |
Watch the logs
During startup, ZooKeeper already prints out log messages.
Vector was configured to print the aggregated logs to stderr
, so if you look at the logs of the Vector pod, you will see the ZooKeeper logs:
kubectl logs vector-aggregator-0 | grep "zookeeper.version=" | jq
You should see a JSON object per ZooKeeper replica printed that looks like
{
"cluster": "simple-zk",
"container": "zookeeper",
"file": "zookeeper.log4j.xml",
"level": "INFO",
"logger": "org.apache.zookeeper.server.ZooKeeperServer",
"message": "Server environment:zookeeper.version=3.8.0-5a02a05eddb59aee6ac762f7ea82e92a68eb9c0f, built on 2022-02-25 08:49 UTC",
"namespace": "default",
"pod": "simple-zk-server-default-0",
"role": "server",
"roleGroup": "default",
"source_type": "vector",
"timestamp": "2023-11-06T10:30:40.223Z"
}
The JSON object contains a timestamp, the log message, log level and some additional information.
You can see the same log line in the log output of the ZooKeeper container:
kubectl logs \
--container=zookeeper simple-zk-server-default-0 \
| grep "zookeeper.version="
2023-11-06 10:30:40,223 [myid:1] - INFO [main:o.a.z.Environment@98] - Server environment:zookeeper.version=3.8.0-5a02a05eddb59aee6ac762f7ea82e92a68eb9c0f, built on 2022-02-25 08:49 UTC
Congratulations, this concludes the tutorial!
What’s next?
Look into different sink configurations which are more suited to production use in the sinks overview documetation or learn more about how logging works on the platform in the concepts documentation.