Configuration
The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
Do not override port numbers. This will lead to faulty installations. |
Configuration Properties
For a role or role group, at the same level of config
, you can specify: configOverrides
for:
-
config.properties
-
node.properties
-
log.properties
-
password-authenticator.properties
For a list of possible configuration properties consult the Trino Properties Reference.
workers:
roleGroups:
default:
config: {}
replicas: 1
configOverrides:
config.properties:
query.max-memory-per-node: "2GB"
Just as for the config
, it is possible to specify this at role level as well:
workers:
configOverrides:
config.properties:
query.max-memory-per-node: "2GB"
roleGroups:
default:
config: {}
replicas: 1
All override property values must be strings. The properties will be passed on without any escaping or formatting.
Environment Variables
Environment variables can be (over)written by adding the envOverrides
property.
For example per role group:
workers:
roleGroups:
default:
config: {}
replicas: 1
envOverrides:
JAVA_HOME: "path/to/java"
or per role:
workers:
envOverrides:
JAVA_HOME: "path/to/java"
roleGroups:
default:
config: {}
replicas: 1
Here too, overriding properties such as http-server.https.port
will lead to broken installations.
Resources
Storage for data volumes
You can mount a volume where data (config and logs of Trino) is stored by specifying PersistentVolumeClaims for each individual role or role group:
workers:
config:
resources:
storage:
data:
capacity: 2Gi
roleGroups:
default:
config:
resources:
storage:
data:
capacity: 3Gi
In the above example, all Trino workers in the default group will store data (the location of the property --data-dir
) on a 3Gi
volume. Additional role groups not specifying any resources will inherit the config provided on the role level (2Gi
volume). This works the same for memory or CPU requests.
By default, in case nothing is configured in the custom resource for a certain role group, each Pod will have a 2Gi
large local volume mount for the data location containing mainly logs.
Resource Requests
Stackable operators handle resource requests in a sligtly different manner than Kubernetes. Resource requests are defined on role or group level. See Roles and role groups for details on these concepts. On a role level this means that e.g. all workers will use the same resource requests and limits. This can be further specified on role group level (which takes priority to the role level) to apply different resources.
This is an example on how to specify CPU and memory resources using the Stackable Custom Resources:
---
apiVersion: example.stackable.tech/v1alpha1
kind: ExampleCluster
metadata:
name: example
spec:
workers: # role-level
config:
resources:
cpu:
min: 300m
max: 600m
memory:
limit: 3Gi
roleGroups: # role-group-level
resources-from-role: # role-group 1
replicas: 1
resources-from-role-group: # role-group 2
replicas: 1
config:
resources:
cpu:
min: 400m
max: 800m
memory:
limit: 4Gi
In this case, the role group resources-from-role
will inherit the resources specified on the role level. Resulting in a maximum of 3Gi
memory and 600m
CPU resources.
The role group resources-from-role-group
has maximum of 4Gi
memory and 800m
CPU resources (which overrides the role CPU resources).
For Java products the actual used Heap memory is lower than the specified memory limit due to other processes in the Container requiring memory to run as well. Currently, 80% of the specified memory limits is passed to the JVM. |
For memory only a limit can be specified, which will be set as memory request and limit in the Container. This is to always guarantee a Container the full amount memory during Kubernetes scheduling.
If no resource requests are configured explicitly, the Trino operator uses the following defaults:
workers:
roleGroups:
default:
config:
resources:
requests:
cpu: 200m
memory: 2Gi
limits:
cpu: "4"
memory: 2Gi
storage:
data:
capacity: 2Gi
The default values are most likely not sufficient to run a proper cluster in production. Please adapt according to your requirements. |