Creating a Trino cluster
Define an insecure cluster (testing)
Create an insecure single node Trino cluster for testing. This can be accessed with the UI/CLI via http without either user/password credentials or authorization.
For testing purposes we use the Trino CLI.
First, ensure all necessary operator have been deployed:
stackablectl operator install \
secret commons hive trino
The Trino cluster can now be deployed:
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
name: hive
labels:
trino: simple-trino
spec:
connector:
hive:
metastore:
configMap: simple-hive-derby
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
name: simple-trino
spec:
image:
productVersion: 396
stackableVersion: 0.3.0
catalogLabelSelector:
matchLabels:
trino: simple-trino
coordinators:
roleGroups:
default:
replicas: 1
workers:
roleGroups:
default:
replicas: 1
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
name: simple-hive-derby
spec:
image:
productVersion: 3.1.3
stackableVersion: 0.2.0
clusterConfig:
database:
connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true
user: APP
password: mine
dbType: derby
metastore:
roleGroups:
default:
replicas: 1
We have defined a single catalog - Hive - which uses an embedded database (derby).
To interact with Trino, first obtain the host and port for the Trino coordinator service (in this and following examples, https://172.18.0.3:31748):
stackablectl services list
PRODUCT NAME NAMESPACE ENDPOINTS EXTRA INFOS
hive simple-hive-derby default hive 172.18.0.4:32186
metrics 172.18.0.4:30109
trino simple-trino default coordinator-metrics 172.18.0.3:32123
coordinator-https https://172.18.0.3:31748
Next, download the Trino CLI tool (this can be obtained from the Stackable repository, as shown below):
curl --output trino.jar https://repo.stackable.tech/repository/packages/trino-cli/trino-cli-396-executable.jar
Execute some CLI commands to verify operation, such as returning the names of all catalogs. Note that an insecure connection is specified:
./trino.jar --insecure --debug --server https://172.18.0.3:31748 --user=admin --execute "SHOW CATALOGS" --output-format=CSV_UNQUOTED
hive
system
Define a secure cluster (production)
For secure connections the following steps must be taken:
-
Enable authentication
-
Enable TLS between the clients and coordinator
-
Enable internal TLS for communication between coordinators and workers
Via authentication
If authentication is enabled, TLS for the coordinator as well as a shared secret for internal communications (this is base64 and not encrypted) must be configured.
Securing the Trino cluster will disable all HTTP ports and disable the web interface on the HTTP port as well. In the definition below the authentication is directed to use the trino-users
secret and TLS communication will use a certificate signed by the Secret Operator (indicated by autoTls
).
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
name: hive
labels:
trino: simple-trino
spec:
connector:
hive:
metastore:
configMap: simple-hive-derby
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
name: simple-trino
spec:
image:
productVersion: 396
stackableVersion: 0.3.0
config:
tls:
secretClass: trino-tls (1)
authentication:
method:
multiUser:
userCredentialsSecret:
name: trino-users (2)
catalogLabelSelector:
matchLabels:
trino: simple-trino (3)
coordinators:
roleGroups:
default:
replicas: 1
workers:
roleGroups:
default:
replicas: 1
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
name: trino-tls (1)
spec:
backend:
autoTls: (4)
ca:
secret:
name: secret-provisioner-trino-tls-ca
namespace: default
autoGenerate: true
---
apiVersion: v1
kind: Secret
metadata:
name: trino-users (2)
type: kubernetes.io/opaque
stringData:
# admin:admin
admin: $2y$10$89xReovvDLacVzRGpjOyAOONnayOgDAyIS2nW9bs5DJT98q17Dy5i
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
name: simple-hive-derby
spec:
image:
productVersion: 3.1.3
stackableVersion: 0.2.0
clusterConfig:
database:
connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true
user: APP
password: mine
dbType: derby
metastore:
roleGroups:
default:
replicas: 1
1 | The name of (and reference to) the SecretClass |
2 | The name of (and reference to) the Secret |
3 | TrinoCatalog reference |
4 | TLS mechanism |
The CLI now requires that a path to the keystore and a password be provided:
./trino.jar --debug --server https://172.18.0.3:31748
--user=admin --keystore-path=<path-to-keystore.p12> --keystore-password=<password>
Via TLS only
This will disable the HTTP port and UI access and encrypt client-server communications.
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
name: hive
labels:
trino: simple-trino
spec:
connector:
hive:
metastore:
configMap: simple-hive-derby
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
name: simple-trino
spec:
image:
productVersion: 396
stackableVersion: 0.3.0
config:
tls:
secretClass: trino-tls (1)
catalogLabelSelector:
matchLabels:
trino: simple-trino (2)
coordinators:
roleGroups:
default:
replicas: 1
workers:
roleGroups:
default:
replicas: 1
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
name: trino-tls (1)
spec:
backend:
autoTls: (3)
ca:
secret:
name: secret-provisioner-trino-tls-ca
namespace: default
autoGenerate: true
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
name: simple-hive-derby
spec:
image:
productVersion: 3.1.3
stackableVersion: 0.2.0
clusterConfig:
database:
connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true
user: APP
password: mine
dbType: derby
metastore:
roleGroups:
default:
replicas: 1
1 | The name of (and reference to) the SecretClass |
2 | TrinoCatalog reference |
3 | TLS mechanism |
CLI callout:
./trino.jar --debug --server https://172.18.0.3:31748 --keystore-path=<path-to-keystore.p12> --keystore-password=<password>
Via internal TLS
Internal TLS is for encrypted and authenticated communications between coordinators and workers. Since this applies to all the data send and processed between the processes, this may reduce the performance significantly.
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
name: hive
labels:
trino: simple-trino
spec:
connector:
hive:
metastore:
configMap: simple-hive-derby
---
apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCluster
metadata:
name: simple-trino
spec:
image:
productVersion: 396
stackableVersion: 0.3.0
config:
internalTls:
secretClass: trino-internal-tls (1)
authentication:
method:
multiUser:
userCredentialsSecret:
name: trino-users (2)
catalogLabelSelector:
matchLabels:
trino: simple-trino
coordinators:
roleGroups:
default:
replicas: 1
workers:
roleGroups:
default:
replicas: 1
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
name: trino-internal-tls (1)
spec:
backend:
autoTls: (3)
ca:
secret:
name: secret-provisioner-trino-internal-tls-ca
namespace: default
autoGenerate: true
---
apiVersion: v1
kind: Secret
metadata:
name: trino-users (2)
type: kubernetes.io/opaque
stringData:
# admin:admin
admin: $2y$10$89xReovvDLacVzRGpjOyAOONnayOgDAyIS2nW9bs5DJT98q17Dy5i
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
name: simple-hive-derby
spec:
image:
productVersion: 3.1.3
stackableVersion: 0.2.0
clusterConfig:
database:
connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true
user: APP
password: mine
dbType: derby
metastore:
roleGroups:
default:
replicas: 1
1 | The name of (and reference to) the SecretClass |
2 | The name of (and reference to) the Secret |
3 | TLS mechanism |
Since Trino has internal and external communications running over a single port, this will enable the HTTPS port but not expose it. Cluster access is only possible via HTTP.
./trino.jar --debug --server http://172.18.0.3:31748 --user=admin
S3 connection specification
You can specify S3 connection details directly inside the TrinoCatalog
specification
or by referring to an external S3Connection
custom resource.
To specify S3 connection details directly as part of the TrinoCatalog
resource, you
add an inline connection configuration as shown below:
s3: (1)
inline:
host: test-minio (2)
port: 9000 (3)
pathStyleAccess: true (4)
secretClass: minio-credentials (5)
tls:
verification:
server:
caCert:
secretClass: minio-tls-certificates (6)
1 | Entry point for the connection configuration |
2 | Connection host |
3 | Optional connection port |
4 | Optional flag if path-style URLs should be used; This defaults to false
which means virtual hosted-style URLs are used. |
5 | Name of the Secret object expected to contain the following keys:
accessKey and secretKey |
6 | Optional TLS settings for encrypted traffic. The secretClass can be provided by the Secret Operator or yourself. |
A self provided S3 TLS secret can be specified like this:
---
apiVersion: secrets.stackable.tech/v1alpha1
kind: SecretClass
metadata:
name: minio-tls-certificates
spec:
backend:
k8sSearch:
searchNamespace:
pod: {}
---
apiVersion: v1
kind: Secret
metadata:
name: minio-tls-certificates
labels:
secrets.stackable.tech/class: minio-tls-certificates
data:
ca.crt: <your-base64-encoded-ca>
tls.crt: <your base64-encoded-public-key>
tls.key: <your-base64-encoded-private-key>
It is also possible to configure the bucket connection details as a separate
Kubernetes resource and only refer to that object from the TrinoCatalog
specification
like this:
s3:
reference: my-connection-resource (1)
1 | Name of the connection resource with connection details |
The resource named my-connection-resource
is then defined as shown below:
---
apiVersion: s3.stackable.tech/v1alpha1
kind: S3Connection
metadata:
name: my-connection-resource
spec:
host: test-minio
port: 9000
accessStyle: Path
credentials:
secretClass: minio-credentials
This has the advantage that the connection configuration can be shared across applications and reduces the cost of updating these details.