Discovery
The Stackable Operator for Apache HDFS publishes a discovery ConfigMap, which exposes a client configuration bundle that allows access to the Apache HDFS cluster.
Example
Given the following HDFS cluster:
apiVersion: hdfs.stackable.tech/v1alpha1
kind: HdfsCluster
metadata:
name: {clusterName} (1)
namespace: {namespace} (2)
spec:
namenode:
roleGroups:
default: (3)
[…]
1 | The name of the HDFS cluster, which is also the name of the created discovery ConfigMap. |
2 | The namespace of the discovery ConfigMap. |
3 | A role group name of the namenode role. |
The resulting discovery ConfigMap
is located at {namespace}/{clusterName}
.
Contents
The ConfigMap data values are formatted as Hadoop XML files which allows simple mounting of that ConfigMap into pods that require access to HDFS.
core-site.xml
-
Contains the
fs.DefaultFS
which defaults tohdfs://{clusterName}/
. hdfs-site.xml
-
Contains the
dfs.namenode.*
properties forrpc
andhttp
addresses for thenamenodes
as well as thedfs.nameservices
property which defaults tohdfs://{clusterName}/
.
Kerberos
In case Kerberos is enabled according to the security documentation, the discovery ConfigMap also includes the information that clients must authenticate themselves using Kerberos.
Some Kerberos-related configuration settings require the environment variable KERBEROS_REALM
to be set (e.g. using export KERBEROS_REALM=$(grep -oP 'default_realm = \K.*' /stackable/kerberos/krb5.conf)
).
If you want to use the discovery ConfigMap outside Stackable services, you need to provide this environment variable.
As an alternative you can substitute ${env.KERBEROS_REALM}
with your actual realm (e.g. by using sed -i -e 's/${{env.KERBEROS_REALM}}/'"$KERBEROS_REALM/g" core-site.xml
).
One example would be the property dfs.namenode.kerberos.principal
being set to nn/hdfs.default.svc.cluster.local@${env.KERBEROS_REALM}
.