First steps

Once you have followed the steps in Installation for the Operator and its dependencies, you will now go through the steps to set up and connect to a Superset instance.

Database for the Superset metadata

Superset metadata (slices, connections, tables, dashboards etc.) is stored in an SQL database.

For testing purposes, you can spin up a PostgreSQL database with the following commands:

helm repo add bitnami https://charts.bitnami.com/bitnami
helm install --wait superset bitnami/postgresql \
    --set auth.username=superset \
    --set auth.password=superset \
    --set auth.database=superset

Secret with Superset credentials

A secret with the necessary credentials must be created: this contains database connection credentials as well as an admin account for Superset itself. Create a file called superset-credentials.yaml:

---
apiVersion: v1
kind: Secret
metadata:
  name: simple-superset-credentials
type: Opaque
stringData:
  adminUser.username: admin
  adminUser.firstname: Superset
  adminUser.lastname: Admin
  adminUser.email: admin@superset.com
  adminUser.password: admin
  connections.secretKey: thisISaSECRET_1234
  connections.sqlalchemyDatabaseUri: postgresql://superset:superset@superset-postgresql.default.svc.cluster.local/superset

And apply it:

kubectl apply -f superset-credentials.yaml

The connections.secretKey will be used for securely signing the session cookies and can be used for any other security related needs by extensions. It should be a long random string of bytes.

connections.sqlalchemyDatabaseUri must contain the connection string to the SQL database storing the Superset metadata.

The adminUser fields are used to create an admin user. Please note that the admin user will be disabled if you use a non-default authentication mechanism like LDAP.

Creation of a Superset node

A Superset node must be created as a custom resource, create file called superset.yaml:

---
apiVersion: superset.stackable.tech/v1alpha1
kind: SupersetCluster
metadata:
  name: simple-superset
spec:
  image:
    productVersion: 2.1.0
    stackableVersion: 23.7.0
  clusterConfig:
    credentialsSecret: simple-superset-credentials
    listenerClass: external-unstable
    loadExamplesOnInit: true
  nodes:
    roleGroups:
      default:
        config:
          rowLimit: 10000
          webserverTimeout: 300

And apply it:

kubectl apply -f superset.yaml

metadata.name contains the name of the Superset cluster.

The label of the Docker image provided by Stackable must be set in spec.version.

Please note that the version you need to specify is not only the version of Apache Superset which you want to roll out, but has to also include the Stackable version as shown. This Stackable version is the version of the underlying container image which is used to execute the processes. For a list of available versions please check our image registry. It should generally be safe to simply use the latest image version that is available.

spec.statsdExporterVersion must contain the tag of a statsd-exporter Docker image in the Stackable repository.

The previously created secret must be referenced in spec.credentialsSecret.

The spec.loadExamplesOnInit key is optional and defaults to false, it can be set to true to load example data into Superset when the database is initialized.

The rowLimit configuration option defines the row limit when requesting chart data.

The webserverTimeout configuration option defines the maximum number of seconds a Superset request can take before timing out. This settings effect the maximum duration a query to an underlying datasource can take. If you get timeout errors before your query returns the result you may need to increase this timeout.

Initialization of the Superset database

The first time the cluster is created, the operator creates a SupersetDB resource with the same name as the cluster. It ensures that the database is initialized (with the creation of a schema and admin user).

A Kubernetes job is created which starts a pod to initialize the database. This can take a while.

You can use kubectl to wait on the resource:

time kubectl wait supersetdb/simple-superset \
  --for jsonpath='{.status.condition}'=Ready \
  --timeout 600s

Using Superset

When the Superset node is created and the database is initialized, Superset can be opened in the browser.

The Superset port which defaults to 8088 can be forwarded to the local host:

kubectl port-forward service/simple-superset-external 8088 > /dev/null 2>&1 &

Then it can be opened in the browser with http://localhost:8088.

Enter the admin credentials from the Kubernetes secret:

Login screen of Superset

If the examples were loaded then some dashboards are already available:

Superset dashboard showing birth names

Great! You have set up a Superset instance and connected to it!

What’s next

Look at the Usage guide to find out more about configuring your Superset instance or have a look at the Superset documentation to create your first dashboard.