Testing Trino with Hive and S3

Create a schema and a table for the Iris data located in S3 and query data. This assumes to have the Iris data set in the PARQUET format available in the S3 bucket which can be downloaded here.

Create schema

CREATE SCHEMA IF NOT EXISTS hive.iris
WITH (location = 's3a://iris/');

which should return:

CREATE SCHEMA

Create table

CREATE TABLE IF NOT EXISTS hive.iris.iris_parquet (
  sepal_length DOUBLE,
  sepal_width  DOUBLE,
  petal_length DOUBLE,
  petal_width  DOUBLE,
  class        VARCHAR
)
WITH (
  external_location = 's3a://iris/parq',
  format = 'PARQUET'
);

which should return:

CREATE TABLE

Query data

SELECT
    sepal_length,
    class
FROM hive.iris.iris_parquet
LIMIT 10;

which should return something like this:

 sepal_length |    class
--------------+-------------
          5.1 | Iris-setosa
          4.9 | Iris-setosa
          4.7 | Iris-setosa
          4.6 | Iris-setosa
          5.0 | Iris-setosa
          5.4 | Iris-setosa
          4.6 | Iris-setosa
          5.0 | Iris-setosa
          4.4 | Iris-setosa
          4.9 | Iris-setosa
(10 rows)

Query 20220210_161615_00000_a8nka, FINISHED, 1 node
https://172.18.0.5:30299/ui/query.html?20220210_161615_00000_a8nka
Splits: 18 total, 18 done (100.00%)
CPU Time: 0.7s total,    20 rows/s, 11.3KB/s, 74% active
Per Node: 0.3 parallelism,     5 rows/s, 3.02KB/s
Parallelism: 0.3
Peak Memory: 0B
2.67 [15 rows, 8.08KB] [5 rows/s, 3.02KB/s]