[Tracker] Findings of demos #15

sbernauer · 2022-10-03T12:54:25Z

Feel free to ignore this, this serves mostly as a list for me to track and improve demos

Not adopted in demos

Adopted in demos

Not from demo, but something @sbernauer want's to track anyway

Port 8080 missing from Service in case tls is disabled trino-operator#299
microk8s kubelet cannot connect to secret-operator node registrar secret-operator#229
[Security] Check if druid.client.https.validateHostnames can be turned of druid-operator#372

Not from Stackable

Structured Streaming writes to iceberg table with non-identity partition spec breaks with spark extensions enabled apache/iceberg#5625

## Description Needs a larger k8s cluster! I use IONOS k8s with 9x 4 cores (8 threads), 20GB ram and 30GB hdd disk Maybe we can also offer a smaller variant later on. Otherwise business as usual. From feature-branch run `stackablectl --additional-stacks-file stacks/stacks-v1.yaml --additional-releases-file releases.yaml --additional-demos-file demos/demos-v1.yaml demo install data-warehouse-iceberg-trino-spark` I'm not happy with some parts but i think an iterative approach is best: * Shared bikes are currently not streamed into Kafka (instead on-time job) * Some high-volume real-time datasource would be great. Currently we use the water levels and duplicate them to get higher volumes. * Some sort of Upsert or Deletion usecases would be great. But probably not on the large datasets as for our wallet ^^ * Better Dashboards. The current one were thrown together quickly * I would like to partitions the water_level measurements by day but run into apache/iceberg#5625. There might be ways around by using a dedicated Spark context for compaction but we can easily adopt the partitioning after the issue gets fixed. Sorting during rewrites did cost performance during compaction and did not provide real benefits for my intial measurements. Disabled for now. * As always tracked my findings in https://github.com/stackabletech/stackablectl/issues/128 To get to the Spark UI `kubectl port-forward $(kubectl get pod -o name | grep 'spark-ingest-into-warehouse-.*-driver') 4040`

sbernauer changed the title ~~[Tracker] Findings of nifi-kafka-druid-water-level-data demo~~ [Tracker] Findings of demos Oct 12, 2022

sbernauer mentioned this issue Oct 14, 2022

[Merged by Bors] - Add demo data-warehouse-iceberg-trino-spark stackabletech/stackablectl#144

Closed

5 tasks

fhennig transferred this issue from stackabletech/stackablectl Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracker] Findings of demos #15

[Tracker] Findings of demos #15

sbernauer commented Oct 3, 2022 •

edited

Loading

[Tracker] Findings of demos #15

[Tracker] Findings of demos #15

Comments

sbernauer commented Oct 3, 2022 • edited Loading

Not adopted in demos

Adopted in demos

Not from demo, but something @sbernauer want's to track anyway

Not from Stackable

sbernauer commented Oct 3, 2022 •

edited

Loading