Hacker News

Orchestera (https://orchestera.com/) - Fully managed Apache Spark clusters in your own AWS account with no additonal compute markups, unlike EMR and Databricks.

Currently implemented the following:

- Automated scale in / scale out of nodes for Spark executors and drivers via Karpenter

- Jupyter notebook integration that works as a Spark driver for quick iteration and prototyping

- A simple JSON based IAM permissions managementent via AWS Parameter Store

Work-in-progress this month:

- Jupyterhub based Spark notebook provisioning

- Spark History Server

- Spark History Server MCP support with chat interface to support Spark pipeline debugging and diagnostics

Open to feedback and connecting. Docs at https://docs.orchestera.com/