Hacker News

Yep. Heavy ETL often adds latency; a staging table plus COPY into Postgres, then idempotent upserts, is usually enough. Keep it incremental and observable: checksums, counts, and replayable loads. For bigger scales, add CDC (logical decoding like Debezium) and parallelize ingestion across partitions; minimize in-Python transforms and push work into SQL.