Xata: Postgres at scale, with copy-on-write branching and anonymization

https://news.ycombinator.com/rss Hits: 5
Summary

Here at Xata, we’ve been quietly busy (re)building a new PostgreSQL platform from scratch. We've incorporated everything we learned from operating a Postgres data platform for over four years, combined with feedback from our customers and our analysis of the gaps in the current market.The result is an entirely new Postgres service that:Has instant Copy-on-Write branches with data.Includes data anonymization so that developer branches don’t accidentally contain PII.Is cloud-agnostic, installable in your own AWS/GCP/Azure account, or even on-prem.Separates storage from compute with a distributed storage system accessed over NVMe/TCP.Offers a performance/cost ratio that is very competitive at scale.We'll dive into the technical details shortly, but let's first talk about who this Postgres platform is for.Postgres at scaleOur goal is to address the challenges teams face when using Postgres at scale. When we say "scale", we don't only mean technical dimensions like data volume, number of CPUs, GB of RAM, or vertical/horizontal scaling capabilities. While these are included in our definition of "at scale", we also mean organizational challenges such as:Zero-downtime requirements when applying schema changes or major version upgrades.Testing database changes or training AI models on realistic data sets that resemble the production data, but without any PII or sensitive data in them.Fast, cost-efficient ephemeral environments for developers.Security and compliance for companies that deal with sensitive data.Let’s start by looking at staging and dev use cases, and then we’ll be talking about prod use cases as well.Staging and dev use casesCopy-on-Write branches are a great way to quickly spin up preview or development environments with production data, even when dealing with very large datasets. But on their own, they come with two major limitations:By design, the branches contain exactly the same data as the parent database, including private PII/PHI or any other sensitive ...

First seen: 2025-05-17 21:48

Last seen: 2025-05-18 01:49