How to design a DBMS for Telco requirements

https://news.ycombinator.com/rss Hits: 5
Summary

My colleague Zhao Song presented a walkthrough of the evolution of the DBMSs and how it relates to Google Spanner, Aurora, PolarDB and MySQL NDB Cluster.I had some interesting discussions with him on the topic and it makes sense to return to the 1990s when I designed NDB Cluster and the impact on the recovery algorithms from the requirements for a Telco DBMS.A Telco DBMS is a DBMS that operates in a Telco environment, this DBMS is involved in each interaction with the Telco system through smartphones such as call setup, location updates, SMS, Mobile Data. If the DBMS is down it means no service available for smartphones. Obviously there is no time of day or night when it is ok to be down. Thus even a few seconds of downtime is important to avoid.Thus in the design of NDB Cluster I had to take into account the following events:DBMS Software UpgradeApplication Software UpgradeSW Failure in DBMS NodeSW Failure in Application ServiceHW Failure in DBMS NodeHW Failure in Application ServiceSW Failure in DBMS ClusterRegion FailureIt was clear that the design had to be a distributed DBMS, in Telcos it was not uncommon to build HW Redundant solutions with a single node but redundant HW. But this solution will obviously have difficulties with SW failures. Also it requires very specialised HW which costs hundreds of million of dollars to develop. Today this solution is very rarely used.One of the first design decisions would be to choose between a disk-based DBMS and an in-memory DBMS. This was settled by the fact that latency requirement was to handle transactions involving tens of rows within around 10 milliseconds, thus with the hard drives of those days not really possible. Today with the introduction of SSDs and NVMe drives there is still a latency impact of at least 3x in using disk drives compared to using an in-memory DBMS.If we play with the thought of using a Shared Disk DBMS using modern HW we still have a problem. The Shared Disk solution requires a storage solutio...

First seen: 2025-08-30 15:40

Last seen: 2025-08-30 19:41