Decreasing Gitlab repo backup times from 48 hours to 41 minutes

https://news.ycombinator.com/rss Hits: 21
Summary

Repository backups are a critical component of any robust disaster recovery strategy. However, as repositories grow in size, the process of creating reliable backups becomes increasingly challenging. Our own Rails repository was taking 48 hours to back up — forcing impossible choices between backup frequency and system performance. We wanted to tackle this issue for our customers and for our own users internally. Ultimately, we traced the issue to a 15-year-old Git function with O(N²) complexity and fixed it with an algorithmic change, reducing backup times exponentially. The result: lower costs, reduced risk, and backup strategies that actually scale with your codebase. This turned out to be a Git scalability issue that affects anyone with large repositories. Here's how we tracked it down and fixed it. Backup at scale First, let's look at the problem. As organizations scale their repositories and backups grow more complex, here are some of the challenges they can face: Time-prohibitive backups: For very large repositories, creating a repository backup could take several hours, which can hinder the ability to schedule regular backups. Resource intensity: Extended backup processes can consume substantial server resources, potentially impacting other operations. Backup windows: Finding adequate maintenance windows for such lengthy processes can be difficult for teams running 24/7 operations. Increased failure risk: Long-running processes are more susceptible to interruptions from network issues, server restarts, and system errors, which can force teams to restart the entire very long backup process from scratch. Race conditions: Because it takes a long time to create a backup, the repository might have changed a lot during the process, potentially creating an invalid backup or interrupting the backup because objects are no longer available. These challenges can lead to compromising on backup frequency or completeness – an unacceptable trade-off when it comes to data p...

First seen: 2025-06-06 16:07

Last seen: 2025-06-07 12:11