Post-Mortem: Backup Failure via Filesystem Limit Exhaustion
**Incident Summary:**
A website backup process failed after a single 1.6MB GIF (a "Friends" reaction animation) was replicated 246,173 times within the storage system, inflating its size to 377GB and triggering filesystem limits.
**Root Cause:**
The issue stemmed from a security policy in the Discourse platform's "secure uploads" feature. To maintain security contexts, when a file moves between private and public scopes (e.g., PM to public post), the system generates a new copy with a randomized SHA1 hash rather than referencing the original. High community usage of a specific GIF caused massive data duplication.
**Technical Failure Chain:**
1. **Data Inflation:** Excessive duplication led to 377GB of backup bloat from a single asset.
2. **Failed Mitigation (Hardlinks):** An initial attempt to use hardlinks to group identical hashes failed because the number of duplicates exceeded the **ext4 filesystem limit of ~65,000 hardlinks per inode**. This resulted in approximately 181,000 fallback downloads, failing to resolve the bloat.
**Resolution:**
New logic was implemented for the backup process: The system attempts to create hardlinks first; if the filesystem returns an `EMLINK` error (too many hardlinks), it automatically switches to creating a local copy and designates that as the "primary" file until the limit is reached again. This ensures compatibility across all filesystems without manual configuration.
Anna's Archive has backed up Spotify's metadata and music files (~300TB), creating the largest publicly available music metadata database (256 million tracks, 186 million ISRCs) and a "preservation archive" for music.