Introduction: The Hidden Cost of Deletion in ScyllaDB
In ScyllaDB, deleting data doesn’t immediately remove it. Instead, a tombstone — a special marker — is written to indicate a deletion has occurred. These tombstones `are essential for maintaining consistency across replicas in a distributed environment.But if unmanaged, they can become silent performance killers, especially in workloads with frequent deletes or upserts, typically those exceeding 1,000-10,000 deletions per second, such as user profile normalization or reconciliation jobs.
At Datanised, we’ve repeatedly seen teams struggle with latency spikes and compaction pressure, only to discover tombstone buildup as the root cause. In this article, we unpack how tombstones work, their impact on performance, and the schema and operational strategies you can use to mitigate their effects.
How Tombstones Work in ScyllaDB
ScyllaDB stores data in immutable SSTables (Sorted String Tables). When a delete occurs, Scylla doesn’t rewrite files — it appends a tombstone that marks the data as deleted. These markers persist until they are purged by compaction, but only after a grace period defined by gc_grace_seconds (default: 10 days).
Tombstones are propagated during repairs to ensure all replicas honour the deletion. Until they’re compacted away, tombstones are scanned during reads, increasing CPU and I/O load.
Types of Tombstones
- Cell tombstones – Mark individual column deletions.
- Row tombstones – Represent full row deletions.
- Range tombstones – Result from deleting a range of clustering keys (e.g., DELETE … WHERE key = ? AND ts > ?).
Tombstones accumulate naturally, but when data models require frequent rewrites or range deletes, their impact compounds quickly.
When Tombstones Are Not a Problem
It’s important to understand that tombstones are not inherently bad. They are a fundamental mechanism in ScyllaDB (and other similar distributed databases) for ensuring data consistency across replicas, especially during operations like repairs. In many typical workloads, tombstones are created and then efficiently purged through the normal compaction process within the `gc_grace_seconds` period without causing significant performance issues.
Tombstones generally do not become a major concern in scenarios such as:
- Low Deletion/Upsert Workloads: Applications that primarily involve appending new data or making infrequent updates and deletions will naturally generate fewer tombstones. The rate of tombstone creation is low enough that compaction can easily keep up.
- Read-Heavy Workloads with Stable Data: If your application is mostly reading data that is not frequently changed or deleted, the impact of tombstones on read performance will be minimal. Reads will encounter fewer tombstones in the SSTables they scan.
- Effective Compaction and Repair Strategies: With properly tuned compaction strategies (like LCS in appropriate scenarios) and regular, successful repair operations, tombstones are actively managed and removed from the system before they can accumulate to problematic levels.
- Well-Designed Data Models: Schemas that are designed to minimize the need for frequent data rewrites or range deletes will naturally reduce the conditions that lead to excessive tombstone generation. Append-only or TTL-based designs, where applicable, can be very effective.