DBMS/Cassandra

Cassandra, how it data written?

seungh0 2023. 3. 4. 21:20
반응형

How is data written?

Cassandra processes data at several stages on the write path, starting with the immediate logging of a write and ending in with a write of data to disk.

  • Logging data in the commit log
  • Writing data to the memtable.
  • Flushing data from the memtable.
  • Storing data on disk in SSTables

 

Logging writes and memtable storage

  • When a write occurs, Cassandra stores the data in a memory structure called memtable, and to provide configurable durability, it also appends writes to the commit log on disk.
    • The commit log receives every write made to a Cassandra node, and these durable writes survive permanently even if power failes on a node.
  • The memtable is a write-back cache of data partitions that Cassandra looks up by key.
    • The memtable stores writes in sorted order until reaching a configurable limit, and then is flushed.

 

Flushing data from the memtable

  • To flush the data, Cassandra writes the data to disk, in the memtable-sorted order. A partition index is also created on the disk that maps the tokens to a location on disk.
  • When the membtable content exceeds the configurable threshold or commitlog space exceeds the commit_log_total_space_in_mb, the memtable is put in a queue that is flushed to disk.
  • The queue can be configured with the membtable_heap_space_in_mb or memtable_offheap_space_in_mb settings in the cassandra.yaml.
  • If the data to be flushed exceeds the memtable_cleanup_threshold, Cassandra blocks write until the next flush succeeds.
  • You can manually flush a table using nodetool flush or nodetool drain (flushes memtablese without listening for connections to other nodes)
    • To reduce the commit log replay time, the recommended best practice is to flush the memtable before you restart the nodes. If a node stops working, replaying the commit log restores to the memtable the writes that were there before it stopped.


Storing data on disk in SSTables.

  • Memtables and SSTable are maintained per table.
    • The commit log is shared among tables.
  • SSTable are immutable, not written to again after the memtable is flushed.
    • Consequently, a paritition is typically stored across multiple SSTable files. A numbef of other SSTable structures exists to assist read operations.
반응형