DBMS/Cassandra
Cassandra compression
Compression Compression maximizes the storage capacity of Cassandra nodes by reducing the volume of data on disk and disk I/O, particularly for read-domainated workloads. Cassandra quickly finds the location of rows in the SSTable index and decompresses the relevant row chunks. New Storage engine (Cassandra 3.0+) Compression is important for Cassandra 2.2, but Cassandra 3.0 and later uses a new ..
Cassandra, how is data maintained?
How is data maintained? The Cassandra write process stores data in files called SSTables. SSTables are immutable. Instead of overwriting existing rows with inserts or updates, Cassandra writes new timestamped versions of the inserted or updated data in new SSTables. Cassandra does not perform delete by removing the deleted data; instead, Cassandra marks it with tombstones. Over time, Cassandra m..
Cassandra, how it data written?
How is data written? Cassandra processes data at several stages on the write path, starting with the immediate logging of a write and ending in with a write of data to disk. Logging data in the commit log Writing data to the memtable. Flushing data from the memtable. Storing data on disk in SSTables Logging writes and memtable storage When a write occurs, Cassandra stores the data in a memory st..
Cassandra Storage Engine
Storage Engine Cassandra uses a storage strucutre similar to a Log-Structure Merge Tree. Cassandra avoids reading before writing. Read-before-write, especially in a large distributed system, can result in large latencies in read performance and other problems. For example, two clients read at the same time; one overwirtes the row to make update A, and the other overwrites the row to make update ..
Cassandra Snitch
determines which datacenters and racks nodes belong to. they inform Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into datacenters and racks. Strategy Dynamic snitching by default, all snitches also use a dynamic snitch layer that monitors read latency and where possible routes requests away from poor..