Cassandra’s advanced architecture offers powerful features for managing distributed data. In this set of 30 multiple-choice questions (MCQs), you will learn about key concepts like tunable consistency levels, hinted handoff, read repair, anti-entropy, Merkle trees, write and read paths, compaction strategies, and tombstones. This knowledge is crucial for optimizing Cassandra clusters for performance, fault tolerance, and consistency.
1. Tunable Consistency Levels
In Cassandra, the consistency level defines: a) The number of nodes that need to acknowledge a read or write request for it to be considered successful b) The maximum latency allowed for a read or write operation c) The type of database used d) The type of query to be executed
Which of the following is NOT a consistency level in Cassandra? a) ONE b) QUORUM c) ALL d) TRANSACTIONAL
In the QUORUM consistency level, how many replicas must acknowledge a write or read? a) A majority of replicas b) At least one replica c) All replicas d) No replicas
What is the default consistency level for a write operation in Cassandra? a) ONE b) ALL c) QUORUM d) LOCAL_QUORUM
In Cassandra, which consistency level is best suited for high availability over consistency? a) ALL b) ONE c) LOCAL_QUORUM d) EACH_QUORUM
2. Hinted Handoff and Read Repair
Hinted Handoff in Cassandra is used to: a) Ensure that data is replicated when a node is down b) Clean up expired data c) Improve query performance d) Merge tombstones during compaction
When is a hinted handoff stored in Cassandra? a) When a node is unreachable during a write operation b) When a read operation fails c) During scheduled maintenance of nodes d) After every read operation
What happens during the Read Repair process in Cassandra? a) Data on inconsistent nodes is repaired after a read request b) A write operation is rejected if a node is down c) Data is deleted from stale nodes d) Data is transferred between nodes during compaction
In Cassandra, what is the main purpose of Read Repair? a) To prevent write failures during network partitions b) To ensure all replicas have the most recent data c) To clean up data from tombstones d) To improve query performance during high traffic
Which of the following consistency levels supports Read Repair? a) ONE b) QUORUM c) LOCAL_QUORUM d) ALL
3. Anti-Entropy and Merkle Trees
Anti-entropy is a process used in Cassandra to: a) Repair and synchronize data between replicas b) Maintain the cluster’s consistency during network partitions c) Optimize read performance d) Encrypt data at rest
Which of the following is used during the Anti-Entropy process in Cassandra? a) Merkle Trees b) Time-series analysis c) Data compaction d) Backup snapshots
What is the role of Merkle Trees in Cassandra’s Anti-Entropy process? a) To maintain real-time analytics b) To compare and synchronize data between replicas c) To store historical data snapshots d) To perform joins between different tables
Merkle Trees help in identifying: a) The most recent data b) Data inconsistencies across replicas c) The correct schema version d) The status of disk I/O operations
During Anti-Entropy repair, Merkle Trees are used to: a) Identify which data needs to be repaired between replicas b) Sort data for more efficient queries c) Backup data for recovery d) Merge tombstones
4. Internals of Write and Read Paths
The write path in Cassandra includes which of the following components? a) Commit Log b) Memtable c) SSTable d) All of the above
In Cassandra, data is first written to: a) SSTable b) Memtable c) Disk d) Commit Log
Which of the following is responsible for flushing data from the Memtable to SSTables? a) Read Repair b) Hinted Handoff c) Compaction d) Write Flush
The read path in Cassandra involves all of the following EXCEPT: a) Querying the Memtable b) Searching the Commit Log c) Searching the SSTable d) Querying the Replication Log
When a read request is issued in Cassandra, it first checks: a) Commit Log b) Memtable c) Data Centers d) Tombstone markers
5. Compaction Strategies and Tombstones
In Cassandra, compaction is the process of: a) Reorganizing SSTables to improve read performance and reclaim disk space b) Synchronizing data between replicas c) Encrypting data d) Cleaning up expired data during read requests
Tombstones in Cassandra refer to: a) Deleted data markers that ensure consistency b) Backup data during compaction c) Data stored in an expired state d) A method of indexing large data sets
Which of the following is NOT a valid compaction strategy in Cassandra? a) Size-Tiered Compaction b) Leveled Compaction c) Time-Based Compaction d) Universal Compaction
Which compaction strategy is most suitable for write-heavy workloads? a) Size-Tiered Compaction b) Leveled Compaction c) Time-Based Compaction d) None of the above
Leveled Compaction in Cassandra helps to: a) Reduce disk usage by organizing SSTables into levels b) Store data in compressed form c) Handle high write throughput d) Optimize read repair
Tombstones are used in Cassandra to: a) Mark deleted data for removal in future compactions b) Indicate missing data during a read request c) Improve consistency during network partitions d) Store backup copies of data
How does Size-Tiered Compaction work in Cassandra? a) It merges small SSTables into larger ones when the number of files exceeds a threshold b) It organizes data into fixed-sized chunks c) It optimizes disk space for read-heavy applications d) It focuses on time-based data management
In which scenario would Leveled Compaction be most effective? a) In systems with high write-to-read ratios b) In systems with low storage requirements c) In systems with consistent read-heavy workloads d) In systems with burst traffic patterns
What is the main advantage of compaction in Cassandra? a) It ensures data integrity b) It reduces the number of SSTables and improves read performance c) It increases write throughput d) It accelerates replication
Tombstones can negatively affect Cassandra performance if: a) They are not handled properly during compaction b) They are used too frequently in the data model c) They are stored in multiple replicas d) They cause data redundancy
Answer Key
Q No.
Answer
1
a) The number of nodes that need to acknowledge a read or write request for it to be considered successful
2
d) TRANSACTIONAL
3
a) A majority of replicas
4
a) ONE
5
b) ONE
6
a) Ensure that data is replicated when a node is down
7
a) When a node is unreachable during a write operation
8
a) Data on inconsistent nodes is repaired after a read request
9
b) To ensure all replicas have the most recent data
10
b) QUORUM
11
a) Repair and synchronize data between replicas
12
a) Merkle Trees
13
b) To compare and synchronize data between replicas
14
b) Data inconsistencies across replicas
15
a) Identify which data needs to be repaired between replicas
16
d) All of the above
17
b) Memtable
18
c) Compaction
19
b) Searching the Commit Log
20
b) Memtable
21
a) Reorganizing SSTables to improve read performance and reclaim disk space
22
a) Deleted data markers that ensure consistency
23
c) Time-Based Compaction
24
a) Size-Tiered Compaction
25
a) Reduce disk usage by organizing SSTables into levels
26
a) Mark deleted data for removal in future compactions
27
a) It merges small SSTables into larger ones when the number of files exceeds a threshold
28
c) In systems with consistent read-heavy workloads
29
b) It reduces the number of SSTables and improves read performance
30
a) They are not handled properly during compaction