Cassandra is a highly scalable, distributed NoSQL database designed to handle large amounts of data across many commodity servers. In Chapter 2, we explore key concepts such as peer-to-peer architecture, data distribution, replication, consistency, and the mechanisms like gossip protocol, commit logs, and SSTables. These principles ensure Cassandra’s reliability and scalability.
Peer-to-Peer Distributed System
What is the fundamental architecture of Cassandra? a) Master-Slave b) Peer-to-Peer c) Client-Server d) Tree-based
In Cassandra, data is distributed across: a) Master nodes b) Slave nodes c) Peer nodes d) Database clusters
What does the peer-to-peer architecture in Cassandra ensure? a) High availability and fault tolerance b) Centralized management c) Low storage requirements d) Limited scalability
In a peer-to-peer system, every node in Cassandra is: a) A master node b) Responsible for a specific data range c) Equal and independent d) A slave node
Which of the following is true about the communication between nodes in Cassandra? a) Communication happens through a single central server b) Nodes exchange data in a peer-to-peer manner c) Each node is connected to only one other node d) Data transfer only happens on the master node
Partitions and Data Distribution
Cassandra stores data in what form? a) Tables b) Rows and columns c) Partitions d) Data blocks
How does Cassandra distribute data across nodes? a) By using a hashing mechanism b) By using a time-based approach c) By distributing data randomly d) By storing data in specific data centers
Which component is used to determine where data should be stored on a node? a) Consistent Hashing b) Data Masking c) Distributed Indexing d) Sharding Algorithm
What is a partition key in Cassandra? a) A key used for encryption b) A key that determines the distribution of data c) A key for data compression d) A key used for indexing
The process of dividing data into partitions in Cassandra helps with: a) Improved data security b) Scalability and performance c) Data backup management d) Faster queries
Replication and Consistency
Cassandra uses replication to: a) Ensure data consistency b) Improve query performance c) Handle system failures d) All of the above
What is the default replication factor in Cassandra? a) 1 b) 2 c) 3 d) 5
In Cassandra, consistency level refers to: a) The number of replicas of data b) The freshness of the data c) The level of agreement required between nodes for an operation d) The speed of queries
Which of the following consistency levels in Cassandra ensures that a write is acknowledged by all nodes? a) ONE b) QUORUM c) ALL d) LOCAL_QUORUM
What happens when a node in Cassandra becomes unavailable? a) Data becomes inconsistent b) The node is removed permanently c) Replication ensures data availability d) Data is lost
Gossip Protocol and Failure Detection
What is the Gossip Protocol in Cassandra used for? a) Data encryption b) Node communication and failure detection c) Query optimization d) Load balancing
In Cassandra, the Gossip Protocol helps nodes to: a) Track each other’s state b) Share data between regions c) Generate encryption keys d) Cache query results
Failure detection in Cassandra is managed by: a) Centralized monitoring tools b) Gossip Protocol c) Manual node inspection d) Hardware redundancy
What happens when a node fails in Cassandra? a) Data is lost permanently b) The node is immediately replaced c) The system continues to work with replicas d) All nodes go down
How does Cassandra detect failed nodes? a) Using a heartbeat signal b) Through a leader election process c) By querying the system logs d) Using the Gossip Protocol
Commit Logs and SSTables
What is the purpose of the commit log in Cassandra? a) To store query results b) To log every write operation for durability c) To manage node communication d) To manage backups
Commit logs are written to disk before: a) Data is stored in memory b) Data is read by users c) Data is replicated d) Data is written to SSTables
What is the structure used by Cassandra to store data on disk? a) Filesystems b) SSTables c) Tablespaces d) Data partitions
What is an SSTable in Cassandra? a) A file containing immutable data b) A temporary storage format c) A file that stores commit logs d) A type of memory structure
SSTables are used in Cassandra because they allow for: a) Frequent data updates b) Efficient data reads and writes c) In-memory data caching d) Enhanced encryption
After a write operation in Cassandra, data is first stored in: a) SSTables b) Commit logs c) Memory tables (Memtables) d) Backup files
When does Cassandra flush data from memory to an SSTable? a) When the memory table exceeds a certain size b) Every 10 minutes c) When the system reaches a consistent state d) Only during backups
What happens when an SSTable reaches its limit in Cassandra? a) It is automatically deleted b) It is merged with other SSTables c) It is backed up immediately d) It becomes read-only
What is a key benefit of using SSTables in Cassandra? a) Quick node replacement b) Efficient disk storage and read access c) Faster network replication d) Simple encryption of data
How does Cassandra manage updates to data in SSTables? a) It directly overwrites old data b) It marks old data for deletion and writes new data c) It moves old data to a backup file d) It compresses old data
Answers
Qno
Answer
1
b) Peer-to-Peer
2
c) Peer nodes
3
a) High availability and fault tolerance
4
c) Equal and independent
5
b) Nodes exchange data in a peer-to-peer manner
6
c) Partitions
7
a) By using a hashing mechanism
8
a) Consistent Hashing
9
b) A key that determines the distribution of data
10
b) Scalability and performance
11
d) All of the above
12
c) 3
13
c) The level of agreement required between nodes for an operation
14
c) ALL
15
c) Replication ensures data availability
16
b) Node communication and failure detection
17
a) Track each other’s state
18
b) Gossip Protocol
19
c) The system continues to work with replicas
20
d) Using the Gossip Protocol
21
b) To log every write operation for durability
22
b) Data is written to SSTables
23
b) SSTables
24
a) A file containing immutable data
25
b) Efficient data reads and writes
26
c) Memory tables (Memtables)
27
a) When the memory table exceeds a certain size
28
b) It is merged with other SSTables
29
b) Efficient disk storage and read access
30
b) It marks old data for deletion and writes new data