Apache Kafka is a powerful platform for building real-time data pipelines. Chapter 7 focuses on Kafka Connect, a framework that simplifies integration with external systems. Learn about source and sink connectors, configuring and managing connectors, data transformations with SMT, and integrating Kafka with databases, Hadoop, and cloud services. These Apache Kafka MCQs questions will help you understand key concepts in integration and data movement, preparing you for exams, certifications, or practical implementation.
Multiple-Choice Questions (MCQs)
Introduction to Kafka Connect API
What is Kafka Connect primarily used for? a) Managing Kafka partitions b) Building and running data pipelines c) Monitoring Kafka clusters d) Writing producer and consumer applications
Which component of Kafka does Kafka Connect interact with? a) Topics b) Brokers c) Zookeeper d) Consumer groups
What programming language is Kafka Connect written in? a) Python b) Scala c) Java d) C++
Kafka Connect supports: a) Only source connectors b) Only sink connectors c) Both source and sink connectors d) Neither source nor sink connectors
What is a major advantage of Kafka Connect? a) It simplifies integration with external systems b) It improves Kafka partitioning c) It enhances Zookeeper performance d) It supports Kafka topic replication
Source and Sink Connectors
What is a source connector used for? a) Reading data from external systems into Kafka b) Writing data from Kafka to external systems c) Transforming messages within Kafka d) Monitoring Kafka topics
Which of the following is an example of a sink connector? a) JDBC Source Connector b) ElasticSearch Sink Connector c) Kafka Streams Processor d) Producer API
How are connectors deployed in Kafka Connect? a) As standalone or distributed workers b) Using Zookeeper configuration c) Via the Kafka producer API d) Through a consumer group
What format is typically used for connector configuration? a) XML b) JSON c) YAML d) CSV
What ensures fault tolerance in Kafka Connect? a) Distributed worker mode b) Standalone worker mode c) Topic compaction d) Consumer offset reset
Connector Configuration and Management
Which property is essential in a connector configuration? a) group.id b) key.converter c) zookeeper.connect d) auto.offset.reset
How are connectors scaled in distributed mode? a) By adding more brokers b) By adding more worker nodes c) By increasing topic partitions d) By using consumer groups
What is the REST API used for in Kafka Connect? a) Managing and monitoring connectors b) Creating Kafka topics c) Configuring producer and consumer settings d) Allocating broker resources
What is the primary difference between standalone and distributed mode in Kafka Connect? a) Distributed mode supports scaling and fault tolerance b) Standalone mode allows multiple connectors c) Standalone mode uses Zookeeper for management d) Distributed mode is limited to single-node deployments
What happens when a connector fails in distributed mode? a) It automatically restarts on another worker node b) Kafka topics are recreated c) All workers shut down d) Producers stop sending messages
Data Transformation and SMT (Single Message Transformations)
What is the purpose of Single Message Transformations (SMT)? a) Modify individual messages before they are sent to the destination b) Create Kafka topics dynamically c) Change the structure of topic partitions d) Configure Zookeeper nodes
Which of the following is an example of an SMT transformation? a) InsertField b) DeleteTopic c) CreateBroker d) OffsetReset
How are SMTs configured in Kafka Connect? a) As part of the connector configuration b) Through Zookeeper settings c) Using Kafka consumer properties d) By modifying the topic schema
What is the role of key.converter in Kafka Connect? a) Converts the key format of Kafka records b) Manages topic partitioning c) Tracks consumer group offsets d) Configures replication factor
What type of data transformation does the InsertField SMT perform? a) Adds fields to the record key or value b) Removes fields from Kafka topics c) Modifies partition configurations d) Replicates data across brokers
Integrating Kafka with Databases, Hadoop, and Cloud Services
What is required to integrate Kafka with a relational database? a) A JDBC connector b) A topic replication policy c) A Zookeeper instance d) A Kafka consumer API
How does Kafka integrate with Hadoop? a) Using HDFS sink connectors b) Through Zookeeper synchronization c) By directly modifying YARN configurations d) Via Spark streaming jobs
What is a common cloud service used with Kafka for analytics? a) Amazon Redshift b) Kafka Streams c) Zookeeper Nodes d) ElasticSearch
How does Kafka Connect support cloud-based storage systems like S3? a) Through sink connectors b) By configuring topic partitions c) Using the Kafka producer API d) With a standalone worker mode
What is the role of Kafka in IoT data processing? a) Collecting and streaming data from IoT devices b) Performing in-memory computation c) Handling Zookeeper state d) Configuring broker logs
Which configuration helps in connecting Kafka with a NoSQL database like MongoDB? a) MongoDB Sink Connector b) Elasticsearch Source Connector c) Kafka Streams API d) Consumer Group Offset
What is the primary benefit of using Kafka with cloud services? a) Scalability and integration flexibility b) Reduced Zookeeper dependency c) Elimination of brokers d) Lower replication costs
What is the purpose of a sink connector for Google BigQuery? a) Write data from Kafka topics into BigQuery tables b) Stream data directly to Zookeeper c) Monitor producer activity d) Backup broker logs
How does Kafka Connect ensure data integrity during integration? a) By tracking offsets in Kafka topics b) Using consumer group rebalancing c) Through Zookeeper failover d) With topic compaction
What is the benefit of Kafka Connect’s REST API for integration? a) Simplified management of connectors b) Enhanced producer performance c) Improved Zookeeper configuration d) Lower message latency
Answers
QNo
Answer (Option with text)
1
b) Building and running data pipelines
2
a) Topics
3
c) Java
4
c) Both source and sink connectors
5
a) It simplifies integration with external systems
6
a) Reading data from external systems into Kafka
7
b) ElasticSearch Sink Connector
8
a) As standalone or distributed workers
9
b) JSON
10
a) Distributed worker mode
11
b) key.converter
12
b) By adding more worker nodes
13
a) Managing and monitoring connectors
14
a) Distributed mode supports scaling and fault tolerance
15
a) It automatically restarts on another worker node
16
a) Modify individual messages before they are sent to the destination
17
a) InsertField
18
a) As part of the connector configuration
19
a) Converts the key format of Kafka records
20
a) Adds fields to the record key or value
21
a) A JDBC connector
22
a) Using HDFS sink connectors
23
a) Amazon Redshift
24
a) Through sink connectors
25
a) Collecting and streaming data from IoT devices
26
a) MongoDB Sink Connector
27
a) Scalability and integration flexibility
28
a) Write data from Kafka topics into BigQuery tables