lets dive into 50 scenario-based multiple-choice questions (MCQs) in Azure Data Factory.
1. Working with Data Sources and Sinks
You need to copy data from an Azure SQL Database to a Blob storage using Azure Data Factory. What is the correct source and sink configuration?
a) Azure SQL Database as source, Azure Data Lake as sink
b) Azure SQL Database as source, Azure Blob Storage as sink
c) Azure Blob Storage as source, Azure SQL Database as sink
d) Azure Data Lake as source, Azure SQL Database as sink
In Azure Data Factory, which connector should you use when copying data from an on-premises SQL Server to Azure Blob Storage?
a) SQL Server connector
b) ODBC connector
c) Azure Blob Storage connector
d) On-premises data gateway
You are tasked with loading data from a CSV file stored in Azure Blob Storage into an Azure SQL Database. What activity will you use in your pipeline?
a) Copy data activity
b) Data flow activity
c) Execute SQL activity
d) Stored procedure activity
You want to read data from an Azure Cosmos DB container and write it to an Azure Data Lake Store. Which is the best sink to use?
a) Cosmos DB sink
b) Azure Data Lake Store sink
c) Blob Storage sink
d) SQL Database sink
When using the Copy activity in Azure Data Factory, what can be used to connect a source and sink if they have different structures?
a) Data Flow
b) Mapping Data Flow
c) Pipeline Activity
d) Data Lake Storage Gen2 connector
2. Pipelines and Activities
You are building a pipeline that requires parallel execution of several activities. Which Azure Data Factory feature should you use?
a) ForEach activity
b) Execute Pipeline activity
c) Parallel activity
d) Wait activity
You need to implement conditional logic within an Azure Data Factory pipeline. Which activity can you use?
a) Execute Pipeline
b) If Condition activity
c) Wait activity
d) Until activity
Which of the following is true when configuring a ForEach activity in a pipeline?
a) It executes only once
b) It executes multiple times based on a list or array
c) It can execute a sub-pipeline only
d) It requires a timeout setting
Which type of activity is used to run a stored procedure as part of an Azure Data Factory pipeline?
a) Execute SQL Activity
b) Stored Procedure Activity
c) Data Flow Activity
d) Web Activity
Which activity in Azure Data Factory can be used to call external APIs during pipeline execution?
a) Web Activity
b) Execute Pipeline Activity
c) Data Flow Activity
d) Wait Activity
3. Data Flows and Transformations (Basic)
You need to perform basic transformations like filtering, aggregating, and joining data in Azure Data Factory. Which tool should you use?
a) Mapping Data Flow
b) Data Lake Storage
c) SQL Query Activity
d) Custom Activity
In a Mapping Data Flow, which transformation would you use to combine data from two different datasets?
a) Aggregate
b) Join
c) Filter
d) Derived Column
What transformation should you use in Azure Data Factory to apply business rules such as changing column values based on conditions?
a) Select
b) Derived Column
c) Aggregate
d) Filter
Which activity can be used to transform data in Azure Data Factory before loading it into the sink?
a) Data Flow Activity
b) Execute SQL Activity
c) Copy Activity
d) Pipeline Activity
When performing transformations in Mapping Data Flow, which transformation would you use to remove duplicate rows from your data?
a) Aggregate
b) Deduplicate
c) Filter
d) Select
4. Data Flows and Transformations (Advanced)
To optimize large-scale transformations in Azure Data Factory, which type of transformation should you use?
a) Data Flow with pushdown computation
b) Mapping Data Flow with Spark clusters
c) Data Flow with custom code
d) Copy activity with transformations
You need to scale the execution of transformations in Azure Data Factory. Which feature allows you to scale your data flows?
a) Data Flow with scaling options
b) Data Lake Gen2 Storage
c) Azure SQL Database scaling
d) Parallel activity
Which transformation in Azure Data Factory can be used to add new columns to a data flow without modifying the original source data?
a) Join
b) Derived Column
c) Filter
d) Select
When applying transformations in Azure Data Factory, which mode should you choose if you want to run transformations on your data using Spark clusters?
a) Data Flow
b) Mapping Data Flow
c) Data Lake Analytics
d) Batch Processing Mode
Which of the following transformations can be used to change the data type of a column in Azure Data Factory Data Flows?
a) Derived Column transformation
b) Select transformation
c) Aggregate transformation
d) Join transformation
5. Orchestration and Workflow Management
You want to trigger a pipeline in Azure Data Factory every time new data is added to a specific folder in Azure Blob Storage. Which type of trigger should you use?
a) Schedule Trigger
b) Tumbling Window Trigger
c) Event-based Trigger
d) Manual Trigger
What is the best way to handle complex dependencies and workflow execution in an Azure Data Factory pipeline?
a) Use the Execute Pipeline activity
b) Use triggers and dependency conditions
c) Use the If Condition activity
d) Use ForEach activity
You want to ensure that a pipeline in Azure Data Factory runs on a specific schedule. Which trigger should you use?
a) Event-based Trigger
b) Schedule Trigger
c) Tumbling Window Trigger
d) Custom Trigger
Which feature in Azure Data Factory allows you to handle the execution flow of multiple pipelines and activities?
a) Data Flows
b) Pipeline Activities
c) Control Flow
d) Data Lake Analytics
You need to chain multiple pipelines to execute in sequence in Azure Data Factory. What activity can help with this orchestration?
a) Execute Pipeline activity
b) ForEach activity
c) If Condition activity
d) Wait activity
6. Real-Time and Incremental Data Processing
To implement real-time data ingestion in Azure Data Factory, which trigger type should you use?
a) Event-based Trigger
b) Schedule Trigger
c) Tumbling Window Trigger
d) Custom Trigger
For an incremental data load scenario, which property should you configure in Azure Data Factory?
a) Change Data Capture (CDC)
b) Data Flow parameters
c) Batch processing
d) Event-based Trigger
Which of the following methods can be used to efficiently process large amounts of incremental data in Azure Data Factory?
a) Use Tumbling Window Triggers
b) Use Change Data Capture (CDC)
c) Use Data Flow
d) Use a combination of Manual Triggers
What strategy can you use to update only the modified data from an Azure SQL Database to Azure Blob Storage?
a) Use incremental loads with Change Data Capture (CDC)
b) Full data refresh
c) Use data flows with filter transformations
d) Use event-based triggers
In Azure Data Factory, how can you monitor and ensure that real-time data processing is functioning properly?
a) Use Data Flow Debugging
b) Use pipeline monitoring and logs
c) Use Event-based Triggers
d) Use the Custom Trigger feature
7. Monitoring and Debugging
In Azure Data Factory, which feature allows you to troubleshoot and monitor your pipeline runs and data flow execution?
a) Debug activity
b) Monitoring dashboard
c) Data Flow Debugging
d) Logging and error handling
You have encountered a failed pipeline execution. Which tool would you use to investigate the cause of the failure?
a) Data Flow
b) Monitoring and Debugging section in ADF portal
c) Data Lake
d) Pipeline Trigger history
Which of the following actions should you take when debugging a pipeline in Azure Data Factory?
a) Check the Monitoring tab for failed activities
b) Manually rerun the pipeline without logs
c) Delete the pipeline and recreate it
d) Ignore the error and continue
In Azure Data Factory, what can be used to capture the detailed logs of activity execution during pipeline runs?
a) Monitoring
b) Data Flow Debugging
c) Pipeline execution logs
d) Custom code
What is the purpose of the “Fault Tolerance” option in Azure Data Factory?
a) To retry failed activities automatically
b) To scale transformations
c) To enable parallel execution of activities
d) To log pipeline errors
8. Security and Compliance
Which feature can help you control access to resources in Azure Data Factory?
a) Azure Active Directory (AAD) authentication
b) Managed Identity
c) Role-Based Access Control (RBAC)
d) All of the above
Which of the following is a best practice to ensure secure data transfers in Azure Data Factory?
a) Use encrypted data stores
b) Use Managed Identity for authentication
c) Use Virtual Networks and Private Endpoints
d) All of the above
How can you secure sensitive information such as connection strings in Azure Data Factory pipelines?
a) Use Azure Key Vault integration
b) Use environment variables
c) Store them in plain text in the pipeline
d) Use local files
What is the role of a managed identity in Azure Data Factory?
a) To authenticate and authorize data access
b) To enable event-based triggers
c) To scale data flows
d) To debug pipeline errors
How can you ensure that only authorized users can execute pipelines in Azure Data Factory?
a) Configure role-based access control (RBAC)
b) Use encryption keys for execution
c) Use firewall rules to restrict access
d) Disable all triggers
Answer Table
Qno
Answer
1
b) Azure SQL Database as source, Azure Blob Storage as sink
2
d) On-premises data gateway
3
a) Copy data activity
4
b) Azure Data Lake Store sink
5
b) Mapping Data Flow
6
a) ForEach activity
7
b) If Condition activity
8
b) It executes multiple times based on a list or array
9
b) Stored Procedure Activity
10
a) Web Activity
11
a) Mapping Data Flow
12
b) Join
13
b) Derived Column
14
a) Data Flow Activity
15
b) Deduplicate
16
a) Data Flow with pushdown computation
17
b) Data Flow with Spark clusters
18
b) Derived Column
19
b) Mapping Data Flow
20
a) Derived Column transformation
21
c) Event-based Trigger
22
b) Use triggers and dependency conditions
23
b) Schedule Trigger
24
c) Control Flow
25
a) Execute Pipeline activity
26
a) Event-based Trigger
27
a) Change Data Capture (CDC)
28
b) Use Change Data Capture (CDC)
29
a) Use incremental loads with Change Data Capture (CDC)
30
b) Use pipeline monitoring and logs
31
b) Monitoring dashboard
32
b) Monitoring and Debugging section in ADF portal
33
a) Check the Monitoring tab for failed activities
34
c) Pipeline execution logs
35
a) To retry failed activities automatically
36
d) All of the above
37
d) All of the above
38
a) Use Azure Key Vault integration
39
a) To authenticate and authorize data access
40
a) Configure role-based access control (RBAC)
request:
You want to filter out all records where the transaction amount is less than $50 from a dataset. Which transformation should you use?
a) Select
b) Filter
c) Aggregate
d) Join
In your data flow, you need to exclude rows that contain null values in a specific column. Which transformation will you apply?
a) Filter
b) Select
c) Derived Column
d) Union
You have a dataset with many columns, but you only need a few. Which transformation should you use to select specific columns?
a) Filter
b) Select
c) Derived Column
d) Lookup
You need to rename a few columns in a dataset before passing it to the next stage in the data flow. Which transformation would you use?
a) Pivot
b) Select
c) Lookup
d) Join
You want to sort customer data by their registration date in descending order. Which transformation should you use?
a) Sort
b) Filter
c) Select
d) Lookup
In a data flow, you need to sort products by price and category. Which transformation will you apply to accomplish this?
a) Sort
b) Join
c) Aggregate
d) Derived Column
You have two datasets: one with customer details and the other with their purchase history. You need to combine these datasets based on customer ID. Which transformation will you use?
a) Union
b) Join
c) Filter
d) Lookup
You need to join data from two different files with a common column but want to keep all rows from the first dataset and only matching rows from the second. Which join type should you use?
a) Left Outer Join
b) Inner Join
c) Right Outer Join
d) Cross Join
You have two datasets with the same schema, and you want to combine them into one dataset. Which transformation will you use?
a) Join
b) Union
c) Filter
d) Select
You are merging data from two different regions, each with its own file, into a single dataset. Which transformation is best for this scenario?
a) Union
b) Join
c) Pivot
d) Aggregate
You need to calculate the total sales for each product in your dataset. Which transformation will you use?
a) Lookup
b) Aggregate
c) Select
d) Join
In your dataset, you need to find the average salary for employees in each department. Which transformation would you apply?
a) Aggregate
b) Filter
c) Pivot
d) Union
You need to create a new column that calculates the profit margin by subtracting the cost from the price. Which transformation will you use?
a) Derived Column
b) Select
c) Filter
d) Aggregate
In your dataset, you want to concatenate first name and last name into a full name. Which transformation will you use?
a) Join
b) Derived Column
c) Select
d) Lookup
You need to look up additional details, such as country code and currency, from a reference dataset based on the country name. Which transformation would you use?
a) Filter
b) Lookup
c) Aggregate
d) Derived Column
To enrich data with customer status from a lookup table, which transformation will you use in your data flow?
a) Lookup
b) Union
c) Aggregate
d) Pivot
You have sales data with one column for each month, but you need to convert these columns into a row-based format, where each row represents a month. Which transformation will you use?
a) Unpivot
b) Pivot
c) Join
d) Filter
You need to pivot customer data where each customer’s region becomes a new column. Which transformation should you use?
a) Pivot
b) Lookup
c) Join
d) Aggregate
You have a dataset with yearly sales data in separate columns, and you need to convert them into individual rows for each year. Which transformation will you use?
a) Unpivot
b) Pivot
c) Join
d) Aggregate
Your dataset has separate columns for quarters (Q1, Q2, Q3, Q4), but you want to unpivot them into rows for better analysis. Which transformation would you apply?
a) Pivot
b) Unpivot
c) Lookup
d) Join
You need to route data based on a customer’s region: records from Europe go to one sink, and records from Asia go to another. Which transformation should you use?
a) Conditional Split
b) Filter
c) Aggregate
d) Join
You want to split your data based on age: people under 18 go to one pipeline, and those above 18 go to another. Which transformation will you use?
a) Conditional Split
b) Derived Column
c) Lookup
d) Filter
You need to mark records for deletion if the status is “inactive” before moving them to the destination. Which transformation will you use?
a) Alter Row
b) Select
c) Filter
d) Aggregate
You want to update specific rows in a dataset that have missing or invalid values. Which transformation is suitable for this task?
a) Alter Row
b) Select
c) Filter
d) Union
You want to copy transformed data to an Azure SQL Database. Which sink should you use?
a) Azure SQL Database Sink
b) Azure Blob Storage Sink
c) Data Lake Sink
d) Cosmos DB Sink
To store the output of a data flow in a CSV format in Azure Blob Storage, which sink should you configure?
a) Azure SQL Database Sink
b) Blob Storage Sink
c) Data Lake Sink
d) Cosmos DB Sink
You want to scale a data flow to improve performance for a large dataset by distributing the computation. Which feature should you enable?
a) Data Flow Debugging
b) Scale
c) Monitoring
d) Fault Tolerance
When running a complex transformation on a large data set, you need to scale the resources used for computation. What should you do in Azure Data Factory?
a) Enable Data Flow Debugging
b) Use Scale option in Data Flow
c) Enable Monitoring
d) Use a larger VM
You have a nested JSON file, and you need to flatten the hierarchy to convert it into a tabular format for analysis. Which transformation will you use?
a) Flatten
b) Select
c) Join
d) Aggregate
You need to flatten a hierarchical structure in a JSON dataset before writing it to a relational database. Which transformation should you apply?
a) Flatten
b) Select
c) Pivot
d) Join
You need to shift a dataset’s date by 5 days forward. Which transformation will you use?
a) Shift Date
b) Derived Column
c) Select
d) Alter Row
You want to shift the sales data date by a specific number of days for each record. Which transformation would you use?
a) Shift Date
b) Alter Row
c) Derived Column
d) Lookup
You need to create a rolling window function that calculates a moving average for sales. Which transformation would you apply?
a) Window
b) Join
c) Pivot
d) Aggregate
You want to join every row from two datasets, where both datasets have the same number of rows. Which join type should you use?
a) Cross Join
b) Inner Join
c) Left Join
d) Full Outer Join
You need to evaluate complex expressions in your data flow based on conditions. Which transformation should you use?
a) Expression
b) Derived Column
c) Select
d) Filter
You want to sample 10% of your data for analysis in a dataset. Which transformation should you use?
a) Sample
b) Filter
c) Join
d) Pivot
You want to cleanse data by removing unnecessary spaces, special characters, and null values. Which transformation should you use?
a) Cleanse
b) Derived Column
c) Select
d) Lookup
You need to copy data from one dataset to another while ensuring the destination schema matches the source. Which activity will you use?
a) Copy Activity
b) Data Flow Activity
c) Lookup Activity
d) Move Data Activity
You need to apply a transformation that combines multiple transformations into a single pipeline. Which activity is best suited for this?
a) Data Flow Activity
b) Lookup Activity
c) Execute Pipeline Activity
d) Copy Activity
You want to copy data from a CSV file into an Azure SQL database, transforming the data during the process. Which activity will you use?
a) Copy Activity
b) Data Flow Activity
c) Lookup Activity
d) Execute Pipeline Activity
To run a batch process that reads and writes data incrementally, which activity is suitable?
a) Copy Activity
b) Execute Pipeline Activity
c) Data Flow Activity
d) Lookup Activity
You need to handle errors during data transformation and track changes. Which feature of Azure Data Factory should you enable?
a) Monitoring
b) Data Flow Debugging
c) Logging
d) Error Handling
You need to monitor the success or failure of a data pipeline in Azure Data Factory. Which feature will you use?
a) Data Flow Debugging
b) Monitoring
c) Logs
d) Alerts
You need to implement a data pipeline that adjusts to real-time data. Which feature of Azure Data Factory will help you achieve this?
a) Real-Time Data Processing
b) Monitoring
c) Logging
d) Data Flow Debugging
You need to ensure that data privacy regulations are met when processing data in Azure Data Factory. Which feature should you use?
a) Security
b) Monitoring
c) Data Flow Debugging
d) Real-Time Data Processing
You need to ensure that your data pipeline complies with specific industry standards such as HIPAA or GDPR. Which feature of Azure Data Factory should you implement?
a) Security and Compliance
b) Monitoring
c) Data Flow Debugging
d) Real-Time Data Processing
To maintain data privacy and security, which role should you assign to users in Azure Data Factory?
a) Data Contributor
b) Pipeline Operator
c) Data Analyst
d) Data Engineer
You need to ensure secure data transfers between Azure services. Which feature of Azure Data Factory should you enable?
a) Managed Identity
b) Encryption at Rest
c) Secure Data Transfers
d) VPN Integration
To protect sensitive data while running a pipeline, which feature of Azure Data Factory will you use?
a) Managed Identity
b) Data Flow Debugging
c) Encryption
d) Data Monitoring
You need to monitor and manage large-scale data processing jobs to avoid failures. Which feature should you use in Azure Data Factory?
a) Monitoring
b) Logging
c) Data Flow Debugging
d) Alerts
Answer Key:
Qno
Answer
1
b) Filter
2
a) Filter
3
b) Select
4
b) Select
5
a) Sort
6
a) Sort
7
b) Join
8
a) Left Outer Join
9
b) Union
10
a) Union
11
b) Aggregate
12
a) Aggregate
13
a) Derived Column
14
b) Derived Column
15
b) Lookup
16
a) Lookup
17
b) Pivot
18
a) Pivot
19
a) Unpivot
20
b) Unpivot
21
a) Conditional Split
22
a) Conditional Split
23
a) Alter Row
24
a) Alter Row
25
a) Azure SQL Database Sink
26
b) Blob Storage Sink
27
b) Scale
28
b) Use Scale option in Data Flow
29
a) Flatten
30
a) Flatten
31
a) Shift Date
32
a) Shift Date
33
a) Window
34
a) Cross Join
35
a) Expression
36
a) Sample
37
a) Cleanse
38
a) Copy Activity
39
a) Data Flow Activity
40
a) Copy Activity
41
a) Copy Activity
42
b) Monitoring
43
b) Monitoring
44
a) Real-Time Data Processing
45
a) Security
46
a) Security and Compliance
47
a) Data Contributor
48
c) Secure Data Transfers
49
c) Encryption
50
a) Monitoring
Azure Data Factory transformations and activities:
You need to read data from an Azure SQL database, join it with data from a CSV file in Azure Blob Storage, apply a filter to remove null records, and then aggregate the results before saving them to a data warehouse. Which transformations and activities will you use?
a) Copy Activity, Join, Filter, Aggregate
b) Data Flow Activity, Join, Filter, Aggregate
c) Pipeline Activity, Lookup, Filter, Aggregate
d) Copy Activity, Pivot, Filter, Aggregate
You have a large dataset in Azure Data Lake, and you need to process it by pivoting the data, then unpivoting it back, and finally writing the results to an Azure SQL database. What sequence of transformations will you use?
a) Pivot, Unpivot, Sink
b) Unpivot, Pivot, Data Flow Activity
c) Pivot, Unpivot, Filter, Sink
d) Pivot, Filter, Unpivot, Sink
You need to read data from an Azure Blob Storage container, clean the data by removing special characters and trimming whitespace, apply a conditional split to route data into different streams, and then load it into a Cosmos DB. What transformations and activities will you use?
a) Copy Activity, Cleanse, Conditional Split
b) Data Flow Activity, Cleanse, Conditional Split, Sink
c) Data Flow Activity, Cleanse, Conditional Split, Data Lake Sink
d) Data Flow Activity, Cleanse, Alter Row, Sink
You are given a dataset containing monthly sales data, and you need to aggregate it by year, calculate the total sales per region, and store the results in an Azure SQL Database. What combination of activities and transformations will you use?
a) Copy Activity, Aggregate, Sink
b) Data Flow Activity, Aggregate, Sink
c) Data Flow Activity, Filter, Aggregate, Sink
d) Copy Activity, Aggregate, Filter, Sink
You want to enrich your data by looking up values from a reference table stored in an Azure SQL Database, then pivot the data to create columns for each region, followed by saving the results to Azure Blob Storage. Which transformations should you use?
a) Lookup, Pivot, Sink
b) Join, Pivot, Sink
c) Lookup, Unpivot, Sink
d) Join, Unpivot, Sink
You need to process data from a CSV file in Azure Blob Storage by filtering out records with invalid data, deriving a new column for profit margin, and then writing the final data to a Data Lake. What combination of transformations and activities will you use?
a) Filter, Derived Column, Sink
b) Filter, Aggregate, Sink
c) Copy Activity, Derived Column, Sink
d) Data Flow Activity, Filter, Derived Column, Sink
Your task is to shift all dates in a dataset by 30 days, clean the data by removing invalid characters, then join this data with another dataset from a different source and load the results into an Azure SQL Database. What transformations and activities will you use?
a) Shift Date, Cleanse, Join, Sink
b) Shift Date, Lookup, Join, Sink
c) Cleanse, Shift Date, Join, Sink
d) Lookup, Cleanse, Shift Date, Sink
You have sales data for multiple products across several months, and you need to unpivot the data, perform a conditional split based on sales amount, and then aggregate the results by product. What transformations should you use?
a) Unpivot, Conditional Split, Aggregate
b) Pivot, Conditional Split, Aggregate
c) Unpivot, Filter, Aggregate
d) Unpivot, Lookup, Aggregate
You want to process customer data by deriving a new column for loyalty points, applying a filter for active customers, and then aggregating the data by customer segment before loading it into an Azure SQL Database. Which transformations will you use?
a) Derived Column, Filter, Aggregate, Sink
b) Filter, Aggregate, Sink
c) Derived Column, Lookup, Aggregate, Sink
d) Derived Column, Filter, Join, Sink
You need to join two datasets, one containing customer information and another containing their order history, apply a transformation to remove duplicates, and then write the results to an Azure Data Lake. What transformations will you use?
a) Join, Remove Duplicates, Sink
b) Join, Aggregate, Sink
c) Join, Filter, Sink
d) Join, Conditional Split, Sink
You have sales data with multiple columns for each region, and you need to unpivot the data, apply a filter to remove records where sales are below $1000, and then aggregate the results by region before saving it to an Azure SQL Database. What combination of transformations should you use?
a) Unpivot, Filter, Aggregate, Sink
b) Pivot, Filter, Aggregate, Sink
c) Unpivot, Conditional Split, Aggregate, Sink
d) Unpivot, Filter, Join, Sink
You need to join two datasets based on product ID, apply a derived column to calculate the profit margin, and then shift the date by 7 days before saving the results to Azure Blob Storage. What transformations will you use?
a) Join, Derived Column, Shift Date, Sink
b) Join, Shift Date, Derived Column, Sink
c) Derived Column, Shift Date, Join, Sink
d) Derived Column, Join, Shift Date, Sink
You need to process a dataset containing customer feedback, remove records with invalid ratings, pivot the data to show ratings by product, and then load it into Azure SQL Database. Which transformations and activities will you use?
a) Filter, Pivot, Sink
b) Filter, Pivot, Join, Sink
c) Aggregate, Pivot, Sink
d) Join, Filter, Pivot, Sink
You want to read data from an Azure Blob Storage container, apply a lookup to enrich the data with region codes, and then aggregate the data by product category before saving the results to a Data Lake. Which transformations will you use?
a) Lookup, Aggregate, Sink
b) Filter, Lookup, Aggregate, Sink
c) Join, Lookup, Aggregate, Sink
d) Lookup, Derived Column, Aggregate, Sink
You need to copy data from a relational database to a Data Lake, pivot the data based on product category, and then cleanse the data by removing null values. What combination of transformations and activities should you use?