MCQs on Setting Up and Configuring AWS Glue | AWS Glue MCQs Question

AWS Glue is a fully managed ETL (Extract, Transform, Load) service that simplifies data preparation for analytics. Chapter 2 focuses on configuring AWS Glue, including environment setup, using the AWS Glue Console and CLI, and understanding IAM roles and permissions. Explore these AWS Glue MCQ questions and answers to enhance your knowledge.


MCQs: Prerequisites and Environment Setup

  1. What is a prerequisite for using AWS Glue?
    a) A VPC configuration
    b) S3 bucket for data storage
    c) Enabling DynamoDB streams
    d) Setting up EC2 instances
  2. Which programming languages does AWS Glue support for writing ETL scripts?
    a) Java and Python
    b) Python and Scala
    c) Python and Ruby
    d) Scala and JavaScript
  3. What is the default location for storing AWS Glue scripts?
    a) AWS Glue Console
    b) Amazon DynamoDB
    c) Amazon S3
    d) AWS CloudFormation
  4. How do you set up a connection for accessing data sources in AWS Glue?
    a) Through AWS Lambda
    b) By creating a connection in the AWS Glue Console
    c) By enabling AWS Secrets Manager
    d) Using the AWS Glue CLI
  5. What is the purpose of a data catalog in AWS Glue?
    a) To store raw data
    b) To maintain metadata about your data
    c) To host ETL jobs
    d) To handle schema migration
  6. Before running AWS Glue jobs, what must be configured in the environment?
    a) S3 buckets and Lambda functions
    b) IAM roles and Glue Data Catalog
    c) EC2 instances and EBS volumes
    d) RDS databases
  7. What network configuration is necessary for AWS Glue to connect to on-premises databases?
    a) Elastic Load Balancer setup
    b) Direct Connect or VPN setup
    c) Lambda integration
    d) API Gateway configuration
  8. Which AWS service is commonly used with AWS Glue for storing extracted data?
    a) Amazon RDS
    b) Amazon S3
    c) AWS DynamoDB
    d) Amazon EC2
  9. What type of database is required for enabling Glue Data Catalog integration with Athena?
    a) NoSQL database
    b) Relational database
    c) MySQL-compatible database
    d) None; it uses Glue Data Catalog directly
  10. How is the Glue Python library installed for local development?
    a) Using pip to install glue-python
    b) By downloading from the AWS Glue Console
    c) Through the AWS CLI
    d) By setting up a Lambda layer

MCQs: AWS Glue Console and CLI

  1. Which interface is used for managing AWS Glue resources?
    a) AWS Lambda Console
    b) AWS Glue Console
    c) CloudFormation Console
    d) Elastic Beanstalk Console
  2. What does the “Jobs” section in the AWS Glue Console allow you to do?
    a) Manage Glue Data Catalog
    b) Configure IAM roles
    c) Create, edit, and run ETL jobs
    d) Monitor EC2 instances
  3. How can you start an AWS Glue job from the command line?
    a) Using aws-glue-cli
    b) With the start-job command in the AWS CLI
    c) By running a Lambda function
    d) Through an S3 event trigger
  4. What is the command to list Glue Data Catalog tables using the AWS CLI?
    a) aws glue get-databases
    b) aws glue list-tables
    c) aws glue list-databases
    d) aws glue get-tables
  5. How do you monitor the progress of AWS Glue jobs in the Console?
    a) Check logs in Amazon CloudWatch
    b) Use the “Triggers” section
    c) Access the AWS Lambda Console
    d) View job status in the “Jobs” section
  6. What type of scripts can be edited in the AWS Glue Console?
    a) CloudFormation templates
    b) Python or Scala ETL scripts
    c) Java Lambda functions
    d) Shell scripts
  7. What does the AWS Glue CLI allow you to do?
    a) Deploy applications on EC2
    b) Perform all tasks that can be done in the AWS Glue Console
    c) Set up serverless databases
    d) Monitor S3 events
  8. How can you trigger a Glue job manually using the Console?
    a) By configuring IAM policies
    b) By selecting the job and clicking “Run”
    c) By setting up an event bridge rule
    d) Using the AWS SDK
  9. What is required to access the AWS Glue CLI?
    a) Access to an EC2 instance
    b) AWS IAM user credentials with Glue permissions
    c) Setting up a Glue endpoint
    d) A CloudFormation template
  10. Which AWS Glue resource can be created using both the Console and CLI?
    a) Virtual Private Cloud (VPC)
    b) Data Catalog table
    c) Lambda function
    d) CloudTrail event

MCQs: IAM Roles and Permissions

  1. What is the primary purpose of IAM roles in AWS Glue?
    a) To schedule Glue jobs
    b) To provide permissions for AWS Glue jobs to access resources
    c) To create Glue Data Catalogs
    d) To manage AWS billing
  2. What is the default policy required for AWS Glue to read and write data in S3?
    a) AmazonS3FullAccess
    b) AWSGlueServiceRole
    c) GlueS3ReadWritePolicy
    d) AWSDataPipelineRole
  3. Which policy ensures Glue jobs can access AWS Glue Data Catalog?
    a) AWSGlueConsoleFullAccess
    b) AWSGlueServiceRole
    c) GlueCatalogReadWritePolicy
    d) AmazonDynamoDBFullAccess
  4. How do you restrict a Glue job from accessing a specific S3 bucket?
    a) Remove the Glue service role
    b) Attach a deny policy in IAM
    c) Use SCPs in AWS Organizations
    d) Revoke Glue Console permissions
  5. What permission is needed to create triggers for Glue jobs?
    a) CloudWatch Logs permissions
    b) IAM role with glue:CreateTrigger
    c) AdministratorAccess policy
    d) GlueCatalogPolicy
  6. What does the glue:BatchGetJobs permission allow?
    a) Access to Data Catalog metadata
    b) Retrieve details of specific Glue jobs
    c) Execute Glue jobs
    d) Monitor Glue job metrics
  7. What must be included in a Glue service role to allow integration with Amazon Redshift?
    a) RedshiftDataPolicy
    b) S3ReadWriteRole
    c) GlueServiceRole for Redshift
    d) AmazonRedshiftFullAccess policy
  8. Which IAM feature can restrict Glue job access to a specific VPC?
    a) IAM inline policies
    b) Resource-based policies
    c) VPC endpoint policies
    d) Glue trigger configurations
  9. What AWS Glue-specific managed policy grants full Console access?
    a) AWSGlueServicePolicy
    b) AWSGlueFullAccess
    c) GlueDataAdminPolicy
    d) AWSGlueConsoleFullAccess
  10. How can you ensure least privilege for AWS Glue jobs?
    a) Use the AWS Glue Administrator role
    b) Assign Glue-specific managed policies
    c) Grant only the permissions required for each task
    d) Allow full access to S3

Answers Table

QnoAnswer (Option with Text)
1b) S3 bucket for data storage
2b) Python and Scala
3c) Amazon S3
4b) By creating a connection in the AWS Glue Console
5b) To maintain metadata about your data
6b) IAM roles and Glue Data Catalog
7b) Direct Connect or VPN setup
8b) Amazon S3
9d) None; it uses Glue Data Catalog directly
10a) Using pip to install glue-python
11b) AWS Glue Console
12c) Create, edit, and run ETL jobs
13b) With the start-job command in the AWS CLI
14d) aws glue get-tables
15a) Check logs in Amazon CloudWatch
16b) Python or Scala ETL scripts
17b) Perform all tasks that can be done in the AWS Glue Console
18b) By selecting the job and clicking “Run”
19b) AWS IAM user credentials with Glue permissions
20b) Data Catalog table
21b) To provide permissions for AWS Glue jobs to access resources
22b) AWSGlueServiceRole
23b) AWSGlueServiceRole
24b) Attach a deny policy in IAM
25b) IAM role with glue:CreateTrigger
26b) Retrieve details of specific Glue jobs
27d) AmazonRedshiftFullAccess policy
28c) VPC endpoint policies
29d) AWSGlueConsoleFullAccess
30c) Grant only the permissions required for each task

Use a Blank Sheet, Note your Answers and Finally tally with our answer at last. Give Yourself Score.

X
error: Content is protected !!
Scroll to Top