Learn essential concepts of AWS Amazon SageMaker with these MCQ questions and answers. This set focuses on data preparation and management, covering data import and processing, feature engineering, and data security with access control. These questions are perfect for mastering SageMaker’s role in machine learning workflows and ensuring data integrity.
MCQs
Data Import and Processing
What is the primary use of Amazon SageMaker Data Wrangler? a) Manage large-scale database queries b) Simplify data preparation workflows c) Automate machine learning model deployment d) Store training datasets securely
Which of the following is a supported data source for SageMaker? a) Google Drive b) AWS Glue Data Catalog c) Microsoft OneDrive d) Dropbox
Which type of data storage is commonly used for training in SageMaker? a) Amazon S3 b) Amazon RDS c) AWS Lambda d) Amazon Aurora
How can you preprocess data in SageMaker? a) Using predefined templates in Amazon Redshift b) By writing custom scripts in Jupyter Notebooks c) Through Elastic Beanstalk d) Using AWS CloudTrail
What role does AWS Glue play in SageMaker data preparation? a) Deploying machine learning models b) Extracting, transforming, and loading data c) Monitoring SageMaker endpoints d) Scaling training instances automatically
Which SageMaker tool can be used for automated data labeling? a) Ground Truth b) AutoPilot c) Data Wrangler d) Feature Store
How is data imported into SageMaker for training? a) Directly from Amazon DynamoDB b) By uploading datasets to S3 and linking them c) Using AWS IAM policies d) Through AWS Snowball devices
What is the typical file format for input data in SageMaker? a) .docx b) .xlsx c) .csv d) .exe
Feature Engineering
What is the purpose of feature engineering in SageMaker? a) Automating model deployment b) Extracting meaningful information from raw data c) Visualizing model predictions d) Scaling machine learning infrastructure
Which SageMaker component is used to store and retrieve machine learning features? a) AWS Glue Catalog b) SageMaker Feature Store c) SageMaker Data Wrangler d) AWS Lambda
What technique is commonly used for handling missing values in datasets? a) Model tuning b) Imputation c) Instance scaling d) Data mirroring
How does SageMaker ensure that feature engineering is scalable? a) By integrating with on-premises databases b) Through support for distributed processing c) By offering auto-scaling for RDS instances d) Using direct connections to Amazon CloudFront
Which method helps in reducing dimensionality in datasets? a) Normalization b) Principal Component Analysis (PCA) c) Label encoding d) Hyperparameter tuning
What is the primary benefit of using SageMaker Feature Store? a) It automates instance scaling for training jobs b) Enables real-time access to precomputed features c) It visualizes machine learning workflows d) Provides real-time security monitoring
How can you monitor the quality of engineered features in SageMaker? a) Using AWS Config b) By integrating with Amazon CloudWatch c) Through SageMaker Clarify d) Using AWS Cost Explorer
Data Security and Access Control
How does SageMaker ensure data security during training? a) By storing all data in public S3 buckets b) By encrypting data in transit and at rest c) Using unencrypted local storage d) Through predefined network policies
What is a key practice for access control in SageMaker? a) Using AWS Identity and Access Management (IAM) b) Creating root user accounts c) Disabling logging for endpoints d) Sharing access keys publicly
Which feature of SageMaker enables network isolation for data security? a) Elastic Load Balancing b) VPC Endpoints c) AWS Lambda d) Amazon DynamoDB
How can you audit access to SageMaker resources? a) Using AWS CloudTrail b) Through SageMaker AutoPilot logs c) By monitoring with Amazon GuardDuty d) Using SageMaker Ground Truth
What is the default encryption state for data stored in Amazon S3 when used with SageMaker? a) Data is encrypted by default b) Data is unencrypted by default c) Encryption depends on S3 bucket policies d) Encryption is not supported
Which encryption option is available in SageMaker? a) Key Pair Encryption b) AWS Key Management Service (KMS) c) SSL/TLS certificates only d) RSA public key encryption
What type of access policy is recommended for SageMaker training jobs? a) Broad permissions for all users b) Least privilege access policies c) No access restrictions d) Publicly shared IAM roles
Which of the following ensures secure API calls to SageMaker? a) Using AWS CloudFormation templates b) Signing API requests with AWS Signature Version 4 c) Enabling step functions d) Using unencrypted connections
How can SageMaker resources be restricted to specific regions? a) By disabling multi-region replication b) Through service control policies (SCPs) c) Using Amazon Athena queries d) By enabling cross-region logging
Which security mechanism prevents unauthorized access to SageMaker notebooks? a) Amazon Inspector b) Multi-Factor Authentication (MFA) c) AWS Trusted Advisor d) SageMaker Debugger
Answer Key
Qno
Answer
1
b) Simplify data preparation workflows
2
b) AWS Glue Data Catalog
3
a) Amazon S3
4
b) By writing custom scripts in Jupyter Notebooks
5
b) Extracting, transforming, and loading data
6
a) Ground Truth
7
b) By uploading datasets to S3 and linking them
8
c) .csv
9
b) Extracting meaningful information from raw data
10
b) SageMaker Feature Store
11
b) Imputation
12
b) Through support for distributed processing
13
b) Principal Component Analysis (PCA)
14
b) Enables real-time access to precomputed features
15
c) Through SageMaker Clarify
16
b) By encrypting data in transit and at rest
17
a) Using AWS Identity and Access Management (IAM)
18
b) VPC Endpoints
19
a) Using AWS CloudTrail
20
a) Data is encrypted by default
21
b) AWS Key Management Service (KMS)
22
b) Least privilege access policies
23
b) Signing API requests with AWS Signature Version 4