MCQs on Loading and Unloading Data | Snowflake

Efficiently handling data is a cornerstone of working with Snowflake. Chapter 4 delves into bulk loading data using the powerful COPY command, explores supported file formats like CSV, JSON, and Parquet, and guides you on importing data from cloud storage solutions (S3, GCS, Azure). Additionally, it covers strategies for unloading and exporting data to external destinations. Here’s a set of 30 multiple-choice questions (MCQs) to test your understanding of these concepts.


Bulk Loading Data Using COPY Command

  1. What is the primary purpose of the COPY INTO command in Snowflake?
    a) To export data to a local machine
    b) To load data into Snowflake tables
    c) To perform analytics on data
    d) To delete rows from a table
  2. Which keyword in the COPY INTO command is used to specify the source data file location?
    a) FROM
    b) USING
    c) LOCATION
    d) INTO
  3. When using the COPY command, which option helps validate the data without loading it?
    a) VALIDATE_ONLY
    b) NO_LOAD
    c) PREVIEW
    d) TEST
  4. Which of the following is true about the ON_ERROR parameter in the COPY command?
    a) It defines how errors are logged
    b) It specifies what to do when errors occur during data loading
    c) It determines the format of the output
    d) It is optional and not recommended
  5. The COPY command supports which types of files for loading data?
    a) Only CSV
    b) CSV and Parquet
    c) Various formats including CSV, JSON, and Parquet
    d) None of the above

Data File Formats (CSV, JSON, Parquet)

  1. Which of the following is a columnar storage format?
    a) CSV
    b) JSON
    c) Parquet
    d) XML
  2. What does the FIELD_OPTIONALLY_ENCLOSED_BY parameter do in a CSV file format?
    a) Specifies the delimiter for columns
    b) Indicates the character used for enclosing fields
    c) Specifies the encoding standard
    d) Defines the maximum row size
  3. JSON files are primarily used for storing:
    a) Tabular data
    b) Key-value pairs and hierarchical structures
    c) Binary data
    d) Images
  4. Which file format is most suitable for analytics on large datasets in Snowflake?
    a) CSV
    b) JSON
    c) Parquet
    d) Excel
  5. Parquet files store data:
    a) Row-wise
    b) Columnar-wise
    c) As a binary tree
    d) In plain text

Loading from Cloud Storage (S3, GCS, Azure)

  1. What is required to load data from AWS S3 into Snowflake?
    a) A storage integration or access credentials
    b) A Google Cloud Platform account
    c) Azure Data Lake access
    d) Local file system permissions
  2. Which Snowflake command is used to stage data in cloud storage?
    a) LOAD_DATA
    b) PUT
    c) CREATE_STAGE
    d) IMPORT
  3. To load data from GCS, you need:
    a) A Google Cloud Storage bucket and credentials
    b) An S3 bucket
    c) Azure Blob Storage integration
    d) A REST API
  4. Azure Blob Storage integration with Snowflake uses:
    a) SAS tokens
    b) AWS CLI
    c) OAuth
    d) Snowpipe
  5. Which file format supports efficient compression for loading from cloud storage?
    a) CSV
    b) JSON
    c) Parquet
    d) XML

Data Unloading and Exporting

  1. The COPY INTO command for unloading data is used to:
    a) Load data into a table
    b) Export data from a Snowflake table to an external location
    c) Create new tables
    d) Merge tables
  2. Data can be unloaded to:
    a) Only local storage
    b) External cloud storage like S3, GCS, and Azure
    c) Only to Excel files
    d) Only to a local database
  3. Which parameter is critical for defining the file format while unloading data?
    a) FILE_FORMAT
    b) STORAGE_LOCATION
    c) ACCESS_ROLE
    d) DATA_COMPRESSION
  4. During unloading, the OVERWRITE parameter is used to:
    a) Append data to existing files
    b) Remove existing files in the destination
    c) Optimize file size
    d) Generate summary reports
  5. What is the default file format for unloaded data?
    a) Parquet
    b) CSV
    c) JSON
    d) XML

General Concepts

  1. What is a Snowflake stage?
    a) A temporary table
    b) A data loading area
    c) A database schema
    d) A predefined set of queries
  2. Internal stages in Snowflake are:
    a) Managed by Snowflake for file storage
    b) User-defined
    c) Third-party tools
    d) Deprecated
  3. To secure data during unloading, you should use:
    a) SSL/TLS encryption
    b) Password protection
    c) Encryption keys provided by the cloud provider
    d) All of the above
  4. Multi-part files in Snowflake unloading are useful for:
    a) Small datasets
    b) Optimizing load balancing and parallel processing
    c) Preventing data duplication
    d) Manual editing of files
  5. Metadata for unloading files can be managed through:
    a) Snowflake streams
    b) Stage listings
    c) File headers
    d) External cataloging tools

Advanced Techniques

  1. Using Snowpipe for loading data enables:
    a) Real-time or near real-time data ingestion
    b) Archiving of historical data
    c) Batch processing of files
    d) Manual data loading
  2. Which option improves performance during data loading?
    a) Splitting large files into smaller chunks
    b) Compressing the files using gzip
    c) Storing files closer to Snowflake regions
    d) All of the above
  3. Unloading compressed data can help:
    a) Save storage costs
    b) Decrease processing speed
    c) Add redundant information
    d) Make the data human-readable
  4. AUTO_COMPRESS in the COPY INTO command:
    a) Enables automatic compression of unloaded data
    b) Automatically decompresses files during loading
    c) Specifies error-handling modes
    d) Deletes unwanted rows
  5. Partitioning data during unloading is used for:
    a) Reducing file sizes for small datasets
    b) Optimizing storage and query efficiency
    c) Creating a data backup
    d) Enforcing access controls

Answers Table

QNoAnswer
1b) To load data into Snowflake tables
2a) FROM
3b) NO_LOAD
4b) It specifies what to do when errors occur during data loading
5c) Various formats including CSV, JSON, and Parquet
6c) Parquet
7b) Indicates the character used for enclosing fields
8b) Key-value pairs and hierarchical structures
9c) Parquet
10b) Columnar-wise
11a) A storage integration or access credentials
12c) CREATE_STAGE
13a) A Google Cloud Storage bucket and credentials
14a) SAS tokens
15c) Parquet
16b) Export data from a Snowflake table to an external location
17b) External cloud storage like S3, GCS, and Azure
18a) FILE_FORMAT
19b) Remove existing files in the destination
20b) CSV
21b) A data loading area
22a) Managed by Snowflake for file storage
23d) All of the above
24b) Optimizing load balancing and parallel processing
25b) Stage listings
26a) Real-time or near real-time data ingestion
27d) All of the above
28a) Save storage costs
29a) Enables automatic compression of unloaded data
30b) Optimizing storage and query efficiency

Use a Blank Sheet, Note your Answers and Finally tally with our answer at last. Give Yourself Score.

X
error: Content is protected !!
Scroll to Top