BigQuery, Google’s serverless data warehouse, enables scalable data analysis with exceptional speed and cost efficiency. Chapter 2 explores the basics of BigQuery, including its data model (datasets, tables, and views), loading data from various file formats, and executing SQL queries in the BigQuery Console. Test your knowledge with 30 MCQs.
Topic 1: BigQuery Data Model (Datasets, Tables, Views)
What is a dataset in BigQuery? a) A collection of tables and views b) A data pipeline for processing files c) A single table in BigQuery d) A storage bucket in Cloud Storage
How is a table defined in BigQuery? a) A virtual representation of structured data b) A collection of rows and columns in a dataset c) A schema-less collection of objects d) A key-value store for unstructured data
Which of the following can be used to query data in BigQuery without storing it permanently? a) Tables b) Views c) Datasets d) Temporary Tables
What is a key feature of BigQuery views? a) They automatically update to reflect changes in the underlying data b) They store the result of a query permanently c) They are stored in Cloud Storage d) They are used for streaming data
What is the maximum nesting depth for repeated fields in a BigQuery table? a) 3 b) 15 c) 10 d) 5
Which of the following statements about datasets is true? a) Datasets must reside in a specific region. b) Datasets can span multiple projects. c) Datasets are limited to one table only. d) Datasets cannot include views.
What does a BigQuery schema define? a) Query performance metrics b) The structure of the data in a table c) The region where the dataset is stored d) The cost of storage per table
How do you make a table public in BigQuery? a) Enable public sharing in the Google Cloud Console b) Add the “allUsers” member in the table’s permissions c) Create a new dataset with public access d) Share the table’s URL directly
What is the storage format for tables in BigQuery? a) Parquet b) Columnar storage c) JSON d) Tabular storage
Can you join data from two different datasets in BigQuery? a) Yes, using cross-dataset queries b) No, datasets cannot interact c) Yes, but only within the same project d) Yes, but only if both datasets are public
Topic 2: Loading Data into BigQuery
Which file formats are supported for loading data into BigQuery? a) JSON, CSV, Avro, Parquet, and ORC b) TXT, XML, and YAML c) PDF, DOCX, and PPTX d) PNG, JPEG, and TIFF
What is the purpose of using Cloud Storage when loading data into BigQuery? a) To encrypt data before loading b) To provide a scalable staging area for files c) To convert data to BigQuery tables d) To monitor data quality
What is the maximum size of a file that can be loaded into BigQuery from Cloud Storage? a) 5 GB b) 50 GB c) 1 TB d) 10 GB
Which option is required when loading CSV data into BigQuery? a) Schema definition b) Data streaming enabled c) File compression d) Encryption keys
How does BigQuery handle NULL values in loaded data? a) It ignores rows with NULL values. b) It replaces NULL with default values. c) It preserves NULL values as-is. d) It converts NULL values to empty strings.
What is the advantage of loading Parquet files into BigQuery? a) Smaller storage size and faster queries b) Automatic data transformation c) Simplified schema creation d) Direct integration with APIs
What happens if the schema of the file does not match the table during a data load? a) BigQuery loads the file with warnings. b) The load operation fails. c) BigQuery automatically adjusts the schema. d) It skips the unmatched fields.
Can you append new data to an existing table during a load operation? a) No, you must overwrite the table. b) Yes, by using the append write disposition. c) No, tables are immutable. d) Yes, but only for JSON files.
What is the purpose of a partitioned table in BigQuery? a) To encrypt data for security b) To divide data based on specific columns for faster queries c) To merge data from different datasets d) To replicate data across regions
How do you schedule a recurring data load into BigQuery? a) Use Cloud Scheduler with a Dataflow job b) Manually trigger uploads daily c) Create a cron job in your local machine d) Enable auto-load in the BigQuery Console
Topic 3: Query Basics
Which SQL dialect does BigQuery use? a) Standard SQL b) T-SQL c) MySQL SQL d) HiveQL
What is the purpose of the LIMIT clause in a BigQuery query? a) To restrict the number of columns b) To filter rows based on conditions c) To limit the number of rows returned d) To define the maximum query cost
Which function is used to count rows in a BigQuery table? a) SUM() b) COUNT() c) ROWS() d) LENGTH()
What happens when a query exceeds the BigQuery free tier limit? a) The query fails automatically. b) BigQuery prompts for upgrade approval. c) The user is charged based on the excess usage. d) The query is delayed until the next month.
How do you run a query in the BigQuery Console? a) Write the query and click “Run” b) Select the “Execute” tab in Cloud Storage c) Save the query to a dataset and schedule it d) Write SQL in the CLI
Which clause is used to filter data in a BigQuery query? a) SELECT b) FILTER c) WHERE d) GROUP BY
Can BigQuery queries join tables across different projects? a) Yes, using fully qualified table names b) No, tables must be in the same project c) Only if both projects have the same billing account d) No, this feature is not supported
What is the result of using a wildcard (*) in the SELECT clause? a) It retrieves all columns from the table. b) It fetches distinct rows only. c) It limits the rows fetched to 100. d) It causes the query to fail.
What is the purpose of the BigQuery “Preview” feature? a) To view the results of a query without incurring costs b) To analyze data with sample queries c) To view the schema and a sample of the data d) To visualize query performance
Which BigQuery feature estimates query costs before execution? a) Cost Advisor b) Query Estimator c) Query Validator d) Query Pricing Tool
Answers
Q No
Answer
1
a) A collection of tables and views
2
b) A collection of rows and columns in a dataset
3
b) Views
4
a) They automatically update to reflect changes in the underlying data
5
d) 5
6
a) Datasets must reside in a specific region
7
b) The structure of the data in a table
8
b) Add the “allUsers” member in the table’s permissions
9
b) Columnar storage
10
a) Yes, using cross-dataset queries
11
a) JSON, CSV, Avro, Parquet, and ORC
12
b) To provide a scalable staging area for files
13
c) 1 TB
14
a) Schema definition
15
c) It preserves NULL values as-is
16
a) Smaller storage size and faster queries
17
b) The load operation fails
18
b) Yes, by using the append write disposition
19
b) To divide data based on specific columns for faster queries