BigQuery is Google’s powerful cloud-based data warehouse that simplifies large-scale data analysis. With advanced features like analytical functions, window functions, and statistical tools, BigQuery enables deep exploratory data analysis (EDA). Learn how to optimize queries, understand query plans and costs, and handle nested data using arrays and structs for efficient analytics.
BigQuery MCQs
Analytical Functions
What does the COUNT function in BigQuery return? a) Number of distinct rows b) Total number of rows c) Number of null values d) Average of rows
Which function calculates the sum of a numeric column? a) COUNT b) AVG c) SUM d) MEDIAN
What is the purpose of the GROUP BY clause in BigQuery? a) Aggregating data into groups b) Filtering rows based on conditions c) Joining two tables d) Sorting the data
Which analytical function is used to find the nth largest value in a dataset? a) RANK b) NTILE c) PERCENTILE_CONT d) ROW_NUMBER
Which clause is necessary for using aggregate functions like SUM or COUNT? a) HAVING b) WHERE c) GROUP BY d) ORDER BY
Window Functions
What is a window function in BigQuery? a) A function used for aggregating data globally b) A function that operates over a subset of rows related to the current row c) A visualization tool d) A function for data filtering
Which keyword defines the boundaries of a window function? a) PARTITION BY b) OVER c) WITHIN d) LIMIT
What does the ROW_NUMBER window function do? a) Assigns ranks to rows with ties b) Assigns a unique number to each row in a partition c) Groups rows into buckets d) Returns cumulative sums
How does the RANK function differ from ROW_NUMBER? a) RANK skips numbers for tied values b) ROW_NUMBER groups rows into buckets c) RANK uses partitioning, while ROW_NUMBER does not d) ROW_NUMBER includes nulls, while RANK excludes them
What does the NTILE function in window operations do? a) Splits rows into a specified number of groups b) Sorts rows in ascending order c) Calculates cumulative sums d) Counts rows per partition
Statistical Functions
Which function calculates the standard deviation of a column? a) STDDEV b) VARIANCE c) AVG d) MEDIAN
What does the CORR function compute in BigQuery? a) Variance b) Standard deviation c) Correlation between two columns d) Median
Which statistical function calculates the percentile of a dataset? a) NTILE b) PERCENTILE_CONT c) STDDEV d) VAR_POP
What is the purpose of the VAR_POP function? a) Calculate sample variance b) Calculate population variance c) Calculate mean d) Calculate median
Which statistical function is best for analyzing the relationship between numeric columns? a) AVG b) STDDEV c) CORR d) SUM
Exploratory Data Analysis (EDA)
What is the main goal of exploratory data analysis? a) Building predictive models b) Visualizing trends and patterns in data c) Cleaning and transforming data d) Compressing datasets
Which BigQuery feature is best for examining the distribution of data? a) ARRAY functions b) Statistical functions c) Window functions d) Aggregate functions
What function would you use to get a quick summary of data? a) EXPLAIN b) INFORMATION_SCHEMA c) SUMMARY d) COUNT
Which clause filters rows during EDA? a) GROUP BY b) ORDER BY c) WHERE d) HAVING
What is an example of a categorical variable in EDA? a) Employee ID b) Annual income c) Sales growth percentage d) Age in years
Unnesting and Flattening Nested Data
Which function is used to flatten nested data in BigQuery? a) UNNEST b) FLATTEN c) SPLIT d) JOIN
Arrays in BigQuery can store: a) Only numeric data b) Only string data c) Multiple values of the same type d) Mixed data types
How do you access a specific element in an array? a) ARRAY_INDEX b) Using the OFFSET function c) ACCESS function d) SELECT function
What is a struct in BigQuery? a) A single value container b) A collection of ordered values c) A collection of named fields d) A dynamic data type
Which clause is required when working with nested data? a) GROUP BY b) CROSS JOIN c) WITH OFFSET d) UNNEST
Optimizing Query Performance
What is the main purpose of query optimization in BigQuery? a) Enhance query execution speed b) Improve data quality c) Ensure data privacy d) Increase query size
How can you reduce query costs in BigQuery? a) Use SELECT * for all queries b) Partition tables c) Avoid indexes d) Increase table size
What does the EXPLAIN statement do in BigQuery? a) Visualizes query results b) Analyzes query execution plans c) Executes the query immediately d) Combines datasets
What are BigQuery slots? a) Data storage units b) Virtual CPUs used for query processing c) Memory blocks for caching results d) Predefined templates for queries
Which caching method helps improve query performance? a) Result caching b) Object caching c) Disk caching d) Data compression
Answers
Qno
Answer
1
b) Total number of rows
2
c) SUM
3
a) Aggregating data into groups
4
a) RANK
5
c) GROUP BY
6
b) A function that operates over a subset of rows related to the current row
7
b) OVER
8
b) Assigns a unique number to each row in a partition