Enhance your data manipulation skills in R with complex joins, efficient use of data.table, and advanced reshaping techniques with tidyr. Master these tools for efficient data processing.
dplyr is used to combine two data frames by a common column?merge()join()left_join()bind_rows()dplyr?left_join()right_join()full_join()inner_join()dplyr is used to join data frames by multiple columns?multi_join()full_join()inner_join()by()summarise() function do in dplyr?dplyr?arrange()group_by()select()filter()dplyr?summarize(sum())group_by() %>% sum()summarize(total())group_by() %>% summarize(sum(column))dplyr, which function can be used to combine data frames vertically?bind_rows()full_join()left_join()merge()dplyr joins returns all rows from the left data frame and matching rows from the right data frame?full_join()left_join()right_join()inner_join()dplyr?by()on()column_names()matching()dplyr?mean_by()summarize(mean())mutate(mean())group_by() %>% mean()tidyversedata.tabledplyrggplot2data.table object?as.data.table()data.table()convert()to.data.table()data.table by reference?dt[, "column_name"]dt[, column_name]dt$column_namedt["column_name"]data.table?dt[column_name > value]filter(dt, column_name > value)subset(dt, column_name > value)dt[filter(column_name > value)]data.table by reference?dt[, column_name := new_value]dt$column_name <- new_valueupdate(dt, column_name, new_value)dt[column_name] <- new_valuesetkey() function in data.table do?data.table, how would you calculate the sum of a column grouped by another column?dt[, sum(column), by = group_column]dt$sum(column) %>% group_by(group_column)group_by(dt, group_column) %>% sum(column)aggregate(dt, by = group_column, FUN = sum)data.table?merge()inner_join()setkey()join()data.table over a regular data.frame?data.table objects?merge()left_join()setkey()merge.data.table()tidyr is used to convert a wide-format data frame into long format?spread()gather()pivot_wider()pivot_longer()pivot_wider() function do in tidyr?tidyr?separate()split()extract()subseparate()tidyr?fill()na.fill()replace_na()complete()gather()spread()pivot_wider()pivot_longer()tidyr function is used to convert a data frame into a more complete form by filling missing combinations of data?expand()complete()fill()expand_grid()unnest() function in tidyr?tidyr?spread()pivot_wider()gather()separate()tidyr is used to make a data frame with all possible combinations of a set of columns?expand()complete()spread()nest()separate() function do in tidyr?| QNo | Answer (Option with text) |
|---|---|
| 1 | c) left_join() |
| 2 | c) full_join() |
| 3 | d) by() |
| 4 | c) Summarizes data by calculating aggregates |
| 5 | b) group_by() |
| 6 | d) group_by() %>% summarize(sum(column)) |
| 7 | a) bind_rows() |
| 8 | b) left_join() |
| 9 | a) by() |
| 10 | b) group_by(column) %>% summarize(mean()) |
| 11 | b) data.table |
| 12 | a) as.data.table() |
| 13 | b) dt[, column_name] |
| 14 | a) dt[column_name > value] |
| 15 | a) dt[, column_name := new_value] |
| 16 | b) Creates a key for indexing |
| 17 | a) dt[, sum(column), by = group_column] |
| 18 | a) merge() |
| 19 | a) Smaller memory usage and faster computation |
| 20 | d) merge.data.table() |
| 21 | b) gather() |
| 22 | a) Converts long-format data into wide format |
| 23 | a) separate() |
| 24 | c) replace_na() |
| 25 | a) gather() |
| 26 | b) complete() |
| 27 | a) Unwraps nested data frames or lists into separate columns |
| 28 | b) pivot_wider() |
| 29 | a) expand() |
| 30 | c) Splits a single column into multiple columns |