Learn the fundamentals of data manipulation in R, including efficient techniques for indexing, subsetting, filtering, sorting, combining, and reshaping data to handle large datasets in your analysis.
vector[]vector(index)index(vector)vector[[]]v in R?v[2]v[1]v(2)v[[2]]df[1:3,] in R?dfdfdfdfdf to select rows where Age > 30?df[df$Age > 30, ]df[Age > 30, ]df[Age > 30]df[Age = 30, ]df in R?df[, 3]df[3, ]df[3]df$3drop=FALSE parameter do when subsetting data in R?X equals 5?df[df$X == 5, ]df[X == 5, ]df[X == 5]df[X = 5, ]subset() do in R?df in R?df[, c(1, 2)]df[c(1, 2)]df[1:2]df[, 1:2]df[ , -1] do to a data frame df?filter()select()subset()sort()df by the column Age in ascending order?df[order(df$Age), ]df$Age[sort()]sort(df$Age)df$Age[order()]order()sort()rank()filter()df by multiple columns, Age and Salary, in R?df[order(df$Age, df$Salary), ]df[sort(df$Age, df$Salary), ]df[order(df$Age + df$Salary), ]df[sort(df$Age, df$Salary)]arrange() function do in R?v in descending order in R?sort(v, decreasing = TRUE)sort(v, TRUE)v[sort()]order(v, decreasing = TRUE)df where Age is greater than 30 and Salary is less than 50000?df[df$Age > 30 & df$Salary < 50000, ]df[Age > 30 & Salary < 50000]df[Age > 30 | Salary < 50000]df[Age > 30, Salary < 50000]Age column and then by the Salary column in ascending order?df[order(df$Age, df$Salary), ]df[sort(df$Age, df$Salary), ]df[order(df$Age, df$Salary)]df[sort(df$Age + df$Salary)]filter()subset()select()order()NA values in a column?df[!is.na(df$Age), ]df[is.na(df$Age), ]df[Age != NA, ]df[na.omit(df$Age), ]rbind()cbind()merge()concat()df1 and df2 by columns in R?cbind(df1, df2)rbind(df1, df2)merge(df1, df2)concat(df1, df2)merge()rbind()cbind()join()rbind(df1, df2) in R?df1 and df2 by adding rowsdf1 and df2 by adding columnsdf1 and df2 by a common keydf1 and df2reshape()melt()spread()pivot()melt()reshape()spread()pivot_longer()pivot_wider() function do in R?pivot_wider()melt()spread()reshape()cbind() function do in R?data.matrix()as.matrix()matrix()df_to_matrix()| Qno | Answer |
|---|---|
| 1 | a) vector[] |
| 2 | a) v[2] |
| 3 | a) The first three rows of df |
| 4 | a) df[df$Age > 30, ] |
| 5 | a) df[, 3] |
| 6 | a) It prevents R from dropping dimensions when selecting a single row or column |
| 7 | a) df[df$X == 5, ] |
| 8 | a) It subsets a data frame based on a condition |
| 9 | a) df[, c(1, 2)] |
| 10 | a) Removes the first column |
| 11 | a) filter() |
| 12 | a) df[order(df$Age), ] |
| 13 | b) sort() |
| 14 | a) df[order(df$Age, df$Salary), ] |
| 15 | a) Sorts data in ascending or descending order |
| 16 | a) sort(v, decreasing = TRUE) |
| 17 | a) df[df$Age > 30 & df$Salary < 50000, ] |
| 18 | a) df[order(df$Age, df$Salary), ] |
| 19 | a) filter() |
| 20 | a) df[!is.na(df$Age), ] |
| 21 | a) rbind() |
| 22 | a) cbind(df1, df2) |
| 23 | a) merge() |
| 24 | a) Combines df1 and df2 by adding rows |
| 25 | b) melt() |
| 26 | a) melt() |
| 27 | a) Converts long-format data into wide-format data |
| 28 | a) pivot_wider() |
| 29 | a) Combines data frames by columns |
| 30 | b) as.matrix() |