Learn efficient techniques for manipulating string and text data in R, including the powerful stringr package, using regular expressions, and performing text cleaning and tokenization for data analysis.
MCQs on String and Text Data Manipulation in R
1. String Handling with stringr
What is the function used in stringr to find the length of a string in R? a) str_length() b) length() c) nchar() d) str_size()
Which stringr function is used to replace a substring in a string? a) str_sub() b) str_replace() c) str_extract() d) str_split()
How do you concatenate two strings str1 and str2 in R using stringr? a) str_c(str1, str2) b) concatenate(str1, str2) c) combine(str1, str2) d) str_merge(str1, str2)
Which stringr function can be used to check if a string matches a regular expression pattern? a) str_match() b) str_detect() c) str_which() d) str_extract()
How do you convert a string to uppercase in stringr? a) str_upcase() b) str_to_upper() c) str_to_uppercase() d) str_toUpper()
Which function from stringr is used to extract a specific part of a string? a) str_extract() b) str_sub() c) str_split() d) str_find()
How do you remove leading and trailing spaces from a string in R using stringr? a) str_trim() b) trim() c) remove_spaces() d) str_clean()
Which stringr function is used to split a string into individual words? a) str_split() b) str_word() c) str_words() d) split()
Which function would you use in stringr to detect if a string contains a pattern? a) str_detect() b) str_match() c) str_count() d) str_find()
What is the correct function to replace the first occurrence of a pattern in a string in stringr? a) str_replace() b) str_replace_all() c) str_sub() d) str_modify()
2. Regular Expressions in R
Which of the following regular expression symbols in R is used to match any character except a newline? a) . b) * c) + d) ?
In a regular expression, what does ^ signify? a) Matches the beginning of a string b) Matches the end of a string c) Matches any single character d) Matches a specific character class
What does the \\d pattern in a regular expression match? a) Any digit (0-9) b) Any non-digit character c) Any whitespace d) Any alphanumeric character
How would you match the exact word “apple” using a regular expression in R? a) ^apple$ b) apple c) \bapple\b d) \dapple\d
What is the purpose of the regular expression \b in R? a) Matches a word boundary b) Matches a non-word character c) Matches any alphanumeric character d) Matches the beginning of a string
How do you use a regular expression to match one or more digits in R? a) \\d+ b) \\d* c) \\d{1,} d) \\d?
Which regular expression character is used to escape special characters in R? a) \\ b) . c) * d) ^
What does \\w+ match in a regular expression in R? a) A sequence of one or more word characters b) A single whitespace character c) A sequence of one or more non-word characters d) Any number of alphanumeric characters
How do you match a whitespace character in a regular expression in R? a) \\s b) \\w c) \\d d) \\b
What does the $ symbol represent in a regular expression in R? a) End of a string b) Start of a string c) Any whitespace character d) Any digit
3. Text Cleaning and Tokenization
Which function is used to convert a string to lowercase in R? a) str_to_lower() b) tolower() c) str_lower() d) convert_to_lower()
How can you remove punctuation from a text string in R? a) str_remove_all("[[:punct:]]") b) remove_punctuation() c) str_trim("[[:punct:]]") d) clean_text()
What function is used to remove numbers from a text string in R? a) str_remove_all("\\d+") b) remove_numbers() c) str_remove_all("[0-9]") d) remove_digits()
What is the function used for text tokenization in R? a) str_split() b) word_tokenize() c) tokenize_text() d) str_tokenize()
In text cleaning, which R function is used to replace all instances of a specific character with another? a) str_replace_all() b) str_replace() c) str_sub() d) str_modify_all()
Which of the following is used to remove stop words from a text string in R? a) removeWords() b) str_remove() c) str_trim() d) clean_text()
What does the function str_detect() do in R during text analysis? a) Detects the presence of a pattern in a string b) Replaces a pattern in a string c) Splits a string into tokens d) Extracts a substring from a string
How would you remove whitespace characters from a string in R? a) str_squish() b) str_trim() c) remove_spaces() d) str_clean()
Which R package is commonly used for advanced text processing, including tokenization and text cleaning? a) stringr b) tm c) dplyr d) tidyr
What is the primary purpose of text tokenization in R? a) To split text into smaller units like words or sentences b) To remove special characters c) To analyze the sentiment of text d) To convert text into lowercase
Answers
Qno
Answer
1
a) str_length()
2
b) str_replace()
3
a) str_c(str1, str2)
4
b) str_detect()
5
b) str_to_upper()
6
a) str_extract()
7
a) str_trim()
8
a) str_split()
9
a) str_detect()
10
a) str_replace()
11
a) .
12
a) Matches the beginning of a string
13
a) Any digit (0-9)
14
a) ^apple$
15
a) Matches a word boundary
16
a) \\d+
17
a) \\
18
a) A sequence of one or more word characters
19
a) \\s
20
a) End of a string
21
b) tolower()
22
a) str_remove_all("[[:punct:]]")
23
a) str_remove_all("\\d+")
24
a) str_split()
25
a) str_replace_all()
26
a) removeWords()
27
a) Detects the presence of a pattern in a string
28
a) str_squish()
29
b) tm
30
a) To split text into smaller units like words or sentences