This is primarily a personal learning project. It uses wrappers around base R string operations to provide a comprehensive and convenient set of functions for working with strings in R. More specifically, for:
- Combining strings.
- Cleaning and transforming strings.
- Manipulating strings based on regular expression patterns.
The interface is built with a consistent naming scheme and argument structure. The first argument will be the string or strings to work on, which makes it particularly convenient to work with pipes. Functions start with prefixes for easy identification and auto-completion:
-
str_
for vectorised operations on strings. -
chr_
for vector-wide operations over strings.
You can read more at vignette("suitestrings-conventions")
.
Installation
You can install suitestrings from GitHub with:
# install.packages("devtools")
devtools::install_github("davidfoord1/suitestrings", build_vignettes = TRUE)
# Load the package to your library
library(suitestrings)
This package is built with base R, so there are no additional dependencies to install.
Examples
Combine strings
str_concat(c("Hello", "How are"), c("world!", "you?"), separator = " ")
#> [1] "Hello world!" "How are you?"
With R expressions
str_glue()
treats text in braces {}
like R code
Clean and transform strings
Shorten strings
# Remove leading/trailing spaces and reduce middle spaces down to one
str_squish(" Too much space ")
#> [1] "Too much space"
# Reduce a string to a specified length
str_truncate("This string is far too long, let's make it shorter", 20)
#> [1] "This string is fa..."
Extend strings
# Append a specific number of characters
str_indent(c("Hello", "World"), 3)
#> [1] " Hello" " World"
# Repeat the contents of strings
str_repeat("hello", 3, ", ")
#> [1] "hello, hello, hello"
Reformat strings
# Convert strings to different cases
str_to_snake_case(" This /IS/ a ->>!!GREAT!!<<-STriNg!!")
#> [1] "this_is_a_great_string"
Manipulate strings based on regular expression patterns
# Prepare an example character vector.
strings <- c("flat-hat", "backpack", "roll", "cat-sat-on-a-mat")
# Define a pattern for a three letter sequence with "a" in the middle.
pattern <- "\\wa\\w"
Detect, extract and replace patterns in strings
# Does a string contain a match?
str_detect(strings, pattern)
#> [1] TRUE TRUE FALSE TRUE
# Get the first match in a string
str_extract_first(strings, pattern)
#> [1] "lat" "bac" NA "cat"
# Replace the first match in a string
str_replace_first(strings, pattern, "rat")
#> [1] "frat-hat" "ratkpack" "roll" "rat-sat-on-a-mat"
Work with specific occurrences of patterns
Using suffixes _first, _nth and _last
# Get the second match for each string
str_extract_nth(strings, pattern, 2)
#> [1] "hat" "pac" NA "sat"
# Remove the last match in each string
str_remove_last(strings, pattern)
#> [1] "flat-" "backk" "roll" "cat-sat-on-a-"
Work with every occurrence of a pattern
With suffix _all
str_locate_all(strings, pattern)
#> [[1]]
#> start end
#> [1,] 2 4
#> [2,] 6 8
#>
#> [[2]]
#> start end
#> [1,] 1 3
#> [2,] 5 7
#>
#> [[3]]
#> start end
#> [1,] NA NA
#>
#> [[4]]
#> start end
#> [1,] 1 3
#> [2,] 5 7
#> [3,] 14 16
# Get a list of every match for each string
str_extract_all(strings, pattern)
#> [[1]]
#> [1] "lat" "hat"
#>
#> [[2]]
#> [1] "bac" "pac"
#>
#> [[3]]
#> character(0)
#>
#> [[4]]
#> [1] "cat" "sat" "mat"
Work on the character vector as a whole
With prefix chr_
# Get every match from `strings` into one character vector
chr_extract_all(strings, pattern)
#> [1] "lat" "hat" "bac" "pac" "cat" "sat" "mat"
# Which elements of the vector contain a match?
chr_which(strings, pattern)
#> [1] 1 2 4
# Get the subset of `strings` that contain a match
chr_subset(strings, pattern)
#> [1] "flat-hat" "backpack" "cat-sat-on-a-mat"