Skip to contents

These functions find occurrences of a pattern in strings.

str_locate_first(), str_locate_nth() and str_locate_last() find the specified occurrence of a pattern in each string.

str_locate_all() finds all occurrences of a pattern in each string.

Usage

str_locate_first(strings, pattern, fixed = FALSE)

str_locate_all(strings, pattern, fixed = FALSE)

str_locate_nth(strings, pattern, n, fixed = FALSE)

str_locate_last(strings, pattern, fixed = FALSE)

Arguments

strings

A character vector, where each element of the vector is a character string.

pattern

A single character string to be searched for in each element of strings. By default, pattern is interpreted as a regular expression (regex). If the fixed argument is set to TRUE, pattern will be treated as a literal string to be matched exactly.

fixed

Logical; whether pattern should be matched exactly, treating regex special characters as regular string characters. Default FALSE.

n

str_locate_nth() only: Integer, the nth occurrence of the pattern to extract. Negative values count back from the end.

Value

str_locate_first(), str_locate_nth() and str_locate_last(): return a two-column matrix with the start and end positions of the first, nth and last match respectively. There is a row for each string.

str_locate_all(): returns a list of matrices. There is a matrix for each string and a row for each match.

If no match is found, NA values are returned.

Details

These functions are built using the base R regular expression functions. {suitestrings} uses Perl-compatible Regular Expressions (PCRE). This is achieved by setting perl = TRUE in the underlying base functions. See R's regexp documentation for info on the regex implementation. For complete syntax details see https://www.pcre.org/current/doc/html/

See also

regexpr() and gregexpr() to locate matches in base R. The form is different, with integer start positions and match length as an attribute.

Examples

str_locate_first("Hello world", "world")
#>      start end
#> [1,]     7  11
str_locate_first(c("Hello world", "Goodbye world"), "o")
#>      start end
#> [1,]     5   5
#> [2,]     2   2

str_locate_all(c("Hello world", "Goodbye world"), "world")
#> [[1]]
#>      start end
#> [1,]     7  11
#>
#> [[2]]
#>      start end
#> [1,]     9  13
str_locate_all(c("Hello world", "Goodbye world"), "o")
#> [[1]]
#>      start end
#> [1,]     5   5
#> [2,]     8   8
#>
#> [[2]]
#>      start end
#> [1,]     2   2
#> [2,]     3   3
#> [3,]    10  10

str_locate_nth("Hello world", "world", 2)
#>      start end
#> [1,]    NA  NA
str_locate_nth(c("Hello world", "Goodbye world"), "o", 2)
#>      start end
#> [1,]     8   8
#> [2,]     3   3

str_locate_last("Hello world", "world")
#>      start end
#> [1,]     7  11
str_locate_last(c("Hello world", "Goodbye world"), "o")
#>      start end
#> [1,]     8   8
#> [2,]    10  10