String Functions
- reference
This topic describes the builtin SQL++ for Enterprise Analytics string functions.
concat
-
Syntax:
concat(string1, string2, ...)
-
Returns a concatenated string from arguments.
-
Arguments:
-
string1: a string value, -
string2: a string value, -
….
-
-
Return Value:
-
a concatenated string from arguments,
-
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value causes a type error.
-
-
Example:
concat("test ", "driven ", "development"); -
The expected result is:
"test driven development"
contains
-
Syntax:
contains(string, substring_to_contain)
-
Checks whether the string
stringcontains the stringsubstring_to_contain -
Arguments:
-
string: astringthat might contain the given substring, -
substring_to_contain: a targetstringthat might be present in thestring.
-
-
Return Value:
-
a
Booleanvalue,trueifstringcontainssubstring_to_contain, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value causes a type error,
-
falseotherwise.
-
-
Example:
{ "v1": contains("I like x-phone", "phone"), "v2": contains("one", "phone") }; -
The expected result is:
{ "v1": true, "v2": false }
ends_with
-
Syntax:
ends_with(string, substring_to_end_with)
-
Checks whether the string
stringends with the stringsubstring_to_end_with. -
Arguments:
-
string: astringthat might end with the given string, -
substring_to_end_with: astringthat might be the ending substring.
-
-
Return Value:
-
a
Booleanvalue,trueifstringcontainssubstring_to_contain, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value causes a type error,
-
falseotherwise.
-
-
Example:
{ "v1": ends_with(" love product-b its shortcut_menu is awesome:)", ":)"), "v2": ends_with(" awsome:)", ":-)") }; -
The expected result is:
{ "v1": true, "v2": false }
initcap (or title)
-
Syntax:
initcap(string)
-
Converts a given string
stringso that the first letter of each word is uppercase and every other letter is lowercase. The function has an alias called "title". -
Arguments:
-
string: astringto convert.
-
-
Return Value:
-
a
stringas the title form of the givenstring, -
missingif the argument is amissingvalue, -
nullif the argument is anullvalue, -
any other non-string input value causes a type error.
-
-
Example:
{ "v1": initcap("ANALYTICS is here!"), "v2": title("ANALYTICS is here!") }; -
The expected result is:
{ "v1": "Analytics Is Here!", "v2": "Analytics Is Here!" }
length
-
Syntax:
length(string)
-
Returns the length of the string
stringin code point units. See the following examples for more details. -
Arguments:
-
string: astringornullthat represents the string to check.
-
-
Return Value:
-
an
bigintthat represents the length ofstring, -
missingif the argument is amissingvalue, -
nullif the argument is anullvalue, -
any other non-string input value causes a type error.
-
-
Example:
length("test string"); -
The expected result is:
11
-
Example:
length("π©βπ©βπ§βπ¦"); -
The expected result for this emoji character π©βπ©βπ§βπ¦ with 7 code points:
7
lower
-
Syntax:
lower(string)
-
Converts a given string
stringto its lowercase form. -
Arguments:
-
string: astringto convert.
-
-
Return Value:
-
a
stringas the lowercase form of the givenstring, -
missingif the argument is amissingvalue, -
nullif the argument is anullvalue, -
any other non-string input value causes a type error.
-
-
Example:
lower("ANALYTICS"); -
The expected result is:
"analytics"
ltrim
-
Syntax:
ltrim(string[, chars]);
-
Returns a new string with all leading characters that appear in
charsremoved. By default, white space is the character to trim. Here, one character means one code point. For example, the emoji 4-people-family notation π©βπ©βπ§βπ¦ contains 7 code points, and it’s possible to trim a few code points (such as a 2-people-family π¨βπ¦) from it. See the following example for more details. -
Arguments:
-
string: astringto trim, -
chars: astringthat contains characters that are used to trim.
-
-
Return Value:
-
a trimmed, new
string, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value causes a type error.
-
-
Related functions: see
trim(),rtrim() -
Example:
ltrim("me like x-phone", "eml"); -
The expected result is:
" like x-phone"
-
Example with multi-codepoint notation (trim the man and boy from the family of man, woman, girl and boy):
ltrim("π¨βπ©βπ§βπ¦", "π¨βπ¦") -
The expected result is (only woman, girl and boy are left in the family):
"π©βπ§βπ¦"
position
-
Syntax:
position(string, string_pattern)
-
Returns the first position of
string_patternwithinstring. The result is counted in the unit of code points. See the following example for more details. -
The function returns the 0-based position. Another version of the function returns the 1-based position. Below are the aliases for each version:
-
0-based:
position,pos,position0,pos0. -
1-based:
position1,pos1.
-
-
Arguments:
-
string: astringthat might contain the pattern. -
string_pattern: a patternstringto match.
-
-
Return Value:
-
the first position that
string_patternappears withinstring, starting at 0, or -1 if it does not appear, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value returns a
null.
-
-
Example:
{ "v1": position("ppphonepp", "phone"), "v2": position("hone", "phone"), "v3": position1("ppphonepp", "phone"), "v4": position1("hone", "phone") }; -
The expected result is:
{ "v1": 2, "v2": -1, v3": 3, "v4": -1 } -
Example of multi-code-point character:
position("π©βπ©βπ§βπ¦π", "π"); -
The expected result is (the emoji family character has 7 code points):
7
regexp_contains
-
Syntax:
regexp_contains(string, string_pattern[, string_flags])
-
Checks whether the strings
stringcontains the regular expression patternstring_pattern(a Java regular expression pattern). -
Aliases:
-
regexp_contains,regex_contains,contains_regexp,contains_regex.
-
-
Arguments:
-
string: astringthat might contain the pattern. -
string_pattern: a patternstringto match. -
string_flag: Optional. Astringwith flags to use during regular expression matching.-
The following modes are enabled with these flags: dotall (s), multiline (m), case_insensitive (i), and comments and whitespace (x).
-
-
-
Return Value:
-
a
Boolean, returnstrueifstringcontains the patternstring_pattern,falseotherwise. -
missingif any argument is amissingvalue. -
nullif any argument is anullvalue but no argument is amissingvalue. -
any other non-string input value returns a
null.
-
-
Example:
{ "v1": regexp_contains("pphonepp", "p*hone"), "v2": regexp_contains("hone", "p+hone") }; -
The expected result is:
{ "v1": true, "v2": false }
regexp_like
-
Syntax:
regexp_like(string, string_pattern[, string_flags])
-
Checks whether the string
stringexactly matches the regular expression patternstring_pattern(a Java regular expression pattern). -
Aliases:
-
regexp_like,regex_like.
-
-
Arguments:
-
string: astringthat might contain the pattern. -
string_pattern: a patternstringthat might be contained. -
string_flag: Optional. Astringwith flags to use during regular expression matching.-
The following modes are enabled with these flags: dotall (s), multiline (m), case_insensitive (i), and comments and whitespace (x).
-
-
-
Return Value:
-
a
Booleanvalue,trueifstringcontains the patternstring_pattern,falseotherwise. -
missingif any argument is amissingvalue. -
nullif any argument is anullvalue but no argument is amissingvalue. -
any other non-string input value returns a
null.
-
-
Example:
{ "v1": regexp_like(" can't stand acast the network is horrible:(", ".*acast.*"), "v2": regexp_like("acast", ".*acst.*") }; -
The expected result is:
{ "v1": true, "v2": false }
regexp_position
-
Syntax:
regexp_position(string, string_pattern[, string_flags])
-
Returns first position of the regular expression
string_pattern(a Java regular expression pattern) withinstring. The function returns the 0-based position. Another version of the function returns the 1-based position. Below are the aliases for each version: -
Aliases:
-
0-Based:
regexp_position,regexp_pos,regexp_position0,regexp_pos0,regex_position,regex_pos,regex_position0,regex_pos0. -
1-Based:
regexp_position1,regexp_pos1,regex_position1regex_pos1.
-
-
Arguments:
-
string: astringthat might contain the pattern. -
string_pattern: a patternstringto be matched. -
string_flag: Optional. Astringwith flags to use during regular expression matching.-
The following modes are enabled with these flags: dotall (s), multiline (m), case_insensitive (i), and comments and whitespace (x).
-
-
-
Return Value:
-
the first position that the regular expression
string_patternappears instring(starting at 0), or -1 if it does not appear. -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value returns a
null.
-
-
Example:
{ "v1": regexp_position("pphonepp", "p*hone"), "v2": regexp_position("hone", "p+hone"), "v3": regexp_position1("pphonepp", "p*hone"), "v4": regexp_position1("hone", "p+hone") }; -
The expected result is:
{ "v1": 0, "v2": -1, "v3": 1, "v4": -1 }
regexp_replace
-
Syntax:
regexp_replace(string, string_pattern, string_replacement[, string_flags]) regexp_replace(string, string_pattern, string_replacement[, replacement_limit])
-
Checks whether the string
stringmatches the given regular expression patternstring_pattern(a Java regular expression pattern), and replaces the matched patternstring_patternwith the new patternstring_replacement. -
Aliases:
-
regexp_replace,regex_replace.
-
-
Arguments:
-
string: astringthat might contain the pattern. -
string_pattern: a patternstringto be matched. -
string_replacement: a patternstringto use as the replacement. -
string_flag: Optional. Astringwith flags to use during replace.-
The following modes are enabled with these flags: dotall (s), multiline (m), case_insensitive (i), and comments and whitespace (x).
-
-
replacement_limit: Optional. Anintegerspecifying the maximum number of replacements to make (if negative then all occurrences will be replaced)
-
-
Return Value:
-
Returns a
stringthat is obtained after the replacements. -
missingif any argument is amissingvalue. -
nullif any argument is anullvalue but no argument is amissingvalue. -
any other non-string input value returns a
null.
-
-
Example:
regexp_replace(" like x-phone the voicemail_service is awesome", " like x-phone", "like product-a"); -
The expected result is:
"like product-a the voicemail_service is awesome"
repeat
-
Syntax:
repeat(string, n)
-
Returns a string formed by repeating the input
stringntimes. -
Arguments:
-
string: astringto be repeated, -
n: antinyint/smallint/integer/bigintvalue - how many times the string should be repeated.
-
-
Return Value:
-
a string that repeats the input
stringntimes, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
a type error will be raised if:
-
the first argument is any other non-string value,
-
or, the second argument is not a
tinyint,smallint,integer, orbigint.
-
-
-
Example:
repeat("test", 3); -
The expected result is:
"testtesttest"
replace
-
Syntax:
replace(string, search_string, replacement_string[, limit])
-
Finds occurrences of the given substring
search_stringin the input stringstringand replaces them with the new substringreplacement_string. -
Arguments:
-
string: an inputstring, -
search_string: astringsubstring to be searched for, -
replacement_string: astringto use as the replacement, -
limit: Optional. Aninteger- maximum number of occurrences to be replaced. If not specified or negative then all occurrences will be replaced
-
-
Return Value:
-
Returns a
stringthat is obtained after the replacements, -
missingif any argument is amissingvalue, -
any other non-string input value or non-integer
limitcauses a type error, -
nullif any argument is anullvalue but no argument is amissingvalue.
-
-
Example:
{ "v1": replace(" like x-phone the voicemail_service is awesome", " like x-phone", "like product-a"), "v2": replace("x-phone and x-phone", "x-phone", "product-a", 1) }; -
The expected result is:
{ "v1": "like product-a the voicemail_service is awesome", "v2": "product-a and x-phone" }
reverse
-
Syntax:
reverse(string)
-
Returns a string formed by reversing characters in the input
string. For characters of multiple code points, code point is the minimal unit to reverse. See the following examples for more details. -
Arguments:
-
string: astringto be reversed
-
-
Return Value:
-
a string containing characters from the the input
stringin the reverse order, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
a type error will be raised if:
-
the first argument is any other non-string value
-
-
-
Example:
reverse("hello"); -
The expected result is:
"olleh"
-
Example of multi-code-point character (Korean):
reverse("αα ‘α«αα ³α―"); -
The expected result, in which the Korean characters are split into code points and then the code points are reversed:
"α―α ³αα«α ‘α"
rtrim
-
Syntax:
rtrim(string[, chars]);
-
Returns a new string with all trailing characters that appear in
charsremoved. By default, white space is the character to trim. Note that here one character means one code point. For example, the emoji 4-people-family notation "π©βπ©βπ§βπ¦" contains 7 code points, and it is possible to trim a few code points (such as a 2-people-family "π¨βπ¦") from it. See the following example for more details. -
Arguments:
-
string: astringto be trimmed, -
chars: astringthat contains characters that are used to trim.
-
-
Return Value:
-
a trimmed, new
string, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value causes a type error.
-
-
Related functions: see
trim(),ltrim() -
Example:
{ "v1": rtrim("i like x-phone", "x-phone"), "v2": rtrim("i like x-phone", "onexph") }; -
The expected result is:
{ "v1": "i like ", "v2": "i like x-" } -
Example with multi-codepoint notation (trim the man and boy from the family of man, woman, girl and boy):
rtrim("π¨βπ©βπ§βπ¦", "π¨βπ¦") -
The expected result is (only man, woman and girl are left in the family):
"π¨βπ©βπ§"
split
-
Syntax:
split(string, sep)
-
Splits the input
stringinto an array of substrings separated by the stringsep. -
Arguments:
-
string: astringto be split.
-
-
Return Value:
-
an array of substrings by splitting the input
stringbysep, -
in case of two consecutive
seps in thestring, the result of splitting the two consecutiveseps will be the empty string"", -
missingif the argument is amissingvalue, -
nullif the argument is anullvalue, -
any other non-string input value causes a type error.
-
-
Example:
split("test driven development", " "); -
The expected result is:
[ "test", "driven", "development" ]
-
Example with two consecutive
seps in thestring:split("123//456", "/"); -
The expected result is:
[ "123", "", "456" ]
starts_with
-
Syntax:
starts_with(string, substring_to_start_with)
-
Checks whether the string
stringstarts with the stringsubstring_to_start_with. -
Arguments:
-
string: astringthat might start with the given string. -
substring_to_start_with: astringthat might be contained as the starting substring.
-
-
Return Value:
-
a
Boolean, returnstrueifstringstarts with the stringsubstring_to_start_with, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value causes a type error,
-
falseotherwise.
-
-
Example:
{ "v1" : starts_with(" like the plan, amazing", " like"), "v2" : starts_with("I like the plan, amazing", " like") }; -
The expected result is:
{ "v1": true, "v2": false }
substr
-
Syntax:
substr(string, offset[, length])
-
Returns the substring from the given string
stringbased on the given start offsetoffsetwith the optionallength. Note that both of theoffsetandlengthare in the unit of code point (e.g. the emoji family π¨βπ©βπ§βπ¦ has 7 code points). The function uses the 0-based position. Another version of the function uses the 1-based position. Below are the aliases for each version: -
Aliases:
-
0-Based:
substring,substr,substring0,substr0. -
1-Based:
substring1,substr1.
-
-
Arguments:
-
string: astringto extract. -
offset: antinyint/`sm, starting at 0. If negative then counted from the end of the string. -
length: Optional. Atinyint/smallint/integer/bigintvalue as the length of the substring.
-
-
Return Value:
-
a
stringthat represents the substring, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, or if the substring could not be obtained because the starting offset is not within string bounds orlengthis negative. -
a
nullwill be returned if:-
the first argument is any other non-string value.
-
the second argument is not a
tinyint,smallint,integer, orbigint. -
the third argument is not a
tinyint,smallint,integer, orbigintif the argument is present.
-
-
-
Example:
{ "v1": substr("test string", 6, 3), "v2": substr1("test string", 6, 3) }; -
The expected result is:
{ "v1": "tri", "v2": "str" }
The function has an alias substring.
trim
-
Syntax:
trim(string[, chars]);
-
Returns a new string with all leading and trailing characters that appear in
charsremoved. By default, white space is the character to trim. Note that here one character means one code point. For example, the emoji 4-people-family notation "π©βπ©βπ§βπ¦" contains 7 code points, and it is possible to trim a few code points (such as a 2-people-family "π¨βπ¦") from it. See the following example for more details. -
Arguments:
-
string: astringto be trimmed, -
chars: astringthat contains characters that are used to trim.
-
-
Return Value:
-
a trimmed, new
string, -
missingif any argument is amissingvalue, -
nullif any argument is anullvalue but no argument is amissingvalue, -
any other non-string input value causes a type error.
-
-
Related functions: see
ltrim(),rtrim() -
Example:
trim("i like x-phone", "xphoen"); -
The expected result is:
" like "
-
Example with multi-codepoint notation (trim the man and boy from the family of man, woman, girl and boy):
trim("π¨βπ©βπ§βπ¦", "π¨βπ¦") -
The expected result is (only woman and girl are left in the family):
"π©βπ§"
upper
-
Syntax:
upper(string)
-
Converts a given string
stringto its uppercase form. -
Arguments:
-
string: astringto be converted.
-
-
Return Value:
-
a
stringas the uppercase form of the givenstring, -
missingif the argument is amissingvalue, -
nullif the argument is anullvalue, -
any other non-string input value causes a type error.
-
-
Example:
upper("hello") -
The expected result is:
"HELLO"
concat2
-
Syntax:
concat2(string_separator, string_or_array_of_strings1, string_or_array_of_strings2, ...)
-
This function takes the input strings, or arrays of strings, and concatenates them with the specified separator between each input string. Arrays of strings are flattened and concatenated in the same order. If there is only a single string argument, the separator is not used.
-
Arguments:
-
string_separator: a string value representing the separator to use between the concatenated strings. -
string_or_array_of_strings1: a string or an array of strings to be concatenated. -
….
-
-
Return Value:
-
a concatenated string from arguments with separator between each argument.
-
missingif any argument is amissingvalue. -
missingif an array is provided, and any element in the array ismissing. -
nullif any argument (or any array element) is anullbut no argument (or any array element) is amissingvalue. -
any other non-string input value returns a
null.
-
-
Example:
concat2("-", "ab", "cd", ["e", "f", "g"]); -
The expected result is:
"ab-cd-e-f-g"