User-Defined Functions
- Capella Analytics
- reference
In SQL++ for Capella Analytics, user-defined functions enable you to name and reuse complex or repetitive expressions, including subqueries, in order to simplify your queries.
Each user-defined function belongs to a scope. Within a scope, a function is uniquely identified by its function signature: that is, its name and the number of parameters it takes. You can create more than one function with the same name in the same scope, as long as each function has a different number of parameters.
You can query the Function collection within the System.Metadata scope to get metadata about all existing user-defined functions.
FUNCTION is a reserved keyword, so you need to delimit the identifier for the Function collection with backticks (``).
|
Creating a Function
CreateFunction
The CREATE FUNCTION command enables you to create a user-defined function.
At present, Couchbase supports Internal Functions, which use SQL++ expressions.
This syntax enables you to create an internal user-defined function. You can think of this kind of SQL++ user-defined function as being a parameterized view.
Function Name
QualifiedName
DatabaseAndScopeName
The QualifiedName specifies the name of the function to create.
For information about how Capella Analytics organizes entities into a database.scope.database_object hierarchy and resolves names, see Entities in Capella Analytics Services.
Parameter List
FunctionParameters
The parameter list specifies parameters for the function.
You delimit the list with parentheses ().
You can specify named parameters for the function using a comma-separated list of identifiers.
If you specify named parameters for the function, then you must call the function with exactly the same number of arguments in your queries. If you specify no parameters, then you must call the function with no arguments.
To create a variadic function, that is, a function that you can call with any number of arguments or none, specify ... as the only parameter.
Function Body
Query
The function body defines the function.
You delimit the function body with braces {}.
It can contain any valid expression or subquery.
If you specified named parameters for the function, you can use them in the function body to represent arguments passed to the function at execution time.
If the function is variadic, an array named args holds any arguments passed to the function at execution time.
The function body can refer to other collections or views, or to other functions in the same or other scopes. If a scope name was explicitly specified when creating the function, then that scope is the default scope for function calls or collection references within the function body.
Recursive function invocation is not permitted. The function body cannot refer to itself or to another user-defined function that calls this function indirectly.
Checking for an Existing Function and Replacing a Function
The optional IF NOT EXISTS keywords enable you to verify whether a user-defined function exists before creating it, and the optional OR REPLACE keywords enable you to redefine the function.
If a function with the same signature already exists within the specified scope, then:
-
If the
OR REPLACEkeywords are present, replace the existing function. -
If the
OR REPLACEkeywords are not present, then:-
If the
IF NOT EXISTSkeywords are present, the statement does nothing and completes without error. -
If the
IF NOT EXISTSkeywords are not present, generate an error.
-
If the statement contains both the OR REPLACE keywords and the IF NOT EXISTS keywords, an error results.
Internal Function Examples
For simplicity, none of these examples implement any data validation.
Example 1: Function with expression body
This statement creates a function called rstr, which returns the specified number of characters from the right of a string.
The function expects two named arguments: vString, which is the string to work with, and vLen, which is the number of characters to return.
The body of the function is an expression based on the two arguments.
CREATE FUNCTION rstr(vString, vLen)
{ substr(vString, length(vString) - vLen, vLen) };
Test
rstr("Couchbase", 4);
Result
[ "base" ]
Example 2: Function with subquery body
This statement creates a function called total_spending based on the Commerce example data.
The function takes one parameter, a customer id.
Using a subquery, the function returns the total spending of all orders placed by that customer.
CREATE FUNCTION total_spending(id)
{ SELECT VALUE SUM(i.qty * i.price)
FROM orders AS o UNNEST o.items AS i
WHERE o.custid = id
};
Test
total_spending("C13");
Result
[ 13036.8 ]
Example 3: Variadic function with expression body
This statement creates a function that can take any number of arguments. Using an expression, the function returns the number of arguments that are passed to it.
CREATE FUNCTION count_my_args( ... )
{ array_count(args) };
Test
count_my_args("Hello", "Goodbye");
Result
[ 2 ]
Example 4: Variadic function containing a subquery
The body of this function is an expression that contains a subquery. The function takes a variable number of strings and returns the total length of all the strings.
CREATE FUNCTION total_length(...)
{ array_sum(
(SELECT VALUE length(a) FROM args AS a)
)
};
Test
total_length("Hello", "Goodbye");
Result
[ 12 ]
Example 5: Function with no parameters
This statement creates a function which returns the mathematical constant φ. The function takes no arguments.
CREATE FUNCTION phi() { 2 * sin(radians(54)) };
phi();
[ 1.618033988749895 ]
Example 6: Replace a function
The following statement redefines the function so that it calculates φ using a different method.
CREATE OR REPLACE FUNCTION phi() { (1 + sqrt(5)) / 2 };
phi();
[ 1.618033988749895 ]
TRANSFORM FUNCTION
Capella Analytics supports lightweight transformations on incoming data destined for remote collections.
You can use SQL++ TRANSFORM User-Defined Functions (UDFs) to declaratively transform data objects before they’re stored in remote collections.
The supported common transformations through UDFs include:
-
Field operations and record filtering, such as:
-
Renaming fields.
-
Adding fields with static values.
-
Excluding specific fields.
-
Filtering records.
-
Flattening nested structures, using single-row nested aggregation.
-
-
Value conversions, such as:
-
Type conversions including string to integer and timestamp formatting.
-
String manipulations including concatenation, lower/upper case, and substring.
-
Arithmetic transformations and creation of derived fields.
-
Create TRANSFORM FUNCTION
A TRANSFORM FUNCTION is a specialized type of user-defined function (UDF) that’s created using the CREATE TRANSFORM FUNCTION statement.
Its purpose is to perform an automatic, on-the-fly transformation of incoming data.
For an UDF to be a valid TRANSFORM FUNCTION, you must have the following conditions:
-
Only has a single argument: this argument is the incoming document, received from the data service/Kafka, which is passed to the function.
-
Returns at most one value: the function should return at most one value for each input document. If for an input document, the function does not return any value, that document is skipped. In case the function returns a value, the value should be a valid document. The returned/transformed document is stored.
-
Does not touch any collection/view: the function works only with the input document.
See the following examples:
CREATE TRANSFORM FUNCTION project_fields(input) {
SELECT VALUE doc FROM (
SELECT d.name, d.address FROM [input] d
) as doc
}
This function extracts only the name and address fields from each incoming document.
The transformed document contains only these two fields.
CREATE OR REPLACE TRANSFORM FUNCTION CustomerTransform (cust)
{
SELECT VALUE doc FROM (
SELECT c.*,
(c.address.street || ', ' || c.address.city || ', ' || c.address.zipcode) AS location
FROM [cust] AS c ) as doc
};
This function transforms the customers mailing address into a single-line string.
The transformed document contains all the original fields, plus a new field called location.
Applying TRANSFORM FUNCTION to Collections
A TRANSFORM FUNCTION is applied to a remote collection to automatically process data as it is ingested from an external source. You do this by adding the APPLY FUNCTION clause to the CREATE COLLECTION statement.
See the following example:
CREATE COLLECTION remote_hotel ON hotel AT my_remote_link APPLY FUNCTION project_fields;
You cannot modify or drop a TRANSFORM FUNCTION while it’s applied to any collection.
|
Calling a Function
OrdinaryFunctionCall
You can invoke a user-defined function in the same way as any other ordinary function. You can optionally prefix the name of the function with the names of the database and scope containing the function.
For information about how Capella Analytics organizes entities into a database.scope.database_object hierarchy and resolves names, see Entities in Capella Analytics Services.
After determining the database and scope, Capella Analytics tries to find a user-defined function with the same function signature within that scope. If a user-defined function cannot be found, Capella Analytics tries to find a built-in function with the same function name. If Capella Analytics cannot find a built-in function, the function call fails.
Examples
The examples in this section assume that you’re using the database and scope sampleAnalytics.Commerce.
See Example Data to install the Commerce dataset.
You can use USE Statements to set the default scope for the statement that follows it.
USE sampleAnalytics.Commerce;
If you’re using the Capella Analytics UI, you can alternatively use the query editor’s Query Context lists to set the database and scope.
Select sampleAnalytics as the database and Commerce as the scope for the following examples.
Example 9: Function with subquery
The following statement creates a function called nameSearch, which selects the customer name from all documents with the specified ID in the customers collection.
CREATE FUNCTION nameSearch(customerId) {
(SELECT VALUE c.name
FROM customers AS c
WHERE c.custid = customerId)[0]
};
SELECT VALUE nameSearch("C25");
[ "M. Sinclair" ]
Example 10: Call a user-defined function
The following query uses the nameSearch function as a projection expression in a SELECT query.
Compare this with example Q3.29 in the section on Subqueries.
SELECT o.orderno, o.custid,
nameSearch(o.custid) AS name
FROM orders AS o, o.items AS i
WHERE i.itemno = 120;
[
{
"orderno": 1003,
"custid": "C31",
"name": "B. Pruitt"
},
{
"orderno": 1006,
"custid": "C41",
"name": "R. Dodge"
}
]
Dropping a Function
DropFunction
The DROP FUNCTION statement enables you to delete a user-defined function.
You cannot delete a user-defined function if there are any other user-defined functions that call this function, in this scope or any other scope.
Function Name
QualifiedName
DatabaseAndScopeName
The QualifiedName specifies the name of the function to delete.
For information about how Capella Analytics organizes entities into a database.scope.database_object hierarchy and resolves names, see Entities in Capella Analytics Services.
Parameter List
FunctionParameters
When you drop a function, you must specify the same number of parameters that you specified when you created the function.
When dropping a variadic function, specify ....
Checking for an Existing Function
The optional IF EXISTS keywords enable you to verify whether the specified function or scope exists before dropping it.
-
If you include these keywords, and the function name or scope is not found, the statement does nothing and completes without error.
-
If you do not include these keywords, and the function name or scope is not found, Capella Analytics generates an error.
Examples
The examples that follow assume that you have set sampleAnalytics.Commerce as the default database and scope.
Example 11: Drop a Function
This statement drops the function named nameSearch, if it exists.
DROP FUNCTION nameSearch(customerId) IF EXISTS;
Run the following query to verify the availability of a function:
SELECT * FROM Metadata.`Function`;