Gathr provides different kind of functions like Date, Lookup, String, Math, Array and Miscellaneous functions. Using these functions, you can directly apply computations on the dataset columns and get desired result.
The functions takes columns and/or literals as arguments.
In order to pass column “A” of a dataset “D” as a function argument, pass for example
@{column.D.A}
There are some exceptions where you need to pass the column name directly.
Following are the functions that take the column name directly.
expr(A)
rename_column(A)
In order to pass a “string literal” as argument to function pass it in double quotes for example
"IST"
In order to pass a “number literal” to a function, pass it as it is.
For example, 2 or 1.345
If we have a dataset named schemaName having a string/date/timestamp column date holding the value 2009-03-01 for a particular row, then
add_months@{column.schemaName.date},2) will return 2009-05-01
Note:
All the examples presented in this guide have column values instead of the entire expression i.e. @{column.schemaName.columnName}
add_months@{column.schemaName.date},2) will return 2009-05-01
add_months is a function, @{column.schemaName.date} is an expression and 2009-03-01 is the value of the expression.
add_months(@{column.schemaName.date},2) will be shown as:
add_months(2009-03-01,2)
A detailed description of these functions is given below:
Enables to perform calculations involving dates.
Calculates the new date by adding the given number of months.
Input Parameters: Column arg0, int arg1.
• arg0 (required) – The string/date/timestamp column to which months are to be added.
• arg1 (required)– Number of months to be added. Output Type Date Column Configuration Parameters No Returns Returns the date that is arg1 after arg0. Throws Application Exception
Use Case:
Example
add_months("2009-03-01",2) will return "2009-05-01"
Computes the current date as a date column.
Input Parameters: There are no function arguments.
Output Type | Date Column |
Configuration Parameters | No |
Returns | Returns the current date as date column. |
Throws | Application Exception |
Use Case:
Example
current_date() will return the current date.
If used on 2018-06-15 will return 2018-06-15
Computes the current timestamp as a timestamp column.
Input Parameters: There are no function arguments.
Output Type | Timestamp Column. |
Configuration Parameters | No. |
Returns | Returns the current timestamp as a timestamp column. |
Throws | Application Exception |
Use Case:
Example
current timestamp() will return the current timestamp.
Adds the specified number of months in ‘arg1’ to a date time value in ‘arg0’.
Input Parameters: Column arg0, int arg1
• arg0 (required) – The string/date/timestamp column to which days to be added.
• arg1 (required)– Number of days to be added.
Output Type | Date Column. |
Configuration Parameters | No. |
Returns | Returns the date that is arg0 after arg1. |
Throws | Application Exception |
Use Case:
Example
date_add("2009-03-01",2) will return "2009-03-03".
Extracts a json object from path.
Input Parameters: Column arg0, int arg1
• arg0:The json txt column
• arg1: The path
Output Type | Date Column |
Configuration Parameters | No |
Returns | Returns the extracted json object. |
Throws | Application Exception |
Example
get_json_object('{"a":"b"}', '$.a') will return b
If timestamp1 is later than timestamp2, then the result is positive. If timestamp1 and timestamp2 are on the same day of month, or both are the last day of month, time of day will be ignored. Otherwise, the difference is calculated based on 31 days per month, and rounded to 8 digits unless roundOff=false.
Input Parameters: Column arg0, int arg1
• arg0:timestamp1 column.
• arg1:timestamp2 column.
• arg2:roundoff boolean
Output Type | Date Column |
Configuration Parameters | No |
Returns | Returns the months difference |
Throws | Application Exception |
Example
months_between('1997-02-28 10:30:00', '1996-10-30', false) will return 3.9495967741935485
Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC.
Input Parameters:
arg0:The timestamp column.
arg1:The timezone column.
Output Type | Date Column |
Configuration Parameters | No |
Returns | Returns the timestamp. |
Throws | Application Exception |
Example
to_utc_timestamp('2016-08-31', 'Asia/Seoul') will return 2016-08-30 15:00:00
Returns timestamp ts truncated to the unit specified by the format model fmt. fmt should be one of ["YEAR" "YYYY" "YY" "MON" "MONTH" "MM" "DAY" "DD" "HOUR" "MINUTE" "SECOND" "WEEK" "QUARTER"]
Input Parameters:
arg0: The FMT format.
arg1: The ts timestamp
Configuration Parameters | NO |
Returns | Returns timestamp ts truncated to the unit specified by the format model fmt. |
Throws | ApplicationException |
Example
date_trunc('YEAR', '2015-03-05T09:32:05.359') will return 2015-01-01 00:00:00
Returns the day of the week for date/timestamp (1 = Sunday).
Input Parameters:
arg0: The date column.
Configuration Parameters | NO |
Returns | Returns the day of the week for date/timestamp |
Throws | ApplicationException |
Example:
Dayofweek('2009-07-30') will return 5
Parses the date_str expression with the fmt expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the fmt is omitted.
Input Parameters:
arg0: str date_str column column.
arg1:the format string.
Configuration Parameters | NO |
Returns | Returns the formatted date. |
Throws | ApplicationException |
Example:
to_date('2016-12-31', 'yyyy-MM-dd') will return 2016-12-31
Parses the timestamp expression with the fmt expression to a timestamp. Returns null with invalid input. By default, it follows casting rules to a timestamp if the fmt is omitted.
Input Parameters:
arg0:The timestamp column.
arg1:The format string.
Configuration Parameters | NO |
Returns | Returns the formated timestamp. |
Throws | ApplicationException |
Example:
to_timestamp('2016-12-31', 'yyyy-MM-dd') will return 2016-12-31 00:00:00
Converts a date/timestamp/string column 'arg0' to a value of string in the format specified by the date format given by the 'arg1'.
Input Parameters: Column arg0, String arg1
• arg0 (required) – The date/timestamp/string column to be converted.
• arg1 (required)– The format to which the date/timestamp/string to be converted.
Output Type | String Column |
Configuration Parameters | No |
Returns | The formatted date as string. |
Throws | Application Exception |
Use Case:
Example
date_format("2009-03-01","MM-dd-yyyy") will return "03-01-2009".
Computes the date that is 'arg1' days before 'arg0'.
Input Parameters: Column arg0, int arg1
• arg0 (required) – The string/date/timestamp column from which days to be subtracted.
• arg1 (required)– Number of days to be subtracted.
Output Type | Date Column |
Configuration Parameters | No |
Returns | The formatted date as string. |
Throws | Application Exception |
Use Case:
Example
date_format("2009-03-01","MM-dd-yyyy") will return "03-01-2009".
Calculates the difference between the dates 'arg1' and 'arg0' date and returns the number of days.
Input Parameters: Column arg0, Column arg1
• arg0 (required) – The end string/date/timestamp column.
• arg1 (required)– The start string/date/timestamp column.
Output Type | Integer Column |
Configuration Parameters | No |
Returns | The number of days from 'arg1' date to 'arg0' date. |
Throws | Application Exception |
Use Case:
Example
datediff("2009-03-01","2009-02-27") will return 2.
Extracts the day of the year as an integer from a given date/timestamp/string.
Input Parameters: Column arg0
• arg0 (required) – The date/timestamp/string from which the day of year to be extracted.
Output Type | Integer Column |
Configuration Parameters | No |
Returns | The extracted day as an integer. |
Throws | Application Exception |
Use Case:
Example
dayofyear("2017-12-15") will return 349
Convert time string with given pattern to Unix time stamp (in seconds), returns null if fails.
https://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html
Input Parameters: Column arg0,Object arg1
• arg0 (required) – The string/date/timestamp column to be converted.
• arg1 (required)– The string/date/timestamp format.
Output Type | Long Column |
Configuration Parameters | No |
Returns | The converted Unix time stamp |
Throws | Application Exception |
Use Case:
Example
unix_timestamp("2017-12-15 11:56","yyyy-MM-dd hh:mm") will return 1513339008.
Gets current Unix timestamp in seconds.
Input Parameters: There are no function arguments.
Output Type | Long Column |
Configuration Parameters | No |
Returns | The current Unix time stamp |
Throws | Application Exception |
Use Case:
Example
default_unix_timestamp() will return current time in long format.
Given a timestamp, which corresponds to a certain time of day in UTC, returns another timestamp that corresponds to the same time of day in the given timezone.
Input Parameters: Column arg0, Object arg1
• arg0 (required) – The string/date/timestamp column in UTC.
• arg1 (required) – The timezone to which the timestamp has to converted.
Output Type | Timestamp Column |
Configuration Parameters | No |
Returns | The timestamp in the given timezone. |
Throws | Application Exception |
Use Case:
Example
from_utc_timestamp("2017-12-15 11:40",IST) will return "2017-12-15 17:10"
Extracts the hours as an integer from a given date/timestamp/string.
Input Parameters: Column arg0
• arg0 (required) – The date/timestamp/string column from which the hours to be extracted..
Output Type | Integer Column |
Configuration Parameters | No |
Returns | Returns the extracted hours as an integer. |
Throws | Application Exception |
Use Case:
Example
date_format("2009-03-01","MM-dd-yyyy") will return "03-01-2009".
The date/timestamp/string column, returns the last day of the month which the given date belongs to.
Input Parameters: Column arg0
• arg0 (required) – The date/timestamp/string column from which last day of month to be extracted...
Output Type | Date Column |
Configuration Parameters | No |
Returns | Returns the computed last day. |
Throws | Application Exception |
Use Case:
Example
last_day("2017-12-15") will return "2017-12-31".
These functions help to look up data beyond the application flow.
It is used to fetch the records from Hbase table based on the following input parameters.
Input parameters: String tableName,String rowID, String columnFamily, int resultRow, String resultColumnName.
• tableName (required): Hbase Table name with name space as string type.
• rowID (required): String representation of unique key for record.
• columnFamily (required): Column family name is in string format.
• resultRow (required): Result row number.
• resultColumnName (required): Result column name.
Configuration Parameters: Below are the configuration parameters of lookupHBase function.
• ConnectionName (required): Hbase cluster connection name.
• RowidCaching (required): Caching will be applied on the basis of row id.
Output Type | Column |
Returns | Returns the value for the given table, rowId, columnFamily, result row number, result column name. |
Throws | Application Exception |
Use Case:
Example:
lookupHBase("tableName","rowID","columnFamily",1,"age") will return value of age from row 1 of result for given rowId and columnFamily.
It is used to fetch the record from RDBMS table. The application supports database lookup for Postgres, MySQL and Oracle.
Input Parameters: String sqlQuery, int resultRow, String resultColumnName
• sqlQuery(required) – string type, select sql query specific to database.
• resultRow (required): Result row number.
• resultColumnName (required): Result column name.
Configuration Parameters:
• ConnectionName(required)– Complete jdbc URL specific to database.
• FunctionReturnType (required)– Return type of the function based on the result column being queried.
Output Type | Column |
Returns | Returns the value for the given query, row number and column name. |
Throws | Application Exception |
Use Case:
Example:
lookupRDBMS(“select * from tableName",1,"age") will return value of age for row 1 of query result.
Returns the result with respect to query from ElasticSearch index, outputs the given column for given row.
Input Parameters: String stringQuery, int resultRow, String resultColumnName
• stringQuery(required) – string Elastic search query.
• resultRow (required): Result row number
• resultColumnName (required): Result column name.
Configuration Parameters:
• ConnectionName(required)– Elastic search connection name.
• indexName(required)- Elastic search index name.
• indexType(required)- Elastic search index type.
• functionReturnType (required)– Return type of the function based on the result column being queried.
. Output Type Column Returns Return the result of Elastic search query with respect to row number and column name. Throws Application Exception
Use Case:
Example:
lookupES("{\'query\':{\'match_all\':{}}}",1,"age") will return row 1 and column named age from search result of ES index.
This method is used to get the response of web service. Gathr supports REST based web service.
Input Parameters: There is no input parameter.
Configuration Parameters:
• endpoint(required): The end point of URL.
• methodType(required): Web service method like GET, POST, etc.
• Content-type(required): The content type like application/JSON, etc.
• wsparams(required): The web service parameters.
. Output Type Column Returns Return the response of web service Throws Application Exception
Use Case:
Returns the result, with respect to query, from Solr Index.
Input Parameters: String queryString, int resultRow, String resultColumnName
• queryString (required): Solr search query.
• resultRow (required): Result row number.
• resultColumnName (required): Result column name.
Configuration Parameters:
• connectionName(required)- Solr connection name.
• indexName(required)- Solr index name.
Output Type | Column |
Returns | Return the result of solr search query with respect to index name, row number and column name. |
Throws | Application Exception |
Use Case:
Example:
lookupSOLR(''*:*'',1,”age”) will return you search result from Solr index for row 1 and column age of query result.
Returns the boolean result, with respect to path exits or not, from HDFS.
Input Parameters:
• hdfsPath (required) – Path on HDFS.
Configuration Parameters:
• ConnectionName(required)– HDFS Connection name.
• FunctionReturnType (required)– Return type of the function based on the result column being queried.
Output Type | Boolean Column |
Returns | Returns Boolean as result from HDFS. |
Throws | Application Exception |
Use Case:
Example:
lookupHDFSPATH(''hdfsPath'') will return whether the path exists or not on HDFS.
Returns the Base64 encoded result, with respect to row and column-family passed as argument to the function, from HBase table, outputs the given column for given row.
Input parameters: String tableName,String rowID, String columnFamily, int resultRow, String resultColumnName
• tableName (required): Hbase Table name with name space as string type.
• rowID (required): String representation of unique key for record.
• columnFamily (required): Column family name is in string format.
• resultRow (required): Result row number.
• resultColumnName (required): Result column name.
Configuration Parameters: Below are the configuration parameters of lookupHBase function.
• ConnectionName(required)– Hbase Connection name.
• RowIdCaching (required)– Caching will be applied on the basis of row id.
. Output Type Column Returns Returns the value for the given table, rowId, columnFamily result row number, result column name. Throws Application Exception
Use Case:
Example:
lookupHBase64("tableName","rowID","columnFamily",1,"age") will return value of age from row 1 of result for given rowId and columnFamily.
Computes the numeric value of the first character of the string column, and returns the result as an int column.
Input Parameters: Column arg0
• arg0(required) - The string column for which the first character's numeric value to be calculated.
Output Type | String Column |
Configuration Parameters | No |
Returns | The ASCII value as an int column. |
Throws | Application Exception |
Use Case:
Example:
ascii(“An apple”) will return 65
Converts the argument from a binary bin to a base 64 string.
Input Parameters:
arg0: The Column to be converted to base64 string.
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns the base64 string. |
Throws | Application Exception |
Example:
base64('Spark SQL') will return U3BhcmsgU1FM
Returns the string representation of the long value expr represented in binary.
Input Parameters:
arg0: The numerical Column to be converted to represented in binary.
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns the binary representation. |
Throws | Application Exception |
Example:
bin(13) will return 1101
Decodes the first argument using the second argument character set.
Input Parameters:
arg0:Column to be decoded.
arg1: The charset Output Type String Column Configuration Parameters No Returns Returns the decoded column. Throws Application Exception
Example:
decode(encode('abc', 'utf-8'), 'utf-8') will return abc
Encodes the first argument using the second argument character set.
Input Parameters:
arg0:Column to be encoded.
arg1: The charset
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns the encoded column. |
Throws | Application Exception |
Example:
encode('abc', 'utf-8') will return abc
Removes the leading string contains the characters from the trim string.
Input Parameters:
arg0:the trim string characters to trim, the default value is a single space.
arg1: a string expression.
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns the trimed string. |
Throws | Application Exception |
Example:
ltrim('Sp','SsparkSQLS')will return ArkSQLS
Returns Soundex code of the string.
Input Parameters:
arg0:String column
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns Soundex code of the string. |
Throws | Application Exception |
Example:
soundex('Miller') will return M460
Splits str around occurrences that match regex
Input Parameters:
arg0: str string column.
arg1:the regex string
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns the spilts. |
Throws | Application Exception |
Example:
split('oneAtwoBthreeC', '[ABC]')will return ["one","two","three",""]
Remove the leading and trailing trimStr characters from str
Input Parameters:
arg0:The trimStr String column.
arg1:The str string
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns the trimed string. |
Throws | Application Exception |
Example:
trim('SL', 'SsparkSQLS') will return parkSQ
Converts the argument from a base 64 string str to a binary.
Input Parameters:
arg0:The base 64 String column
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns the unbase64 of string. |
Throws | Application Exception |
Example:
unbase64('U3BhcmsgU1FM') will return Spark SQL
Concatenates multiple input string columns together into a single string column.
Input Parameters: Column... arg0
arg0 (required) - The String columns to be concatenated.
Output Type | String Column |
Configuration Parameters | No |
Returns | The concatenated string as a single string column. |
Throws | Application Exception |
Use Case:
Example:
concat("format","string") will return "formatstring"
Concatenates multiple input string columns together into a single string column, using the given separator.
Input Parameters: Object arg0, Column... arg1
• arg0(required) - The separator to be used.
• arg1 (required) - The String columns to be concatenated.
Output Type | Column |
Configuration Parameters | No |
Returns | The concatenated strings using the given separator as a single string column |
Throws | Application Exception |
Example:
concat_ws("-","format","string") will return "format-string".
Parses the expression string into the column that it represents.
Input Parameters: Object arg0
• arg0 (required) - The expression string to be parsed Output Type Column Configuration Parameters No Returns The column return by the expression evaluation. Throws Application Exception
Use Case:
Example:
expr("colA * 3") will return 6 if the value of column 'colA' is 2 for particular row.
Formats the arguments in printf-style and returns the result as a string column.
Input Parameters: Object arg0, Column arg1
• arg0 (required) - The printf style format.
• arg1 (required) - The columns to be formatted.
Output Type | String Column |
Configuration Parameters | No |
Returns | The formatted arguments as a string column. |
Throws | Application Exception |
Example:
format_string("the %s jumped over the %s, %d times","cow","moon",2) will return "the cow jumped over the moon 2 times".
Computes a new string column by converting the first letter of each word to uppercase.
Input Parameters: Column arg0
• arg0 (required) - The input string column Output Type String Column Configuration Parameters No Returns The converted string column. Throws ApplicationException
Use Case:
Example:
initcap("apple") will return "Apple")
Locate the position of the first occurrence of given substring in the given string column.
Input Parameters: Column arg0, Object arg1
• arg0 (required) - The string column in which the location to be determined.
• arg1 (required) - The substring for which the position to be determined.
Output Type | Integer Column |
Configuration Parameters | No |
Returns | Returns the computed last day. |
Throws | Application Exception |
Example:
instr("apple","le") will return 4
Computes the length of a given string or binary column.
Input Parameters: Column arg0
arg0 (required) - The string column for which the length to be determined.
Output Type | Integer Column |
Configuration Parameters | No |
Returns | The computed length. |
Use Case:
Example:
length("apple") will return 5
Computes the Levenshtein distance of the two given string columns.
Input Parameters: Column arg0, Column arg1
• arg0 (required) - The first string column from which the Levenshtein distance from the second string column to be determined.
• arg1 (required) - The second string column
Output Type | Integer Column |
Configuration Parameters | No |
Returns | The computed Levenshtein distance. |
Throws | Application Exception |
Use Case:
Example:
levenshtein("kitten", "sitting") will return 3
Locate the position of the first occurrence of given substring in a string column, after the given position.
Input Parameters: Object arg0, Column arg1, int arg2
• arg0 (required) - The String for which the location to be determined.
• arg1 (required) - The String column in which the location to be determined.
• arg2 (required) - The position after which the location to be determined Output Type Integer Column Configuration Parameters No Returns The position of the first occurrence of substring. Throws Application Exception
Use Case:
Example:
locate("apple","An apple",1) will return 3
Converts a string column to lower case.
Input Parameters: Column arg0
• arg0 (required) - The string column to be converted to lower case Output Type String Column Configuration Parameters No Returns String in lower case. Throws Application Exception
Use Case:
Example:
lower("Apple") will return "apple"
Left-pad the string column with the given string, to a given length.
Input Parameters: Column arg0, int arg1, Object arg2
• arg0 (required) - The string column to be left-padded.
• arg1 (required) - The length for the padding.
• arg2 (required) - The string to use for left-pad.
Output Type | String Column |
Configuration Parameters | No |
Returns | The Left-padded string. |
Throws | Application Exception |
Use Case:
Example:
lpad("SQL Tutorial", 20, "ABC") will return "ABCABCABSQL Tutorial"
Trim the spaces from left end for the specified string column.
Input Parameters: Column arg0
• arg0 (required) - The string column from which left spaces to be trimmed.
Output Type | String Column |
Configuration Parameters | No |
Returns | The trimmed string column. |
Use Case:
Example:
ltrim(“apple”) will return “apple”
Extracts a specific group matched by a Java regex, from the specified string column.
Arguments: Column arg0, Object arg1, int arg2
• arg0 (required) - The string column from which the group to be extracted.
• arg1 (required) - The string specifying the regex.
• arg2 (required) - The regex group id.
Output Type | String Column |
Configuration Parameters | No |
Returns | The extracted group. |
Throws | Application Exception |
Use Case:
Example:
regexp_extract("foothebar","foo(.*?)(bar)", 2) will return "bar"
Replace all substrings of the specified string column that match regexp with the given replacement.
Input Parameters: Column arg0, Object arg1, Object arg2
arg0 (required) - The string column from which substrings to be replaced.
arg1 (required) - The pattern to be used.
arg2 (required) - The replacement. Output Type String Column Configuration Parameters No Returns The string column after replacement of substrings that match regexp. Throws Application Exception
Use Case:
Example:
regexp_replace("foobar", "oo|ar", "") will return "fb"
Rename the given column.
Input Parameters: Object arg0
arg0 (required) - The column name to be renamed.
Output Type | Column |
Configuration Parameters | No |
Returns | The renamed column. |
Throws | Application Exception |
Use Case:
Example:
rename_column("age") will rename "age" to given output column name.
Repeats each value in 'arg0' column 'arg1' times.
Input Parameters: Column arg0,int arg1
arg0 (required) - A column needs to be repeated
arg1 (required) - Integer value representing no of times arg0 is to be repeated. Output Type Column Configuration Parameters No Returns A repeated value. Throws Application Exception
Use Case:
Example:
repeat("str",2) will return 'strstr'
Reverses the string column and returns it as a new string column.
Input Parameters: Column arg0
arg0 (required) - The string column to be reversed.
Output Type | String Column |
Configuration Parameters | No |
Returns | The reversed string column. |
Throws | Application Exception |
Use Case:
Example:
reverse("apple") will return "elppa"
Right-pad the string column with the given string, to a given length.
Input Parameters: Column arg0, int arg1, Object arg2
• arg0 (required) - The string column to be right-padded.
• arg1 (required) - The length for the padding.
• arg2 (required) - The string to use for right-pad.
Output Type | StringColumn |
Configuration Parameters | No |
Returns | The Right-padded string. |
Throws | Application Exception |
Use Case:
Example:
rpad("SQL Tutorial", 20, "ABC") will return "SQL TutorialABCABCAB"
Trim the spaces from right end for the specified string column.
Input Parameters: Column arg0
arg0 (required) - The string column from which right spaces to be trimmed.
Output Type | String Column |
Configuration Parameters | No |
Returns | The trimmed string column. |
Use Case:
Example:
rtrim("apple ") will return "apple"
Substring starts at given position and is of given length when given argument is String type or returns the slice of byte array that starts at given position in byte and is of given length when given argument is Binary type.
Input Parameters: Column arg0, int arg1, int arg2
• arg0 (required) - The String column from which substring to be extracted.
• arg1 (required) - The start position for the substring.
• arg2 (required) - The end position for the substring.
Output Type | String Column |
Configuration Parameters | No |
Returns | Returns the computed last day. |
Use Case:
Example:
substring("foo bar",4,6) will return "bar"
Computes the substring from given string column before given count occurrences of the given delimiter.
Input Parameters: Column arg0, Object arg1, int arg2
• arg0 (required) - The String column from which substring to be extracted.
• arg1 (required) - The delimiter.
• arg2 (required) - The count occurrences for the delimiter. Output Type Column Configuration Parameters No Returns The result substring.
Use Case:
Example:
substring_index("www.xyz.com",".",2) will return www.xyz
Translate any character in the given string by a given character in given replaceString.
Input Parameters: Column arg0, Object arg1, Object arg2
• arg0 (required) - The string column in which the translation to be done.
• arg1 (required) - The matching character.
• arg2 (required) - The replacement character.
Output Type | String Column |
Configuration Parameters | No |
Returns | The translated string. |
Example:
translate("The foo bar","f","t") will return "The too bar"
Trim the spaces from both ends for the specified string column.
Input Parameters: Column arg0
arg0 (required) - The string column from which spaces to be trimmed.
Output Type | String Column |
Configuration Parameters | No |
Returns | The trimmed string value. |
Throws | Application Exception |
Use Case:
Example:
trim(" An apple ") will return "An apple"
Converts a string column to upper case.
Input Parameters: Column arg0
arg0 (required) - The string column to be converted to upper case. Output Type String Column Configuration Parameters No Returns The converted string column.
Use Case:
Example:
upper("aPPle") will return "APPLE"
Input Parameters: Column arg0
arg0 (required) - The column for which absolute value to be calculated. Output Type Number Column Configuration Parameters No Returns The computed absolute value. Throws Application Exception
Use Case:
Example:
abs(77.76) will return 77.76
Computes the cosine inverse of the given value; the returned angle is in the range 0.0 through pi.
Input Parameters: Column arg0
arg0 (required) - The column for which cosine inverse to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed cosine inverse in the range 0.0 through pi. |
Throws | Application Exception |
Use Case:
Example:
acos(0.45) will return 1.104031001096478
Computes the sine inverse of the given value; the returned angle is in the range -pi/2 through pi/2.
Input Parameters: Column arg0
arg0 (required) - The column for which sine inverse to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed sine inverse in the range -pi/2 through pi/2. |
Throws | Application Exception |
Use Case:
Example:
asin(0.45) will return 0.4667653256984187
Computes the tangent inverse of the given value.
Input Parameters: Column arg0
arg0 (required) - The column for which tangent inverse to be calculated. Output Type Number Column Configuration Parameters No Returns The computed the tangent inverse. Throws Application Exception
Example:
atan(0.45) will return 0.42285391621948626
Computes the angle theta from the conversion of rectangular coordinates (arg0, arg1) to polar coordinates (arg1, theta).
Input Parameters: Column arg0, Column arg1
• arg0 (required) - The column for the x rectangular coordinate.
• arg1 (required) - The column for the y rectangular coordinate. Output Type Number Column Configuration Parameters No Returns The computed angle theta. Throws Application Exception
Use Case:
Example:
atan2(12, 71.21) will return 1.403849169952035
Computes the angle theta from the conversion of rectangular coordinates (arg0, arg1) to polar coordinates (arg1, theta).
Input Parameters: double arg0, Column arg1
• arg0 (required) - The x rectangular coordinate.
• arg1 (required) - The column for the y rectangular coordinate. Output Type Number Column Configuration Parameters No Returns The computed angle theta. Throws Application Exception
Use Case:
Example:
atan2_left_arg_double(12, 71.21) will return 1.403849169952035
Computes the angle theta from the conversion of rectangular coordinates (arg0, arg1) to polar coordinates (arg1, theta).
Input Parameters: Column arg0, double arg1
• arg0 (required) - The column for the x rectangular coordinate.
• arg1 (required) - The y rectangular coordinate. Output Type Number Column Configuration Parameters No Returns The computed angle theta. Throws Application Exception
Use Case:
Example:
atan2_right_arg_double(12, 71.21) will return 1.403849169952035
Computes the value of the column arg0 rounded to 0 decimal places with HALF_EVEN round mode.
Input Parameters: Column arg0
arg0 (required) - The column for which value rounded to 0 decimal places with HALF_EVEN round mode to be calculated
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed value |
Throws | Application Exception |
Use Case:
Example:
bround(71.21) will return 71.0
Round the value of arg0 to arg1 decimal places with HALF_EVEN round mode if arg1 is greater than or equal to 0 or at integral part when arg1 is less than 0.
Input Parameters: Column arg0, int arg1
• arg0 (required) - The column for which value rounded to 0 decimal places with HALF_EVEN round mode to be calculated.
• arg1 (required) - The scale to use
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed value. |
Throws | Application Exception |
Use Case:
Example:
bround_with_scale(71.21, 1) will return 71.2
Computes the cube-root of the given column.
Input Parameters: Column arg0
arg0 (required) - The column for which cube-root to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed cube-root. |
Throws | Application Exception |
Use Case:
Example:
cbrt(80.89) will return 4.324789202233814
Computes the ceiling of the given column.
Input Parameters: Column arg0
• arg0 (required) - The column for which ceiling to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed ceiling. |
Throws | Application Exception |
Example:
ceil(77.76) will return 4.2682720044742055
Converts a number column from a given base to another.
Input Parameters: Column arg0,Column args1,Column args2
• arg0 (required) - A number/String number column
• arg1 (required) - Integer column of base from which a number is to be converted
• arg2 (required) - Integer column of base to which a number is to be converted.
Output Type | String Column |
Configuration Parameters | No |
Returns | The string in the target base. |
Throws | Application Exception |
Use Case:
Example:
conv(258,10,2) will return 100000010
Computes the cosine of the given value.
Input Parameters: Column arg0
• arg0 (required) - The column for which cosine to be calculated. Output Type Number Column Configuration Parameters No Returns The computed cosine. Throws Application Exception
Use Case:
Example:
cos(76.56) will return 0.3977126102073901
Computes the hyperbolic cosine of the given value.
Input Parameters: Column arg0
• arg0 (required) -The column for which hyperbolic cosine to be calculated. Output Type Number Column Configuration Parameters No Returns The computed hyperbolic cosine. Throws Application Exception
Use Case:
Example:
cos(76.56) will return 0.3977126102073901
Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
Input Parameters: Column arg0
• arg0 (required) -The column for which the equivalent angle measured in degrees to be calculated. Output Type Number Column Configuration Parameters No Returns The converted angle measured in degrees. Throws Application Exception
Use Case:
Example:
degrees(71.21) will return 4080.0324066707394
Computes the exponential of the given value.
Input Parameters: Column arg0
• arg0 (required) -The column for which exponential to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed exponential. |
Throws | Application Exception |
Use Case:
Example:
exp(0.78) will return 2.18147220308578
Computes the exponential of the given value minus one.
Input Parameters: Column arg0
• arg0 (required) -The column for which exponential minus one to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed value. |
Throws | Application Exception |
Use Case:
Example:
expm1(0.23) will return 0.2586000151807663
Computes the factorial of the given value.
Input Parameters: Column arg0
• arg0 (required) -The column for which factorial to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | The computed factorial. |
Throws | Application Exception |
Use Case:
Example:
factorial(11) will return 39916800
Computes the floor of the given column.
Input Parameters: Column arg0
• arg0 (required) -The column for which floor to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed floor. |
Throws | Application Exception |
Use Case:
Example:
floor(71.21) will return 71
Formats numeric column arg0 to a format like '#,###,###.##', rounded to arg1 decimal places, and returns the result as a string column.
Input Parameters: Column arg0, int arg1
• arg0 (required) -The column to be formated.
• arg1 (required) -The integer specifying the decimal places to be used for rounding.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the formatted result as a string column. |
Throws | Application Exception |
Example:
format_number(7120.12, 1) will return 7,120.1
It gives the greatest value of the list of values. This function takes at least 2 parameters.
Input Parameters: Column arg0,Column args1,...
• arg0 (required) -A column from the schema
• arg1 (required) -A column from the schema
Output Type | Number Column |
Configuration Parameters | No |
Returns | The greatest column |
Throws | Application Exception |
Use Case:
Example:
greatest(258,259) will return 259
If the argument is an INT, hex returns the number as a STRING in hexadecimal format. Otherwise if the number is a STRING, it converts each character into its hexadecimal representation and returns the resulting STRING.
Input Parameters: Column arg0
• arg0 (required) -A int/string column
Output Type | String Column |
Configuration Parameters | No |
Returns | A string value. |
Throws | Application Exception |
Use Case:
Example:
hex(258) will return 102
Computes sqrt(arg0^2^ + arg1^2^) without intermediate overflow or underflow.
Input Parameters: Column arg0, Column arg1
• arg0 (required) -Will be used while computing sqrt.
• arg1 (required) -Will be used while computing sqrt.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed sqrt(arg0^2^ + arg1^2^). |
Throws | Application Exception |
Use Case:
Example:
hypot(71.21, 10.5) will return 71.97995533209642
Computes sqrt(arg0^2^ + arg1^2^) without intermediate overflow or underflow.
Input Parameters: double arg0, Column arg1
• arg0 (required) -Will be used while computing sqrt.
• arg1 (required) -Will be used while computing sqrt.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed sqrt(arg0^2^ + arg1^2^). |
Throws | Application Exception |
Use Case:
Example:
hypot_left_arg_double(71.21, 10.5) will return 71.97995533209642
Computes sqrt(arg0^2^ + arg1^2^) without intermediate overflow or underflow.
Input Parameters: Column arg0, double arg1
• arg0 (required) -Will be used while computing sqrt.
• arg1 (required) -Will be used while computing sqrt.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed sqrt(arg0^2^ + arg1^2^). |
Throws | Application Exception |
Use Case:
Example:
hypot_right_arg_double(71.21, 10.5) will return 71.97995533209642
It gives the least value of the list of values. This function takes at least 2 parameters.
Input Parameters: Column arg0,Column args1,...
• arg0 (required) - A column from the schema.
• arg1 (required) - A column from the schema.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the least column |
Throws | Application Exception |
Use Case:
Example:
least(258,259) will return 259
Computes the natural logarithm of the given value.
Input Parameters: Column arg0
• arg0 (required) - The column for which natural logarithm to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed natural logarithm. |
Throws | Application Exception |
Use Case:
Example:
log(20) will return 2.995732273553991
Computes the natural logarithm of the given value plus one.
Input Parameters: Column arg0
• arg0 (required) - The column for which natural logarithm plus one to be calculated. Output Type Number Column Configuration Parameters No Returns Returns the computed natural logarithm plus one. Throws Application Exception
Example:
log1p(20) will return 3.044522437723423
Computes the first argument-base logarithm of the second argument.
Input Parameters: double arg0, Column arg1
• arg0 (required) - The base to used.
• arg1 (required) - The column for which logarithm to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed algorithm. |
Throws | Application Exception |
Use Case:
Example:
log_with_base(10, 20) will return 1.301029995663981
Computes the Unary minus, i.e. -(arg0)
Input Parameters: Column arg0
• arg0 (required) - The column for which unary minus to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed Unary minus, i.e. -(arg0). |
Throws | Application Exception |
Example:
negate(20) will return -20
Computes the positive value of arg0 mod arg1.
Input Parameters: Column arg0, Column arg1
• arg0 (required) - The dividend.
• arg1 (required) - The divisor. Output Type Number Column Configuration Parameters No Returns Returns the computed positive value of arg0 mod arg1. Throws Application Exception
Use Case:
Example:
pmod(19, 0.78) will return 0.2800007
Computes the value of the first argument raised to the power of the second argument.
Input Parameters: Column arg0, Column arg1
• arg0 (required) - The base.
• arg1 (required) - The exponent.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed value |
Throws | Application Exception |
Use Case:
Example:
pow(20, 2) will return 400
Computes the value of the first argument raised to the power of the second argument.
Input Parameters: double arg0, Column arg1
• arg0 (required) - The base.
• arg1 (required) - The exponent Output Type Number Column Configuration Parameters No Returns Returns the compute value Throws Application Exception
Use Case:
Example:
pow_left_arg_double(20, 2) will return 400
Computes the value of the first argument raised to the power of the second argument.
Input Parameters: Column arg0, double arg1
• arg0 (required) - The base.
• arg1 (required) - The exponent Output Type Column Configuration Parameters No Returns Returns the computed value. Throws Application Exception
Use Case:
Example:
pow_right_arg_double(20, 2) will return 400
Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
Input Parameters: Column arg0
• arg0 (required) - The column for which equivalent angle measured in radians to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the converted angle measured in radians. |
Throws | Application Exception |
Use Case:
Example:
radians(20) will return 0.3490658503988659
Generate a random column with independent and identically distributed (i.i.d.) samples from 0.0 to 1.0.
Input Parameters: No input arguments.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the generated column. |
Throws | Application Exception |
Example:
rand() will return a new column with independent and identically distributed (i.i.d.) samples from 0.0 to 1.0.
Generate a random column with independent and identically distributed (i.i.d.) samples from 0.0 to 1.0.
Input Parameters: long arg0
• arg0 (required) - The seed
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the generated column. |
Throws | Application Exception |
Use Case:
Example:
rand_with_seed(2) will return a new column with independent and identically distributed (i.i.d.) samples from 0.0 to 1.0.
Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
Input Parameters: No input arguments. Output Type Number Column Configuration Parameters No Returns Returns the generated column. Throws Application Exception
Use Case:
Example:
randn() will return a new column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
Input Parameters: long arg0
• arg0 (required) - The seed Output Type Number Column Configuration Parameters No Returns Returns the generated column. Throws Application Exception
Use Case:
Example:
randn_with_seed(2) will return a new column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
Computes the double value that is closest in value to the argument and is equal to a mathematical integer.
Input Parameters: Column arg0
• arg0 (required) - The column for which double value to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed last day. |
Throws | Application Exception |
Use Case:
Example:
rint(80.89) will return 81.0
Computes the value of the column arg0 rounded to 0 decimal places.
Input Parameters: Column arg0
arg0 (required) - The column for which value rounded to 0 decimal places to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed value. |
Throws | Application Exception |
Use Case:
Example:
round(80.89) will return 81.0
Computes the value of arg0 to scale decimal places if arg1 is greater than or equal to 0 or at integral part when arg1 is less than 0.
Input Parameters: Column arg0, int arg1
• arg0 (required) - The column for which value rounded to 0 decimal places to be calculated.
• arg0 (required) - The seed Output Type Number Column Configuration Parameters No Returns Returns the computed value. Throws Application Exception
Use Case:
Example:
round_with_scale(80.89,1) will return 80.9
Computes the signum of the given value.
Input Parameters: Column arg0
• arg0 (required) - The column for which signum to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed signum. |
Throws | Application Exception |
Example:
signum(20) will return 1.0
Computes the sine of the given value.
Input Parameters: Column arg0
• arg0 (required) - The column for which sine to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed sine |
Throws | Application Exception |
Example:
sin(20) will return 0.9129452507276277
Computes the hyperbolic sine of the given value.
Input Parameters: Column arg0
• arg0 (required) - The column for which sine to be calculated. Output Type Number Column Configuration Parameters No Returns Returns the computed hyperbolic sine. Throws Application Exception
Example:
sinh(20) will return 2.4258259770489514E8
Computes the square root of the specified float value.
Input Parameters: Column arg0
• arg0 (required) - The column for which square root to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed square root. |
Throws | Application Exception |
Use Case:
Example:
sqlSqrt(20) will return 4.47213595499958
Computes the tangent of the given value.
Input Parameters: Column arg0
arg0 (required) - The column for which tangent to be calculated. Output Type Number Column Configuration Parameters No Returns Returns the computed tangent Throws Application Exception
Use Case:
Example:
tan(20) will return 2.237160944224742
Computes the hyperbolic tangent of the given value.
Input Parameters: Column arg0
• arg0 (required) - The column for which hyperbolic tangent to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the computed hyperbolic tangent. |
Throws | Application Exception |
Example:
tanh(20) will return 1.0
Computes the angle measured in radians to an approximately equivalent angle measured in degrees.
Input Parameters: Column arg0
• arg0 (required) - The column for which degree to be calculated
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the angle measured in degrees. |
Throws | Application Exception |
Use Case:
Example:
toDegrees(3.14159) will return 180
Computes the angle measured in degrees to an approximately equivalent angle measured in radians.
Input Parameters: Column arg0
• arg0 (required) - A column for which radians to be calculated.
Output Type | Number Column |
Configuration Parameters | No |
Returns | Returns the angle measured in radians. |
Throws | Application Exception |
Example:
toRadians(180) will return 3.14159
Returns an array with the given elements.
Input Parameters:
arg0: The given columns to create array column. Configuration Parameters No Returns Returns an array with the given elements. Throws Application Exception
Example:
array(1, 2, 3) will return [1,2,3]
Returns an array of the elements in the union of array1 and array2, without duplicates.
Input Parameters:
arg0: The first array column.
arg1: The second array column.
Configuration Parameters | No |
Returns | Returns an array of the elements in the union of array1 and array2, without duplicates. |
Throws | Application Exception |
Example:
array_union(array(1, 2, 3), array(1, 3, 5)) will return [1,2,3,5]
Removes duplicate values from the array.
Input Parameters:
arg0: The given array column. Configuration Parameters No Returns Returns the array with duplicate values removed. Throws Application Exception
Example:
array_distinct(array(1, 2, 3, null, 3)) will return [1,2,3,null]
Returns an array of the elements in array1 but not in array2, without duplicates.
Input Parameters:
arg0: First array column.
arg1: Second array column. Configuration Parameters No Returns Returns an array of the elements in array1 but not in array2, without duplicates. Throws Application Exception
Example:
array_except(array(1, 2, 3), array(1, 3, 5)) will return [2]
Performs intersection of array1 and array2, without duplicates.
Input Parameters:
arg0: First array column.
arg1: Second array column. Configuration Parameters No Returns Returns an array of the elements in the intersection of array1 and array2, without duplicates. Throws Application Exception
Example:
array_intersect(array(1, 2, 3), array(1, 3, 5)) will return [1,3]
Concatenates the elements of the given array using the delimiter and an optional string to replace nulls. If no value is set for nullReplacement, any null value is filtered.
Input Parameters:
arg0: array column.
arg1: delimiter.
arg2: nullReplacement. Configuration Parameters No Returns Returns the concatenated array. Throws Application Exception
Example:
array_join(array('hello', null ,'world'), ' ', ',') will return hello , world
Returns the maximum value in the array. NULL elements are skipped.
Input Parameters:
arg0: The array column. Configuration Parameters No Returns Returns the maximum value in the array. NULL elements are skipped. Throws Application Exception
Example:
array_max(array(1, 20, null, 3)) will return 20
Returns the minimum value in the array. NULL elements are skipped.
Input Parameters:
arg0: The array column. Configuration Parameters No Returns Returns the minimum value in the array. NULL elements are skipped. Throws Application Exception
Example:
array_min(array(1, 20, null, 3)) will return 1
Returns the (1-based) index of the first element of the array as long.
Input Parameters:
arg0: The array column.
arg1: The position.
Configuration Parameters | No |
Returns | Returns the (1-based) index of the first element of the array as long. |
Throws | Application Exception |
Example:
array_position(array(3, 2, 1), 1) will return 3
Remove all elements that equal to element from array.
Input Parameters:
arg0: The array column.
arg1: The position.
Configuration Parameters | No |
Returns | Returns the array with elements removed. |
Throws | Application Exception |
Example:
array_remove(array(1, 2, 3, null, 3), 3) will return [1,2,null]
Returns the array containing element count times.
Input Parameters:
arg0: The array column.
arg1: The count
Configuration Parameters | No |
Returns | Returns the array containing element count times. |
Throws | Application Exception |
Example:
array_repeat('123', 2) will return ["123","123"]
Sorts the input array in ascending order. The elements of the input array must be order-able. Null elements will be placed at the end of the returned array.
Input Parameters:
arg0: The array column. Configuration Parameters No Returns Returns the sorted array. Throws Application Exception
Example:
array_sort(array('b', 'd', null, 'c', 'a')) will return ["a","b","c","d",null]
Returns an array of the elements in the union of array1 and array2, without duplicates.
Input Parameters:
arg0: The first array column.
arg1: The second array column.
Configuration Parameters | No |
Returns | Returns an array of the elements in the union of array1 and array2, without duplicates. |
Throws | Application Exception |
Example:
array_union(array(1, 2, 3), array(1, 3, 5)) will return [1,2,3,5]
Returns true if a1 contains at least a non-null element present also in a2. If the arrays have no common element and they are both non-empty and either of them contains a null element null is returned, false otherwise.
Input Parameters:
arg0: The first array column.
arg1: The second array column.
Configuration Parameters | No |
Returns | Returns true or false. |
Throws | Application Exception |
Example:
arrays_overlap(array(1, 2, 3), array(3, 4, 5)) will return true.
Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
Input Parameters:
arg0: The Columns to be zipped.
Configuration Parameters | No |
Returns | Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays. |
Throws | Application Exception |
Example:
arrays_zip(array(1, 2, 3), array(2, 3, 4)) Will return [{"0":1,"1":2},{"0":2,"1":3},{"0":3,"1":4}]
Returns TRUE if the array contains value.
Input Parameters: Column arg0, Object arg1
• arg0 (required) - An array column.
• arg1 (required) - A value to be checked. Output Type Boolean Column Configuration Parameters No Returns A boolean true/false Throws Application Exception
Use Case:
Example:
array_contains(["black","red"] ,"red") will return true
Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns.
Input Parameters:
arg0: The expr column. Configuration Parameters No Returns Returns the exploded column. Throws Application Exception
Example:
explode(array(10, 20)) will return 10, 20 in a new column.
Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns.
Input Parameters:
arg0: The expr column.
Configuration Parameters | No |
Returns | Returns the exploded column. |
Throws | Application Exception |
Example:
explode_outer(array(10, 20)) will return 10, 20.
Computes bitwise NOT of the given integer.
Input Parameters: Column arg0
• arg0 (required) - An integer or a double column.
Output Type | Integer Column |
Configuration Parameters | No |
Returns | The one's complement of the integer |
Throws | Application Exception |
Use Case:
Example:
bitwiseNOT(7) will return 8
Returns the first non-null argument if exists. Otherwise, null.
Input Parameters:
arg0:columns representing expressions.
Output Type | Integer Column |
Configuration Parameters | No |
Returns | Returns the first non-null argument if exists. Otherwise, null. |
Throws | Application Exception |
Example:
coalesce(NULL, 1, NULL) will return 1
Returns a struct value with the given jsonStr and schema.
Input Parameters:
arg0: The Json string column.
arg1: The schema column.
arg2: The properties map.
Configuration Parameters | No |
Returns | Returns the struct value. |
Throws | Application Exception |
Example:
from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE') will return {"a":1, "b":0.8} and from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'))will return {"time":"2015-08-26 00:00:00.0"}
Returns a hash value of the arguments.
Input Parameters:
arg0:The columns for which hash to be calculated.
Configuration Parameters | No |
Returns | Returns a hash value of the arguments. |
Throws | Application Exception |
Example:
hash('Spark', array(123), 2) will return -1321691492
Returns the name of the file being read, or empty string if not available.
Configuration Parameters | No |
Returns | Returns the name of the file being read, or empty string if not available. |
Throws | Application Exception |
Example:
input_file_name() - will return the name of the file being read
Creates a map with the given key/value pairs.
Input Parameters:
arg0:The columns for key and value.
Configuration Parameters | No |
Returns | Returns the map. |
Throws | Application Exception |
Example:
map(1.0, '2', 3.0, '4') will return {1.0:"2",3.0:"4"}
Returns the union of all the given maps
Input Parameters:
arg0:The map columns.
Configuration Parameters | No |
Returns | Returns the union of all the given maps. |
Throws | Application Exception |
Example:
map_concat(map(1, 'a', 2, 'b'), map(2, 'c', 3, 'd')) will return {1:"a",2:"c",3:"d"}
Creates a map with a pair of the given key/value arrays. All elements in keys should not be null.
Input Parameters:
arg0:Array of keys.
arg1:Array of values.
Configuration Parameters | No |
Returns | Returns the map. |
Throws | Application Exception |
Example:
map_from_arrays(array(1.0, 3.0), array('2', '4')) will return {1.0:"2",3.0:"4"}
Returns a map created from the given array of entries.
Input Parameters:
arg0:Array of entries.
Configuration Parameters | No |
Returns | Returns the map. |
Throws | Application Exception |
Example:
map_from_entries(array(struct(1, 'a'), struct(2, 'b'))) will return {1:"a",2:"b"}
Returns an unordered array containing the keys of the map.
Input Parameters:
arg0:Map column.
Configuration Parameters | No |
Returns | Returns the array. |
Throws | Application Exception |
Example:
map_keys(map(1, 'a', 2, 'b')) will return [1,2]
Returns an unordered array containing the values of the map.
Input Parameters:
arg0:Map column.
Configuration Parameters | No |
Returns | Returns the array. |
Throws | Application Exception |
Example:
map_values(map(1, 'a', 2, 'b')) will return ["a","b"]
R Returns expr1 if it's not NaN, or expr2 otherwise.
Input Parameters:
arg0:expr1 column.
arg1:expr2 column.
Configuration Parameters | No |
Returns | Returns expr1 if it's not NaN, or expr2 otherwise. |
Throws | Application Exception |
Example:
nanvl(cast('NaN' as double), 123) will return 123.0
Perform logical not of given column.
Input Parameters:
arg0:Given boolean column
Configuration Parameters | No |
Returns | Returns logical not of given column. |
Throws | Application Exception |
Example:
not(false) will return true
Returns schema in the DDL format of JSON string.
Input Parameters:
arg0:Given json string column
Configuration Parameters | No |
Returns | Returns schema of the json. |
Throws | Application Exception |
Example:
schema_of_json('[{"col":0}]') will return array>
Generates an array of elements from start to stop (inclusive), incrementing by step. The type of the returned elements is the same as the type of argument expressions.Supported types are: byte, short, integer, long, date, timestamp.The start and stop expressions must resolve to the same type. If start and stop expressions resolve to the 'date' or 'timestamp' type then the step expression must resolve to the 'interval' type, otherwise to the same type as the start and stop expressions.
Input Parameters:
arg0:start - an expression. The start of the range.
arg1:stop - an expression. The end the range (inclusive).
arg2:step - an optional expression. The step of the range. By default step is 1 if start is less than or equal to stop, otherwise -1. For the temporal sequences it's 1 day and -1 day respectively. If start is greater than stop then the step must be negative, and vice versa.
Configuration Parameters | No |
Returns | Returns the sequence |
Throws | Application Exception |
Example:
sequence(to_date('2018-01-01'), to_date('2018-03-01'), interval 1 month) will return [2018-01-01,2018-02-01,2018-03-01]
Returns the size of an array or a map. The function returns -1 if its input is null and spark.sql.legacy.sizeOfNull is set to true. If spark.sql.legacy.sizeOfNull is set to false, the function returns null for null input. By default, the spark.sql.legacy.sizeOfNull parameter is set to true.
Input Parameters:
arg0: array or map column.
Configuration Parameters | No |
Returns | Returns the size |
Throws | Application Exception |
Example:
size(array('b', 'd', 'c', 'a')) will return 4
Creates a struct with the given field values.
Input Parameters:
arg0:columns using which the struct will be created.
Configuration Parameters | No |
Returns | Returns the struct column |
Throws | Application Exception |
Returns a JSON string with a given struct value.
Input Parameters:
arg0:struct column.
arg1:additional options map.
Configuration Parameters | No |
Returns | Returns a JSON string with a given struct value. |
Throws | Application Exception |
Example:
to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy')) will return {"time":"26/08/2015"}
Converts hexadecimal expr to binary.
Input Parameters:
arg0:The hexadecimal column.
Configuration Parameters | No |
Returns | Returns the binary. |
Throws | Application Exception |
Example:
decode(unhex('537061726B2053514C'), 'UTF-8') will return Spark SQL
Computes a cyclic redundancy check value for string.
Input Parameters: Column arg0
• arg0 (required) - A string column. Output Type Integer Column Configuration Parameters No Returns Bigint value. Throws Application Exception
Use Case:
Example:
crc32("ABC") will return 2743272264
Returns true if the input is Not a Number.
Input Parameters: Column arg0
• arg0 (required) - A column whose values needs to be checked.
Output Type | Boolean Column |
Configuration Parameters | No |
Returns | A boolean true if the value is not a number. |
Throws | Application Exception |
Use Case:
Example:
isnan("abc") will return true
Returns true if “a” is NULL and false otherwise.
Input Parameters: Column arg0
• arg0 (required) - A column whose values needs to be checked.
Output Type | Boolean Column |
Configuration Parameters | No |
Returns | A boolean true if the value is null |
Throws | Application Exception |
Use Case:
Example:
isnull("abc") will return false
Creates a Column of literal value.
Input Parameters: Object arg0
• arg0 (required) - A constant value. Output Type Column Configuration Parameters No Returns A column with the constant value Throws Application Exception
Use Case:
Example:
lit(1) will return a column having 1 in all the rows of the column.
Calculates an MD5 128-bit checksum for the string.
Input Parameters: Column arg0
• arg0 (required) - A string column
Output Type | String Column |
Configuration Parameters | No |
Returns | The value is returned as a string of 32 hex digits, or NULL if the argument was NULL. |
Throws | Application Exception |
Use Case:
Example:
md5('ABC') will give you an output '902fbdd2b1df0c4f70b4a5d23525e932'.
A column expression that generates monotonically increasing 64-bit integers.
Input Parameters: No input arguments.
Output Type | Integer Column |
Configuration Parameters | No |
Returns | Monotonically increasing integers. |
Throws | Application Exception |
Example:
monotonically_increasing_id() will return rows as 0,1,2...
Calculates the SHA-1 digest for string and returns the value as a hex string
Input Parameters: Column arg0
• arg0 (requird) - A string column
Output Type | String Column |
Configuration Parameters | No |
Returns | A hex string. |
Throws | Application Exception |
Use Case:
Example:
sha1("ABC") will return '3c01bdbb26f358bab27f267924aa2c9a03fcfdb8'
Calculates the SHA-2 family of hash functions of a string value and returns the value as a hex string. NumBits controls the number of bits in the message digest.
Input Parameters: Column arg0,int numBits
• arg0 (required) - A string column
• arg1 (required) - one of 224, 256, 384, or 512.
Output Type | String Column |
Configuration Parameters | No |
Returns | A hex String. |
Throws | Application Exception |
Use Case:
Example:
sha2("Sam",256) will return '4ecde249d747d51d8..'
Bitwise left shift, Shifts a b positions to the left.
Input Parameters: Column arg0,int numBits
• arg0 (required) - A number column
• arg1 (required) - An integer column Output Type Number Column Configuration Parameters No Returns If the given value is a long value, this function will return a long value else it will return an integer value Throws Application Exception
Use Case:
Example:
shiftLeft(258,2) will return 1032
Bitwise right shift, Shifts a b positions to the right.
Input Parameters: Column arg0,int numBits
• arg0 (required) - A number column
• arg1 (required) - An integer column Output Type Number Column Configuration Parameters No Returns If the given value is a long value, this function will return a long value else it will return an integer value Throws Application Exception
Use Case:
Example:
shiftRight(258,2) will return 64
Bitwise unsigned right shift, Shifts a b positions to the right.
Input Parameters: Column arg0,int numBits
• arg0 (required) - A number column
• arg1 (required) - An integer column. Output Type Number Column Configuration Parameters No Returns If the given value is a long value, this function will return a long value else it will return an integer value Throws Application Exception
Use Case:
Example:
shiftRightUnsigned(258,2) will return 64