- Print
- DarkLight
Functions and Operators
- Print
- DarkLight
Logical Operators
Comparison Operators
Operator Description Example AND True if both values are true a AND b OR True if either value is true a OR b NOT True if the value is false NOT a Comparison Operators
Operator Description < Less than > Greater than <= Less than or equal to >= More than or equal to = Equal <> Not equal != Not equal
Conversion Functions
cast
(value AS type) → typeExplicitly cast a value as a type. This can be used to cast a varchar to a numeric value type and vice versa.
try_cast
(value AS type) → typeLike
cast()
, but returns null if the cast fails.
Data and Time Functions/Operators
Operators
Operator Example Result + date '2012-08-08' + interval '2' day 10/08/2012 + time '01:00' + interval '3' hour 04:00.0 + timestamp '2012-08-08 01:00' + interval '29' hour 2012-08-09 06:00:00.000 + timestamp '2012-10-31 01:00' + interval '1' month 2012-11-30 01:00:00.000 + interval '2' day + interval '3' hour 2 03:00:00.000 + interval '3' year + interval '5' month 03-May - date '2012-08-08' - interval '2' day 06/08/2012 - time '01:00' - interval '3' hour 22:00.0 - timestamp '2012-08-08 01:00' - interval '29' hour 2012-08-06 20:00:00.000 - timestamp '2012-10-31 01:00' - interval '1' month 2012-09-30 01:00:00.000 - interval '2' day - interval '3' hour 1 21:00:00.000 - interval '3' year - interval '5' month 02-Jul Functions
date_trunc
function supports the following units:Unit Example Truncated Value second 2001-08-22 03:04:05.000 minute 2001-08-22 03:04:00.000 hour 2001-08-22 03:00:00.000 day 2001-08-22 00:00:00.000 week 2001-08-20 00:00:00.000 month 2001-08-01 00:00:00.000 quarter 2001-07-01 00:00:00.000 year 2001-01-01 00:00:00.000 The above examples use the timestamp
2001-08-22 03:04:05.321
as the input.date_trunc
(unit, x) → [same as input]Returns
x
truncated tounit
.
Aggregate/Statistical
avg
(x) → doubleReturns the average (arithmetic mean) of all input values.
count
(***) → bigintReturns the number of input rows.
max
(x, n) → array<[same as x]>Returns
n
largest values of all input values ofx
.min
(x) → [same as input]Returns the minimum value of all input values.
min
(x, n) → array<[same as x]>Returns
n
smallest values of all input values ofx
.sum
(x) → [same as input]Returns the sum of all input values.
histogram
(x) -> map(K, bigint)Returns a map containing the count of the number of times each input value occurs.
corr
(y, x) → doubleReturns correlation coefficient of input values.
Example:
// Getting trend of crime in a particular postcode SELECT a.postcode, corr(a.period, a.crime_count) FROM (SELECT postcode, period, count(*) as crime_count FROM eng_area_reported_crime where postcode = 'E3 4RL' GROUP BY postcode, period) a group by a.postcode;
covar_pop
(y, x) → doubleReturns the population covariance of input values.
covar_samp
(y, x) → doubleReturns the sample covariance of input values.
kurtosis
(x) → doubleReturns the excess kurtosis of all input values.
regr_intercept
(y, x) → doubleReturns linear regression intercept of input values.
y
is the dependent value.x
is the independent value.regr_slope
(y, x) → doubleReturns linear regression slope of input values.
y
is the dependent value.x
is the independent value.skewness
(x) → doubleReturns the skewness of all input values.
stddev_pop
(x) → doubleReturns the population standard deviation of all input values.
stddev_samp
(x) → doubleReturns the sample standard deviation of all input values.
var_pop
(x) → doubleReturns the population variance of all input values.
var_samp
(x) → doubleReturns the sample variance of all input values.
Array
Subscript Operator:
[]
The
[]
operator is used to access an element of an array and is indexed starting from one:SELECT my_array[1] AS first_element
Concatenation Operator:
||
The
||
operator is used to concatenate an array with an array or an element of the same type:SELECT ARRAY [1] || ARRAY [2]; -- [1, 2] SELECT ARRAY [1] || 2; -- [1, 2] SELECT 2 || ARRAY [1]; -- [2, 1]
Functions
array_distinct
(x) → arrayRemove duplicate values from the array
x
.array_intersect
(x, y) → arrayReturns an array of the elements in the intersection of
x
andy
, without duplicates.array_union
(x, y) → arrayReturns an array of the elements in the union of
x
andy
, without duplicates.array_except
(x, y) → arrayReturns an array of elements in
x
but not iny
, without duplicates.array_join
(x, delimiter, null_replacement) → varcharConcatenates the elements of the given array using the delimiter and an optional string to replace nulls.
array_max
(x) → xReturns the maximum value of input array.
array_min
(x) → xReturns the minimum value of input array.
array_position
(x, element) → bigintReturns the position of the first occurrence of the
element
in arrayx
(or 0 if not found).array_remove
(x, element) → arrayRemove all elements that equal
element
from arrayx
.array_sort
(x) → arraySorts and returns the array
x
. The elements ofx
must be orderable. Null elements will be placed at the end of the returned array.cardinality
(x) → bigintReturns the cardinality (size) of the array
x
.contains
(x, element) → booleanReturns true if the array
x
contains theelement
.
Map
Subscript Operator:
[]
Retrieves value from given key:
SELECT name_to_age_map['Bob'] AS bob_age;
Functions
cardinality
(x) → bigintReturns the cardinality (size) of the map
x
.element_at
(map(K, V), key) → VReturns value for given
key
, orNULL
if the key is not contained in the map.map
(array(K), array(V)) -> map(K, V)Returns a map created using the given key/value arrays.
SELECT map(ARRAY[1,3], ARRAY[2,4]); -- {1 -> 2, 3 -> 4}
map_keys
(x(K, V)) -> array(K)Returns all the keys in the map
x
.map_values
(x(K, V)) -> array(V)Returns all the values in the map
x
.
Regular Expression
regexp_extract_all(string,pattern) ->
array(varchar)
Returns the substrings matched by regex
pattern
instring
SELECT regexp_extract_all('1a 2b 14m', '\d+'); -- [1, 2, 14]
regexp_extract_all(string,pattern, group) ->
array(varchar)
Finds all occurrences of the regex
pattern
instring
and returnsgroup
SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
regexp_extract(string, pattern) ->
varchar
Returns the first substring matched by the regex
pattern
instring
SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
regexp_extract(string, pattern, group) ->
varchar
Finds the first occurrence of the regex
pattern
instring
and returns thegroup
SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
regexp_like(string, pattern) ->
boolean
Evaluates the regex
pattern
and determines if it is contained withinstring
.This is similar to the
LIKE
operator but performs acontains
operationrather than a
match` operation.SELECT regexp_like('1a 2b 14m', '\d+b'); -- true
regexp_replace(string, pattern) ->
varchar
Removes every instance of the substring matched by the regex
pattern
fromstring
:SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
regexp_replace(string, pattern, replacement) ->
varchar
Removes every instance of the substring matched by the regex
pattern
fromstring
withreplacement
SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'
regexp_replace(string, pattern, function) ->
varchar
Removes every instance of the substring matched by the regex
pattern
fromstring
usingfunction
SELECT regexp_replace('new york', '(\w)(\w*)', x -> upper(x[1]) || lower(x[2])); --'New York'
regexp_split(string, pattern) ->
array(varchar)
Splits
string
using the regexpattern
and returns an array. Trailing empty strings are preserved.SELECT regexp_split('1a 2b 14m', '\s*[a-z]+\s*'); -- [1, 2, 14]
Window Functions
Window functions perform calculations across rows of the query result. They run after the HAVING
clause but before the ORDER BY
clause. Invoking a window function requires special syntax using the OVER
clause to specify the window. A window has three components:
- The partition specification, which separates the input rows into different partitions. This is analogous to how the
GROUP BY
clause separates rows into different groups for aggregate functions. - The ordering specification, which determines the order in which input rows will be processed by the window function.
- The window frame, which specifies a sliding window of rows to be processed by the function for a given row. If the frame is not specified, it defaults to RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This frame contains all rows from the start of the partition up to the last peer of the current row.
For example, the following query ranks orders for each clerk by price:
SELECT orderkey, clerk, totalprice,
rank() OVER (PARTITION BY clerk
ORDER BY totalprice DESC) AS rnk
FROM orders
ORDER BY clerk, rnk
Aggregate Functions
All Aggregate Functions can be used as a window function by adding the OVER
clause.
For example, the following query produces a rolling sum of order prices by day for each clerk:
SELECT clerk, orderdate, orderkey, totalprice,
sum(totalprice) OVER (PARTITION BY clerk
ORDER BY orderdate) AS rolling_sum
FROM orders
ORDER BY clerk, orderdate, orderkey
Ranking Functions
row_number() ->
bigint
Returns a unique, sequential number for each row, starting with one, according to the ordering of rows within the window partition.
rank() ->
bigint
Returns the rank of a value in a group of values. The rank is one plus the number of rows preceding the row that are not peer with the row. Thus, tie values in the ordering will produce gaps in the sequence. The ranking is performed for each window partition.
dense_rank() ->
bigint
Returns the rank of a value in a group of values. This is similar to rank(), except that tie values do not produce gaps in the sequence.