Functions and Operators
    • Dark
      Light

    Functions and Operators

    • Dark
      Light

    Article summary

    Logical Operators

    1. Comparison Operators

      OperatorDescriptionExample
      ANDTrue if both values are truea AND b
      ORTrue if either value is truea OR b
      NOTTrue if the value is falseNOT a
    2. Comparison Operators

      OperatorDescription
      <Less than
      >Greater than
      <=Less than or equal to
      >=More than or equal to
      =Equal
      <>Not equal
      !=Not equal

    Conversion Functions

    • cast(value AS type) → type

      Explicitly cast a value as a type. This can be used to cast a varchar to a numeric value type and vice versa.

    • try_cast(value AS type) → type

      Like cast(), but returns null if the cast fails.

    Data and Time Functions/Operators

    1. Operators

      OperatorExampleResult
      +date '2012-08-08' + interval '2' day10/08/2012
      +time '01:00' + interval '3' hour04:00.0
      +timestamp '2012-08-08 01:00' + interval '29' hour2012-08-09 06:00:00.000
      +timestamp '2012-10-31 01:00' + interval '1' month2012-11-30 01:00:00.000
      +interval '2' day + interval '3' hour2 03:00:00.000
      +interval '3' year + interval '5' month03-May
      -date '2012-08-08' - interval '2' day06/08/2012
      -time '01:00' - interval '3' hour22:00.0
      -timestamp '2012-08-08 01:00' - interval '29' hour2012-08-06 20:00:00.000
      -timestamp '2012-10-31 01:00' - interval '1' month2012-09-30 01:00:00.000
      -interval '2' day - interval '3' hour1 21:00:00.000
      -interval '3' year - interval '5' month02-Jul
    2. Functions

      date_trunc function supports the following units:

      UnitExample Truncated Value
      second2001-08-22 03:04:05.000
      minute2001-08-22 03:04:00.000
      hour2001-08-22 03:00:00.000
      day2001-08-22 00:00:00.000
      week2001-08-20 00:00:00.000
      month2001-08-01 00:00:00.000
      quarter2001-07-01 00:00:00.000
      year2001-01-01 00:00:00.000

      The above examples use the timestamp 2001-08-22 03:04:05.321 as the input.

      • date_trunc(unit, x) → [same as input]

        Returns x truncated to unit.

    Aggregate/Statistical

    • avg(x) → double

      Returns the average (arithmetic mean) of all input values.

    • count(***) → bigint

      Returns the number of input rows.

    • max(x, n) → array<[same as x]>

      Returns n largest values of all input values of x.

    • min(x) → [same as input]

      Returns the minimum value of all input values.

    • min(x, n) → array<[same as x]>

      Returns n smallest values of all input values of x.

    • sum(x) → [same as input]

      Returns the sum of all input values.

    • histogram(x) -> map(K, bigint)

      Returns a map containing the count of the number of times each input value occurs.

    • corr(y, x) → double

      Returns correlation coefficient of input values.

      Example:

        // Getting trend of crime in a particular postcode
        SELECT a.postcode, corr(a.period, a.crime_count)
        FROM (SELECT postcode, period, count(*) as crime_count FROM eng_area_reported_crime where postcode = 'E3 4RL' GROUP BY postcode, period) a
        group by a.postcode;
      
    • covar_pop(y, x) → double

      Returns the population covariance of input values.

    • covar_samp(y, x) → double

      Returns the sample covariance of input values.

    • kurtosis(x) → double

      Returns the excess kurtosis of all input values.

    • regr_intercept(y, x) → double

      Returns linear regression intercept of input values. y is the dependent value. x is the independent value.

    • regr_slope(y, x) → double

      Returns linear regression slope of input values. y is the dependent value. x is the independent value.

    • skewness(x) → double

      Returns the skewness of all input values.

    • stddev_pop(x) → double

      Returns the population standard deviation of all input values.

    • stddev_samp(x) → double

      Returns the sample standard deviation of all input values.

    • var_pop(x) → double

      Returns the population variance of all input values.

    • var_samp(x) → double

      Returns the sample variance of all input values.

    Array

    1. Subscript Operator: []

      The [] operator is used to access an element of an array and is indexed starting from one:

       SELECT my_array[1] AS first_element
      
    2. Concatenation Operator: ||

      The || operator is used to concatenate an array with an array or an element of the same type:

       SELECT ARRAY [1] || ARRAY [2]; -- [1, 2]
       SELECT ARRAY [1] || 2; -- [1, 2]
       SELECT 2 || ARRAY [1]; -- [2, 1]
      
    3. Functions

      • array_distinct(x) → array

        Remove duplicate values from the array x.

      • array_intersect(x, y) → array

        Returns an array of the elements in the intersection of x and y, without duplicates.

      • array_union(x, y) → array

        Returns an array of the elements in the union of x and y, without duplicates.

      • array_except(x, y) → array

        Returns an array of elements in x but not in y, without duplicates.

      • array_join(x, delimiter, null_replacement) → varchar

        Concatenates the elements of the given array using the delimiter and an optional string to replace nulls.

      • array_max(x) → x

        Returns the maximum value of input array.

      • array_min(x) → x

        Returns the minimum value of input array.

      • array_position(x, element) → bigint

        Returns the position of the first occurrence of the element in array x (or 0 if not found).

      • array_remove(x, element) → array

        Remove all elements that equal element from array x.

      • array_sort(x) → array

        Sorts and returns the array x. The elements of x must be orderable. Null elements will be placed at the end of the returned array.

      • cardinality(x) → bigint

        Returns the cardinality (size) of the array x.

      • contains(x, element) → boolean

        Returns true if the array x contains the element.

    Map

    1. Subscript Operator: []

      Retrieves value from given key:

       SELECT name_to_age_map['Bob'] AS bob_age;
      
    2. Functions

      • cardinality(x) → bigint

        Returns the cardinality (size) of the map x.

      • element_at(map(K, V), key) → V

        Returns value for given key, or NULL if the key is not contained in the map.

      • map(array(K), array(V)) -> map(K, V)

        Returns a map created using the given key/value arrays.

          SELECT map(ARRAY[1,3], ARRAY[2,4]); -- {1 -> 2, 3 -> 4}
        
      • map_keys(x(K, V)) -> array(K)

        Returns all the keys in the map x.

      • map_values(x(K, V)) -> array(V)

        Returns all the values in the map x.

    Regular Expression

    • regexp_extract_all(string,pattern) -> array(varchar)

      Returns the substrings matched by regex pattern in string

      SELECT regexp_extract_all('1a 2b 14m', '\d+'); -- [1, 2, 14]
    • regexp_extract_all(string,pattern, group) -> array(varchar)

      Finds all occurrences of the regex pattern in string and returns group

      SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    • regexp_extract(string, pattern) -> varchar

      Returns the first substring matched by the regex pattern in string

      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
    • regexp_extract(string, pattern, group) -> varchar

      Finds the first occurrence of the regex pattern in string and returns the group

      SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    • regexp_like(string, pattern) -> boolean

      Evaluates the regex pattern and determines if it is contained within string.

      This is similar to the LIKE operator but performs a contains operationrather than amatch` operation.

      SELECT regexp_like('1a 2b 14m', '\d+b'); -- true
    • regexp_replace(string, pattern) -> varchar

      Removes every instance of the substring matched by the regex pattern from string:

      SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
    • regexp_replace(string, pattern, replacement) -> varchar

      Removes every instance of the substring matched by the regex pattern from string with replacement

      SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'
    • regexp_replace(string, pattern, function) -> varchar

      Removes every instance of the substring matched by the regex pattern from string using function

      SELECT regexp_replace('new york', '(\w)(\w*)', x -> upper(x[1]) || lower(x[2])); --'New York'
    • regexp_split(string, pattern) -> array(varchar)

      Splits string using the regex pattern and returns an array. Trailing empty strings are preserved.

      SELECT regexp_split('1a 2b 14m', '\s*[a-z]+\s*'); -- [1, 2, 14]

    Window Functions

    Window functions perform calculations across rows of the query result. They run after the HAVING clause but before the ORDER BY clause. Invoking a window function requires special syntax using the OVER clause to specify the window. A window has three components:

    • The partition specification, which separates the input rows into different partitions. This is analogous to how the GROUP BY clause separates rows into different groups for aggregate functions.
    • The ordering specification, which determines the order in which input rows will be processed by the window function.
    • The window frame, which specifies a sliding window of rows to be processed by the function for a given row. If the frame is not specified, it defaults to RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This frame contains all rows from the start of the partition up to the last peer of the current row.

    For example, the following query ranks orders for each clerk by price:

    SELECT orderkey, clerk, totalprice,
       rank() OVER (PARTITION BY clerk
                    ORDER BY totalprice DESC) AS rnk
    FROM orders
    ORDER BY clerk, rnk

    Aggregate Functions

    All Aggregate Functions can be used as a window function by adding the OVER clause.

    For example, the following query produces a rolling sum of order prices by day for each clerk:

    SELECT clerk, orderdate, orderkey, totalprice,
           sum(totalprice) OVER (PARTITION BY clerk
                                 ORDER BY orderdate) AS rolling_sum
    FROM orders
    ORDER BY clerk, orderdate, orderkey

    Ranking Functions

    • row_number() -> bigint

      Returns a unique, sequential number for each row, starting with one, according to the ordering of rows within the window partition.

    • rank() -> bigint

      Returns the rank of a value in a group of values. The rank is one plus the number of rows preceding the row that are not peer with the row. Thus, tie values in the ordering will produce gaps in the sequence. The ranking is performed for each window partition.

    • dense_rank() -> bigint

      Returns the rank of a value in a group of values. This is similar to rank(), except that tie values do not produce gaps in the sequence.


    What's Next
    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.