hiltdocu.blogg.se - Redshift rank function

REDSHIFT RANK FUNCTION FULL

Therefore a stack of union all rollups is not a nice solution.Ī blog post from 2016 calls out this problem and provides a rudimentary workaround, so thank you Mark D. The result needs to work on a redshift cluster, furthermore this is a simplified problem and the actual desired result has product category and customer type, which multiplies the number of partitions needed. * otc is a table of dated transactions of customers, channels, and month of order */ , count(distinct customer_id) over() as ytd_total_customers , count(distinct customer_id) over(partition by order_month) as monthly_customers_all_channels , count(distinct customer_id) over(partition by traffic_channel) as ytd_customers_by_channel , count(distinct customer_id) over(partition by order_month, traffic_channel) as customers_by_channel_and_month This is what I would write in postgres (or Oracle for that matter): select order_month

REDSHIFT RANK FUNCTION FULL

I don't want to get into the habit of running a full query for each desired count piled up between a bunch of union all.

I can count distinct customers using count(distinct customer_id).group by, but this will give me only a single result of the four needed.

Since a customer can visit more than once I need to count only distinct customers, and therefore the Redshift window aggregates won't help. I desire monthly and YTD unique customer counts for the current year, and also split by traffic channel as well as total for all channels.

My use case: count customers over varying time intervals and traffic channels AWS documentation for COUNT states this, and distinct isn't supported for any of the window functions. Redshift doesn't support DISTINCT aggregates in its window functions.