Rank vs Row_Number in Teradata
Anyone working on Teradata would be aware of the two most common OLAP functions in Teradata which are used for common use i.e.
- Rank()
- Row_Number()
In one of my posts I have mentioned about how to generate sequence number (ex. Surrogate Key) using Row_Number. Today let us find out the difference in usage of rank and row number functions by taking following scenario on tables (sales_daily)
SELECT
item_id,
sales_date,
Actual_sales,
RANK() OVER (ORDER BY Actual_sales DESC) , -- RANK function
ROW_NUMBER() OVER (ORDER BY Actual_sales DESC) -- ROW_NUMBER function
FROM Sales_daily
WHERE
sales_date BETWEEN DATE '2010-01-01' AND DATE '2010-03-01';
The result set is as follows:
Item_id | Sales_date | Actual_sales | Rank | Row_Number() |
120 | 4/1/2010 | 550 | 1 | 1 |
120 | 4/2/2017 | 550 | 1 | 2 |
120 | 4/2/2020 | 450 | 3 | 3 |
120 | 4/2/2006 | 350 | 4 | 4 |
120 | 4/2/2027 | 350 | 4 | 5 |
120 | 4/1/2005 | 350 | 4 | 6 |
120 | 4/1/2003 | 250 | 7 | 7 |
120 | 4/2/2003 | 250 | 7 | 8 |
120 | 4/1/2025 | 200 | 9 | 9 |
120 | 4/1/2002 | 200 | 9 | 10 |
120 | 4/1/2021 | 150 | 11 | 11 |
120 | 4/2/2001 | 150 | 11 | 12 |
120 | 4/1/2001 | 150 | 11 | 13 |
120 | 4/1/1931 | 100 | 14 | 14 |
In the result set, we can observe the following,
1) Rows 1 and 2 have same rank assigned since that have same actual_sales value
2) Rows 1 and 2 have different row numbers assigned even though same actual_sales value
3) Row 3 has rank “3” assigned as ranking positions
Hence ,
rows are always assigned an incremented sequence number with ROW_NUMBER.
Rows are always assigned the same number with RANK function.
No comments:
Post a Comment