Tuesday, March 8, 2011

ROW NUMBER

Rank vs Row_Number in Teradata

Anyone working on Teradata would be aware of the two most common OLAP functions in Teradata which are used for common use i.e.
  1. Rank()
  2. Row_Number()

  In one of my posts I have mentioned about how to generate sequence number (ex. Surrogate Key) using Row_Number.   Today let us find out the difference in usage of rank and row number functions by taking following scenario on tables (sales_daily)


SELECT    
item_id,
sales_date,
Actual_sales,
RANK() OVER (ORDER  BY Actual_sales DESC)  ,                                  -- RANK function
ROW_NUMBER() OVER (ORDER     BY Actual_sales DESC)                   -- ROW_NUMBER function
FROM   Sales_daily
WHERE    
sales_date BETWEEN DATE  '2010-01-01'     AND DATE '2010-03-01';


The result set is as follows:
Item_id
Sales_date
Actual_sales
Rank
Row_Number()
120
4/1/2010
550
1
1
120
4/2/2017
550
1
2
120
4/2/2020
450
3
3
120
4/2/2006
350
4
4
120
4/2/2027
350
4
5
120
4/1/2005
350
4
6
120
4/1/2003
250
7
7
120
4/2/2003
250
7
8
120
4/1/2025
200
9
9
120
4/1/2002
200
9
10
120
4/1/2021
150
11
11
120
4/2/2001
150
11
12
120
4/1/2001
150
11
13
120
4/1/1931
100
14
14


In the result set, we can observe the following,  
1)       Rows 1 and 2 have same rank assigned  since that have same actual_sales value
2)       Rows 1 and 2 have different row numbers assigned even though same actual_sales value
      3)   Row 3 has rank “3” assigned as ranking positions

Hence , 
rows are always assigned an incremented sequence number with ROW_NUMBER.
Rows are always assigned the same number with RANK function.

No comments:

Post a Comment