Tuesday, March 8, 2011

How to generate row numbers in teradata?

We have come across rownum function in oracle . This function gives number for every row/record in Oracle. In teradata there is no direct approach like in oracle. There are various approaches suggested around.

Approach 1:
Here Sum function is used  over rows preceding in the SOURCE_TABLE

select
sum(1) over( rows unbounded preceding ),
columnA,
columnB
from
SOURCE_TABLE;

Approach 2:
Here ROW_NUMBER function is used to generate row_number on columnA
select
ROW_NUMBER() over( ORDER BY columnA ),
columnA,
columnB
from
SOURCE_TABLE;


If you have  to use the  row number concept  in target table  as well, then  following  approach using "identity column" (from V2R6 onwards )  be used :

CREATE MULTISET TABLE TARGET_TABLE
  (
   columnA INTEGER GENERATED BY DEFAULT AS IDENTITY
       (START WITH 1
        INCREMENT BY 20) ,
   columnB VARCHAR(20) NOT NULL
  )
UNIQUE PRIMARY INDEX pidx (ColA);

P.S:  Identity columns does differ from sequence concept in oracle. The numbers assigned in these columns are not guaranteed to be sequenctial. The Identity column in Teradata is used to guaranteed row-uniqueness.

This  works without use of Identity approach.
create TABLE TARGET_TABLE as
(
   select
   ROW_NUMBER() over( ORDER BY columnA ) NUMBER_SEQ ,
   columnA,
   columnB,
   columnC
   from a join b on a.id=b.id
) with data ;

How to select Nth row from a table ?

how to select a particular row from the table?
Ans:

By using  ROW_NUMBER() Function , we can mark all the selected rows with numbers .
Then use QUALIFY clause to get excat row number.


select
columnA,columnB
from
source_table
qualify row_number() over (order by columnA ) = Nth record ;
Here 'N' being particular row number.

P.S:  The same query can be used to select Top N records;
select
columnA,columnB
from 
source_table 
qualify row_number() over (order by columnA ) <= N;

ROW NUMBER

Rank vs Row_Number in Teradata

Anyone working on Teradata would be aware of the two most common OLAP functions in Teradata which are used for common use i.e.
  1. Rank()
  2. Row_Number()

  In one of my posts I have mentioned about how to generate sequence number (ex. Surrogate Key) using Row_Number.   Today let us find out the difference in usage of rank and row number functions by taking following scenario on tables (sales_daily)


SELECT    
item_id,
sales_date,
Actual_sales,
RANK() OVER (ORDER  BY Actual_sales DESC)  ,                                  -- RANK function
ROW_NUMBER() OVER (ORDER     BY Actual_sales DESC)                   -- ROW_NUMBER function
FROM   Sales_daily
WHERE    
sales_date BETWEEN DATE  '2010-01-01'     AND DATE '2010-03-01';


The result set is as follows:
Item_id
Sales_date
Actual_sales
Rank
Row_Number()
120
4/1/2010
550
1
1
120
4/2/2017
550
1
2
120
4/2/2020
450
3
3
120
4/2/2006
350
4
4
120
4/2/2027
350
4
5
120
4/1/2005
350
4
6
120
4/1/2003
250
7
7
120
4/2/2003
250
7
8
120
4/1/2025
200
9
9
120
4/1/2002
200
9
10
120
4/1/2021
150
11
11
120
4/2/2001
150
11
12
120
4/1/2001
150
11
13
120
4/1/1931
100
14
14


In the result set, we can observe the following,  
1)       Rows 1 and 2 have same rank assigned  since that have same actual_sales value
2)       Rows 1 and 2 have different row numbers assigned even though same actual_sales value
      3)   Row 3 has rank “3” assigned as ranking positions

Hence , 
rows are always assigned an incremented sequence number with ROW_NUMBER.
Rows are always assigned the same number with RANK function.

LOCKS

Difference between read lock and access lock?

For access lock to be placed we have to specify it explicitly as LOCKING FOR ACCESS.
Access lock is one wherein the table is not locked means you can do insert/update/delete on the table while access lock will access the table ,in this the dirty reads may happen , which  means you will not get the latest changes whatever happened on table to be reflected in your answer set.

 READ lock will lock the table wherein you can not do insert/update and structural changes in table.
It is placed by simple SELECT statement for by explicitly specifying as LOCKING FOR READ.

Syntax:
read lock :
Locking table for Access;

access lock :
Locking table for Read Access

The main difference between read lock and access lock is data Integrity.On placing a read lock the user expects data integrity, while as for access lock the user cannot expect data integrity.

Consider following scenarios
1. User A places READ lock and User B WRITE places lock
User B will have to wait for User A to complete its read in order to start insert/updates/deletes on the data.

2. User A places ACCESS lock and User B WRITE places lock.
User A & B access the data simultaneously, hence User A cannot expect to get consistent results.