Powered By Blogger

Saturday 15 September 2012

Merge Query (More than two tables)


MERGE

CREATE EMP1 TABLE WITH FOLLOWING ROWS AND COLUMNS.


    CREATE TABLE EMP1
       (
        EMPNO NUMBER(4,0),
ENAME VARCHAR2(10),
JOB VARCHAR2(9),
MGR NUMBER(4,0),
HIREDATE DATE,
SAL NUMBER(10,5),
COMM NUMBER(7,2),
DEPTNO NUMBER(2,0)
       );


Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7566,'JONES','sr_MANGER',7839,to_date('02-APR-81','DD-MON-RR'),2975,null,60);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7788,'SCOTT','accountt',7566,to_date('09-DEC-82','DD-MON-RR'),3000,null,60);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7902,'FORD','accountt',7566,to_date('03-DEC-81','DD-MON-RR'),3000,null,60);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7369,'SMITH','SR_CLERK',7902,to_date('17-DEC-80','DD-MON-RR'),800,null,60);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7782,'CLARK','sr_MANGER',7839,to_date('09-JUN-81','DD-MON-RR'),2450,null,50);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7934,'MILLER','SR_CLERK',7782,to_date('23-JAN-82','DD-MON-RR'),1300,null,50);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7698,'BLAKE','sr_MANGER',7839,to_date('01-MAY-81','DD-MON-RR'),2850,null,70);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7521,'WARD','accountt',7698,to_date('22-FEB-81','DD-MON-RR'),1250,500,70);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7654,'MARTIN','accountt',7698,to_date('28-SEP-81','DD-MON-RR'),1250,1400,70);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7844,'TURNER','accountt',7698,to_date('08-SEP-81','DD-MON-RR'),1500,0,70);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
 values (7900,'JAMES','SR_CLERK',7698,to_date('03-DEC-81','DD-MON-RR'),950,null,70);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
 values (7876,'ADAMS','SR_CLERK',7788,to_date('12-JAN-83','DD-MON-RR'),1100,null,60);
Insert into EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
values (7499,'ALLEN','accountt',7698,to_date('20-FEB-81','DD-MON-RR'),1600,300,70);
INSERT INTO EMP1 (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO)
VALUES (7839,'KING','PRESIDENT',NULL,TO_DATE('17-NOV-81','DD-MON-RR'),0.5,NULL,10);


EMPNO   ENAME    JOB            MGR      HIREDATE        SAL    COMM    DEPTNO
--------------------------------------------------------------------------------  
7566 JONES sr_MANGER 7839 02-APR-81 2975 60
7788 SCOTT accountt 7566 09-DEC-82 3000 60
7902 FORD accountt 7566 03-DEC-81 3000 60
7369 SMITH SR_CLERK 7902 17-DEC-80 800 60
7782 CLARK sr_MANGER 7839 09-JUN-81 2450 50
7934 MILLER SR_CLERK 7782 23-JAN-82 1300 50
7698 BLAKE sr_MANGER 7839 01-MAY-81 2850 70
7521 WARD accountt 7698 22-FEB-81 1250 500 70
7654 MARTIN accountt 7698 28-SEP-81 1250 1400 70
7844 TURNER accountt 7698 08-SEP-81 1500 0 70
7900 JAMES SR_CLERK 7698 03-DEC-81 950 70
7876 ADAMS SR_CLERK 7788 12-JAN-83 1100 60
7499 ALLEN accountt 7698 20-FEB-81 1600 300 70
7839 KING PRESIDENT 17-NOV-81 0.5 10


--firstly create the table mybonus with given syntax
CREATE TABLE MYBONUS
(
EMPNO NUMBER,
BONUS NUMBER DEFAULT 100
);

--insert the data into the table
insert into mybonus(empno,sal) select e.empno,e.sal from emp1 e where job='SR_CLERK';

After insert our MYBONUS table becomes like.....

empno   bonus   sal
---------------------
7369 100 800
7934 100 1300
7900 100 950
7876 100 1100

NOTE:= Bonus has value 100 because it is define default at the time of table creation.

Now when we are fire the below merge query then the result like that

EXAMPLE 1
-------------
MERGE INTO MYBONUS B
USING (SELECT EMPNO,SAL,DEPTNO FROM EMP1 WHERE DEPTNO=70) S
ON (B.EMPNO=S.EMPNO)
WHEN MATCHED THEN
UPDATE SET B.BONUS=B.BONUS+200
WHEN NOT MATCHED THEN
INSERT (B.EMPNO,B.BONUS,B.SAL) VALUES (S.EMPNO,S.SAL*.10,S.SAL) WHERE (S.SAL<=4000);

then the result like that

OUTPUT:=  6 rows merged.
-------

and result become like that in MYBONUS Table....


EMPNO   BONUS   SAL
--------------------
7369 100 800
7934 100 1300
7900 300 950
7876 100 1100
7521 125 1250
7844 150 1500
7698 285 2850
7654 125 1250
7499 160 1600

EXPLANATION:= 
when merge will fire then using clause will fire that contain the query....

SQL>SELECT EMPNO,SAL,DEPTNO FROM EMP1 WHERE DEPTNO=70;

and result like.......

EMPNO   SAL    DEPTNO
--------------------
7521 1250 70
7698 2850 70
7654 1250 70
7844 1500 70
7900 950 70
7499 1600 70


then it will check the on(B.EMPNO=S.EMPNO) condition.

if the ON condtion will meet, then it will  update the existing data in MYBONUS table. Other wise insert.
In given example EMPNO (7369,7934,7876) are not matched any condition of output of second table (means S table which is
output of >>SELECT EMPNO,SAL,DEPTNO FROM EMP1 WHERE DEPTNO=70;).
Only EMPNO 7900 is satisfying the condition, that means only 7900 empno is updated in MYBONUS table.

Apart of this all the data of second table S (result of query >>SELECT EMPNO,SAL,DEPTNO FROM EMP1 WHERE DEPTNO=70;)
will insert into the MYBONUS table.
and result become such as....


EMPNO   BONUS   SAL
--------------------
7369 100 800
7934 100 1300
7900 300 950
7876 100 1100
7521 125 1250
7844 150 1500
7698 285 2850
7654 125 1250
7499 160 1600
 


EXAMPLE 2
-----------

MERGE INTO MYBONUS B
USING (SELECT EMPNO,SAL,DEPTNO FROM EMP1 WHERE DEPTNO=70) S
ON (B.EMPNO=S.EMPNO)
WHEN MATCHED THEN
UPDATE SET B.BONUS=B.BONUS+200
DELETE WHERE b.SAL in (800,950)
WHEN NOT MATCHED THEN
INSERT (B.EMPNO,B.BONUS,B.SAL) VALUES (S.EMPNO,S.SAL*.10,S.SAL) WHERE (S.SAL<=4000);

The difference Between first merge and second merge is only that, the DELETE statement is added in second one.

and the result become like that...


EMPNO   BONUS   SAL
--------------------
7369 100 800
7934 100 1300
7876 100 1100
7521 125 1250
7844 150 1500
7698 285 2850
7654 125 1250
7499 160 1600



EXPLANATION:=
--------------

The Delete statement is delete the record  which satisfy the ON condition as well as DELETE condition.
For example, merge query delete the EMPNO 7900 (has sal 950) because if we notice in first merge query
only EMPNO 7900 is satisfied the on condition as well as DELETE condition here.EMPNO 7369 which has sal 800
is not  deleted from the table bescause empno 7369 is not satisfied the on condition of the merge query.


NOTE:=   If your EMP1 table contain the duplicate row in table then it will not merge and give the error
-------

ERROR: IT WILL COME WHEN WE ARE FIRE THE QUERY AND WHEN WE ARE GETTING THE DUPLICATE RECORD IN SOURCE TABLE for above query.
here if emp1 table contain the duplicate records.


ORA-30926: unable to get a stable set of rows in the source tables
30926. 00000 -  "unable to get a stable set of rows in the source tables"
*Cause:    A stable set of rows could not be got because of large dml
           ACTIVITY OR A NON-DETERMINISTIC WHERE CLAUSE.
*ACTION:   REMOVE ANY NON-DETERMINISTIC WHERE CLAUSES AND REISSUE THE DML.


SOLUTION OF ERROR:
-------------------

Only we have to write the distinct keywords in using clause after the.
and our merge query become like that.....

MERGE INTO MYBONUS B
USING (SELECT distinct EMPNO,SAL,DEPTNO FROM EMP1 WHERE DEPTNO=70) S
ON (B.EMPNO=S.EMPNO)
WHEN MATCHED THEN
UPDATE SET B.BONUS=B.BONUS+200
DELETE WHERE b.SAL>2000
WHEN NOT MATCHED THEN
INSERT (B.EMPNO,B.BONUS,B.SAL) VALUES (S.EMPNO,S.SAL*.10,S.SAL) WHERE (S.SAL<=4000);


--create table by following syntax

CREATE TABLE EXAMTIMETABLE(EXAMNAME VARCHAR2(40),EXAMTIME VARCHAR2(12),CONSTRAINT EXAMNAMEPK PRIMARY KEY(EXAMNAME));
insert into EXAMTIMETABLE values('PHYSICAL SCIENCE','9:00 AM');


--merge the table one table TO itself.

MERGE INTO  EXAMTIMETABLE E1
USING EXAMTIMETABLE E2
ON (E2.EXAMNAME=E1.EXAMNAME AND E1.EXAMNAME='PHYSICAL SCIENCE')
WHEN MATCHED THEN
UPDATE SET E1.EXAMTIME='10:30 AM'
WHEN NOT MATCHED THEN
insert (e1.EXAMNAME,e1.EXAMTIME) values ('PHYSICAL SCIENCE','10:00 AM');


Inline view:=
--------------
An inline view is term given to sub query in FROM clause of query which can be used as table.
Inline view effectively is a named sub query.

EXAMPLE:=
---------

SELECT DNAME,ENAME,SAL FROM EMP,
(SELECT DNAME, DEPTNO FROM DEPT) D
WHERE A.DEPTNO = B.DEPTNO;

In the above query (SELECT DNAME, DEPTNO FROM DEPT) D is the inline view.

Inline views are determined at runtime, and in contrast to normal view they are not stored in the data dictionary.


Advantage of using inline views:
--------------------------------

1. Better query performance
2. Better visibility of code


EXAMPLE 2:
-----------

select emp.empno,
EMP.ENAME,
DEPT.DEPTNO
from
(SELECT DEPTNO FROM DEPT) DEPT,
(SELECT EMPNO, ENAME,DEPTNO FROM EMP) EMP
WHERE EMP.DEPTNO = DEPT.DEPTNO;

In the above example,first the compiler forms 2 inline views emp1 and dept and then
joins these 2 in the where clause.

Hashing Algorithm:
-------------------
Difference Between Hash Join & Merge Join
-------------------------------------------
Merge Join :
-------------
Oracle performs a join between two sets of row data  using the merge
join algorithm. The inputs are two separate sets of row data. Output is
the results of the join.  Oracle reads rows from both inputs in an
alternating fashion and merges together matching rows in order to
generate output. The two inputs are sorted on join column.

Hash Join :
-------------

Oracle performs a join between two sets of row data using hash join
algorithm.  Input and Output same as Merge Join.  Oracle reads all rows
from the second input and builds a hash structure (like has table in
java), before reading each row from the first input one at a time. For
each row from the first input, the hash structure is probed and matching
rows generate output.


Hashing and Hash table

Before seeing javaĆ¢€™s Hashtable in detail you should understand hashing in general.
 Assume that, v is a value to be stored and k is the key used for storage / retrieval,
 then h is a hash function where v is stored at h(k) of table. To retrieve a value compute
 h(k) so that you can directly get the position of v. So in a key-value pair table, you need
 not sequentially scan through the keys to identify a value.

h(k) is the hashing function and it is used to find the location to store the corresponding value v.
h(k) cannot compute to a indefinite space. Storage allocated for a Hashtable is limited within a
program. So, the hasing function h(k) should return a
number within that allocated spectrum (logical address space).

1. Definition of a Hash Table

Before we get into the definition of Hash Tables, it is good to introduce WHY to use Hash tables.

Hash tables are good for doing a quick search on things.

For instance if we have an array full of data (say 100 items). If we knew the position that a specific item is stored in an array, then we could quickly access it. For instance, we just happen to know that the item we want it is at position 3; I can apply:
myitem=myarray[3];

With this, we don't have to search through each element in the array, we just access position 3.

The question is, how do we know that position 3 stores the data that we are interested in?

This is where hashing comes in handy. Given some key, we can apply a hash function to it to find an index or position that we want to access.



1.1 What is the hash function?

There are many different hash functions. Some hash functions will take an integer key and turn it into an index. A common one is the division method.

Let's learn through an example:

1.2 Division method (one hash method for integers)

Let's say you had the following numbers or keys that you wanted to map into an array of 10 elements:

123456
123467
123450
To apply the division method, you could divide the number by 10 (or the maximum number of elements in the array) and use the remainder (the modulo) as an index. The following would result:

123456 % 10 = 6 (the remainder is 6 when dividing by 10)
123467 % 10 = 7 (the remainder is 7)
123450 % 10 = 0 (the remainder is 0)
These numbers would be inserted into the array at positions 6, 7, and 0 respectively. It might look something like this:

The important thing with the division method is that the keys are integers.

1.3 What happens when the keys aren't integers?

You have to apply another hash function to turn them into integers. Effectively, you get two hash functions in one:

function to get an integer
function to apply a hash method from above to get an index to an array


What do we mean that the keys aren't integers? Well, let's say that the keys are people's names. Such as:

Sarah Jones
Tony Balognie
Tom Katz
The goal is to type in one of these names and get an index to an array in order to access that information. How do we do this?
The hash function has to do two things:

Convert the names into integers
For instance, we have a function which turns a string into an integer. The results will be as follows:
Sarah Jones --> 1038
Tony Balognie --> 1259
Tom Katz --> 746
Apply a hash method to get an index
We can now apply the division method to get an index for an array of 10 elements
Sarah Jones --> 1038 % 10 --> 8
Tony Balognie --> 1259 % 10 --> 9
Tom Katz --> 746 % 10 --> 6
1.4 What would that look like in the array?

The array is known as a hash table. It stores the key (used to find the index) along with associated values. In the above example, we might have a hash table that looked something like this:



Again, the idea is that we will insert items into the hash table using the key and applying the hash function(s) to get the index.

A problem occurs when two keys yield the same index. For Instance, say we wanted to include:
John Smith --> 948 % 10 --> 8

We have a collision because Sarah Jones is already stored at array index 8.

We need a method to resolve this. The resolution comes in how you create your hash table. There two major approaches given in the book:

Linear Probing
Chaining
The approach used in this lab is referred to as chaining.
The details are left as class material, but recognize that in chaining you have an array of linked lists. All the data in the "same link", have colliding index values.

Consider a diagram of the above example. Remember, there was a collision with Sarah Jones and John Smith. Notice that John Smith is "chained" or "linked" after Sarah Jones.







Saturday 4 August 2012

Important Query



TO GET THE DATA_LENGTH AND DATA_TYPE OF TABLE

 SELECT  COLUMN_NAME, DATA_TYPE, DATA_LENGTH, DATA_PRECISION, DATA_SCALE FROM USER_TAB_COLUMNS WHERE COLUMN_NAME='INTERCHANGE' AND TABLE_NAME='EMI_TRAN_DETAIL';


 > SELECT * FROM USER_OBJECTS
WHERE OBJECT_NAME = 'APP_MAST_CURRENCYUPDATE';



 SELECT DBMS_METADATA.GET_DDL('VIEW','APP_MAST_CURRENCYUPDATE') FROM DUAL;


CONVERT FROM TIMESTAMP TO DATE FORMAT

> SELECT TO_DATE(TO_CHAR (SYSTIMESTAMP, 'YYYY-MON-DD HH24:MI:SS'),'YYYY-MON-DD HH24:MI:SS') AS MY_DATE FROM DUAL;


CONVERT FROM VARCHAR2 TO DATE FORMAT

>SELECT TO_DATE(LOG_DATETIME, 'DDMMYYYYHH24MISS') FROM app_switchin;



DIFFERENCES OF THE COLUMN

>SELECT COLUMN_NAME FROM USER_TAB_COLS WHERE TABLE_NAME='EMP'
MINUS
SELECT COLUMN_NAME FROM USER_TAB_COLS WHERE TABLE_NAME='EMP1';



CONCAT QUERY FOR SELECT STATEMENT

SELECT  EMPNO||'
        '||      ENAME   ||'
        '||       SAL    ||'
        '||      DEPTNO  AS EMPLOYEE FROM EMP;




>SELECT 'EMPNO ='|| EMPNO ||',
        ENAME ='|| ENAME ||',
        SAL   ='|| SAL   ||', ,
        DEPTNO='|| DEPTNO  AS EMPLOYEE FROM EMP;  

PASS THE DBMS_OUTPUT.PUT_LINE ARGUMENT

> DBMS_OUTPUT.PUT_LINE('v_id = ' || v_id || ', v_name = ' || v_name ||', v_salary = ' || v_salary);


ALTER THE TYPE IN TABLE 

> ALTER TABLE EMP ADD ADRESS ADDRESS_TYPE;

 REPLACE THE SECOND POSITION OF ENAME BY OTHER CHARACTER 

SELECT ENAME,SUBSTR(ENAME,2,1),REPLACE(ENAME,SUBSTR(ENAME,2,1),'T') FROM EMP1;

INSERT ONE COLUMN'S DATA INTO ANOTHER COLUMN WITHIN THE SAME TABLE


UPDATE EMP1 SET DESTINATION_COLUMN_NAME = SOURCE_COLUMN_NAME;

INSERT INTO TABLE WITH ALIAS NAME
  
>INSERT INTO EMP1 (EMPLOYEE_ID,SALARY)

SELECT EMPNO AS EMPLOYEE_ID, SAL AS SALARY FROM EMP;

 SUB QUERY IN SELECT CLAUSE

>SELECT
(SELECT SUM(SAL) FROM EMP) TOT_SAL,
(SELECT COUNT(*) FROM EMP) CN,
(SELECT MAX(SAL) FROM EMP) MSAL,
(SELECT MIN(SAL) FROM EMP) LEAST_SAL,
(SELECT AVG(SAL) FROM EMP) AVERAGE_SAL
FROM

EMP;


 >SELECT INCR_SAL,MODIFIED_ENAME,LPAD_EMPNO,HIREDATE FROM
(SELECT SAL+1000 AS INCR_SAL, RPAD(ENAME,10,'@') AS MODIFIED_ENAME, LPAD(EMPNO,5,'0') AS LPAD_EMPNO,HIREDATE FROM EMP)
EMPLOYEE WHERE INCR_SAL>3000;


>SELECT E.ENAME, E.SAL,
(SELECT AVG(SAL) FROM EMP WHERE DEPTNO = E.DEPTNO) AVG_SAL_DEPT,
(SELECT MAX(SAL) FROM EMP WHERE DEPTNO =E.DEPTNO) MAX_SAL
FROM EMP E
ORDER BY 1;


>SELECT AVG(SUM_COLUMN1)
  FROM (SELECT SUM(SAL) AS SUM_COLUMN1

        FROM EMP GROUP BY DEPTNO) A ;


>SELECT ENAME,
(SELECT MAX(SAL) FROM EMP) MAXSAL ,
SAL,
((SELECT MAX(SAL) FROM EMP ) - SAL ) DIFFERENCE
FROM EMP

ORDER BY ((SELECT MAX(SAL) FROM EMP ) - SAL);


TOP 2 SAL FROM EMP TABLE 

>SELECT * FROM (SELECT * FROM EMP  ORDER BY SAL DESC) WHERE ROWNUM <3;

 NTH HIGHEST SAL FROM EMP   


>SELECT SAL FROM (SELECT ROWNUM AS SALINDEX, SAL FROM (SELECT DISTINCT SAL FROM EMP ORDER BY SAL DESC)) WHERE SALINDEX=&N;

 MERGE THREE TABLES

MERGE INTO APP_MAST_MDSFC FC USING (SELECT FC.ROWID FC_RW, EPST.ADJ_AMT_SETT, CARDMASTER.CARD_NUMBER
          from
          APP_MAST_MDSFC FC,APP_MAST_MDSEPST EPST,APP_CARDMASTER CARDMASTER
          where
          FC.PROCESSOR='I'
          AND
          EPST.ADJ_SETT_IND = 'c'
          AND
          FC.TTUM_FLAG is NULL
          AND
          FC.CPD BETWEEN to_date('10-07-2012','dd-mm-rrrr') AND to_date('10-07-2012','dd-mm-rrrr')
          AND
          FC.SWT_SRL_NO = EPST.ORG_SWT_SRL_NO
          AND
          CARDMASTER.CARD_NUMBER(+) = FC.CRD_NUM)
ON (FC.ROWID=FC_RW)
WHEN MATCHED THEN
UPDATE SET EPST_AMOUNT = (ADJ_AMT_SETT / 100);


COUNT THE EMP WHO IS WORKING IN ALL (10,20,30) DEPT

SELECT D.DEPTNO, D.DNAME,
(SELECT COUNT(*) FROM EMP E
WHERE E.DEPTNO = D.DEPTNO) EMPL_CNT

FROM DEPT D;


COUNT THE EMP OF 10TH DEPT

SELECT D.DEPTNO, D.DNAME,
(SELECT COUNT(*) FROM EMP E
WHERE E.DEPTNO = D.DEPTNO) EMPL_CNT

FROM DEPT D WHERE D.DEPTNO=10;







Saturday 2 June 2012

Query for getting Create Script.

1)))))This query give the create script of table/view/synonyms/sequences.

>>>>>SELECT DBMS_METADATA.GET_DDL('TABLE','TABLE_NAME') FROM DUAL;

if you wants VIEW then use ('VIEW','VIEW_NAME') like that you can use for all objects

2)))For separate the particular column SUPPOSE your column name is  WORD

WORD
------------
ABC
EFG
XYZ
PQR


IN THIS FORMAT

'ABC','EFG','XYZ','PQR'


>>> SELECT WM_CONCAT(''''||WORD||'''') FROM TABLE_NAME;

Saturday 5 May 2012

Find the Difference amongs the column in different tables.

INTRODUCTION

Let us suppose you have two table. One is yearly (Permanent) table and other is temporary table ( daily transaction table). If your  yearly table has 450 columns and your temporary table has 446 columns.If
you want to move temporary table data into the yearly table that time oracle give the error """"NOT ENOUGH VALUE"""". That time you  need to know which column is not present in your temporary table. For that you write the query.....


> select column_name from user_tab_columns where table_name='EMP'
   MINUS
   select column_name from user_tab_columns where table_name='EMPTEMP'

It will show 4 columns that was not present in your temporary table. you can alter the column.


""here EMP is a yearly(permanent) table and EMPTEMP is a temporary table"".

Wednesday 2 May 2012

Cluster and Non Cluster Index With Example

1. Introduction

We all know that data entered in the tables are persisted in the physical drive in the form of database files. Think about a table, say Customer (For any leading bank India), that has around 16 million records. When we try to retrieve records for two or three customers based on their customer id, all 16 million records are taken and comparison is made to get a match on the supplied customer ids. Think about how much time that will take if it is a web application and there are 25 to 30 customers that want to access their data through internet. Does the database server do 16 million x 30 searches? The answer is no because all modern databases use the concept of index.

2. What is an Index

Index is a database object, which can be created on one or more columns (16 Max column combination). When creating the index will read the column(s) and forms a relevant data structure to minimize the number of data comparisons. The index will improve the performance of data retrieval and adds some overhead on data modification such as create, delete and modify. So it depends on how much data retrieval can be performed on table versus how much of DML (Insert, Delete and Update) operations.
In this article, we will see creating the Index. The below two sections are taken from my previous article as it is required here. If your database has changes for the next two sections, you can directly go to section 5.

3. First Create Two Tables

To explain these constraints, we need two tables. First, let us create these tables. Run the below scripts to create the tables. Copy paste the code on the new Query Editor window, then execute it.

CREATE TABLE Student(StudId Number(4), StudName varchar(50), Class varchar2(15));
CREATE TABLE TotalMarks(StudId Number(5), TotalMarks number(5));
Go

Note that there are no constraints at present on these tables. We will add the constraints one by one.

4. Primary Key Constraint

A table column with this constraint is called as the key column for the table. This constraint helps the table to make sure that the value is not repeated and also no null entries. We will mark the StudId column of the Student table as primary key. Follow these steps:
  1. Right click the student table and click on the modify button.
  2. From the displayed layout, select the StudId row by clicking the Small Square like button on the left side of the row.
  3. Click on the Set Primary Key toolbar button to set the StudId column as primary key column.
Pic01.JPG
Now this column does not allow null values and duplicate values. You can try inserting values to violate these conditions and see what happens. A table can have only one Primary key. Multiple columns can participate on the primary key column. Then, the uniqueness is considered among all the participant columns by combining their values.

5. Clustered Index

The primary key created for the StudId column will create a clustered index for the Studid column. A table can have only one clustered index on it.
When creating the clustered index, Oracle reads the Studid column and forms a Binary tree on it. This binary tree information is then stored separately in the disc. Expand the table Student and then expand the Indexes. You will see the following index created for you when the primary key is created:

Pic02.jpg

With the use of the binary tree, now the search for the student based on the studid decreases the number of comparisons to a large amount. Let us assume that you had entered the following data in the table student:
Pic03.jpg The index will form the below specified binary tree. Note that for a given parent, there are only one or two Childs. The left side will always have a lesser value and the right side will always have a greater value when compared to parent. The tree can be constructed in the reverse way also. That is, left side higher and right side lower.
Pic04.JPG Now let us assume that we had written a query like below:
Select * from student where studid = 103;
Select * from student where studid = 107;

Execution without index will return value for the first query after third comparison.
Execution without index will return value for the second query at eights comparison.

Execution of first query with index will return value at first comparison.
Execution of second query with index will return the value at the third comparison. Look below:
  1. Compare 107 vs 103 : Move to right node
  2. Compare 107 vs 106 : Move to right node
  3. Compare 107 vs 107 : Matched, return the record
If numbers of records are less, you cannot see a different one. Now apply this technique with a Yahoo email user accounts stored in a table called say YahooLogin. Let us assume there are 33 million users around the world that have Yahoo email id and that is stored in the YahooLogin. When a user logs in by giving the user name and password, the comparison required is 1 to 25, with the binary tree that is clustered index.
Look at the above picture and guess yourself how fast you will reach into the level 25. Without Clustered index, the comparison required is 1 to 33 millions.
Got the usage of Clustered index? Let us move to Non-Clustered index.

6. Non Clustered Index

A non-clustered index is useful for columns that have some repeated values. Say for example, AccountType column of a bank database may have 10 million rows. But, the distinct values of account type may be 10-15. A clustered index is automatically created when we create the primary key for the table. We need to take care of the creation of the non-clustered index.

7. How Does a Non-Clustered Index Work?

A table can have more than one Non-Clustered index. But, it should have only one clustered index that works based on the Binary tree concept. Non-Clustered column always depends on the Clustered column on the database.
This can be easily explained with the concept of a book and its index page at the end. Let us assume that you are going to a bookshop and found a big 1500 pages of C# book that says all about C#. When you glanced at the book, it has all beautiful color pages and shiny papers. But, that is not only the eligibility for a good book right? One you are impressed, you want to see your favorite topic of Regular Expressions and how it is explained in the book. What will you do? I just peeped at you from behind and recorded what you did as below:
  1. You went to the Index page (it has total 25 pages). It is already sorted and hence you easily picked up Regular Expression that comes on page Number 17.
  2. Next, you noted down the number displayed next to it which is 407, 816, 1200-1220.
  3. Your first target is Page 407. You opened a page in the middle, the page is greater than 500.
  4. Then you moved to a somewhat lower page. But it still reads 310.
  5. Then you moved to a higher page. You are very lucky you exactly got page 407. [Yes man you got it. Otherwise I need to write more. OK?]
  6. That’s all, you started exploring what is written about Regular expression on that page, keeping in mind that you need to find page 816 also.
In the above scenario, the Index page is Non-Clustered index and the page numbers are clustered index arranged in a binary tree. See how you came to the page 407 very quickly. Your mind actually traversed the binary tree way left and right to reach the page 407 quickly.

Tuesday 1 May 2012

Normalization with Example

Why do we need to do normalization?

To eliminate redundancy of data i.e. having same information stored at multiple places, which eventually be difficult to maintain and will also increase the size of our database.
With normalization we will have tables with fewer columns which will make data retrieval and insert, update and delete operations more efficient.


What do we mean when we say a table is not in normalized form?


Let’s take an example to understand this,
Say I want to create a database which stores my friends name and their top three favorite artists.
This database would be quite a simple so initially I’ll be having only one table in it say friends table. Here FID is the primary key.


FID FNAME FavoriteArtist
1 Srihari Akon, The Corrs, Robbie Williams.
2 Arvind Enigma, Chicane, Shania Twain

This table is not in normal form why?

FavoriteArtist column is not atomic or doesn’t have scalar value i.e. it has having more that one value.
Let’s modify this table

FID FNAME FavoriteArtist1 FavoriteArtist2 FavoriteArtist3
1 Srihari Akon. The Corrs Robbie Williams.
2 Arvind Enigma Chicane Shania Twain

This table is also not in normal form why?

We have now changed our table and now each column has only one value!! (So what’s left?)
Because here we are having multiple columns with same kind of value.

I.e. repeating group of data or repeating columns.

So what we need to do to make it normal or at least bring it in First Normal Form?
  1. We’ll first break our single table into two.
  2. Each table should have information about only one entity so it would be nice if we store our friend’s information in one table and his favorite artists’ information in another
(For simplicity we are working with few columns but in real world scenario there could be column like friend’s phone no, email , address and favorites artists albums, awards received by them, country etc. So in that case having two different tables would make complete sense)


FID FNAME
1 Srihari
2 Arvind
FID Favorite Artist
1 Akon.
1 The Corrs
1 Robbie Williams
2 Enigma
2 Chicane
2 Shania Twain

FID foreign key in FavoriteArtist table which refers to FID in our Friends Table.

Now we can say that our table is in first normal form.

Remember For First Normal Form

1...Column values should be atomic, scalar or should be holding single value
2...No repetition of information or values in multiple columns.

3...So what does Second Normal Form means?


 Second normal form our database should already be in first normal form and every non-key column must depend on entire primary key.

Here we can say that our Friend database was already in second normal form l.
Why?

Because we don’t have composite primary key in our friends and favorite artists table.

Composite primary keys are- primary keys made up of more than one column. But there is no such thing in our database.
But still let’s try to understand second normal form with another example
This is our new table
Gadgets Supplier Cost Supplier Address
Headphone Abaci 123$ New York
Mp3 Player Sagas 250$ California
Headphone Mayas 100$ London

In about table ITEM+SUPPLIER together form a composite primary key.

Let’s check for dependency

If I know gadget can I know the cost?

No same gadget is provided my different supplier at different rate.

If I know supplier can I know about the cost?

No because same supplier can provide me with different gadgets.

If I know both gadget and supplier can I know cost?

Yes than we can.

So cost is fully dependent (functionally dependent) on our composite primary key (Gadgets+Supplier)

Let’s start with another non-key column Supplier Address.

If I know gadget will I come to know about supplier address?

Obviously no.

If I know who the supplier is can I have it address?

Yes.

So here supplier is not completely dependent on (partial dependent) on our composite primary key (Gadgets+Supplier).

This table is surely not in Second Normal Form.

So what do we need to do to bring it in second normal form?

Here again we’ll break the table in two.
Gadgets Supplier Cost
Headphone Abaci 123$
Mp3 Player Sagas 250$
Headphone Mayas 100$
Supplier Supplier Address
Abaci New York
Sagas California
Mayas London

We now how to normalize till second normal form.

But let’s take a break over here and learn some definitions and terms.

Composite Key: -Composite key is a primary key composed of multiple columns.
Functional Dependency – When value of one column is dependent on another column.

So that if value of one column changes the value of other column changes as well.

e.g. Supplier Address is functionally dependent on supplier name. If supplier’s name is changed in a record we need to change the supplier address as well.

S.Supplier–Ć S.SupplierAddress

“In our s table supplier address column is functionally dependent on the supplier column”

Partial Functional DependencyA non-key column is dependent on some, but not all the columns in a composite primary key.

In our above example Supplier Address was partially dependent on our composite key columns (Gadgets+Supplier).

Transitive Dependency- A transitive dependency is a type of functional dependency in which the value in a non-key column is determined by the value in another non-key column.

With these definitions in mind let’s move to Third Normal Form.
For a table in third normal form
  • It should already be in Second Normal Form.
  • There should be no transitive dependency, i.e. we shouldn’t have any non-key column depending on any other non-key column.
Again we need to make sure that the non-key columns depend upon the primary key and not on any other non-key column.

Album Artist No. of tracks Country
Come on over Shania Twain 11 Canada
History Michael Jackson 15 USA
Up Shania Twain 11 Canada
MCMXC A.D. Enigma 8 Spain
The cross of changes Enigma 10 Spain

Although the above table looks fine but still there is something in it because of which we will normalize it further.

Album is the primary key of the above table.

Artist and No. of tracks are functionally dependent on the Album(primary key).

But can we say the same of Country as well?

In the above table Country value is getting repeated because of artist.

So in our above table Country column is depended on Artist column which is a non-key column.

So we will move that information in another table and could save table from redundancy i.e. repeating values of Country column.

Album Artist No. of tracks
Come on over Shania Twain 11
History Michael Jackson 15
Up Shania Twain 11
MCMXC A.D. Enigma 8
The cross of changes Enigma 10
Artist Country
Shania Twain Canada
Michael Jackson USA
Enigma Spain


Normally this is considered enough and we don’t really go on applying the other normal forms.

Most of real-world application has databases which are in third normal forms.

Sunday 22 April 2012

First and Last Row of the TABLE


First row of table 
SQL>select * from EMP where rownum = 1 order by rowid desc;
Last Row of table
SQL>select * from EMP where rowid IN (select max (rowid) from EMP);
SQL>select  * from (select * from emp order by rowid desc) where rownum = 1;

How to get SECOND HIGHEST SALARY


SQL> select ename , sal from emp a where 1 = ( select count(*) from emp b
where b.sal > a.sal);
 
ENAME             SAL
---------- ----------
JONES            2975

We can generalized this query for other highest salary.

SQL> select ename , sal from emp a where N-1 = ( select count(*) from emp b where b.sal > a.sal);

where N-1 is the any number which you want to select rank wise salary.
for example if we want to select third highest salary then N-1 becomes 2 and
if we want fourth then it becomes 3.