Popular Posts

Sunday, 29 September 2013

How do database indexes work/How to Create database index

How do database indexes work/How to Create database index


Let’s start out our tutorial and explanation of why you would need a database index by going through a very simple example
SUPPOSE we have a database table named EMP as:
desc emp
Name     Null     Type        
-------- -------- ------------
EMPNO    NOT NULL NUMBER(4)   
ENAME             VARCHAR2(10)
JOB               VARCHAR2(9) 
MGR               NUMBER(4)   
HIREDATE          DATE        
SAL               NUMBER(7,2) 
COMM              NUMBER(7,2) 
DEPTNO            NUMBER(2) 


And we query as:

SELECT * FROM emp WHERE ename = 'KING'
OUTPUT:
EMPNO                  ENAME      JOB       MGR                    HIREDATE                  SAL                    COMM                   DEPTNO                
---------------------- ---------- --------- ---------------------- ------------------------- ---------------------- ---------------------- ----------------------
7839                   KING       PRESIDENT                        17-11-81                  5000                                          10                    


What would happen without an index on the table?

Once we run that query, what exactly goes on behind the scenes to find employees who are named KING? Well, the database software would literally have to look at every single row in the Employee table to see if the Enamefor that row is ‘KING’. And, because we want every row with the name ‘KING’ inside it, we can not just stop looking once we find just one row with the name ‘KING’, because there could be other rows with the name KING. So, every row up until the last row must be searched – which means thousands of rows in this scenario will have to be examined by the database to find the rows with the name ‘KING’. This is what is called a FULL TABLE SCAN....

 

How a database index can help performance..


The whole point of having an index is to speed up search queries by essentially cutting down the number of records/rows in a table that need to be examined.

What is an index?

So, what is an index? Well, an index is a data structure (most commonly a B- tree) that stores the values for a specific column in a table. An index is created on a column of a table. So, the key points to remember are that an index consists of column values from one table, and that those values are stored in a data structure. The index is a data structure – remember that.

How does a hash table index work?

Hash tables are another data structure that you may see being used as indexes – these indexes are commonly referred to as hash indexes. The reason hash indexes are used is because hash tables are extremely efficient when it comes to just looking up values. So, queries that compare for equality to a string can retrieve values very fast if they use a hash index. For instance, the query we discussed earlier (SELECT * FROM Employee WHERE Ename= ‘KING’) could benefit from a hash index created on the Ename column. The way a hash index would work is that the column value will be the key into the hash table and the actual value mapped to that key would just be a pointer to the row data in the table. Since a hash table is basically an associative array, a typical entry would look something like “KING => 0×28939″, where 0×28939 is a reference to the table row where KING is stored in memory. Looking up a value like “KING” in a hash table index and getting back a reference to the row in memory is obviously a lot faster than scanning the table to find all the rows with a value of “KING” in the Ename column.

The disadvantages of a hash index

Hash tables are not sorted data structures, and there are many types of queries which hash indexes can not even help with. For instance, suppose you want to find out all of the employees who are less than 40 years old. How could you do that with a hash table index? Well, it’s not possible because a hash table is only good for looking up key value pairs – which means queries that check for equality (like “WHERE name = ‘KING’”). What is implied in the key value mapping in a hash table is the concept that the keys of a hash table are not sorted or stored in any particular order.

What are some other types of indexes?

Indexes that use a R- tree data structure are commonly used to help with spatial problems. For instance, a query like “Find all of the Starbucks within 2 kilometers of me” would be the type of query that could show enhanced performance if the database table uses a R- tree index.


What exactly is inside a database index?

So, now you know that a database index is created on a column in a table, and that the index stores the values in that specific column. But, it is important to understand that a database index does not store the values in the other columns of the same table. For example, if we create an index on the Ename column, this means that the Job and emp_no or dept_no column values are not also stored in the index. If we did just store all the other columns in the index, then it would be just like creating another copy of the entire table – which would take up way too much space and would be very inefficient.

How does a database know when to use an index?

When a query like “SELECT * FROM Employee WHERE Ename= ‘KING’ ” is run, the database will check to see if there is an index on the column(s) being queried. Assuming the Ename column does have an index created on it, the database will have to decide whether it actually makes sense to use the index to find the values being searched – because there are some scenarios where it is actually less efficient to use the database index, and more efficient just to scan the entire table.

Can you force the database to use an index on a query?

Generally, you will not tell the database when to actually use an index – that decision will be made by the database itself. Although it is worth noting that in most databases (like Oracle and MySQL), you can actually specify that you want the index to be used.


How to create an index in SQL:

Here’s what the actual SQL would look like to create an index on the Employee_Name column from our example earlier:
CREATE INDEX name_index
ON Emp (Ename)

How to create a multi-column index in SQL:

We could also create an index on two of the columns in the Employee table , as shown in this SQL:
CREATE INDEX name_index
ON Employee (ENAME, HIREDATE)


What is the cost of having a database index?

So, what are some of the disadvantages of having a database index? Well, for one thing it takes up space – and the larger your table, the larger your index. Another performance hit with indexes is the fact that whenever you add, delete, or update rows in the corresponding table, the same operations will have to be done to your index. Remember that an index needs to contain the same up to the minute data as whatever is in the table column(s) that the index covers.
As a general rule, an index should only be created on a table if the data in the indexed column will be queried frequently.


No comments :

Post a Comment