Importance of Indexing and efficiency of indexing

Glad to present something useful to many. I have tried to cover the topic of “What is indexing and where to use it?”.

“A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indices are used to quickly locate data without having to search every row in a database table every time a database table is accessed.”

– Wikipedia

An index is a specific structure that organizes a reference to your data that makes it easier to look up. When accessing data, Postgres will either use some form of an index if it exists or a sequential scan.

Sequential scan – is when it searches over all of the data before returning the results.

Advantages:

• Adding an index to a column will allow you to query based on a column faster.
• Indices like primary key index and unique index help to avoid duplicate row data.

Disadvantages:

• They decrease performance on DML Commands.
• They take up space (this increases with the number of fields used and the length of the fields).

The whole point of having an index is to speed up search queries by essentially cutting down the number of records/rows in a table that need to be examined.

By default the Primary Key on every table will be indexed.

To create index on SQL:

CREATE INDEX index_name ON table_name (column_name);

Example:

For example lets take User table which contains 10k records. and we need to find an user with user’s Phone Number which is saved under “phone_number” column.

The query will be,

SELECT * FROM employees WHERE phone_number = ‘1234567890’;

Without index on Employees Table the Query will take some time to process. We can get the query details by using “EXPLAIN” command.

EXPLAIN SELECT * FROM employees WHERE phone_number = ‘1234567890’;

Seq Scan on employees (cost=0.00..32769.75 rows=1 width=297)
Filter: (“phone_number” = 1234567890)
(2 rows)

Come lets create index on “phone_number” for our Employees Table.

CREATE INDEX index_employees_on_phone_number ON employees (phone_numbers);

Congrats… We have created index on Employees table. Now lets see our Query’s performance with Index.

EXPLAIN SELECT * FROM employees WHERE phone_number = ‘1234567890’;

Index Scan using index_employees_on_phone_number on employees (cost=0.42..8.44 rows=1 width=297)
Index Cond: (“phone_number” = 1234567890)
(2 rows)

You can see the impact of Index on Employees table. The query time is radically decreased.

To remove index:

DROP INDEX index_name;

Index will use some disk space to store the values. For every DML action(INSERT/UPDATE/DELETE) made on a indexed table it also do the action on index.

If we create many indices for a heavy-write table the DML query cost will be increased but the query cost will be minimal.

Create indices based on the frequency of the Index accessed.

To know more about Indices have a look at this  Importance of Indexing and efficiency of indexing – Part 2

Surender,
ROR Developer,
Mallow Technologies.

2 Comments

  1. kumaravel

    super…..

  2. ebasinajosh

    Good and perfect

Leave a Comment

Your email address will not be published. Required fields are marked *