Category Archives: Sql Server Internals

SQL Server: How Local Variables Can Reduce Query Performance

It’s a common practice by database developers to use local variables in stored procedures and scripts to place filter on basis of these local variables. YES, these local variables can slowdown your queries. Let’s prove it.
Create a new table and insert dummy rows.


      (tempID UNIQUEIDENTIFIER,tempMonth INT,tempDateTime DATETIME )

INSERT INTO TempTable (tempID, tempMonth, tempDateTime)

SELECT NEWID(),(CAST(100000*RAND() AS INT) % 12) + 1 ,GETDATE()
GO 100000 — (EXECUTE THIS BATCH 100000 TIME)

— Create an index to support our query

CREATE NONCLUSTERED INDEX[IX_tempDateTime] ON [dbo].[TempTable]
([tempDateTime] ASC)
Now let’s execute a simple query with hard coded values in WHERE clause


WHEREtempDateTime > ‘2012-07-10 03:18:01.640’
Table ‘TempTable’. Scan count 1, logical reads 80,physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Check out its execution plan and index seeks properties. You can find that estimated rows are double to actual rows but that’s not a big difference to affect execution plan and resultantly optimizer has selected a proper plan to execute this query.

Query optimizer has estimated number of rows from its base statistics histogram i.e.  EQ_ROWS + AVG_RANGE_ROWS (77 + 88.64286)

DBCC SHOW_STATISTICS (‘dbo.TempTable’, IX_tempDateTime)

Now, let’s modify our SELECT query and use local variable and execute it. You will find that query optimizer has selected a different plan this time, a more costly plan. WHY ??

SET@RequiredDate = ‘2012-07-10 03:18:01.640’

WHEREtempDateTime  >@RequiredDate
Table ‘TempTable’. Scan count 1, logical reads 481, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Huge difference of estimated and actual number of rows clearly indicating that query optimizer was unable to proprly estimate number of rows and with this wrong estimation, it has selected a more costly execution plan. Basically Query Optimizer does not know the value of local variable at the time of optimization and resultantly can’t use histogram of statistics. It behaves differently with inequality and equality operators.

In Case of Inequality Operator:
In our case of inequality operator in query, query optimizer used a simple formula of 30% of total rows.

Estimated Rows =(Total Rows * 30)/100 = (100000*30)/100 = 30000

 In Case of Equality Operator:

SET@RequiredDate = ‘2012-07-10 03:18:01.640’

WHEREtempDateTime  =@RequiredDate

If equality operator is used with local variables, query optimizer gets estimated rows figure from a different formula i.e.  Density * Total Number of Table Rows. Execute following query to get density value.

DBCC SHOW_STATISTICS (‘dbo.TempTable’, IX_tempDateTime)

All Density = 0.0007358352

Total Number of Rows in Table = 100000
Estimated Rows = Density * Total Number =  0.0007358352 *  100000 = 73.5835

Drop table when not required


Sql Server : Common Misconceptions Regarding IDENTITY Columns

  • Small group of developers think, that IDENTITY is a data type.
IDENTITY isn’t a data type; it’s a column property that you can declare on a whole number data type such as tinyint, smallint, int, bigint, or numeric/decimal
  • IDENTITY property can be assigned to as many columns in a table, as you want.
Only ONE column in a table can be assigned IDENTITY property. That’s why following query is also valid, to get identity column (even without providing name of column in select statement).
USE AdventureWorks
SELECT IDENTITYCOL FROM HumanResources.Department
  •  IDENTITY property ensures uniqueness among column values
IDENTITY column never guarantees unique value in a column. Though values generated by IDENTITY are always unique.
  • One cannot insert explicit value for identity column in table
Explicit value for identity column is possible. How? Find it here in early post.
  • @@IDENTITY, contains the last identity value used by that table/column
It’s not true. @@IDENTITY actually contains the last identity value used by that CONNECTION. If you have multiple tables with IDENTITY COLUMNS and if multiple INSERT statements are carried out in a batch on the same or different tables, the @@IDENTITY has the value for the last statement only. 
IDENT_CURRENT(tablename) is the best alternative. It will return accurte value used by IDENTITY column of your desired table. 
SELECT IDENT_CURRENT(‘HumanResources.Department’)

Sql Server Performance: Does Your Sql Server Machine Need More RAM? Buffer Cache Hit Ratio Can Inform You.

Buffer Cache Hit Ratio counter indicates how often SQL Server goes to the buffer, not the hard disk, to get the data. 
In OLTP applications, this ratio should exceed 90% (most of DBAs says it should exceed 95%), and ideally be over 99%.If your buffer cache hit ration is lower than 90%, you must add RAM modules. If the ratio is between 90% and 99%, then it should be considered as last warning and you must purchase more RAM.
In OLAP applications, the low ratio is acceptable but still it depends on nature of OLAP working style.

Following is the query I like to use to measure buffer cache hit ratio:
      (CAST(SUM(CASE LTRIM(RTRIM(counter_name))
                                                WHEN ‘Buffer cache hit ratio’
                                                          THEN CAST(cntr_value AS INTEGER)
                                                ELSE NULL END) AS FLOAT) /
      CAST(SUM(CASE LTRIM(RTRIM(counter_name))
                                                WHEN ‘Buffer cache hit ratio base’
                                                          THEN CAST(cntr_value AS INTEGER)
                                                ELSE NULL END) AS FLOAT)) * 100
      AS BufferCacheHitRatio
      LTRIM(RTRIM([object_name])) LIKE ‘%:Buffer Manager’ AND

      [counter_name] LIKE ‘Buffer Cache Hit Ratio%’        

Sql Server Internal: Negative Side Effects of Altering Tables

What do you think, what happens when you alter your database table in Sql Server. Most of the time Sql Server just make changes in meta data and NO physical changes are made. It commonly happens when you:
o   Drop a column
o   Add a new column and assumes NULL as the new value for all rows
o   Increase length of  a variable-length column
o   Change a  non-nullable column to allow NULLs
Due to this behavior few major negative side effects can be observed.
1.       When a fixed length column (i.e. column with data type CHAR, NCHAR, INT, SMALL INT etc) is altered to increase its length. The old column is not actually replaced. Rather, a new column is added to the table, and DBCC PAGE shows you that the old data is still there.
2.       When you try to decrease length of fixed length column. It just make changes in meta data, that column values will be according to new mentioned length and NO PHYSICAL change occur. It means size is never deceased. For example if you want to change CHAR(15) column to CHAR(10). Sql Server will never decrease its length to 10. Column length will remain 15 physically and its only meta data which will bind you to use only 10 character length.
3.       One more drawback is, when you drop a column from your table. NO such action is performed (Column is not dropped physically) and only meta data is changed, so in future you can’t see or use it. No space is released and column still exists.

               Make your desired changes through ALTER TABLE and then reclaim table space by recreating table or just rebuild clustered index.

Sql Server: Why We Can’t Add New Column in Between Existing Columns

Internally Sql Server maintains COLUMN_ID for each column of a table. If we have two columns in a table and we want to add one more. Newly added will have COLUMN_ID next to last created column i.e. if last COLUMN_ID was 2, newly added will have COLUMN_ID  3, and column will be place at the end of existing columns. That’s why we can’t add a column in between existing columns of a table. 
Though, order of column doesn’t matter, but still if we need it. We have to drop and recreate the table, with new column sequence.
Most of readers, by reading just title can have opinion that WE CAN insert new column in between existing columns. Yes it is possible through Sql Server Management Studio, but reality is different. AS SSMS follow same drop and recreate steps, i.e.:
      I.      Creates a temporary table according to existing structure of table
     II.      Shifts data from original table to temporary one
    III.      Drop original table
    IV.      Creates new table according to NEWLY GIVEN SEQUENCE of columns
     V.      Shifts data from temporary table to newly created table
    VI.      Drops temporary table
Its COLUMN_ID which is responsible to display columns in a fixed sequence whenever we execute SELECT * statement for a given table.