Table Scans and Index Scans affects more than the table they access

SQL Server only queries data in memory (data cache). If the data needed is not cached, SQL Server will retrieve the data from disk and load it to data cache, and then SQL Server will use the data from the cache.

I have a general guide line that Table Scans and Index Scans are bad. This may not be an issue for small tables, but for large tables scans can cause significant performance issues. For example, if a query accesses a table that is 20 GB in size and a scan occurs, then there is a good chance that all data for that entity will be loaded in memory. If this data is not in memory, then SQL must fetch the data from disk and load it into memory. Fetching data from disk is usually an expensive IO process. If there is not enough available space in the data cache, SQL Server will remove (flush) data from the cache to make room for data that was retrieved from disk.

Data that is used often and cached can be removed from the cache due to poor queries or the lack of an index. Here’s a contrived example. We have 2 tables. The 1st table is Orders and it contains 10 million records and requires 3 GB of disk space. The Orders table is extremely important and is used in most queries and queries that access this table must return very quickly. The 2nd table TaskLog; contains 200 million records and requires 7 GB of disk space. For simplicity, neither table has any non-cluster indexes.

Let’s presume that the server has 8 GB memory. If all queries are executed on the Orders table, eventually most of the data from the Orders table would be in the data cache. There would be little need for SQL to access the disk. Queries would execute fairly fast.

Now, UserA queries the TaskLog table. The query gets counts of TaskType(see example query below). When the user executes this query a table scan is used. Since the data is not in memory, SQL Server will transfer the data from disk to memory. The problem is that there is not enough memory to contain both the Orders and TaskLog table. Since there’s not enough memory SQL Server will flush Orders data from memory and replace it with data with the TaskLog data.

Now the issue is that any queries that need to access Ordres will be retrieve from disk. This will incur a penalty in performance.

There are many options to solve this problem; indexes could be created on both the Orders and TaskLog table, more memory could be added, and there are probably other options.

But how do you identify if memory allocation is a problem. Below is a query that retrieves space used by all Cluster Indexes and Non-Cluster Indexes. It will show the size of the entity on disk and how much of the entity is in memory.

Here sample result from the query (names have been changed to protect the innocent)

Table Name Index Name Index MB Buffer MB Index In Buffer Percent
Table1 PK_Table1 211875 20586 10%
Table2 PK_Table2 3711 3348 90%
Table3 PK_Table3 27689 2246 8%
Table4 IX_Table4_A 52181 1675 3%
Table5 PK_Table5 278409 1436 1%
Table4 IX_Table4_B 28585 1418 5%
Table2 IX_Table2_A 725 745 103%
Table6 PK_Table6 572 572 100%
Table3 IX_Table3_A 15701 493 3%
Table3 IX_Table3_B 17756 467 3%
Table7 PK_Table7 461 461 100%

Table2 is equivalent to our Orders table in the example. It’s very important that results from this table are returned fairly fast. As we can see 90% of data for PK_Table2 is stored in memory; this is good.

PK_Table1 is 211 GB and 20 GB are in memory. For this example speed in retrieving data from this table isn’t that important and 20GB in memory seems too much. This could be an indication that a scan is being used to access this data, or someone is running a query that they shouldn’t. This provides some good information to further my investigation.

Having one bad query can affect not just the performance of 1 table but the performance of the system as a whole.

SQL Server Create Index – sys.dm_exec_requests Percent Completed

I had an observation today in regards to creating an index that you may find interesting.  Per SQL Server 2008 (not R2) documentation for sys.dm_exec_requests it mentions that percent_completed works for the following commands:

  • ALTER INDEX REORGANIZE
  • AUTO_SHRINK option with ALTER DATABASE
  • BACKUP DATABASE
  • CREATE INDEX
  • DBCC CHECKDB
  • DBCC CHECKFILEGROUP
  • DBCC CHECKTABLE
  • DBCC INDEXDEFRAG
  • DBCC SHRINKDATABASE
  • DBCC SHRINKFILE
  • KILL (Transact-SQL)
  • RESTORE DATABASE,
  • UPDATE STATISTICS.

sys.dm_exec_requests (Transact-SQL)
http://msdn.microsoft.com/en-us/library/ms177648(v=sql.100).aspx

I was creating an index and I wanted to know how much time was remaining until the index creation finished.

So I executed the following query. I constantly ran this query, but percent_complete never changed from 0.

This seemed odd. Based on SQL Server documentation percent_completed should have been provided for Create Index.

Here’s an individual that talks about the discrepancies for sys.dm_exec_requests
Differences in documentation for sys.dm_exec_requests
http://sqlblog.com/blogs/aaron_bertrand/archive/2011/02/07/differences-in-documentation-for-sys-dm-exec-requests.aspx

SQL Server documentation for 2008 R2 and 2012 does not include Create Index for percent_complete.

The conclusion here is that for sys.dm_exec_requests, percent_complete is not provided for Create Index.  And I’m not sure if it ever was supported.

Other links
DOC : sys.dm_exec_requests topic is inconsistent between versions
http://connect.microsoft.com/SQLServer/feedback/details/641790/doc-sys-dm-exec-requests-topic-is-inconsistent-between-versions

SQL Server Objects In Memory VS Disk

For a few months I’ve been meaning to do an extensive post on a SQL Server query that would show the amount of each index/table in memory (Data Cache). I haven’t had the time to do this write-up, so I’m jut providing the query at this time. Hopefully in the future I will have time to describe this better.

The following query has been extremely beneficial at my work. It has help to identify indexes that should not be in memory and performance issues. This query doesn’t usually identify a root-cause, but it does seem to point you into the correct direction.

In the past there was a situation where a Cluster Index Scan was being used on a 20 GB table. This caused major performance problems for our system. The Cluster Index was loaded into memory (data cache) which forced other important data out of the data cache. By using this query to show indexes in memory, it was easy to identify the table that was being queried and inturn find the query that was causing the problem.

IF OBJECT_ID('tempdb..#PhysicalBufferSize') IS NOT NULL
BEGIN
PRINT 'DROP TABLE tempdb..#PhysicalBufferSize'
DROP TABLE #PhysicalBufferSize
END

SELECT
PhysicalSize.TableName
,PhysicalSize.IndexName
,PhysicalSize.Index_MB
,BufferSize.Buffer_MB
,CASE
WHEN Index_MB != 0 AND Buffer_MB != 0 THEN
CAST(Buffer_MB AS Float) / CAST(Index_MB AS Float)
ELSE 0
END IndexInBuffer_Percent
INTO #PhysicalBufferSize
FROM
(
--Index Disk Allocation per file
SELECT
OBJECT_NAME(i.OBJECT_ID) AS TableName,
i.name AS IndexName,
i.index_id AS IndexID,
SUM(a.used_pages) / 128 AS 'Index_MB'
FROM sys.indexes AS i
JOIN sys.partitions AS p ON
p.OBJECT_ID = i.OBJECT_ID
AND p.index_id = i.index_id
JOIN sys.allocation_units AS a ON
a.container_id = p.partition_id
WHERE
i.object_id > 100
GROUP BY i.OBJECT_ID,i.index_id,i.name
--ORDER BY 8 * SUM(a.used_pages) DESC--OBJECT_NAME(i.OBJECT_ID),i.index_id
) PhysicalSize

LEFT JOIN
(
--Index Memory Allocations per file
SELECT
obj.[name] TableName,
i.[name] IndexName,
obj.[index_id] IndexID,
i.[type_desc],
count_BIG(*)AS Buffered_Page_Count ,
count_BIG(*) /128 as Buffer_MB --8192 / (1024 * 1024)
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN
(
SELECT object_name(object_id) AS name
,index_id ,allocation_unit_id, object_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p ON
au.container_id = p.hobt_id
AND (au.type = 1 OR au.type = 3 OR au.type = 2)
) AS obj ON
bd.allocation_unit_id = obj.allocation_unit_id
LEFT JOIN sys.indexes i on
i.object_id = obj.object_id
AND i.index_id = obj.index_id
WHERE database_id = db_id()
GROUP BY obj.name, obj.index_id , i.[name],i.[type_desc]
--ORDER BY Buffered_Page_Count DESC
) BufferSize ON
PhysicalSize.TableName = BufferSize.TableName
AND PhysicalSize.IndexID = BufferSize.IndexID
ORDER BY Buffer_MB DESC

--*****************************************************************
-- Queries that work of Temp Table
--*****************************************************************
SELECT *
FROM #PhysicalBufferSize
ORDER By Buffer_MB DESC--TableName, IndexName DESC

SELECT TableName, SUM(Index_MB) Index_MB, SUM(Buffer_MB) Buffer_MB,
CASE
WHEN SUM(Index_MB) != 0 AND SUM(Buffer_MB) != 0 THEN
CAST(SUM(Buffer_MB) AS Float) / CAST(SUM(Index_MB) AS Float)
ELSE 0
END IndexInBuffer_Percent
FROM #PhysicalBufferSize
GROUP BY TableName
ORDER BY Buffer_MB DESC

SELECT SUM(Index_MB) Index_MB, SUM(Buffer_MB) Buffer_MB,
CASE
WHEN SUM(Index_MB) != 0 AND SUM(Buffer_MB) != 0 THEN
CAST(SUM(Buffer_MB) AS Float) / CAST(SUM(Index_MB) AS Float)
ELSE 0
END IndexInBuffer_Percent
FROM #PhysicalBufferSize

SELECT *
FROM #PhysicalBufferSize
ORDER By Index_MB DESC

The following grid shows an example output from AdventureWorks database. The grid identifes the size of the index on disk (Index MB) and the amount of the index in the Data Cache.

Table
Name
Index Name Index MB Buffer MB Percent
Person PK_Person_BusinessEntityID 29 29 1.00
SalesOrderDetail PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID 9 9 1.00
TransactionHistory PK_TransactionHistory_TransactionID 6 6 1.00
DatabaseLog NULL 6 6 1.00
WorkOrderRouting PK_WorkOrderRouting_WorkOrderID_ProductID_OperationSequence 5 5 1.00
SalesOrderHeader PK_SalesOrderHeader_SalesOrderID 5 5 1.00
WorkOrder PK_WorkOrder_WorkOrderID 4 4 1.00
TransactionHistoryArchive PK_TransactionHistoryArchive_TransactionID 5 4 0.80
Address PK_Address_AddressID 2 2 1.00
CreditCard PK_CreditCard_CreditCardID 1 1 1.00
EmailAddress PK_EmailAddress_BusinessEntityID_EmailAddressID 1 1 1.00
Password PK_Password_BusinessEntityID 1 1 1.00
PersonPhone PK_PersonPhone_BusinessEntityID_PhoneNumber_PhoneNumberTypeID 1 1 1.00
Address IX_Address_AddressLine1_AddressLine2_City_StateProvinceID_PostalCode 1 1 1.00
Shift AK_Shift_StartTime_EndTime 0 0 0.00

Resource:
SQLTEAM - What Data is in SQL Server's Memory?
MSDN - sys.dm_os_buffer_descriptors (Transact-SQL)
Glenn Berry - SQL Server Performance 1
Glenn Berry - SQL Server Performance 2

Find Parent Table

I’m in the need of a function that will find the parent table based on a child table name and a column name.  The column should be a foreign key.  I tried to find a solution on the web, but couldn’t find anything.  I probably used the wrong terminology in the searches.  So this is what I have come up with.

'CREATED FUNCTION fn_GetParentTable'
GO
CREATE
FUNCTION dbo.fn_GetParentTable(
     @ChildTable VARCHAR(200)
     ,@ChildColumn VARCHAR(200)
RETURNS
VARCHAR(MAX)
BEGIN
     DECLARE @ParentTableName VARCHAR(MAX)
     SELECT @ParentTableName = Parent_Table_Constraints.Table_Name
     FROM Information_Schema.Table_Constraints Table_Constraints
     INNER JOIN Information_Schema.CONSTRAINT_COLUMN_USAGE CONSTRAINT_COLUMN_USAGE ON
     Table_Constraints.Constraint_Name = CONSTRAINT_COLUMN_USAGE.Constraint_Name
     INNER JOIN Information_Schema.REFERENTIAL_CONSTRAINTS REFERENTIAL_CONSTRAINTS ON
  CONSTRAINT_COLUMN_USAGE.Constraint_Name = REFERENTIAL_CONSTRAINTS.Constraint_Name
     INNER JOIN Information_Schema.Table_Constraints Parent_Table_Constraints ON
  REFERENTIAL_CONSTRAINTS.Unique_Constraint_Name = Parent_Table_Constraints.Constraint_Name
     WHERE
  CONSTRAINT_COLUMN_USAGE.Table_Name = @ChildTable
  AND CONSTRAINT_COLUMN_USAGE.Column_Name = @ChildColumn
  AND Table_Constraints.Constraint_Type = 'FOREIGN KEY'
     RETURN @ParentTableName
END
GO

SELECT dbo.fn_GetParentTable ('tableName', 'ColumnName')

 

Remove Multiple Spaces

I have a need to replace multiple space with a single space in varchar during a SQL statement.
 
On the web there were many solutions, but these two seemed to be the more simple ones.
 

DECLARE

@strValue VARCHAR(MAX)

SET @strValue = sf    ds     fds f fds fd sf    ds    

 
While

CharIndex(‘  ‘, @strValue)>0

    Select

@strValue = Replace(@strValue, ‘  ‘, ‘ ‘)

 

–********************************

DECLARE

@strValue VARCHAR(MAX)

SET

@strValue = ‘     sf ds fds f ds       fds         fd sf ds ‘

SELECT

LTRIM(REPLACE(REPLACE(REPLACE(@strValue,‘  ‘, ‘ ‘), ‘  ‘, ‘ ‘), ‘  ‘, ‘ ‘))

 
Resources:
 
Removing unwanted spaces within a string …
 
Squeeze Function

SQL Function for Parsing or Spliting String

I have a table that stores a delimited string in a column.  I need this string to be in a table format.  So instead of trying to create this myself, I decided to Google and see if I could find any SQL Server function that could parse or split a delimited string into a table.   I found many good example and below in resource are links to the better results that I found.
 
This is probably the best one I found.  Since it’s the most complex, it has to be the best. Right?
 
Clayton Groom
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS OFF
GO
create   function fn_ParseText2Table
(
@p_SourceText  varchar(8000)
,@p_Delimeter varchar(100) = ,’ default to comma delimited.
)
RETURNS @retTable TABLE
(
  Position  int identity(1,1)
,Int_Value int
,Num_value Numeric(18,3)
,txt_value varchar(2000)
)
AS
/*
********************************************************************************
Purpose: Parse values from a delimited string
  & return the result as an indexed table
Copyright 1996, 1997, 2000, 2003 Clayton Groom (Clayton_Groom@hotmail.com)
Posted to the public domain Aug, 2004
06-17-03 Rewritten as SQL 2000 function.
Reworked to allow for delimiters > 1 character in length
and to convert Text values to numbers
********************************************************************************
*/

BEGIN
DECLARE @w_Continue  int
  ,@w_StartPos  int
  ,@w_Length  int
  ,@w_Delimeter_pos int
  ,@w_tmp_int  int
  ,@w_tmp_num  numeric(18,3)
  ,@w_tmp_txt   varchar(2000)
  ,@w_Delimeter_Len tinyint
if len(@p_SourceText) = 0
begin
  SET  @w_Continue = 0 force early exit
end
else
begin
parse the original @p_SourceText array into a temp table
  SET  @w_Continue = 1
  SET @w_StartPos = 1
  SET @p_SourceText = RTRIM( LTRIM( @p_SourceText))
  SET @w_Length   = DATALENGTH( RTRIM( LTRIM( @p_SourceText)))
  SET @w_Delimeter_Len = len(@p_Delimeter)
end
WHILE @w_Continue = 1
BEGIN
  SET @w_Delimeter_pos = CHARINDEX( @p_Delimeter
      ,(SUBSTRING( @p_SourceText, @w_StartPos
      ,((@w_Length – @w_StartPos) + @w_Delimeter_Len)))
      )

  IF @w_Delimeter_pos > 0  delimeter(s) found, get the value
  BEGIN
   SET @w_tmp_txt = LTRIM(RTRIM( SUBSTRING( @p_SourceText, @w_StartPos
        ,(@w_Delimeter_pos – 1)) ))
   if isnumeric(@w_tmp_txt) = 1
   begin
    set @w_tmp_int = cast( cast(@w_tmp_txt as numeric) as int)
    set @w_tmp_num = cast( @w_tmp_txt as numeric(18,3))
   end
   else
   begin
    set @w_tmp_int =  null
    set @w_tmp_num =  null
   end
   SET @w_StartPos = @w_Delimeter_pos + @w_StartPos + (@w_Delimeter_Len- 1)
  END
  ELSE No more delimeters, get last value
  BEGIN
   SET @w_tmp_txt = LTRIM(RTRIM( SUBSTRING( @p_SourceText, @w_StartPos
      ,((@w_Length – @w_StartPos) + @w_Delimeter_Len)) ))
   if isnumeric(@w_tmp_txt) = 1
   begin
    set @w_tmp_int = cast( cast(@w_tmp_txt as numeric) as int)
    set @w_tmp_num = cast( @w_tmp_txt as numeric(18,3))
   end
   else
   begin
    set @w_tmp_int =  null
    set @w_tmp_num =  null
   end
   SELECT @w_Continue = 0
  END
  INSERT INTO @retTable VALUES( @w_tmp_int, @w_tmp_num, @w_tmp_txt )
END
RETURN
END
GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO

 
  
Resource:
SQL User Defined Function to Parse a Delimited String
 
Parse String into Table
 
Function to Split a Delimited String into a Table
 
Parse delimited string in a Stored Procedure
 
 

SQL – How to use (not use) GO within Batches

“GO” is not a T-SQL statement, it’s a command specific to utilities such as SQLCmd, OSQL, Query Analyzer and SQL Server Management Studio.  Go is not part of ANSI-SQL or Transaction-SQL (TSQL).  The GO command is interpreted by these tools to trigger a batch of Transaction-SQL.

 

In the past I’ve ran into problems using GO when creating scripts that are used for moving data or structure to another database.  Usually these scripts include multiple Creates, Alters, Inserts DROP and etc.  A bad habit I had was to add the GO command after every statement, which is not needed.  A caveat pertaining to Stored Procedures, in a script that contains multiple statements, a GO command must be the last line of the SP.  If a GO command is not include as the last line of the SP then all then the code following the SP will be compiled and save in the SP.  I’ve seen this problem many times when a person creates a Stored Procedure and in the same script add permissions to that SP without a GO command separating the two statements. Example:

 

 

CREATE PROCEDURE DoSomeThing

  @Var1 VARCHAR(50)

AS

  SELECT TOP 10 Col1, Col2

  FROM Table

  WHERE Col2 = @Car1

–GO –Go is needed hter

GRANT EXECUTE ON DoSomeThing TO User

 

Now every time the DoSomeThing SP is executed the “Grant Excute…” statement will be ran also.

 

Also the scope of local (user-defined) variables is limited to a batch, and cannot be referenced after a GO command.

 

 

 

Resources:

SQL Server Books -ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/b2ca6791-3a07-4209-ba8e-2248a92dd738.htm

http://msdn.microsoft.com/en-us/library/ms188037.aspx

 

SQL Server batch rules

http://searchsqlserver.techtarget.com/tip/0,289483,sid87_gci1161826,00.html

 

Errors in T-SQL Batches and the GO Statement in ETL process

http://www.siusic.com/wphchen/errors-in-t-sql-batches-and-the-go-statement-in-etl-process-63.html

 

IF and TRANSACTION problem

http://www.eggheadcafe.com/software/aspnet/32705839/if-and-transaction-proble.aspx

 

 

SQL Server Dynamic Query SQL Server without using Dynamic SQL but with XML

There always seems to be  a need for dynamic SQL, but security seems too limit where it can be used.  I found the article and it seemed very intriguing, but have not been able to deeply look into it, so I’m going to put it here for future reading.
 
"Running a Dynamic Query against SQL Server without using Dynamic SQL"

SQL Server Searching Object Text

I had a situation where soft deletes were going to be implemented in the database, but for the past year hard deleted were being implemented.   On every table the IsDeleted column was created.  The newer Stored Procedures, Views, and Functions included IsDeleted in there WHERE statement, but all the db object that were created prior to implementing the IsDeleted needed to be changed to include the IsDeleted. 
 
The following was my first attempt to identify DB objects that did not include IsDeleted.  This seemed to return the correct data. But, there seem to be a problem.  There were situations where the results that were returned included duplicate names.  Come to find out the [Text] column that is returned from syscomments is NVARCHAR(4000).  If the [Text] of DB object exceeded 4000 characters an additional records was created every additional 4000 characters.   There were a few stored procedures that surpassed 4000 characters and in-turn had duplicate Object_names returned in the results .  This is not that big of a deal, but I wasn’t getting back the results I expected. 
 

SELECT OBJECT_NAME(id), *

FROM syscomments

WHERE

(

OBJECTPROPERTY(id, ‘IsScalarFunction’) = 1

OR OBJECTPROPERTY(id, ‘IsTableFunction’) = 1

OR OBJECTPROPERTY(id, ‘IsProcedure’) = 1

OR OBJECTPROPERTY(id, ‘IsView’) = 1

)

AND [Text] NOT LIKE ‘%IsDeleted%’

 
I found a solution to my problem by referencing sys.objects table and using the function OBJECT_DEFINITION, which returns the Transact-SQL source text of the definition of a specified object.  OBJECT_DEFINITION is not limited to NVARCHAR(4000), but instead is NVARCHAR(MAX).  I probably could have referenced INFORMATION_SCHEMA.Tables, but decided not to mess with it today and just wanted to get it to work.
 

SELECT

Name, *

FROM

sys.objects

WHERE

(

  OBJECTPROPERTY(object_id, ‘IsScalarFunction’) = 1

  OR OBJECTPROPERTY(object_id, ‘IsTableFunction’) = 1

  OR OBJECTPROPERTY(object_id, ‘IsProcedure’) = 1

  OR OBJECTPROPERTY(object_id, ‘IsView’) = 1

)

AND OBJECT_DEFINITION(object_id) NOT LIKE ‘%IsDeleted%’

 
 
 
References:
SQL Server OBJECT_DEFINITION (Transact-SQL)
 

SQL Server sysobjects

 

Searching Syscomments Accurately (View the Comments section)

 

SQL Server Custom Shortcuts

Querying the INFORMATION_SCHEMA  (But has info about SQL Server Shortcuts)
This is where I originally found info about SQL Server shortcuts.
 
Assigning shortcuts to commonly run tasks in SQL Server 2005 Management Studio
 
SELECT * FROM, sp_helptext ShortCut for SQL Query Analyzer
 
 
SQL SERVER – Query Analyzer Shortcuts (SQL Server default shortcut keys)
This is probably my favorite SQL Server web site