SQL Server: Does Unwanted Tables in a Query or View Affect Performance


Recently a friend of mine asked, that is it true that presence of extra tables in joins section of a query, will affect query performance. Extra tables means, tables which can be skipped from query without affecting query result. For example following query has extra tables (other than vendor and contact tables) in join section
USE AdventureWorks
GO

SELECT  Vendor.Name,
        Contact.Title,
        Contact.FirstName,
        Contact.MiddleName
FROM    Person.Address AS a
        INNER JOIN Purchasing.VendorAddress AS VendorAddress 
        ON a.AddressID = VendorAddress.AddressID
        INNER JOIN Person.StateProvince AS StateProvince 
        ON StateProvince.StateProvinceID = a.StateProvinceID
        INNER JOIN Person.CountryRegion AS CountryRegion 
        ON CountryRegion.CountryRegionCode = StateProvince.CountryRegionCode
        INNER JOIN Purchasing.Vendor AS Vendor
        INNER JOIN Purchasing.VendorContact AS VendorContact 
        ON VendorContact.VendorID = Vendor.VendorID
        INNER JOIN Person.Contact AS Contact 
        ON Contact.ContactID = VendorContact.ContactID
        INNER JOIN Person.ContactType AS ContactType 
        ON VendorContact.ContactTypeID = ContactType.ContactTypeID 
        ON VendorAddress.VendorID = Vendor.VendorID
Though this is NOT common to have extra tables in our usual queries but it could be possible in views. A view can be created with multiple tables and selecting columns from each joined table. And later on when we will query this view we can use only few columns in our select statement. So when we will execute above query SQL Server Query Analyzer will skip all those tables which are not part of game. Here is execution plan of above query.

Same query with more columns, pushing all tables in action.

SELECT  Vendor.Name,
        ContactType.Name AS ContactType,
        Contact.Title,
        Contact.FirstName,
        Contact.MiddleName,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        StateProvince.Name AS StateProvinceName,
        a.PostalCode,
        CountryRegion.Name AS CountryRegionName,
        Vendor.VendorID
FROM    Person.Address AS a
        INNER JOIN Purchasing.VendorAddress AS VendorAddress 
        ON a.AddressID = VendorAddress.AddressID
        INNER JOIN Person.StateProvince AS StateProvince 
        ON StateProvince.StateProvinceID = a.StateProvinceID
        INNER JOIN Person.CountryRegion AS CountryRegion 
        ON CountryRegion.CountryRegionCode = StateProvince.CountryRegionCode
        INNER JOIN Purchasing.Vendor AS Vendor
        INNER JOIN Purchasing.VendorContact AS VendorContact 
        ON VendorContact.VendorID = Vendor.VendorID
        INNER JOIN Person.Contact AS Contact 
        ON Contact.ContactID = VendorContact.ContactID
        INNER JOIN Person.ContactType AS ContactType 
        ON VendorContact.ContactTypeID = ContactType.ContactTypeID 
        ON VendorAddress.VendorID = Vendor.VendorID

If we create a view using our second query and use our view in following style then execution plan will be same to our first query.

SELECT  Name,
        Title,
        FirstName,
        MiddleName
FROM    vw_MyView

Hence, SQL Server Query Analyser is quite smart and work on only those tables which are part of actual game and it doesn’t matter that extra tables are part of your query or a view.

Advertisements

Posted on June 8, 2011, in Daily Dose, Performance Tuning, Sql Server 2005, Sql Server 2008, TSQL Tips n Tricks, View. Bookmark the permalink. 5 Comments.

  1. Eliel Mamousette

    Aasim, proceeding here without much context, I would say that the rule of thumb to follow here is “Yes, there should be no extra tables included in a join statement”.

    Depending on the quality of the query as written, and the optimizer being used, additional tables can have an exponential effect on the number of IO requests performed to process a given query.

    Having said that, most modern DBMS systems will “do the right thing” and hence the net effect of including those tables should not be significant. But I have seen instances where the optimizer did not do the right thing, so I always caution against assuming that a generalized parser can do a better job than the DBA who knows the content of the tables being queried.

  2. Sometimes this is true, but not always.

    I recently discovered a subquery that was something like:
    SELECT DISTINCT
    p.PK
    FROM
    dbo.Parent p
    INNER JOIN
    dbo.Child c
    ON c.ParentPK = p.PK

    and found that the child table wasn’t removed.

  3. Tom Rigg via LinkedIn

    joins always effect performance, lets say you want 1,000 rows of 100,000 the join is done on all 100,000 rows. Of course if you structure of your query can limit those joins.

  4. Optimizer can ignore all these extra tables ONLY because they have FKs.

    Try to do delete one:
    ALTER TABLE [Purchasing].[VendorContact] DROP CONSTRAINT [FK_VendorContact_ContactType_ContactTypeID]

    then run your first query and look at the plan.

  1. Pingback: DotNetShoutout

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: