Indexing Advice

**Simon Hayes** · Jul 20 '05, 02:37 AM

Re: Indexing Advice

"Daniel Roy" <danielroy10jun k@hotmail.com> wrote in message
news:3722db.040 1190652.41d33a3 0@posting.googl e.com...[color=blue]
> Hi guys,
> I've been playing with this query for about a week now, and I still
> can't find ways to speed it up. It runs for about 5 minutes. The only
> table (among the 41) with significant amount of data is dbo.S_EVT_ACT,
> which contains about 900,000 records. The query is automatically
> generated by Siebel, and therefore not really modifiable. And hints
> are not really possible. While looking at the exec plan, we can see
> that what takes so long is a clustered index scan of T1. This
> clustered index is defined on a column ROW_ID (a Siebel-generated
> primary key used for every table). I tried to define all kinds of
> indexes in order to bypass this clustered index scan, but no success
> so far ... Here's the query:
>
> SELECT ...
> FROM
> dbo.S_EVT_ACT T1
> LEFT OUTER JOIN dbo.S_EVT_ACT_F NX T2 ON T1.ROW_ID =
> T2.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_FN_APPR T3 ON T2.AMS_ACT_ID =
> T3.ROW_ID
> LEFT OUTER JOIN dbo.S_ORG_EXT T4 ON T1.PERFRM_BY_OU _ID =
> T4.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_PRI_LST T5 ON T1.PRI_LST_ID =
> T5.ROW_ID
> LEFT OUTER JOIN dbo.S_CONTACT T6 ON T1.PR_CON_ID =
> T6.PAR_ROW_ID
> AND T1.PR_CON_ID = T6.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_PRI_LST T7 ON T1.RATE_LST_ID =
> T7.ROW_ID
> LEFT OUTER JOIN dbo.S_ACT_PRDIN T T8 ON T1.ROW_ID =
> T8.ACTIVITY_ID
> LEFT OUTER JOIN dbo.S_ASSET T9 ON T1.ASSET_ID = T9.ROW_ID
> LEFT OUTER JOIN dbo.S_CONTACT T10 ON T1.PR_CON_ID =
> T10.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_DOC_AGREE T11 ON T1.AGREEMENT_ID =
> T11.ROW_ID
> LEFT OUTER JOIN dbo.S_EXP_RPT T12 ON T1.PR_EXP_RPT_I D =
> T12.ROW_ID
> LEFT OUTER JOIN dbo.S_INS_CLAIM T13 ON T1.INSCLM_ID =
> T13.ROW_ID
> LEFT OUTER JOIN dbo.S_EVT_ACT_S S T14 ON T1.ROW_ID =
> T14.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_INT_INSTA NCE T15 ON T14.OWN_INST_ID =
> T15.ROW_ID
> LEFT OUTER JOIN dbo.S_ME_EVT_LS T16 ON T1.ME_EVT_ID =
> T16.ROW_ID
> LEFT OUTER JOIN dbo.S_OPTY T17 ON T1.OPTY_ID = T17.ROW_ID
> LEFT OUTER JOIN dbo.S_ORG_EXT T18 ON T1.TARGET_OU_ID =
> T18.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_PART_RPR T19 ON T1.PART_RPR_ID =
> T19.ROW_ID
> LEFT OUTER JOIN dbo.S_POSTN T20 ON T1.OWNER_POSTN_ ID =
> T20.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_PROD_DEFE CT T21 ON T1.SRA_DEFECT_I D =
> T21.ROW_ID
> LEFT OUTER JOIN dbo.S_PROJ T22 ON T1.PROJ_ID = T22.ROW_ID
> LEFT OUTER JOIN dbo.S_PROJITEM T23 ON T1.PROJ_ITEM_ID =
> T23.ROW_ID
> LEFT OUTER JOIN dbo.S_SRC T24 ON T1.SRC_ID = T24.ROW_ID
> LEFT OUTER JOIN dbo.S_SRV_REQ T25 ON T1.SRA_SR_ID =
> T25.ROW_ID
> LEFT OUTER JOIN dbo.S_TMPL_PLAN ITEM T26 ON T1.ASSESS_TMPL_ ID
> =
> T26.ROW_ID
> LEFT OUTER JOIN dbo.S_TMSHT_LIN E T27 ON T1.PR_TMSHT_LIN E_ID
> =
> T27.ROW_ID
> LEFT OUTER JOIN dbo.S_EVT_ACT T28 ON T1.TEMPLATE_ID =
> T28.ROW_ID
> LEFT OUTER JOIN dbo.S_SUSP_ACT T29 ON T1.ROW_ID =
> T29.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_EVT_MAIL T30 ON T1.ROW_ID =
> T30.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_SRV_ACT T31 ON T1.ROW_ID =
> T31.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_ACT_EMP T32 ON T1.OWNER_PER_ID =
> T32.EMP_ID AND
> T1.ROW_ID = T32.ACTIVITY_ID
> LEFT OUTER JOIN dbo.S_PARTY T33 ON T32.EMP_ID = T33.ROW_ID
> LEFT OUTER JOIN dbo.S_CONTACT_F NX T34 ON T32.EMP_ID =
> T34.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_USER T35 ON T32.EMP_ID =
> T35.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_PARTY T36 ON T1.PR_CON_ID = T36.ROW_ID
> LEFT OUTER JOIN dbo.S_CONTACT T37 ON T1.PR_CON_ID =
> T37.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_CONTACT_F NX T38 ON T1.PR_CON_ID =
> T38.PAR_ROW_ID
> LEFT OUTER JOIN dbo.S_PROD_INT T39 ON T1.PR_PRDINT_ID =
> T39.ROW_ID
> LEFT OUTER JOIN dbo.S_ACT_PROD_ APPL T40 ON T1.PR_PRDINT_ID =
> T40.ROW_ID
> LEFT OUTER JOIN dbo.S_PROD_INT T41 ON T40.PRDINT_ID =
> T41.ROW_ID
> WHERE
> (
> (T1.PAR_EVT_ID = T1.ROW_ID OR T1.PAR_EVT_ID IS NULL OR
> T1.PAR_EVT_ID !=
> T1.ROW_ID AND T1.OPTY_ID IS NOT NULL OR T1.SUBTYPE_CD =
> 'General' AND
> T1.TODO_CD != 'Marketing eEvent') AND
> (T1.APPT_REPT_R EPL_CD IS NULL) AND
> (T1.TEMPLATE_FL G != 'Y' AND T1.TEMPLATE_FLG != 'P' AND
> (T1.OPTY_ID IS NULL OR T17.SECURE_FLG = 'N' OR T1.OPTY_ID IN
> (
> SELECT SQ1_T2.OPTY_ID
> FROM dbo.S_PARTY SQ1_T1
> INNER JOIN dbo.S_OPTY_POST N SQ1_T2
> ON SQ1_T2.POSITION _ID = SQ1_T1.ROW_ID
> INNER JOIN dbo.S_POSTN SQ1_T3
> ON SQ1_T1.ROW_ID = SQ1_T3.PAR_ROW_ ID
> LEFT OUTER JOIN dbo.S_CONTACT SQ1_T4
> ON SQ1_T3.PR_EMP_I D = SQ1_T4.PAR_ROW_ ID
> WHERE
> (SQ1_T4.ROW_ID = '1-23NDP')
> )
> )
> )
> AND
> (T1.PRIV_FLG = 'N' OR T1.PRIV_FLG IS NULL OR T1.OWNER_PER_ID =
> '1-23NDP')) AND
> (T1.APPT_START_ DT > DATEADD
> (SECOND, ROUND (-183 * 86400, 0), '01/15/2004 00:00:00')
> OR
> T1.TODO_ACTL_EN D_DT >
> DATEADD (SECOND, ROUND (-183 * 86400, 0), '01/15/2004
> 00:00:00'))
>
> Anyone's got an idea of which index on T1 might help me?
>
> Daniel[/color]

According to your query, it looks like you will always have either an index
or table scan of dbo.S_EVT_ACT - it is on the left side of a LEFT JOIN, and
that means that all rows from the table will always be returned. The rows
may be filtered further by the WHERE clause, but MSSQL will require all the
rows as part of the join.

If you had control over the query, you might be able to re-write it with the
filter conditions in the join, not the WHERE clause, or possibly change the
join order to put a smaller table first (although both of those would change
the logic of your query, so might not be straightforward ). You could
investigate if Siebel has some facility for changing the join order of a
query, if you can find another join order which is logically equivalent.

Other than that, there probably isn't much you can do, if you don't control
the query itself. One thing that might be possible would be to create
indexed views of common joins, then use the views instead of tables in your
queries. But there are lots of conditions attached to creating indexed views
(including no outer joins), so you would need to do some research and
testing.

Simon

**Erland Sommarskog** · Jul 20 '05, 02:37 AM

Re: Indexing Advice

Simon Hayes (sql@hayes.ch) writes:[color=blue]
> According to your query, it looks like you will always have either an
> index or table scan of dbo.S_EVT_ACT - it is on the left side of a LEFT
> JOIN, and that means that all rows from the table will always be
> returned. The rows may be filtered further by the WHERE clause, but
> MSSQL will require all the rows as part of the join.[/color]

I agree that the query is likely to be a hopeless case, but I don't
think you are right on your assement of the left join. Of course SQL Server
can apply the WHERE conditions on S_EVT_ACT, before it does all the
left joins. Consider for instance this query:

select *
from Customers C
left join Orders O ON C.CustomerID = O.CustomerID
where C.City = 'Berlin'

This query performs an Index Seek on C.City.

The problem with Daniel's query is that the WHERE conditions are very
complex. The simplest condition is that T1.APPT_REPT_RE PL_CD must be
NULL. And if this condition filters most of the rows, we have an index
candidate here. I'm not holding my breath, though.

Other possibilities would be to create some combined index on some
combinations of columns, or create separate indexes and hope for
index intersection. But it definitely looks like an uphil battle.

--
Erland Sommarskog, SQL Server MVP, sommar@algonet. se

Books Online for SQL Server SP3 at

SQL Server 2025 | Microsoft Evaluation Center

http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp

Accelerate your AI application's time to market by harnessing the power of your own data and the built-in AI capabilities of SQL Server 2025, the enterprise database with best-in-class security, performance and availability.

**Daniel Roy** · Jul 20 '05, 02:37 AM

Re: Indexing Advice

Thank you Simon for your response. Isn't there any way that the
optimizer can decide first to filter the rows from S_EVT_ACT (with a
non-clustered index and a clustered index lookup), and only afterwards
use the resulting rows to address the remaining 40 tables?

Daniel

**Erland Sommarskog** · Jul 20 '05, 02:37 AM

Re: Indexing Advice

Daniel Roy (danielroy10jun k@hotmail.com) writes:[color=blue]
> Thank you Simon for your response. Isn't there any way that the
> optimizer can decide first to filter the rows from S_EVT_ACT (with a
> non-clustered index and a clustered index lookup), and only afterwards
> use the resulting rows to address the remaining 40 tables?[/color]

As you might have seen from my post, there is. And in fact it is
likely that what SQL Server will do. However, the conditions are
so complex, that most likely it will have to scan the table anyway.

One of the left joined-tables are special, and that is T17 which
appears in the WHERE clause, so it is possible this has to be
examined first.

--
Erland Sommarskog, SQL Server MVP, sommar@algonet. se

Books Online for SQL Server SP3 at

SQL Server 2025 | Microsoft Evaluation Center

http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp

Accelerate your AI application's time to market by harnessing the power of your own data and the built-in AI capabilities of SQL Server 2025, the enterprise database with best-in-class security, performance and availability.

**Daniel Roy** · Jul 20 '05, 02:38 AM

Re: Indexing Advice

> Daniel Roy (danielroy10jun k@hotmail.com) writes:[color=blue][color=green]
> > Thank you Simon for your response. Isn't there any way that the
> > optimizer can decide first to filter the rows from S_EVT_ACT (with a
> > non-clustered index and a clustered index lookup), and only afterwards
> > use the resulting rows to address the remaining 40 tables?[/color]
>
> As you might have seen from my post, there is. And in fact it is
> likely that what SQL Server will do. However, the conditions are
> so complex, that most likely it will have to scan the table anyway.
>
> One of the left joined-tables are special, and that is T17 which
> appears in the WHERE clause, so it is possible this has to be
> examined first.[/color]

Thanx Erland and Simon for your kind advice. Now that we know that a
(non-clustered) index could possibly be used first before dealing with
the joins, anyone could tell me which index to T1 should be created,
according to the WHERE clause? I agree with you, Erland, when you say
it will be an uphill battle, but I think it's worth a try. Here are
some specific questions which stop me currently:
- can a comparison between 2 columns of the same table be resolved
with a non-clustered index (T1.PAR_EVT_ID = T1.ROW_ID)?
- can an IS NULL condition resolved with a non-clustered index
(T1.PAR_EVT_ID IS NULL)?
- can an inequality between 2 columns of the same table be resolved
with a non-clustered index (T1.PAR_EVT_ID != T1.ROW_ID)?
- can an IS NOT NULL condition resolved with a non-clustered index
(T1.OPTY_ID IS NOT NULL)?
....

As you can see, I'm just looking for guidance on the way that the SQL
Server's optimizer can make use of non-clustered indexes, in order to
help my chances of creating one which would make the clustered index
scan not necessary.

Daniel

**Erland Sommarskog** · Jul 20 '05, 02:38 AM

Re: Indexing Advice

Daniel Roy (danielroy10jun k@hotmail.com) writes:[color=blue]
> Thanx Erland and Simon for your kind advice. Now that we know that a
> (non-clustered) index could possibly be used first before dealing with
> the joins, anyone could tell me which index to T1 should be created,
> according to the WHERE clause? I agree with you, Erland, when you say
> it will be an uphill battle, but I think it's worth a try. Here are
> some specific questions which stop me currently:
> - can a comparison between 2 columns of the same table be resolved
> with a non-clustered index (T1.PAR_EVT_ID = T1.ROW_ID)?
> - can an IS NULL condition resolved with a non-clustered index
> (T1.PAR_EVT_ID IS NULL)?
> - can an inequality between 2 columns of the same table be resolved
> with a non-clustered index (T1.PAR_EVT_ID != T1.ROW_ID)?
> - can an IS NOT NULL condition resolved with a non-clustered index
> (T1.OPTY_ID IS NOT NULL)?
> ...[/color]

All sorts of conditions can be resolved by the index, this is not the
problem. The problem is whether the index can be selective enough or not.

Consider these two queries:

select * from Northwind..Orde rs WHERE CustomerID like 'N%'
select * from Northwind..Orde rs WHERE CustomerID like 'A%'

If you run them from Query Analyzer after having pressed CTRL-K
to get a tab with the execution plan, you will find that the
first query uses the non-clustered index on CustomerID, while
the second performs a clustered index scan.

Why is this? When SQL Server uses a non-clustered index to retrieve
data, and the query includes columns not present in the index (or
in the clustered index, of which the keys constitute the row pointers
of the non-clustered index) SQL Server must access the data pages.
This is known as "bookmark lookup". That is one access to a page
for each row, so the more rows you find this way, the more likely
you will access the same page more than once. Thus, at some level,
a table scan is cheaper. In the example above there are very few
orders from Customers whose IDs start with N, but there are a bunch
for A.

Now consider this query:

select *
from Northwind..Cust omers
WHERE CompanyName LIKE 'N%'
OR City = 'Berlin'

There is one index on CompanyName and one City, and if you run
the query in QA, you will find that SQL Server uses both indexes.
Again, if you change 'N%' to 'A%', SQL Server instead opts for a
table scan.

Another important thing to consider is the order of the columns
in a multi-columns index. Say that you have a non-clustered index
on the columns (a, b), and you issue the query:

SELECT * FROM tbl WHERE b = 3

Will SQL Server use the index? Maybe. But it would have to scan the
entire index, since the entries with b = 3 are scattered all over
the index. But if SQL Server has statistics that gives an estimate
that this query will only hit 3 of a million rows, SQL Server will
scan the non-clustered index, since scanning the index require
few page reads than scanning the data pages. (Assuming that the
table has a coupld of more columns.)

In these examples, I have used examples of equality. Of course
conditions with != can also be used, but as you may have realized
by now, they are not very likely to be selective enough.

Note that I say selective, this is not a question about actual
facts, but estimates drawn from the statistics SQL Server have
availble. In these examples, input value has been constants.
When the input value comes from a variable or another column,
SQL Server still can use the distribution, but the plan will be
the same no matter the input value, beause the optimizer does not
that value when it builds the plan.

This should give you some idea. It is possible that you can create
one or more non-clustered indexes, to resolve this, but I would
not hold high hopes.

Maybe I should add one more thing. You could lump about all
columns that appear in the WHERE clause in no particular order.
If there are only one or two more small columns in the table, this
is not likely to have any effect. But there are lot of more columns,
or the remaining columns are big, then you can make some gains
this way. Then SQL Server can evaluate the WHERE clause by scanning
the index, and as I noted above, this is cheaper than scanning
the entire table. (Because there are fewer pages to read.)

--
Erland Sommarskog, SQL Server MVP, sommar@algonet. se

Books Online for SQL Server SP3 at

SQL Server 2025 | Microsoft Evaluation Center

http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp

Accelerate your AI application's time to market by harnessing the power of your own data and the built-in AI capabilities of SQL Server 2025, the enterprise database with best-in-class security, performance and availability.

**Erland Sommarskog** · Jul 20 '05, 02:39 AM

Re: Indexing Advice

Daniel Roy (danielroy10jun k@hotmail.com) writes:[color=blue]
> As you can see, I'm just looking for guidance on the way that the SQL
> Server's optimizer can make use of non-clustered indexes, in order to
> help my chances of creating one which would make the clustered index
> scan not necessary.[/color]

And, oh, there is one more thing you definitely should try if you have
not already, and that is DBCC DBREINDEX. It is not going to remove
the clustered index scan, but if the table is heavily fragmented, reindexing
can make that scan a lot cheaper.

--
Erland Sommarskog, SQL Server MVP, sommar@algonet. se

Books Online for SQL Server SP3 at

SQL Server 2025 | Microsoft Evaluation Center

http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp

Accelerate your AI application's time to market by harnessing the power of your own data and the built-in AI capabilities of SQL Server 2025, the enterprise database with best-in-class security, performance and availability.

Indexing Advice

Indexing Advice

Comment

Comment

Comment

Comment

Comment

Comment

Comment