Re: index bloat?
I updated statistics on the loan table and it had no effect.
Was that a typo and you meant to say it sounds like a good plan?
Because I forced this plan with the following query:
select * from (
select * from loan
where loan.deal_no='M L4W1') as x
inner join loan_history as lh
on x.exloan_id=lh. exloan_id
where time_period=196
option (force order, loop join)
and it is so fast it makes me cry - 1 second if cached, 5 seconds if
not cached and a large deal. Here's the plan:
StmtText
|--Nested Loops(Inner Join, OUTER REFERENCES:([loan].[EXLOAN_ID])
WITH PREFETCH)
|--Bookmark Lookup(BOOKMARK :([Bmk1000]),
OBJECT:([LoanPerformance].[dbo].[loan]))
| |--Index
Seek(OBJECT:([LoanPerformance].[dbo].[loan].[IX_loan_deal_no]),
SEEK:([loan].[DEAL_NO]='ML4W1') ORDERED FORWARD)
|--Compute
Scalar(DEFINE:([lh].[YYYYMM]=(floor([lh].[TIME_PERIOD]/12)+1989)*100+[lh].[TIME_PERIOD]%12+1))
|--Clustered Index
Seek(OBJECT:([LoanPerformance].[dbo].[loan_history].[PK_loan_history]
AS [lh]), SEEK:([lh].[TIME_PERIOD]=196 AND
[lh].[EXLOAN_ID]=[loan].[EXLOAN_ID]) ORDERED FORWARD)
I tried this query:
select * from loan
inner join loan_history lh
on loan.exloan_id = lh.exloan_id
where loan.deal_no='M L4W1'
and time_period='19 6'
option (force order)
and it takes a couple minutes if not cached, 5 seconds if cached -
definitely better than a couple hours, but nowhere near my
corrected-nested-loop. Here's the plan:
StmtText
|--Merge Join(Inner Join,
MERGE:([loan].[EXLOAN_ID])=([lh].[EXLOAN_ID]),
RESIDUAL:([lh].[EXLOAN_ID]=[loan].[EXLOAN_ID]))
|--Bookmark Lookup(BOOKMARK :([Bmk1000]),
OBJECT:([LoanPerformance].[dbo].[loan]))
| |--Index
Seek(OBJECT:([LoanPerformance].[dbo].[loan].[IX_loan_deal_no]),
SEEK:([loan].[DEAL_NO]='ML4W1') ORDERED FORWARD)
|--Compute
Scalar(DEFINE:([lh].[YYYYMM]=(floor([lh].[TIME_PERIOD]/12)+1989)*100+[lh].[TIME_PERIOD]%12+1))
|--Clustered Index
Seek(OBJECT:([LoanPerformance].[dbo].[loan_history].[PK_loan_history]
AS [lh]), SEEK:([lh].[TIME_PERIOD]=196) ORDERED FORWARD)
And now for the obvious/hard question: now that I know what the right
execution plan is, how can I get the query optimizer to generate it -
*without the hints*? Because some users of the DB are using report
designing software that will just generate the join query with no
hints, and I don't want them sitting there for hours when they don't
have to. Does the optimizer not choose the merge join because it
requires a bookmark lookup? Does it not choose my nested loop join
because the statistics are wrong somewhere?
Thanks again,
Seth
I updated statistics on the loan table and it had no effect.
Was that a typo and you meant to say it sounds like a good plan?
Because I forced this plan with the following query:
select * from (
select * from loan
where loan.deal_no='M L4W1') as x
inner join loan_history as lh
on x.exloan_id=lh. exloan_id
where time_period=196
option (force order, loop join)
and it is so fast it makes me cry - 1 second if cached, 5 seconds if
not cached and a large deal. Here's the plan:
StmtText
|--Nested Loops(Inner Join, OUTER REFERENCES:([loan].[EXLOAN_ID])
WITH PREFETCH)
|--Bookmark Lookup(BOOKMARK :([Bmk1000]),
OBJECT:([LoanPerformance].[dbo].[loan]))
| |--Index
Seek(OBJECT:([LoanPerformance].[dbo].[loan].[IX_loan_deal_no]),
SEEK:([loan].[DEAL_NO]='ML4W1') ORDERED FORWARD)
|--Compute
Scalar(DEFINE:([lh].[YYYYMM]=(floor([lh].[TIME_PERIOD]/12)+1989)*100+[lh].[TIME_PERIOD]%12+1))
|--Clustered Index
Seek(OBJECT:([LoanPerformance].[dbo].[loan_history].[PK_loan_history]
AS [lh]), SEEK:([lh].[TIME_PERIOD]=196 AND
[lh].[EXLOAN_ID]=[loan].[EXLOAN_ID]) ORDERED FORWARD)
I tried this query:
select * from loan
inner join loan_history lh
on loan.exloan_id = lh.exloan_id
where loan.deal_no='M L4W1'
and time_period='19 6'
option (force order)
and it takes a couple minutes if not cached, 5 seconds if cached -
definitely better than a couple hours, but nowhere near my
corrected-nested-loop. Here's the plan:
StmtText
|--Merge Join(Inner Join,
MERGE:([loan].[EXLOAN_ID])=([lh].[EXLOAN_ID]),
RESIDUAL:([lh].[EXLOAN_ID]=[loan].[EXLOAN_ID]))
|--Bookmark Lookup(BOOKMARK :([Bmk1000]),
OBJECT:([LoanPerformance].[dbo].[loan]))
| |--Index
Seek(OBJECT:([LoanPerformance].[dbo].[loan].[IX_loan_deal_no]),
SEEK:([loan].[DEAL_NO]='ML4W1') ORDERED FORWARD)
|--Compute
Scalar(DEFINE:([lh].[YYYYMM]=(floor([lh].[TIME_PERIOD]/12)+1989)*100+[lh].[TIME_PERIOD]%12+1))
|--Clustered Index
Seek(OBJECT:([LoanPerformance].[dbo].[loan_history].[PK_loan_history]
AS [lh]), SEEK:([lh].[TIME_PERIOD]=196) ORDERED FORWARD)
And now for the obvious/hard question: now that I know what the right
execution plan is, how can I get the query optimizer to generate it -
*without the hints*? Because some users of the DB are using report
designing software that will just generate the join query with no
hints, and I don't want them sitting there for hours when they don't
have to. Does the optimizer not choose the merge join because it
requires a bookmark lookup? Does it not choose my nested loop join
because the statistics are wrong somewhere?
Thanks again,
Seth
Comment