Saturday, February 25, 2012

Question : How SQL chooses an index for a process

I have just tested 3 queries using QA. The complete
test information :

--
CREATE TABLE agls1
(fyear char(4) NULL ,
fprefix char(3) NULL ,
fvcno char(20) NULL ,
fdate datetime NULL ,
fid char(15) NULL ,
fiddate datetime NULL ,
fdesc char(60) NULL ,
facc char(12) NULL ,
fval decimal(18, 2) NULL ,
fcrc char(5) NULL ,
fsub char(1) NULL ,
fmaster char(9) NULL ,
fcode char(15) NULL )
CREATE CLUSTERED INDEX a ON agls1 (fyear, fprefix,
fdate, fvcno)
CREATE INDEX b ON agls1 (fyear, facc, fdate,
fprefix, fvcno)
CREATE INDEX c ON agls1(fyear,fsub, fmaster, fcode)

insert into agls1
( fyear,fsub,fmaster,fcode,fprefix,fdate,fvcno,facc )
values
( '2004','A','B','123','inv','20040101','01','111' )

--query-1
select * from agls1
where fyear = '2004' and fprefix = 'inv' and
fdate = '20040101' and fvcno = '01'

--query-2
select * from agls1
where fyear = '2004' and facc = '111' and
fdate = '20040101' and fprefix = 'inv' and fvcno = '01'

--query-3
select * from agls1
where fyear = '2004' and fsub = 'A' and fmaster = 'B'
and fcode = '123'
--

The execution plan shows that the index a
is always used for all 3 select queries above.

I have 3 questions for you :
a. Why does SQL not choose index b for query-2 ?
Why does SQL not choose index c for query-3 ?
b. Is it right that query-2 does not benefit from
index b and query-3 does not benefit from index c ?
c. How does SQL choose an index for a process ?

Could anyone help me

Thanks in advance

Anita Hery

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!Anita (anonymous@.devdex.com) writes:
> The execution plan shows that the index a
> is always used for all 3 select queries above.
> I have 3 questions for you :
> a. Why does SQL not choose index b for query-2 ?
> Why does SQL not choose index c for query-3 ?

There is only one row in the table, that makes the test somewhat
meaningless.

But the general is that there is always a tradeoff whether to use a
non-clustered index or not. When SQL Server finds row through a non-
clustered index, it has to go to the data page and get data requested
in the query which is not present in the index. This means that it
can be more expensive to use the index than to scan table, if the
optimizer estimates that the index will find many rows.

> b. Is it right that query-2 does not benefit from
> index b and query-3 does not benefit from index c ?

No, that depends on how the data looks like. Let's first take query/index
c. Say that there is over a million rows with year = 2004. In this case,
without the index, query c would have to scan all those rows in the
clustered index, whereas with the non-clustered index can find the
matching rows faster. But if there are say, 10000 rows that matches
query c, I would execpt SQL Server to use the clustered index.

As for index b, there are situations where this index could help, but
in this case, there must be many duplicates in the clustered index,
so that you actually make the query significantly more precise by adding
that extra column.

> c. How does SQL choose an index for a process ?

SQL Server uses a cost-based optimizer which makes its decisions from
statistics about the table column. Therefore the same query can get
different query plans with different data.

There is material in Books Online you can study. I can also recommend
Kalen Delaney's "Inside SQL Server 2000", which covers this topic
in detail.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Hi Erland,

Thanks a lot for the lucid explanation.
Very helpful.

I will continue testing using minimum amount of rows
to see that SQL Server uses index c for query-3.
The data must easily force SQL Server to use
index c. If you do not mind, could you advice me
how data looks like that I should create.

Anita Hery

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!|||
Anita wrote:

> I have just tested 3 queries using QA. The complete
> test information :
> --
> CREATE TABLE agls1
> (fyear char(4) NULL ,
> fprefix char(3) NULL ,
> fvcno char(20) NULL ,
> fdate datetime NULL ,
> fid char(15) NULL ,
> fiddate datetime NULL ,
> fdesc char(60) NULL ,
> facc char(12) NULL ,
> fval decimal(18, 2) NULL ,
> fcrc char(5) NULL ,
> fsub char(1) NULL ,
> fmaster char(9) NULL ,
> fcode char(15) NULL )
> CREATE CLUSTERED INDEX a ON agls1 (fyear, fprefix,
> fdate, fvcno)
> CREATE INDEX b ON agls1 (fyear, facc, fdate,
> fprefix, fvcno)
> CREATE INDEX c ON agls1(fyear,fsub, fmaster, fcode)
> insert into agls1
> ( fyear,fsub,fmaster,fcode,fprefix,fdate,fvcno,facc )
> values
> ( '2004','A','B','123','inv','20040101','01','111' )
> --query-1
> select * from agls1
> where fyear = '2004' and fprefix = 'inv' and
> fdate = '20040101' and fvcno = '01'
> --query-2
> select * from agls1
> where fyear = '2004' and facc = '111' and
> fdate = '20040101' and fprefix = 'inv' and fvcno = '01'
> --query-3
> select * from agls1
> where fyear = '2004' and fsub = 'A' and fmaster = 'B'
> and fcode = '123'
> --
> The execution plan shows that the index a
> is always used for all 3 select queries above.
> I have 3 questions for you :
> a. Why does SQL not choose index b for query-2 ?
> Why does SQL not choose index c for query-3 ?
> b. Is it right that query-2 does not benefit from
> index b and query-3 does not benefit from index c ?
> c. How does SQL choose an index for a process ?

The last question in particular has lots of book chapters on it.
The fact is that the table is so small that no index can really
help much. A blind table-scan is fastest with a one-row table.
To see more intuitive index use, you should probably test with
a table having thousands of well-distributed rows.
Joe Weinstein at BEA
> Could anyone help me
> Thanks in advance
> Anita Hery
> *** Sent via Developersdex http://www.developersdex.com ***
> Don't just participate in USENET...get rewarded for it!|||Anita (anonymous@.devdex.com) writes:
> I will continue testing using minimum amount of rows
> to see that SQL Server uses index c for query-3.
> The data must easily force SQL Server to use
> index c. If you do not mind, could you advice me
> how data looks like that I should create.

You can always use an index hint to convince SQL Server to use an index:

SELECT * FROM tbl WITH (INDEX = c)

To make it simple you should have fyear = 2004 in all rows you create.
But the values in fsub, fmastser and fcode should vary.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||"Erland Sommarskog" <sommar@.algonet.se> wrote in message
news:Xns949BF26CFB825Yazorman@.127.0.0.1...
> Anita (anonymous@.devdex.com) writes:
> > I will continue testing using minimum amount of rows
> > to see that SQL Server uses index c for query-3.
> > The data must easily force SQL Server to use
> > index c. If you do not mind, could you advice me
> > how data looks like that I should create.
> You can always use an index hint to convince SQL Server to use an index:
> SELECT * FROM tbl WITH (INDEX = c)

Just to jump in, you can of course do that. However (and this is to Anita,
not Erland since I know he's aware of this), it's generally a fairly bad
idea to force an index hint, since your data may later change in such a way
to make the index less useful.

I'd recommend finding some of the papers Kalen Delany has written on this
subject as it may help.

> To make it simple you should have fyear = 2004 in all rows you create.
> But the values in fsub, fmastser and fcode should vary.
>
> --
> Erland Sommarskog, SQL Server MVP, sommar@.algonet.se
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp|||I have inserted 810 rows by following Erland's advice.
With these rows, SQL Server uses index c when executes
query :
select * from agls1
where fyear = '2004' and fsub = 'A' and fmaster = 'B'
and fcode = '123'

Thanks again to all of you that sent the replies

Anita Hery

Note :
Below is my insert test :

declare @.sub as int, @.master as int, @.code as int
set @.sub = 0 --max 15
set @.master = 0 --max 6
set @.code = 0 --max 15 digit

lsub:
set @.sub = @.sub + 1
set @.master = 0
set @.code = 0
lmaster:
set @.master = @.master + 1
set @.code = 0
lcode:
set @.code = @.code + 1
insert into agls1
(fyear,fsub,fmaster,fcode,fprefix,fdate,fvcno,facc )
values
('2004',str(@.sub,1),str(@.master,9),
str(@.code,15),'inv','20040101','01','111')

if @.code < 15 goto lcode
if @.master < 6 goto lmaster
if @.sub < 9 goto lsub

select * from agls1
where fyear = '2004' and fsub = 'A' and
fmaster = 'B' and fcode = '123'

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

No comments:

Post a Comment