Searching question....

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • aaj

    Searching question....

    This is a simple question compared to some of the stuff that gets asked (and
    answered) here, but sometimes its easy to over look the simpler things.

    I've been working with databases for a few years now, but have no formal
    training, so some times you just get on and if it works dont worry about it.
    But sometimes I wonder just how it works underneath the skin e.g.

    If I have 100000 records in a complicated view and say each record, amongst
    other stuff, contains a name and an archived flag represented by a bit,
    which are pulled together from individual tables.

    Of the 100000 records say there are 25 that have the bit set to 0 and
    remainder have the bit set to 1 (for instance '0' this may mean data that is
    in the current month and so not archived)

    If I'm looking for the name 'fred' which is I know amongst the 25 (current
    data), I would use some thing like ' select name from view where name =
    'fred' and bit = 0'

    So in my own mind I'm thinking well the sensible thing for the database to
    do is to discard the mass of records where the archieved bit = 1 and then
    only search the remaining 25.

    But what I then think is there has to be a time overhead in disgarding the
    many thousands I'm not interested in. Does sql have to generate the view for
    all the data in the view, process each record in turn, saying I'm not
    interested in this one, or this one etc and then just pulling out the ones
    it is interested in to a second list , then processing these in the second
    list.

    Or does it process the relevant sub tables first i.e finding the matching
    record for fred and the matching ones for bit = 0 and then applying the
    rules for the view afterwards.

    Or ( as I suspect) does it do it a much cleverer way ??

    thanks for any help

    Andy



  • Hugo Kornelis

    #2
    Re: Searching question....

    On Wed, 15 Sep 2004 08:37:54 +0100, aaj wrote:
    [color=blue]
    >This is a simple question compared to some of the stuff that gets asked (and
    >answered) here, but sometimes its easy to over look the simpler things.
    >
    >I've been working with databases for a few years now, but have no formal
    >training, so some times you just get on and if it works dont worry about it.
    >But sometimes I wonder just how it works underneath the skin e.g.
    >
    >If I have 100000 records in a complicated view and say each record, amongst
    >other stuff, contains a name and an archived flag represented by a bit,
    >which are pulled together from individual tables.
    >
    >Of the 100000 records say there are 25 that have the bit set to 0 and
    >remainder have the bit set to 1 (for instance '0' this may mean data that is
    >in the current month and so not archived)
    >
    >If I'm looking for the name 'fred' which is I know amongst the 25 (current
    >data), I would use some thing like ' select name from view where name =
    >'fred' and bit = 0'
    >
    >So in my own mind I'm thinking well the sensible thing for the database to
    >do is to discard the mass of records where the archieved bit = 1 and then
    >only search the remaining 25.
    >
    >But what I then think is there has to be a time overhead in disgarding the
    >many thousands I'm not interested in. Does sql have to generate the view for
    >all the data in the view, process each record in turn, saying I'm not
    >interested in this one, or this one etc and then just pulling out the ones
    >it is interested in to a second list , then processing these in the second
    >list.
    >
    > Or does it process the relevant sub tables first i.e finding the matching
    >record for fred and the matching ones for bit = 0 and then applying the
    >rules for the view afterwards.
    >
    >Or ( as I suspect) does it do it a much cleverer way ??
    >
    >thanks for any help
    >
    >Andy[/color]

    Hi Andy,

    This question is not as simple as it may seem.

    SQL Server has many tricks up it's sleeve to help improve the performance
    of a query. So the optimizer *might* choose (as you put it) "a much
    cleverer way".

    A very important thing to consider here is the use of indexes. If there is
    an index defined for the bit column, it will be considered for use in this
    query. Normally, indexes on bit columns have little use as they'll match
    about 50% of all rows. In your case, the number matched is considerably
    less. If the index statistics reflect that almost all values in your bit
    column are 1 and if thhe text of your SQL clearly shows you only want the
    rows with a value of 0, then the optimizer might decide that using this
    index will prove cheaper than using another index or no index at all.

    If you run with query in Query Analyzer with the option to show the
    execution plan turned on (optien Query / Shoe Execution Plan), you can see
    how the query was carried out. You'll also see that the optimizer will
    sometimes shuffle elements from your view definition(s) and the select
    where the view is used to the point where you wouldn't even recognise your
    own query - though this extensive reshuffling will only occur if you have
    large tables (for small sets of data, the optimizer doesn't keep searching
    for the optimal plan - when the expected execution time is less than the
    time it would take to optimize further, the optimizer stops and decides on
    the best plan so far. Good is good enough).

    Best, Hugo
    --

    (Remove _NO_ and _SPAM_ to get my e-mail address)

    Comment

    • Simon Hayes

      #3
      Re: Searching question....


      "aaj" <a.b@c.com> wrote in message
      news:4147f155$0 $27401$afc38c87 @news.easynet.c o.uk...[color=blue]
      > This is a simple question compared to some of the stuff that gets asked
      > (and
      > answered) here, but sometimes its easy to over look the simpler things.
      >
      > I've been working with databases for a few years now, but have no formal
      > training, so some times you just get on and if it works dont worry about
      > it.
      > But sometimes I wonder just how it works underneath the skin e.g.
      >
      > If I have 100000 records in a complicated view and say each record,
      > amongst
      > other stuff, contains a name and an archived flag represented by a bit,
      > which are pulled together from individual tables.
      >
      > Of the 100000 records say there are 25 that have the bit set to 0 and
      > remainder have the bit set to 1 (for instance '0' this may mean data that
      > is
      > in the current month and so not archived)
      >
      > If I'm looking for the name 'fred' which is I know amongst the 25 (current
      > data), I would use some thing like ' select name from view where name =
      > 'fred' and bit = 0'
      >
      > So in my own mind I'm thinking well the sensible thing for the database to
      > do is to discard the mass of records where the archieved bit = 1 and then
      > only search the remaining 25.
      >
      > But what I then think is there has to be a time overhead in disgarding the
      > many thousands I'm not interested in. Does sql have to generate the view
      > for
      > all the data in the view, process each record in turn, saying I'm not
      > interested in this one, or this one etc and then just pulling out the ones
      > it is interested in to a second list , then processing these in the second
      > list.
      >
      > Or does it process the relevant sub tables first i.e finding the matching
      > record for fred and the matching ones for bit = 0 and then applying the
      > rules for the view afterwards.
      >
      > Or ( as I suspect) does it do it a much cleverer way ??
      >
      > thanks for any help
      >
      > Andy
      >
      >[/color]

      Assuming your view is not indexed, then the view name is replaced with the
      view definition before the query is executed, so MSSQL is only looking at
      the base tables. The easiest way to see this is to look at the query plan in
      Query Analyzer, which will show you exactly how MSSQL processes your query.
      Exactly how the query is optimized and executed depends on the indexes and
      statistics which MSSQL can use to find a good plan. Since the optimizer is
      cost-based, it will not always find a 'perfect' plan, as the time required
      to find the plan might be greater than the time required to execute the
      query with a less than optimal plan.

      If the view is indexed, then MSSQL can read the data directly from the view,
      without going to the base tables. This can make queries much faster, but any
      modifications to data in the underlying tables will be much slower, so it's
      probably best to use indexed views mainly for reporting, where the data is
      usually static.

      I'd suggest you get a copy of "Inside SQL Server 2000" by Kalen Delaney if
      you're interested in MSSQL internals.

      Simon


      Comment

      • aaj

        #4
        Re: Searching question....

        Thanks for the info guys

        I had a tinker with the analyser and it all looks extremely interesting.

        Andy

        "aaj" <a.b@c.com> wrote in message
        news:4147f155$0 $27401$afc38c87 @news.easynet.c o.uk...[color=blue]
        > This is a simple question compared to some of the stuff that gets asked[/color]
        (and[color=blue]
        > answered) here, but sometimes its easy to over look the simpler things.
        >
        > I've been working with databases for a few years now, but have no formal
        > training, so some times you just get on and if it works dont worry about[/color]
        it.[color=blue]
        > But sometimes I wonder just how it works underneath the skin e.g.
        >
        > If I have 100000 records in a complicated view and say each record,[/color]
        amongst[color=blue]
        > other stuff, contains a name and an archived flag represented by a bit,
        > which are pulled together from individual tables.
        >
        > Of the 100000 records say there are 25 that have the bit set to 0 and
        > remainder have the bit set to 1 (for instance '0' this may mean data that[/color]
        is[color=blue]
        > in the current month and so not archived)
        >
        > If I'm looking for the name 'fred' which is I know amongst the 25 (current
        > data), I would use some thing like ' select name from view where name =
        > 'fred' and bit = 0'
        >
        > So in my own mind I'm thinking well the sensible thing for the database to
        > do is to discard the mass of records where the archieved bit = 1 and then
        > only search the remaining 25.
        >
        > But what I then think is there has to be a time overhead in disgarding the
        > many thousands I'm not interested in. Does sql have to generate the view[/color]
        for[color=blue]
        > all the data in the view, process each record in turn, saying I'm not
        > interested in this one, or this one etc and then just pulling out the ones
        > it is interested in to a second list , then processing these in the second
        > list.
        >
        > Or does it process the relevant sub tables first i.e finding the matching
        > record for fred and the matching ones for bit = 0 and then applying the
        > rules for the view afterwards.
        >
        > Or ( as I suspect) does it do it a much cleverer way ??
        >
        > thanks for any help
        >
        > Andy
        >
        >
        >[/color]


        Comment

        Working...