How to speed up query that returns nulls and dates within a range

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Vikki McCormick
    New Member
    • Aug 2010
    • 46

    How to speed up query that returns nulls and dates within a range

    Okay here is another piece of query optimization for you guys. I have this stored procedure as you know which joins many tables. It's purpose is to be a search for an online application. So they go to the page and they have 2 fields for Age. They want all data between 0 and 10 days old. The table that joins for age has to be a left join because there are sometimes no records but we want to include all records from the base table. So I need to figure out the aged dates like Getdate() & Getdate() - 10. Then I have to say give me all the records that between 0 & 10. That works fine, but if you don't enter an Age you should get all records including those with a Null in the date column. So I have to set a range and then say, "OR IS NULL". Or I have to use a function in the where clause to replace the null - no!!, or I should be able to use a case statement. I am not sure what the best way is to handle this. If I use a case statement it runs in 8 seconds but it does not include the Nulls. If I use an "OR". It more than doubles the run time.

    Please see my example.
    Code:
    Create Table DateTest1
    (ID int,
    name varchar(10),
    datetest datetime
    )
    
    Create Table DateTest2
    (
    ID int,
    Descripttion varchar(35)
    )
    
    Insert into DateTest1  Values (1,'Joe',Getdate() -1) , (2,'Jim',NULL), (3,'Jake',Getdate() -7), (4,'John',NULL), (5,'Jan',Getdate() -10), (6,'Juno',Getdate() -5)
    
    Insert into DateTest2  Values (1,'Descr 1') , (2,'Descr 2'), (3,'Descr 3'), (4,'Descr 4'), (5,'Descr 5'), (6,'Descr 6')
    
    Select a.id, a.name, a.datetest, b.descripttion from DateTest2 b
    Inner Join DateTest1 a on a.id = b.id
    where a.datetest  between GETDATE()- 11 and GETDATE() or 
    		                 a.datetest is null 
    
    --Case statement does not include the Nulls
    --where a.datetest = case  When a.datetest is null then a.datetest
    --                         When a.datetest  between GETDATE()- 11 and 
    --                         GETDATE() then a.datetest end
    
    --Or I can use something like this.
    --a.datetest >= GETDATE()- 11 and a.datetest <= GETDATE() or 
    --a.datetest is null
    Well you get the idea. Should I make another thread?
  • Rabbit
    Recognized Expert MVP
    • Jan 2007
    • 12517

    #2
    Since this is about a specific query, I've moved it to its own thread.

    You can use the the ISNULL() function to change the null values to GETDATE(). If that's too slow because it operates on one of the fields, then try a union all approach.

    On a side note, the case statement doesn't do what you think it does. A null does not = null. For null comparions, you have to use IS NULL. You can't use NULL = NULL.

    Comment

    • Vikki McCormick
      New Member
      • Aug 2010
      • 46

      #3
      Thanks. That doesn't work. I am not sure what's going here. Does CASE exclude nulls in the same way that an IN statement would?

      I feel I am not understanding how this is working internally.

      Okay I am totally struggling with being positive with this thing. It's because I have to use the isNull on the left side of the predicate, but is that a performance disaster?

      Comment

      • Rabbit
        Recognized Expert MVP
        • Jan 2007
        • 12517

        #4
        It's not the CASE statement that's excluding NULLs. It's being excluded because of how you're trying to compare them. You misunderstand what NULL means. NULL doesn't mean that there's an absence of a value. A NULL can be anything. It is unlikely for two NULLs to be equal.
        NULL = NULL will always return FALSE. This is not a true statement. Your CASE statement is basically doing that comparison on NULLs.

        I wouldn't call it a disaster per se. But performance will definitely suffer because of the calculation that needs to be done. Did you try the UNION ALL method? You didn't say whether or not you did.

        Comment

        • ck9663
          Recognized Expert Specialist
          • Jun 2007
          • 2878

          #5
          When you say a user input is optional, that means there's a variable involve. You cannot hard code the getdate() function because it will always calculate the age based on today's date. What if the date that your user want is yesterday? Also, is your date column time sensitive? You also need to consider that.

          Just modify your code to check the variable, not the column. If the variable is NULL return everything.

          Something like...

          Code:
          declare @DateVar datetime
          
          --set @DateVar = getdate()
           
          Select a.id, a.name, a.datetest, b.descripttion,
             DateFrom = dateadd(dd, -10, DATEADD(dd,DATEDIFF(dd,0,@DateVar),0)),
             DateTo = dateadd(ms,-2,DATEADD(dd,DATEDIFF(dd,0,DATEADD(d,+1,@DateVar)),0)) 
          from #DateTest2 b
          left Join #DateTest1 a on a.id = b.id
          where (a.datetest  between dateadd(dd, -10, DATEADD(dd,DATEDIFF(dd,0,@DateVar),0)) and
                                    dateadd(ms,-2,DATEADD(dd,DATEDIFF(dd,0,DATEADD(d,+1,@DateVar)),0)))
                or @DateVar is null
          Just comment out the set if you need to test it for specific date.

          Happy Coding!!!


          ~~ CK

          Comment

          • Vikki McCormick
            New Member
            • Aug 2010
            • 46

            #6
            Ugh.. this works. I lost 6 seconds... grr.. still 20 seconds is better than 13 minutes.

            Code:
            isNull(Requests.ProcessingComplete,'01/01/1900') = case when Requests.ProcessingComplete is Null then '01/01/1900'
            											when Requests.ProcessingComplete between @AgeFromDate and  @AgeToDate then Requests.ProcessingComplete
            											end
            Yes ck9663 it's a variable. I didn't want to post the whole thing, because it's supposed to be proprietary, and because it's huge and to explain all that is going on would take a long time. The sproc basically is a catch all query that allows the user to go to a form and add in different variables to search a ton of information. It has to allow for the value or the null. I need to get all possible rows if there are all nulls being passed from the form.

            Thanks for the Happy Coding reminder. :)

            Comment

            • Rabbit
              Recognized Expert MVP
              • Jan 2007
              • 12517

              #7
              This will probably be faster.
              Code:
              ISNULL(Requests.ProcessingComplete, @AgeFromDate) BETWEEN @AgeFromDate AND @AgeToDate
              I still think the UNION ALL could be faster. You should try that as well.

              Comment

              • Vikki McCormick
                New Member
                • Aug 2010
                • 46

                #8
                I found this example. Does isNull replace the COELESCE function?

                Code:
                SELECT * 
                FROM MyTable 
                WHERE  
                     MyTable.StartDate >= COALESCE(MyTable.StartDate, "1/1/1900")  
                     /* Date selected as earliest plausible constant to avoid min() lookup */ 
                 
                 AND MyTable.EndDate <= COALESCE(MyTable.EndDate, "1/1/3001") 
                     /* Date selected as latest plausible constant to avoid max() lookup */
                Rabbit,

                I tried UNION ALL yesterday, and I am getting the same time frame as using the isNull on the left predicate.

                There are 4 Left Joined tables. Unavoidable if the form is going to work. I wonder if possibly there is a better way to design this type of process. Are there standard good practice designs for online catch all queries?

                Comment

                • Rabbit
                  Recognized Expert MVP
                  • Jan 2007
                  • 12517

                  #9
                  COALESCE is slightly different but very similar to ISNULL. ISNULL is probably a little faster. But you can do the same thing with either function.

                  Comment

                  • Rabbit
                    Recognized Expert MVP
                    • Jan 2007
                    • 12517

                    #10
                    As long as you need all the records from the base table and some of those fields on the joined tables, I'm afraid there's no way around it. The only thing left to do at this point is to reevaluate what you're showing the user and see whether or not they actually need all those tables.

                    I'm assuming you have an index (collated if you don't have one already) on at least the date field?

                    Comment

                    • Vikki McCormick
                      New Member
                      • Aug 2010
                      • 46

                      #11
                      Yes. I indexed. I will reveiw them again though to make absolutely sure I didn't put the order in wrong or something.

                      Comment

                      • Vikki McCormick
                        New Member
                        • Aug 2010
                        • 46

                        #12
                        Rabbit using

                        Code:
                        1.ISNULL(Requests.ProcessingComplete, @AgeFromDate) BETWEEN @AgeFromDate AND @AgeToDate
                        I gained one second. Huzzah!

                        Comment

                        • Rabbit
                          Recognized Expert MVP
                          • Jan 2007
                          • 12517

                          #13
                          I think at this point it's about as optimized as you can get it.

                          You can try adding the WHERE and JOIN fields into the same index as your date field and see if that gains you anything. And also, create indexes on the other tables on the JOIN fields. And if that nets you some speed, try adding in any ORDER BY fields. And if that works, start adding in the SELECT fields. At some point though, the index might become too large and start costing you performance. At which point you should pull it back a little.

                          So do some trial and error on the indices and you might be able to net a few more seconds.

                          If you want to post your full SQL, we can take a look and see if there are any other areas for optimization.
                          Last edited by Rabbit; Mar 16 '12, 05:23 PM.

                          Comment

                          Working...