Performance tuning for a stored procedure

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • serge

    Performance tuning for a stored procedure

    I have an SP that is big, huge, 700-800 lines.

    I am not an expert but I need to figure out every possible way that
    I can improve the performance speed of this SP.

    In the next couple of weeks I will work on preparing SQL statements
    that will create the tables, insert sample record and run the SP.
    I would hope people will look at my SP and give me any hints on how
    I can better write the SP.

    In the meantime, after looking at the SP briefly, my first observations are:

    1- use SET NOCOUNT ON
    2- avoid using UNION statements
    3- use WITH (NOLOCK) with each SELECT statement
    4- avoid using NESTED Select statements
    5- use #temp tables
    6- avoid renaming tables in SELECT statements, for example SELECT * FROM
    tblClients C

    Am i correct to consider the above 6 points as valid things in terms of
    causing
    performance problems?


    I would appreciate any comments/help

    Thank you very much


  • Hugo Kornelis

    #2
    Re: Performance tuning for a stored procedure

    Hi Serge,

    On Thu, 9 Sep 2004 00:13:31 -0400, serge wrote:
    [color=blue]
    >I have an SP that is big, huge, 700-800 lines.
    >
    >I am not an expert but I need to figure out every possible way that
    >I can improve the performance speed of this SP.
    >
    >In the next couple of weeks I will work on preparing SQL statements
    >that will create the tables, insert sample record and run the SP.
    >I would hope people will look at my SP and give me any hints on how
    >I can better write the SP.
    >
    >In the meantime, after looking at the SP briefly, my first observations are:
    >
    >1- use SET NOCOUNT ON[/color]

    This will not affect performance, but it is good practice. Many clients
    choke on the numerous "(n row(s) affected)" messages returned by an SP
    without this setting.

    [color=blue]
    >2- avoid using UNION statements[/color]

    Depends. Sometimes, a UNION can be quicker than the alternative. I've seen
    cases where a query with a complicated WHERE clause that was evaluated
    with a table scan was rewritten as two (or more) queries with UNION; each
    of those queries was resolved with an index and the UNION version ran a
    lot quicker.

    You can gain performance by using UNION ALL instead of UNION where
    possible. Sometimes, this can be made possible by extending the WHERE
    clause of one of the queries in the UNION (though you have to be carefull
    that the cost of this extension doesn't outweigh the gain of UNION ALL
    over UNION!).

    Run the following examples and check execution plans and statistics on I/O
    and time used for execution:

    USE pubs
    SELECT au_id, au_lname, au_fname
    FROM authors
    WHERE au_id LIKE '2%'
    OR au_lname = 'Ringer'
    GO
    SELECT au_id, au_lname, au_fname
    FROM authors
    WHERE au_id LIKE '2%'
    UNION
    SELECT au_id, au_lname, au_fname
    FROM authors
    WHERE au_lname = 'Ringer'
    GO
    SELECT au_id, au_lname, au_fname
    FROM authors
    WHERE au_id LIKE '2%'
    UNION ALL
    SELECT au_id, au_lname, au_fname
    FROM authors
    WHERE au_lname = 'Ringer'
    AND au_id NOT LIKE '2%'
    GO

    [color=blue]
    >3- use WITH (NOLOCK) with each SELECT statement[/color]

    This can gain you some performance (especially if concurrency is high on
    your database), but you run the risk of getting dirty reads. If that risk
    is acceptable, go ahead. If you don't know what a dirty read is, then
    don't use this option.

    [color=blue]
    >4- avoid using NESTED Select statements[/color]

    Again: depends. If you can safely and easily replace the nested select (or
    subquery, as it's usually called) with other code, do so, then test both
    versions to see if performance actually has improved (if often won't
    improve, as SQL Server's optimizer already uses the same execution plan).

    Similar, if you can safely replace a correlated subquery with a
    non-correlated, do so and test both versions.

    But if removing the subquery means that you have to code lots more SQL, it
    might hurt performance instead of improving it. And if you can gain some
    performance by replacing an intuitive subquery with a contrived and hard
    to understand query, then you might want to reconsider if you really value
    performance higher than maintainability . One day, you will find youself
    staring at that query, wondering what the %$# that ^%#$&%# query is
    supposed to do.

    [color=blue]
    >5- use #temp tables[/color]

    At the risk of repeating myself: depends. If you find the same subquery
    used over and over in the procedure, it MIGHT help performance if you
    execute that subquery into a #temp table and use that for the rest of the
    execution. It MIGHT also help further to index the temp table. But, again,
    it might also hurt performance - creating the temp table and storing the
    data induces some overhead as well and if you're not careful, you might be
    faced with numerous recompilationms of the stored procedure that wouldn't
    be needed without the temp table.

    If you use a temp table to break a complicated query down in steps, you
    have a good chance of degrading performance. In one complicated query, the
    optimizer may choose an execution plan that you would never think of but
    that's faster than the obvious way to execute it; if you dictate the steps
    by executing them seperateely with a temp table for intermediate results,
    you take a lot of options from the optimizer. Of course, there is also the
    consideration of maintainability and readability of your code, so you
    might choose to accept the performance degradation, just so that you will
    understand your code when (not if!!) you (or someone else) have to get
    back to it later.

    [color=blue]
    >6- avoid renaming tables in SELECT statements, for example SELECT * FROM
    >tblClients C[/color]

    I've never heard that using a table alias (as this is called) would hurt
    performance. If you have any evidence of this, please point me to it. I
    would be highly surprised.

    In fact, using an alias is absolutely needed when you use the same table
    more than once in a query and when you use derived table; in all other
    cases (except for single-table queries or very short table names) I'd also
    heartily recommend using an alias. Do choose a mnemonic alias, not just a
    random one or two letter combination!

    [color=blue]
    >Am i correct to consider the above 6 points as valid things in terms of
    >causing
    >performance problems?[/color]

    See above. And you might also want to take a look at this site:



    Best, Hugo
    --

    (Remove _NO_ and _SPAM_ to get my e-mail address)

    Comment

    • Gert-Jan Strik

      #3
      Re: Performance tuning for a stored procedure

      As always, Hugo has already given an excellent response. Just some
      additions here...
      [color=blue]
      > 3- use WITH (NOLOCK) with each SELECT statement[/color]

      Don't use this, unless there is a specific performance problem with
      respect to the statement in question. And even then you should only use
      it after you have assessed whether the risks are acceptable.
      [color=blue]
      > 6- avoid renaming tables in SELECT statements, for example SELECT * FROM
      > tblClients C[/color]

      Nonsense.
      [color=blue]
      > 2- avoid using UNION statements[/color]

      This is not a goal. Your goal should be to write statements that are
      easy to read and understand and are effective. It starts there. In
      general, you should not use GROUP BY, DISTINCT or UNION (without ALL) if
      you don't need it. But if you do need it, then go for it, SQL-Server is
      optimized for their use.
      [color=blue]
      > 4- avoid using NESTED Select statements[/color]

      Start with a set orientation in mind. That way you will automatically
      avoid most unnecessary nested selects. The ones that remain are probably
      your best choice.
      [color=blue]
      > 5- use #temp tables[/color]

      Don't use intermediate tables unless you have to. If you have to then
      #temp tables are usually better than creating a permanent table and
      dropping it at the end.

      Gert-Jan
      --
      (Please reply only to the newsgroup)

      Comment

      • Erland Sommarskog

        #4
        Re: Performance tuning for a stored procedure

        serge (sergea@nospam. ehmail.com) writes:[color=blue]
        > I have an SP that is big, huge, 700-800 lines.[/color]

        Hey, you should see some of our procedures. We have one that is 3000 lines
        long!

        But, OK, 700 lines is no small size for a stored procedure.

        Hugo and Gert-Jan has already pointed out weakness in your observations,
        but I like to make some supplemental comments.
        [color=blue]
        > 2- avoid using UNION statements[/color]

        This is a bad rule. Sometimes UNION may be the wrong solution. Sometimes
        it is the right. Even if we are talking from a performance perspective.
        [color=blue]
        > 3- use WITH (NOLOCK) with each SELECT statement[/color]

        Only do this, if there are unavioadble table scans, and you are really
        are experience contention problems - *and* if you can accept that the
        results are not consistent.
        [color=blue]
        > 4- avoid using NESTED Select statements[/color]

        Again, not a very good rule. But it depends a little on what you mean.
        Say that you have:

        SELECT a, b, (SELECT SUM(c) FROM B WHERE B.col = A.col) = c
        FROM A
        WHERE col2 BETWEEN 12 AND 19

        My observation is that a rewrite using a derived table often gives better
        performance:

        SELECT A.a, A.b, B.c
        FROM A
        JOIN (SELECT c = SUM(c), col FROM B GROUP BY col) AS B ON A.col = B.col
        WHERE col2 BETWEEN 12 AND 19

        This may look expensive if B is large and there are only a handful of
        values between 12 and 19 in A. But the above is only a logical description
        of the query. The optimizer may recast computation order, as long as the
        result is the same, and often does with a very good result.

        The same thing applies to update queries:

        UPDATE A
        SET c = (SELECT SUM(c) FROM B WHERE A.col = B.col)
        WHERE col2 BETWEEN 12 AND 19

        While the above is ANSI-compliant, this is usually more effective:

        UPDATE A
        SET c = B.c
        FROM A
        JOIN (SELECT c = SUM(c), col FROM B GROUP BY col) AS B ON A.col = B.col
        WHERE col2 BETWEEN 12 AND 19

        You must benchmark all such changes. It may not always be the best thing
        to do.
        [color=blue]
        > 5- use #temp tables[/color]

        This is a very complex topic. Yes, it can sometimes be a good thing to
        save intermediate results in a temp table. But temp tables can also
        cause performance problems, since if you fill up a temp table, SQL Server
        may opt to recompile the procedure. And recompiling a 700 line stored
        procedure can easily take a few seconds. This can be evaded, by using
        table variables instead. Table variabels never causes recompilations.
        But then again, you may want those recompilations, because it can slash
        the execution time of the procedure from three hours to two minutes.

        Here is a real-life story about a procedure optimization that I did
        some time ago. The code originated from a stored procedure that I had
        written in 1997 for MS SQL 6.0, and used a couple of temp tables on
        which a bunch of operations were performed. At a customer site, this
        procedure took too long time. I tried a lot of tricks in the book,
        but few gave any effect.

        Eventually, I replaced the most of the temp-table manipulation with a
        50+ line SELECT statement which performed a FULL JOIN of three table
        sources, whereof at least one was a derived table. (And a three-way
        full join requires at least one derived table in itself, because you
        have to full-join two by two.)

        So why I did not write it this way in 1997? Well, at that time SQL Server
        did not have FULL JOIN or derived tables. Also, the requirements of the
        original procedure was different, and more complex that the current one.
        [color=blue]
        > 6- avoid renaming tables in SELECT statements, for example SELECT * FROM
        > tblClients C[/color]

        No, use aliases. The impact on execution on performance is neglible, but
        the impact on developer performance should not be ignored.

        --
        Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

        Books Online for SQL Server SP3 at
        Get the flexibility you need to use integrated solutions, apps, and innovations in technology with your data, wherever it lives—in the cloud, on-premises, or at the edge.

        Comment

        • Dan Guzman

          #5
          Re: Performance tuning for a stored procedure

          >>1- use SET NOCOUNT ON[color=blue]
          >
          > This will not affect performance, but it is good practice. Many clients
          > choke on the numerous "(n row(s) affected)" messages returned by an SP
          > without this setting.[/color]


          From the SQL 2000 BOL:

          <Excerpt href="tsqlref.c hm::/ts_set-set_3ed0.htm">
          For stored procedures that contain several statements that do not return
          much actual data, this can provide a significant performance boost because
          network traffic is greatly reduced.
          </Excerpt>

          --
          Hope this helps.

          Dan Guzman
          SQL Server MVP


          Comment

          • Hugo Kornelis

            #6
            Re: Performance tuning for a stored procedure

            On Fri, 10 Sep 2004 11:30:32 GMT, Dan Guzman wrote:
            [color=blue][color=green][color=darkred]
            >>>1- use SET NOCOUNT ON[/color]
            >>
            >> This will not affect performance, but it is good practice. Many clients
            >> choke on the numerous "(n row(s) affected)" messages returned by an SP
            >> without this setting.[/color]
            >
            >
            >From the SQL 2000 BOL:
            >
            ><Excerpt href="tsqlref.c hm::/ts_set-set_3ed0.htm">
            >For stored procedures that contain several statements that do not return
            >much actual data, this can provide a significant performance boost because
            >network traffic is greatly reduced.
            ></Excerpt>[/color]

            Hi Dan,

            Yes, you're right. My bad. Thanks for pointing out my error!

            Best, Hugo
            --

            (Remove _NO_ and _SPAM_ to get my e-mail address)

            Comment

            • serge

              #7
              Re: Performance tuning for a stored procedure

              Thank you all for your responses.

              A lot of new things for me in your answers, I'll have to read them many
              times
              and take the time to fully understand them.

              Thanks again.


              Comment

              • serge

                #8
                Re: Performance tuning for a stored procedure

                > 3- use WITH (NOLOCK) with each SELECT statement

                This post is for everyone who gave me answers about the WITH (NOLOCK).

                I've been using WITH (NOLOCK) in my own applications for a few years.
                And i do know about Dirty Reads. It's basically the SELECT statement will
                not
                wait for a transaction with a lock to finish before getting the data. For
                example,
                if a transaction is at the stage of modifying a column from 'USA' to
                'Canada',
                the SELECT statement will fetch me USA even though the record
                is locked and in the process of having the column's value being changed to
                Canada.

                Am i correct with my explanation?

                Now, i have an SP with 300 lines with a lot tables, joins, left joins and
                one or
                a few unions.

                If i run this SP with parameters in the SQL Query Analyzer, it takes 1
                second
                to return 11 rows. However, the front-end application which is ASP based,
                when calls this SP, it takes 23-25 seconds for the ASP page to load!

                If i use WITH (NOLOCK) on all the tables, the ASP page loads in 4-5
                seconds!

                I am planning on understanding this SP and all other SPs and find all the
                ways possible to better write it in order to improve the code and have
                better performance results. This will be a slow process.

                Until then, if my explanation of WITH (NOLOCK) is valid and that there
                is only 1 side-effect, then maybe we can temporarily use WITH (NOLOCK)
                for the short term.


                As always, I appreciate any feedback.

                Thank you very much





                Comment

                • Hugo Kornelis

                  #9
                  Re: Performance tuning for a stored procedure

                  Hi Serge,

                  On Tue, 14 Sep 2004 00:23:04 -0400, serge wrote:
                  [color=blue]
                  >I've been using WITH (NOLOCK) in my own applications for a few years.
                  >And i do know about Dirty Reads. It's basically the SELECT statement will
                  >not
                  >wait for a transaction with a lock to finish before getting the data. For
                  >example,
                  >if a transaction is at the stage of modifying a column from 'USA' to
                  >'Canada',
                  >the SELECT statement will fetch me USA even though the record
                  >is locked and in the process of having the column's value being changed to
                  >Canada.
                  >
                  >Am i correct with my explanation?[/color]

                  Only partially. It depends on how "far" the processing of the other
                  transaction has gone. If the update has already performed but the
                  transaction is not yet finished, the SELECT statement with nolock hint
                  will return the value 'Canada'.

                  This can cause quite unexpected side effects when the other transaction
                  later has to be rolled back - you'll have read a value that logically
                  never even existed in the database. It might even be that the rollback
                  occured BECAUSE the value 'Canada' violates a business rule - it's
                  inserted in the table first, then the trigger starts that checks the
                  business rules and a rollback is initiated if the business rules were
                  violated. Normal locking behaviour ensures that nobody ever sees the
                  "illegal" value 'Canada', as this row is locked until the transaction
                  finishes; performing dirty read means that you run the risk of returning
                  this "illegal" value.

                  [color=blue]
                  >If i run this SP with parameters in the SQL Query Analyzer, it takes 1
                  >second
                  >to return 11 rows. However, the front-end application which is ASP based,
                  >when calls this SP, it takes 23-25 seconds for the ASP page to load!
                  >
                  >If i use WITH (NOLOCK) on all the tables, the ASP page loads in 4-5
                  >seconds![/color]

                  If the SP executes in 1 second from QA, then you should be able to see
                  comparable performance from any other client. I suspect that this
                  difference is caused by something in the ASP code. As I have never used
                  ASP myself, I'll leave it to others to comment on this.

                  Your solution to use dirty reads looks more like a workaround than like a
                  fix. I hope someone more ASP-savvy then me can help you find the real
                  cause of the delay.

                  And I suggest you think very long and very hard about what harm might be
                  caused if your SP reads and uses data changed by an unfinished transaction
                  that might even be rolled back.

                  Best, Hugo
                  --

                  (Remove _NO_ and _SPAM_ to get my e-mail address)

                  Comment

                  • serge

                    #10
                    Re: Performance tuning for a stored procedure

                    Hi Hugo,

                    Your reply was very informative for me.

                    I liked your explanation of the "illegal" value. It's true, i didn't think
                    about Dirty Reads like you explained, where the value is actually 'Canada'
                    but that it hasn't been finalized and the record had been committed.

                    As far ASP having a problem, yes it has a serious problem the ASP page.
                    When i run the profiler, bunch of SPs get executed THREE straight times!!!
                    This i find very bad and I'll have to investigate why the repeat (i guess
                    the page
                    is being refreshed 3 times thus triggering all SPs to be executed three
                    times).

                    And yes I am considering WITH (NOLOCK) as a workaround for now and
                    not a fix.

                    I'll keep investigating.

                    Thank you very much Hugo.

                    [color=blue][color=green]
                    > >I've been using WITH (NOLOCK) in my own applications for a few years.
                    > >And i do know about Dirty Reads. It's basically the SELECT statement will
                    > >not
                    > >wait for a transaction with a lock to finish before getting the data. For
                    > >example,
                    > >if a transaction is at the stage of modifying a column from 'USA' to
                    > >'Canada',
                    > >the SELECT statement will fetch me USA even though the record
                    > >is locked and in the process of having the column's value being changed[/color][/color]
                    to[color=blue][color=green]
                    > >Canada.
                    > >
                    > >Am i correct with my explanation?[/color]
                    >
                    > Only partially. It depends on how "far" the processing of the other
                    > transaction has gone. If the update has already performed but the
                    > transaction is not yet finished, the SELECT statement with nolock hint
                    > will return the value 'Canada'.
                    >
                    > This can cause quite unexpected side effects when the other transaction
                    > later has to be rolled back - you'll have read a value that logically
                    > never even existed in the database. It might even be that the rollback
                    > occured BECAUSE the value 'Canada' violates a business rule - it's
                    > inserted in the table first, then the trigger starts that checks the
                    > business rules and a rollback is initiated if the business rules were
                    > violated. Normal locking behaviour ensures that nobody ever sees the
                    > "illegal" value 'Canada', as this row is locked until the transaction
                    > finishes; performing dirty read means that you run the risk of returning
                    > this "illegal" value.
                    >
                    >[color=green]
                    > >If i run this SP with parameters in the SQL Query Analyzer, it takes 1
                    > >second
                    > >to return 11 rows. However, the front-end application which is ASP based,
                    > >when calls this SP, it takes 23-25 seconds for the ASP page to load!
                    > >
                    > >If i use WITH (NOLOCK) on all the tables, the ASP page loads in 4-5
                    > >seconds![/color]
                    >
                    > If the SP executes in 1 second from QA, then you should be able to see
                    > comparable performance from any other client. I suspect that this
                    > difference is caused by something in the ASP code. As I have never used
                    > ASP myself, I'll leave it to others to comment on this.
                    >
                    > Your solution to use dirty reads looks more like a workaround than like a
                    > fix. I hope someone more ASP-savvy then me can help you find the real
                    > cause of the delay.
                    >
                    > And I suggest you think very long and very hard about what harm might be
                    > caused if your SP reads and uses data changed by an unfinished transaction
                    > that might even be rolled back.[/color]


                    Comment

                    • Hugo Kornelis

                      #11
                      Re: Performance tuning for a stored procedure

                      On Tue, 14 Sep 2004 08:02:05 -0400, serge wrote:
                      [color=blue]
                      >As far ASP having a problem, yes it has a serious problem the ASP page.[/color]
                      (snip)[color=blue]
                      >I'll keep investigating.[/color]

                      Hi Serge,

                      You might consider looking in the microsoft.publi c hierarchy of
                      newsgroups. There are lots of them with "asp" somewhere in the name, so
                      there's a good chance you'll find an answer to this in one of those
                      groups. I know too little (read: nothing <g>) about asp to recognise which
                      group might be the best suited.

                      Best, Hugo
                      --

                      (Remove _NO_ and _SPAM_ to get my e-mail address)

                      Comment

                      • Erland Sommarskog

                        #12
                        Re: Performance tuning for a stored procedure

                        serge (sergea@nospam. ehmail.com) writes:[color=blue]
                        > If i run this SP with parameters in the SQL Query Analyzer, it takes 1
                        > second to return 11 rows. However, the front-end application which is
                        > ASP based, when calls this SP, it takes 23-25 seconds for the ASP page
                        > to load![/color]

                        If the ASP code calls it three times, there is something to be fixed.

                        But exactly how was it run? Was any of the calls preceded by
                        SET FMTONLY ON? FMTONLY it a sort of NOEXEC thing, SQL Server only
                        returns information about result sets.

                        Assuming that your ASP code uses, ADO, make sure that you use
                        CommantType = adStoredProcedu re. Do not use adCommandText with an
                        EXEC statement or a ODBC call syntax.

                        If there there is an indexed view or indexed computed column involved
                        somewhere, this can explain the difference between ASP and QA. In QA,
                        the setting ARITHABORT is ON by default, but it's off when you use
                        ADO. Thus, issuing SET ARITHABORT ON (or setting it default for the
                        database with ALTER DATABASE or for the server with sp_configure could
                        give some effect.)

                        Also, SET NOCOUNT ON is good.
                        [color=blue]
                        > If i use WITH (NOLOCK) on all the tables, the ASP page loads in 4-5
                        > seconds![/color]

                        This could indicate that there is some blocking. Are you alone on the
                        server when it takes 25 seconds to run? Unless there are concurrency
                        issues, I find it difficult to believe that (NOLOCK) has that drastic
                        effect. There is some cost for locks, but I find it difficult to believe
                        that it is that huge.


                        --
                        Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

                        Books Online for SQL Server SP3 at
                        Get the flexibility you need to use integrated solutions, apps, and innovations in technology with your data, wherever it lives—in the cloud, on-premises, or at the edge.

                        Comment

                        • serge

                          #13
                          Re: Performance tuning for a stored procedure

                          A lot of good information you've provided me here. I will have to look at
                          the ASP
                          code and investigate all your points.

                          To answer your question. Yes i am alone on the server when it takes 25
                          seconds to run.
                          Checking SQL Profiler, i see 3 sets of the same SPs being called and the
                          main SP
                          that is slow takes about 6.5 seconds to run, multiply that by 3, that's 20
                          seconds.
                          The asp page is also calling other SPs, totaling 23-25 seconds.

                          I'll have to investigate much further and go through your points and other
                          people's points
                          i've received and find out what i can fix/improve.

                          Thanks again!

                          [color=blue]
                          > If the ASP code calls it three times, there is something to be fixed.
                          >
                          > But exactly how was it run? Was any of the calls preceded by
                          > SET FMTONLY ON? FMTONLY it a sort of NOEXEC thing, SQL Server only
                          > returns information about result sets.
                          >
                          > Assuming that your ASP code uses, ADO, make sure that you use
                          > CommantType = adStoredProcedu re. Do not use adCommandText with an
                          > EXEC statement or a ODBC call syntax.
                          >
                          > If there there is an indexed view or indexed computed column involved
                          > somewhere, this can explain the difference between ASP and QA. In QA,
                          > the setting ARITHABORT is ON by default, but it's off when you use
                          > ADO. Thus, issuing SET ARITHABORT ON (or setting it default for the
                          > database with ALTER DATABASE or for the server with sp_configure could
                          > give some effect.)
                          >
                          > Also, SET NOCOUNT ON is good.
                          >[color=green]
                          > > If i use WITH (NOLOCK) on all the tables, the ASP page loads in 4-5
                          > > seconds![/color]
                          >
                          > This could indicate that there is some blocking. Are you alone on the
                          > server when it takes 25 seconds to run? Unless there are concurrency
                          > issues, I find it difficult to believe that (NOLOCK) has that drastic
                          > effect. There is some cost for locks, but I find it difficult to believe
                          > that it is that huge.[/color]


                          Comment

                          Working...