Performance Question: Cursor-based vs Set-based

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • klalonde
    New Member
    • Apr 2006
    • 5

    Performance Question: Cursor-based vs Set-based

    I was researching opinions on using cursors in stored procedures and found a thread http://www.thescripts.com/forum/thread143091.html discussing (generally) why not to use them.

    I took a cursor-based solution and took a stab at writing the equivalent to it as a set-based procedure. I then timed each process using Query Profiler by executing each 20 consecutive times in Query Analyzer (i.e. EXECUTE sp_CalculateRew ards and EXECUTE sp_CalculateRew ards2) With respect to data size and row counts, of the approximately 12,000 rows in the customer table, only 85 records meet the criteria for processing.

    Perhaps my set-based solution is way off, but the cursor based solution consistently ran in the 500ms range (avg 525ms), while my set-based one ran (progressively better) in cycles of 4000ms, 2000ms, 600ms, 4000ms...etc... repeating this pattern over the 20 repeated executions.

    My two questions are this. First, does my set-based approach seem reasonably correct or is there a better way. Second, has anyone seen the same anomaly with respect to the timings and is there a reason for it?

    Thanks for your input!

    The cursor based solution: delete all records from reporting table...open cursor for those customers from the customer table with an enrollment date...calculat e sales for prior 365 days from the enrollment date and then year to date sales since enrollment date...build an INSERT statement and EXECUTE the statement thus inserting the customer_id, customer_name, and respective sales figures back into the reporting table...and continue for each customer_id in the cursor resultset.

    The set-based version:

    DELETE FROM RewardsTable

    INSERT RewardsTable (customer_id, customer_name, enrollment_date , sales_base, ytd_sales, percent_to_goal ) SELECT customer_id, customer_name, cert_date, 0, 0, 0) FROM Customer WHERE ISDATE(cert_dat e) = 1

    SELECT rt.customer_id CASE WHEN sum(inv.invoice _amount) IS NULL THEN 0 ELSE sum(inv.invoice _amount) END [sales_base] INTO #rt_sales_base FROM RewardsTable rt LEFT OUTER JOIN invoice_view inv WITH (NOLOCK) ON rt.customer_id = inv.customer_id WHERE inv.invoice_dat e >= rt.enrollment_d ate - 365 AND inv.invoice_dat e <= rt.enrollment_d ate

    SELECT rt.customer_id CASE WHEN sum(inv.invoice _amount) IS NULL THEN 0 ELSE sum(inv.invoice _amount) END [ytd_sales] INTO #rt_ytd_sales FROM RewardsTable rt LEFT OUTER JOIN invoice_view inv WITH (NOLOCK) ON rt.customer_id = inv.customer_id WHERE inv.invoice_dat e >= rt.enrollment_d ate AND inv.invoice_dat e <= rt.enrollment_d ate + 365

    UPDATE RewardsTable SET sales_base = tmp.sales_base FROM RewardsTable INNER JOIN #rt_sales_base tmp ON RewardsTable.cu stomer_id = tmp.customer_id

    UPDATE RewardsTable SET ytd_sales = tmp.ytd_sales FROM RewardsTable INNER JOIN #rt_ytd_sales tmp ON RewardsTable.cu stomer_id = tmp.customer_id

    DROP TABLE #rt_sales_base
    DROP TABLE #rt_ytd_sales
    Last edited by klalonde; Apr 21 '06, 11:46 PM.
  • klalonde
    New Member
    • Apr 2006
    • 5

    #2
    I did some additional testing. I populated an enrollement_dat e in for all customers such that all ~12000 customer records would be processed, and notice that the CURSOR-based stored procedure was no longer faster than the set-based one. So I now have a third question. At some point the dataset became large enough that the optimations chosed on the execution path passed up the cursor based solution. Is it fair to say that (if your not worried about running in a non-MS SQL environment) the CURSORS on smaller sets of data hold a performance advantage?

    Comment

    Working...