extended stored proc programming

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Bonj

    extended stored proc programming

    Hi
    I'm looking to write an extended stored procedure, the job of which will
    basically to read data from one table, process it using a COM object, and
    write (insert) rows out to another table.
    I don't really know where to start with finding the functions necessary to
    read from an existing table. Can anyone be so kind as to point me in the
    right direction?

    Thanks very much
  • Adam Machanic

    #2
    Re: extended stored proc programming


    I'm curious as to why you want to retrieve and insert the rows within the
    extended proc rather than retrieve the rows, pass them into the extended
    proc, then pass the data back for re-insertion? IMO, that would be a more
    streamlined approach than having the proc do all of that work itself...


    "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
    news:728C9170-206E-4575-A905-45C84F46F0CD@mi crosoft.com...[color=blue]
    > Hi
    > I'm looking to write an extended stored procedure, the job of which will
    > basically to read data from one table, process it using a COM object, and
    > write (insert) rows out to another table.
    > I don't really know where to start with finding the functions necessary to
    > read from an existing table. Can anyone be so kind as to point me in the
    > right direction?
    >
    > Thanks very much[/color]


    Comment

    • Toby Herring

      #3
      Re: extended stored proc programming

      Rows inserted by an XP can be inserted outside the transaction context, so
      that if the transaction is rolled back, the XP-inserted rows remain. Good
      for auditing/logging/troubleshooting .

      --
      Toby Herring
      MCDBA, MCSD, MCP+SB
      Need a Second Life?



      "Adam Machanic" <amachanic@hotm ail._removetoem ail_.com> wrote in message
      news:u%23C5e7Ur EHA.3172@TK2MSF TNGP10.phx.gbl. ..[color=blue]
      >
      > I'm curious as to why you want to retrieve and insert the rows within the
      > extended proc rather than retrieve the rows, pass them into the extended
      > proc, then pass the data back for re-insertion? IMO, that would be a more
      > streamlined approach than having the proc do all of that work itself...
      >
      >
      > "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
      > news:728C9170-206E-4575-A905-45C84F46F0CD@mi crosoft.com...[color=green]
      >> Hi
      >> I'm looking to write an extended stored procedure, the job of which will
      >> basically to read data from one table, process it using a COM object, and
      >> write (insert) rows out to another table.
      >> I don't really know where to start with finding the functions necessary
      >> to
      >> read from an existing table. Can anyone be so kind as to point me in the
      >> right direction?
      >>
      >> Thanks very much[/color]
      >
      >[/color]


      Comment

      • Bonj

        #4
        Re: extended stored proc programming

        I don't know, I'm curious as to learn what the best way would be.
        For me, 'best' == fastest.
        I'm not bothered in the slightest about rollbacks or auditing.

        However, I am still open to suggestions about what the best method might be.
        My intial perception is that reading the data in the xp will be faster than
        reading it via SQL and passing it into the xp for processing and then
        inserting when back in SQL, as it's not having to constantly cross (marshal?)
        between the SQL script and the xp. I intially tried this approach, but
        instead of an xp, I wrote a load of SQL code that instantiates the COM object
        that does the processing via sp_OAcreate, sp_OAmethod, etc., but it was
        horrendously slow. Any ideas why this might be, and why an xp might be faster?

        I just get the impression that if it's just the same block of C++ code that
        does the entire lot of the work, then it will be faster.

        If you still think your approach will be faster, please tell me how to 'pass
        a row' to the xp - but please someone tell me how to read data using the xp
        so I can test the methods against each other for speed.



        "Adam Machanic" wrote:
        [color=blue]
        >
        > I'm curious as to why you want to retrieve and insert the rows within the
        > extended proc rather than retrieve the rows, pass them into the extended
        > proc, then pass the data back for re-insertion? IMO, that would be a more
        > streamlined approach than having the proc do all of that work itself...
        >
        >
        > "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
        > news:728C9170-206E-4575-A905-45C84F46F0CD@mi crosoft.com...[color=green]
        > > Hi
        > > I'm looking to write an extended stored procedure, the job of which will
        > > basically to read data from one table, process it using a COM object, and
        > > write (insert) rows out to another table.
        > > I don't really know where to start with finding the functions necessary to
        > > read from an existing table. Can anyone be so kind as to point me in the
        > > right direction?
        > >
        > > Thanks very much[/color]
        >
        >
        >[/color]

        Comment

        • Adam Machanic

          #5
          Re: extended stored proc programming


          "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
          news:27A5203C-A57B-469F-AE68-15194762251C@mi crosoft.com...[color=blue]
          >
          > However, I am still open to suggestions about what the best method might[/color]
          be.[color=blue]
          > My intial perception is that reading the data in the xp will be faster[/color]
          than[color=blue]
          > reading it via SQL and passing it into the xp for processing and then[/color]

          I guess, like anything else, it depends. Is it faster to cursor through
          the data in C++ or T-SQL? How expensive is each initialization of the XP?
          Will it be more expensive for the XP to have to make new database
          connections to read/write, rather than just using the connection that
          spawned the XP?
          [color=blue]
          > instead of an xp, I wrote a load of SQL code that instantiates the COM[/color]
          object[color=blue]
          > that does the processing via sp_OAcreate, sp_OAmethod, etc., but it was
          > horrendously slow. Any ideas why this might be, and why an xp might be[/color]
          faster?

          There is, IMO, no reason to believe that an XP would be that much
          faster. Did you profile your COM objects to figure out where the
          performance drain was?
          [color=blue]
          > If you still think your approach will be faster, please tell me how to[/color]
          'pass[color=blue]
          > a row' to the xp - but please someone tell me how to read data using the[/color]
          xp[color=blue]
          > so I can test the methods against each other for speed.[/color]

          I apologize if I implied that you can pass a row as-is -- you can't.
          You can only pass scalar datatypes. Reading data using an XP is the same as
          reading it from any other code -- set up a connection to the DB and query
          the data.


          Comment

          • Bonj

            #6
            Re: extended stored proc programming

            [color=blue]
            > I guess, like anything else, it depends. Is it faster to cursor through
            > the data in C++ or T-SQL?[/color]

            I'm guessing C++. That's why I want to check it out with the whole lot
            running in C++.
            [color=blue]
            > How expensive is each initialization of the XP?[/color]

            Don't know. Probably average relatively... but the COM object has to be
            called ~78m times, at the most recent count of the output rows. It currently
            takes about 30 hours to do all that lot with a VB program. But that includes
            all the time it takes it to send the data to SQL server. So I'm guessing that
            to keep one connection open with the code that's calling the COM object all
            the way through would be the best.

            I just can't accept that initializing a stored procedure, or initalizing and
            opening a connection, or pretty much initializing *anything* 78 million times
            can be a good thing. That's why I want to do it all within the C++ code of an
            extended proc.

            [color=blue]
            > Will it be more expensive for the XP to have to make new database
            > connections to read/write, rather than just using the connection that
            > spawned the XP?[/color]

            That's what I'm thinking, yes.
            [color=blue]
            >[color=green]
            > > instead of an xp, I wrote a load of SQL code that instantiates the COM[/color]
            > object[color=green]
            > > that does the processing via sp_OAcreate, sp_OAmethod, etc., but it was
            > > horrendously slow. Any ideas why this might be, and why an xp might be[/color]
            > faster?
            >
            > There is, IMO, no reason to believe that an XP would be that much
            > faster. Did you profile your COM objects to figure out where the
            > performance drain was?[/color]

            The COM object isn't under our control. But it is at the very very very
            centre of all the loops. It is the absolute bottleneck. And before you ask,
            no we can't write it in SQL because the company that wrote it have protected
            it with a dongle, so I would think they'll be unlikely to do anything that
            might be tantamount to giving the algorithm away.
            [color=blue]
            > I apologize if I implied that you can pass a row as-is -- you can't.
            > You can only pass scalar datatypes. Reading data using an XP is the same as
            > reading it from any other code -- set up a connection to the DB and query
            > the data.[/color]

            I'm afraid I insist. Can you please tell me how to read a table from within
            an XP, even if just to satisfy my own curiousity regarding testing how fast
            it actually will be?


            [color=blue]
            >
            >
            >[/color]

            Comment

            • Toby Herring

              #7
              Re: extended stored proc programming

              Depends on what version of C++.

              If it's MS C++ 6, then you can simply use the ADO libraries/objects. If
              it's C++.Net, then you can either use ADO with unmanaged C++, or you can
              write mixed managed/unmanaged code, and use ADO.Net libraries for data
              access.

              If you're using another version of C++, then you'll either have to use the
              ADO COM libraries or use whatever data access is built into the version of
              C++ you're using. (You can use the SQL API, but I wouldn't recommend that
              approach. It can be faster, but it's a royal PITA to write & troubleshoot,
              and it's nearly impossible to find decent documentation on its
              implementation. )

              If you were to write a front-end application that needed to access the
              database, and wrote it in C++, how would you access the database? It's the
              same method, no matter whether it's an executable or a dll or an xp.

              --
              Toby Herring
              MCDBA, MCSD, MCP+SB
              Need a Second Life?



              "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
              news:762BAA7F-A494-48F0-A8A8-BFAF15844FA2@mi crosoft.com...[color=blue]
              >
              > I'm afraid I insist. Can you please tell me how to read a table from
              > within
              > an XP, even if just to satisfy my own curiousity regarding testing how
              > fast
              > it actually will be?
              >[/color]


              Comment

              • Adam Machanic

                #8
                Re: extended stored proc programming


                "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
                news:762BAA7F-A494-48F0-A8A8-BFAF15844FA2@mi crosoft.com...[color=blue]
                >
                > Don't know. Probably average relatively... but the COM object has to be
                > called ~78m times, at the most recent count of the output rows. It[/color]
                currently[color=blue]
                >
                > centre of all the loops. It is the absolute bottleneck. And before you[/color]
                ask,[color=blue]
                > no we can't write it in SQL because the company that wrote it have[/color]
                protected[color=blue]
                > it with a dongle, so I would think they'll be unlikely to do anything that
                > might be tantamount to giving the algorithm away.[/color]

                Okay, I'm going to suggest that if you're going to be looping 78 million
                times, you don't use an XP at all for this! It will be much more efficient
                if you pull the data out to a flat file, manipulate it however you need to,
                then BULK INSERT or BCP the data back in. The time saved by the BULK INSERT
                alone as compared with a row-by-row reinsertion will be gigantic...
                [color=blue]
                > I'm afraid I insist. Can you please tell me how to read a table from[/color]
                within[color=blue]
                > an XP, even if just to satisfy my own curiousity regarding testing how[/color]
                fast[color=blue]
                > it actually will be?[/color]

                As I said, open a connection to the database and query it. Just as you
                would from any client app. I don't know what connection library you're
                using in C++ -- ADO, MFC's ODBC connection library, ATL's OLE DB Consumer
                Templates library, etc -- use that however you normally would. There is no
                "loopback" feature in the XP architecture, if that's what you're looking
                for.


                Comment

                • Bonj

                  #9
                  Re: extended stored proc programming

                  hey now that's a good idea.
                  I might have to learn how to do that.

                  Can I just add a managed class and as long as I do the #import <mscorlib>
                  thing and put, like "__gc class...{}" rather than just "class...{} " then
                  it'll be able to use the managed ADO.NET?

                  Thanks for that!


                  "Toby Herring" wrote:
                  [color=blue]
                  > Depends on what version of C++.
                  >
                  > If it's MS C++ 6, then you can simply use the ADO libraries/objects. If
                  > it's C++.Net, then you can either use ADO with unmanaged C++, or you can
                  > write mixed managed/unmanaged code, and use ADO.Net libraries for data
                  > access.
                  >
                  > If you're using another version of C++, then you'll either have to use the
                  > ADO COM libraries or use whatever data access is built into the version of
                  > C++ you're using. (You can use the SQL API, but I wouldn't recommend that
                  > approach. It can be faster, but it's a royal PITA to write & troubleshoot,
                  > and it's nearly impossible to find decent documentation on its
                  > implementation. )
                  >
                  > If you were to write a front-end application that needed to access the
                  > database, and wrote it in C++, how would you access the database? It's the
                  > same method, no matter whether it's an executable or a dll or an xp.
                  >
                  > --
                  > Toby Herring
                  > MCDBA, MCSD, MCP+SB
                  > Need a Second Life?
                  > http://secondlife.com/ss/?u=03e0e5b3...e80ee40119a65e
                  >
                  >
                  > "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
                  > news:762BAA7F-A494-48F0-A8A8-BFAF15844FA2@mi crosoft.com...[color=green]
                  > >
                  > > I'm afraid I insist. Can you please tell me how to read a table from
                  > > within
                  > > an XP, even if just to satisfy my own curiousity regarding testing how
                  > > fast
                  > > it actually will be?
                  > >[/color]
                  >
                  >
                  >[/color]

                  Comment

                  • Bonj

                    #10
                    Re: extended stored proc programming

                    > As I said, open a connection to the database and query it. Just as you[color=blue]
                    > would from any client app. I don't know what connection library you're
                    > using in C++ -- ADO, MFC's ODBC connection library, ATL's OLE DB Consumer
                    > Templates library, etc -- use that however you normally would. There is no
                    > "loopback" feature in the XP architecture, if that's what you're looking
                    > for.[/color]

                    Yes unfortunately that is what I was hoping for. What about "DB-Library for
                    C" - is that fast?

                    Also any suggestions about how to implement a managed class inside this
                    would be helpful!


                    [color=blue]
                    >
                    >
                    >[/color]

                    Comment

                    • Bonj

                      #11
                      Re: extended stored proc programming

                      The reason i don't want to use ADO is because it's COM - the overhead might
                      be small but when multiplied by 70-odd mill it's quite significant.

                      I've just had another idea actually, which is to challenge the company that
                      wrote the COM object to write it as an extended stored proc, and they would
                      know full well they would still be able to protect it with their dongle and
                      yet it could run fast on SQL server.

                      Comment

                      • Adam Machanic

                        #12
                        Re: extended stored proc programming


                        "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
                        news:AC973B5E-6558-48C7-B56E-86AA7EAA1FAC@mi crosoft.com...[color=blue]
                        >
                        > Yes unfortunately that is what I was hoping for. What about "DB-Library[/color]
                        for[color=blue]
                        > C" - is that fast?[/color]

                        No, don't touch the DB Library, it's very old and, I believe,
                        deprecated. The fastest, IMO, is the ATL Consumer Templates OLE DB
                        implementation. But it's also a b*tch to work with. Easiest IMO is ADO,
                        followed by the MFC ODBC implementation.


                        Comment

                        • Adam Machanic

                          #13
                          Re: extended stored proc programming


                          "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
                          news:E4B0FAE3-C3A5-4B4D-BDE6-97BB0F193826@mi crosoft.com...[color=blue]
                          >
                          > I've just had another idea actually, which is to challenge the company[/color]
                          that[color=blue]
                          > wrote the COM object to write it as an extended stored proc, and they[/color]
                          would[color=blue]
                          > know full well they would still be able to protect it with their dongle[/color]
                          and[color=blue]
                          > yet it could run fast on SQL server.[/color]

                          As I said, an extended stored proc is not going to help you here,
                          performance wise. If you want performance, you're going to have to do this
                          operation in bulk. Any row-by-row solution that requires marshalling of
                          data across processes/components is going to slow you down. If this were 78
                          rows instead of 78 million it might not be a problem... But even just
                          invoking the xp_hello example extended procedure 78 million times would
                          probably take quite a while (I'll leave that test to you :) )


                          Comment

                          • Toby Herring

                            #14
                            Re: extended stored proc programming

                            I'm not really sure how it's done (mixed C++). Anything I write managed I
                            do in C#, and anything I need pure C++ for I use MSC++6.

                            I do know that you can write mixed code in C++.Net, I'm just not sure about
                            all the semantics, or restrictions.

                            --
                            Toby Herring
                            MCDBA, MCSD, MCP+SB
                            Need a Second Life?



                            "Bonj" <Bonj@discussio ns.microsoft.co m> wrote in message
                            news:B5B803E2-5718-46FE-8ADD-3EF816AEA19F@mi crosoft.com...[color=blue]
                            > hey now that's a good idea.
                            > I might have to learn how to do that.
                            >
                            > Can I just add a managed class and as long as I do the #import <mscorlib>
                            > thing and put, like "__gc class...{}" rather than just "class...{} " then
                            > it'll be able to use the managed ADO.NET?
                            >
                            > Thanks for that![/color]


                            Comment

                            • Bonj

                              #15
                              Re: extended stored proc programming

                              [color=blue]
                              > As I said, an extended stored proc is not going to help you here,
                              > performance wise. If you want performance, you're going to have to do this
                              > operation in bulk.[/color]

                              Yes - what I was hoping to do is have an extended stored procedure do the
                              bulk operation, I thought being as it's operating directly within SQL
                              server's own process space, there wouldn't be any cross-process marshalling
                              going on all the time between SQL and COM.

                              [color=blue]
                              > Any row-by-row solution that requires marshalling of
                              > data across processes/components is going to slow you down. If this were 78
                              > rows instead of 78 million it might not be a problem... But even just
                              > invoking the xp_hello example extended procedure 78 million times would
                              > probably take quite a while (I'll leave that test to you :) )[/color]

                              That's exactly why I want an extended proc that will do all the 78m rows
                              from one invocation, called once, reads the rows in one fell swoop and spits
                              them out to a table, by doing "insert outputtable(... .) exec xp_myproc"

                              Anyhow, I've written a test project to practice all the techniques I'll
                              need, and it was a lot easier than I imagined actually. I have used a static
                              library project to contain the managed code, and it seems to work. I'll do
                              speed testing on it, and if it's significantly faster then so be it, but at
                              least I'll be able to try it out.

                              Thanks

                              Comment

                              Working...