Representation for Deleted Entities: difficult question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Robert Brown

    Representation for Deleted Entities: difficult question

    Our customer (of our ecommerce system) wants to be able to preserve
    deleted entities in the database so that they can do reporting,
    auditing etc.

    The system is quite complex where each end user can belong to multiple
    institutional affiliations (which can purchase on behalf of the user).
    The end user also has a rich trail of past transactions affiliations
    etc. Thus in the schema each user entity is related to many others
    which in turn relate to yet others and so on.

    In the past when a user was deleted all of his complex relationships
    were also deleted in a cascading fashion. But now the customer wants
    us to add a "deleted" flag to each user so that a user is never
    _really_ deleted but instead his "deleted" flag is set to true. The
    system subsequently behaves as if the user did not exist but the
    customer can still do reports on deleted users.

    I pointed out that it is not as simple as that because the user entity
    is related to many, many others so we would have to add this "deleted"
    flag to every relationship and every other entity and thus have
    "deleted" past purchases, "deleted" affiliations - a whole shadow
    schema full of such ghost entities. This would overtime degrade
    performance since now each query in the system has to add a clause:
    "where deleted = 0".

    I assume this is a standard problem since many organizations must have
    this need of preserving deleted records (for legal or other reasons).
    I tried to talk them into creating a simple audit file where all the
    deletions will be recorded in XML but they were not too happy with
    that.

    Is there a more satisfying solution to this than have this "deleted"
    flag?

    Thanks for your help,

    - robert
  • NetComrade

    #2
    Re: Representation for Deleted Entities: difficult question

    You didn't post database you're using.

    In Oracle you could partition the main table (from where all
    'cascaded' is coming from) into deleted/undeleted records. Put a view
    on top of the table just as original table would look like with where
    clause 'undeleted' and the undeleted partition would always be used.
    Only certain versions of oracle allow rows to migrate from partition
    to partition, u can look it up.

    However, you would still suffer from 'performance' issues on the other
    tables, if your data is truly huge.

    Best solution would probably be to separate all deleted/undeleted data
    into separate tables, and built a union view on top of them for
    reporting purposes.

    my2c
    Someone might have a better idea.



    On 13 May 2004 11:32:21 -0700, robertbrown1971 @yahoo.com (Robert
    Brown) wrote:
    [color=blue]
    >Our customer (of our ecommerce system) wants to be able to preserve
    >deleted entities in the database so that they can do reporting,
    >auditing etc.
    >
    >The system is quite complex where each end user can belong to multiple
    >institutiona l affiliations (which can purchase on behalf of the user).
    >The end user also has a rich trail of past transactions affiliations
    >etc. Thus in the schema each user entity is related to many others
    >which in turn relate to yet others and so on.
    >
    >In the past when a user was deleted all of his complex relationships
    >were also deleted in a cascading fashion. But now the customer wants
    >us to add a "deleted" flag to each user so that a user is never
    >_really_ deleted but instead his "deleted" flag is set to true. The
    >system subsequently behaves as if the user did not exist but the
    >customer can still do reports on deleted users.
    >
    >I pointed out that it is not as simple as that because the user entity
    >is related to many, many others so we would have to add this "deleted"
    >flag to every relationship and every other entity and thus have
    >"deleted" past purchases, "deleted" affiliations - a whole shadow
    >schema full of such ghost entities. This would overtime degrade
    >performance since now each query in the system has to add a clause:
    >"where deleted = 0".
    >
    >I assume this is a standard problem since many organizations must have
    >this need of preserving deleted records (for legal or other reasons).
    >I tried to talk them into creating a simple audit file where all the
    >deletions will be recorded in XML but they were not too happy with
    >that.
    >
    >Is there a more satisfying solution to this than have this "deleted"
    >flag?
    >
    >Thanks for your help,
    >
    >- robert[/color]

    ........
    We use Oracle 8.1.7.4 on Solaris 2.7 boxes
    remove NSPAM to email

    Comment

    • Mikito Harakiri

      #3
      Re: Representation for Deleted Entities: difficult question

      "Robert Brown" <robertbrown197 1@yahoo.com> wrote in message
      news:240a4d09.0 405131032.6c2e9 802@posting.goo gle.com...[color=blue]
      > I pointed out that it is not as simple as that because the user entity
      > is related to many, many others so we would have to add this "deleted"
      > flag to every relationship and every other entity and thus have
      > "deleted" past purchases, "deleted" affiliations - a whole shadow
      > schema full of such ghost entities. This would overtime degrade
      > performance since now each query in the system has to add a clause:
      > "where deleted = 0".[/color]

      Blanket statements like this are rarely true.
      [color=blue]
      > I assume this is a standard problem since many organizations must have
      > this need of preserving deleted records (for legal or other reasons).
      > I tried to talk them into creating a simple audit file where all the
      > deletions will be recorded in XML but they were not too happy with
      > that.[/color]

      So you think file is better than DBMS?
      [color=blue]
      > Is there a more satisfying solution to this than have this "deleted"
      > flag?[/color]

      Can I suggest that there is a modelling problem? I can imagine customer
      having multiple timestamp columns, for example

      table customer (
      ...
      DOB DATE,
      married DATE,
      divorced DATE,
      died DATE
      )

      but can't possibly see why you need "is_alive", "is_married " boolean
      columns.


      Comment

      • Robert Brown

        #4
        Re: Representation for Deleted Entities: difficult question

        andreyNSPAM@boo kexchange.net (NetComrade) wrote in message news:<40a3cbf2. 428366208@local host>...
        [color=blue]
        > You didn't post database you're using.[/color]

        Thanks for your answer. This particular customer is using Oracle but
        our software is supported on SQL server as well.

        [color=blue]
        > In Oracle you could partition the main table (from where all
        > 'cascaded' is coming from) into deleted/undeleted records. Put a view
        > on top of the table just as original table would look like with where
        > clause 'undeleted' and the undeleted partition would always be used.
        > Only certain versions of oracle allow rows to migrate from partition
        > to partition, u can look it up.
        >
        > However, you would still suffer from 'performance' issues on the other
        > tables, if your data is truly huge.
        >
        > Best solution would probably be to separate all deleted/undeleted data
        > into separate tables, and built a union view on top of them for
        > reporting purposes.
        >
        > my2c
        > Someone might have a better idea.
        >
        >
        >
        > On 13 May 2004 11:32:21 -0700, robertbrown1971 @yahoo.com (Robert
        > Brown) wrote:
        >[color=green]
        > >Our customer (of our ecommerce system) wants to be able to preserve
        > >deleted entities in the database so that they can do reporting,
        > >auditing etc.
        > >
        > >The system is quite complex where each end user can belong to multiple
        > >institutiona l affiliations (which can purchase on behalf of the user).
        > >The end user also has a rich trail of past transactions affiliations
        > >etc. Thus in the schema each user entity is related to many others
        > >which in turn relate to yet others and so on.
        > >
        > >In the past when a user was deleted all of his complex relationships
        > >were also deleted in a cascading fashion. But now the customer wants
        > >us to add a "deleted" flag to each user so that a user is never
        > >_really_ deleted but instead his "deleted" flag is set to true. The
        > >system subsequently behaves as if the user did not exist but the
        > >customer can still do reports on deleted users.
        > >
        > >I pointed out that it is not as simple as that because the user entity
        > >is related to many, many others so we would have to add this "deleted"
        > >flag to every relationship and every other entity and thus have
        > >"deleted" past purchases, "deleted" affiliations - a whole shadow
        > >schema full of such ghost entities. This would overtime degrade
        > >performance since now each query in the system has to add a clause:
        > >"where deleted = 0".
        > >
        > >I assume this is a standard problem since many organizations must have
        > >this need of preserving deleted records (for legal or other reasons).
        > >I tried to talk them into creating a simple audit file where all the
        > >deletions will be recorded in XML but they were not too happy with
        > >that.
        > >
        > >Is there a more satisfying solution to this than have this "deleted"
        > >flag?
        > >
        > >Thanks for your help,
        > >
        > >- robert[/color]
        >
        > .......
        > We use Oracle 8.1.7.4 on Solaris 2.7 boxes
        > remove NSPAM to email[/color]

        Comment

        • Steve Jorgensen

          #5
          Re: Representation for Deleted Entities: difficult question

          Note that SQL Server 2000 EE does support partitioned views.

          On 14 May 2004 09:14:47 -0700, robertbrown1971 @yahoo.com (Robert Brown) wrote:
          [color=blue]
          >andreyNSPAM@bo okexchange.net (NetComrade) wrote in message news:<40a3cbf2. 428366208@local host>...
          >[color=green]
          >> You didn't post database you're using.[/color]
          >
          >Thanks for your answer. This particular customer is using Oracle but
          >our software is supported on SQL server as well.
          >
          >[color=green]
          >> In Oracle you could partition the main table (from where all
          >> 'cascaded' is coming from) into deleted/undeleted records. Put a view
          >> on top of the table just as original table would look like with where
          >> clause 'undeleted' and the undeleted partition would always be used.
          >> Only certain versions of oracle allow rows to migrate from partition
          >> to partition, u can look it up.
          >>
          >> However, you would still suffer from 'performance' issues on the other
          >> tables, if your data is truly huge.
          >>
          >> Best solution would probably be to separate all deleted/undeleted data
          >> into separate tables, and built a union view on top of them for
          >> reporting purposes.
          >>
          >> my2c
          >> Someone might have a better idea.
          >>
          >>
          >>
          >> On 13 May 2004 11:32:21 -0700, robertbrown1971 @yahoo.com (Robert
          >> Brown) wrote:
          >>[color=darkred]
          >> >Our customer (of our ecommerce system) wants to be able to preserve
          >> >deleted entities in the database so that they can do reporting,
          >> >auditing etc.
          >> >
          >> >The system is quite complex where each end user can belong to multiple
          >> >institutiona l affiliations (which can purchase on behalf of the user).
          >> >The end user also has a rich trail of past transactions affiliations
          >> >etc. Thus in the schema each user entity is related to many others
          >> >which in turn relate to yet others and so on.
          >> >
          >> >In the past when a user was deleted all of his complex relationships
          >> >were also deleted in a cascading fashion. But now the customer wants
          >> >us to add a "deleted" flag to each user so that a user is never
          >> >_really_ deleted but instead his "deleted" flag is set to true. The
          >> >system subsequently behaves as if the user did not exist but the
          >> >customer can still do reports on deleted users.
          >> >
          >> >I pointed out that it is not as simple as that because the user entity
          >> >is related to many, many others so we would have to add this "deleted"
          >> >flag to every relationship and every other entity and thus have
          >> >"deleted" past purchases, "deleted" affiliations - a whole shadow
          >> >schema full of such ghost entities. This would overtime degrade
          >> >performance since now each query in the system has to add a clause:
          >> >"where deleted = 0".
          >> >
          >> >I assume this is a standard problem since many organizations must have
          >> >this need of preserving deleted records (for legal or other reasons).
          >> >I tried to talk them into creating a simple audit file where all the
          >> >deletions will be recorded in XML but they were not too happy with
          >> >that.
          >> >
          >> >Is there a more satisfying solution to this than have this "deleted"
          >> >flag?
          >> >
          >> >Thanks for your help,
          >> >
          >> >- robert[/color]
          >>
          >> .......
          >> We use Oracle 8.1.7.4 on Solaris 2.7 boxes
          >> remove NSPAM to email[/color][/color]

          Comment

          • Mike Nicewarner

            #6
            Re: Representation for Deleted Entities: difficult question

            This solution is not specific to Oracle. Most DBMS support partitions and
            views.

            --
            Mike Nicewarner [TeamSybase]

            mike@nospam!dat amodel.org
            Sybase product enhancement requests:


            "NetComrade " <andreyNSPAM@bo okexchange.net> wrote in message
            news:40a3cbf2.4 28366208@localh ost...[color=blue]
            > You didn't post database you're using.
            >
            > In Oracle you could partition the main table (from where all
            > 'cascaded' is coming from) into deleted/undeleted records. Put a view
            > on top of the table just as original table would look like with where
            > clause 'undeleted' and the undeleted partition would always be used.
            > Only certain versions of oracle allow rows to migrate from partition
            > to partition, u can look it up.
            >
            > However, you would still suffer from 'performance' issues on the other
            > tables, if your data is truly huge.
            >
            > Best solution would probably be to separate all deleted/undeleted data
            > into separate tables, and built a union view on top of them for
            > reporting purposes.
            >
            > my2c
            > Someone might have a better idea.
            >
            >
            >
            > On 13 May 2004 11:32:21 -0700, robertbrown1971 @yahoo.com (Robert
            > Brown) wrote:
            >[color=green]
            > >Our customer (of our ecommerce system) wants to be able to preserve
            > >deleted entities in the database so that they can do reporting,
            > >auditing etc.
            > >
            > >The system is quite complex where each end user can belong to multiple
            > >institutiona l affiliations (which can purchase on behalf of the user).
            > >The end user also has a rich trail of past transactions affiliations
            > >etc. Thus in the schema each user entity is related to many others
            > >which in turn relate to yet others and so on.
            > >
            > >In the past when a user was deleted all of his complex relationships
            > >were also deleted in a cascading fashion. But now the customer wants
            > >us to add a "deleted" flag to each user so that a user is never
            > >_really_ deleted but instead his "deleted" flag is set to true. The
            > >system subsequently behaves as if the user did not exist but the
            > >customer can still do reports on deleted users.
            > >
            > >I pointed out that it is not as simple as that because the user entity
            > >is related to many, many others so we would have to add this "deleted"
            > >flag to every relationship and every other entity and thus have
            > >"deleted" past purchases, "deleted" affiliations - a whole shadow
            > >schema full of such ghost entities. This would overtime degrade
            > >performance since now each query in the system has to add a clause:
            > >"where deleted = 0".
            > >
            > >I assume this is a standard problem since many organizations must have
            > >this need of preserving deleted records (for legal or other reasons).
            > >I tried to talk them into creating a simple audit file where all the
            > >deletions will be recorded in XML but they were not too happy with
            > >that.
            > >
            > >Is there a more satisfying solution to this than have this "deleted"
            > >flag?
            > >
            > >Thanks for your help,
            > >
            > >- robert[/color]
            >
            > .......
            > We use Oracle 8.1.7.4 on Solaris 2.7 boxes
            > remove NSPAM to email[/color]


            Comment

            • Mike Nicewarner

              #7
              Re: Representation for Deleted Entities: difficult question

              As Leandro and Mikito point out, you have flaws in your design.
              First, deleting the user entity is the only thing that is logically deleted,
              but rather than make it an indicator, use a date, as in DELETE_DATE as
              nullable. Non-null entities are to be ignored.
              In addition, all relationships to the user entity should be evaluated to
              determine if they need to be sensitive to the user entity's status. Some
              may, and others may not. This is a business question, not a technical
              question. For instance, if there are invoices and inventory tables linked
              in some way to the user entity, would you really want to *not* display that
              information just because an associated user entity had been deleted?

              I'd really need to see your design and talk to your business to know exactly
              what impact this design change would have on your database.

              --
              Mike Nicewarner [TeamSybase]

              mike@nospam!dat amodel.org
              Sybase product enhancement requests:


              "Robert Brown" <robertbrown197 1@yahoo.com> wrote in message
              news:240a4d09.0 405131032.6c2e9 802@posting.goo gle.com...[color=blue]
              > Our customer (of our ecommerce system) wants to be able to preserve
              > deleted entities in the database so that they can do reporting,
              > auditing etc.
              >
              > The system is quite complex where each end user can belong to multiple
              > institutional affiliations (which can purchase on behalf of the user).
              > The end user also has a rich trail of past transactions affiliations
              > etc. Thus in the schema each user entity is related to many others
              > which in turn relate to yet others and so on.
              >
              > In the past when a user was deleted all of his complex relationships
              > were also deleted in a cascading fashion. But now the customer wants
              > us to add a "deleted" flag to each user so that a user is never
              > _really_ deleted but instead his "deleted" flag is set to true. The
              > system subsequently behaves as if the user did not exist but the
              > customer can still do reports on deleted users.
              >
              > I pointed out that it is not as simple as that because the user entity
              > is related to many, many others so we would have to add this "deleted"
              > flag to every relationship and every other entity and thus have
              > "deleted" past purchases, "deleted" affiliations - a whole shadow
              > schema full of such ghost entities. This would overtime degrade
              > performance since now each query in the system has to add a clause:
              > "where deleted = 0".
              >
              > I assume this is a standard problem since many organizations must have
              > this need of preserving deleted records (for legal or other reasons).
              > I tried to talk them into creating a simple audit file where all the
              > deletions will be recorded in XML but they were not too happy with
              > that.
              >
              > Is there a more satisfying solution to this than have this "deleted"
              > flag?
              >
              > Thanks for your help,
              >
              > - robert[/color]


              Comment

              • ctcgag@hotmail.com

                #8
                Re: Representation for Deleted Entities: difficult question

                robertbrown1971 @yahoo.com (Robert Brown) wrote:[color=blue]
                > Our customer (of our ecommerce system) wants to be able to preserve
                > deleted entities in the database so that they can do reporting,
                > auditing etc.
                >
                > The system is quite complex where each end user can belong to multiple
                > institutional affiliations (which can purchase on behalf of the user).
                > The end user also has a rich trail of past transactions affiliations
                > etc. Thus in the schema each user entity is related to many others
                > which in turn relate to yet others and so on.
                >
                > In the past when a user was deleted all of his complex relationships
                > were also deleted in a cascading fashion.[/color]

                Users are not deleted. They may die, they may be incarcerated, they may
                be fired, but they cannot be deleted. Only data is deleted.
                [color=blue]
                > But now the customer wants
                > us to add a "deleted" flag to each user so that a user is never
                > _really_ deleted but instead his "deleted" flag is set to true. The
                > system subsequently behaves as if the user did not exist but the
                > customer can still do reports on deleted users.
                >
                > I pointed out that it is not as simple as that because the user entity
                > is related to many, many others so we would have to add this "deleted"
                > flag to every relationship and every other entity and thus have
                > "deleted" past purchases,[/color]

                If I am hit by a bus tomorrow, will the office furniture I ordered (but
                which is owned by the company) disappear? Will the patents I've generated
                for the company no longer be valid?
                [color=blue]
                > "deleted" affiliations - a whole shadow
                > schema full of such ghost entities.[/color]

                Why would each of these need it's own deleted flag? If they are being
                joined against the user table, then they can just rely on the user table to
                provide the necessary logic. If they are not being joined against the user
                table, then why would you want them to be excluded based on a condition of
                a user?
                [color=blue]
                > This would overtime degrade
                > performance since now each query in the system has to add a clause:
                > "where deleted = 0".[/color]

                Do they want the reporting to go back indefinately, or would they like
                the data to be "really" deleted after a certain time of "virtual" deletion?

                [color=blue]
                >
                > I assume this is a standard problem since many organizations must have
                > this need of preserving deleted records (for legal or other reasons).[/color]

                It is a standard problem, but there is not a standard solution.

                Xho

                --
                -------------------- http://NewsReader.Com/ --------------------
                Usenet Newsgroup Service $9.95/Month 30GB

                Comment

                • RSMEINER

                  #9
                  Re: Representation for Deleted Entities: difficult question

                  >It is a standard problem, but there is not a standard solution.[color=blue]
                  >
                  >Xho[/color]

                  Put a bit column in there and flag it if this record is "deleted".
                  Or write a trigger and dump the deleted records into an archive
                  table.


                  Randy
                  Discover the latest breaking news in the U.S. and around the world — politics, weather, entertainment, lifestyle, finance, sports and much more.

                  Comment

                  • Erland Sommarskog

                    #10
                    Re: Representation for Deleted Entities: difficult question

                    Robert Brown (robertbrown197 1@yahoo.com) writes:[color=blue]
                    > Our customer (of our ecommerce system) wants to be able to preserve
                    > deleted entities in the database so that they can do reporting,
                    > auditing etc.
                    >
                    > The system is quite complex where each end user can belong to multiple
                    > institutional affiliations (which can purchase on behalf of the user).
                    > The end user also has a rich trail of past transactions affiliations
                    > etc. Thus in the schema each user entity is related to many others
                    > which in turn relate to yet others and so on.
                    >
                    > In the past when a user was deleted all of his complex relationships
                    > were also deleted in a cascading fashion. But now the customer wants
                    > us to add a "deleted" flag to each user so that a user is never
                    > _really_ deleted but instead his "deleted" flag is set to true. The
                    > system subsequently behaves as if the user did not exist but the
                    > customer can still do reports on deleted users.
                    >
                    > I pointed out that it is not as simple as that because the user entity
                    > is related to many, many others so we would have to add this "deleted"
                    > flag to every relationship and every other entity and thus have
                    > "deleted" past purchases, "deleted" affiliations - a whole shadow
                    > schema full of such ghost entities. This would overtime degrade
                    > performance since now each query in the system has to add a clause:
                    > "where deleted = 0".[/color]

                    Indeed, this is quite a big change if this was not in the system from
                    the beginning.

                    In our system, many items are not deletable, because even if an item
                    goes away, there might still be plenty of references to it. So in these
                    tables - accounts, customers, instruments, currencies, to name a few - there
                    is a deregdate column. This column serves the double purpose of telling
                    us if the item is still active, and if it is not, when the entity was
                    deregistered. And, yes it happens that deregistered entities are revived
                    too!

                    A non-trivial issue here is to know when a deregistered item should be
                    included and when it should not. If you are producing a list of last
                    month's tranactions, it obviously should. But if you are populating a
                    list of available products to order, deregistered products should not be
                    included. So when you introduce this concept in your system, you have
                    a lot to write in your functional specification too.

                    Another issue which becomes complicated, is referential integrity.
                    accounts.accres ponsible may refer to the users table, but if the account
                    is active, the user must be too. Currently we do this in triggers, which
                    is a bit complex, and more difficult than foreign-key constraints. One
                    thought I've been playing with is to have tables like active_accounts ,
                    active_users etc. This would not be the partition suggested by others,
                    since active_accounts would only hold the account number, and foreign
                    keys to other deregisterable items. Thus, active_accounts .accresponsible
                    would refer to active_users.us erid. The full data would still be in
                    acconts and users, for both active and deregistered items.

                    I have never considered the performance cost for "AND deregdate IS NULL",
                    but I would suggest that if you need to access that column, you probably
                    already access some column which is not in any index, so there is
                    already a bookmark lookup, so I would not expect any particular penalty.
                    (This applies to MS SQL Server. Not that I really expect Oracle to be
                    different, but I don't know Oracle.)

                    --
                    Erland Sommarskog, SQL Server MVP, sommar@algonet. se

                    Books Online for SQL Server SP3 at
                    SQL Server 2025 redefines what's possible for enterprise data. With developer-first features and integration with analytics and AI models, SQL Server 2025 accelerates AI innovation using the data you already have.

                    Comment

                    Working...