checksum

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Federico Balbi

    checksum

    Hi,
    I was wondering if PGSQL has a function similar to binary_checksum () of
    MS SQL Server 2000. It is pretty handy when it comes to compare rows of
    data instead of having to write long boolean expressions.
    binary_checksum () takes a list of fields and it returns an integer value
    which sumarize the row content.

    Thanks,
    Fed


    ---------------------------(end of broadcast)---------------------------
    TIP 4: Don't 'kill -9' the postmaster

  • David Helgason

    #2
    Re: checksum

    > I was wondering if PGSQL has a function similar to binary_checksum ()[color=blue]
    > of
    > MS SQL Server 2000. It is pretty handy when it comes to compare rows of
    > data instead of having to write long boolean expressions.
    > binary_checksum () takes a list of fields and it returns an integer
    > value
    > which sumarize the row content.[/color]

    On a similar note, I've found myself wanting an extended '=' operator
    meaning
    (a = b or (a is null and b is null))

    same goal of course, for more general comparisons...

    d.
    --
    David Helgason,
    Business Development et al.,
    Over the Edge I/S (http://otee.dk)
    Direct line +45 2620 0663
    Main line +45 3264 5049
    On 26. sep 2004, at 19:58, Federico Balbi wrote:


    ---------------------------(end of broadcast)---------------------------
    TIP 3: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to majordomo@postg resql.org so that your
    message can get through to the mailing list cleanly

    Comment

    • Joshua D. Drake

      #3
      Re: checksum

      David Helgason wrote:
      [color=blue][color=green]
      >> I was wondering if PGSQL has a function similar to
      >> binary_checksum () of
      >> MS SQL Server 2000. It is pretty handy when it comes to compare rows of
      >> data instead of having to write long boolean expressions.
      >> binary_checksum () takes a list of fields and it returns an integer value
      >> which sumarize the row content.[/color]
      >
      >[/color]
      You could use the md5 function.... such as :

      select md5(foo) from bar where baz = 2;

      J



      [color=blue]
      > On a similar note, I've found myself wanting an extended '=' operator
      > meaning
      > (a = b or (a is null and b is null))
      >
      > same goal of course, for more general comparisons...
      >
      > d.[/color]



      --
      Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
      Postgresql support, programming shared hosting and dedicated hosting.
      +1-503-667-4564 - jd@commandpromp t.com - http://www.commandprompt.com
      PostgreSQL Replicator -- production quality replication for PostgreSQL


      ---------------------------(end of broadcast)---------------------------
      TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org

      Comment

      • Federico Balbi

        #4
        Re: checksum

        > David Helgason wrote:[color=blue]
        >[color=green][color=darkred]
        > >> I was wondering if PGSQL has a function similar to
        > >> binary_checksum () of
        > >> MS SQL Server 2000. It is pretty handy when it comes to compare rows of
        > >> data instead of having to write long boolean expressions.
        > >> binary_checksum () takes a list of fields and it returns an integer value
        > >> which sumarize the row content.[/color]
        > >
        > >[/color]
        > You could use the md5 function.... such as :
        >
        > select md5(foo) from bar where baz = 2;[/color]

        Looks like md5() takes only a string. I need to pass alist of fields
        instead. I was looking at the documentattion and I think I can write
        soemthing like:

        field1, field2, ..., fieldn = expr1, expr2, ..., exprn

        This way one operator will check all the fields for equality.

        Fed


        ---------------------------(end of broadcast)---------------------------
        TIP 7: don't forget to increase your free space map settings

        Comment

        • Joshua D. Drake

          #5
          Re: checksum

          [color=blue][color=green]
          >>You could use the md5 function.... such as :
          >>
          >>select md5(foo) from bar where baz = 2;
          >>
          >>[/color]
          >
          >Looks like md5() takes only a string. I need to pass alist of fields
          >instead. I was looking at the documentattion and I think I can write
          >soemthing like:
          >
          >field1, field2, ..., fieldn = expr1, expr2, ..., exprn
          >
          >This way one operator will check all the fields for equality.
          >
          >
          >[/color]

          Maybe I am missing what you are saying, but you can md5() the data
          column... So you could do:

          select one,two from foo where md5(one) = 'e4da3b7fbbce23 45d7772b0674a31 8d5';

          or

          select one,two from foo where md5(one) = $1; or whatever.

          for example....

          Sincerely,

          Joshua D. Drake


          [color=blue]
          >Fed
          >
          >
          >---------------------------(end of broadcast)---------------------------
          >TIP 7: don't forget to increase your free space map settings
          >
          >[/color]


          --
          Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
          Postgresql support, programming shared hosting and dedicated hosting.
          +1-503-667-4564 - jd@commandpromp t.com - http://www.commandprompt.com
          PostgreSQL Replicator -- production quality replication for PostgreSQL


          Comment

          • Mike Rylander

            #6
            Re: checksum

            On Sun, 26 Sep 2004 20:16:52 +0200, David Helgason <david@uti.is > wrote:[color=blue][color=green]
            > > I was wondering if PGSQL has a function similar to binary_checksum ()
            > > of
            > > MS SQL Server 2000. It is pretty handy when it comes to compare rows of
            > > data instead of having to write long boolean expressions.
            > > binary_checksum () takes a list of fields and it returns an integer
            > > value
            > > which sumarize the row content.[/color]
            >[/color]

            As noted, you can use the md5(text) function with the || (concat) operator
            [color=blue]
            > On a similar note, I've found myself wanting an extended '=' operator
            > meaning
            > (a = b or (a is null and b is null))
            >[/color]

            Setting 'transform_null _equals' to true in postgresql.conf should do
            what you want.

            --miker
            [color=blue]
            > same goal of course, for more general comparisons...
            >
            > d.
            > --
            > David Helgason,
            > Business Development et al.,
            > Over the Edge I/S (http://otee.dk)
            > Direct line +45 2620 0663
            > Main line +45 3264 5049
            > On 26. sep 2004, at 19:58, Federico Balbi wrote:
            >
            > ---------------------------(end of broadcast)---------------------------
            > TIP 3: if posting/reading through Usenet, please send an appropriate
            > subscribe-nomail command to majordomo@postg resql.org so that your
            > message can get through to the mailing list cleanly
            >[/color]

            ---------------------------(end of broadcast)---------------------------
            TIP 7: don't forget to increase your free space map settings

            Comment

            • Stephan Szabo

              #7
              Null comparisons (was Re: checksum)


              On Mon, 27 Sep 2004, Mike Rylander wrote:
              [color=blue]
              > On Sun, 26 Sep 2004 20:16:52 +0200, David Helgason <david@uti.is > wrote:[/color]
              [color=blue][color=green]
              > > On a similar note, I've found myself wanting an extended '=' operator
              > > meaning
              > > (a = b or (a is null and b is null))
              > >[/color]
              >
              > Setting 'transform_null _equals' to true in postgresql.conf should do
              > what you want.[/color]

              Unfortunately, it probably won't. That only changes the explicit token
              sequence = NULL into an IS NULL, it won't help if you're doing a=b where a
              or b may be NULL.

              The original does appear to be equivalent to "not(a is distinct from b)",
              although I'm not sure that's necessarily easier to use than the above.

              ---------------------------(end of broadcast)---------------------------
              TIP 6: Have you searched our list archives?



              Comment

              • Greg Stark

                #8
                Re: Null comparisons (was Re: checksum)


                Stephan Szabo <sszabo@megazon e.bigpanda.com> writes:
                [color=blue][color=green]
                > > On Sun, 26 Sep 2004 20:16:52 +0200, David Helgason <david@uti.is > wrote:[/color]
                >[color=green][color=darkred]
                > > > On a similar note, I've found myself wanting an extended '=' operator
                > > > meaning
                > > > (a = b or (a is null and b is null))[/color][/color]
                >
                > The original does appear to be equivalent to "not(a is distinct from b)",
                > although I'm not sure that's necessarily easier to use than the above.[/color]

                I often do things like "coalesce(a ,0) = coalesce(b,0)".
                (Or whatever value you know won't appear)

                Though for pretty small values of "often". It always makes me think twice
                about my data model when I find myself doing this. But there are definitely
                still cases where it's useful and as clean as anything else I could think of.

                --
                greg


                ---------------------------(end of broadcast)---------------------------
                TIP 9: the planner will ignore your desire to choose an index scan if your
                joining column's datatypes do not match

                Comment

                • Dean Gibson

                  #9
                  Re: Null comparisons (was Re: checksum)

                  Even simpler: COALESCE( a = b, a IS NULL AND b IS NULL )

                  -- Dean

                  Greg Stark wrote on 2004-09-27 08:17:
                  [color=blue]
                  >Stephan Szabo <sszabo@megazon e.bigpanda.com> writes:
                  >[color=green][color=darkred]
                  > > > On Sun, 26 Sep 2004 20:16:52 +0200, David Helgason <david@uti.is > wrote:[/color]
                  > >[color=darkred]
                  > > > > On a similar note, I've found myself wanting an extended '=' operator
                  > > > > meaning
                  > > > > (a = b or (a is null and b is null))[/color]
                  > >
                  > > The original does appear to be equivalent to "not(a is distinct from b)",
                  > > although I'm not sure that's necessarily easier to use than the above.[/color]
                  >
                  >I often do things like "coalesce(a ,0) = coalesce(b,0)".
                  >(Or whatever value you know won't appear)
                  >
                  >Though for pretty small values of "often". It always makes me think twice
                  >about my data model when I find myself doing this. But there are definitely
                  >still cases where it's useful and as clean as anything else I could think of.
                  >
                  >--
                  >greg
                  >
                  >
                  >---------------------------(end of broadcast)---------------------------
                  >TIP 9: the planner will ignore your desire to choose an index scan if your
                  > joining column's datatypes do not match[/color]


                  ---------------------------(end of broadcast)---------------------------
                  TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org

                  Comment

                  • David Helgason

                    #10
                    Re: Null comparisons (was Re: checksum)

                    On 27. sep 2004, at 22:08, Dean Gibson (DB Administrator) wrote:[color=blue]
                    > Greg Stark wrote on 2004-09-27 08:17:[color=green]
                    >> Stephan Szabo <sszabo@megazon e.bigpanda.com> writes:[color=darkred]
                    >> >> On Sun, 26 Sep 2004 20:16:52 +0200, David Helgason <david@uti.is >[/color]
                    >> wrote:[color=darkred]
                    >> >>> On a similar note, I've found myself wanting an extended '='[/color]
                    >> operator[color=darkred]
                    >> >>> meaning
                    >> >>> (a = b or (a is null and b is null))
                    >> >
                    >> > The original does appear to be equivalent to "not(a is distinct[/color]
                    >> from b)",[color=darkred]
                    >> > although I'm not sure that's necessarily easier to use than the[/color]
                    >> above.
                    >>
                    >> I often do things like "coalesce(a ,0) = coalesce(b,0)".
                    >> (Or whatever value you know won't appear)
                    >>[/color]
                    > Even simpler: COALESCE( a = b, a IS NULL AND b IS NULL )[/color]

                    I'm not quite sure what is being accomplished here... My original
                    expression wasn't that bad, just clunky. I'd prefer a === b or (a
                    samevalue b), but the above just complicates matters. Also, a 'set'
                    command outside the expression goes completely against the idea, that
                    certain fields have 'null' as a legal, comparable value, while others
                    do not.

                    Anyway, idle speculation :)

                    d.


                    ---------------------------(end of broadcast)---------------------------
                    TIP 9: the planner will ignore your desire to choose an index scan if your
                    joining column's datatypes do not match

                    Comment

                    • Marco Colombo

                      #11
                      Re: Null comparisons (was Re: checksum)

                      On Mon, 27 Sep 2004, David Helgason wrote:
                      [color=blue]
                      > On 27. sep 2004, at 22:08, Dean Gibson (DB Administrator) wrote:[color=green]
                      >> Greg Stark wrote on 2004-09-27 08:17:[color=darkred]
                      >>> Stephan Szabo <sszabo@megazon e.bigpanda.com> writes:
                      >>> >> On Sun, 26 Sep 2004 20:16:52 +0200, David Helgason <david@uti.is >
                      >>> wrote:
                      >>> >>> On a similar note, I've found myself wanting an extended '=' operator
                      >>> >>> meaning
                      >>> >>> (a = b or (a is null and b is null))
                      >>> >
                      >>> > The original does appear to be equivalent to "not(a is distinct from
                      >>> b)",
                      >>> > although I'm not sure that's necessarily easier to use than the above.
                      >>>
                      >>> I often do things like "coalesce(a ,0) = coalesce(b,0)".
                      >>> (Or whatever value you know won't appear)
                      >>>[/color]
                      >> Even simpler: COALESCE( a = b, a IS NULL AND b IS NULL )[/color]
                      >
                      > I'm not quite sure what is being accomplished here... My original expression
                      > wasn't that bad, just clunky. I'd prefer a === b or (a samevalue b), but the
                      > above just complicates matters. Also, a 'set' command outside the expression
                      > goes completely against the idea, that certain fields have 'null' as a legal,
                      > comparable value, while others do not.
                      >
                      > Anyway, idle speculation :)
                      >
                      > d.[/color]

                      (a = b or (a is null and b is null))

                      that raises a flag for me. It seems that NULL is used as a special value,
                      which is not. NULL just means 'unknown', nothing more, nothing less.
                      That's why any boolean expression involving a NULL is NULL.

                      'unknown' when compared to anything else just gives 'unknown'.
                      It means: "I can't tell whether the two expressions are the same, because
                      I don't know the value of one of them."
                      Note that that's different from "I know they're different".
                      They _could_ be equal, we just don't know.

                      'unknown' compared to 'unknown' gives of course 'unknown', since it's
                      just a special case of the above. Not knowing the value of both the
                      expressions doesn't help much. The answer can never be 'I know they
                      are the same.' Not knowing one is enough to say you don't know the
                      result of the comparison.

                      I can hardly imagine why you may want to select all rows that you
                      are certain have equal values, plus others that have potentially
                      different values because they are both unknown. It smells like a
                      design problem.


                      BTW,

                      coalesce(a,0) = coalesce(b,0)

                      is wrong, since it assumes 0 is a special value, never used in the
                      table. If so, it's better use it from the start instead of NULL for
                      those special rows. That espression is true for the following rows:
                      a | b
                      ---+---
                      1 | 1
                      2 | 2
                      |
                      0 |
                      | 0

                      the last two rows are wrongly selected.


                      coalesce(a = b, a is null and b is null)

                      is correct, and maybe slightly better than the original

                      (a = b) or (a is null and b is null)

                      if the implementation is smart enough to evaluate its arguments only
                      when needed. The or operator needs to evaluate the right side when
                      the left side is either false or null, COALESCE only when it's null.
                      I think the docs mention that.

                      ..TM.
                      --
                      ____/ ____/ /
                      / / / Marco Colombo
                      ___/ ___ / / Technical Manager
                      / / / ESI s.r.l.
                      _____/ _____/ _/ Colombo@ESI.it

                      ---------------------------(end of broadcast)---------------------------
                      TIP 7: don't forget to increase your free space map settings

                      Comment

                      • Greg Stark

                        #12
                        Re: Null comparisons (was Re: checksum)


                        Marco Colombo <pgsql@esiway.n et> writes:
                        [color=blue]
                        > (a = b or (a is null and b is null))
                        >
                        > that raises a flag for me. It seems that NULL is used as a special value,
                        > which is not.[/color]

                        Well, as I said, it raised a flag for me too. However, it's not good to be too
                        dogmatic about things. General rules are useful guiding principles but you
                        have to recognize when it's worth it to break them. We don't know enough about
                        his problem to say another approach would be any better.

                        For example, in one application I have a table that *does* have "unknown"
                        values. However I do need to look up records that match criteria including
                        having "unknown" values in specific positions. For most queries using NULL is
                        convenient and it's perfectly appropriate. But I have queries like this user
                        does and I use coalesce since I find the resulting expression much clearer
                        than using the three-way logical expression above.

                        Incidentally, coalesce(a,0)=c oalesce(b,0) has the advantage over all the other
                        suggestions that you can build an index on coalesce(a,0) and/or coalesce(b,0)
                        and use them for the join or for individual record lookups.
                        [color=blue]
                        > BTW,
                        >
                        > coalesce(a,0) = coalesce(b,0)
                        >
                        > is wrong, since it assumes 0 is a special value, never used in the
                        > table. If so, it's better use it from the start instead of NULL for
                        > those special rows.[/color]

                        I specifically said you had to use a special value in my suggestion. Saying
                        something is "wrong" when it does what's needed just because it violates some
                        abstract design principle is just short-sighted.

                        Using 0 in the table might violate unique constraints or foreign key
                        constraints. I try to avoid having a single quirky table propagate its
                        quirkiness to the rest of the system.

                        For example, creating a bogus "0" record in some other table just to satisfy
                        the foreign key constraint then having the rest of the application have to
                        work around this bogus record results in a much less workable system than
                        simply using NULL instead of 0 for the special value.
                        [color=blue]
                        > coalesce(a = b, a is null and b is null)
                        >
                        > is correct, and maybe slightly better than the original
                        >
                        > (a = b) or (a is null and b is null)
                        >
                        > if the implementation is smart enough to evaluate its arguments only
                        > when needed. The or operator needs to evaluate the right side when
                        > the left side is either false or null, COALESCE only when it's null.
                        > I think the docs mention that.[/color]

                        Actually it's only the latter expression that will be able to avoid evaluating
                        the extra expression, not the coalesce example. In any case the time to
                        evaluate the "a is null and b is null" part is negligible. And the fact that
                        neither can use any indexes is the only relevant performance question. It's
                        possible that's not a concern, but if it is they both lose.

                        That's one possible argument in favour of a === operator. It would be easy (I
                        think?) to make === use a btree index without even having to build a
                        functional index like with coalesce(a,0).

                        --
                        greg


                        ---------------------------(end of broadcast)---------------------------
                        TIP 5: Have you checked our extensive FAQ?



                        Comment

                        • Marco Colombo

                          #13
                          Re: Null comparisons (was Re: checksum)

                          On Tue, 28 Sep 2004, Greg Stark wrote:
                          [color=blue]
                          >
                          > Marco Colombo <pgsql@esiway.n et> writes:
                          >[color=green]
                          >> (a = b or (a is null and b is null))
                          >>
                          >> that raises a flag for me. It seems that NULL is used as a special value,
                          >> which is not.[/color]
                          >
                          > Well, as I said, it raised a flag for me too. However, it's not good to be too
                          > dogmatic about things. General rules are useful guiding principles but you
                          > have to recognize when it's worth it to break them. We don't know enough about
                          > his problem to say another approach would be any better.[/color]

                          The way SQL treats NULLs in boolean expressions is not a "general rule".
                          It's just the way it is. I'm being pragmatic not dogmatic.
                          [color=blue]
                          > For example, in one application I have a table that *does* have "unknown"
                          > values. However I do need to look up records that match criteria including
                          > having "unknown" values in specific positions. For most queries using NULL is
                          > convenient and it's perfectly appropriate. But I have queries like this user
                          > does and I use coalesce since I find the resulting expression much clearer
                          > than using the three-way logical expression above.[/color]

                          Using special values is a mistake, but I agree that's being dogmatic.

                          Using NULL as a special value is a way worse mistake. Three way logic
                          it's there, there's nothing you can do about that. NULL can't be a value.
                          That's why you can't use it in comparisons.

                          As for the "dogma" part, I bet you're using special values only to
                          model states, not values, and are using the same columns you're using
                          to model values. Just add another column, life will be better.
                          This has nothing to do with NULLs, BTW.
                          [color=blue]
                          > Incidentally, coalesce(a,0)=c oalesce(b,0) has the advantage over all the other
                          > suggestions that you can build an index on coalesce(a,0) and/or coalesce(b,0)
                          > and use them for the join or for individual record lookups.
                          >[color=green]
                          >> BTW,
                          >>
                          >> coalesce(a,0) = coalesce(b,0)
                          >>
                          >> is wrong, since it assumes 0 is a special value, never used in the
                          >> table. If so, it's better use it from the start instead of NULL for
                          >> those special rows.[/color]
                          >
                          > I specifically said you had to use a special value in my suggestion. Saying
                          > something is "wrong" when it does what's needed just because it violates some
                          > abstract design principle is just short-sighted.
                          >
                          > Using 0 in the table might violate unique constraints or foreign key
                          > constraints. I try to avoid having a single quirky table propagate its
                          > quirkiness to the rest of the system.[/color]

                          _Unique_ constraints? I don't get this. 0 has no special meaning.
                          1, -1, 1000000 might violate unique constraints as well. Any value might.
                          What's the point here? Same goes for foreign key constraints.
                          Actually same goes for _any_ constraint.

                          I agree that using a value you _know_ it's invalid due to some contraints
                          allows you to do the "coalesce trick" safely. But this assumes there is
                          at least _one_ invalid value. This is not true in general. In general,

                          coalesce(a,0) = coalesce(b,0)

                          is not the same of

                          (a = b) or (a is null and b is null).

                          I've even provided a fine example. It can't be a general equivalence.
                          [color=blue]
                          > For example, creating a bogus "0" record in some other table just to satisfy
                          > the foreign key constraint then having the rest of the application have to
                          > work around this bogus record results in a much less workable system than
                          > simply using NULL instead of 0 for the special value.[/color]

                          I'm lost here. I've never proposed to add 0 anywhere. I just wonder:
                          "what if someone does?". In order to use 0 in coalesce(a,0) = coalesce(b,0)
                          you have to make sure 0 is invalid for both a and b. If you knew -1 is
                          invalid instead, you would use coalesce(a,-1) = coalesce(b,-1).
                          But what if there's no invalid value?

                          It might seem a "dogmatic" question, but my point is: why bother?
                          Just get the model right. Mapping NULLs to 0 or -1 or whatever is
                          meaningless, if the model is right.

                          That's again the whole point. What you're proposing sounds like this:
                          "I'd use 0 (or other special value) in the table, but that's not good
                          cause it may break some constraint. So I use NULLs in place of my
                          special value, and convert them at later time with coalesce(), so that
                          I can compare them again."

                          Now, that's abusing of NULLs. There's a reason why NULLs don't break
                          foreign key constraints, and a reason why you can't compare them.
                          Your use of coalesce(), your functional index, it's just placing a
                          brown paperbag on the real problem, which is a wrong model. Don't play
                          with NULLs, fix your schema. A badly designed model is not a matter
                          of "general principles". It's a plain real-world mistake. And leads
                          to any kind of acrobatic exercises in SQL to get the system work.
                          One day you'll run into another query you've got a hard time to write
                          or to make run efficently.

                          NULLs are not meant to be "values" and should never be used as such.
                          Anytime you feel the need of an index on them, or to compare them,
                          you're treating them as values, and that's plain wrong. Not because
                          of any dogma, but because that's how the system works. The _only_
                          way to use them as values is to abuse of them.
                          [color=blue]
                          >[color=green]
                          >> coalesce(a = b, a is null and b is null)
                          >>
                          >> is correct, and maybe slightly better than the original
                          >>
                          >> (a = b) or (a is null and b is null)
                          >>
                          >> if the implementation is smart enough to evaluate its arguments only
                          >> when needed. The or operator needs to evaluate the right side when
                          >> the left side is either false or null, COALESCE only when it's null.
                          >> I think the docs mention that.[/color]
                          >
                          > Actually it's only the latter expression that will be able to avoid evaluating
                          > the extra expression, not the coalesce example. In any case the time to[/color]
                          ^^^^^^^^^^^^^^^ ^^^^^^^^^
                          Why not?
                          [color=blue]
                          > evaluate the "a is null and b is null" part is negligible. And the fact that
                          > neither can use any indexes is the only relevant performance question. It's
                          > possible that's not a concern, but if it is they both lose.[/color]

                          We agree here. I wrote "maybe slightly better".
                          [color=blue]
                          > That's one possible argument in favour of a === operator. It would be easy (I
                          > think?) to make === use a btree index without even having to build a
                          > functional index like with coalesce(a,0).[/color]

                          A new, non standard operator just to support badly modeled schemas?
                          No, thanks.

                          In C, you can store a (small) string into a pointer type variable,
                          if you're careful enough, and know what you're doing. It might work
                          for you, but it's still abusing of the language: you can't
                          expect the language to _support_ that. And for sure, you can't sell
                          it as a general solution.

                          I'm not against abusing of the db, nor playing dirty tricks, if that fits
                          your needs. You're free to design your db the way you like and face
                          the cost of a careful design or of later SQL gymnastics. I'm fine,
                          as long as you don't ask for syntactic sugar to support those "features".

                          ..TM.
                          --
                          ____/ ____/ /
                          / / / Marco Colombo
                          ___/ ___ / / Technical Manager
                          / / / ESI s.r.l.
                          _____/ _____/ _/ Colombo@ESI.it


                          ---------------------------(end of broadcast)---------------------------
                          TIP 4: Don't 'kill -9' the postmaster

                          Comment

                          • Greg Stark

                            #14
                            Re: Null comparisons (was Re: checksum)


                            Marco Colombo <pgsql@esiway.n et> writes:
                            [color=blue]
                            > I'm not against abusing of the db, nor playing dirty tricks, if that fits
                            > your needs. You're free to design your db the way you like and face
                            > the cost of a careful design or of later SQL gymnastics. I'm fine,
                            > as long as you don't ask for syntactic sugar to support those "features".[/color]

                            My point is that you're making judgements about his schema without actually
                            knowing what you're talking about. For all we know his schema is entirely
                            reasonable and it's the query that has unusual requirements. Spouting general
                            design principles that may or may not apply as being iron-clad rules and
                            saying it's just wrong to break them is wilful blindness.

                            Database modelling is not something you can do by holding up some textbook and
                            screaming "third normal form" until the developers trying to get work done
                            cower in submission. There are lots of times when breaking or bending the
                            rules is entirely reasonable and blindly following them is simply a waste of
                            time.

                            For example, I have a table that uses NULLs to represent absent data. In 90%
                            of the queries three value logic is just exactly what's needed. In any case
                            they each have foreign key dependencies and having special values to represent
                            the absent values would be a major pain. It would require satisfying the
                            foreign keys with bogus records.

                            However I have queries that have to match provided data with other records,
                            including having missing data in the same position. For such a query I need to
                            break the usual model of three value logic and write something similar to what
                            this user needs. In my case no index would really be reasonable since there
                            are half a dozen such fields, but in general there's no reason an index
                            shouldn't be available for such cases.

                            --
                            greg


                            ---------------------------(end of broadcast)---------------------------
                            TIP 7: don't forget to increase your free space map settings

                            Comment

                            Working...