left join is strange

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Andrei Ivanov

    left join is strange


    Hello,
    I have 2 tables:

    CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL
    );

    CREATE TABLE products_daily_ compacted_views (
    product INTEGER NOT NULL REFERENCES products,
    date DATE NOT NULL DEFAULT ('NOW'::TEXT):: DATE,
    count INTEGER NOT NULL
    );

    The table products has 1785 rows, the table products_daily_ compacted_views
    has 768 rows with date = current_date;

    I want to list all the products and the number of times each product has
    been viewed:

    SELECT p.id, p.name, COALESCE(v.coun t, 0) AS views
    FROM products p LEFT JOIN products_daily_ compacted_views v ON p.id = v.product
    WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC

    The problem with this query is that it doesn't return all the products,
    instead of 1785 rows, it returns 1077 rows

    This modified query seems to be correct, it returns all the products...

    SELECT p.id, p.name, COALESCE(v.coun t, 0) AS views
    FROM products p LEFT JOIN products_daily_ compacted_views v
    ON p.id = v.product AND v.date = current_date
    ORDER BY views DESC

    Could anybody explain to me why does this happen ?

    Thank you.

    ---------------------------(end of broadcast)---------------------------
    TIP 8: explain analyze is your friend

  • Arjen van der Meijden

    #2
    Re: left join is strange

    > Andrei Ivanov wrote:[color=blue]
    >
    > I want to list all the products and the number of times each
    > product has
    > been viewed:
    >
    > SELECT p.id, p.name, COALESCE(v.coun t, 0) AS views
    > FROM products p LEFT JOIN products_daily_ compacted_views v ON
    > p.id = v.product
    > WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC
    >
    > The problem with this query is that it doesn't return all the
    > products,
    > instead of 1785 rows, it returns 1077 rows[/color]
    And that is exactly as it should be.
    You will get the left joined combination of p and v, but the filter in
    the where is applied afterwards on all those combinations.
    [color=blue]
    >
    > This modified query seems to be correct, it returns all the
    > products...
    >
    > SELECT p.id, p.name, COALESCE(v.coun t, 0) AS views
    > FROM products p LEFT JOIN products_daily_ compacted_views v
    > ON p.id = v.product AND v.date = current_date
    > ORDER BY views DESC
    >
    > Could anybody explain to me why does this happen ?[/color]
    Here you apply your filter to the elements of v, before joining them to
    the elements of p.

    Best regards,

    Arjen




    ---------------------------(end of broadcast)---------------------------
    TIP 3: if posting/reading through Usenet, please send an appropriate
    subscribe-nomail command to majordomo@postg resql.org so that your
    message can get through to the mailing list cleanly

    Comment

    • Andrei Ivanov

      #3
      Re: left join is strange



      On Mon, 8 Dec 2003, Arjen van der Meijden wrote:
      [color=blue][color=green]
      > > Andrei Ivanov wrote:
      > >
      > > I want to list all the products and the number of times each
      > > product has
      > > been viewed:
      > >
      > > SELECT p.id, p.name, COALESCE(v.coun t, 0) AS views
      > > FROM products p LEFT JOIN products_daily_ compacted_views v ON
      > > p.id = v.product
      > > WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC
      > >
      > > The problem with this query is that it doesn't return all the
      > > products,
      > > instead of 1785 rows, it returns 1077 rows[/color]
      > And that is exactly as it should be.
      > You will get the left joined combination of p and v, but the filter in
      > the where is applied afterwards on all those combinations.
      >[/color]

      I kinda figured that out, but still, being a left join, it should return
      all the rows in the table products, which I then filter with
      v.date = current_date OR v.date IS NULL.

      v.date has 3 possible values: current_date, some other date or NULL, if
      there is no corresponding row in products_daily_ compacted_views for that
      product.

      I filter out only 1 value, and I still should get 1785 rows...

      [color=blue][color=green]
      > >
      > > This modified query seems to be correct, it returns all the
      > > products...
      > >
      > > SELECT p.id, p.name, COALESCE(v.coun t, 0) AS views
      > > FROM products p LEFT JOIN products_daily_ compacted_views v
      > > ON p.id = v.product AND v.date = current_date
      > > ORDER BY views DESC
      > >
      > > Could anybody explain to me why does this happen ?[/color]
      > Here you apply your filter to the elements of v, before joining them to
      > the elements of p.
      >
      > Best regards,
      >
      > Arjen
      >
      >
      >
      >
      > ---------------------------(end of broadcast)---------------------------
      > TIP 3: if posting/reading through Usenet, please send an appropriate
      > subscribe-nomail command to majordomo@postg resql.org so that your
      > message can get through to the mailing list cleanly
      >[/color]

      ---------------------------(end of broadcast)---------------------------
      TIP 7: don't forget to increase your free space map settings

      Comment

      • Arjen van der Meijden

        #4
        Re: left join is strange

        > Andrei Ivanov wrote:[color=blue]
        >
        > On Mon, 8 Dec 2003, Arjen van der Meijden wrote:
        >[color=green][color=darkred]
        > > > Andrei Ivanov wrote:
        > > >
        > > > I want to list all the products and the number of times each
        > > > product has
        > > > been viewed:
        > > >
        > > > SELECT p.id, p.name, COALESCE(v.coun t, 0) AS views
        > > > FROM products p LEFT JOIN products_daily_ compacted_views v ON
        > > > p.id = v.product
        > > > WHERE v.date = current_date OR v.date IS NULL ORDER BY views DESC
        > > >
        > > > The problem with this query is that it doesn't return all the
        > > > products,
        > > > instead of 1785 rows, it returns 1077 rows[/color]
        > > And that is exactly as it should be.
        > > You will get the left joined combination of p and v, but[/color]
        > the filter in[color=green]
        > > the where is applied afterwards on all those combinations.
        > >[/color]
        >
        > I kinda figured that out, but still, being a left join, it
        > should return
        > all the rows in the table products, which I then filter with
        > v.date = current_date OR v.date IS NULL.
        >
        > v.date has 3 possible values: current_date, some other date
        > or NULL, if
        > there is no corresponding row in
        > products_daily_ compacted_views for that
        > product.
        >
        > I filter out only 1 value, and I still should get 1785 rows...[/color]

        No, you combine two table using a left join (and yes, you get 1785 rows
        from that left join), which then (after the joining) get filtered using
        your where.
        The values that have the current_date (which are probably none, since
        that is taken at the moment of the selection, not at the moment of the
        insert) or the NULL will get through, resulting in less than your 1785
        rows.

        Regards,

        Arjen




        ---------------------------(end of broadcast)---------------------------
        TIP 8: explain analyze is your friend

        Comment

        • Tom Lane

          #5
          Re: left join is strange

          Andrei Ivanov <andrei.ivanov@ ines.ro> writes:[color=blue]
          > I kinda figured that out, but still, being a left join, it should return
          > all the rows in the table products, which I then filter with
          > v.date = current_date OR v.date IS NULL.
          > v.date has 3 possible values: current_date, some other date or NULL, if
          > there is no corresponding row in products_daily_ compacted_views for that
          > product.[/color]

          Right. Your first query will show products for which (1) there is a v
          row with date = current_date, or (2) there is *no* v row at all. If
          there is a v row with the wrong date, it will get through the left join
          and then be eliminated at WHERE. Because it gets through the left join,
          no null-extended row is generated for that product, and so your OR
          v.date IS NULL doesn't help.

          In your second query, the date condition is considered part of the LEFT
          JOIN condition, meaning that if no v rows pass the date condition, a
          null-extended row will be emitted.

          regards, tom lane

          ---------------------------(end of broadcast)---------------------------
          TIP 4: Don't 'kill -9' the postmaster

          Comment

          Working...