Trouble With Counting New Documents With Complex Query

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Adam i Agnieszka Gasiorowski FNORD

    Trouble With Counting New Documents With Complex Query


    I need help width formulating the most
    effective (in terms of processing time)
    SQL query to count all the "new"
    documents in the repository, where "new" is
    defined as "from 00:00:01 up to 23:59:59
    today". My current query does not give me
    satisfactory results, it creates a visible
    delay in rendering of the main page of one of
    the departments (Drugs) :8[[[
    (at least I, for now, think it's the culprit).
    It's for the <url: https://hyperreal.info >
    site, see for yourself, notice the delay
    <url: https://hyperreal.info/drugs/go.to/index >.

    Currently I ask MySQL to (offending
    PHP fragment follows, I hope it is self-
    explanatory).

    <?
    $suma = 0;
    $pytanie = "SELECT COUNT(DISTINCT x_article.ID) AS CNT ";
    $pytanie .= "FROM x_article ";
    $pytanie .= "LEFT JOIN x_instance ";
    $pytanie .= "ON x_article.ID = x_instance.Arti cle ";
    $pytanie .= "LEFT JOIN x_section ";
    $pytanie .= "ON x_instance.Sect ion = x_section.ID ";
    $pytanie .= "WHERE (x_section.Stat us & 1) = 0 "; // not empty
    $pytanie .= "AND (x_section.Dept = 2 OR x_section.Dept = 5) "; // Drugs, NeuroGroove
    $pytanie .= "AND (x_instance.Sta tus & 255) = 0 "; // not hidden, etc
    $pytanie .= "AND UNIX_TIMESTAMP( x_article.Date) BETWEEN " . mktime(0, 0, 1, date('m'), date('d'), date('Y')) . " AND UNIX_TIMESTAMP( NOW()) ";
    $pytanie .= "GROUP BY x_article.ID";
    $wynik = mysql_query($py tanie);
    while ($tmp = mysql_fetch_arr ay($wynik))
    {
    $suma += $tmp['CNT'];
    }
    if ($suma)
    {
    // pretty-printing of the result
    $dzisdodano = str_pad((string )(int)$suma, 4, '0', STR_PAD_LEFT);
    }
    else $dzisdodano = '0000';
    ?>

    The table layout is as follows:

    mysql> DESC x_article;
    +-------------+------------------+------+-----+----------------------+----------------+
    | Field | Type | Null | Key | Default | Extra |
    +-------------+------------------+------+-----+----------------------+----------------+
    | ID | int(10) unsigned | | PRI | NULL | auto_increment |
    | Name | varchar(255) | YES | MUL | NULL | |
    | Description | varchar(255) | YES | | NULL | |
    | Keywords | varchar(255) | YES | | NULL | |
    | Content | mediumtext | | | | |
    | Date | datetime | | | 2001-01-01 00:00:00 | |
    | Author | varchar(100) | | | unknown@hyperre al.pl | |
    | Feedback | varchar(100) | YES | | NULL | |
    | Size | int(32) | YES | | NULL | |
    | Words | int(32) | YES | | NULL | |
    | Images | int(32) | YES | | NULL | |
    +-------------+------------------+------+-----+----------------------+----------------+

    mysql> DESC x_instance;
    +----------+------------------+------+-----+---------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +----------+------------------+------+-----+---------+-------+
    | Article | mediumint(9) | | MUL | 0 | |
    | Section | mediumint(9) | | MUL | 0 | |
    | Priority | tinyint(4) | | | 0 | |
    | Status | int(16) unsigned | | | 0 | |
    +----------+------------------+------+-----+---------+-------+

    mysql> DESC x_section;
    +----------+----------------------+------+-----+---------------+----------------+
    | Field | Type | Null | Key | Default | Extra |
    +----------+----------------------+------+-----+---------------+----------------+
    | ID | mediumint(9) | | PRI | NULL | auto_increment |
    | Name | varchar(100) | | MUL | | |
    | Parent | mediumint(9) | | MUL | 0 | |
    | Dept | smallint(6) | | MUL | 0 | |
    | Priority | tinyint(4) | | | 3 | |
    | Keywords | varchar(255) | YES | | NULL | |
    | Sorting | varchar(255) | | | Priority DESC | |
    | OrderBy | varchar(255) | YES | | NULL | |
    | SplitAt | smallint(5) unsigned | | | 25 | |
    | Status | int(16) unsigned | | | 0 | |
    +----------+----------------------+------+-----+---------------+----------------+

    Tell me if you need any additional information.
    Thank you for all your help.

    MySQL version is 4.0.17, PLD Linux Distribution MySQL RPM.

    --
    Seks, seksiæ, seksolatki... news:pl.soc.sek s.moderowana
    http://hyperreal.info { iWanToDie } WiNoNa ) (
    http://szatanowskie-ladacznice.0-700.pl foReVeR( * )
    Poznaj jej zwiewne kszta³ty... http://www.opera.com 007
  • Agelmar

    #2
    Re: Trouble With Counting New Documents With Complex Query

    A few things...
    1. Your AND clause for the date is making the server do way more work than
    it has to. Try
    $pytanie .= "AND x_article.Date BETWEEN CURDATE() AND NOW() ";
    (curdate() will return the date, and will then get promoted to datetime with
    the time part set to 00:00:00.)
    2. Build an index on x_instance.Arti cle and on x_article.Date (you can do
    this via phpmyadmin, or you can issue the following mysql query: ALTER TABLE
    x_instance ADD INDEX('Article' );

    That should speed it up a bit, let me know if it helps...

    Adam i Agnieszka Gasiorowski FNORD wrote:[color=blue]
    > I need help width formulating the most
    > effective (in terms of processing time)
    > SQL query to count all the "new"
    > documents in the repository, where "new" is
    > defined as "from 00:00:01 up to 23:59:59
    > today". My current query does not give me
    > satisfactory results, it creates a visible
    > delay in rendering of the main page of one of
    > the departments (Drugs) :8[[[
    > (at least I, for now, think it's the culprit).
    > It's for the <url: https://hyperreal.info >
    > site, see for yourself, notice the delay
    > <url: https://hyperreal.info/drugs/go.to/index >.
    >
    > Currently I ask MySQL to (offending
    > PHP fragment follows, I hope it is self-
    > explanatory).
    >
    > <?
    > $suma = 0;
    > $pytanie = "SELECT COUNT(DISTINCT x_article.ID) AS CNT ";
    > $pytanie .= "FROM x_article ";
    > $pytanie .= "LEFT JOIN x_instance ";
    > $pytanie .= "ON x_article.ID = x_instance.Arti cle ";
    > $pytanie .= "LEFT JOIN x_section ";
    > $pytanie .= "ON x_instance.Sect ion = x_section.ID ";
    > $pytanie .= "WHERE (x_section.Stat us & 1) = 0 "; // not empty
    > $pytanie .= "AND (x_section.Dept = 2 OR x_section.Dept = 5) "; //
    > Drugs, NeuroGroove $pytanie .= "AND (x_instance.Sta tus & 255) = 0 ";
    > // not hidden, etc $pytanie .= "AND UNIX_TIMESTAMP( x_article.Date)
    > BETWEEN " . mktime(0, 0, 1, date('m'), date('d'), date('Y')) . " AND
    > UNIX_TIMESTAMP( NOW()) "; $pytanie .= "GROUP BY x_article.ID"; $wynik
    > = mysql_query($py tanie);
    > while ($tmp = mysql_fetch_arr ay($wynik))
    > {
    > $suma += $tmp['CNT'];
    > }
    > if ($suma)
    > {
    > // pretty-printing of the result
    > $dzisdodano = str_pad((string )(int)$suma, 4, '0', STR_PAD_LEFT);
    > }
    > else $dzisdodano = '0000';[color=green]
    >>[/color]
    >
    > The table layout is as follows:
    >
    > mysql> DESC x_article;
    >[/color]
    +-------------+------------------+------+-----+----------------------+------
    ----------+[color=blue][color=green]
    >> Field | Type | Null | Key | Default |
    >> Extra |
    >>[/color][/color]
    +-------------+------------------+------+-----+----------------------+------
    ----------+[color=blue][color=green]
    >> ID | int(10) unsigned | | PRI | NULL |
    >> auto_increment |
    >> Name | varchar(255) | YES | MUL | NULL |
    >> |
    >> Description | varchar(255) | YES | | NULL |
    >> |
    >> Keywords | varchar(255) | YES | | NULL |
    >> |
    >> Content | mediumtext | | | |
    >> |
    >> Date | datetime | | | 2001-01-01 00:00:00 |
    >> |
    >> Author | varchar(100) | | | unknown@hyperre al.pl |
    >> |
    >> Feedback | varchar(100) | YES | | NULL |
    >> |
    >> Size | int(32) | YES | | NULL |
    >> |
    >> Words | int(32) | YES | | NULL |
    >> |
    >> Images | int(32) | YES | | NULL |
    >> |
    >>[/color][/color]
    +-------------+------------------+------+-----+----------------------+------
    ----------+[color=blue]
    >
    > mysql> DESC x_instance;
    > +----------+------------------+------+-----+---------+-------+[color=green]
    >> Field | Type | Null | Key | Default | Extra |[/color]
    > +----------+------------------+------+-----+---------+-------+[color=green]
    >> Article | mediumint(9) | | MUL | 0 | |
    >> Section | mediumint(9) | | MUL | 0 | |
    >> Priority | tinyint(4) | | | 0 | |
    >> Status | int(16) unsigned | | | 0 | |[/color]
    > +----------+------------------+------+-----+---------+-------+
    >
    > mysql> DESC x_section;
    >[/color]
    +----------+----------------------+------+-----+---------------+------------
    ----+[color=blue][color=green]
    >> Field | Type | Null | Key | Default | Extra
    >> |
    >>[/color][/color]
    +----------+----------------------+------+-----+---------------+------------
    ----+[color=blue][color=green]
    >> ID | mediumint(9) | | PRI | NULL |
    >> auto_increment |
    >> Name | varchar(100) | | MUL | |
    >> |
    >> Parent | mediumint(9) | | MUL | 0 |
    >> |
    >> Dept | smallint(6) | | MUL | 0 |
    >> |
    >> Priority | tinyint(4) | | | 3 |
    >> |
    >> Keywords | varchar(255) | YES | | NULL |
    >> |
    >> Sorting | varchar(255) | | | Priority DESC |
    >> |
    >> OrderBy | varchar(255) | YES | | NULL |
    >> |
    >> SplitAt | smallint(5) unsigned | | | 25 |
    >> |
    >> Status | int(16) unsigned | | | 0 |
    >> |
    >>[/color][/color]
    +----------+----------------------+------+-----+---------------+------------
    ----+[color=blue]
    >
    > Tell me if you need any additional information.
    > Thank you for all your help.
    >
    > MySQL version is 4.0.17, PLD Linux Distribution MySQL RPM.[/color]


    Comment

    • Chung Leong

      #3
      Re: Trouble With Counting New Documents With Complex Query

      Try breaking the query into two. First, retrieve a list of article ids that
      are "new". Then in the second query, join x_instance and x_section and count
      the number of unique ids that matches the additional criteria.

      U¿ytkownik "Adam i Agnieszka Gasiorowski FNORD"
      <agquarx@sieg-heil-szatan.org> napisa³ w wiadomo¶ci
      news:3FF5B186.1 60FBC14@venus.c i.uw.edu.pl...[color=blue]
      >
      > I need help width formulating the most
      > effective (in terms of processing time)
      > SQL query to count all the "new"
      > documents in the repository, where "new" is
      > defined as "from 00:00:01 up to 23:59:59
      > today". My current query does not give me
      > satisfactory results, it creates a visible
      > delay in rendering of the main page of one of
      > the departments (Drugs) :8[[[
      > (at least I, for now, think it's the culprit).
      > It's for the <url: https://hyperreal.info >
      > site, see for yourself, notice the delay
      > <url: https://hyperreal.info/drugs/go.to/index >.
      >
      > Currently I ask MySQL to (offending
      > PHP fragment follows, I hope it is self-
      > explanatory).
      >
      > <?
      > $suma = 0;
      > $pytanie = "SELECT COUNT(DISTINCT x_article.ID) AS CNT ";
      > $pytanie .= "FROM x_article ";
      > $pytanie .= "LEFT JOIN x_instance ";
      > $pytanie .= "ON x_article.ID = x_instance.Arti cle ";
      > $pytanie .= "LEFT JOIN x_section ";
      > $pytanie .= "ON x_instance.Sect ion = x_section.ID ";
      > $pytanie .= "WHERE (x_section.Stat us & 1) = 0 "; // not empty
      > $pytanie .= "AND (x_section.Dept = 2 OR x_section.Dept = 5) "; // Drugs,[/color]
      NeuroGroove[color=blue]
      > $pytanie .= "AND (x_instance.Sta tus & 255) = 0 "; // not hidden, etc
      > $pytanie .= "AND UNIX_TIMESTAMP( x_article.Date) BETWEEN " . mktime(0, 0,[/color]
      1, date('m'), date('d'), date('Y')) . " AND UNIX_TIMESTAMP( NOW()) ";[color=blue]
      > $pytanie .= "GROUP BY x_article.ID";
      > $wynik = mysql_query($py tanie);
      > while ($tmp = mysql_fetch_arr ay($wynik))
      > {
      > $suma += $tmp['CNT'];
      > }
      > if ($suma)
      > {
      > // pretty-printing of the result
      > $dzisdodano = str_pad((string )(int)$suma, 4, '0', STR_PAD_LEFT);
      > }
      > else $dzisdodano = '0000';
      > ?>
      >
      > The table layout is as follows:
      >
      > mysql> DESC x_article;
      >[/color]
      +-------------+------------------+------+-----+----------------------+------
      ----------+[color=blue]
      > | Field | Type | Null | Key | Default |[/color]
      Extra |[color=blue]
      >[/color]
      +-------------+------------------+------+-----+----------------------+------
      ----------+[color=blue]
      > | ID | int(10) unsigned | | PRI | NULL |[/color]
      auto_increment |[color=blue]
      > | Name | varchar(255) | YES | MUL | NULL |[/color]
      |[color=blue]
      > | Description | varchar(255) | YES | | NULL |[/color]
      |[color=blue]
      > | Keywords | varchar(255) | YES | | NULL |[/color]
      |[color=blue]
      > | Content | mediumtext | | | |[/color]
      |[color=blue]
      > | Date | datetime | | | 2001-01-01 00:00:00 |[/color]
      |[color=blue]
      > | Author | varchar(100) | | | unknown@hyperre al.pl |[/color]
      |[color=blue]
      > | Feedback | varchar(100) | YES | | NULL |[/color]
      |[color=blue]
      > | Size | int(32) | YES | | NULL |[/color]
      |[color=blue]
      > | Words | int(32) | YES | | NULL |[/color]
      |[color=blue]
      > | Images | int(32) | YES | | NULL |[/color]
      |[color=blue]
      >[/color]
      +-------------+------------------+------+-----+----------------------+------
      ----------+[color=blue]
      >
      > mysql> DESC x_instance;
      > +----------+------------------+------+-----+---------+-------+
      > | Field | Type | Null | Key | Default | Extra |
      > +----------+------------------+------+-----+---------+-------+
      > | Article | mediumint(9) | | MUL | 0 | |
      > | Section | mediumint(9) | | MUL | 0 | |
      > | Priority | tinyint(4) | | | 0 | |
      > | Status | int(16) unsigned | | | 0 | |
      > +----------+------------------+------+-----+---------+-------+
      >
      > mysql> DESC x_section;
      >[/color]
      +----------+----------------------+------+-----+---------------+------------
      ----+[color=blue]
      > | Field | Type | Null | Key | Default | Extra[/color]
      |[color=blue]
      >[/color]
      +----------+----------------------+------+-----+---------------+------------
      ----+[color=blue]
      > | ID | mediumint(9) | | PRI | NULL |[/color]
      auto_increment |[color=blue]
      > | Name | varchar(100) | | MUL | |[/color]
      |[color=blue]
      > | Parent | mediumint(9) | | MUL | 0 |[/color]
      |[color=blue]
      > | Dept | smallint(6) | | MUL | 0 |[/color]
      |[color=blue]
      > | Priority | tinyint(4) | | | 3 |[/color]
      |[color=blue]
      > | Keywords | varchar(255) | YES | | NULL |[/color]
      |[color=blue]
      > | Sorting | varchar(255) | | | Priority DESC |[/color]
      |[color=blue]
      > | OrderBy | varchar(255) | YES | | NULL |[/color]
      |[color=blue]
      > | SplitAt | smallint(5) unsigned | | | 25 |[/color]
      |[color=blue]
      > | Status | int(16) unsigned | | | 0 |[/color]
      |[color=blue]
      >[/color]
      +----------+----------------------+------+-----+---------------+------------
      ----+[color=blue]
      >
      > Tell me if you need any additional information.
      > Thank you for all your help.
      >
      > MySQL version is 4.0.17, PLD Linux Distribution MySQL RPM.
      >
      > --
      > Seks, seksiæ, seksolatki... news:pl.soc.sek s.moderowana
      > http://hyperreal.info { iWanToDie } WiNoNa ) (
      > http://szatanowskie-ladacznice.0-700.pl foReVeR( * )
      > Poznaj jej zwiewne kszta³ty... http://www.opera.com 007[/color]


      Comment

      • Agelmar

        #4
        Re: Trouble With Counting New Documents With Complex Query

        Chung Leong wrote:[color=blue]
        > Try breaking the query into two. First, retrieve a list of article
        > ids that are "new". Then in the second query, join x_instance and
        > x_section and count the number of unique ids that matches the
        > additional criteria.[/color]

        This should not make any improvement. Actually, it should slow it down. You
        always want to use as few queries as possible. The SQL server has what's
        called a "query optimizer." It looks at its attribute and relation catalogs,
        and sees which conditions are the most selective (match the fewest results),
        calculates the cost of various access plans etc. Given his query, and the
        fact that MySQL builds an index on all primary keys, the most likely access
        plan for his query would be a B+ tree traversal on x_article. This will
        yield sorted data, and if an index is built on x_instance.arti cle as I
        suggested, will make for a very fast and efficient join. Breaking the query
        into two would require a much more complicated selection in the second
        query, and would also lose the advantage of having the articleid's
        pre-sorted.


        Comment

        • Agelmar

          #5
          Re: Trouble With Counting New Documents With Complex Query

          Agelmar wrote:[color=blue]
          > A few things...
          > 1. Your AND clause for the date is making the server do way more work
          > than it has to. Try
          > $pytanie .= "AND x_article.Date BETWEEN CURDATE() AND NOW() ";
          > (curdate() will return the date, and will then get promoted to
          > datetime with the time part set to 00:00:00.)
          > 2. Build an index on x_instance.Arti cle and on x_article.Date (you
          > can do this via phpmyadmin, or you can issue the following mysql
          > query: ALTER TABLE x_instance ADD INDEX('Article' );[/color]

          hmm, just noticed that there was already an index on x_instance.Arti cle...
          just make sure that it's an index on x_instance.Arti cle and not an index on
          (x_instance.Som ething, x_instance.Arti cle) (e.g. that it's the first
          attribute in the index)


          Comment

          • Adam i Agnieszka Gasiorowski FNORD

            #6
            Re: Trouble With Counting New Documents With Complex Query

            Agelmar wrote:
            [color=blue]
            > $pytanie .= "AND x_article.Date BETWEEN CURDATE() AND NOW() ";[/color]

            Good idea. Thank you.

            --
            Seks, seksiæ, seksolatki... news:pl.soc.sek s.moderowana
            http://hyperreal.info { iWanToDie } WiNoNa ) (
            http://szatanowskie-ladacznice.0-700.pl foReVeR( * )
            Poznaj jej zwiewne kszta³ty... http://www.opera.com 007

            Comment

            Working...