STD of multiple columns

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • stefaan.lhermitte@agr.kuleuven.ac.be

    STD of multiple columns

    Dear Mysql-ians,

    I want to calculate the standard deviation of data that are in multiple
    columns. I know how to calculate the STD of 1 column (e.g. X1 of
    table_X) using:

    SELECT STD(X1) FROM table_X;

    but I want to calculate now the STD of the union of data of columns
    (e.g. X1, X2, ..., X100 of table_X).

    Does anyone has any suggestion on how to do that? I hoped something as
    SELECT STD(X1,X2,...,X 100) FROM table_X existed, but apparently it does
    not.

    Thanks for your suggestions in advance!

    Kind regards,
    Stef

  • Jerry Stuckle

    #2
    Re: STD of multiple columns

    stefaan.lhermit te@agr.kuleuven .ac.be wrote:[color=blue]
    > Dear Mysql-ians,
    >
    > I want to calculate the standard deviation of data that are in multiple
    > columns. I know how to calculate the STD of 1 column (e.g. X1 of
    > table_X) using:
    >
    > SELECT STD(X1) FROM table_X;
    >
    > but I want to calculate now the STD of the union of data of columns
    > (e.g. X1, X2, ..., X100 of table_X).
    >
    > Does anyone has any suggestion on how to do that? I hoped something as
    > SELECT STD(X1,X2,...,X 100) FROM table_X existed, but apparently it does
    > not.
    >
    > Thanks for your suggestions in advance!
    >
    > Kind regards,
    > Stef
    >[/color]

    Stef,

    I don't *think* it's possible from your current design.

    Perhaps a redesign is in order. Having 100 columns containing basically the
    same information is not a good design.

    For instance, in the case of student test scores - you could do something like:

    (table) studentid name test1scrore test2score test3score test4score

    (Of course there would be more info)

    A better design would be:

    (table 1) studentid name

    (table 2) studentid testid score

    Such a design is more versatile - and cures your problem along the way.


    --
    =============== ===
    Remove the "x" from my email address
    Jerry Stuckle
    JDS Computer Training Corp.
    jstucklex@attgl obal.net
    =============== ===

    Comment

    • stefaan.lhermitte@agr.kuleuven.ac.be

      #3
      Re: STD of multiple columns

      Thanks for your help!

      If I look at my design, it looks like:
      (table) id, obs_time1, obs_time2, ..., obs_time100
      where:
      obs_timeX = observation at "time X"

      I have over 2 million records (with different unique id's) for this
      table, and I want to create the STD of all observations of 1 id through
      time.

      If I understand your design well, you suggest to reform it towards:
      (table_1) studentid + other info
      (table_2) studentid obs_time obs_value
      where:
      obs_time = "time X" of obs_timeX
      obs_value = value of obs_timeX with corresponding "time X"

      If I am correct, I will get a very long table_2 since I create 2
      millions (ids) *100 records (for every obs_time). Don't I create much
      more redundant information then (having only 1 row of obs_timeX per id
      and having unique id's)?

      I hope i made myself clear?

      Thanks again for your help!

      Regards
      Stef

      Comment

      • Jerry Stuckle

        #4
        Re: STD of multiple columns

        stefaan.lhermit te@agr.kuleuven .ac.be wrote:[color=blue]
        > Thanks for your help!
        >
        > If I look at my design, it looks like:
        > (table) id, obs_time1, obs_time2, ..., obs_time100
        > where:
        > obs_timeX = observation at "time X"
        >
        > I have over 2 million records (with different unique id's) for this
        > table, and I want to create the STD of all observations of 1 id through
        > time.
        >
        > If I understand your design well, you suggest to reform it towards:
        > (table_1) studentid + other info
        > (table_2) studentid obs_time obs_value
        > where:
        > obs_time = "time X" of obs_timeX
        > obs_value = value of obs_timeX with corresponding "time X"
        >
        > If I am correct, I will get a very long table_2 since I create 2
        > millions (ids) *100 records (for every obs_time). Don't I create much
        > more redundant information then (having only 1 row of obs_timeX per id
        > and having unique id's)?
        >
        > I hope i made myself clear?
        >
        > Thanks again for your help!
        >
        > Regards
        > Stef
        >[/color]

        Stef,

        Yep, that's exactly what I'm suggesting. Do some reading up on "Database
        Normalization" - it can help you understand why this is potentially a better
        solution.

        And yes, the new table will be quite long. But your existing table is quite
        wide! 200M rows (the max you could have) isn't as different than what you have
        now - 2M rows with > 100 columns in each row.

        Also, as you normalize your tables, you can potentially have more, if the
        majority of the fields are filled. But you may also have less, if only a small
        number are filled. And normalizing your tables makes things more flexible.



        --
        =============== ===
        Remove the "x" from my email address
        Jerry Stuckle
        JDS Computer Training Corp.
        jstucklex@attgl obal.net
        =============== ===

        Comment

        • stefaan.lhermitte@agr.kuleuven.ac.be

          #5
          Re: STD of multiple columns

          Thanks for your suggestion! I will try to reorganize my data.

          I was thinking of making a query to reorganize my data.
          E.g.:
          (SELECT id, "name(obs_time1 )" AS obs_time, obs_time1 AS obs_value FROM
          table_1)
          UNION
          (SELECT id, "name(obs_time2 )" AS obs_time, obs_time2 AS obs_value FROM
          table_1)
          UNION
          .......
          UNION
          (SELECT id, "name(obs_time1 00)" AS obs_time, obs_time100 AS obs_value
          FROM table_1)
          ORDER BY id

          with:
          "name(obs_timeX )"= the name of my columns I now use to extract
          obs_timeX

          Hopefully that will work!

          Regards,
          Stef

          Comment

          • tommaso.gastaldi@uniroma1.it

            #6
            Re: STD of multiple columns


            Stef,

            do you really want to run a union of 100 select on a table with 2
            millions records ?!?

            Actually I don't see where is the problem, why dont' you just apply the
            function std to each single column, select std(v1), std(v2), ... ? Why
            do you feel you need a multivariate function?

            -tom

            stefaan.lhermit te@agr.kuleuven .ac.be ha scritto:
            [color=blue]
            > Thanks for your suggestion! I will try to reorganize my data.
            >
            > I was thinking of making a query to reorganize my data.
            > E.g.:
            > (SELECT id, "name(obs_time1 )" AS obs_time, obs_time1 AS obs_value FROM
            > table_1)
            > UNION
            > (SELECT id, "name(obs_time2 )" AS obs_time, obs_time2 AS obs_value FROM
            > table_1)
            > UNION
            > ......
            > UNION
            > (SELECT id, "name(obs_time1 00)" AS obs_time, obs_time100 AS obs_value
            > FROM table_1)
            > ORDER BY id
            >
            > with:
            > "name(obs_timeX )"= the name of my columns I now use to extract
            > obs_timeX
            >
            > Hopefully that will work!
            >
            > Regards,
            > Stef[/color]

            Comment

            • Jerry Stuckle

              #7
              Re: STD of multiple columns

              stefaan.lhermit te@agr.kuleuven .ac.be wrote:[color=blue]
              > Thanks for your suggestion! I will try to reorganize my data.
              >
              > I was thinking of making a query to reorganize my data.
              > E.g.:
              > (SELECT id, "name(obs_time1 )" AS obs_time, obs_time1 AS obs_value FROM
              > table_1)
              > UNION
              > (SELECT id, "name(obs_time2 )" AS obs_time, obs_time2 AS obs_value FROM
              > table_1)
              > UNION
              > ......
              > UNION
              > (SELECT id, "name(obs_time1 00)" AS obs_time, obs_time100 AS obs_value
              > FROM table_1)
              > ORDER BY id
              >
              > with:
              > "name(obs_timeX )"= the name of my columns I now use to extract
              > obs_timeX
              >
              > Hopefully that will work!
              >
              > Regards,
              > Stef
              >[/color]

              Stef,

              Actually, I think I'd do it in PHP or some other language and let it loop, i.e.

              (Assuming you're using a version which can insert from a select statement)

              for ($i = 1; $i <= 100; $i++) {
              $query = "INSERT INTO newtable (studentid, obs_time, obs_value) " .
              "VALUES (SELECT studentid, $i, obs_value" . $i , ") FROM oldtable";
              result = mysql_query($qu ery);
              if (!$result) {
              echo "MySQL Error: " . mysql_error();
              break;
              }
              }

              Also, if all of the times don't have values and you don't need to insert them,
              you can do this in two queries - select the value; if it's null (or blank) then
              you don't need to insert it into the new table.


              --
              =============== ===
              Remove the "x" from my email address
              Jerry Stuckle
              JDS Computer Training Corp.
              jstucklex@attgl obal.net
              =============== ===

              Comment

              • stefaan.lhermitte@agr.kuleuven.ac.be

                #8
                Re: STD of multiple columns

                Thanx Jerry. I followed your advice and it worked wonderfully!

                Comment

                Working...