Alternative to self-joins??

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • nuked@bway.net

    Alternative to self-joins??

    I have a table that has values of variables for certain entities. The
    columns of interest are targetID, variableID, and valueID. A row (1, 5,
    9) means that target number 1 has a value of 9 for variable 5. Being
    denormalized, target number one will have many possible rows in this
    table, one for each variable for which it has a value.

    My problem occurs when I want to find out what targets match a certain
    set of variable values. For instance, I want to find out what targets
    have a value of 9 for variable 5 and a value of 25 for variable 10. I'm
    thinking that this can be a simple self-join:

    SELECT mya.targetID from mytable as mya
    LEFT JOIN mytable as myb
    ON mya.targetID=my b.targetID
    WHERE (mya.variableID =5 AND mya.valueID=9)
    AND (myb.variableID =10 AND myb.valueID=25)

    Does this make sense so far? The problem is that this doesn't scale.
    When I have more than 31 variables and I need to evaluate them all,
    MySQL breaks: I can't do more than 31 joins.

    My design calls for perhaps 80-100 variables, so even 64-bit
    architecture with a limit of 64 joins won't get me there.

    I need another data structure that won't get me stuck on too many
    joins. Any suggestions? If I have to scrap this approach in favor of
    another, I can do that; even some clues on what direction to head out
    on would be helpful. I'm stuck at the present. Thanks.

  • Bill Karwin

    #2
    Re: Alternative to self-joins??

    nuked@bway.net wrote:[color=blue]
    > My design calls for perhaps 80-100 variables, so even 64-bit
    > architecture with a limit of 64 joins won't get me there.
    >
    > I need another data structure that won't get me stuck on too many
    > joins. Any suggestions?[/color]

    Not every SQL task can be done in a single statement. Or at least,
    doing it in a single statement is so difficult that it's not worth the
    time spent to develop and maintain the solution.

    You could do this in a straightforward and scalable way by iterating
    through the tests:

    CREATE TEMPORARY TABLE joinSat (
    targetID INT NOT NULL PRIMARY KEY,
    join_satisfied TINYINT NOT NULL DEFAULT 1
    );

    Prime the temp table with the list of all targetID's:

    INSERT INTO joinSatisfied (targetID, join_satisfied)
    SELECT DISTINCT targetID, 1 FROM mytable;

    Then for each variable test, change the join_satisfied field to 0 if you
    can't find matching targetID's associated with the value you're looking
    for. You could use MySQL's multi-table UPDATE syntax.

    UPDATE joinSatisfied LEFT OUTER JOIN myTable
    ON (j.targetID = m.targetID AND m.variableID = 5 AND m.valueID = 9);
    SET joinSatisfied.b ool = 0
    WHERE myTable.targetI D IS NULL;

    UPDATE joinSatisfied LEFT OUTER JOIN myTable
    ON (j.targetID = m.targetID AND m.variableID = 10 AND m.valueID = 25);
    SET joinSatisfied.b ool = 0
    WHERE myTable.targetI D IS NULL;

    ....etc.

    Repeat the UPDATE statement for each of your variables that you're
    looking for, and in the end your temp table has a 1 for each targetID
    that satisfied all the tests, and 0 otherwise.

    SELECT t.*
    FROM targetMasterTab le AS t INNER JOIN joinSatisfied as j
    ON (t.targetID = j.targetID)
    WHERE j.bool = 1;

    Regards,
    Bill K.

    Comment

    Working...