User Information Privacy Levels DB design

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • bilibytes
    New Member
    • Jun 2008
    • 128

    User Information Privacy Levels DB design

    Hi everyone,

    I'm facing a database design problem.
    I want to make a sort of networking solution for the clients of my site in which they would be able to share or keep private some of their contact information.

    so if i have a table like this:

    Code:
    CREATE TABLE `users_retailers_account_info` (
      `id` bigint(15) unsigned NOT NULL auto_increment,
      `email` varchar(255) NOT NULL default 'N/A',
      `password` varchar(255) NOT NULL default 'N/A',
      `name_first_owner` varchar(255) NOT NULL default 'N/A',
      `name_second_owner` varchar(255) NOT NULL default 'N/A',
      `company_name` varchar(255) NOT NULL default 'N/A',
      `country_id` mediumint(8) NOT NULL default 'N/A',
      `city_id` varchar(255) NOT NULL default 'N/A',
      `street_address` varchar(255) NOT NULL default 'N/A',
      `lang_iso` varchar(5) NOT NULL default 'en',    
      PRIMARY KEY  (`id`)
    )
    i would like to be able to set some of the columns as having different privacy levels.

    The problem is that as far as i know, there is no possibility to put metadata to a column like:
    Code:
    [public] street_address | 4th avenue, Manhatan
    [protected] email | johnleehooker@hotmail.com
    note that i dont want it to be permanent, i want to let the user decide which privacy levels give to any personal acount information (except for password...).

    My first approach to solve this, was to make a big table, with the columns names, field values and privacy levels like this :
    Code:
     CREATE TABLE `privacy_levels` (
      `id` bigint(15) NOT NULL auto_increment,
      `user_id` bigint(15) NOT NULL auto_increment,
      `column_name` varchar(255) NOT NULL default 'N/A',
      `field_value` varchar(255) NOT NULL default 'N/A',
      `privacy_level` varchar(20) NOT NULL default 'private',
      PRIMARY KEY  (`id`)
    )
    whenever a user sets a different privacy setting than the default, the name of the column and the field value would be stored in this privacy levels table with the desired privacy.
    However this approach has a lot lot lot of overhead. and would make the table very large and slow to query.

    Then i thought of another approach which was to give the users_retailers _account_info column a privacy_level pair column like this:
    Code:
    CREATE TABLE `users_retailers_account_info` (
      `id` bigint(15) unsigned NOT NULL auto_increment,
      `email` varchar(255) NOT NULL default 'N/A',
     [B] `privacy_email` [/B]varchar(20) NOT NULL default 'private',
      `password` varchar(255) NOT NULL default 'N/A',
      `name_first_owner` varchar(255) NOT NULL default 'N/A',
     [B] `privacy_nfo` [/B]varchar(20) NOT NULL default 'private',
      `name_second_owner` varchar(255) NOT NULL default 'N/A',
     [B] `privacy_nso`[/B] varchar(20) NOT NULL default 'private',
      `company_name` varchar(255) NOT NULL default 'N/A',
      [B]`privacy_company` [/B]varchar(20) NOT NULL default 'private',
      `country_id` mediumint(8) NOT NULL default 'N/A',
      `city_id` varchar(255) NOT NULL default 'N/A',
      `street_address` varchar(255) NOT NULL default 'N/A',
     [B] `privacy_street_address`[/B] varchar(20) NOT NULL default 'private',
      `lang_iso` varchar(5) NOT NULL default 'en',    
      PRIMARY KEY  (`id`)
    )
    so when a user requests to get info from a certain retailer, the application would query all the table fields and only show those whose privacy is set to public. but I don't know if this is the right approach.

    If you have any suggestions, please let me know.

    Thank you for reading this far,

    best regards


    bilibytes
  • chemlight
    New Member
    • Jan 2009
    • 33

    #2
    I like the idea of metadata. You could pre-pend the metadata with the public/private value, use whatever delimeter, and then split the value with your script, limiting it to only one split.

    so, for example, email would look like
    private;myemail @email.com

    then, when you queried the table, you would run something like this (I'm using php as an example, not sure what script you're using)
    Code:
    $email = mysql_query("SELECT email FROM users_retailers_account_info WHERE id = "$userid" ");
    $email = mysql_fetch_array(explode(";", $email, 2));
    if($email[0] == "private"){
    //do not display email
    }else{
    //display email
    }
    You could conceivably cut down even more on overhead by reducing private/public to 0/1.

    You would have to make sure though that every time you manipulated the data, you would need to split the string. You could probably just create a function for that...

    Anyway, hope this helps...

    Comment

    • bilibytes
      New Member
      • Jun 2008
      • 128

      #3
      Originally posted by chemlight
      I like the idea of metadata. You could pre-pend the metadata with the public/private value, use whatever delimeter, and then split the value with your script, limiting it to only one split.

      so, for example, email would look like
      private;myemail @email.com

      then, when you queried the table, you would run something like this (I'm using php as an example, not sure what script you're using)
      Code:
      $email = mysql_query("SELECT email FROM users_retailers_account_info WHERE id = "$userid" ");
      $email = mysql_fetch_array(explode(";", $email, 2));
      if($email[0] == "private"){
      //do not display email
      }else{
      //display email
      }
      You could conceivably cut down even more on overhead by reducing private/public to 0/1.

      You would have to make sure though that every time you manipulated the data, you would need to split the string. You could probably just create a function for that...

      Anyway, hope this helps...
      Thank you very much for your help!
      You think it's the most efficient way? even if I have to subtract the privacy value from each fetched column?

      thank you again

      by the way, you exploded the string in the wrong order. you should first use mysq_fetch_arra y($result) and to the result of that function you should explode the value.
      (I know you know how to do it and it was a typing mistake, but i point it out so that new coders don't get misled)

      regards

      Comment

      • chemlight
        New Member
        • Jan 2009
        • 33

        #4
        Not sure if its the best. I don't know how to test script speeds, but thats how I would think of doing it...(note the "newbie" under Chemlight :-) )

        Comment

        • bilibytes
          New Member
          • Jun 2008
          • 128

          #5
          Originally posted by chemlight
          Not sure if its the best. I don't know how to test script speeds, but thats how I would think of doing it...(note the "newbie" under Chemlight :-) )
          lol that newbee just means you are new to the site... :) not necessarily newbee to programming.
          I just posted a thread on php script testing under php section. if you wait that someone answers we'll be able to test our scripts!

          i'll dig deeper in the content privacy table design. thanks for your help anyway

          regards

          Comment

          • chemlight
            New Member
            • Jan 2009
            • 33

            #6
            Originally posted by bilibytes
            I just posted a thread on php script testing under php section. if you wait that someone answers we'll be able to test our scripts!
            regards
            Lol. I just posted a thread on that too...

            :-)

            Comment

            • chemlight
              New Member
              • Jan 2009
              • 33

              #7
              Here's the code that I speed tested:

              Code:
              function microtime_float()
              		{
              			list($usec, $sec) = explode(" ", microtime());
              			return ((float)$usec + (float)$sec);
              		}
              		
              		$time_start = microtime_float();
              		
              		$i = 0;
              		while($i < 1000){
              			$info = mysql_query("SELECT email, email_privacy FROM testtable2 WHERE id = '1'");
              			while($row = mysql_fetch_array($info)){
              				$email = $row['email'];
              				$privacy = $row['email_privacy'];
              			}
              			$i++;
              		}
              		
              		$time_end = microtime_float();
              		$time = $time_end - $time_start;
              		
              		echo "<br />Did nothing in $time seconds\n";
              		
              		$time_start = microtime_float();
              		
              		$i = 0;
              		while($i < 1000){
              			$info = mysql_query("SELECT email FROM testtable1 WHERE id = '1'");
              			while($row = mysql_fetch_array($info)){
              				$emailsplit = explode(";", $row['email'], 2);
              				$email =  $emailsplit[0];
              				$privacy =  $emailsplit[1];
              			}
              			$i++;
              		}
              		
              		$time_end = microtime_float();
              		$time = $time_end - $time_start;
              		
              		echo "<br />Did nothing in $time seconds\n";
              The two differences in the scripts are that the first must query a result from two fields, and doesn't need to explode the variables, whereas the second only needs to query from one field, but has to use explode to get the variables.

              Here were the results from Multiple tests:
              Did nothing in 0.121823072433 seconds
              Did nothing in 0.110490083694 seconds

              Did nothing in 0.120975017548 seconds
              Did nothing in 0.112523078918 seconds

              Did nothing in 0.119637012482 seconds
              Did nothing in 0.11141705513 seconds

              There are 21,000 of the same entry in each table. The table formats/data are
              testtable1
              id | email
              1 | 0;myemail@email .com

              testtable2
              id | email | email_privacy
              1 | myemail@email.c om | 0

              The table sizes (again, with 21,000 entries) are:
              testtable1 = 786.2KiB
              testtable2 = 786.3KiB

              I don't know what is considered worthwhile with speed and overhead, and I'm only testing the difference in overhead with very few fields(although very many records). But, the speed and overhead of using split seems to be better and more efficient.

              Comment

              • bilibytes
                New Member
                • Jun 2008
                • 128

                #8
                wow, i really appreciate what you did!! that is great!
                thank you very very much!

                tomorrow i'll fill my tables with some content and i'll do some tests re adapting your code. and i'll post the verdict here if it is different to yours.

                Comment

                • chemlight
                  New Member
                  • Jan 2009
                  • 33

                  #9
                  I look forward to it!

                  Comment

                  Working...