A BILLION pictures...how do they do it?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • cbmeeks

    A BILLION pictures...how do they do it?

    I am writing my own family photo sharing site that I hope to take
    public (like so many others). Anyway, currently, when the user
    uploads a picture, I store the picture outside my htdocs folder and
    record the image details in a MySQB db. When you browse the picture,
    I read the record and build the image by sending an image/jpeg header.

    Seems to work but I am a little disappointed with performance.
    Granted I am running on a really old machine which might be the
    reason. lol

    Seriously though, if I take this public and get extremely lucky and
    millions of photos are uploaded, would this be the best method?

    I've read pros and cons of storing images in a database. I've read
    about Flickr, SmugMug, Photobucket having HUNDREDS of millions to over
    a BILLION images stored!

    Obviously, load balancing plays into this but what other secrets do
    you think they use?

    One thing I worry about is my file system. I have something like:

    pix
    -----user1
    -------------thumbs
    -----user2
    -------------thumbs

    etc...

    Any pointers would be appreciated.
    Thanks

    cbmeeks

  • Jerry Stuckle

    #2
    Re: A BILLION pictures...how do they do it?

    cbmeeks wrote:
    I am writing my own family photo sharing site that I hope to take
    public (like so many others). Anyway, currently, when the user
    uploads a picture, I store the picture outside my htdocs folder and
    record the image details in a MySQB db. When you browse the picture,
    I read the record and build the image by sending an image/jpeg header.
    >
    Seems to work but I am a little disappointed with performance.
    Granted I am running on a really old machine which might be the
    reason. lol
    >
    Seriously though, if I take this public and get extremely lucky and
    millions of photos are uploaded, would this be the best method?
    >
    I've read pros and cons of storing images in a database. I've read
    about Flickr, SmugMug, Photobucket having HUNDREDS of millions to over
    a BILLION images stored!
    >
    Obviously, load balancing plays into this but what other secrets do
    you think they use?
    >
    One thing I worry about is my file system. I have something like:
    >
    pix
    -----user1
    -------------thumbs
    -----user2
    -------------thumbs
    >
    etc...
    >
    Any pointers would be appreciated.
    Thanks
    >
    cbmeeks
    >
    First of all, you should be asking this in a database newsgroup, not a
    PHP one. And preferably a newsgroup aimed at the database you're using.

    I store pictures in databases. It works quite well. Takes some tuning,
    but I find it provides good performance.

    --
    =============== ===
    Remove the "x" from my email address
    Jerry Stuckle
    JDS Computer Training Corp.
    jstucklex@attgl obal.net
    =============== ===

    Comment

    • cbmeeks

      #3
      Re: A BILLION pictures...how do they do it?

      First of all, you should be asking this in a database newsgroup, not a
      PHP one. And preferably a newsgroup aimed at the database you're using.
      Well, that's assuming I would only use MySQL and not PHP to serve my
      files. :-)
      I store pictures in databases. It works quite well. Takes some tuning,
      but I find it provides good performance.
      Yeah, I'm not surprised you replied. I have been reading some of your
      posts about images in db's. You really have me thinking about images
      in db's. I have to admit, I am walking on top of the fence and could
      jump to either side when it comes to file system/db for storing
      images. I agree with your postings about actually doing it instead of
      quoting theories.

      Scalability is very important but it's not the only thing.
      Portability is also important. I am thinking of using Amazon's S3
      (which I believe is a flat file system). But the bad thing about
      using Amazon is that I put all of my eggs in one basket. They just
      recently had a price change that made a lot of people happy but not
      all...point is, they did that because they can.

      I would love to be the fly on the wall at Amazon, eBay, Google, etc
      and see how they store images. I know Google has their BigTable.

      I guess I should follow by example. SmugMug uses their own internal
      system that is helped along with S3. But I have no idea of how much
      they serve from S3 or if they just use S3 as a backup.

      Oh well, sorry for the rambling.

      cbmeeks


      Comment

      • max.schulze@googlemail.com

        #4
        Re: A BILLION pictures...how do they do it?

        On Jun 12, 5:09 pm, Jerry Stuckle <jstuck...@attg lobal.netwrote:
        cbmeeks wrote:
        I am writing my own family photo sharing site that I hope to take
        public (like so many others). Anyway, currently, when the user
        uploads a picture, I store the picture outside my htdocs folder and
        record the image details in a MySQB db. When you browse the picture,
        I read the record and build the image by sending an image/jpeg header.
        >
        Seems to work but I am a little disappointed with performance.
        Granted I am running on a really old machine which might be the
        reason. lol
        >
        Seriously though, if I take this public and get extremely lucky and
        millions of photos are uploaded, would this be the best method?
        >
        I've read pros and cons of storing images in a database. I've read
        about Flickr, SmugMug, Photobucket having HUNDREDS of millions to over
        a BILLION images stored!
        >
        Obviously, load balancing plays into this but what other secrets do
        you think they use?
        >
        One thing I worry about is my file system. I have something like:
        >
        pix
        -----user1
        -------------thumbs
        -----user2
        -------------thumbs
        >
        etc...
        >
        Any pointers would be appreciated.
        Thanks
        >
        cbmeeks
        >
        First of all, you should be asking this in a database newsgroup, not a
        PHP one. And preferably a newsgroup aimed at the database you're using.
        >
        I store pictures in databases. It works quite well. Takes some tuning,
        but I find it provides good performance.
        >
        --
        =============== ===
        Remove the "x" from my email address
        Jerry Stuckle
        JDS Computer Training Corp.
        jstuck...@attgl obal.net
        =============== ===
        You should read the Database DOCS. In case of MySQL, if you index your
        table and use the right mysql database type, then you will get more
        perfomance with storing images in the database.
        Also if you run a very huge site, your database server's will run on
        SCSI machine's which means that you have often faster Database
        Harddrive's then your webserver.

        Comment

        • Jerry Stuckle

          #5
          Re: A BILLION pictures...how do they do it?

          cbmeeks wrote:
          >First of all, you should be asking this in a database newsgroup, not a
          >PHP one. And preferably a newsgroup aimed at the database you're using.
          >
          Well, that's assuming I would only use MySQL and not PHP to serve my
          files. :-)
          >
          >I store pictures in databases. It works quite well. Takes some tuning,
          >but I find it provides good performance.
          >
          Yeah, I'm not surprised you replied. I have been reading some of your
          posts about images in db's. You really have me thinking about images
          in db's. I have to admit, I am walking on top of the fence and could
          jump to either side when it comes to file system/db for storing
          images. I agree with your postings about actually doing it instead of
          quoting theories.
          >
          Scalability is very important but it's not the only thing.
          Portability is also important. I am thinking of using Amazon's S3
          (which I believe is a flat file system). But the bad thing about
          using Amazon is that I put all of my eggs in one basket. They just
          recently had a price change that made a lot of people happy but not
          all...point is, they did that because they can.
          >
          I would love to be the fly on the wall at Amazon, eBay, Google, etc
          and see how they store images. I know Google has their BigTable.
          >
          I guess I should follow by example. SmugMug uses their own internal
          system that is helped along with S3. But I have no idea of how much
          they serve from S3 or if they just use S3 as a backup.
          >
          Oh well, sorry for the rambling.
          >
          cbmeeks

          >
          Either way you're going to have to use PHP (or PERL or some language) to
          serve the images up. But the database design and configuration is the
          more important thing here. That's why I suggested a database newsgroup.
          It's a better place to discuss these things.


          --
          =============== ===
          Remove the "x" from my email address
          Jerry Stuckle
          JDS Computer Training Corp.
          jstucklex@attgl obal.net
          =============== ===

          Comment

          • cbmeeks

            #6
            Re: A BILLION pictures...how do they do it?

            Max/Jerry:

            Oh believe me, I would like to use the DB and I will certainly try it
            and run some performance testing.

            serve the images up. But the database design and configuration is the
            more important thing here. That's why I suggested a database newsgroup.
            It's a better place to discuss these things.
            >
            Agreed. I just don't like to cross post and I knew that PHP and MySQL
            would be involved. That's why I started here first.

            Comment

            • NC

              #7
              Re: A BILLION pictures...how do they do it?

              On Jun 12, 5:27 am, cbmeeks <cbme...@gmail. comwrote:
              >
              I am writing my own family photo sharing site that
              I hope to take public (like so many others). Anyway,
              currently, when the user uploads a picture, I store
              the picture outside my htdocs folder and record the
              image details in a MySQB db. When you browse the
              picture, I read the record and build the image by
              sending an image/jpeg header.
              >
              Seriously though, if I take this public and get
              extremely lucky and millions of photos are uploaded,
              would this be the best method?
              >
              I've read pros and cons of storing images in a database.
              I've read about Flickr, SmugMug, Photobucket having
              HUNDREDS of millions to over a BILLION images stored!
              >
              Obviously, load balancing plays into this but what
              other secrets do you think they use?
              Separating (static) pictures from other (dynamic) content. Say, you
              have two servers, one with PHP/MySQL (let's call it www.yoursite.com),
              another with nothing but Apache (content.yoursi te.com), optimized for
              serving static images. The application residing on www.yoursite.com
              saves images onto content.yoursit e.com and records their full URLs
              (http://content.yoursite.com/path/file.jpg) in its database. When
              content.yoursit e.com gets low on available disk space, you put up a
              new server (content2.yours ite.com) for writing and start filling it up
              with pictures, while content.yoursit e.com still remains accessible for
              reading. You can continue to add new content*.yoursi te.com servers as
              you go. Dynamically generated HTML gets served from www.yoursite.com
              (which may eventually outgrow a single server and morph into a server
              cluster), static images, from content*.yoursi te.com.

              A slight variation of this approach is that multiple servers are open
              for writing at any given time; images are written onto a randomly
              chosen server. This helps ensure that highly popular content will be
              spread between multiple servers and can thus be served faster.

              Yet another possibility is to hide your application behind a layer of
              caching proxies...
              One thing I worry about is my file system. I have
              something like:
              >
              pix
              -----user1
              -------------thumbs
              -----user2
              -------------thumbs
              There's absolutely no need for the file structure to replicate your
              database structure...

              Cheers,
              NC

              Comment

              • Jerry Stuckle

                #8
                Re: A BILLION pictures...how do they do it?

                cbmeeks wrote:
                Max/Jerry:
                >
                Oh believe me, I would like to use the DB and I will certainly try it
                and run some performance testing.
                >
                >
                >serve the images up. But the database design and configuration is the
                >more important thing here. That's why I suggested a database newsgroup.
                > It's a better place to discuss these things.
                >>
                >
                Agreed. I just don't like to cross post and I knew that PHP and MySQL
                would be involved. That's why I started here first.
                >
                Ah, but cross-posting is the ONLY way to fly! :-)

                --
                =============== ===
                Remove the "x" from my email address
                Jerry Stuckle
                JDS Computer Training Corp.
                jstucklex@attgl obal.net
                =============== ===

                Comment

                • cbmeeks

                  #9
                  Re: A BILLION pictures...how do they do it?

                  That makes sense. I see many of the big sites use
                  "static123.exam ple.com".




                  On Jun 12, 1:52 pm, NC <n...@iname.com wrote:
                  On Jun 12, 5:27 am, cbmeeks <cbme...@gmail. comwrote:
                  >
                  >
                  >
                  >
                  >
                  I am writing my own family photo sharing site that
                  I hope to take public (like so many others). Anyway,
                  currently, when the user uploads a picture, I store
                  the picture outside my htdocs folder and record the
                  image details in a MySQB db. When you browse the
                  picture, I read the record and build the image by
                  sending an image/jpeg header.
                  >
                  Seriously though, if I take this public and get
                  extremely lucky and millions of photos are uploaded,
                  would this be the best method?
                  >
                  I've read pros and cons of storing images in a database.
                  I've read about Flickr, SmugMug, Photobucket having
                  HUNDREDS of millions to over a BILLION images stored!
                  >
                  Obviously, load balancing plays into this but what
                  other secrets do you think they use?
                  >
                  Separating (static) pictures from other (dynamic) content. Say, you
                  have two servers, one with PHP/MySQL (let's call itwww.yoursite. com),
                  another with nothing but Apache (content.yoursi te.com), optimized for
                  serving static images. The application residing onwww.yoursite. com
                  saves images onto content.yoursit e.com and records their full URLs
                  (http://content.yoursite.com/path/file.jpg) in its database. When
                  content.yoursit e.com gets low on available disk space, you put up a
                  new server (content2.yours ite.com) for writing and start filling it up
                  with pictures, while content.yoursit e.com still remains accessible for
                  reading. You can continue to add new content*.yoursi te.com servers as
                  you go. Dynamically generated HTML gets served fromwww.yoursit e.com
                  (which may eventually outgrow a single server and morph into a server
                  cluster), static images, from content*.yoursi te.com.
                  >
                  A slight variation of this approach is that multiple servers are open
                  for writing at any given time; images are written onto a randomly
                  chosen server. This helps ensure that highly popular content will be
                  spread between multiple servers and can thus be served faster.
                  >
                  Yet another possibility is to hide your application behind a layer of
                  caching proxies...
                  >
                  One thing I worry about is my file system. I have
                  something like:
                  >
                  pix
                  -----user1
                  -------------thumbs
                  -----user2
                  -------------thumbs
                  >
                  There's absolutely no need for the file structure to replicate your
                  database structure...
                  >
                  Cheers,
                  NC

                  Comment

                  Working...