Dealing with large amounts of data

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Digety

    Dealing with large amounts of data

    We are looking to store a large amount of user data that will be
    changed and accessed daily by a large number of people. We expect
    around 6-8 million subscribers to our service with each record being
    approximately 2000-2500 bytes. The system needs to be running 24/7
    and therefore cannot be shut down. What is the best way to implement
    this? We were thinking of setting up a cluster of servers to hold the
    information and another cluster to backup the information. Is this
    practical?
    Also, what software is available out there that can distribute query
    calls across different servers and to manage large amounts of query
    requests?

    Thank you in advance.

    Ben
  • nib

    #2
    Re: Dealing with large amounts of data

    Digety wrote:[color=blue]
    > We are looking to store a large amount of user data that will be
    > changed and accessed daily by a large number of people. We expect
    > around 6-8 million subscribers to our service with each record being
    > approximately 2000-2500 bytes. The system needs to be running 24/7
    > and therefore cannot be shut down. What is the best way to implement
    > this? We were thinking of setting up a cluster of servers to hold the
    > information and another cluster to backup the information. Is this
    > practical?
    > Also, what software is available out there that can distribute query
    > calls across different servers and to manage large amounts of query
    > requests?
    >
    > Thank you in advance.
    >
    > Ben[/color]

    You need to consult with a profession firm that has experience with this
    kind of thing. I wouldn't trust us yahoos! :D

    Zach

    Comment

    • David Portas

      #3
      Re: Dealing with large amounts of data

      This question is really too big for a newsgroup. The obvious question is
      that if you want to build and run a 24/7 server-farm for 8 million
      subscribers then you'll need a team of architecture, development and
      operations staff with plenty of years of experience in high-volume,
      high-availability environments... so why not ask them this question, since
      they'll be in a better position to understand your requirements?

      SQL Server's clustering support is for Failover Clustering so that may be
      part of your solution for high availability. In SQL Server, distributed
      queries can be implemented using partitioned views for load balancing.

      You may find some useful information on Microsoft's scalability site:
      Transform your business with a unified data platform. SQL Server 2019 comes with Apache Spark and Hadoop Distributed File System (HDFS) for intelligence over all your data.


      --
      David Portas
      SQL Server MVP
      --


      Comment

      • Greg D. Moore \(Strider\)

        #4
        Re: Dealing with large amounts of data


        "Digety" <beeno524@hotma il.com> wrote in message
        news:82d12e6b.0 410181158.51e21 079@posting.goo gle.com...[color=blue]
        > We are looking to store a large amount of user data that will be
        > changed and accessed daily by a large number of people. We expect
        > around 6-8 million subscribers to our service with each record being
        > approximately 2000-2500 bytes. The system needs to be running 24/7
        > and therefore cannot be shut down. What is the best way to implement
        > this? We were thinking of setting up a cluster of servers to hold the
        > information and another cluster to backup the information. Is this
        > practical?[/color]

        Practical? What's your budget? What's your response time requirements?

        Clustering ain't cheap.

        6-8 million rows isn't all that much btw. What's more important is how much
        it changes.

        [color=blue]
        > Also, what software is available out there that can distribute query
        > calls across different servers and to manage large amounts of query
        > requests?[/color]

        If you're just doing queries, my guess is you won't need this.

        To give you an example, I've got a quad CPU Xeon box (700Mhz) that runs at
        about 50% CPU these days (some new code just helped). It INSERTS I think 17
        million rows a day (which then get moved to another server overnight.)

        [color=blue]
        >
        > Thank you in advance.
        >
        > Ben[/color]


        Comment

        • Digety

          #5
          Re: Dealing with large amounts of data

          Obviously you can tell that I don't know much about this subject, so
          I'm sorry for my ignorance. What types of firms out there can handle
          something like this? If you could give me some examples, that would
          be great. Thank you so much.

          [Ben]

          Comment

          • David Portas

            #6
            Re: Dealing with large amounts of data

            Have you already chosen SQL Server as your database platform? If so then
            contact Microsoft Sales, explain your requirements and ask them to suggest a
            vendor in your area. I suggest you also hire someone with experience in
            databse systems implementation to liaise with the vendor.

            If you don't have any given constraints as to what hardware and software
            platform to use then you may want to take some advice from an independent IT
            consultant to help you decide on the right technology before you talk to
            vendors.

            --
            David Portas
            SQL Server MVP
            --


            Comment

            • filesiteguy

              #7
              Re: Dealing with large amounts of data

              David Portas scratched out in the sand
              [color=blue]
              > Have you already chosen SQL Server as your database platform? If so then
              > contact Microsoft Sales, explain your requirements and ask them to suggest
              > a vendor in your area. I suggest you also hire someone with experience in
              > databse systems implementation to liaise with the vendor.
              >
              > If you don't have any given constraints as to what hardware and software
              > platform to use then you may want to take some advice from an independent
              > IT consultant to help you decide on the right technology before you talk
              > to vendors.
              >[/color]

              As someone else mentioned, your volume really isn't that bad. SQL Server (or
              Oracle, MySQL, PostgreSQL) could handle it on one server very easily. I've
              done this many times over on MSSQL and MySQL. Both handle the volume on one
              server without a hitch.

              If, you're worried about 24/7 operations, however, I'd shy away from
              MS-based solutions, as they get pricy quickly and are difficult to
              maintain. Though Windows is a decent departmental level solution, I've seen
              too many cases were $$$ was thrown at it to make it 24/7 and still had it
              fail.

              I'd lean more towards a Unix-based system, simply because they're better
              designed for datacenter operations.
              --
              kai - kai at 3gproductions dot com
              www.gamephreakz.com || www.filesite.org
              "friends don't let friends use windows xp"

              Comment

              • Greg D. Moore \(Strider\)

                #8
                Re: Dealing with large amounts of data


                "filesitegu y" <abuse@127.0.0. 1> wrote in message
                news:10na2n33ja j678c@corp.supe rnews.com...[color=blue]
                >
                > As someone else mentioned, your volume really isn't that bad. SQL Server[/color]
                (or[color=blue]
                > Oracle, MySQL, PostgreSQL) could handle it on one server very easily. I've
                > done this many times over on MSSQL and MySQL. Both handle the volume on[/color]
                one[color=blue]
                > server without a hitch.
                >[/color]

                I'd agree with this.

                [color=blue]
                > If, you're worried about 24/7 operations, however, I'd shy away from
                > MS-based solutions, as they get pricy quickly and are difficult to
                > maintain. Though Windows is a decent departmental level solution, I've[/color]
                seen[color=blue]
                > too many cases were $$$ was thrown at it to make it 24/7 and still had it
                > fail.[/color]

                But not this. Our main production SQL Server had over the course of the
                past few years a 100% uptime (except for a planned move and a few planned
                maintenances).

                A lot of 24/7 really goes into planning, Unix, Windows or otherwise.
                [color=blue]
                >
                > I'd lean more towards a Unix-based system, simply because they're better
                > designed for datacenter operations.
                > --
                > kai - kai at 3gproductions dot com
                > www.gamephreakz.com || www.filesite.org
                > "friends don't let friends use windows xp"[/color]


                Comment

                • filesiteguy

                  #9
                  Re: Dealing with large amounts of data

                  Greg D. Moore (Strider) scratched out in the sand

                  [color=blue][color=green]
                  >> If, you're worried about 24/7 operations, however, I'd shy away from
                  >> MS-based solutions, as they get pricy quickly and are difficult to
                  >> maintain. Though Windows is a decent departmental level solution, I've[/color]
                  > seen[color=green]
                  >> too many cases were $$$ was thrown at it to make it 24/7 and still had it
                  >> fail.[/color]
                  >
                  > But not this. Our main production SQL Server had over the course of the
                  > past few years a 100% uptime (except for a planned move and a few planned
                  > maintenances).
                  >
                  > A lot of 24/7 really goes into planning, Unix, Windows or otherwise.
                  >[/color]

                  If you were down due to maintenance, then it isn't 100% uptime. Yes,
                  planning is VERY important. I just have seen that Windows is better being
                  planned as a small departmental solution and doesn't fit as an enterprise
                  system.

                  Eventually you'll learn. You'll also learn not to use Outlook Express,
                  someday.

                  --
                  kai - kai at 3gproductions dot com
                  www.gamephreakz.com || www.filesite.org
                  "friends don't let friends use windows xp"

                  Comment

                  • David Portas

                    #10
                    Re: Dealing with large amounts of data

                    > I just have seen that Windows is better being[color=blue]
                    > planned as a small departmental solution and doesn't fit as an enterprise
                    > system.[/color]

                    That tells us more about your experience than about Windows. Your
                    generalization is belied by the reality of thousands of major enterprises
                    whose experience apparently differs from yours.

                    --
                    David Portas
                    SQL Server MVP
                    --


                    Comment

                    • Greg D. Moore \(Strider\)

                      #11
                      Re: Dealing with large amounts of data


                      "filesitegu y" <abuse@127.0.0. 1> wrote in message
                      news:10nhuntd4p j6d1e@corp.supe rnews.com...[color=blue]
                      > Greg D. Moore (Strider) scratched out in the sand
                      >
                      >[color=green][color=darkred]
                      > >> If, you're worried about 24/7 operations, however, I'd shy away from
                      > >> MS-based solutions, as they get pricy quickly and are difficult to
                      > >> maintain. Though Windows is a decent departmental level solution, I've[/color]
                      > > seen[color=darkred]
                      > >> too many cases were $$$ was thrown at it to make it 24/7 and still had[/color][/color][/color]
                      it[color=blue][color=green][color=darkred]
                      > >> fail.[/color]
                      > >
                      > > But not this. Our main production SQL Server had over the course of the
                      > > past few years a 100% uptime (except for a planned move and a few[/color][/color]
                      planned[color=blue][color=green]
                      > > maintenances).
                      > >
                      > > A lot of 24/7 really goes into planning, Unix, Windows or otherwise.
                      > >[/color]
                      >
                      > If you were down due to maintenance, then it isn't 100% uptime.[/color]

                      Depends on your definition. And your budget. (i.e. whether you have the
                      budget for the hardware and software solutions that allow 0% downtime
                      maintenance. Same as in a Unix shop.)

                      And let's put it this way, the Windows solution has had far better uptime
                      than the Unix/Oracle solution used in a different division.
                      [color=blue]
                      > Yes,
                      > planning is VERY important. I just have seen that Windows is better being
                      > planned as a small departmental solution and doesn't fit as an enterprise
                      > system.
                      >
                      > Eventually you'll learn. You'll also learn not to use Outlook Express,
                      > someday.[/color]

                      And perhaps someday you'll learn to not be so condescending.

                      [color=blue]
                      >
                      > --
                      > kai - kai at 3gproductions dot com
                      > www.gamephreakz.com || www.filesite.org
                      > "friends don't let friends use windows xp"[/color]


                      Comment

                      Working...