Comparison of Database performance

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • waqasahmd
    New Member
    • Nov 2009
    • 13

    Comparison of Database performance

    Hi all,

    I happen to propose a DBMS, the requirement is that we have 1.6million rows in DB having 15 simple columns and at worse we need to come compare 10,000 rows(trasaction s) one by one with this 1.6 figure, only 5 columns are compared and filter out some transactions.

    and we need get back the results within two days. We are using Java on most probably some linux flavour. Assuming indexing e.t.c are applied

    So, what would you suggest that which database we use. can MYSQL or Postgre handle this type of transaction processing. And can we i get calculations on transaction throughput and performance with MySQL and PostgreSQL from some link.

    Thanx
  • dgreenhouse
    Recognized Expert Contributor
    • May 2008
    • 250

    #2
    Well that's 80 Billion comparisons, so it will take awhile
    1.6M X 10,000 X 5 = 80,000,000,000

    If you have a machine already setup, I'd just create some dummy data and 'let her rip!' :-)

    i.e.

    1- Create two tables
    a- Table A - 15 columns
    b- Table B - 5-15 columns

    2- Generate some dummy data in table A & B

    3- Create a comparison algorithm

    4- Have timing code embedded in the script

    5- PostgreSQL would be my first choice for this as you'd need to segment the results if MySQL is used.
    Postgres Database specs
    Limit Value
    Maximum Database Size Unlimited
    Maximum Table Size 32 TB
    Maximum Row Size 1.6 TB
    Maximum Field Size 1 GB
    Maximum Rows per Table Unlimited
    Maximum Columns per Table 250 - 1600 depending on column types
    Maximum Indexes per Table Unlimited

    6- If you have a beefy enough machine, you could possibly create a comparison matrix table from a single select statement using Postgres or some other large db engine, but you'd probably end up with connection timeouts.

    Regardless, this will run for awhile.

    Finally, you should be able to do a rough estimate off of a single comparison unit.
    i.e.
    1- set up timing (in microseconds)
    2- compare five columns
    3- update timing and output timing results
    4- Multiple this time X 16,000,000,000
    5- It won't take into consideration the operating system, disk access, and other network issues that will arise during a full run, but it'll give you a rough estimate of what you're in for.

    Comment

    Working...