text file vs. cPickle vs sqlite a design question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Dag

    text file vs. cPickle vs sqlite a design question

    I have an application which works with lists of tuples of the form
    (id_nr,'text',' more text',1 or 0). I'll have maybe 20-50 or so of these
    lists containing anywhere from 3 to over 30000 tuples. The actions I
    need to do is either append a new tuple to the end of the list, display
    all the tuples or display all the tuples where the last element is a 1

    Basically what I'm wondering is the best way to store these data stuctures
    to disc. As the subject mentioned I've basically got three approaches.
    Store each list as a text file, pickle each list to file or shove the
    whole thing into a bunch of database tables. I can see pros and cons
    with each approach. Does anybody have any advice as to whether any of
    these approaches is obviously better than any other? On one hand I like
    the text file approach since it lets me append without loading
    everything into memory, on the other hand the sqlite approach makes it
    easy to select stuff with SELECT * FROM foo WHERE... which could be
    handy if ever need to add more advanced filtering.

    Dag
  • Gabriel Genellina

    #2
    Re: text file vs. cPickle vs sqlite a design question

    En Wed, 11 Apr 2007 13:40:02 -0300, Dag <f98dawa@dd.cha lmers.seescribi ó:
    I have an application which works with lists of tuples of the form
    (id_nr,'text',' more text',1 or 0). I'll have maybe 20-50 or so of these
    lists containing anywhere from 3 to over 30000 tuples. The actions I
    need to do is either append a new tuple to the end of the list, display
    all the tuples or display all the tuples where the last element is a 1
    >
    Basically what I'm wondering is the best way to store these data
    stuctures
    to disc. As the subject mentioned I've basically got three approaches.
    Store each list as a text file, pickle each list to file or shove the
    whole thing into a bunch of database tables. I can see pros and cons
    with each approach. Does anybody have any advice as to whether any of
    From your description, none of these three approaches is obviously better.
    Try to isolate the data from its storage, and use the easiest way now
    (pickle perhaps?).
    This way you can change it later easily - maybe to use sqlite if you need
    more difficult queries.

    --
    Gabriel Genellina

    Comment

    • Bruno Desthuilliers

      #3
      Re: text file vs. cPickle vs sqlite a design question

      Dag a écrit :
      I have an application which works with lists of tuples of the form
      (id_nr,'text',' more text',1 or 0). I'll have maybe 20-50 or so of these
      lists containing anywhere from 3 to over 30000 tuples. The actions I
      need to do is either append a new tuple to the end of the list, display
      all the tuples or display all the tuples where the last element is a 1
      >
      Basically what I'm wondering is the best way to store these data stuctures
      to disc. As the subject mentioned I've basically got three approaches.
      Store each list as a text file, pickle each list to file or shove the
      whole thing into a bunch of database tables. I can see pros and cons
      with each approach. Does anybody have any advice as to whether any of
      these approaches is obviously better than any other? On one hand I like
      the text file approach since it lets me append without loading
      everything into memory, on the other hand the sqlite approach makes it
      easy to select stuff with SELECT * FROM foo WHERE... which could be
      handy if ever need to add more advanced filtering.
      Given your specs, I'd go for SQLite without any hesitation. Your data
      structure is obviously relational (a list of tuples is a pretty good
      definition of a relation), so a relational DBMS is the obvious solution,
      and you'll get lots of other benefits from it (SQL being only one of
      them - you can also think about free optimization, scalability, and
      interoperabilit y). And if you don't like raw SQL and prefer something
      more pythonic, then you have SQLAlchemy and Elixir.

      My 2 cents...

      Comment

      • John Machin

        #4
        Re: text file vs. cPickle vs sqlite a design question

        On Apr 12, 7:09 am, Bruno Desthuilliers
        <bdesth.quelque ch...@free.quel quepart.frwrote :
        Dag a écrit :
        >
        >
        >
        I have an application which works with lists of tuples of the form
        (id_nr,'text',' more text',1 or 0). I'll have maybe 20-50 or so of these
        lists containing anywhere from 3 to over 30000 tuples. The actions I
        need to do is either append a new tuple to the end of the list, display
        all the tuples or display all the tuples where the last element is a 1
        >
        Basically what I'm wondering is the best way to store these data stuctures
        to disc. As the subject mentioned I've basically got three approaches.
        Store each list as a text file, pickle each list to file or shove the
        whole thing into a bunch of database tables. I can see pros and cons
        with each approach. Does anybody have any advice as to whether any of
        these approaches is obviously better than any other? On one hand I like
        the text file approach since it lets me append without loading
        everything into memory, on the other hand the sqlite approach makes it
        easy to select stuff with SELECT * FROM foo WHERE... which could be
        handy if ever need to add more advanced filtering.
        s/if/when/
        >
        Given your specs, I'd go for SQLite without any hesitation. Your data
        structure is obviously relational (a list of tuples is a pretty good
        definition of a relation), so a relational DBMS is the obvious solution,
        and you'll get lots of other benefits from it (SQL being only one of
        them - you can also think about free optimization, scalability, and
        interoperabilit y). And if you don't like raw SQL and prefer something
        more pythonic, then you have SQLAlchemy and Elixir.
        >
        My 2 cents...
        .... and a few more cents:

        There are *two* relations/tables involved (at least): a "tuple" table
        and a "list" table. The 20-50 or so lists need a unique name or number
        each, and other attributes of a list are sure to come out of the
        woodwork later. Each tuple will need a column containing the ID of the
        list it belongs to. It's a bit boggling that (1) each tuple has an
        id_nr but there's no requirement to query on it (2) req. only to
        "append" new tuples w/o checking id_nr already exists (3) req. to
        "display" all of 30,000 tuples ...


        Comment

        • Bruno Desthuilliers

          #5
          Re: text file vs. cPickle vs sqlite a design question

          John Machin a écrit :
          (snip)
          ... and a few more cents:
          >
          There are *two* relations/tables involved (at least): a "tuple" table
          and a "list" table.
          <ot mode="thinking out loud">
          Mmm... From a purely technical POV, not necessarily. If there's no need
          for anything else than distinguishing between different lists, a single
          table with a compound key (list_id, tuple_id) could be enough...
          </ot>

          Comment

          • Paddy

            #6
            Re: text file vs. cPickle vs sqlite a design question

            On Apr 11, 5:40 pm, Dag <f98d...@dd.cha lmers.sewrote:
            I have an application which works with lists of tuples of the form
            (id_nr,'text',' more text',1 or 0). I'll have maybe 20-50 or so of these
            lists containing anywhere from 3 to over 30000 tuples. The actions I
            need to do is either append a new tuple to the end of the list, display
            all the tuples or display all the tuples where the last element is a 1
            >
            Basically what I'm wondering is the best way to store these data stuctures
            to disc. As the subject mentioned I've basically got three approaches.
            Store each list as a text file, pickle each list to file or shove the
            whole thing into a bunch of database tables. I can see pros and cons
            with each approach. Does anybody have any advice as to whether any of
            these approaches is obviously better than any other? On one hand I like
            the text file approach since it lets me append without loading
            everything into memory, on the other hand the sqlite approach makes it
            easy to select stuff with SELECT * FROM foo WHERE... which could be
            handy if ever need to add more advanced filtering.
            >
            Dag
            If you have enough resources to keep all the lists comfortably in
            memory, and you have enough disk space then I would save your data as
            python text. Something like:

            print "# <What the data is and how it is formatted>"
            print "all_lists = []"
            for i,l in enumerate(all_l ists):
            print "all_lists.appe nd( [ #", i
            for tpl in l:
            print " ", tpl, ","
            print " ]) #", i

            You would then have your data saved in a format that could easily
            be re-used by other programs at a later date, and that can be
            examined in any text editor.

            - Paddy.

            Comment

            • Bruno Desthuilliers

              #7
              Re: text file vs. cPickle vs sqlite a design question

              Dag a écrit :
              I have an application which works with lists of tuples of the form
              (id_nr,'text',' more text',1 or 0). I'll have maybe 20-50 or so of these
              lists containing anywhere from 3 to over 30000 tuples. The actions I
              need to do is either append a new tuple to the end of the list, display
              all the tuples or display all the tuples where the last element is a 1
              >
              Basically what I'm wondering is the best way to store these data stuctures
              to disc. As the subject mentioned I've basically got three approaches.
              Store each list as a text file, pickle each list to file or shove the
              whole thing into a bunch of database tables. I can see pros and cons
              with each approach. Does anybody have any advice as to whether any of
              these approaches is obviously better than any other?
              Seems that so far, you get as many different opinion as answers - not
              sure this will help much :-/

              Comment

              Working...