Are lists at least as efficient as dictionaries?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Narendra C. Tulpule

    Are lists at least as efficient as dictionaries?

    Hi,
    if you know the Python internals, here is a newbie question for you.
    If I have a list with 100 elements, each element being a long string,
    is it more efficient to maintain it as a dictionary (with a key = a
    string from the list and value = None) for the purpose of insertion
    and removal?
    Basically, if Python really implements lists as linked lists but
    dictionaries as hash tables, it may well be that hashing a key takes
    negligible time as compared to comparing it against every list
    element.
    Oh and here is (as a non-sequiter) something I don't understand
    either:[color=blue][color=green][color=darkred]
    >>> x = ([],)
    >>> x[0] += ['something'][/color][/color][/color]
    Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    TypeError: object doesn't support item assignment[color=blue][color=green][color=darkred]
    >>> x[/color][/color][/color]
    (['something'],) <---- complained but did it anyway??[color=blue][color=green][color=darkred]
    >>> x[0].append('and another thing') <------- no complaint!
    >>> x[/color][/color][/color]
    (['something', 'and another thing'],)
  • Peter Otten

    #2
    Re: Are lists at least as efficient as dictionaries?

    Narendra C. Tulpule wrote:
    [color=blue]
    > If I have a list with 100 elements, each element being a long string,
    > is it more efficient to maintain it as a dictionary (with a key = a
    > string from the list and value = None) for the purpose of insertion
    > and removal?[/color]

    Wild guess: I suppose that both implementations will not significantly
    affect the overall speed of your application, e.g. if the strings are
    *really* large (as opposed to the list of *only* 100 elements), reading
    from disk will take much longer than inserting into the list, even at
    arbitrary positions.

    Also, note that no particular order is preserved for the keys in a
    dictionary.

    And now for something completely different:
    [color=blue][color=green][color=darkred]
    >>>> x = ([],)
    >>>> x[0] += ['something'][/color][/color]
    > Traceback (most recent call last):
    > File "<stdin>", line 1, in ?
    > TypeError: object doesn't support item assignment[/color]

    += calls the list.__iadd__(s elf, other) method, which seems to be
    implemented as

    def __iadd__(self, other):
    self.append(oth er)
    return self

    The call of this method succeds, but the following assignment fails, because
    tuples are immutable. This could only be remedied if all assignments

    a = a

    were silently ignored, or, if the += operator would not perform an
    assignment, which has the disadvantage that it would no longer work for
    immutables, so that[color=blue][color=green][color=darkred]
    >>> i = 1
    >>> i += 1
    >>> print i[/color][/color][/color]
    1 # admittedly faked[color=blue][color=green][color=darkred]
    >>>[/color][/color][/color]

    You could change __iadd__() to

    def __iadd__(self, other):
    return self + other

    but then sane behaviour in one special case comes at the cost of always
    creating a copy of a potentially large (say more than 100 items :-) list.

    By the way, one topic per post is always a good idea :-)

    Peter

    Comment

    • Alex Martelli

      #3
      Re: Are lists at least as efficient as dictionaries?

      Narendra C. Tulpule wrote:
      [color=blue]
      > Hi,
      > if you know the Python internals, here is a newbie question for you.
      > If I have a list with 100 elements, each element being a long string,
      > is it more efficient to maintain it as a dictionary (with a key = a
      > string from the list and value = None) for the purpose of insertion
      > and removal?[/color]

      If you use a dictionary, there will be no "intrinsic" ordering -- you
      may as well use sets, then. "Insertion" for a list would thus just
      be an .append, very fast. "removal" may be O(N) for a list, if you
      need to search through it for occurrences.
      [color=blue]
      > Basically, if Python really implements lists as linked lists but
      > dictionaries as hash tables, it may well be that hashing a key takes
      > negligible time as compared to comparing it against every list
      > element.[/color]

      Python's lists are actually vectors, dicts are indeed hash tables.
      Hashing a long string does take some time, but the value may be
      cached (depending on the Python implementation) so that said time
      need be spent only once for a given string object (but separate
      string objects will spend the time multiply, even if it so happens
      that their values coincide).

      All in all, there's no serious alternative to benchmarking both
      possibilities in a realistic scenario. Quite likely you may find
      out that -- for sensible frequencies of insertiom / removal --
      the time doesn't matter for your application overall (dominated
      by I/O or other issues), and then you can use the simplest rather
      than the fastest approach. At least 9 times out of 10 you will
      be happiest with simplicity. If you don't care about ordering,
      Python 2.3's sets are likely the simplest data structure (they
      are implemented in terms of dictionaries, thus pretty fast too).


      Alex


      Comment

      • Jp Calderone

        #4
        Re: Are lists at least as efficient as dictionaries?

        On Fri, Aug 29, 2003 at 10:07:20AM +0200, Peter Otten wrote:[color=blue]
        > Narendra C. Tulpule wrote:
        >[/color]
        [ snip][color=blue]
        >
        > And now for something completely different:
        >[color=green][color=darkred]
        > >>>> x = ([],)
        > >>>> x[0] += ['something'][/color]
        > > Traceback (most recent call last):
        > > File "<stdin>", line 1, in ?
        > > TypeError: object doesn't support item assignment[/color]
        >
        > += calls the list.__iadd__(s elf, other) method, which seems to be
        > implemented as
        >
        > def __iadd__(self, other):
        > self.append(oth er)
        > return self[/color]

        self.extend(oth er), actually. But that's the basic idea, yea.
        [color=blue]
        >
        > The call of this method succeds, but the following assignment fails, because
        > tuples are immutable. This could only be remedied if all assignments
        >
        > a = a
        >
        > were silently ignored, or, if the += operator would not perform an
        > assignment, which has the disadvantage that it would no longer work for
        > immutables, so that[color=green][color=darkred]
        > >>> i = 1
        > >>> i += 1
        > >>> print i[/color][/color]
        > 1 # admittedly faked[color=green][color=darkred]
        > >>>[/color][/color][/color]

        += could simply be syntactic sugar for a call to __add__ and then an
        assignment. This would work for mutable and immutable objects.
        [color=blue]
        >
        > You could change __iadd__() to
        >
        > def __iadd__(self, other):
        > return self + other
        >
        > but then sane behaviour in one special case comes at the cost of always
        > creating a copy of a potentially large (say more than 100 items :-) list.[/color]

        True, except that list.extend() exists.

        Jp

        --
        "There is no reason for any individual to have a computer in their
        home."
        -- Ken Olson, President of DEC, World Future Society
        Convention, 1977

        Comment

        • John Roth

          #5
          Re: Are lists at least as efficient as dictionaries?


          "Narendra C. Tulpule" <naren@trillium .com> wrote in message
          news:781faf41.0 308281604.51e48 f45@posting.goo gle.com...[color=blue]
          > Hi,
          > if you know the Python internals, here is a newbie question for you.
          > If I have a list with 100 elements, each element being a long string,
          > is it more efficient to maintain it as a dictionary (with a key = a
          > string from the list and value = None) for the purpose of insertion
          > and removal?[/color]

          Lists are more efficient for lookup by index, dictionaries are
          more efficient for insertion (except at the end, where Python
          maintains extra slots for just this case), removal and lookup
          by key. Lists are also much more efficient for sequential
          traversal.

          John Roth


          Comment

          • Jp Calderone

            #6
            Re: Are lists at least as efficient as dictionaries?

            On Fri, Aug 29, 2003 at 10:24:37AM -0700, Chad Netzer wrote:[color=blue]
            > On Fri, 2003-08-29 at 07:54, Jp Calderone wrote:
            >[color=green]
            > > += could simply be syntactic sugar for a call to __add__ and then an
            > > assignment. This would work for mutable and immutable objects.[/color]
            >
            > But it loses the advantage that some objects would otherwise have of
            > being able to mutate in place, without allocating a new object (ie. very
            > large matrix additions).
            >[/color]

            But as you removed from my original post, list.extend() exists. All one
            has to do to retain the existing functionality of __iadd__ is name the
            method something else, then call it. All the advantages, none of the
            confusing or difficult to track semantics.

            Jp

            Comment

            • Chad Netzer

              #7
              Re: Are lists at least as efficient as dictionaries?

              On Sat, 2003-08-30 at 00:03, Jp Calderone wrote:
              [color=blue]
              > But as you removed from my original post, list.extend() exists. All one
              > has to do to retain the existing functionality of __iadd__ is name the
              > method something else, then call it. All the advantages, none of the
              > confusing or difficult to track semantics.[/color]

              True, however when working with large Matrices, I much prefer A += B, to
              A.add(B), and the performance advantages with __iadd__ can be
              substantial.

              I prefer the original posters suggestion that self assignment be
              ignored. But I see your point about the more difficult semantics of
              __iadd__.


              --
              Chad Netzer


              Comment

              Working...