How do I get a reference to a KEY value of a dictionary?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Andy C

    How do I get a reference to a KEY value of a dictionary?

    I am new to python, so please bear with me if I am making some
    conceptual error.

    Basically I want to create a graph with an adjacency list
    representation, but I don't want any of the adjacency lists to have
    duplicate strings when it is avoidable. I have a function createEdge
    that adds an edge to the graph. The arguments will be distinct since
    they are read from text files. But basically I want to use the
    dictionary as a string pool, and if the argument string equals
    something in the pool already, don't use the argument string, just a
    use a reference to something in the string pool already.

    Is this a waste of time? Because I know in python I cannot be certain
    that the argument strings that are read from files are even garbage
    collected anyway. I could certainly do the job with duplicate
    strings, but it would seem wasteful. I am a C programmer, and old
    habits die hard.

    The following code works -- I tested it and entries in the adjacency
    list that are equivalent are in fact identical. But it seems rather
    stupid to have a dictionary of the form {'alice': 'alice', 'bob':
    'bob'}, etc. i.e. the keys and values are the same. It would be nice
    if I can get a reference to the key in the same time as a hash lookup.
    I know I can do dictionary.keys ().index( 'string' ) or something but
    that would be pretty inefficient, I believe.

    class GraphAccumulato r:
    def __init__( self, fileFunction ):
    self.fileFuncti on = fileFunction
    self.adjList = {} # adjacency list
    self.nodes = {} # list of nodes for
    preventing string dupes

    def createEdge( self, node1, node2 ):

    # another way of doing by using a dictionary

    nodes = self.nodes
    if nodes.has_key( node2 ):
    node2 = nodes[ node2 ]
    else:
    nodes[ node2 ] = node2

    adjList = self.adjList
    if adjList.has_key ( node1 ):
    adjList[ node1 ].append( node2 );
    else:
    adjList[ node1 ] = [ node2 ]; # singleton edge list
  • Ben Finney

    #2
    Re: How do I get a reference to a KEY value of a dictionary?

    On 31 Jul 2003 17:36:41 -0700, Andy C wrote:[color=blue]
    > Is this a waste of time? Because I know in python I cannot be certain
    > that the argument strings that are read from files are even garbage
    > collected anyway. I could certainly do the job with duplicate
    > strings, but it would seem wasteful. I am a C programmer, and old
    > habits die hard.[/color]

    Python dynamically binds names ("variables" ) to objects; many types,
    including immutable strings, will not be duplicated if an identical one
    is already stored.

    The upshot is: store an identical string as many times as you like,
    because PYthon will simply keep references to the one string object for
    each duplicate.

    --
    \ "Friendship is born at that moment when one person says to |
    `\ another, 'What! You too? I thought I was the only one!'" -- |
    _o__) C.S. Lewis |
    Ben Finney <http://bignose.squidly .org/>

    Comment

    • Terry Reedy

      #3
      Re: How do I get a reference to a KEY value of a dictionary?


      "Andy C" <andychup@yahoo .com> wrote in message
      news:645db655.0 307311636.71923 378@posting.goo gle.com...[color=blue]
      > I am new to python, so please bear with me if I am making some
      > conceptual error.
      >
      > Basically I want to create a graph with an adjacency list
      > representation, but I don't want any of the adjacency lists to have
      > duplicate strings when it is avoidable. I have a function[/color]
      createEdge[color=blue]
      > that adds an edge to the graph. The arguments will be distinct[/color]
      since[color=blue]
      > they are read from text files. But basically I want to use the
      > dictionary as a string pool, and if the argument string equals
      > something in the pool already, don't use the argument string, just a
      > use a reference to something in the string pool already.[/color]

      Thinking in terms of 'references' can both help and hinder. (There
      have been some long recent discussion.)

      Are you familiar with this?
      [color=blue][color=green][color=darkred]
      >>> help('intern')[/color][/color][/color]

      Help on built-in function intern:

      intern(...)
      intern(string) -> string

      ``Intern'' the given string. This enters the string in the
      (global)
      table of interned strings whose purpose is to speed up dictionary
      lookups.
      Return the string itself or the previously interned string object
      with the
      same value.

      TJR


      Comment

      • Andy C

        #4
        Re: How do I get a reference to a KEY value of a dictionary?

        I don't think this is true in many cases. For example:
        [color=blue][color=green][color=darkred]
        >>> x = 'hi'
        >>> y = 'hi'
        >>> x is y[/color][/color][/color]
        True[color=blue][color=green][color=darkred]
        >>> a = 'hi'
        >>> b = 'h'
        >>> b += 'i'
        >>> b[/color][/color][/color]
        'hi'[color=blue][color=green][color=darkred]
        >>> a is b[/color][/color][/color]
        False

        Also, when the strings come from different files read in using readline(),
        it appears that duplicates aren't eliminated. I tried my code without the
        hash table duplicate elimination, and the adjacency lists has non-identical
        strings. After I constructed the adjacency lists, I compared them with 'is'
        and they were not identical. After I used my hash table hack solution, all
        equivalent strings were identical.

        "Ben Finney" <bignose-hates-spam@and-benfinney-does-too.id.au> wrote in
        message news:slrnbije2k .ejq.bignose-hates-spam@iris.polar .local...[color=blue]
        > On 31 Jul 2003 17:36:41 -0700, Andy C wrote:[color=green]
        > > Is this a waste of time? Because I know in python I cannot be certain
        > > that the argument strings that are read from files are even garbage
        > > collected anyway. I could certainly do the job with duplicate
        > > strings, but it would seem wasteful. I am a C programmer, and old
        > > habits die hard.[/color]
        >
        > Python dynamically binds names ("variables" ) to objects; many types,
        > including immutable strings, will not be duplicated if an identical one
        > is already stored.
        >
        > The upshot is: store an identical string as many times as you like,
        > because PYthon will simply keep references to the one string object for
        > each duplicate.
        >
        > --
        > \ "Friendship is born at that moment when one person says to |
        > `\ another, 'What! You too? I thought I was the only one!'" -- |
        > _o__) C.S. Lewis |
        > Ben Finney <http://bignose.squidly .org/>[/color]


        Comment

        • Andy C

          #5
          Re: How do I get a reference to a KEY value of a dictionary?

          Thanks, I think that is exactly what I'm looking for. I guess if you do a =
          intern(a), then if a is equivalent but not identical to something in the
          global string table, it will assign a to reference the object in the global
          string table, and thus the old "a" value can be garbage collected.

          Well, for a C/C++ programmer, they ARE references, and I can't imagine
          thinking otherwise. They are implemented as references/pointers in the
          interpreter. Thinking in value semantics can simplify things, but there
          will always be some cases where it doesn't work.

          I think my example is a good one. If you have a graph in an adjacency list
          representation, you don't want the nodes to be copies of each other. They
          should just point to a global list of nodes. This can definitely matter...
          the worst case is when you have a complete graph, where you would have
          n(n-1)/2 (or n(n-1) for a complete directed graph) node objects where you
          only need n. Instead of 100,000 nodes (not an unreasonable size), you could
          have 10 billion (totally unreasonable).

          But it is a good question whether the garbage collection will make the
          difference worth it, in most cases. I could reassign my references, but I
          don't know when the old objects get deleted. That is, I can do a =
          intern(a), but the old value of a could hang around in memory for who knows
          how long. If anyone could shed some light on this, I would appreciate it.


          "Terry Reedy" <tjreedy@udel.e du> wrote in message
          news:Qf-cnYqDQejEJbSiXT WJjw@comcast.co m...[color=blue]
          >
          > "Andy C" <andychup@yahoo .com> wrote in message
          > news:645db655.0 307311636.71923 378@posting.goo gle.com...[color=green]
          > > I am new to python, so please bear with me if I am making some
          > > conceptual error.
          > >
          > > Basically I want to create a graph with an adjacency list
          > > representation, but I don't want any of the adjacency lists to have
          > > duplicate strings when it is avoidable. I have a function[/color]
          > createEdge[color=green]
          > > that adds an edge to the graph. The arguments will be distinct[/color]
          > since[color=green]
          > > they are read from text files. But basically I want to use the
          > > dictionary as a string pool, and if the argument string equals
          > > something in the pool already, don't use the argument string, just a
          > > use a reference to something in the string pool already.[/color]
          >
          > Thinking in terms of 'references' can both help and hinder. (There
          > have been some long recent discussion.)
          >
          > Are you familiar with this?
          >[color=green][color=darkred]
          > >>> help('intern')[/color][/color]
          >
          > Help on built-in function intern:
          >
          > intern(...)
          > intern(string) -> string
          >
          > ``Intern'' the given string. This enters the string in the
          > (global)
          > table of interned strings whose purpose is to speed up dictionary
          > lookups.
          > Return the string itself or the previously interned string object
          > with the
          > same value.
          >
          > TJR
          >
          >[/color]


          Comment

          • Andrew Dalke

            #6
            Re: How do I get a reference to a KEY value of a dictionary?

            Andy C:[color=blue]
            > Basically I want to create a graph with an adjacency list
            > representation, but I don't want any of the adjacency lists to have
            > duplicate strings when it is avoidable.[/color]

            Took me a while but I finally figured out what you're asking.

            Look in the docs for 'intern', which is a builtin. It's rarely used
            because most people don't worry about this sort of duplication.
            [color=blue]
            > Is this a waste of time? Because I know in python I cannot be certain
            > that the argument strings that are read from files are even garbage
            > collected anyway. I could certainly do the job with duplicate
            > strings, but it would seem wasteful. I am a C programmer, and old
            > habits die hard.[/color]

            You can never be certain that anything is every garbage collected.
            Python-the-language makes no guarantees on that. Python-in-C
            does make a stronger statement, but it's an implementation issue. As
            a C programmer you'll be happy that it's a reference counted scheme
            and so long as there are no cycles (which can't happen with a string),
            the string will be deallocated when your code lets go of it.

            This is true even of interned strings, at least in newer Pythons. If
            no one references an interned string, it too goes away. (Older
            Pythons make the strings immortal.)

            Here's how you might change your code.

            class GraphAccumulato r:
            def __init__( self, fileFunction ):
            self.fileFuncti on = fileFunction
            self.adjList = {} # adjacency list

            def createEdge( self, node1, node2 ):

            # another way of doing by using a dictionary
            node1 = intern(node1)
            node2 = intern(node2)
            adjList = self.adjList
            if adjList.has_key ( node1 ):
            adjList[ node1 ].append( node2 );
            else:
            adjList[ node1 ] = [ node2 ]; # singleton edge list

            But personally I would put the intern outside of the graph
            code - either in the reader or the code between the reader
            and this one. Otherwise, consider:

            # fully connect the graph
            nodes = graph.adjList.k eys()
            for n1 in nodes:
            for n2 in nodes:
            if n1 != n2:
            graph.createEdg e(n1, n1)

            This does extra interning even though it isn't needed.

            But then again, normally I wouldn't worry about the interning
            at all. ... Perhaps I should... Hmmm...

            Andrew
            dalke@dalkescie ntific.com


            Comment

            • Aahz

              #7
              Re: How do I get a reference to a KEY value of a dictionary?

              In article <645db655.03073 11636.71923378@ posting.google. com>,
              Andy C <andychup@yahoo .com> wrote:[color=blue]
              >
              >Basically I want to create a graph with an adjacency list
              >representation , but I don't want any of the adjacency lists to have
              >duplicate strings when it is avoidable. I have a function createEdge
              >that adds an edge to the graph. The arguments will be distinct since
              >they are read from text files. But basically I want to use the
              >dictionary as a string pool, and if the argument string equals
              >something in the pool already, don't use the argument string, just a
              >use a reference to something in the string pool already.[/color]

              That makes some sense, but why use the string object for both key and
              value? Just do

              d[key] = True

              You could optimize your coding style (if not your speed) in Python 2.3
              by using sets. I'd recommend against using intern(), because in Python
              2.2 and earlier, interned strings *never* get garbage collected. Even
              in Python 2.3, you may end up with worse memory behavior.
              --
              Aahz (aahz@pythoncra ft.com) <*> http://www.pythoncraft.com/

              This is Python. We don't care much about theory, except where it intersects
              with useful practice. --Aahz

              Comment

              • Andy C

                #8
                Re: How do I get a reference to a KEY value of a dictionary?

                I don't see how I can do this and let me eliminate duplicates. I need to
                assign the old duplicate string to the unique string that already exists.
                Hence the question, how do I get a reference to the KEY value? I know I can
                use keys() and do a linear search, but that is much more inefficient. I
                would like to get a reference to the key value in the same time that it
                takes to do a hash lookup (constant time).

                Basically I would want to rewrite this section of the code I posted:

                nodes = self.nodes
                if nodes.has_key( node2 ):
                node2 = nodes[ node2 ]
                else:
                nodes[ node2 ] = node2

                This dictionary seems stupid, I agree. The keys and values are the same.
                But in the first part of the if, I want to reassign node2 to an equivalent
                string that already exists. How can I do that?

                The intern solution seems reasonable, and it appears that it was designed
                specifically for this problem. I wasn't aware of the implementation
                problems. But I'm still curious about different ways to do it.
                [color=blue]
                > That makes some sense, but why use the string object for both key and
                > value? Just do
                >
                > d[key] = True
                >
                > You could optimize your coding style (if not your speed) in Python 2.3
                > by using sets. I'd recommend against using intern(), because in Python
                > 2.2 and earlier, interned strings *never* get garbage collected. Even
                > in Python 2.3, you may end up with worse memory behavior.
                > --
                > Aahz (aahz@pythoncra ft.com) <*>[/color]
                http://www.pythoncraft.com/[color=blue]
                >
                > This is Python. We don't care much about theory, except where it[/color]
                intersects[color=blue]
                > with useful practice. --Aahz[/color]


                Comment

                • Bengt Richter

                  #9
                  Re: How do I get a reference to a KEY value of a dictionary?

                  On 2 Aug 2003 10:16:13 -0400, aahz@pythoncraf t.com (Aahz) wrote:
                  [color=blue]
                  >In article <w1MWa.133$Tz4. 17019035@newssv r21.news.prodig y.com>,
                  >Andy C <ayc8NOSPAM@cor nell.edu> wrote:[color=green]
                  >>
                  >>I don't see how I can do this and let me eliminate duplicates. I need
                  >>to assign the old duplicate string to the unique string that already
                  >>exists. Hence the question, how do I get a reference to the KEY value?
                  >>I know I can use keys() and do a linear search, but that is much more
                  >>inefficient . I would like to get a reference to the key value in the
                  >>same time that it takes to do a hash lookup (constant time).[/color]
                  >
                  >Ahhhh.... Right. Hmmmmm.... <thinks hard> You're correct, you do
                  >need to set the value to the key. I think using a dict is better than
                  >using intern(). Here's an optimization:
                  >
                  > node2 = node_cache.setd efault(node2, node2)[/color]

                  I would sure be more comfortable with a change in nomenclature, e.g.,

                  node2name = node_name_cache .setdefault(nod e2name, node2name)

                  so that it is clear that you are dealing with name strings, not typical graph
                  vertex/node objects.

                  BTW, reading from a file, you can probably not in general avoid forward references,
                  (i.e., names of adjacent nodes yet to be defined), but they should be resolvable at the end,
                  so you could make nodes that refer to adjacent nodes directly instead of by name.
                  Nodes would carry a ref to their own (string) name, and go by that same name string in a graph dict,
                  so there should be no duplication of strings, just references to them. E.g.,
                  (I started to just sketch something, and ... well, it's hard to stop programming Python
                  before something is running at least preliminarily ;-) I subclassed dict to keep nodes in by name.

                  ====< graph.py >============== =============== =============== =============
                  def boxit(s):
                  lines = s.splitlines()
                  wid = max(map(len,lin es))
                  ret = ['+-%s-+' % ('-'*wid)]
                  fmt = '| %%-%ds |'%wid
                  for line in lines: ret.append(fmt% line)
                  ret.append(ret[0])
                  return '\n'.join(ret)

                  def errep(Ex, *rest):
                  if __debug__: print boxit('%s: %s'%(Ex.__name_ _, rest and rest[0])) # and blunder on
                  else: raise Ex(*rest)

                  class Node(object):
                  __slots__ = 'name', 'adj_list'
                  def __init__(self, name):
                  self.name = name
                  self.adj_list = None # signify undefined, vs [] for no adjacent nodes

                  class Graph(dict):
                  def load(self, infile, print_lines=Fal se):
                  for line in infile:
                  if print_lines: print line.rstrip()
                  sline = line.strip()
                  if not sline or sline.startswit h('#'): continue
                  node_def_names = line.split() # assume line like 'nodename adj_node1_name adj_node2_name ...'
                  node_name = node_def_names[0]
                  if node_name in self:
                  # don't allow re-definition
                  errep(ValueErro r, 'Node %r being re-defined:\n new adj: %r\n old adj: %r' %(
                  node_name, node_def_names[1:], self[node_name].adj_list))
                  else:
                  self[node_name] = Node(node_name)
                  # set adj list, making making this a defined node
                  self[node_name].adj_list = node_def_names[1:]
                  # change adj lists to direct references to adjacent nodes
                  for node in self.values():
                  adj_list = node.adj_list
                  assert adj_list is not None
                  for i in xrange(len(adj_ list)):
                  adj_name = adj_list[i]
                  try:
                  adj_list[i] = self[adj_name]
                  except KeyError:
                  errep( ValueError, 'node %r refers to undefined adj node %r' % (node.name, adj_name))
                  # continuing if debugging
                  adj_list[i] = Node(adj_name) # adj_list None (vs []) signifies undefined
                  def show(self):
                  node_names = self.keys()
                  node_names.sort ()
                  visited = {}
                  ret = []
                  def prtree(node, indent=0):
                  if node.name in visited:
                  if indent: ret.append('%s% s (see previous)'%(ind ent*' ',node.name))
                  else:
                  visited[node.name]=True
                  if node.adj_list is None:
                  ret.append('%s% s (undefined) '%(indent*' ',node.name))
                  return
                  ret.append('%s% s'%(indent*' ',node.name))
                  for adj in node.adj_list:
                  prtree(adj, indent+1)
                  for node_name in node_names:
                  prtree(self[node_name])
                  ret.append('')
                  return '\n'.join(ret)

                  def test(filepath=' '):
                  graph = Graph() # name-string -> Node instance
                  if not filepath:
                  import StringIO
                  f_graph = StringIO.String IO("""
                  A B C D
                  B E F
                  C G H
                  G A
                  D B
                  E
                  F
                  H
                  D E F
                  X E Y
                  """)
                  else:
                  f_graph = file(filepath)

                  print 'Loading graph from %s ...\n--' % (filepath or '(test)')
                  graph.load(f_gr aph, True) # print the lines read
                  print '--'
                  f_graph.close()

                  print graph.show()

                  if __name__ == '__main__':
                  import sys
                  args = sys.argv[1:]
                  test(args and args[0] or '')
                  =============== =============== =============== =============== =============

                  Result:

                  [16:33] C:\pywk\clp\pp> graph.py
                  Loading graph from (test) ...
                  --

                  A B C D
                  B E F
                  C G H
                  G A
                  D B
                  E
                  F
                  H
                  D E F
                  +----------------------------------------+
                  | ValueError: Node 'D' being re-defined: |
                  | new adj: ['E', 'F'] |
                  | old adj: ['B'] |
                  +----------------------------------------+
                  X E Y

                  +-------------------------------------------------------+
                  | ValueError: node 'X' refers to undefined adj node 'Y' |
                  +-------------------------------------------------------+
                  --
                  A
                  B
                  E
                  F
                  C
                  G
                  A (see previous)
                  H
                  D
                  B (see previous)
                  X
                  E (see previous)
                  Y (undefined)

                  (Not tested beyond what you see. NTBWYS? ;-)

                  Regards,
                  Bengt Richter

                  Comment

                  • Bengt Richter

                    #10
                    Re: How do I get a reference to a KEY value of a dictionary?

                    On Sat, 02 Aug 2003 23:34:20 GMT, "Andy C" <ayc8NOSPAM@cor nell.edu> wrote:
                    [color=blue][color=green]
                    >> Google("intern-like memory saver").[/color]
                    >
                    >Well, that seems very sensical, so how come it hasn't made it into the
                    >language? And what's wrong with intern? (Though intern only works on
                    >strings, not for immutable objects in general. I believe someone was asking
                    >a pretty much identical question here, and someone replied with the
                    >'memoize' pattern).
                    >
                    >Can this be done without C, now that you can subclass the built-in
                    >dictionary?
                    >[/color]
                    For a subclass of dict that may be of interest, see my post in this thread
                    timestamped less than 2 minutes before this post of yours ;-)

                    Regards,
                    Bengt Richter

                    Comment

                    • Andy C

                      #11
                      Re: How do I get a reference to a KEY value of a dictionary?

                      Thanks, looks interesting... but it actually doesn't really address the
                      question I was asking. It doesn't matter since I think the intern solution
                      is fine for me. But if I'm not mistaken, your solution still stores the
                      node names in the values of the dictionary (via the Node objects). So in
                      effect you do still have a dictionary of the form { 'node1name':
                      ('node1name', other node1 data), 'node2name': ('node2name', other node2
                      data)). It doesn't seem as silly when you have a real node object, though.
                      In my application the nodes are really just strings; I don't need anything
                      else.

                      Andy

                      "Bengt Richter" <bokr@oz.net> wrote in message
                      news:bgho7i$790 $0@216.39.172.1 22...[color=blue]
                      > On Sat, 02 Aug 2003 23:34:20 GMT, "Andy C" <ayc8NOSPAM@cor nell.edu> wrote:
                      >[color=green][color=darkred]
                      > >> Google("intern-like memory saver").[/color]
                      > >
                      > >Well, that seems very sensical, so how come it hasn't made it into the
                      > >language? And what's wrong with intern? (Though intern only works on
                      > >strings, not for immutable objects in general. I believe someone was[/color][/color]
                      asking[color=blue][color=green]
                      > >a pretty much identical question here, and someone replied with the
                      > >'memoize' pattern).
                      > >
                      > >Can this be done without C, now that you can subclass the built-in
                      > >dictionary?
                      > >[/color]
                      > For a subclass of dict that may be of interest, see my post in this thread
                      > timestamped less than 2 minutes before this post of yours ;-)
                      >
                      > Regards,
                      > Bengt Richter[/color]


                      Comment

                      • Bengt Richter

                        #12
                        Re: How do I get a reference to a KEY value of a dictionary?

                        On Sun, 03 Aug 2003 08:51:40 GMT, "Andy C" <ayc8NOSPAM@cor nell.edu> wrote:
                        [color=blue]
                        >Thanks, looks interesting... but it actually doesn't really address the
                        >question I was asking. It doesn't matter since I think the intern solution[/color]
                        I thought it did ;-)
                        [color=blue]
                        >is fine for me. But if I'm not mistaken, your solution still stores the
                        >node names in the values of the dictionary (via the Node objects). So in[/color]
                        No, just references, not duplicate strings.
                        [color=blue]
                        >effect you do still have a dictionary of the form { 'node1name':
                        >('node1name' , other node1 data), 'node2name': ('node2name', other node2
                        >data)). It doesn't seem as silly when you have a real node object, though.
                        >In my application the nodes are really just strings; I don't need anything
                        >else.[/color]
                        Well, don't you need the adjacency lists you were talking about? I assumed they were
                        all lists of out-edges associated with graph vertices. So I defined a Node class
                        with the two things: a name and an adj_list. You can eliminate the name reference
                        and just let the graph dict keys be the only reference, but then you will not
                        so conveniently have the name available to algorithms that process the nodes.
                        E.g., for some node myNode you might have to search the dict something like

                        for key,value in graph.items(): # untested!
                        if value is myNode: myName=key; break
                        else:
                        myName = "?? can't find name of %r ??'%myNode

                        One caveat though, if your graph has cycles, changing the adjacency lists to direct node
                        references instead of names will create reference cycles that may have to be broken for
                        the gc. I have to look into that. You can comment that fixup loop out if you want to
                        keep adjacency lists as name string references instead of node references.

                        BTW, have you looked at the sets module of 2.3? That would let you have a set of strings
                        with no associated values, and no duplicates (but how do you represent adjacency lists then?)

                        Anyway, unless I goofed, the { 'node1name':('n ode1name', other node 1 data) } analogy is
                        not using two separate strings. It's more like the result of (untested!):

                        s = 'node1name'
                        graph.update({ s:(s, other node 1 data)}) # pseudo-tuple-element at graph[s][1] ;-)
                        del s

                        To check, I added the assert below, where it scans the node names in the
                        graph.show() method, and it didn't complain:

                        def show(self):
                        node_names = self.keys()
                        8< snip >8
                        for node_name in node_names:
                        assert node_name is self[node_name].name
                        prtree(self[node_name])
                        ret.append('')
                        return '\n'.join(ret)

                        Regards,
                        Bengt Richter

                        Comment

                        • Andy C

                          #13
                          Re: How do I get a reference to a KEY value of a dictionary?

                          > >is fine for me. But if I'm not mistaken, your solution still stores the[color=blue][color=green]
                          > >node names in the values of the dictionary (via the Node objects). So in[/color]
                          > No, just references, not duplicate strings.[/color]

                          OK, you're right -- I assumed that the key and value were duplicated, but
                          that's not the case. I don't know why I thought that, maybe since the key
                          must be immutable and the value is mutable. So I guess having a dictionary
                          of the form { 'A': 'A', 'B': 'B' } is not as stupid as it first seems, since
                          only a reference is stored. But still I wonder why the language doesn't
                          have a facility for getting a reference to the key value in constant time.
                          Apparently someone did it by modifying the C source for the dictionary to
                          add a ref_key() accessor. It seems like it would be useful quite often.
                          [color=blue]
                          > One caveat though, if your graph has cycles, changing the adjacency lists[/color]
                          to direct node[color=blue]
                          > references instead of names will create reference cycles that may have to[/color]
                          be broken for[color=blue]
                          > the gc. I have to look into that. You can comment that fixup loop out if[/color]
                          you want to[color=blue]
                          > keep adjacency lists as name string references instead of node references.[/color]

                          Good point... I thought though that the new python (2.3?) will GC data with
                          cycles.
                          [color=blue]
                          > BTW, have you looked at the sets module of 2.3? That would let you have a[/color]
                          set of strings[color=blue]
                          > with no associated values, and no duplicates (but how do you represent[/color]
                          adjacency lists then?)

                          Actually, I did run across the sets type recently. It is actually built *on
                          top of* dict, so it's no better. Ex:
                          [color=blue][color=green][color=darkred]
                          >>> a = Set([1,2,3])
                          >>> a[/color][/color][/color]
                          Set([1, 2, 3])[color=blue][color=green][color=darkred]
                          >>> a._data[/color][/color][/color]
                          {1: True, 2: True, 3: True}

                          It just stores a dummy value as the value of the dict.

                          BTW I am using a regular dictionary for the adjacency list, with the list of
                          destination edges. I was using another dictionary for the 'cache' of names.

                          Thanks for the help.

                          Andy


                          Comment

                          • Aahz

                            #14
                            Re: How do I get a reference to a KEY value of a dictionary?

                            In article <jyFXa.342$TC.2 2669471@newssvr 13.news.prodigy .com>,
                            Andy C <ayc8NOSPAM@cor nell.edu> wrote:[color=blue]
                            >
                            >OK, you're right -- I assumed that the key and value were duplicated,
                            >but that's not the case. I don't know why I thought that, maybe since
                            >the key must be immutable and the value is mutable. So I guess having
                            >a dictionary of the form { 'A': 'A', 'B': 'B' } is not as stupid as it
                            >first seems, since only a reference is stored. But still I wonder why
                            >the language doesn't have a facility for getting a reference to the key
                            >value in constant time. Apparently someone did it by modifying the C
                            >source for the dictionary to add a ref_key() accessor. It seems like
                            >it would be useful quite often.[/color]

                            Well, you're the first person I recall caring about this specific issue.
                            Of course, general caching issues come up quite frequently. All the
                            people I've seen wanting to use intern() come at it from a performance
                            rather than memory perspective, for which a dict would be no use. ;-)
                            --
                            Aahz (aahz@pythoncra ft.com) <*> http://www.pythoncraft.com/

                            This is Python. We don't care much about theory, except where it intersects
                            with useful practice. --Aahz

                            Comment

                            • Andy C

                              #15
                              Re: How do I get a reference to a KEY value of a dictionary?

                              > These days I have an extension which merely subclasses dict to add on[color=blue]
                              > a key reference method and an increment method which does something
                              > like:[/color]

                              You did this in C or python? If in python, how did you do it? (without
                              keys()?) If in C, why don't you submit it to the source tree?

                              thanks,
                              Andy


                              Comment

                              Working...