Delete items in nested dictionary based on value.

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Brian L. Troutwine

    Delete items in nested dictionary based on value.

    I've got a problem that I can't seem to get my head around and hoped
    somebody might help me out a bit:

    I've got a dictionary, A, that is arbitarily large and may contains
    ints, None and more dictionaries which themselves may contain ints,
    None and more dictionaries. Each of the sub-dictionaries is also
    arbitrarily large. When pretty printing A, in the context I'm using A
    for, it's rather helpful to remove all key:value pairs where value is
    None. So I'd like to be able to define a function dict_sweep to recurse
    through A and del all the keys with None as a value.

    I feel like the solution is right on the tip of my tounge, so to speak,
    but I just can't quite get it. Anyway, here's the best I've come up
    with:

    def dict_sweep(dict ionary, key_value):
    """Sweep through dictionary, deleting any key:value where
    value is None.

    """
    # Find key at position key_value.
    key = dictionary.keys ()[key_value]

    # If the key value is a dictionary we want to move into it.
    # Therefore we call rec_rem on dictionary[key] with a
    # key_value of 0
    if type(dictionary[key]) is type({}):
    dict_sweep(dict ionary[key], 0)

    # If the value isn't a dictionary we care only that it's not None.
    # If it is None we nuke it and call rec_rem on the current
    # dictionary with key_value incrimented by one.
    elif dictionary[key] is None:
    del dictionary[key]
    dict_sweep(dict ionary, key_value+1)

    # If the value isn't a dictionary and the value isn't None then
    # we still want to walk through to the next value in the
    # dictionary, so call rec_rem with key_value+1
    else:
    dict_sweep(dict ionary, key_value+1)

    Assuming dict_sweep worked perfectly it would take input like this:

    A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

    B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}

    and output this:

    A_out = {1: {2: 2, 3: {2: 2}}, 2: 2}

    B_out = {2:2}

    This dict_sweep above obviously doesn't work and I'm rather afraid of
    hitting python's recursion limit. Does anyone see a way to modify
    dict_sweep in it's current state to perform dictionary sweeps like
    this? What about a non-recursive implementation of dict_sweep?

  • bearophileHUGS@lycos.com

    #2
    Re: Delete items in nested dictionary based on value.

    My first try, not much tested:

    def clean(d):
    for key,val in d.items():
    if isinstance(val, dict):
    val = clean(val)
    if not val:
    del d[key]
    return d

    a = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
    print clean(a) # Out: {1: {2: 2, 3: {2: 2}}, 2: 2}
    b = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: None}
    print clean(b) # Out: {2: 2}

    Recursivity overflow problem: you can increase the recursivity limit,
    of if you think you need it after some tests on your real data, then
    you can use a stack (a python list, using append and pop methods) to
    simulate recursivity. You probably don't have 300+ levels of nesting.

    Bye,
    bearophile

    Comment

    • bearophileHUGS@lycos.com

      #3
      Re: Delete items in nested dictionary based on value.

      Better:

      def clean(d):
      for key,val in d.items():
      if isinstance(val, dict):
      val = clean(val)
      if val is None or val == {}:
      del d[key]
      return d

      a = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
      print clean(a) # Out: {1: {2: 2, 3: {2: 2}}, 2: 2}
      b = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: None}
      print clean(b) # Out: {2: 2}
      c = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: 0}
      print clean(c) # Out: {2: 2, 3: 0}

      Comment

      • Dale Strickland-Clark

        #4
        Re: Delete items in nested dictionary based on value.

        Brian L. Troutwine wrote:
        I've got a problem that I can't seem to get my head around and hoped
        somebody might help me out a bit:
        >
        I've got a dictionary, A, that is arbitarily large and may contains
        ints, None and more dictionaries which themselves may contain ints,
        None and more dictionaries. Each of the sub-dictionaries is also
        arbitrarily large. When pretty printing A, in the context I'm using A
        for, it's rather helpful to remove all key:value pairs where value is
        None. So I'd like to be able to define a function dict_sweep to recurse
        through A and del all the keys with None as a value.
        >
        I feel like the solution is right on the tip of my tounge, so to speak,
        but I just can't quite get it. Anyway, here's the best I've come up
        with:
        How about this:

        def dict_sweep(dict ionary):
        for k, v in dictionary.item s():
        if v is None:
        del dictionary[k]
        if type(v) == type({}):
        dict_sweep(v)
        #if len(v) == 0:
        # del dictionary[k]


        A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

        B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}}

        dict_sweep(A_in )
        dict_sweep(B_in )
        print A_in
        print B_in
        >>{1: {2: 2, 3: {2: 2}}, 2: 2}
        >>{1: {2: 2}}

        It doesn't produce the output you expect for B_in, but your brackets didn't
        match and I'm not sure where the extra close should be. Also, I suspect
        you're wanting to delete empty dictionaries but I don't see that mentioned
        in the text. If that is so, include the two commented statements.
        --
        Dale Strickland-Clark
        Riverhall Systems - www.riverhall.co.uk

        Comment

        • Fuzzyman

          #5
          Re: Delete items in nested dictionary based on value.


          bearophileHUGS@ lycos.com wrote:
          Better:
          >
          def clean(d):
          for key,val in d.items():
          if isinstance(val, dict):
          val = clean(val)
          if val is None or val == {}:
          del d[key]
          return d
          Can you delete values from a dictionary whilst iterating over its items
          ?

          Fuzzyman

          >
          a = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
          print clean(a) # Out: {1: {2: 2, 3: {2: 2}}, 2: 2}
          b = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: None}
          print clean(b) # Out: {2: 2}
          c = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: 0}
          print clean(c) # Out: {2: 2, 3: 0}

          Comment

          • bearophileHUGS@lycos.com

            #6
            Re: Delete items in nested dictionary based on value.

            Fuzzyman:
            Can you delete values from a dictionary whilst iterating over its items ?
            Try the code, it works. I am not iterating on the dict, I am not using
            dict.iteritems( ), that produces a lazy iterable, I am using
            dict.items() that produces a list of (key,value) pairs. And I am not
            removing elements from that list :-)

            Hugs,
            bearophile

            Comment

            • John McMonagle

              #7
              Re: Delete items in nested dictionary based on value.

              >
              Assuming dict_sweep worked perfectly it would take input like this:
              >
              A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
              >
              B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}
              >
              and output this:
              >
              A_out = {1: {2: 2, 3: {2: 2}}, 2: 2}
              >
              B_out = {2:2}
              >
              This dict_sweep above obviously doesn't work and I'm rather afraid of
              hitting python's recursion limit. Does anyone see a way to modify
              dict_sweep in it's current state to perform dictionary sweeps like
              this? What about a non-recursive implementation of dict_sweep?
              >
              How about this:

              A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

              def dict_sweep(dict in):

              for key in dictin.keys():
              if dictin.get(key) == None:
              del dictin[key]
              elif type(dictin[key]) is dict:
              dictin[key] = dict_sweep(dict in[key])
              return dictin

              A_out = dict_sweep(A_in )
              print A_out




              Running the above returns:

              {1: {2: 2, 3: {2: 2}}, 2: 2}

              Regards,

              John




              --
              This message has been scanned for viruses and
              dangerous content by MailScanner, and is
              believed to be clean.

              Comment

              • Brian L. Troutwine

                #8
                Re: Delete items in nested dictionary based on value.

                Would it not be easier to modify the pretty-printer to ignore such
                entries rather than have to make a duplicate (I presume you have reason
                to need the original dictionary unchanged) dictionary and then delete
                entries that match your "ignore" criteria?
                Good question. The data I'm handling relates to books: their ISBN,
                pricing information, weight, statistical data for each page, citations
                in other books to certain pages, those books statistical data, so on
                and so forth. It's too much data, really, but this is what I'm tasked
                with handling.

                Currently I'm just pretty printing the data, but I'm really cheating
                right now. The tool I'm writing is supposed to just collect the data,
                strip out all the extranious data based on filters defined at the start
                of the collection process and then output all that data to stdout,
                where it will be caught by a tool to properly format and display it.

                Yesterday morning I got into work, went into deep hack mode, wrote the
                entire tool and at the end of the day could not, for the life of me,
                figure out the recursion I needed. I suppose I'd spent myself. Who
                knows?

                Anyway, you are right in that if I did indeed need the original
                dictionary unchanged it would be much, much easier to modify the
                pretty-printer.

                Dennis Lee Bieber wrote:
                On 13 Sep 2006 16:08:37 -0700, "Brian L. Troutwine"
                <goofyheadedpun k@gmail.comdecl aimed the following in comp.lang.pytho n:
                >
                arbitrarily large. When pretty printing A, in the context I'm using A
                for, it's rather helpful to remove all key:value pairs where value is
                None. So I'd like to be able to define a function dict_sweep to recurse
                through A and del all the keys with None as a value.

                I feel like the solution is right on the tip of my tounge, so to speak,
                but I just can't quite get it. Anyway, here's the best I've come up
                with:
                Would it not be easier to modify the pretty-printer to ignore such
                entries rather than have to make a duplicate (I presume you have reason
                to need the original dictionary unchanged) dictionary and then delete
                entries that match your "ignore" criteria?
                --
                Wulfraed Dennis Lee Bieber KD6MOG
                wlfraed@ix.netc om.com wulfraed@bestia ria.com

                (Bestiaria Support Staff: web-asst@bestiaria. com)
                HTTP://www.bestiaria.com/

                Comment

                • Brian L. Troutwine

                  #9
                  Re: Delete items in nested dictionary based on value.

                  This is a general reply to all.

                  Thanks for all the suggestions. I hadn't really thought about filtering
                  empty dictionaries because the data I'm processing won't have them, but
                  it does make for a much nicer, more general filter. I did end up using
                  bearophileH's code, but that's mostly because he got there first. Each
                  solution functioned as advertised.

                  Again, thanks for all the help.

                  Comment

                  Working...