Complex Nested Dictionaries

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • T. Earle

    Complex Nested Dictionaries

    To list,

    I'm trying to figure out the best approach to the following problem:

    I have four variables:
    1) headlines
    2) times
    3) states
    4) zones

    At this time, I'm thinking of creating a dictionary, headlinesDB, that
    stores different headlines and their associated time(s), state(s), and
    zone(s). The complexity is that each headline can have one or more times,
    one or more states, and one or more zones. However, there can only be 1
    zone per time, and 1 zone per state. What is the best way to tackle this
    particular problem?

    Here's an example of the complexity:

    Let's say we have a "High Wind Warning" for our headline or hazard. In
    addition, there are currently two "High Wind Warnings" in effect. The first
    goes from Tonight through Friday morning (i.e., I'll probably store the
    begin/end times in seconds from 1/1/1970). It affects three counties all in
    the state of Oregon: ORZ047, ORZ048, and ORZ049. The second High Wind
    Warning is in effect from Friday at Noon through Friday evening. It affects
    two counties in two separate states: ORZ044 in Oregon and WAZ028 in
    Washington. Here's the flow chart:

    High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3
    |
    --> time2 --> state1 --> zone4
    --> state2 --> zone5

    Keep in mind, each headline or hazard can have multiple times. Each time
    will have one or more states with each state containing one or more zones.
    Is there a better way than a dictionary. As mentioned above, the headline
    or hazard is the key I'll be extracting all the information from.

    Thanks in advance,

    Tom


  • omission9

    #2
    Re: Complex Nested Dictionaries

    T. Earle wrote:
    [color=blue]
    > To list,
    >
    > I'm trying to figure out the best approach to the following problem:
    >
    > I have four variables:
    > 1) headlines
    > 2) times
    > 3) states
    > 4) zones
    >
    > At this time, I'm thinking of creating a dictionary, headlinesDB, that
    > stores different headlines and their associated time(s), state(s), and
    > zone(s). The complexity is that each headline can have one or more times,
    > one or more states, and one or more zones. However, there can only be 1
    > zone per time, and 1 zone per state. What is the best way to tackle this
    > particular problem?
    >
    > Here's an example of the complexity:
    >
    > Let's say we have a "High Wind Warning" for our headline or hazard. In
    > addition, there are currently two "High Wind Warnings" in effect. The first
    > goes from Tonight through Friday morning (i.e., I'll probably store the
    > begin/end times in seconds from 1/1/1970). It affects three counties all in
    > the state of Oregon: ORZ047, ORZ048, and ORZ049. The second High Wind
    > Warning is in effect from Friday at Noon through Friday evening. It affects
    > two counties in two separate states: ORZ044 in Oregon and WAZ028 in
    > Washington. Here's the flow chart:
    >
    > High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3
    > |
    > --> time2 --> state1 --> zone4
    > --> state2 --> zone5
    >
    > Keep in mind, each headline or hazard can have multiple times. Each time
    > will have one or more states with each state containing one or more zones.
    > Is there a better way than a dictionary. As mentioned above, the headline
    > or hazard is the key I'll be extracting all the information from.
    >
    > Thanks in advance,
    >
    > Tom[/color]

    I'd recommend the mx.DateTime package for storing the times instead of
    seconds. That module includes many useful functions may be need so give
    it a look.
    Provides the most natural and robust way of dealing with date/time values in Python.

    Secondly, although I am not 100% sure about the stated problem I would
    recommend that instead of nested dictionaries you use a tuple as a
    key,say, headlines[(time,state,zon e)]=someValue
    From what you say above it would seem that this would create a unique
    key for all the mentioned situations.

    Comment

    • Russell E. Owen

      #3
      Re: Complex Nested Dictionaries

      In article <40354c35@news. bmi.net>, "T. Earle" <tnospamwade@bm i.net>
      wrote:
      [color=blue]
      >...
      >High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3
      > |
      > --> time2 --> state1 --> zone4
      > --> state2 --> zone5
      >
      >Keep in mind, each headline or hazard can have multiple times. Each time
      >will have one or more states with each state containing one or more zones.
      >Is there a better way than a dictionary. As mentioned above, the headline
      >or hazard is the key I'll be extracting all the information from.[/color]

      If you really only want to look up data by headline, then a dictionary
      of dictionaries or nested lists or some other kind of collection is easy
      and should suffice. For instance:
      warndict["High Wind Warning"] = (
      (time1, {
      state1: (zone1, zone2, zone3),
      state2: (zone1, zone3),
      }),
      (time2, {...}),
      )

      However, I suspect you will also want to be able to locate data by
      state, time or zone. If that is true, I really think you should consider
      storing the data in a relational database. It sounds like a perfect
      match to your problem. Python has some nice interfaces to various
      databases (including PostgreSQL and MySQL).

      -- Russell

      P.S. if you do go with the dictionary, note that it is very easy to make
      a variant dictionary that defines
      a[key] = foo
      to mean "if list a[key] exists, then append foo to that list, otherwise
      create a new list with foo as its only element" (in fact my RO package
      contains just such a class: RO.Alg.MultiDic t -- see <http://www.astro.washi ngton.edu/owen/ROPython.html>)

      Comment

      • T. Earle

        #4
        Re: Complex Nested Dictionaries

        Russell,
        [color=blue]
        > If you really only want to look up data by headline, then a dictionary
        > of dictionaries or nested lists or some other kind of collection is easy
        > and should suffice. For instance:
        > warndict["High Wind Warning"] = (
        > (time1, {
        > state1: (zone1, zone2, zone3),
        > state2: (zone1, zone3),
        > }),
        > (time2, {...}),
        > )[/color]

        This definitely seems to be the structure I've been looking for or at least
        have in mind. Since I'm no expert, could offer some code examples on how to
        create this structure on the fly?
        [color=blue]
        > However, I suspect you will also want to be able to locate data by
        > state, time or zone. If that is true, I really think you should consider
        > storing the data in a relational database. It sounds like a perfect
        > match to your problem. Python has some nice interfaces to various
        > databases (including PostgreSQL and MySQL).[/color]

        My first inclination was to go with a database; however, I thought about it
        and concluded there may be too much variability each time the program is
        executed. For example, there will be times when there are no headlines;
        other times, there will be numerous headlines. Because of this variability,
        the database would have to be created from scratch each time the program is
        ran. As a result, would a database still be the right choice?

        I really appreciate your help and suggestions

        T. Earle


        Comment

        • Scott David Daniels

          #5
          Re: Complex Nested Dictionaries

          T. Earle wrote:[color=blue]
          > Russell,
          >
          >[color=green]
          >>If you really only want to look up data by headline, then a dictionary
          >>of dictionaries or nested lists or some other kind of collection is easy
          >>and should suffice. For instance:
          >>warndict["High Wind Warning"] = (
          >> (time1, {
          >> state1: (zone1, zone2, zone3),
          >> state2: (zone1, zone3),
          >> }),
          >> (time2, {...}),
          >>)[/color]
          > This definitely seems to be the structure I've been looking for or at least
          > have in mind. Since I'm no expert, could offer some code examples on how to
          > create this structure on the fly?[/color]
          For something very much (but not quite) like the above:

          warndict['High Wind Warning'] = {
          time1: {
          state1: [zone1, zone2, zone3],
          state2: [zone1, zone3]},
          time2: {...},
          ...}

          can be built with something like:
          warndict = {}
          for headline, time, state, zone in somesource:
          timedict = warndict.setdef ault(headline, {})
          statedict = timedict.setdef ault(time, {})
          stateentry = statedict.setde fault(state, [])
          stateentry.appe nd(zone)

          [color=blue]
          > My first inclination was to go with a database; however, I thought about it
          > and concluded there may be too much variability each time the program is
          > executed. For example, there will be times when there are no headlines;
          > other times, there will be numerous headlines. Because of this variability,
          > the database would have to be created from scratch each time the program is
          > ran. As a result, would a database still be the right choice?[/color]
          It really depends on the volume of data and the kinds of searches.
          Anything under a thousand or so entries will be searchable by simple
          brute force in reasonable time, so internal data structures may well
          be the way to go.

          --
          -Scott David Daniels
          Scott.Daniels@A cm.Org

          Comment

          • T. Earle

            #6
            Re: Complex Nested Dictionaries

            Scot,

            I really appreciate your help and code. It really helps me to understand
            the underlying solution to my problem. I have another question though,
            what's the best way to test if the headline already exists? If it does not,
            I need to create it along with the required associated data; however, if it
            already exists, I need to test to ensure I'm not already adding data that's
            already there (e.g., time and/or state already exists). Basically, I
            envision, if the state already exists all I need to do is add the new zone.
            I probably should check to make sure the zone doesn't already exists too.
            Any help would be greatly appreciated. I believe it would be similiar to
            what Russell mentioned in his previous responce:

            "if list a[key] exists, then append; otherwise, create a new list"

            Would it be possible to supply a code snippet of this logic to get me
            started? What are state and time? Is it possible to use the "key" keyword
            on these variables to test for their existence? I apologize for my lack of
            knowledge in this particular realm of programming in Python. Nested
            dictionaries have always given me trouble.

            Thanks,

            T. Earle
            [color=blue]
            >
            > warndict['High Wind Warning'] = {
            > time1: {
            > state1: [zone1, zone2, zone3],
            > state2: [zone1, zone3]},
            > time2: {...},
            > ...}
            >
            > can be built with something like:
            > warndict = {}
            > for headline, time, state, zone in somesource:
            > timedict = warndict.setdef ault(headline, {})
            > statedict = timedict.setdef ault(time, {})
            > stateentry = statedict.setde fault(state, [])
            > stateentry.appe nd(zone)[/color]


            Comment

            • Edward C. Jones

              #7
              Re: Complex Nested Dictionaries

              T. Earle wrote:[color=blue]
              > Scot,
              >
              > I really appreciate your help and code. It really helps me to understand
              > the underlying solution to my problem. I have another question though,
              > what's the best way to test if the headline already exists? If it does not,
              > I need to create it along with the required associated data; however, if it
              > already exists, I need to test to ensure I'm not already adding data that's
              > already there (e.g., time and/or state already exists). Basically, I
              > envision, if the state already exists all I need to do is add the new zone.
              > I probably should check to make sure the zone doesn't already exists too.
              > Any help would be greatly appreciated. I believe it would be similiar to
              > what Russell mentioned in his previous responce:
              >
              > "if list a[key] exists, then append; otherwise, create a new list"
              >
              > Would it be possible to supply a code snippet of this logic to get me
              > started? What are state and time? Is it possible to use the "key" keyword
              > on these variables to test for their existence? I apologize for my lack of
              > knowledge in this particular realm of programming in Python. Nested
              > dictionaries have always given me trouble.
              >
              > Thanks,
              >
              > T. Earle[/color]

              Check out "MultiDict. py" at "http://members.tripod. com/~edcjones/".

              Ed Jones

              Comment

              • Irmen de Jong

                #8
                Re: Complex Nested Dictionaries

                T. Earle wrote:
                [color=blue]
                > "if list a[key] exists, then append; otherwise, create a new list"[/color]

                when a is a dict, key the required key, and object the value you want
                to insert;

                a.setdefault(ke y,[]).append(object )

                --Irmen

                Comment

                • has

                  #9
                  Re: Complex Nested Dictionaries

                  "T. Earle" <tnospamwade@bm i.net> wrote in message news:<40354c35@ news.bmi.net>.. .[color=blue]
                  > To list,
                  >
                  > I'm trying to figure out the best approach to the following problem:
                  >
                  > I have four variables:
                  > 1) headlines
                  > 2) times
                  > 3) states
                  > 4) zones
                  >
                  > At this time, I'm thinking of creating a dictionary, headlinesDB, that
                  > stores different headlines and their associated time(s), state(s), and
                  > zone(s). The complexity is that each headline can have one or more times,
                  > one or more states, and one or more zones. However, there can only be 1
                  > zone per time, and 1 zone per state. What is the best way to tackle this
                  > particular problem?[/color]

                  Shake out non-essential complexity first. Not really up on relational
                  DBs and stuff, so take my attempts at table design with a pinch of
                  salt, but think I'd break your problem down something like this:


                  - Hazard Type Table
                  TYPE
                  High Wind Warning
                  Tornado Warning
                  Blizzard Warning

                  - Hazard Event Table
                  ID TYPE START END
                  ZONES
                  1 High Wind Warning 2004-03-01-22-00-00 2004-03-02-08-00-00
                  [ORZ047, ORZ048, ORZ049]
                  2 High Wind Warning 2004-03-02-12-00-00 2004-03-02-20-00-00
                  [ORZ044, WAZ028]

                  - Zone Table
                  ZONE STATE
                  ORZ044 Oregon
                  ORZ047 Oregon
                  ORZ048 Oregon
                  ORZ049 Oregon
                  WAZ028 Washington


                  Note this organises around individual hazard 'events', rather than
                  hazard types, making it easier to think see what's going on. Also,
                  because Zones already identify their States, there's no need to put
                  State info into hazard events. (State names, if you need them, can be
                  looked up separately.)

                  How you actually implement it - as a relational DB/a list of
                  HazardEvent instances stuffed into a list and brute-force searched via
                  list comprehensions/nested dicts and lists - really depends on how
                  you're going to manipulate it, how much flexibility/simplicity you
                  need, etc.

                  HTH

                  has

                  Comment

                  • Russell E. Owen

                    #10
                    Re: Complex Nested Dictionaries

                    In article <4036a303@news. bmi.net>, "T. Earle" <tnospamwade@bm i.net>
                    wrote:
                    [color=blue]
                    >Russell,
                    >[color=green]
                    >> If you really only want to look up data by headline, then a dictionary
                    >> of dictionaries or nested lists or some other kind of collection is easy
                    >> and should suffice. For instance:
                    >> warndict["High Wind Warning"] = (
                    >> (time1, {
                    >> state1: (zone1, zone2, zone3),
                    >> state2: (zone1, zone3),
                    >> }),
                    >> (time2, {...}),
                    >> )[/color]
                    >
                    >This definitely seems to be the structure I've been looking for or at least
                    >have in mind. Since I'm no expert, could offer some code examples on how to
                    >create this structure on the fly?
                    >[color=green]
                    >> However, I suspect you will also want to be able to locate data by
                    >> state, time or zone. If that is true, I really think you should consider
                    >> storing the data in a relational database. It sounds like a perfect
                    >> match to your problem. Python has some nice interfaces to various
                    >> databases (including PostgreSQL and MySQL).[/color]
                    >
                    >My first inclination was to go with a database; however, I thought about it
                    >and concluded there may be too much variability each time the program is
                    >executed. For example, there will be times when there are no headlines;
                    >other times, there will be numerous headlines. Because of this variability,
                    >the database would have to be created from scratch each time the program is
                    >ran. As a result, would a database still be the right choice?
                    >
                    >I really appreciate your help and suggestions[/color]

                    Regarding a database: if you are mainly interested in fairly current
                    events (rather than being able to go back and search for old events) and
                    you don't have a huge # of events, then a database does seem "overkill".

                    However, if you have a lot of events or want to do a lot of searching,
                    it may be worth keeping a database around. If you use a database, I
                    recommend creating only one of them. Just add new events, and
                    occasionally purge old data if you don't care about it anymore.

                    Here is some sample code (untested) to create the structure shown above.
                    I assume a simple (for me) structure for the input data; modify
                    addHealine accordingly if your data needs more massaging first.

                    This code exposes the internal data, because the class is itself the
                    dictionary of data. Whether or not this is a good idea depends on how
                    you want to search for data. If the built in dict methods are of
                    interest, then you are all set. If not, I would make HeadDict *contain*
                    a dict instead of *being* a dict, then write your own methods to
                    retrieve data.

                    - Download the RO package from http://astro.washington.edu/owen and install it in site-packages
                    or anywhere on your PythonPath. RO includes RO.Alg.ListDict , which
                    supports a dictionary whose values are a list and for which the
                    expression md[key] = value appends "value" to the list associated with
                    "key", creating a new list if "key" doesn't already exist.

                    import RO.Alg

                    class HeadDict(RO.Alg .ListDict):
                    def addHeadline(sel f, headline, time, stateZoneList)
                    """Add a headline for a given time. stateZoneList is of the form:
                    ((state1, zones_for_state 1), (state2, zones_for_state 2), ...)
                    """
                    stateZoneDict = dict(stateZoneL ist)
                    self[headline] = (time, stateZoneDict)

                    warndict = HeadDict()
                    warnDict.addHea dline("High Wind Warning", time1, stateZoneList1)
                    warnDict.addHea dline("High Wind Warning", time2, stateZoneList2)

                    -- Russell

                    Comment

                    Working...