Need some help speeding up this loop

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • erikcw

    Need some help speeding up this loop

    Hi all,

    I'm trying to write a loop that will build a list of "template
    strings".

    My current implementation is *really slow*. It took 15 minutes to
    finish. (final len(list) was about 16k entries.)

    #combinations = 12 small template strings ie "{{ city }},
    {{ state }}..."
    #states = either a django model or a list of 50 states
    #cities = either a django model of 400 cities or a smaller list of
    cities.

    templates = []
    for c in combinations:
    if len(states):
    for state in states:
    if type(geo) is City:
    cities = state.city_set. all()
    else:
    cities = geo
    for city in cities:
    if type(city) is City:
    city = city.city
    templates.appen d(c.template.re place('{{ city }}',
    city))
    templates.appen d(c.template) #just in case there are no
    cities
    templates = [k.replace('{{ state }}',
    state.state).re place('{{ state_abbr }}', state.abbreviat ion) for k in
    templates]
    elif len(geo):
    for city in geo:
    templates.appen d(c.template.re place('{{ city }}', city))
    else:
    #no cities or states so add roots
    templates.appen d(c.template)

    The final output needs to be a list of the templates combined with all
    the states and cities (some templates will only have the city, some
    only the state).

    Any ideas how I can optimize this?

    Thanks!
  • Marc 'BlackJack' Rintsch

    #2
    Re: Need some help speeding up this loop

    On Wed, 29 Oct 2008 19:24:32 -0700, erikcw wrote:
    I'm trying to write a loop that will build a list of "template strings".
    >
    My current implementation is *really slow*. It took 15 minutes to
    finish. (final len(list) was about 16k entries.)
    What is `list` here? Do you mean ``len(templates )``?
    templates = []
    for c in combinations:
    if len(states):
    for state in states:
    if type(geo) is City:
    cities = state.city_set. all()
    else:
    cities = geo
    for city in cities:
    if type(city) is City:
    city = city.city
    templates.appen d(c.template.re place('{{ city }}',
    city))
    templates.appen d(c.template) #just in case there are no
    cities
    templates = [k.replace('{{ state }}',
    state.state).re place('{{ state_abbr }}', state.abbreviat ion) for k in
    templates]
    It seems you are iterating over *all* accumulated templates so far, over
    and over again, even those which don't have those place holders anymore.
    Looks like the source of quadratic runtime for me.

    Ciao,
    Marc 'BlackJack' Rintsch

    Comment

    • Arnaud Delobelle

      #3
      Re: Need some help speeding up this loop

      On Oct 30, 2:24 am, erikcw <erikwickst...@ gmail.comwrote:
      Hi all,
      >
      I'm trying to write a loop that will build a list of "template
      strings".
      >
      My current implementation is *really slow*.  It took 15 minutes to
      finish. (final len(list) was about 16k entries.)
      >
      #combinations = 12 small template strings ie "{{ city }},
      {{ state }}..."
      #states = either a django model or a list of 50 states
      #cities = either a django model of 400 cities or a smaller list of
      cities.
      >
      templates = []
      for c in combinations:
          if len(states):
              for state in states:
                  if type(geo) is City:
                      cities = state.city_set. all()
                  else:
                      cities = geo
                  for city in cities:
                      if type(city) is City:
                          city = city.city
                      templates.appen d(c.template.re place('{{ city }}',
      city))
                  templates.appen d(c.template) #just in case there are no
      cities
                  templates = [k.replace('{{ state }}',
      state.state).re place('{{ state_abbr }}', state.abbreviat ion) for k in
      templates]
      The line above does a lot of unnecessary work as you iterate over lots
      of templates which have already been completely substituted.
          elif len(geo):
              for city in geo:
                  templates.appen d(c.template.re place('{{ city }}',city))
          else:
              #no cities or states so add roots
              templates.appen d(c.template)
      >
      The final output needs to be a list of the templates combined with all
      the states and cities (some templates will only have the city, some
      only the state).
      >
      Any ideas how I can optimize this?
      >
      Thanks!
      I would suggest this (untested):

      def template_replac e(template, values):
      for var, val in values.iteritem s():
      template = template.replac e('{{ %s }}' % var, val)
      return template

      templates = []
      for c in combinations:
      if len(states):
      for state in states:
      values = { 'state': state.state,
      'state_abbr': state.abbreviat ion }
      #just in case there are no cities :
      templates.appen d(template_repl ace(c.template, values)
      if type(geo) is City:
      cities = state.city_set. all()
      else:
      cities = geo
      for city in cities:
      if type(city) is City:
      values['city'] = city.city
      # Do all the replacing in one go:
      templates.appen d(template_repl ace(c.template,
      values))
      elif len(geo):
      for city in geo:
      templates.appen d(template_repl ace(c.template,
      {'city':city}))
      else:
      #no cities or states so add roots
      templates.appen d(c.template)

      HTH

      Even better would be to iterate over states, then cities, then only
      templates.

      --
      Arnaud

      Comment

      Working...