zip distance in a square, not radius

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Xenophobe

    zip distance in a square, not radius

    I have successfully created a zip code radius search, but the performance is
    unacceptable.

    I have two tables. The first is 52K zip codes w/lat and long. The second is
    3K national business addresses.

    Using the zip code 10010, it takes almost 60 seconds to return 122
    businesses within a 25 mile radius. This slowdown is almost entirely due to
    the query radius distance formula.

    I wonder if someone knows the formula to calculate the distance within a
    given square. Would this increase performance? I presume the results would
    include zips in the "corners", but performance boost might be worth it.

    Thanks in advance.


  • Chung Leong

    #2
    Re: zip distance in a square, not radius

    The trick is to use the box to quickly filter out the ones that can't
    possibly satistfy the condition, before doing the more expensive radius
    calculation.

    First of all, add the latitude and longtitude of each business to the second
    table. Since these are going to stay the same through difference searches,
    we can do the look up in advance. Be sure to put indexes on the columns.

    Using the lat., long. of the zip code provide by the user, calculate the box

    $left = $long - $dist; $right = $long + $dist;
    $top = $lat + $dist; $bottom = $lat - $dist;

    and use it in the query to exclude those businesses outside the box.

    $result = mysql_query("SE LECT * FROM business
    WHERE ($left < long) AND (long < $right)
    AND ($bottom < lat) AND (lat < $top);

    Now, if you want, you can do a radius check one each row. It's probably
    unnecessary though, since you don't actually know precise location of the
    user and the businesses.


    Uzytkownik "Xenophobe" <xenophobe@plan etx.com> napisal w wiadomosci
    news:sSAZb.9126 6$jk2.439590@at tbi_s53...[color=blue]
    > I have successfully created a zip code radius search, but the performance[/color]
    is[color=blue]
    > unacceptable.
    >
    > I have two tables. The first is 52K zip codes w/lat and long. The second[/color]
    is[color=blue]
    > 3K national business addresses.
    >
    > Using the zip code 10010, it takes almost 60 seconds to return 122
    > businesses within a 25 mile radius. This slowdown is almost entirely due[/color]
    to[color=blue]
    > the query radius distance formula.
    >
    > I wonder if someone knows the formula to calculate the distance within a
    > given square. Would this increase performance? I presume the results would
    > include zips in the "corners", but performance boost might be worth it.
    >
    > Thanks in advance.
    >
    >[/color]


    Comment

    • Steven C. Gallafent

      #3
      Re: zip distance in a square, not radius

      "Xenophobe" <xenophobe@plan etx.com> wrote in message
      news:sSAZb.9126 6$jk2.439590@at tbi_s53...[color=blue]
      > Using the zip code 10010, it takes almost 60 seconds to return 122
      > businesses within a 25 mile radius. This slowdown is almost entirely due[/color]
      to[color=blue]
      > the query radius distance formula.
      >
      > I wonder if someone knows the formula to calculate the distance within a
      > given square. Would this increase performance? I presume the results would
      > include zips in the "corners", but performance boost might be worth it.[/color]

      I've got a system where I do something like this, but I precalculate the
      distances. Whenever I update the ZIP database, I do something similar to the
      following:

      1) Calculate the distance (straight-line according to the particular formula
      I use) between every pair of ZIP codes.

      In my case, I'm only using ZIP codes for Long Island and the five boroughs,
      so I've only got a few thousand, resulting in a manageable table. If you did
      this for the entire national database, you're talking about LOTS of rows.

      The table looks like this:

      CREATE TABLE ZIPDistance (
      ZIPCode1 DECIMAL(5),
      ZIPCode2 DECIMAL(5),
      Distance DECIMAL(3)
      )

      By convention, I always store the "lower" of the two ZIP codes in ZIPCode1.

      2) Delete everything in the table where the distance is over my maximum
      lookup distance.

      You can now quickly get all of the businesses within a certain distance with
      a query along these lines:

      SELECT * FROM Business WHERE ZIPCode IN
      (
      SELECT ZIPCode2 FROM ZIPDistance WHERE ZIPCode1 = 11787 AND Distance <= 10
      UNION
      SELECT ZIPCode1 FROM ZIPDistance WHERE ZIPCode2 = 11787 AND Distance <= 10
      UNION
      SELECT 11787
      )

      Note: I'm running on SQL Server for this project, so I can write these sorts
      of queries. You'll need to come up with a working method for MySQL. That is
      left as an exercise for the interested student. :)

      Steve
      --
      Steven C. Gallafent - The Computer Guy
      steve@compguy.c om - http://www.compguy.com/


      Comment

      • Xenophobe

        #4
        Re: zip distance in a square, not radius

        Thank you for your reply.

        I discovered the root of my problem was the lack of proper indexing. This
        was THE reason for the poor performance, not the query radius calculation
        math.

        A 25 mile radius search around 10010 that took 60 seconds before, but now
        only take 2 seconds.

        The moral of this story is index, index, index.

        "Chung Leong" <chernyshevsky@ hotmail.com> wrote in message
        news:M_CdnfhyoP Yo_qrdRVn-vw@comcast.com. ..[color=blue]
        > The trick is to use the box to quickly filter out the ones that can't
        > possibly satistfy the condition, before doing the more expensive radius
        > calculation.
        >
        > First of all, add the latitude and longtitude of each business to the[/color]
        second[color=blue]
        > table. Since these are going to stay the same through difference searches,
        > we can do the look up in advance. Be sure to put indexes on the columns.
        >
        > Using the lat., long. of the zip code provide by the user, calculate the[/color]
        box[color=blue]
        >
        > $left = $long - $dist; $right = $long + $dist;
        > $top = $lat + $dist; $bottom = $lat - $dist;
        >
        > and use it in the query to exclude those businesses outside the box.
        >
        > $result = mysql_query("SE LECT * FROM business
        > WHERE ($left < long) AND (long < $right)
        > AND ($bottom < lat) AND (lat < $top);
        >
        > Now, if you want, you can do a radius check one each row. It's probably
        > unnecessary though, since you don't actually know precise location of the
        > user and the businesses.
        >
        >
        > Uzytkownik "Xenophobe" <xenophobe@plan etx.com> napisal w wiadomosci
        > news:sSAZb.9126 6$jk2.439590@at tbi_s53...[color=green]
        > > I have successfully created a zip code radius search, but the[/color][/color]
        performance[color=blue]
        > is[color=green]
        > > unacceptable.
        > >
        > > I have two tables. The first is 52K zip codes w/lat and long. The second[/color]
        > is[color=green]
        > > 3K national business addresses.
        > >
        > > Using the zip code 10010, it takes almost 60 seconds to return 122
        > > businesses within a 25 mile radius. This slowdown is almost entirely due[/color]
        > to[color=green]
        > > the query radius distance formula.
        > >
        > > I wonder if someone knows the formula to calculate the distance within a
        > > given square. Would this increase performance? I presume the results[/color][/color]
        would[color=blue][color=green]
        > > include zips in the "corners", but performance boost might be worth it.
        > >
        > > Thanks in advance.
        > >
        > >[/color]
        >
        >[/color]


        Comment

        • Jochen Daum

          #5
          Re: zip distance in a square, not radius

          Hi,

          On Sat, 21 Feb 2004 04:10:00 GMT, "Xenophobe" <xenophobe@plan etx.com>
          wrote:
          [color=blue]
          >I have successfully created a zip code radius search, but the performance is
          >unacceptable .
          >
          >I have two tables. The first is 52K zip codes w/lat and long. The second is
          >3K national business addresses.
          >
          >Using the zip code 10010, it takes almost 60 seconds to return 122
          >businesses within a 25 mile radius. This slowdown is almost entirely due to
          >the query radius distance formula.
          >[/color]
          What about the suggestion I made last time you asked? It would be
          heaps faster, because it doesn't do any calculation

          HTH, Jochen


          --
          Jochen Daum - Cabletalk Group Ltd.
          PHP DB Edit Toolkit -- PHP scripts for building
          database editing interfaces.
          Download PHP DB Edit Toolkit for free. PHP DB Edit Toolkit is a set of PHP classes makes the generation of database edit interfaces easier and faster. The main class builds tabular and form views based on a data dictionary and takes over handling of insert/update/delete and user input.

          Comment

          Working...