Street Address Clean up using sql server

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • wamap
    New Member
    • Dec 2010
    • 6

    Street Address Clean up using sql server

    Hello ,

    Can any one assist. I have a Street Address file and i want to split the information in the file into different columns.

    For example "100 SW 14th Street N, Suite 120"

    First. The script will capture the 100 into a new column called StreetNo. Was thinking of a scenario where the script does a count until it finds a space then stops.

    Second. The script will extract "SW" into another column called DirectionPrefix . So if it find any N, E, W, S, NE, NW, SE, SW immediately after the streetNo, it should capture that value and store it the the DirectionPrefix column.

    Thirdly. All the value found immediately after the N, E, W, S, NE, NW, SE, SW, if there are any, should be store in a new column (StreetName). The scripts does the count until it find a Street Type (Road, Avenue, Blvd). but this value will be include a street type.As for the example above - Street.

    Fourth. The script will capture the street type for that road. as for the example above - Street.

    Fifth. Search if any N, E, W, S, NE, NW, SE, SW exist immediately after the street type. if yes, capture and place that into another column.

    Sixth. Everything after the Fifth step should be capture and placed into the sixth columm.

    Thank you.
  • E11esar
    New Member
    • Nov 2008
    • 132

    #2
    Use of a substring with start and end parameters should cater for what you need, with end parameter being the length of the current string part, e.g 100 would have length 3 and so on.

    Remember to start at position 0.

    Comment

    • wamap
      New Member
      • Dec 2010
      • 6

      #3
      Thanks. But the issue is the address number vary from one nemeric value to six numeric values.

      Comment

      • E11esar
        New Member
        • Nov 2008
        • 132

        #4
        How about using a SQL split function with a space character as the delimeter?

        Comment

        • ck9663
          Recognized Expert Specialist
          • Jun 2007
          • 2878

          #5
          It's actually more than that. You have to consider the St (for Street) or misspelled Stret or St. (with period). The Suite, Ste, Unit, # can all mean the same, etc...

          Once you got the algorithm done, why don't you post it here ;)

          Happy Coding!!!

          ~~ CK

          Comment

          • wamap
            New Member
            • Dec 2010
            • 6

            #6
            I hope this explains in details my aims.

            Dataset sample -
            609 14th st. NW
            6115 NW Campbell River Blvd Suite 108
            61346 John Doe NE

            a. First column with only contain the street Number until it find a space that separate the number and the strings in the address column. As in the case above , the result will be
            StreetNo
            1. 609
            2. 6115
            3. 61346

            b. Second. The script will extract all possible combinations of directions where applicable. So if it find any N, E, W, S, NE, NW, SE, S.W etc immediately after the streetNo, it should capture that value and store it the Prefix column. I believe all combination would have to be pre-defined in a variable or kind of a loop of something.
            Prefix
            1. null
            2. NW
            3. null

            c. the string found immediately the pre-defined combinations in step b, should be store in a new column (StreetName). The scripts does the search until it find a Street Type (Road, Avenue, Blvd, Street, St) or a combination type as defind in step b above. The street type will also be pre-defined. In this case,
            1. 14th
            2. Camp Bowie
            3. John Doe
            d. the fourth column will capture the street type for that road. In this case,
            1. st.
            2. blvd
            3. null
            e. the fifth column will capture any direction type or combination found after the street type defined above. The direction type/combination used here are similar to those defined in step b above. In this case we would have,
            1. NW
            2. Null
            3. NE

            f. the last column captures the other strings/value right after the e step. The format are Suite 123, 233, Unit 4a etc. In this case
            1. Null
            2.Suite 108
            3.Null

            Comment

            • E11esar
              New Member
              • Nov 2008
              • 132

              #7
              Are you doing this within SQl (as a function or stored procedure) or in a piece of code, such as C#? If in code then you can split each address line into a string array, and use the .replace method to change the likes of St or St. to Street.
              Again SQL offers a "replace" method for the same purpose.

              Comment

              • wamap
                New Member
                • Dec 2010
                • 6

                #8
                I intend to do this with SQL Server Management Studio.

                Comment

                • E11esar
                  New Member
                  • Nov 2008
                  • 132

                  #9
                  Take a look at the following, it might help:

                  Has it really been over a week since last I blogged? Oh well ... I've been busy both at work and at home. Something I've been working on of late is a better routine to split a string into individual words in T-SQL. You might remember my previous effort , which


                  In effect it is just using substring to build / extract the words but ultimately returning a table. You can then use that table to populate your own in the database.

                  Comment

                  • wamap
                    New Member
                    • Dec 2010
                    • 6

                    #10
                    Thanks but the link you sent is either missing, deleted or cant be found.

                    Comment

                    • E11esar
                      New Member
                      • Nov 2008
                      • 132

                      #11
                      "http://madprops.org/blog/Splitting-Text-Into-Words-in-SQL-Revisited/"

                      I just tried that and it worked.

                      Comment

                      • wamap
                        New Member
                        • Dec 2010
                        • 6

                        #12
                        Yes the link to the website works. Thanks.

                        Comment

                        • SmartyMike
                          New Member
                          • Dec 2011
                          • 1

                          #13
                          Parsing a street address is tricky business. There are many components to an address, with many exceptional situations requiring practical knowledge of USPS addressing standards. Once that hurdle is crossed it's usually in your best interest to know if that address is actually real and deliverable (mail would get there). This can only be accomplished if you have access to DPV-certified software (Delivery Point Validation). Combine this with the fact that address data changes all the time and you have a difficult problem.

                          The good news is that it's a solved problem.

                          In the interest of full disclosure I'm a software developer at SmartyStreets, an address verification software company. We have a few services that could help, one for processing pre-existing lists and another for live data entry that accomplish all of the above. Feel free to give the products a try (we offer a free trial with each product)--you don't pay until you are comfortable with what you are getting. Drop us a line (support at smartystreets dot com)!

                          Comment

                          Working...