parsing names and businesses with asterisk

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • James Armstrong
    New Member
    • Oct 2011
    • 12

    parsing names and businesses with asterisk

    Hello. I'm trying to parse out a large list of names and businesses I have but am running into trouble. I'd like to filter it into First, Middle and Last name fields, and if it has an asterisk in front of it, remove the asterisk and enter everything following it into the last name field. I've had some mild success doing this in excel but am having problems in MS Access. Here is an example of the data:

    Code:
    John Smith
    *Acme Paint
    John Lee Smith
    John D Van Gogh
    *Google
    *Random Business
    Can anybody lend a helping hand?
  • Rabbit
    Recognized Expert MVP
    • Jan 2007
    • 12517

    #2
    What have you tried so far?

    Comment

    • NeoPa
      Recognized Expert Moderator MVP
      • Oct 2006
      • 32645

      #3
      Your example data illustrates that you understand that you don't even know what question you're asking. I suggest that would be the place to start.

      What logic are you hoping to implement here? I assume you appreciate that hoping for a computer to recognise commonly used surnames in the way a human would is a forlorn hope. What, then, are you thinking would be the logic that you are asking us to help you to implement?

      Comment

      • zmbd
        Recognized Expert Moderator Expert
        • Mar 2012
        • 5501

        #4
        James Armstrong:
        Why for this project? Sounds a lot like a homework issue given by the Demon CompSci Profs.

        (I had really evil CompSci Profs, good guys, just vary devious - and this was a real pain in FORTRAN!).

        However, it seems that most people don't have Demons for CompSci professors, let's just think this thing thru:

        Using the data in your example, the parsing will be a pain.

        Just some pseudo/air code:
        Lets start with the "*" that will be simple enough.
        Pull the LEFT 1 of the string compare for "*" and if true then MID string from the 2nd position thru the length and as you want all of this in the last name field - well, put it there.

        As for 3 and 4... here I'd locate the spaces.
        Logic behind normal English based names is that he first space following the leading string is the deliminator separating between the first and either the middle or last names. SO, now you have a logical seperation for the first name thus LEFT pull thru to the space (or MID up to you) and stuff into the First name field.

        Now look for a second separation within the context. A second separation will more than likely be the separation between a middle initial and the last name. Take the length of the MID string between the first separation and the second separation. A single character will normally be a middle initial - Put it there and take the remaining string following the second separation and parse to last name.

        HOWEVER, as in line 3 of the example you have more than one character we'll have to make a few more guesses... check for any more spaces (ie line 4) in which case, the MID string between the first and second spaces is highly likely to be a middle name - compound middle names are not very common; however, they do happen so look for another space. If found then we're dealing with compound middle name. Taking a small leap of faith: as a majority of compound middle names from a Englished base data-set will have only one space we can then use the MID string between the first space and the third spaces and push that data into the middle name field with the remainder pushed into the last name field.

        Now two things:
        You will have to prove that this isn't a homework assignment
        You will have to show what you have for code.
        Once done, I'll go dig my old FORTRAN code out (that may have been in the box that the fire destroyed... I hope not, it was a really brilliant piece of code - it got me out of the mid-term exam!) and translate it to VBA - otherwise, you should have enough to write the code.
        Last edited by zmbd; Jan 26 '13, 10:47 PM.

        Comment

        • ADezii
          Recognized Expert Expert
          • Apr 2006
          • 8834

          #5
          I would take a slightly different approach, such as:
          Code:
          Public Function fAnalyzeName(strName As String) As String
          Dim varName As Variant
          Dim intNumOfSpaces As Integer
          
          varName = Split(strName, " ")
          
          If Left$(varName(0), 1) = "*" Then
            'If Name begins with a '*' remove Asterisk and enter as Last Name
            fAnalyzeName = Replace(varName(0), "*", "")
          Else
            'Determine the Number of Spaces in the Name
            intNumOfSpaces = UBound(varName)
            
            'Do the parsing depending on the Number of Spaces
            Select Case intNumOfSpaces
              Case 0      'No Space in Name
              Case 1      '1 Space in Name, First Name is varName(0),
                          'Last Name will be in varName(1)
              Case 2
                          '1 Space in Name, First Name is varName(0),
                          'MI is in varName(1), Last Name will be in varName(2)
              Case 3
              Case 4
              Case 5
              Case 6
              Case Else
            End Select
          End If
          End Function

          Comment

          • NeoPa
            Recognized Expert Moderator MVP
            • Oct 2006
            • 32645

            #6
            That doesn't deal with the fundamental problem inherent in the question though ADezii. Data of "Edwin Van Gogh" would be mistreated in your code. The point I made was that the question itself hasn't been properly considered. To apply logic, that logic must first be defined and/or understood. That is not the case here.

            Comment

            • James Armstrong
              New Member
              • Oct 2011
              • 12

              #7
              Hi guys, thanks for all of your input. I am pretty sure I know what I'm asking. With a little help I believe I have a working excel macro that does account for names such as "Edwin Van Gogh". Sadly, I need this to work in access and am having a devil of a time translating it. Below is the code:

              Code:
              Sub Macro1()
              
              startrow = 1
              startcol = 1
              
              While Cells(startrow, startcol) <> ""
                  work_name = Cells(startrow, startcol)
                  space1 = ""
                  space2 = ""
                  space3 = ""
                  space4 = ""
                  first_name = ""
                  middle_name = ""
                  last_name = ""
                  If Mid(work_name, 1, 1) <> "*" Then
                      On Error GoTo -1
                      On Error GoTo no_more_spaces
                      space1 = InStr(1, work_name, " ")
                      If space1 <> 0 And space1 <> "" Then
                          space2 = InStr(space1 + 1, work_name, " ")
                      End If
                      If space2 <> 0 And space2 <> "" Then
                          space3 = InStr(space2 + 1, work_name, " ")
                      End If
                      If space3 <> 0 And space3 <> "" Then
                          space4 = InStr(space3 + 1, work_name, " ")
                      End If
                  End If
                  
              no_more_spaces:
                      
                      If space4 > 0 And space4 <= 99 Then
                                      last_name = Mid(work_name, space2 + 1, Len(work_name) - space2)
                                      first_name = Left(work_name, space1 - 1)
                                      middle_name = Mid(work_name, space1 + 1, space2 - space1 - 1)
                      Else
                      If space3 > 0 And space3 <= 99 Then
                                      last_name = Mid(work_name, space2 + 1, Len(work_name) - space2)
                                      first_name = Left(work_name, space1 - 1)
                                      middle_name = Mid(work_name, space1 + 1, space2 - space1 - 1)
                      Else
                      If space2 > 0 And space2 <= 99 Then
                                      last_name = Mid(work_name, space2 + 1, Len(work_name) - space2)
                                      first_name = Left(work_name, space1 - 1)
                                      middle_name = Mid(work_name, space1 + 1, space2 - space1 - 1)
                      Else
                      If space1 > 0 And space1 <= 99 Then
                                      last_name = Mid(work_name, space1 + 1, Len(work_name) - space1)
                                      first_name = Left(work_name, space1 - 1)
                      Else
                                      last_name = work_name
                                      middle_name = ""
                                      first_name = ""
                      End If
                      End If
                      End If
                      End If
                      
                      Cells(startrow, 2) = first_name
                      Cells(startrow, 3) = middle_name
                      Cells(startrow, 4) = last_name
                      
                      
                  startrow = startrow + 1
              Wend
                                  
              
              
              End Sub

              Comment

              • James Armstrong
                New Member
                • Oct 2011
                • 12

                #8
                Definately not a homework assignment, more of a challenge I am facing to make my life easier at work. More of a self teaching moment.

                Comment

                • NeoPa
                  Recognized Expert Moderator MVP
                  • Oct 2006
                  • 32645

                  #9
                  Originally posted by James
                  James:
                  I am pretty sure I know what I'm asking. ... Below is the code:
                  We need you to specify the question James. Dropping a bunch of code in and expecting us to work it out for you is not acceptable. Not very respectful TBF. Please specify your question properly as you have already been requested to do.

                  I appreciate that it is probably not a homework question. That should no longer be an issue. We still need a properly posed question to work from though. I'll leave that with you.

                  Comment

                  • Rabbit
                    Recognized Expert MVP
                    • Jan 2007
                    • 12517

                    #10
                    You can simplify this to:
                    Code:
                    Dim last_name As String
                    Dim middle_name As String
                    Dim first_name As String
                    Dim arrNames
                    
                    If Left(work_name, 1) = "*" Then
                       last_name = work_name 
                    Else
                       arrNames = Split(work_name, " ")
                       first_name = arrNames(0)
                    
                       If UBound(arrNames) = 1 Then
                          last_name = arrNames(1)
                       Else
                          middle_name = arrNames(1)
                          last_name = arrNames(2)
                    
                          If UBound(arrNames) = 3 Then
                             last_name = arrNames(2) & " " & arrNames(3)
                          End If
                       End If
                    End If

                    Comment

                    • James Armstrong
                      New Member
                      • Oct 2011
                      • 12

                      #11
                      I definately don't mean to be rude. I apologize if I have been. I have limited understanding of code that prevents me from explaining further than what you require. I will rephrase in light of the excel code I posted.

                      How do I take the code I posted above and modify it to work in an ms access table? Assuming I have tables of Unparsed - FirstName - MiddleName - LastName.

                      Sorry if I'm not being very clear.

                      Rabbit what is arrNames in your code referring to?

                      Comment

                      • Rabbit
                        Recognized Expert MVP
                        • Jan 2007
                        • 12517

                        #12
                        It holds the array from the split function.

                        Comment

                        • zmbd
                          Recognized Expert Moderator Expert
                          • Mar 2012
                          • 5501

                          #13
                          @James:
                          First the devil of a time you are having between excel and access is probably along the lines of how they relate. One very simple way to think of the relationship is as: Worksheets = tables; Rows = records; Columns = Fields; cells = a specific field within a record.
                          The next mindwarp is that where excel tends to be focused on the cell level up... access is focused from the table level down.
                          With this in mind along with the logic in my post #4, followed by both very good outlines given by ADezii and Rabbit - you should be able to write the VBA code.
                          Last edited by zmbd; Jan 14 '14, 06:45 PM.

                          Comment

                          • James Armstrong
                            New Member
                            • Oct 2011
                            • 12

                            #14
                            Thanks for all of your input guys, I really appreciate it. I'm going to give it a go with all of the information provided and see what I can come up with. I'll update this thread if I can get it working.

                            Comment

                            • NeoPa
                              Recognized Expert Moderator MVP
                              • Oct 2006
                              • 32645

                              #15
                              Originally posted by James
                              James:
                              I definately don't mean to be rude. I apologize if I have been. I have limited understanding of code that prevents me from explaining further than what you require.
                              You can't say fairer than that, and I accept what you say at face value. Unfortunate, possibly, but that's as good a reason as you could want.

                              I won't chime in now, as the other chaps have already much for you to be working with. I will if required though.

                              PS. For what it's worth, your Excel code (Post #7) is fairly basic and treats the separate words in the text as discrete items (I'm not trying to disparage the code, but merely translate the routine into an explanation for you). It would also not handle a surname of "Van Gogh". When you say that this routine works for "Edwin Van Gogh", I assume you mean it separates the items out, but it would certainly not identify the surname as "Van Gogh" (which is, after all, the accurate position). I read your post to indicate this had been correctly handled in this routine. This is the point I was trying to bring your attention to. Until it is addressed you will have a routine that only approximates to your actual stated requirement.

                              Comment

                              Working...