Question about concatenation error

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • colonel

    Question about concatenation error

    I am new to python and I am confused as to why when I try to
    concatenate 3 strings, it isn't working properly.

    Here is the code:

    ------------------------------------------------------------------------------------------
    import string
    import sys
    import re
    import urllib

    linkArray = []
    srcArray = []
    website = sys.argv[1]

    urllib.urlretri eve(website, 'getfile.txt')

    filename = "getfile.tx t"
    input = open(filename, 'r')
    reg1 = re.compile('hre f=".*"')
    reg3 = re.compile('".* ?"')
    reg4 = re.compile('htt p')
    Line = input.readline( )

    while Line:
    searchstring1 = reg1.search(Lin e)
    if searchstring1:
    rawlink = searchstring1.g roup()
    link = reg3.search(raw link).group()
    link2 = link.split('"')
    cleanlink = link2[1:2]
    fullink = reg4.search(str (cleanlink))
    if fullink:
    linkArray.appen d(cleanlink)
    else:
    cleanlink2 = str(website) + "/" + str(cleanlink)
    linkArray.appen d(cleanlink2)
    Line = input.readline( )

    print linkArray
    -----------------------------------------------------------------------------------------------

    I get this:

    ["http://www.slugnuts.co m/['index.html']",
    "http://www.slugnuts.co m/['movies.html']",
    "http://www.slugnuts.co m/['ramblings.html ']",
    "http://www.slugnuts.co m/['sluggies.html']",
    "http://www.slugnuts.co m/['movies.html']"]

    instead of this:

    ["http://www.slugnuts.co m/index.html]",
    "http://www.slugnuts.co m/movies.html]",
    "http://www.slugnuts.co m/ramblings.html]",
    "http://www.slugnuts.co m/sluggies.html]",
    "http://www.slugnuts.co m/movies.html]"]

    The concatenation isn't working the way I expected it to. I suspect
    that I am screwing up by mixing types, but I can't see where...

    I would appreciate any advice or pointers.

    Thanks.
  • colonel

    #2
    Re: Question about concatenation error

    On Wed, 07 Sep 2005 16:34:25 GMT, colonel <thecamel@camel richard.org>
    wrote:
    [color=blue]
    >I am new to python and I am confused as to why when I try to
    >concatenate 3 strings, it isn't working properly.
    >
    >Here is the code:
    >
    >------------------------------------------------------------------------------------------
    >import string
    >import sys
    >import re
    >import urllib
    >
    >linkArray = []
    >srcArray = []
    >website = sys.argv[1]
    >
    >urllib.urlretr ieve(website, 'getfile.txt')
    >
    >filename = "getfile.tx t"
    >input = open(filename, 'r')
    >reg1 = re.compile('hre f=".*"')
    >reg3 = re.compile('".* ?"')
    >reg4 = re.compile('htt p')
    >Line = input.readline( )
    >
    >while Line:
    > searchstring1 = reg1.search(Lin e)
    > if searchstring1:
    > rawlink = searchstring1.g roup()
    > link = reg3.search(raw link).group()
    > link2 = link.split('"')
    > cleanlink = link2[1:2]
    > fullink = reg4.search(str (cleanlink))
    > if fullink:
    > linkArray.appen d(cleanlink)
    > else:
    > cleanlink2 = str(website) + "/" + str(cleanlink)
    > linkArray.appen d(cleanlink2)
    > Line = input.readline( )
    >
    >print linkArray
    >-----------------------------------------------------------------------------------------------
    >
    >I get this:
    >
    >["http://www.slugnuts.co m/['index.html']",
    >"http://www.slugnuts.co m/['movies.html']",
    >"http://www.slugnuts.co m/['ramblings.html ']",
    >"http://www.slugnuts.co m/['sluggies.html']",
    >"http://www.slugnuts.co m/['movies.html']"]
    >
    >instead of this:
    >
    >["http://www.slugnuts.co m/index.html]",
    >"http://www.slugnuts.co m/movies.html]",
    >"http://www.slugnuts.co m/ramblings.html]",
    >"http://www.slugnuts.co m/sluggies.html]",
    >"http://www.slugnuts.co m/movies.html]"]
    >
    >The concatenation isn't working the way I expected it to. I suspect
    >that I am screwing up by mixing types, but I can't see where...
    >
    >I would appreciate any advice or pointers.
    >
    >Thanks.[/color]


    Okay. It works if I change:

    fullink = reg4.search(str (cleanlink))
    if fullink:
    linkArray.appen d(cleanlink)
    else:
    cleanlink2 = str(website) + "/" + str(cleanlink)

    to

    fullink = reg4.search(cle anlink[0])
    if fullink:
    linkArray.appen d(cleanlink[0])
    else:
    cleanlink2 = str(website) + "/" + cleanlink[0]


    so can anyone tell me why "cleanlink" gets coverted to a list? Is it
    during the slicing?


    Thanks.

    Comment

    • Steve Holden

      #3
      Re: Question about concatenation error

      colonel wrote:[color=blue]
      > On Wed, 07 Sep 2005 16:34:25 GMT, colonel <thecamel@camel richard.org>
      > wrote:
      >
      >[color=green]
      >>I am new to python and I am confused as to why when I try to
      >>concatenate 3 strings, it isn't working properly.
      >>
      >>Here is the code:
      >>
      >>------------------------------------------------------------------------------------------
      >>import string
      >>import sys
      >>import re
      >>import urllib
      >>
      >>linkArray = []
      >>srcArray = []
      >>website = sys.argv[1]
      >>
      >>urllib.urlret rieve(website, 'getfile.txt')
      >>
      >>filename = "getfile.tx t"
      >>input = open(filename, 'r')
      >>reg1 = re.compile('hre f=".*"')
      >>reg3 = re.compile('".* ?"')
      >>reg4 = re.compile('htt p')
      >>Line = input.readline( )
      >>
      >>while Line:
      >> searchstring1 = reg1.search(Lin e)
      >> if searchstring1:
      >> rawlink = searchstring1.g roup()
      >> link = reg3.search(raw link).group()
      >> link2 = link.split('"')
      >> cleanlink = link2[1:2]
      >> fullink = reg4.search(str (cleanlink))
      >> if fullink:
      >> linkArray.appen d(cleanlink)
      >> else:
      >> cleanlink2 = str(website) + "/" + str(cleanlink)
      >> linkArray.appen d(cleanlink2)
      >> Line = input.readline( )
      >>
      >>print linkArray
      >>-----------------------------------------------------------------------------------------------
      >>
      >>I get this:
      >>
      >>["http://www.slugnuts.co m/['index.html']",
      >>"http://www.slugnuts.co m/['movies.html']",
      >>"http://www.slugnuts.co m/['ramblings.html ']",
      >>"http://www.slugnuts.co m/['sluggies.html']",
      >>"http://www.slugnuts.co m/['movies.html']"]
      >>
      >>instead of this:
      >>
      >>["http://www.slugnuts.co m/index.html]",
      >>"http://www.slugnuts.co m/movies.html]",
      >>"http://www.slugnuts.co m/ramblings.html]",
      >>"http://www.slugnuts.co m/sluggies.html]",
      >>"http://www.slugnuts.co m/movies.html]"]
      >>
      >>The concatenation isn't working the way I expected it to. I suspect
      >>that I am screwing up by mixing types, but I can't see where...
      >>
      >>I would appreciate any advice or pointers.
      >>
      >>Thanks.[/color]
      >
      >
      >
      > Okay. It works if I change:
      >
      > fullink = reg4.search(str (cleanlink))
      > if fullink:
      > linkArray.appen d(cleanlink)
      > else:
      > cleanlink2 = str(website) + "/" + str(cleanlink)
      >
      > to
      >
      > fullink = reg4.search(cle anlink[0])
      > if fullink:
      > linkArray.appen d(cleanlink[0])
      > else:
      > cleanlink2 = str(website) + "/" + cleanlink[0]
      >
      >
      > so can anyone tell me why "cleanlink" gets coverted to a list? Is it
      > during the slicing?
      >
      >
      > Thanks.[/color]

      The statement

      cleanlink = link2[1:2]

      results in a list of one element. If you want to accesss element one
      (the second in the list) then use

      cleanlink = link2[1]

      regards
      Steve
      --
      Steve Holden +44 150 684 7255 +1 800 494 3119
      Holden Web LLC http://www.holdenweb.com/

      Comment

      • Terry Reedy

        #4
        Re: Question about concatenation error


        "colonel" <thecamel@camel richard.org> wrote in message
        news:cu5uh1h6nu bs7mm0f2fmdc2de 58f9nrq87@4ax.c om...[color=blue]
        > so can anyone tell me why "cleanlink" gets coverted to a list?
        > Is it during the slicing?[/color]

        Steve answered for you, but for next time, you could find out faster by
        either using the all-purpose debuging tool known as 'print' or, with
        Python, the handy-dandy interactive window:[color=blue][color=green][color=darkred]
        >>> [1,2,3][1:2][/color][/color][/color]
        [2]

        Terry J. Reedy



        Comment

        • Terry Hancock

          #5
          Re: Question about concatenation error

          On Wednesday 07 September 2005 11:34 am, colonel wrote:[color=blue]
          > I am new to python and I am confused as to why when I try to
          > concatenate 3 strings, it isn't working properly.
          >
          > Here is the code:[/color]

          I'm not taking the time to really study it, but at first
          glance, the code looks like it's probably much more
          complicated than it needs to be.
          [color=blue]
          > ["http://www.slugnuts.co m/['index.html']",
          > "http://www.slugnuts.co m/['movies.html']",
          > "http://www.slugnuts.co m/['ramblings.html ']",
          > "http://www.slugnuts.co m/['sluggies.html']",
          > "http://www.slugnuts.co m/['movies.html']"][/color]

          The tail end of that is the string representation of
          a list containing one string, not of that string. I
          suspect you needed to use ''.join() somewhere. Or,
          you could, in principle have indexed the list, since
          you only want one member of it, e.g.:
          [color=blue][color=green][color=darkred]
          >>> ['index.html'][0][/color][/color][/color]
          'index.html'

          --
          Terry Hancock ( hancock at anansispacework s.com )
          Anansi Spaceworks http://www.anansispaceworks.com

          Comment

          Working...