I am new to python and I am confused as to why when I try to
concatenate 3 strings, it isn't working properly.
Here is the code:
------------------------------------------------------------------------------------------
import string
import sys
import re
import urllib
linkArray = []
srcArray = []
website = sys.argv[1]
urllib.urlretri eve(website, 'getfile.txt')
filename = "getfile.tx t"
input = open(filename, 'r')
reg1 = re.compile('hre f=".*"')
reg3 = re.compile('".* ?"')
reg4 = re.compile('htt p')
Line = input.readline( )
while Line:
searchstring1 = reg1.search(Lin e)
if searchstring1:
rawlink = searchstring1.g roup()
link = reg3.search(raw link).group()
link2 = link.split('"')
cleanlink = link2[1:2]
fullink = reg4.search(str (cleanlink))
if fullink:
linkArray.appen d(cleanlink)
else:
cleanlink2 = str(website) + "/" + str(cleanlink)
linkArray.appen d(cleanlink2)
Line = input.readline( )
print linkArray
-----------------------------------------------------------------------------------------------
I get this:
["http://www.slugnuts.co m/['index.html']",
"http://www.slugnuts.co m/['movies.html']",
"http://www.slugnuts.co m/['ramblings.html ']",
"http://www.slugnuts.co m/['sluggies.html']",
"http://www.slugnuts.co m/['movies.html']"]
instead of this:
["http://www.slugnuts.co m/index.html]",
"http://www.slugnuts.co m/movies.html]",
"http://www.slugnuts.co m/ramblings.html]",
"http://www.slugnuts.co m/sluggies.html]",
"http://www.slugnuts.co m/movies.html]"]
The concatenation isn't working the way I expected it to. I suspect
that I am screwing up by mixing types, but I can't see where...
I would appreciate any advice or pointers.
Thanks.
concatenate 3 strings, it isn't working properly.
Here is the code:
------------------------------------------------------------------------------------------
import string
import sys
import re
import urllib
linkArray = []
srcArray = []
website = sys.argv[1]
urllib.urlretri eve(website, 'getfile.txt')
filename = "getfile.tx t"
input = open(filename, 'r')
reg1 = re.compile('hre f=".*"')
reg3 = re.compile('".* ?"')
reg4 = re.compile('htt p')
Line = input.readline( )
while Line:
searchstring1 = reg1.search(Lin e)
if searchstring1:
rawlink = searchstring1.g roup()
link = reg3.search(raw link).group()
link2 = link.split('"')
cleanlink = link2[1:2]
fullink = reg4.search(str (cleanlink))
if fullink:
linkArray.appen d(cleanlink)
else:
cleanlink2 = str(website) + "/" + str(cleanlink)
linkArray.appen d(cleanlink2)
Line = input.readline( )
print linkArray
-----------------------------------------------------------------------------------------------
I get this:
["http://www.slugnuts.co m/['index.html']",
"http://www.slugnuts.co m/['movies.html']",
"http://www.slugnuts.co m/['ramblings.html ']",
"http://www.slugnuts.co m/['sluggies.html']",
"http://www.slugnuts.co m/['movies.html']"]
instead of this:
["http://www.slugnuts.co m/index.html]",
"http://www.slugnuts.co m/movies.html]",
"http://www.slugnuts.co m/ramblings.html]",
"http://www.slugnuts.co m/sluggies.html]",
"http://www.slugnuts.co m/movies.html]"]
The concatenation isn't working the way I expected it to. I suspect
that I am screwing up by mixing types, but I can't see where...
I would appreciate any advice or pointers.
Thanks.
Comment