Get the filename from a path (URL) sorted without repeated result

**dev7060** · Jul 14 '20, 10:08 PM

What have you done so far? Many ways I can think of.

- Store the part of the link after the final '/' in a string and check if the extension is .pdf.
- Store the whole link in a string, start reading from the back and check if that reads fdp.
- Search for ".pdf" in the entire file. If there's an occurrence, keep copying the chars backward to a string until '/' if found (since filenames can't use slash).
etc...

sorted without repeated result

Keep storing the required names in a string array and when it's all done apply the logic of duplicate elements deletion (or you can check if the value is already present in the array before insertion) and then sorting can be done.

**SioSio** · Jul 15 '20, 01:42 AM

The only annoyance with this process is that there are multiple URLs on one line.
After reading the entire text, replace the line feed code with a space and split into spaces and store in the array.
After that, write the code as shown by dev7060.

Code:

f = open(r'url.txt','r')
line = f.read()
f.close
url_list = line.replace('\n',' ').split(' ')
url_list.pop(-1)

There are three ways to extract the file name from the URL.

Code:

    list = url.split('/')
    fname = list[len(list)-1]

Code:

    fname = url.rsplit('/',1)[1]

Code:

    fname = url[url.rfind('/')+1:]

Sort(sorted or sort) and duplicate elements deletion(set) use this functions.

Code:

new_file_list = sorted(set(file_list))

Get the filename from a path (URL) sorted without repeated result

Get the filename from a path (URL) sorted without repeated result

Comment

Comment