Scraping Just Images in C#

**Bassem** · Nov 13 '09, 01:52 PM

You have solved one of three, not one of two!!

Pay attention to that:
The src attribute - of the img element - content is a link to a URL so its contents type is one of these:
1. Fully qualified URL.
2. Absolute.
3. Relative.

You have solved the first type, it remains two more.

Anyway, consider this method:
1. You have "url_searchBox_ txt.Text" it contains the URL has a type of three, but all contain the domain name (host name), you can split it.
2. Extract the img's src property, compare the value if it begins with the domain name... so it is type #1.
Else if it begins with "/" slash... so it is type #2.
Else... it is type #3.
3. For type #1: go on.
For type #2: insert the domain name into the start of the value. That's it, very simple.
For type #3: Oh, now you got a problem, you will need to search in the website directories and I have no idea how to solve this.

Thanks,
Bassem

**swapan das** · Sep 29 '10, 08:06 AM

The problem is so simple.Look,A web page can import image or media file from its local server or remote server.When the page import image from external server the image url looks like:
<img src="http://www.domain.com/01.jpg></img>
But when the page import image from local server then the image reference looks like:
<img src="/images/01.jpg".
So to fix the problem,just add the http url path at the begining looks: "htt://www.google.com/"+img_resul t

Scraping Just Images in C#

Scraping Just Images in C#

Comment

Comment