Hi All,
(I originally posted this in the .NET Programming Languages, but i realised the fact that i'm using C# is irrelevant as i think the problem i'm having is with DHTML / JavaScript.)
So here goes again:
I'm writing an app in C# that will be doing a bit of web scraping. I've got a fair bit of expierence with this but i've come across an issue with the returned HTML i'm getting from the pages.
When i view the page i'm trying to scrape in a browser all is well, and i can see a 'Services' section on the bottom of the page, however when i do a view source on the page, the corresponding code is missing for that part of the page. I've realised that the reason may be that on the page there is a JavaScript function where the 'Services' section should be. Here is the code:
I've found the corresponding function in an included .js file and it comes from www.dhtmlgoodie s.com.
So my problem is this, when viewing in a browser (and doing a view source) i can't see the actual HTML that creates this 'Services' section and thus when i do my httpWebRequest i don't get the HTML either.
In summary, how can i get ALL the source information from a WebRequest? Do i somehow get the WebRequest to run the JavaScript function and return the data? If so how?
Very much appreciate any help!
Regards,
Andy
(I originally posted this in the .NET Programming Languages, but i realised the fact that i'm using C# is irrelevant as i think the problem i'm having is with DHTML / JavaScript.)
So here goes again:
I'm writing an app in C# that will be doing a bit of web scraping. I've got a fair bit of expierence with this but i've come across an issue with the returned HTML i'm getting from the pages.
When i view the page i'm trying to scrape in a browser all is well, and i can see a 'Services' section on the bottom of the page, however when i do a view source on the page, the corresponding code is missing for that part of the page. I've realised that the reason may be that on the page there is a JavaScript function where the 'Services' section should be. Here is the code:
Code:
<script type='text/javascript'>initTabs('dhtmlgoodies_tabView',Array('Current Services','Services Overview','Notes<img src="temp_files/rosettes/notes_0.png" style="vertical-align: text-bottom;margin-top:4px;padding-top:0;margin-left:5px;float:none">','Photos','Attachments','Services Tree','Services Detail','Organisation Tree','Sales Tasks','Incidents<img src="temp_files/rosettes/openincidents_0.png" style="vertical-align: text-bottom;margin-top:4px;padding-top:0;margin-left:5px;float:none">','Orders<img src="temp_files/rosettes/openorders_1.png" style="vertical-align: text-bottom;margin-top:4px;padding-top:0;margin-left:5px;float:none">','Ceases','POs'),0,'100%','');</script>
So my problem is this, when viewing in a browser (and doing a view source) i can't see the actual HTML that creates this 'Services' section and thus when i do my httpWebRequest i don't get the HTML either.
In summary, how can i get ALL the source information from a WebRequest? Do i somehow get the WebRequest to run the JavaScript function and return the data? If so how?
Very much appreciate any help!
Regards,
Andy
Comment