I am using the HTML Agility Pack to parse data from a few web pages. I am now need to parse a secure webpage that requires a login. When I go to the orginal https pages, I am redirected to a login, once logged in correctly, I am sent back to the original URL. My thought was that I could go to the login page first, and then scrape the secure webpage once logged in.
Does anyone have experience with this library, in partuicular with URL's that require login info?
Code:
string strUserName = "username"; string strPassword = "password"; FormProcessor fp = new FormProcessor(); Form form = fp.GetForm("https://example.company.com/login.aspx", "//form[@name='form1']", FormQueryModeEnum.Nested); form["LoginBox$UserName"].SetAttributeValue("value", strUserName); form["LoginBox$Password"].SetAttributeValue("value", strPassword); HtmlDocument login = fp.SubmitForm(form); HtmlWeb web = new HtmlWeb(); HtmlDocument doc = web.Load("https://example.company.com/Orders/view/Default.aspx?GUID=1234567"); //begin code to crab data here
Comment