HTML Agility Pack Help

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mcfly1204
    New Member
    • Jul 2007
    • 233

    HTML Agility Pack Help

    I am using the HTML Agility Pack to parse data from a few web pages. I am now need to parse a secure webpage that requires a login. When I go to the orginal https pages, I am redirected to a login, once logged in correctly, I am sent back to the original URL. My thought was that I could go to the login page first, and then scrape the secure webpage once logged in.

    Code:
    string strUserName = "username";
                    string strPassword = "password";
    
                    FormProcessor fp = new FormProcessor();
    
                    Form form = fp.GetForm("https://example.company.com/login.aspx", "//form[@name='form1']", FormQueryModeEnum.Nested);
    
                    form["LoginBox$UserName"].SetAttributeValue("value", strUserName);
                    form["LoginBox$Password"].SetAttributeValue("value", strPassword);
    
                    HtmlDocument login = fp.SubmitForm(form);
    
                    HtmlWeb web = new HtmlWeb();
                    HtmlDocument doc = web.Load("https://example.company.com/Orders/view/Default.aspx?GUID=1234567");
    
    //begin code to crab data here
    Does anyone have experience with this library, in partuicular with URL's that require login info?
  • anijos
    New Member
    • Nov 2008
    • 52

    #2
    Hi,

    Did u check this?



    AniJos

    Comment

    • MisterC
      New Member
      • Nov 2008
      • 2

      #3
      I myself use a webbrowser wrapper called csEXWB.

      it can handel all sorts of login methods .
      csEXWB | Google Groups

      Comment

      • mcfly1204
        New Member
        • Jul 2007
        • 233

        #4
        Yes, notice any similarities?

        Comment

        • mcfly1204
          New Member
          • Jul 2007
          • 233

          #5
          I should also note that a cookie keeps the session valid. I think I understand how to store the cookie utilizing the HTTPWebRequest class, but I am not certain how/if I can use that CookieContainer when I am performing a screen scrape with the HTML Agility pack classes.

          Comment

          Working...