Grab Data Displayed On Other Web Page Using document.write

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • '69 Camaro

    Grab Data Displayed On Other Web Page Using document.write

    Perhaps I'm Googling for the wrong terms. Does anyone have links to
    examples of the syntax necessary to read the HTML on another Web page when
    that HTML is produced from JavaScript using the document.write( ) method?

    For a simplified example, I have two Web pages. Page 1 uses JavaScript with
    the following:

    htmlData = "<B>This is bold text.</B>";
    document.write( htmlData);

    Page 1 displays in bold text:

    This is bold text.

    Page 2 needs to get the markup for page 1, i.e., just "<B>This is bold
    text.</B>" (which includes the tags), not the JavaScript code listed above.
    I've tried using the responseText property of the MSXML2.XMLHTTP. 3.0 object,
    but it gives me the JavaScript used for rendering page 1, not the markup
    (that's stored in the htmlData variable).

    The ultimate goal is to grab the data displayed on a Web page and display
    only the items needed on another Web page. I can parse the HTML based upon
    the tags to target exactly the data I want. That's why I need to read the
    tags.

    Suggestions for other approaches are welcome. I'm the author of both Web
    pages, so I have some leeway. Preference is for client-side JavaScript. I
    have some experience in JavaScript, but I'm a C/Java programmer.
    Cross-platform compatibility is preferred, and the majority of browsers will
    be IE6. I can also run Perl scripts on the Web server, but I have little
    experience with Perl, so this would be an opportunity to learn more. In
    case it matters, the Web server is Apache on Linux.

    Thanks.

    Gunny


  • Martin Honnen

    #2
    Re: Grab Data Displayed On Other Web Page Using document.write



    '69 Camaro wrote:

    [color=blue]
    > For a simplified example, I have two Web pages. Page 1 uses JavaScript with
    > the following:
    >
    > htmlData = "<B>This is bold text.</B>";
    > document.write( htmlData);
    >
    > Page 1 displays in bold text:
    >
    > This is bold text.
    >
    > Page 2 needs to get the markup for page 1, i.e., just "<B>This is bold
    > text.</B>" (which includes the tags), not the JavaScript code listed above.
    > I've tried using the responseText property of the MSXML2.XMLHTTP. 3.0 object,
    > but it gives me the JavaScript used for rendering page 1, not the markup
    > (that's stored in the htmlData variable).[/color]

    That script code is being executed by a browser/user agent when it
    renders the HTML document so one way to access that contents with script
    is to use script to automate a browser. On Windows Microsoft IE can be
    automated so you use a Windows Script Host script to create an IE
    instance, load a URL and then read out the innerHTML or outerHTML of an
    element in the browser's document object model. Of course if you have
    the complete document tree object model then I am not sure you need the
    serialized markup to reparse it yourself. And you should be aware that
    the browser object model will usually include both the contents created
    by script and the script element itself.
    That way you have a script application to be run on a system where IE is
    installed and IE loads a remote URL.

    Another approach might be HTTP Unit
    <http://www.httpunit.or g/>
    although I don't know how good their script support is and whether they
    allow access to elements and/or serialized markup.

    None of that however allows you to have browser-side script in one HTML
    document access the DOM created by script in another HTML document. With
    the same origin policy that is only possible if you have two documents
    on the same server, then you can use frames or windows and use cross
    frame or cross window script techniques.


    --

    Martin Honnen

    Comment

    • Thomas 'PointedEars' Lahn

      #3
      Re: Grab Data Displayed On Other Web Page Using document.write

      Martin Honnen wrote:
      [color=blue]
      > None of that however allows you to have browser-side script in one HTML
      > document access the DOM created by script in another HTML document. With
      > the same origin policy that is only possible if you have two documents
      > on the same server, [...][/color]

      It is still only the same second-level domain, will you recognize that?


      PointedEars

      Comment

      • '69 Camaro

        #4
        Re: Grab Data Displayed On Other Web Page Using document.write

        "Martin Honnen" wrote in message
        news:43c54ac2$0 $20773$9b4e6d93 @newsread4.arco r-online.net...[color=blue]
        > On Windows Microsoft IE can be automated so you use a Windows Script Host
        > script to create an IE instance, load a URL and then read out the
        > innerHTML or outerHTML of an element in the browser's document object
        > model.[/color]

        Thanks for that info. I now have an option for visitors with IE browsers.
        [color=blue]
        > And you should be aware that the browser object model will usually include
        > both the contents created by script and the script element itself.[/color]

        So the contents created by the script on Web page 1 can be accessed using
        the innerHTML property of the document.body element of Web page 1? If so,
        does this apply just to IE or does it also apply to other common browsers,
        such as Firefox, Netscape, and Opera?
        [color=blue]
        > Another approach might be HTTP Unit
        > <http://www.httpunit.or g/>
        > although I don't know how good their script support is and whether they
        > allow access to elements and/or serialized markup.[/color]

        Thanks for that info. It can help with the automated testing of Web
        applications which I'd recently been wondering about, so thanks for
        answering another question of mine!
        [color=blue]
        > With the same origin policy that is only possible if you have two
        > documents on the same server, then you can use frames or windows and use
        > cross frame or cross window script techniques.[/color]

        Both HTML documents are in the same subdomain on the same Web server, so I
        think this satisfies the same origin policy. I've been Googling on
        cross-window scripting, looking for examples of cross-window scripting
        syntax, but all I've seen so far are examples that spawn a new window (which
        I'd rather avoid unless I can make it invisible to the user -- which I don't
        know how to do), and then write to the new window, but I need to read its
        contents instead.

        Do you have any links to example syntax for accessing a Web page's DOM
        without spawning a new window (if this is even possible), or for hiding a
        spawned window, or for reading the window's contents if my guess on the
        innerHTML property of the document.body element of Web page 1 is incorrect?

        Thanks.

        Gunny


        Comment

        • Randy Webb

          #5
          Re: Grab Data Displayed On Other Web Page Using document.write

          '69 Camaro said the following on 1/11/2006 3:45 PM:

          <snip>
          [color=blue]
          >
          > Do you have any links to example syntax for accessing a Web page's DOM
          > without spawning a new window (if this is even possible), or for hiding a
          > spawned window, or for reading the window's contents if my guess on the
          > innerHTML property of the document.body element of Web page 1 is incorrect?[/color]

          If all you are wanting to do is get the contents of a page, load it in a
          hidden IFrame and access it from there using the Frames collection.

          window.frames['IFrameNAMEnotI D'].property

          --
          Randy
          comp.lang.javas cript FAQ - http://jibbering.com/faq & newsgroup weekly
          Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/

          Comment

          • '69 Camaro

            #6
            Re: Grab Data Displayed On Other Web Page Using document.write

            "Randy Webb" wrote in message news:ipmdnScDCZ BK5VjeRVn-hA@comcast.com. ..[color=blue]
            > If all you are wanting to do is get the contents of a page, load it in a
            > hidden IFrame and access it from there using the Frames collection.
            >
            > window.frames['IFrameNAMEnotI D'].property[/color]

            Thanks. I'm not familiar with Frames or IFrames, but I figured out how to
            create an IFrame and make it invisible. Now I'm stuck on the syntax for
            trying to read the HTML produced by the document.write( ) method of the Web
            page loaded into that IFrame. To give a very simplified example, say the
            following Web page was loaded into the IFrame:

            <HTML>
            <BODY>
            <SCRIPT Language = "JavaScript " TYPE = "text/javascript">
            <!--
            var htmlData;
            htmlData = "<B>This is bold text.</B>";
            document.write( htmlData);
            //-->
            </SCRIPT>
            </BODY>
            </HTML>

            .. . . and I want to read the HTML produced by the document.write( ) method.
            In the main window, I'd like to use something like the following:

            html = window.frames['StatsFrame'].whateverGoesHe re;

            .. . . where the variable, html, receives "<B>This is bold text.</B>".

            Any suggestions on the correct syntax for "whateverGoesHe re"?

            Thanks.

            Gunny


            Comment

            • Randy Webb

              #7
              Re: Grab Data Displayed On Other Web Page Using document.write

              '69 Camaro said the following on 1/11/2006 7:12 PM:[color=blue]
              > "Randy Webb" wrote in message news:ipmdnScDCZ BK5VjeRVn-hA@comcast.com. ..
              >[color=green]
              >>If all you are wanting to do is get the contents of a page, load it in a
              >>hidden IFrame and access it from there using the Frames collection.
              >>
              >>window.fram es['IFrameNAMEnotI D'].property[/color]
              >
              >
              > Thanks. I'm not familiar with Frames or IFrames, but I figured out how to
              > create an IFrame and make it invisible. Now I'm stuck on the syntax for
              > trying to read the HTML produced by the document.write( ) method of the Web
              > page loaded into that IFrame. To give a very simplified example, say the
              > following Web page was loaded into the IFrame:
              >
              > <HTML>
              > <BODY>
              > <SCRIPT Language = "JavaScript " TYPE = "text/javascript">
              > <!--
              > var htmlData;
              > htmlData = "<B>This is bold text.</B>";
              > document.write( htmlData);
              > //-->
              > </SCRIPT>
              > </BODY>
              > </HTML>
              >
              > .. . . and I want to read the HTML produced by the document.write( ) method.
              > In the main window, I'd like to use something like the following:
              >
              > html = window.frames['StatsFrame'].whateverGoesHe re;
              >
              > .. . . where the variable, html, receives "<B>This is bold text.</B>".
              >
              > Any suggestions on the correct syntax for "whateverGoesHe re"?[/color]


              IFrame Code:

              <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
              "http://www.w3.org/TR/REC-html40/strict.dtd">
              <html>
              <head>
              <title>Form Test Page</title>
              <script type="text/javascript">
              var htmlData;
              htmlData = "<B>This is bold text.</B>";
              document.write( htmlData);
              </script>
              </head>
              <body>
              </body>
              </html>

              Main Page Code:
              <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
              "http://www.w3.org/TR/html4/strict.dtd">
              <html>
              <head>
              <meta http-equiv="content-type" content="text/html; charset=UTF-8">
              <title>Frame test</title>
              </head>
              <body>
              <iframe src="blank.html " name="myIFrame" >Test text here</iframe>
              <button
              onclick="alert( window.frames['myIFrame'].document.body. innerHTML)">Sho w
              Title</button>
              </body>
              </html>

              Opera 8, Firefox and IE all give me simply the HTML that was generated.
              Namely, "<B>This is bold text.</B>"

              So it appears the above should be close to what you want or at least
              close enough to get you started on it.

              --
              Randy
              comp.lang.javas cript FAQ - http://jibbering.com/faq & newsgroup weekly
              Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/

              Comment

              • '69 Camaro

                #8
                Re: Grab Data Displayed On Other Web Page Using document.write

                "Randy Webb" wrote in message
                news:_u-dnacVzri3bljenZ 2dnUVZ_tadnZ2d@ comcast.com...[color=blue]
                > So it appears the above should be close to what you want or at least close
                > enough to get you started on it.[/color]

                Thank you! That gets me much closer to where I want to be. However, I'm
                still trying to get over the hurdle of determining from the second Web page
                what's in the the string variable used for the document.write( ) method on
                the first Web page. While I gave a simplified example of the JavaScript on
                my Web page, the value passed to document.write( ) is actually the value of
                the property of an object, not a string literal. The object is returned
                from a function that I pass variables to, and the function dynamically
                creates the HTML needed to display a Web page with all of the data. For
                example:

                stats = getStats(uid, mid, gid, lg, crit, sUrl);
                htmlData = stats.Text;
                document.write( htmlData);

                .. . . and the value in the stats.Text property would be the string
                containing the HTML with the data and the tags I need to parse, like
                "<B>This is bold text.</B>". This markup never actually appears on Web page
                1, so I don't know how to read it from Web page 2 and extract only the data
                I need.

                Any suggestions on how to read from Web page 2 either the value stored in
                htmlData or the Web page contents created by my script on Web Page 1? I've
                tried using getElementByNam e( ) and getElementByID( ) to retrieve the
                htmlData variable (or value) in the frame object, but I don't have the
                correct syntax because it bombs.

                Thanks.

                Gunny


                Comment

                • Randy Webb

                  #9
                  Re: Grab Data Displayed On Other Web Page Using document.write

                  '69 Camaro said the following on 1/12/2006 10:02 AM:[color=blue]
                  > "Randy Webb" wrote in message
                  > news:_u-dnacVzri3bljenZ 2dnUVZ_tadnZ2d@ comcast.com...
                  >[color=green]
                  >>So it appears the above should be close to what you want or at least close
                  >>enough to get you started on it.[/color]
                  >
                  >
                  > Thank you! That gets me much closer to where I want to be. However, I'm
                  > still trying to get over the hurdle of determining from the second Web page
                  > what's in the the string variable used for the document.write( ) method on
                  > the first Web page. While I gave a simplified example of the JavaScript on
                  > my Web page, the value passed to document.write( ) is actually the value of
                  > the property of an object, not a string literal. The object is returned
                  > from a function that I pass variables to, and the function dynamically
                  > creates the HTML needed to display a Web page with all of the data. For
                  > example:
                  >
                  > stats = getStats(uid, mid, gid, lg, crit, sUrl);
                  > htmlData = stats.Text;
                  > document.write( htmlData);[/color]

                  If the name of that variable is always htmlData, then you can access it
                  by window.frames['IFrameNAMEnotI D'].htmlData

                  --
                  Randy
                  comp.lang.javas cript FAQ - http://jibbering.com/faq & newsgroup weekly
                  Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/

                  Comment

                  Working...