Extracting and replacing url within href tag

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Adnan Siddiqi

    Extracting and replacing url within href tag

    Hi
    Suppose I have following URLs comming from an HTML document

    <a href="http://mydomain1.com"> Domain1</a>
    <a
    href="http://subdomain.domai n.com/myfile.anyext"> http://subdomain.domai n.com/myfile.anyext</a>


    <a href="http://subdomain.domai n2.com/myfile.anyext"> Domain2</a>

    Now,what I want to search URL pattern within Href only as well as check
    if it contains a particular domain ,for instance "domain2.co m", if yes
    then it replace with following URL.

    "http://redirectUrl.com/http://subdomain.domai n2.com/myfile.anyext"

    can anyone shed light upon this?

    Thankyou

    -adnan

  • VK

    #2
    Re: Extracting and replacing url within href tag


    Adnan Siddiqi wrote:[color=blue]
    > Hi
    > Suppose I have following URLs comming from an HTML document
    >
    > <a href="http://mydomain1.com"> Domain1</a>
    > <a
    > href="http://subdomain.domai n.com/myfile.anyext"> http://subdomain.domai n.com/myfile.anyext</a>
    >
    >
    > <a href="http://subdomain.domai n2.com/myfile.anyext"> Domain2</a>
    >
    > Now,what I want to search URL pattern within Href only as well as check
    > if it contains a particular domain ,for instance "domain2.co m", if yes
    > then it replace with following URL.
    >
    > "http://redirectUrl.com/http://subdomain.domai n2.com/myfile.anyext"[/color]

    <script type="text/javascript">
    function patchLinks() {
    var len = document.links. length;
    var lnk = null;
    for (var i=0; i<len; i++) {
    lnk = document.links[i];
    if (lnk.href.index Of('domain2.com ') != -1) {
    lnk.href = 'http://redirectUrl.com/' + lnk.href;
    }
    }
    }

    window.onload = patchLinks;
    </script>

    Comment

    • Thomas 'PointedEars' Lahn

      #3
      Re: Extracting and replacing url within href tag

      VK wrote:
      [color=blue]
      > Adnan Siddiqi wrote:[color=green]
      >> Now,what I want to search URL pattern within Href only as well as check
      >> if it contains a particular domain ,for instance "domain2.co m", if yes
      >> then it replace with following URL.
      >>
      >> "http://redirectUrl.com/http://subdomain.domai n2.com/myfile.anyext"[/color]
      >
      > <script type="text/javascript">
      > function patchLinks() {
      > var len = document.links. length;
      > var lnk = null;
      > for (var i=0; i<len; i++) {
      > lnk = document.links[i];
      > if (lnk.href.index Of('domain2.com ') != -1) {
      > lnk.href = 'http://redirectUrl.com/' + lnk.href;
      > }
      > }
      > }
      >
      > window.onload = patchLinks;
      > </script>[/color]

      Please /think/ before you code.


      PointedEars
      --
      Homer: I have changed the world. Now I know how it feels to be God!
      Marge: Do you want turkey sausage or ham?
      Homer: Thou shalt send me *two*, one of each kind.
      (Santa's Little Helper [dog] and Snowball [cat] run away :))

      Comment

      • Thomas 'PointedEars' Lahn

        #4
        Re: Extracting and replacing url within href tag

        Adnan Siddiqi wrote:
        [color=blue]
        > Suppose I have following URLs comming from an HTML document
        >
        > <a href="http://mydomain1.com"> Domain1</a>
        > <a
        >[/color]
        href="http://subdomain.domai n.com/myfile.anyext"> http://subdomain.domai n.com/myfile.anyext</a>[color=blue]
        >
        >
        > <a href="http://subdomain.domai n2.com/myfile.anyext"> Domain2</a>
        >
        > Now,what I want to search URL pattern within Href only as well as check
        > if it contains a particular domain ,for instance "domain2.co m", if yes
        > then it replace with following URL.
        >
        > "http://redirectUrl.com/http://subdomain.domai n2.com/myfile.anyext"[/color]

        This is not a valid URL/URI. See RFC3986 and below.
        [color=blue]
        > can anyone shed light upon this?[/color]

        First of all, you want to do this server-side, not client-side.

        However, the language used then may be an ECMAScript implementation
        as well. The only difference to the solution presented here is that
        you will need to determine what is a link differently, and that you
        have to parse the source code instead (unless you can make use of an
        existing markup parser implementation) .

        Second, use Regular Expressions.

        ....
        <html>
        <head>
        ...
        <meta http-equiv="Content-Script-Type" content="text/javascript">
        <script type="text/javascript">
        var _global = this;

        /**
        * Patches links referring to specific domains so that their target
        * URL is appended to another URL.
        *
        * @param sDomains: string
        * Links with domains to be redirected, delimited by <tt>|</tt>.
        * @param sRedirectBase: string
        * Base URI (prefix) for the redirection.
        */
        function patchLinks(sDom ains, sRedirectBase)
        {
        /**
        * Tries hard to escape a string according to the query component
        * specification in RFC3986.
        *
        * @partof
        * http://pointedears.de/scripts/string.js
        * @param s: string
        * @return type string
        * <code>s</code> escaped, or unescaped if escaping through
        * <code>encodeURI Component()</code> or <code>escape( )</code>
        * is not possible.
        */
        function esc(s)
        {
        /**
        * @author
        * (C) 2003-2006 Thomas Lahn &lt;types.js@Po intedEars.de&gt ;
        * Distributed under the GNU GPL v2.
        * @partof
        * http://pointedears.de/scripts/types.js
        * @argument s
        * String to be determined a method type, i.e. "object" for
        * IE DOM methods, "function" otherwise. The type must have
        * been retrieved with the `typeof' operator.
        *
        * Note that in contrast to @link{#isMethod ()}, this
        * method may also return <code>true</code> if the value of
        * the <code>typeof</code> operand is <code>null</code>; to be
        * sure that the operand is a method reference, you have to
        * && (AND)-combine the <code>isMethodT ype(...)</code>
        * expression with the method reference identifier.
        *
        * Use this method instead of <code>isMethod( )</code> if
        * you want to avoid warnings in case the property to be
        * tested is not defined, or errors in case the property
        * cannot be read.
        * @return
        * <code>true</code> if <code>s</code> is a method type,
        * <code>false</code> otherwise.
        * @type boolean
        * @see #isMethod()
        */
        function isMethodType(s)
        {
        return /\s*(function|ob ject)\s*/.test(s);
        }

        return (isMethodType(t ypeof encodeURICompon ent)
        && encodeURICompon ent
        ? encodeURICompon ent(s)
        : (isMethodType(t ypeof escape) && escape
        ? escape(s)
        : s));
        }

         for (var links = document.links, i = links && links.length; i--;)
        {
          var
        link = links[i],
        rx = new RegExp(
        "^(ht|f)tps ?:\\/\\/([^.]+\\.)*("
        + sDomains.replac e(/\./g, "\\.")
        + ")(\\/|$)");

        if (rx.test(link.h ref))
        {
            link.href = sRedirectBase + esc(link.href);
          }
         }
        }
        </script>
        </head>

        <body onload="patchLi nks('domain2.co m', 'http://redirectUrl.com/');">
        ...
        </body>
        </html>


        PointedEars
        --




        Comment

        • Michael Winter

          #5
          Re: Extracting and replacing url within href tag

          On 26/05/2006 20:06, Thomas 'PointedEars' Lahn wrote:
          [color=blue]
          > Adnan Siddiqi wrote:[/color]

          [snip]
          [color=blue][color=green]
          >> "http://redirectUrl.com/http://subdomain.domai n2.com/myfile.anyext"[/color]
          >
          > This is not a valid URL/URI. See RFC3986 and below.[/color]

          To a point; the path doesn't contain hierarchical information. For that
          reason, it's certainly a questionable URI - it would be more
          conventional to include the embedded URI in the query string - but
          nevertheless it does match the grammar expressed in RFC 3986.

          Assuming your gripe is syntactic, rather than semantic (and I would
          agree on the latter - no debate there), then I can only see two possible
          causes: the colon and the empty segment.

          Path segments may contain colons,

          path-abempty = *( "/" segment )
          segment = *pchar
          pchar = unreserved / pct-encoded / sub-delims
          / ":" / "@"

          as long as it isn't within the first path segment in a relative-path
          reference:

          relative-part = "//" authority path-abempty
          / path-absolute
          / path-noscheme
          / path-empty
          path-noscheme = segment-nz-nc *( "/" segment )
          segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims
          / "@" )
          ; non-zero-length segment without any colon ":"

          Neither the prose nor the grammar prohibits empty path segments within a
          path (and both easily could if the intention was there). In fact, the
          prose alludes to the possibility of empty path segments as it states
          that a path cannot begin "//" when there is no authority component, but
          it doesn't say that such a sequence can't appear elsewhere.

          [snip]

          Mike

          --
          Michael Winter
          Prefix subject with [News] before replying by e-mail.

          Comment

          • Thomas 'PointedEars' Lahn

            #6
            Re: Extracting and replacing url within href tag

            Michael Winter wrote:
            [color=blue]
            > On 26/05/2006 20:06, Thomas 'PointedEars' Lahn wrote:[color=green]
            >> Adnan Siddiqi wrote:[color=darkred]
            >>> "http://redirectUrl.com/http://subdomain.domai n2.com/myfile.anyext"[/color]
            >> This is not a valid URL/URI. See RFC3986 and below.[/color]
            >
            > To a point; the path doesn't contain hierarchical information.[/color]

            It does not have to:

            | 3.3. Path
            |
            | The path component contains data, usually organized in hierarchical
            | form [...]
            |
            | A path consists of a sequence of path segments separated by a slash
            | ("/") character. A path is always defined for a URI, though the
            | defined path may be empty (zero length). Use of the slash character
            | to indicate hierarchy is only required when a URI will be used as the
            | context for relative references.
            [color=blue]
            > For that reason, it's certainly a questionable URI -[/color]

            There is a better reason for calling it questionable at best.
            [color=blue]
            > it would be more conventional to include the embedded URI in the query
            > string -[/color]

            I did/do not care about conventions just because they exist. Unquestioned
            conventions can lead to unfounded traditions, and unfounded traditions tend
            to lead to a standstill in development. There is no need for a query part
            here if the appended URI is properly escaped.
            [color=blue]
            > but nevertheless it does match the grammar expressed in RFC 3986.[/color]

            To a point, yes.
            [color=blue]
            > Assuming your gripe is syntactic, rather than semantic (and I would
            > agree on the latter - no debate there), then I can only see two possible
            > causes: the colon and the empty segment.
            >
            > Path segments may contain colons,
            >
            > path-abempty = *( "/" segment )
            > segment = *pchar
            > pchar = unreserved / pct-encoded / sub-delims
            > / ":" / "@"[/color]

            These productions apply, but see below.
            [color=blue]
            > as long as it isn't within the first path segment in a relative-path
            > reference:
            >
            > relative-part = "//" authority path-abempty
            > / path-absolute
            > / path-noscheme
            > / path-empty
            > path-noscheme = segment-nz-nc *( "/" segment )
            > segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims
            > / "@" )
            > ; non-zero-length segment without any colon ":"[/color]

            These productions do not apply here. The URI retrieved from the `href'
            property will be an absolute URI, even if the value of the corresponding
            `href' attribute is a URI reference that could produce this
            (URI-reference :: relative-ref :: relative-part [ "?" query ] [ "#"
            fragment ]).
            [color=blue]
            > Neither the prose nor the grammar prohibits empty path segments within a
            > path (and both easily could if the intention was there). In fact, the
            > prose alludes to the possibility of empty path segments as it states
            > that a path cannot begin "//" when there is no authority component, but
            > it doesn't say that such a sequence can't appear elsewhere.[/color]

            You are misunderstandin g the RFC, and your logic is flawed. For an
            /(ht|f)tps?:/ URI/URL (see subsection 1.1.3) must contain an authority
            component because of the need for a host (a general URI does not need to,
            as it may be a URN).

            Let's start with the initial production:

            | URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
            | [...]
            | hier-part = "//" authority path-abempty
            | / path-absolute
            | / path-rootless
            | / path-empty

            Subsequent productions are:

            | authority = [ userinfo "@" ] host [ ":" port ]
            | [...]
            | path-abempty = *( "/" segment )
            | path-absolute = "/" [ segment-nz *( "/" segment ) ]
            | path-rootless = segment-nz *( "/" segment )
            | path-empty = 0<pchar>

            Obviously (for the reasons given above), of those for /(ht|f)tps?:/
            URIs/URLs only the production

            | hier-part = "//" authority path-abempty

            applies. And I concur that path-abempty can produce both `:' and `//'.

            But: the character sequence `scheme:' (here: `http:') is clearly defined.
            As per

            | 2.2. Reserved Characters

            `:' is such a character:

            | reserved = gen-delims / sub-delims
            |
            | gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"

            Furthermore (in the same subsection),

            | URI producing applications should percent-encode data octets that
            | correspond to characters in the reserved set unless these characters
            | are specifically allowed by the URI scheme to represent data in that
            | component.

            The reason for this recommendation (SHOULD) is that another `scheme:'
            character sequence within the URI can render the URI ambiguous.

            So the example given by the OP may be a syntactically Valid URI, but one
            that is at least unwise to use. And considering that this was only an
            example, and that the URI reference to be found in the `href' attribute
            value (that is resolved to an absolute URI upon property access) can
            contain _any_ character valid in an URI _iff it is not part of the path
            component_, especially reserved ones, one can rightfully say that simple
            concatenation of a base absolute (redirect) URI with the retrieved absolute
            URI of the hyperlink (most certainly) will not result in a (valid) URI.


            PointedEars
            --
            Bill Gates isn't the devil -- Satan made sure hell
            _worked_ before he opened it to the damned ...

            Comment

            • Michael Winter

              #7
              Re: Extracting and replacing url within href tag

              On 26/05/2006 23:09, Thomas 'PointedEars' Lahn wrote:
              [color=blue]
              > Michael Winter wrote:[/color]

              [snip]
              [color=blue][color=green]
              >> [...] the path doesn't contain hierarchical information.[/color]
              >
              > It does not have to:[/color]

              Yes, in general, URIs do not need to have hierarchical paths, but HTTP
              URIs are hierarchical. That said, the hierarchy can be arbitrary; it
              certainly doesn't need to follow a directory structure, for example.
              Yes, I know you know that, I'm just trying to eliminate an unnecessary
              response. :-)

              [snip]
              [color=blue][color=green]
              >> as long as it isn't within the first path segment in a relative-path
              >> reference:
              >>
              >> relative-part = "//" authority path-abempty
              >> / path-absolute
              >> / path-noscheme
              >> / path-empty
              >> path-noscheme = segment-nz-nc *( "/" segment )
              >> segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims
              >> / "@" )
              >> ; non-zero-length segment without any colon ":"[/color]
              >
              > These productions do not apply here.[/color]

              I never said they did. You seemed to miss the part where I wrote, "in a
              relative-path reference" (a term defined in 4.2 Relative Reference). The
              URI suggested by the OP isn't a relative-path reference, but an absolute
              URI (4.3 - though a fragment might not be prohibited). The information
              above was just a qualification to prevent misinterpretati on of the
              preceding statement. That is, a colon can appear in path segments, but
              not /all/ path segments.

              [snip]
              [color=blue][color=green]
              >> Neither the prose nor the grammar prohibits empty path segments
              >> within a path (and both easily could if the intention was there).
              >> In fact, the prose alludes to the possibility of empty path
              >> segments as it states that a path cannot begin "//" when there is
              >> no authority component, but it doesn't say that such a sequence
              >> can't appear elsewhere.[/color]
              >
              > You are misunderstandin g the RFC, and your logic is flawed.[/color]

              I disagree. I have simply stated facts. However, it's hard to refute
              conclusively unless you identify what you think I have misunderstood, or
              where exactly logic has apparently failed me.
              [color=blue]
              > For an /(ht|f)tps?:/ URI/URL (see subsection 1.1.3) must contain an
              > authority component because of the need for a host (a general URI
              > does not need to, as it may be a URN).[/color]

              I already knew that. I think you misunderstood why I wrote what I did,
              though that is perhaps my fault. I started by making comments specific
              to HTTP URIs, but then shifted to a more generic treatment without
              explicitly noting it.

              Your problem with what I wrote would seem to revolve around my mention
              of the authority component. It only occurred in relation to empty path
              segments in general, and not specifically to the URI suggested by the OP
              (so the specifics of HTTP URIs are irrelevant, in this instance).

              To some, allowing empty path segments might seem to be an oversight, or
              a simplification of the grammar. Given the number of revisions to the
              URI syntax RFCs, the former is unlikely, but the latter isn't entirely
              unreasonable. However, even if that were the case, the RFC needn't have
              limited itself to stating:

              If a URI does not contain an authority component, then the path
              cannot begin with two slash characters ("//").
              -- 3.3 Path

              It could have just forbade empty segments, instead.

              As I didn't know what the grounds were for your objection to the
              proposed URI, I hoped to cover the two obvious syntactic possibilities.
              On reflection, I should have just asked. :-)

              [snipped well-intentioned quotation of the grammar]
              [color=blue]
              > As per
              >
              > | 2.2. Reserved Characters
              >
              > `:' is such a character:
              >
              > | reserved = gen-delims / sub-delims
              > |
              > | gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
              >
              > Furthermore (in the same subsection),
              >
              > | URI producing applications should percent-encode data octets that
              > | correspond to characters in the reserved set unless these characters
              > | are specifically allowed by the URI scheme to represent data in that
              > | component.
              >
              > The reason for this recommendation (SHOULD) is that another `scheme:'
              > character sequence within the URI can render the URI ambiguous.[/color]

              I don't see why. A scheme can only occur at the start of a URI, and a
              URI may only start in five ways. Three of those require an unambiguous
              delimiter: authority (//), query (?), and fragment (#). The remaining
              two is with the scheme itself, or a path. If the path begins with a
              slash, that too is unambiguous. If it doesn't, then a colon in the first
              segment will be confusing, but this can be resolved by adding a leading
              dot segment (foo:bar -> ./foo:bar).

              So, if a colon occurs before any slash, question mark, or hash
              characters, it delimits the scheme. Anywhere else and it is part of some
              (sub-)component.
              [color=blue]
              > So the example given by the OP may be a syntactically Valid URI, but one
              > that is at least unwise to use.[/color]

              I agree.

              [snip]

              Mike

              --
              Michael Winter
              Prefix subject with [News] before replying by e-mail.

              Comment

              • Thomas 'PointedEars' Lahn

                #8
                Re: Extracting and replacing url within href tag

                Michael Winter wrote:
                [color=blue]
                > On 26/05/2006 23:09, Thomas 'PointedEars' Lahn wrote:[color=green]
                >> Michael Winter wrote:[color=darkred]
                >>> as long as it isn't within the first path segment in a relative-path
                >>> reference:
                >>>
                >>> relative-part = "//" authority path-abempty
                >>> / path-absolute
                >>> / path-noscheme
                >>> / path-empty
                >>> path-noscheme = segment-nz-nc *( "/" segment )
                >>> segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims
                >>> / "@" )
                >>> ; non-zero-length segment without any colon ":"[/color]
                >>
                >> These productions do not apply here.[/color]
                >
                > I never said they did. You seemed to miss the part where I wrote, "in a
                > relative-path reference" (a term defined in 4.2 Relative Reference). The
                > URI suggested by the OP isn't a relative-path reference, but an absolute
                > URI (4.3 - though a fragment might not be prohibited). The information
                > above was just a qualification to prevent misinterpretati on of the
                > preceding statement. That is, a colon can appear in path segments, but
                > not /all/ path segments.[/color]

                Referring to irrelevant parts of the grammar does not strike me as being
                reasonable or helpful. The matter is complicated enough already, more
                noise will rather hinder its clarification.
                [color=blue][color=green]
                >> For an /(ht|f)tps?:/ URI/URL (see subsection 1.1.3) must contain an
                >> authority component because of the need for a host (a general URI
                >> does not need to, as it may be a URN).[/color]
                >
                > I already knew that. I think you misunderstood why I wrote what I did,
                > though that is perhaps my fault. I started by making comments specific
                > to HTTP URIs, but then shifted to a more generic treatment without
                > explicitly noting it.[/color]

                ACK
                [color=blue][color=green]
                >> As per
                >>
                >> | 2.2. Reserved Characters
                >>
                >> `:' is such a character:
                >>
                >> | reserved = gen-delims / sub-delims
                >> |
                >> | gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
                >>
                >> Furthermore (in the same subsection),
                >>
                >> | URI producing applications should percent-encode data octets that
                >> | correspond to characters in the reserved set unless these characters
                >> | are specifically allowed by the URI scheme to represent data in that
                >> | component.
                >>
                >> The reason for this recommendation (SHOULD) is that another `scheme:'
                >> character sequence within the URI can render the URI ambiguous.[/color]
                >
                > I don't see why.[/color]

                | The purpose of reserved characters is to provide a set of delimiting
                | characters that are distinguishable from other data within a URI.
                [color=blue]
                > A scheme can only occur at the start of a URI, and a
                > URI may only start in five ways. Three of those require an unambiguous
                > delimiter: authority (//), query (?), and fragment (#). The remaining
                > two is with the scheme itself, or a path. If the path begins with a
                > slash, that too is unambiguous.[/color]

                | A subset of the reserved characters (gen-delims) is used as
                | delimiters of the generic URI components described in Section 3. A
                | component's ABNF syntax rule will not use the reserved or gen-delims
                | rule names directly; instead, each syntax rule lists the characters
                | allowed within that component (i.e., not delimiting it), and any of
                | those characters that are also in the reserved set are "reserved"
                | for use as subcomponent delimiters within the component.
                [color=blue]
                > If it doesn't, then a colon in the first segment will be confusing, but
                > this can be resolved by adding a leading dot segment
                > (foo:bar -> ./foo:bar).[/color]

                That would, however, require it to be a URI reference instead of a URI.
                [color=blue]
                > So, if a colon occurs before any slash, question mark, or hash
                > characters, it delimits the scheme. Anywhere else and it is part of some
                > (sub-)component.[/color]

                It is not that simple.
                [color=blue][color=green]
                >> So the example given by the OP may be a syntactically Valid URI, but one
                >> that is at least unwise to use.[/color]
                >
                > I agree.[/color]

                I hope you also agree with what you have snipped below because *that*
                contained the key point of this paragraph.
                [color=blue]
                > [snip][/color]


                PointedEars
                --
                I hear, and I forget; I see, and I remember; I do, and I understand.
                -- Chinese proverb

                Comment

                • Adnan Siddiqi

                  #9
                  Re: Extracting and replacing url within href tag

                  VK you rock!

                  Thanks a lot


                  -adnan

                  VK wrote:[color=blue]
                  > Adnan Siddiqi wrote:[color=green]
                  > > Hi
                  > > Suppose I have following URLs comming from an HTML document
                  > >
                  > > <a href="http://mydomain1.com"> Domain1</a>
                  > > <a
                  > > href="http://subdomain.domai n.com/myfile.anyext"> http://subdomain.domai n.com/myfile.anyext</a>
                  > >
                  > >
                  > > <a href="http://subdomain.domai n2.com/myfile.anyext"> Domain2</a>
                  > >
                  > > Now,what I want to search URL pattern within Href only as well as check
                  > > if it contains a particular domain ,for instance "domain2.co m", if yes
                  > > then it replace with following URL.
                  > >
                  > > "http://redirectUrl.com/http://subdomain.domai n2.com/myfile.anyext"[/color]
                  >
                  > <script type="text/javascript">
                  > function patchLinks() {
                  > var len = document.links. length;
                  > var lnk = null;
                  > for (var i=0; i<len; i++) {
                  > lnk = document.links[i];
                  > if (lnk.href.index Of('domain2.com ') != -1) {
                  > lnk.href = 'http://redirectUrl.com/' + lnk.href;
                  > }
                  > }
                  > }
                  >
                  > window.onload = patchLinks;
                  > </script>[/color]

                  Comment

                  • VK

                    #10
                    Re: Extracting and replacing url within href tag


                    Thomas 'PointedEars' Lahn wrote:[color=blue]
                    > Please /think/ before you code.[/color]

                    Maybe a bit too late to ask but just noticed this post, so:

                    - Eh?

                    Comment

                    • Thomas 'PointedEars' Lahn

                      #11
                      Re: Extracting and replacing url within href tag

                      Adnan Siddiqi wrote:
                      [color=blue]
                      > VK you rock![/color]

                      No (his solution is error-prone at best). And neither do you:
                      [color=blue]
                      > [Top post][/color]

                      <URL:http://jibbering.com/faq/>


                      PointedEars

                      Comment

                      • VK

                        #12
                        Re: Extracting and replacing url within href tag


                        Thomas 'PointedEars' Lahn wrote:[color=blue]
                        > No (his solution is error-prone at best).[/color]

                        This solution uses DOM 0 - thus it works for all ever produced browsers
                        with JavaScript/JScript support starting with Netscape 2.

                        If you have yet more universal solution I'm anxious to see it.

                        Comment

                        • Thomas 'PointedEars' Lahn

                          #13
                          Re: Extracting and replacing url within href tag

                          VK wrote:
                          [color=blue]
                          > Thomas 'PointedEars' Lahn wrote:[color=green]
                          >> No (his solution is error-prone at best).[/color]
                          >
                          > This solution uses DOM 0[/color]

                          It uses features that are also available in DOM Level 0.
                          [color=blue]
                          > - thus it works for all ever produced browsers
                          > with JavaScript/JScript support starting with Netscape 2.[/color]

                          Wrong. If you knew what you are talking about, you would also know that
                          "DOM Level 0" refers to features common to Netscape 3.0 and IE 3.0. But
                          even if we ignore that, your statement is still wrong.
                          [color=blue]
                          > If you have yet more universal solution I'm anxious to see it.[/color]

                          I have posted it already.


                          PointedEars
                          --
                          A man who works with his hands is a laborer; a man who works with his
                          hands and his brain is a craftsman; but a man who works with his hands
                          and his brain and his heart is an artist.
                          -- Louis Nizer, lawyer (1902-1994)

                          Comment

                          • VK

                            #14
                            Re: Extracting and replacing url within href tag


                            Thomas 'PointedEars' Lahn wrote:[color=blue][color=green]
                            > > If you have yet more universal solution I'm anxious to see it.[/color]
                            >
                            > I have posted it already.[/color]

                            When you need to replace a bulb, are you turning around the lamp
                            yourself or are you using your hand only? With this code I'm not sure
                            anymore... :-)

                            Comment

                            • Thomas 'PointedEars' Lahn

                              #15
                              Re: Extracting and replacing url within href tag

                              VK wrote:
                              [color=blue]
                              > Thomas 'PointedEars' Lahn wrote:[color=green][color=darkred]
                              >> > If you have yet more universal solution I'm anxious to see it.[/color]
                              >> I have posted it already.[/color]
                              >
                              > When you need to replace a bulb, are you turning around the lamp
                              > yourself or are you using your hand only? With this code I'm not sure
                              > anymore... :-)[/color]

                              Obviously you have not understood my code, which is hardly surprising.
                              FWIW, the main issues that my approach covers and yours does not, are:

                              1. Only the domain of the link's URL should matter.
                              2. The resulting URL must be properly escaped.

                              How this is achieved is a different matter.


                              PointedEars
                              --
                              But he had not that supreme gift of the artist, the knowledge of
                              when to stop.
                              -- Sherlock Holmes in Sir Arthur Conan Doyle's
                              "The Adventure of the Norwood Builder"

                              Comment

                              Working...