Tricks to help prevent site ripping

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Dave Turner

    Tricks to help prevent site ripping

    I know that its impossible to completely prevent somebody from ripping a
    site (or cracking software) if that person has the skills and the
    time/patience, but there are tricks that can be employed in software which
    slow crackers down, from things like self-decrypting code to anti-debug
    tricks, and most people have a breaking point - if you can slow them down
    and waste enough of their time they usually move on to easier targets (and
    as there are so many easy targets out there most people wouldn't waste too
    much time on hard targets).

    I was wondering if there are any such tricks that can be used to slow down
    people who want to rip your website to modify and use for their own? The
    dynamic server-side nature of PHP would suggest that something could be
    done, because obviously when somebody rips a site they only get the HTML
    from the PHP and not its source code

    Any ideas?


  • anjanesh
    New Member
    • Aug 2005
    • 2

    #2
    You can use different templates (simple and plain) for your site - so when soeone goes to another page it doesnt have to be the exact design and colour.
    This way there wont be a pattern for site extraction.

    Comment

    • thehuby

      #3
      Re: Tricks to help prevent site ripping

      Only thing I can think of is to write your PHP code and when complete
      remove all the new lines - so the entire page is on one line. However
      a simple search and replace could undo your work (replace all ; with
      ;\n).

      Rick


      Comment

      • Malcolm Dew-Jones

        #4
        Re: Tricks to help prevent site ripping

        Dave Turner (not@dave) wrote:
        : I know that its impossible to completely prevent somebody from ripping a
        : site (or cracking software) if that person has the skills and the
        : time/patience, but there are tricks that can be employed in software which
        : slow crackers down, from things like self-decrypting code to anti-debug
        : tricks, and most people have a breaking point - if you can slow them down
        : and waste enough of their time they usually move on to easier targets (and
        : as there are so many easy targets out there most people wouldn't waste too
        : much time on hard targets).

        : I was wondering if there are any such tricks that can be used to slow down
        : people who want to rip your website to modify and use for their own? The
        : dynamic server-side nature of PHP would suggest that something could be
        : done, because obviously when somebody rips a site they only get the HTML
        : from the PHP and not its source code

        : Any ideas?

        When somebody rips a site they only get the HTML from the PHP and not its
        source code.


        --

        This space not for rent.

        Comment

        • Jacob Atzen

          #5
          Re: Tricks to help prevent site ripping

          On 2005-08-15, Dave Turner <not@dave> wrote:[color=blue]
          > I was wondering if there are any such tricks that can be used to slow down
          > people who want to rip your website to modify and use for their own? The
          > dynamic server-side nature of PHP would suggest that something could be
          > done, because obviously when somebody rips a site they only get the HTML
          > from the PHP and not its source code[/color]

          So you want to prevent people from reading your HTML?

          --
          Cheers,
          - Jacob Atzen

          Comment

          • Colin McKinnon

            #6
            Re: Tricks to help prevent site ripping

            Dave Turner wrote:
            [color=blue]
            > I know that its impossible to completely prevent somebody from ripping a
            > site (or cracking software) if that person has the skills and the
            > time/patience, but there are tricks that can be employed in software which
            > slow crackers down, from things like self-decrypting code to anti-debug
            > tricks, and most people have a breaking point - if you can slow them down
            > and waste enough of their time they usually move on to easier targets (and
            > as there are so many easy targets out there most people wouldn't waste too
            > much time on hard targets).
            >
            > I was wondering if there are any such tricks that can be used to slow down
            > people who want to rip your website to modify and use for their own? The
            > dynamic server-side nature of PHP would suggest that something could be
            > done, because obviously when somebody rips a site they only get the HTML
            > from the PHP and not its source code
            >
            > Any ideas?[/color]

            Yes lots (maybe I should write a book). The first problem you're going to
            have is discriminating between searchbots (which you probably want on your
            site and harvesters. But you'd also need to be a lot clearer about the
            architecture of your site and what exactly you want to protect. Is it
            really the HTML?

            You should also be addressing the issue of *why* people might want to rip
            your site. Perhaps providing syndicated content might be a better solution.

            To get you started on anti-harvesting - think honeypot.

            HTH

            C.

            Comment

            • Erwin Moller

              #7
              Re: Tricks to help prevent site ripping

              Jacob Atzen wrote:
              [color=blue]
              > On 2005-08-15, Dave Turner <not@dave> wrote:[color=green]
              >> I was wondering if there are any such tricks that can be used to slow
              >> down people who want to rip your website to modify and use for their own?
              >> The dynamic server-side nature of PHP would suggest that something could
              >> be done, because obviously when somebody rips a site they only get the
              >> HTML from the PHP and not its source code[/color]
              >
              > So you want to prevent people from reading your HTML?
              >[/color]

              Excactly Jacob.

              What is the OP trying to achieve?
              I do not see the point...

              Regards,
              Erwin Moller

              Comment

              • Will Woodhull

                #8
                Re: Tricks to help prevent site ripping

                Dave Turner wrote:[color=blue]
                > I know that its impossible to completely prevent somebody from ripping a
                > site (or cracking software) if that person has the skills and the
                > time/patience, but there are tricks that can be employed in software which
                > slow crackers down,[/color]

                If you use sessions, you can track the number of page requests in a
                time interval. If you see an unreasonable amount of requests per
                second, you could stop serving pages to that session. That would
                prevent many harvesters from obtaining more than a slice of your site
                at any one time.

                Is that the kind of impediment you had in mind?

                Comment

                • Gordon Burditt

                  #9
                  Re: Tricks to help prevent site ripping

                  >> I know that its impossible to completely prevent somebody from ripping a[color=blue][color=green]
                  >> site (or cracking software) if that person has the skills and the
                  >> time/patience, but there are tricks that can be employed in software which
                  >> slow crackers down,[/color]
                  >
                  >If you use sessions, you can track the number of page requests in a
                  >time interval.[/color]

                  If you use sessions, the harvesters probably don't use cookies. Or
                  they have a bunch of harvesters running in parallel with different
                  sessions and different session cookies. Or, if you're using trans_sid,
                  there's a bunch of harvesters using different session IDs in the URLs.
                  [color=blue]
                  >If you see an unreasonable amount of requests per
                  >second, you could stop serving pages to that session. That would
                  >prevent many harvesters from obtaining more than a slice of your site
                  >at any one time.
                  >
                  >Is that the kind of impediment you had in mind?[/color]

                  I don't think it's much of an impediment. You might try detecting
                  a lot of requests from the same IP, which means you will slow down
                  or deny service to proxies like AOL and other large ISPs use.

                  Gordon L. Burditt

                  Comment

                  • Dave Turner

                    #10
                    Re: Tricks to help prevent site ripping

                    > So you want to prevent people from reading your HTML?

                    err, no ... (obviously). Re-read my question. I didn't ask how to stop
                    people reading HTML, I asked if anyone knew of any tricks (of which there
                    are at least several) which can be used to slow people down from ripping
                    your sites content (ie. to use your website design for themselves).
                    Obviously this is something which can't be achieved 100% - if somebody has
                    enough time skill and patience then they can copy/rip and site, but slowing
                    them down will deter most people.


                    Comment

                    • Andrew DeFaria

                      #11
                      Re: Tricks to help prevent site ripping

                      Dave Turner wrote:
                      [color=blue][color=green]
                      >> So you want to prevent people from reading your HTML?[/color]
                      >
                      > err, no ... (obviously). Re-read my question. I didn't ask how to stop
                      > people reading HTML, I asked if anyone knew of any tricks (of which
                      > there are at least several) which can be used to slow people down from
                      > ripping your sites content (ie. to use your website design for
                      > themselves). Obviously this is something which can't be achieved 100%
                      > - if somebody has enough time skill and patience then they can
                      > copy/rip and site, but slowing them down will deter most people.[/color]

                      You're question is still not clear. Can you give and example of "ripping
                      your site's content"? Do you mean copy the text of a site? All browsers
                      support copy and paste of text as do most windowed programs. Do you mean
                      copy the images you might have? Hey if the browser knows how to get the
                      image (and it must) then so to can the user. Heck browsers support File:
                      Save As so people can just do that. Not achievable 100%? Try not
                      achievable at all! The best you can do is copyright your material as
                      many do already. Then again the whole concept of copyright and web pages
                      never made much sense to me. The mere act of rendering a page is indeed
                      violating the copyright in that a copy has been made! Besides the net is
                      a place where information is freely exchanged. If you participate in the
                      net then you are participating in the free exchange of ideas and
                      material. That's the way it works!

                      If you mean copy your PHP code, well they can't. Your page is served out
                      to them as HTML, not the PHP code that produced the HTML so at least
                      there you are safe.

                      --
                      A Messy Kitchen Is A Happy Kitchen And This Kitchen Is Delirious

                      Comment

                      • Will Woodhull

                        #12
                        Re: Tricks to help prevent site ripping

                        I agree that "profession al" harvester programs aren't going to be
                        stopped by session lock-ups if some bandwith limit is exceeded (lock up
                        the session if more than X number of page requests are sent per second,
                        etc).

                        However I'm pretty sure this approach would effectively stall casual
                        site ripping with packages like WebLeech or Web Stripper. Haven't tried
                        it though-- not yet at least.

                        Much depends on what Dave Turner (the OP) actually needs. So far it
                        isn't clear whether he's trying to protect the content of a commercial
                        site from being undercut by an unscrupulous competitor or if he's tryin
                        to keep his wawoo-neat design ideas from showing up on half of Yahoo's
                        free web sites.

                        Comment

                        • Peter Fox

                          #13
                          Re: Tricks to help prevent site ripping

                          Following on from Will Woodhull's message. . .[color=blue]
                          >I agree that "profession al" harvester programs aren't going to be
                          >stopped by session lock-ups if some bandwith limit is exceeded (lock up
                          >the session if more than X number of page requests are sent per second,
                          >etc).
                          >
                          >However I'm pretty sure this approach would effectively stall casual
                          >site ripping with packages like WebLeech or Web Stripper. Haven't tried
                          >it though-- not yet at least.
                          >
                          >Much depends on what Dave Turner (the OP) actually needs. So far it
                          >isn't clear whether he's trying to protect the content of a commercial
                          >site from being undercut by an unscrupulous competitor or if he's tryin
                          >to keep his wawoo-neat design ideas from showing up on half of Yahoo's
                          >free web sites.
                          >[/color]
                          THOUGHT!
                          Obviously if a plain browser can see all the stuff on your pages they
                          are in the wild and fair game.

                          You could /try/ the following but beware of the effect of caching.

                          The object is to put a poison pill into the pages which triggers when it
                          isn't being shown live and on your site. I can think of two ways of
                          doing this - both purely theoretical and both require you to dynamically
                          tweak some javascript.

                          1. Your javascript (which might be loading images or other OnLoad()
                          activities) tests the date and time (client) against the 'now' on your
                          server. The 'now' on your server is hard coded into the javascript -
                          hence the need for dynamic creation. Your js code might say "if client
                          time is two weeks behind the time the page was created then open 'a this
                          site has been ripped window'.

                          2. Have js ask for some resource from your server which has to be
                          dynamic. For example an image that gives today's date or a news feed
                          extract. This will look a bit weird when looked at statically.

                          So my conclusion is
                          You can't stop ripping but you might be able to flag it to an
                          unsuspecting viewer of HTML.


                          In a slightly different vein.
                          How about getting js to call some resources in a way that is not easy to
                          deduce from looking at the code automatically is a URL. Basically some
                          low level encryption. I don't know how rippers work but surely one of
                          the things they would do is try to redirect all <a
                          href="my.site/page.htm"> to
                          <a href="ripped.si te/page.htm"> Your js could be written to trap this
                          sort of thing by (a) not fetching stuff for a 'real time' (as in 1
                          above) load but having the url clearly present in the js along the lines
                          of "ha ha this will break the page 'cos now you're trying to link to a
                          resource on ripped.site which doesn't exist." (b) (say) ROT 13 complete
                          URLs including the http bit and calling them by an OnClick(). This
                          brings them onto your site and what you do then is up to you. Perhaps
                          it is a page that has a lifetime of a week and then gets filled with
                          poisonous content. Or just eaxmine the referrer in the header to
                          discover that the come-from page was somewhere out in cyberspace and
                          then you could decide what to do.

                          I'm trying to think of a way to call a style sheet with variable (js -
                          date based) parameters. Then you could really upset the page layout
                          after a fortnight.

                          All the above is just thoughts.

                          --
                          PETER FOX Not the same since the bookshop idea was shelved
                          peterfox@eminen t.demon.co.uk.n ot.this.bit.no. html
                          2 Tees Close, Witham, Essex.
                          Gravity beer in Essex <http://www.eminent.dem on.co.uk>

                          Comment

                          • News KF

                            #14
                            Re: Tricks to help prevent site ripping

                            Hi Dave,

                            Another trick, that can help a little.

                            Use javascript at least for the creation of parts of the internal links.
                            most harvesters don't execute javascript and will therefore never follow
                            links, that are being created by javascript.

                            bye



                            nkf



                            Dave Turner wrote:[color=blue][color=green]
                            >>So you want to prevent people from reading your HTML?[/color]
                            >
                            >
                            > err, no ... (obviously). Re-read my question. I didn't ask how to stop
                            > people reading HTML, I asked if anyone knew of any tricks (of which there
                            > are at least several) which can be used to slow people down from ripping
                            > your sites content (ie. to use your website design for themselves).
                            > Obviously this is something which can't be achieved 100% - if somebody has
                            > enough time skill and patience then they can copy/rip and site, but slowing
                            > them down will deter most people.
                            >
                            >[/color]

                            Comment

                            • ChronoFish

                              #15
                              Re: Tricks to help prevent site ripping

                              Yes - and it's almost 100% and fairly easy to implement:

                              AJAX

                              Each page is made up nothing but an AJAX loader. Doing view source,
                              save page, etc will view/save the AJAX loader.

                              The AJAX loader simply makes requests (one or many) to your PHP code.
                              By requiring your AJAX loader to request the page in a relatively short
                              period of time (i.e. 1 second) the URL that AJAX asks for is only valid
                              for the "short period of time"
                              (for instance
                              https://www.yourpage.com/CoolSite?ge...2&key=234A32CD) where ts
                              is a unix timestmp and key is a MD5 encrypted timestamp+magic word.

                              This will allow users to copy/paste your CONTENT but not your HTML (and
                              of course your PHP is safe as long as you are not distributing it and
                              as long you don't have a server breach).

                              -CF

                              Comment

                              Working...