Cookie Module

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • N.K

    Cookie Module

    Hi ,

    Python's existing cookie module doesnt supports new cookie headers
    SetCookie2 ,
    How to submit a patch for that ? I tried emailing person who owns that
    module.But no response.

    Thanks,
    Nirmal
  • Peter Hansen

    #2
    Re: Cookie Module

    "N.K" wrote:[color=blue]
    >
    > Python's existing cookie module doesnt supports new cookie headers
    > SetCookie2 ,
    > How to submit a patch for that ? I tried emailing person who owns that
    > module.But no response.[/color]

    Have you looked at this page? http://www.python.org/dev/

    Note the reference to the "Patch Manager"...

    -Peter

    Comment

    • John J. Lee

      #3
      Re: Cookie Module

      nirmalkannan@ho tmail.com (N.K) writes:
      [color=blue]
      > Python's existing cookie module doesnt supports new cookie headers
      > SetCookie2 ,[/color]
      [...]

      What do you want this for? I'm curious, and I suspect you're unaware
      that the RFCs on cookies are *not* the standards followed by most of
      the web -- the real 'standards' aren't really written down anywhere.

      Wait a minute (my memory works slowly), didn't I write you an email
      about this a while back? Are you the masochist^H^H^H ^H^H^H^H^H^Hguy
      who wrote some client-side cookie-handling code for a web-crawler
      (called harvestman, IIRC?)?


      John

      Comment

      • Anand Pillai

        #4
        Re: Cookie Module

        It is him alright, and I maintain HarvestMan, (Capital 'H', Capital
        'M').
        But I dont think Nirmal is masochist, not in any way.

        I dont understand why you called him "masochist" . The reason why
        we wrote our own cookie handling module using python's Cookie module
        was that, the ClientCookie module was doing too many things, which
        we wanted to avoid.

        ClientCookie is exactly what it says, it is a module that acts as
        a webclient apart from managing cookies. The ClientCookie module
        borrows the methods of urllib2 like urlopen() and manages cookies
        under the covers, so to say. Good design, no doubt but not what
        we wanted.

        ClientCookie is something like a wrapper over urllib2 plus cookie
        handling. We wanted a module, which works *with* urllib2 and does
        not wrap over it. With all due respect to CC, it cannot be modified
        in a way to do that, withou writing klunky code, which I did not want.

        Hence nirmal read the RFCs for cookie handling and implemented
        a module which works *along with* urllib2 rather than wrapping over
        it.
        The module need special calls to set the cookie which is part of
        the harvestman code(in another module), so in a way it is much
        inferior to client cookie, which does it transparently.

        Nirmal is talking about RFCs because we want to have a very correct
        technical cookie implementation for our module, no matter what web
        servers
        does in their wheels and geats. I know that the SetCookie2 method uses
        latest cookie RFC, which almost no webserver supports, but still the
        question should be taken in a spirit of technical correctness.

        Yeah it has an academic quality to it, not of much pratical use maybe
        right now, but I cannot understand how it makes the guy a masochist.

        -Anand Pillai

        jjl@pobox.com (John J. Lee) wrote in message news:<871xt3q11 6.fsf@pobox.com >...[color=blue]
        > nirmalkannan@ho tmail.com (N.K) writes:
        >[color=green]
        > > Python's existing cookie module doesnt supports new cookie headers
        > > SetCookie2 ,[/color]
        > [...]
        >
        > What do you want this for? I'm curious, and I suspect you're unaware
        > that the RFCs on cookies are *not* the standards followed by most of
        > the web -- the real 'standards' aren't really written down anywhere.
        >
        > Wait a minute (my memory works slowly), didn't I write you an email
        > about this a while back? Are you the masochist^H^H^H ^H^H^H^H^H^Hguy
        > who wrote some client-side cookie-handling code for a web-crawler
        > (called harvestman, IIRC?)?
        >
        >
        > John[/color]

        Comment

        • N.K

          #5
          Re: Cookie Module

          > What do you want this for? I'm curious, and I suspect you're unaware[color=blue]
          > that the RFCs on cookies are *not* the standards followed by most of
          > the web -- the real 'standards' aren't really written down anywhere.[/color]


          True,I was forced to use RFC 2965. And there are more reserved
          keywords such as 'port' , 'Discard' etc. Anyway it is not a bad thing
          to update a module.

          _reserved = { "expires" : "expires",
          "path" : "Path",
          "comment" : "Comment",
          "domain" : "Domain",
          "max-age" : "Max-Age",
          "secure" : "secure",
          "version" : "Version",

          }

          [color=blue]
          > Wait a minute (my memory works slowly), didn't I write you an email
          > about this a while back? Are you the masochist^H^H^H ^H^H^H^H^H^Hguy
          > who wrote some client-side cookie-handling code for a web-crawler
          > (called harvestman, IIRC?)?[/color]


          I discarded my yahoo email id because of spam - Sorry for not
          replying. I am not that busy guy, normally i reply to almost all
          emails :-)

          Comment

          • JanC

            #6
            Re: Cookie Module

            pythonguy@Hotpo p.com (Anand Pillai) schreef:
            [color=blue]
            > Yeah it has an academic quality to it, not of much pratical use maybe
            > right now, but I cannot understand how it makes the guy a masochist.[/color]

            JJL did implement it in ClientCookie, so I'm sure he knows by experience
            what a masochist is... ;-)

            --
            JanC

            "Be strict when sending and tolerant when receiving."
            RFC 1958 - Architectural Principles of the Internet - section 3.9

            Comment

            • John J. Lee

              #7
              Re: Cookie Module

              nirmalkannan@ho tmail.com (N.K) writes:
              [color=blue][color=green]
              > > What do you want this for? I'm curious, and I suspect you're unaware
              > > that the RFCs on cookies are *not* the standards followed by most of
              > > the web -- the real 'standards' aren't really written down anywhere.[/color]
              >
              >
              > True,I was forced to use RFC 2965.[/color]

              Who forced you?

              [color=blue]
              > And there are more reserved
              > keywords such as 'port' , 'Discard' etc. Anyway it is not a bad thing
              > to update a module.[/color]
              [...]

              What? I'm not certain I understand you, but I'll say again
              (especially since my second response to Anand seems to have vanished
              again): RFC 2965 is not only defunct as an internet protocol, it has
              *never* been used by more than a vanishing fraction of the internet.


              John

              Comment

              • John J. Lee

                #8
                Re: Cookie Module

                [reposting since the original seems to have vanished]

                pythonguy@Hotpo p.com (Anand Pillai) writes:
                [color=blue]
                > It is him alright, and I maintain HarvestMan, (Capital 'H', Capital
                > 'M').
                > But I dont think Nirmal is masochist, not in any way.
                >
                > I dont understand why you called him "masochist" . The reason why
                > we wrote our own cookie handling module using python's Cookie module
                > was that, the ClientCookie module was doing too many things, which
                > we wanted to avoid.[/color]

                Anand, first, it wasn't said with the *slightest* malice (after all, I
                meet my own criterion for masochism)!

                Still, I certainly *do* think you're duplicating effort for no obvious
                reason. You're quite at liberty to do that, of course, but so am I to
                call you a masochist for it ;-)

                [color=blue]
                > ClientCookie is exactly what it says, it is a module that acts as
                > a webclient apart from managing cookies. The ClientCookie module
                > borrows the methods of urllib2 like urlopen() and manages cookies
                > under the covers, so to say. Good design, no doubt but not what
                > we wanted.[/color]

                Well, no. It provides convenient replacements for the urllib2
                callables (that's what _urllib2_suppor t.py is for). It certainly
                doesn't *require* you use that urllib2-wrapping stuff. All it
                requires is that you give CookieJar.extra ct_cookies /
                ..add_cookie_he ader trivial request and response objects. In fact,
                since I happen to know that HarvestMan uses urllib2, I know you
                already *have* such objects.

                No wrapping involved:

                import urllib2, ClientCookie

                cj = ClientCookie.Co okieJar()
                request = urllib2.Request ("http://www.example.com/")
                cj.add_cookie_h eader(request)
                response = urllib2.urlopen (request)
                cj.extract_cook ies(response, request)
                ....etc


                It couldn't really be much simpler! In fact, I see your own
                CookieManager.S etCookie and CookieManager.a dd_cookie_heade r methods
                are directly analogous (if slightly less convenient, less
                standards-compliant, more ignorant of the de-facto cookie standard,
                and more buggy -- well OK, maybe not more buggy ;-).

                Well, OK, it could be simpler if you didn't even have to call those
                methods on cj. But of course, that's what the urllib2-replacement
                stuff is for, to add entirely automatic cookie handling:

                ClientCookie.ur lopen(request) # no need for any tiresome method calls


                You're entirely free to use or ignore that stuff.

                Why does that optional support *wrap* urllib2 instead of *extending*
                it with a CookieHandler? Good question. If you want to have the
                urllib2 interface with automatic cookie handling in the sense above,
                you're currently *forced* to replace (parts of) urllib2 rather than
                extend it, due to the current design of urllib2. I hope to change
                that with a patch I've submitted and Jeremy Hylton plans to look at in
                his Copious Free Time.

                [color=blue]
                > ClientCookie is something like a wrapper over urllib2 plus cookie
                > handling. We wanted a module, which works *with* urllib2 and does
                > not wrap over it. With all due respect to CC, it cannot be modified
                > in a way to do that, withou writing klunky code, which I did not want.[/color]

                That's just a misunderstandin g -- it *has* been that way for a long
                time. Just ignore, or throw away, _urllib2_suppor t.py!

                [color=blue]
                > Hence nirmal read the RFCs for cookie handling and implemented[/color]

                Well, as I tried to tell Nirmal (unfortunately it seems that email
                address was dead), and as is explained in tiresome (though far from
                exhaustive) detail in the ClientCookie docs, browsers don't actually
                implement the RFCs. Well, lynx makes a good attempt at RFC 2109, and
                Opera does for 2965, but as long as the big browsers don't implement
                them -- and it's almost a certainty they never will -- nobody will
                actually be *using* either standard! They're just there to trip you
                up <0.7 wink>. As is the original cookie_spec.htm l, actually: it only
                bears a passing resemblence to the de facto standard actually used on
                the internet. Of course, as long as your userbase is sufficiently
                small (please don't feel insulted: the userbase of my code is
                doubtless pretty tiny), you may not notice (though I suspect you
                will), but people like Ronald Tschalar, author of a Java library
                called HTTPClient, have found themelves spending inordinate amounts of
                time fixing problems that arise by trying to implement only the RFCs,
                or by naievely trying to combine the RFCs with the Netscape protocol.
                Of course, you can get around that to a large extent by not bothering
                to implement the security rules properly (which may be quite a
                reasonable thing to do in some cases).

                [...][color=blue]
                > Nirmal is talking about RFCs because we want to have a very correct
                > technical cookie implementation for our module, no matter what web
                > servers
                > does in their wheels and geats. I know that the SetCookie2 method uses
                > latest cookie RFC, which almost no webserver supports, but still the
                > question should be taken in a spirit of technical correctness.[/color]

                Your module claims to implement 2109, officially obsolete. Fine
                (though I suspect it's quite far from a full implementation) . As for
                2965 (the standard that officially obsoletes 2109, and which
                ClientCookie does implement in addition to the Netscape protocol),
                even David Kristol seems to have given up on it, as has the single guy
                who was driving the effort to rescue it from compatibility problems
                with the Netscape protocol (Daniel Kian McKiernan). RFC 2965 now
                seems utterly dead as an internet protocol.

                [color=blue]
                > Yeah it has an academic quality to it, not of much pratical use maybe[/color]
                [...]

                As you say.


                John

                Comment

                Working...