scanf in python

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • AMD

    scanf in python

    Hello,

    I often need to parse strings which contain a mix of characters,
    integers and floats, the C-language scanf function is very practical for
    this purpose.
    I've been looking for such a feature and I have been quite surprised to
    find that it has been discussed as far back as 2001 but never
    implemented. The recommended approach seems to be to use split and then
    atoi or atof or to use regex and then atoi and atof. Both approaches
    seem to be a lot less natural and much more cumbersome than scanf. If
    python already has a % string operator that behaves like printf, why not
    implement either a %% or << string operator to behave like scanf, use
    could be like the followng:

    a, b, c = "%d %f %5c" %% "1 2.0 abcde"

    or

    a, b, c = "%d %f %5c" << "1 2.0 abcde"

    %% is closer to the % operator

    << seems more intuitive to me

    either of this methods seems to me much simpler than:

    lst = "1 2;0 abcde".split()
    a = int(lst[0])
    b = float(lst[1])
    c = lst[2]

    or even worse when using regular expressions to parse such simple input.

    I like python because it is concise and easy to read and I really think
    it could use such an operator.

    I know this has been discussed before and many times, but all previous
    threads I found seem to be dead and I would like to invite further
    investigation of this topic.

    Cheers,

    André M. Descombes
  • Diez B. Roggisch

    #2
    Re: scanf in python

    AMD schrieb:
    Hello,
    >
    I often need to parse strings which contain a mix of characters,
    integers and floats, the C-language scanf function is very practical for
    this purpose.
    I've been looking for such a feature and I have been quite surprised to
    find that it has been discussed as far back as 2001 but never
    implemented. The recommended approach seems to be to use split and then
    atoi or atof or to use regex and then atoi and atof. Both approaches
    seem to be a lot less natural and much more cumbersome than scanf. If
    python already has a % string operator that behaves like printf, why not
    implement either a %% or << string operator to behave like scanf, use
    could be like the followng:
    >
    a, b, c = "%d %f %5c" %% "1 2.0 abcde"
    >
    or
    >
    a, b, c = "%d %f %5c" << "1 2.0 abcde"
    >
    %% is closer to the % operator
    >
    << seems more intuitive to me
    >
    either of this methods seems to me much simpler than:
    >
    lst = "1 2;0 abcde".split()
    a = int(lst[0])
    b = float(lst[1])
    c = lst[2]
    >
    or even worse when using regular expressions to parse such simple input.
    >
    I like python because it is concise and easy to read and I really think
    it could use such an operator.
    >
    I know this has been discussed before and many times, but all previous
    threads I found seem to be dead and I would like to invite further
    investigation of this topic.
    I'm pretty certain python won't grow an additional operator for this.
    Yet you are free to create a scanf-implementation as 3rd-party-module.

    IMHO the usability of the approach is very limited though. First of all,
    the need to capture more than one input token is *very* seldom - nearly
    all commandline-tools I know that do require interactive user-input
    (like the linux kernel config tool) do so by providing either
    line-by-line value entry (including defaults, something you can't do
    with your approach), or even dialog-centric value entry with curses.

    So - I doubt you will gather much momentum on this. Good luck though.


    Diez

    Comment

    • AMD

      #3
      Re: scanf in python

      >
      I'm pretty certain python won't grow an additional operator for this.
      Yet you are free to create a scanf-implementation as 3rd-party-module.
      >
      IMHO the usability of the approach is very limited though. First of all,
      the need to capture more than one input token is *very* seldom - nearly
      all commandline-tools I know that do require interactive user-input
      (like the linux kernel config tool) do so by providing either
      line-by-line value entry (including defaults, something you can't do
      with your approach), or even dialog-centric value entry with curses.
      >
      So - I doubt you will gather much momentum on this. Good luck though.
      >
      >
      Diez
      Actually it is quite common, it is used for processing of files not for
      reading parameters. You can use it whenever you need to read a simple
      csv file or fixed format file which contains many lines with several
      fields per line.
      The advantage of the approach is that it combines the parsing and
      conversion of the fields into one operation.
      Another advantage of using simple formatting strings is that it allows
      for easy translation of these lines, just like you have with the %
      operator for output. I don't see why python can have an operator for
      output but it can't have one for input, it's just not symmetrical.
      I don´t see why you can't use this method for line-by-line value entry,
      just add \n between your %s or %d.
      The method is quite versatile and much simpler than regular expressions
      plus conversion afterwards.

      André

      Comment

      • Robert Kern

        #4
        Re: scanf in python

        AMD wrote:
        Hello,
        >
        I often need to parse strings which contain a mix of characters,
        integers and floats, the C-language scanf function is very practical for
        this purpose.
        I've been looking for such a feature and I have been quite surprised to
        find that it has been discussed as far back as 2001 but never
        implemented.
        The second Google hit is a pure Python implementation of scanf.



        --
        Robert Kern

        "I have come to believe that the whole world is an enigma, a harmless enigma
        that is made terrible by our own mad attempt to interpret it as though it had
        an underlying truth."
        -- Umberto Eco

        Comment

        • AMD

          #5
          Re: scanf in python

          Robert Kern a écrit :
          AMD wrote:
          >Hello,
          >>
          >I often need to parse strings which contain a mix of characters,
          >integers and floats, the C-language scanf function is very practical
          >for this purpose.
          >I've been looking for such a feature and I have been quite surprised
          >to find that it has been discussed as far back as 2001 but never
          >implemented.
          >
          The second Google hit is a pure Python implementation of scanf.
          >

          >
          Hi Robert,

          I had seen this pure python implementation, but it is not as fast or as
          elegant as would be an implementation written in C directly within
          python with no need for import.

          Cheers,

          André

          Comment

          • Fredrik Lundh

            #6
            Re: scanf in python

            AMD wrote:
            I had seen this pure python implementation, but it is not as fast or as
            elegant as would be an implementation written in C directly within
            python with no need for import.
            maybe you should wait with disparaging comments about how Python is not
            what you want it to be until you've learned the language?

            </F>

            Comment

            • AMD

              #7
              Re: scanf in python

              AMD wrote:
              >
              >I had seen this pure python implementation, but it is not as fast or
              >as elegant as would be an implementation written in C directly within
              >python with no need for import.
              >
              maybe you should wait with disparaging comments about how Python is not
              what you want it to be until you've learned the language?
              >
              </F>
              >
              Hello Fredrik,

              I didn't think my comment would offend anyone, I'm sorry if it did. I
              have been developping in Python for about 5 years, my company uses
              Python as a scripting language for all of its products. We use Jython
              for our server products. I think I know it pretty well by now. So I
              think I have earned the right to try to suggest improvements to the
              language or at least intelligent discussion of new features without need
              for offensive comments, don't you think?

              André

              Comment

              • Lawrence D'Oliveiro

                #8
                Re: scanf in python

                In message <4884e77a$0$294 05$426a74cc@new s.free.fr>, AMD wrote:
                Actually it is quite common, it is used for processing of files not for
                reading parameters. You can use it whenever you need to read a simple
                csv file or fixed format file which contains many lines with several
                fields per line.
                I do that all the time, in Python and C++, but I've never felt the need for
                a scanf-type function.

                For reading delimited fields in Python, you can use .split string method.

                Comment

                • AMD

                  #9
                  Re: scanf in python

                  In message <4884e77a$0$294 05$426a74cc@new s.free.fr>, AMD wrote:
                  >
                  >Actually it is quite common, it is used for processing of files not for
                  >reading parameters. You can use it whenever you need to read a simple
                  >csv file or fixed format file which contains many lines with several
                  >fields per line.
                  >
                  I do that all the time, in Python and C++, but I've never felt the need for
                  a scanf-type function.
                  I agree scanf is not a must have function but rather a nice to have
                  function.
                  For reading delimited fields in Python, you can use .split string method.
                  Yes, that is what I use right now, but I still have to do the conversion
                  to integers, floats, dates as several separate steps. What is nice about
                  the scanf function is that it is all done on the same step. Exactly like
                  when you use % to format a string and you pass it a dictionary, it does
                  all the conversions to string for you.

                  Cheers,

                  André

                  Comment

                  • Fredrik Lundh

                    #10
                    Re: scanf in python

                    AMD wrote:
                    >For reading delimited fields in Python, you can use .split string method.
                    Yes, that is what I use right now, but I still have to do the conversion
                    to integers, floats, dates as several separate steps. What is nice about
                    the scanf function is that it is all done on the same step. Exactly like
                    when you use % to format a string and you pass it a dictionary, it does
                    all the conversions to string for you.
                    You're confusing surface syntax with processing steps. If you want to
                    do things on one line, just add a suitable helper to take care of the
                    processing. E.g. for whitespace-separated data:
                    >>def scan(s, *types):
                    .... return tuple(f(v) for (f, v) in zip(types, s.split()))
                    ....
                    >>scan("1 2 3", int, int, float)
                    (1, 2, 3.0)

                    This has the additional advantage that it works with any data type that
                    provides a way to convert from string to that type, not just a small
                    number of built-in types. And you can even pass in your own local
                    helper, of course:
                    >>def myfactory(n):
                    .... return int(n) * "!"
                    ....
                    >>scan("1 2 3", int, float, myfactory)
                    (1, 2.0, '!!!')

                    If you're reading multiple columns of the same type, you might as well
                    inline the whole thing:

                    data = map(int, line.split())

                    For other formats, replace the split with slicing or a regexp. Or use a
                    ready-made module; there's hardly every any reason to read standard CSV
                    files by hand when you can just do "import csv", for example.

                    Also note that function *creation* is relatively cheap in Python, and
                    since "def" is an executable statement, you can create them pretty much
                    anywhere; if you find that need a helper somewhere in your code, just
                    put it there. The following is a perfectly valid pattern:

                    def myfunc(...):

                    def myhelper(...):
                    ...

                    myhelper(...)
                    myhelper(...)

                    for line in open(file):
                    myhelper(...)

                    (I'd say knowing when and how to abstract things away into a local
                    helper is an important step towards full Python fluency -- that is, the
                    point where you're able to pack "a lot of action in a small amount of
                    clear code" most of the time.)

                    </F>

                    Comment

                    • AMD

                      #11
                      Re: scanf in python

                      Thanks Fredrik,

                      very nice examples.

                      André
                      AMD wrote:
                      >
                      >>For reading delimited fields in Python, you can use .split string
                      >>method.
                      >
                      >Yes, that is what I use right now, but I still have to do the
                      >conversion to integers, floats, dates as several separate steps. What
                      >is nice about the scanf function is that it is all done on the same
                      >step. Exactly like when you use % to format a string and you pass it a
                      >dictionary, it does all the conversions to string for you.
                      >
                      You're confusing surface syntax with processing steps. If you want to
                      do things on one line, just add a suitable helper to take care of the
                      processing. E.g. for whitespace-separated data:
                      >
                      >>def scan(s, *types):
                      ... return tuple(f(v) for (f, v) in zip(types, s.split()))
                      ...
                      >>scan("1 2 3", int, int, float)
                      (1, 2, 3.0)
                      >
                      This has the additional advantage that it works with any data type that
                      provides a way to convert from string to that type, not just a small
                      number of built-in types. And you can even pass in your own local
                      helper, of course:
                      >
                      >>def myfactory(n):
                      ... return int(n) * "!"
                      ...
                      >>scan("1 2 3", int, float, myfactory)
                      (1, 2.0, '!!!')
                      >
                      If you're reading multiple columns of the same type, you might as well
                      inline the whole thing:
                      >
                      data = map(int, line.split())
                      >
                      For other formats, replace the split with slicing or a regexp. Or use a
                      ready-made module; there's hardly every any reason to read standard CSV
                      files by hand when you can just do "import csv", for example.
                      >
                      Also note that function *creation* is relatively cheap in Python, and
                      since "def" is an executable statement, you can create them pretty much
                      anywhere; if you find that need a helper somewhere in your code, just
                      put it there. The following is a perfectly valid pattern:
                      >
                      def myfunc(...):
                      >
                      def myhelper(...):
                      ...
                      >
                      myhelper(...)
                      myhelper(...)
                      >
                      for line in open(file):
                      myhelper(...)
                      >
                      (I'd say knowing when and how to abstract things away into a local
                      helper is an important step towards full Python fluency -- that is, the
                      point where you're able to pack "a lot of action in a small amount of
                      clear code" most of the time.)
                      >
                      </F>
                      >

                      Comment

                      • Lawrence D'Oliveiro

                        #12
                        Re: scanf in python

                        In message <4889ae4a$0$372 6$426a74cc@news .free.fr>, AMD wrote:
                        >In message <4884e77a$0$294 05$426a74cc@new s.free.fr>, AMD wrote:
                        >>
                        >>Actually it is quite common, it is used for processing of files not for
                        >>reading parameters. You can use it whenever you need to read a simple
                        >>csv file or fixed format file which contains many lines with several
                        >>fields per line.
                        >>
                        >I do that all the time, in Python and C++, but I've never felt the need
                        >for a scanf-type function.
                        >
                        I agree scanf is not a must have function but rather a nice to have
                        function.
                        I've never felt that scanf would be "nice" to have. Not in Python, not in
                        C++.

                        Comment

                        • tutufan@gmail.com

                          #13
                          Re: scanf in python

                          On Jul 22, 2:00 pm, AMD <amdescom...@gm ail.comwrote:
                          Hello Fredrik,
                          >
                          I didn't think my comment would offend anyone [...]
                          I doubt that it offended anyone else. Having been the recipient of a
                          few F-bombs :-) myself, I'd just let it go by...

                          Mike

                          Comment

                          • castironpi

                            #14
                            Re: scanf in python

                            On Aug 3, 8:27 am, "tutu...@gmail. com" <tutu...@gmail. comwrote:
                            On Jul 22, 2:00 pm, AMD <amdescom...@gm ail.comwrote:
                            >
                            Hello Fredrik,
                            >
                            I didn't think my comment would offend anyone [...]
                            >
                            I doubt that it offended anyone else.  Having been the recipient of a
                            few F-bombs :-) myself, I'd just let it go by...
                            >
                            Mike
                            The regular expression module (re) should be pretty handy at this.
                            This may not be a typical case, but:
                            >>re.match( r"([0-9]+) ([0-9]*.?[0-9]+) (.{1,5})", '1 2.02 abcde' ).groups( )
                            ('1', '2.02', 'abcde')

                            The float catcher doesn't catch "2." If you need that to work or
                            other classes, speak up.

                            Comment

                            Working...