regex question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Fred Mangusta

    regex question

    Hi,

    I would like to delete all the instances of a '.' into a number.

    In other words I'd like to replace all the instances of a '.' character
    with something (say nothing at all) when the '.' is representing a
    decimal separator. E.g.

    500.675 ---- 500675

    but also

    1.000.456.344 ----1000456344

    I don't care about the fact the the resulting number is difficult to
    read: as long as it remains a series of digits it's ok: the important
    thing is to get rid of the period, because I want to keep it only where
    it marks the end of a sentence.

    I was trying to do like this

    s=re.sub("[(\d+)(\.)(\d+)]","... ",s)

    but I don't know much about regular expressions, and don't know how to
    get the two groups of numbers and join them in the sub. Moreover doing
    like this I only match things like "345.000" and not "1.000.000" .

    What's the correct approach?

    Thanks
    F.
  • Marc 'BlackJack' Rintsch

    #2
    Re: regex question

    On Tue, 05 Aug 2008 11:39:36 +0100, Fred Mangusta wrote:
    In other words I'd like to replace all the instances of a '.' character
    with something (say nothing at all) when the '.' is representing a
    decimal separator. E.g.
    >
    500.675 ---- 500675
    >
    but also
    >
    1.000.456.344 ----1000456344
    >
    I don't care about the fact the the resulting number is difficult to
    read: as long as it remains a series of digits it's ok: the important
    thing is to get rid of the period, because I want to keep it only where
    it marks the end of a sentence.
    >
    I was trying to do like this
    >
    s=re.sub("[(\d+)(\.)(\d+)]","... ",s)
    >
    but I don't know much about regular expressions, and don't know how to
    get the two groups of numbers and join them in the sub. Moreover doing
    like this I only match things like "345.000" and not "1.000.000" .
    >
    What's the correct approach?
    In [13]: re.sub(r'(\d)\. (\d)', r'\1\2', '1.000.456.344' )
    Out[13]: '1000456344'

    Ciao,
    Marc 'BlackJack' Rintsch

    Comment

    • Jeff

      #3
      Re: regex question

      On Aug 5, 7:10 am, Marc 'BlackJack' Rintsch <bj_...@gmx.net wrote:
      On Tue, 05 Aug 2008 11:39:36 +0100, Fred Mangusta wrote:
      In other words I'd like to replace all the instances of a '.' character
      with something (say nothing at all) when the '.' is representing a
      decimal separator. E.g.
      >
      500.675  ----      500675
      >
      but also
      >
      1.000.456.344 ----1000456344
      >
      I don't care about the fact the the resulting number is difficult to
      read: as long as it remains a series of digits it's ok: the important
      thing is to get rid of the period, because I want to keep it only where
      it marks the end of a sentence.
      >
      I was trying to do like this
      >
      s=re.sub("[(\d+)(\.)(\d+)]","... ",s)
      >
      but I don't know much about regular expressions, and don't know how to
      get the two groups of numbers and join them in the sub. Moreover doing
      like this I only match things like "345.000" and not "1.000.000" .
      >
      What's the correct approach?
      >
      In [13]: re.sub(r'(\d)\. (\d)', r'\1\2', '1.000.456.344' )
      Out[13]: '1000456344'
      >
      Ciao,
              Marc 'BlackJack' Rintsch
      Even faster:

      '1.000.456.344' .replace('.', '') ='1000456344'

      Comment

      • Chris

        #4
        Re: regex question

        On Aug 5, 2:23 pm, Jeff <jeffo...@gmail .comwrote:
        On Aug 5, 7:10 am, Marc 'BlackJack' Rintsch <bj_...@gmx.net wrote:
        >
        >
        >
        On Tue, 05 Aug 2008 11:39:36 +0100, Fred Mangusta wrote:
        In other words I'd like to replace all the instances of a '.' character
        with something (say nothing at all) when the '.' is representing a
        decimal separator. E.g.
        >
        500.675  ----      500675
        >
        but also
        >
        1.000.456.344 ----1000456344
        >
        I don't care about the fact the the resulting number is difficult to
        read: as long as it remains a series of digits it's ok: the important
        thing is to get rid of the period, because I want to keep it only where
        it marks the end of a sentence.
        >
        I was trying to do like this
        >
        s=re.sub("[(\d+)(\.)(\d+)]","... ",s)
        >
        but I don't know much about regular expressions, and don't know how to
        get the two groups of numbers and join them in the sub. Moreover doing
        like this I only match things like "345.000" and not "1.000.000" .
        >
        What's the correct approach?
        >
        In [13]: re.sub(r'(\d)\. (\d)', r'\1\2', '1.000.456.344' )
        Out[13]: '1000456344'
        >
        Ciao,
                Marc 'BlackJack' Rintsch
        >
        Even faster:
        >
        '1.000.456.344' .replace('.', '') ='1000456344'
        Doesn't work for his use case as he wants to keep periods marking the
        end of a sentence.

        Comment

        • Fred Mangusta

          #5
          Re: regex question

          Chris wrote:
          Doesn't work for his use case as he wants to keep periods marking the
          end of a sentence.
          Exactly. Thanks to all of you anyway, now I have a better understanding
          on how to go on :)

          F.

          Comment

          • MRAB

            #6
            Re: regex question

            On Aug 5, 11:39 am, Fred Mangusta <a...@bbb.itwro te:
            Hi,
            >
            I would like to delete all the instances of a '.' into a number.
            >
            In other words I'd like to replace all the instances of a '.' character
            with something (say nothing at all) when the '.' is representing a
            decimal separator. E.g.
            >
            500.675  ----      500675
            >
            but also
            >
            1.000.456.344 ----1000456344
            >
            I don't care about the fact the the resulting number is difficult to
            read: as long as it remains a series of digits it's ok: the important
            thing is to get rid of the period, because I want to keep it only where
            it marks the end of a sentence.
            >
            I was trying to do like this
            >
            s=re.sub("[(\d+)(\.)(\d+)]","... ",s)
            >
            but I don't know much about regular expressions, and don't know how to
            get the two groups of numbers and join them in the sub. Moreover doing
            like this I only match things like "345.000" and not "1.000.000" .
            >
            What's the correct approach?
            >
            I would use look-behind (is it preceded by a digit?) and look-ahead
            (is it followed by a digit?):

            s = re.sub(r'(?<=\d )\.(?=\d)', '', s)

            Comment

            • Tobiah

              #7
              Re: regex question

              On Tue, 05 Aug 2008 15:55:46 +0100, Fred Mangusta wrote:
              Chris wrote:
              >
              >Doesn't work for his use case as he wants to keep periods marking the
              >end of a sentence.
              Doesn't it? The period has to be surrounded by digits in the
              example solution, so wouldn't periods followed by a space
              (end of sentence) always make it through?



              ** Posted from http://www.teranews.com **

              Comment

              Working...