Remove some characters from a string

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Julien

    Remove some characters from a string

    Hi,

    I can't seem to find the right regular expression to achieve what I
    want. I'd like to remove all characters from a string that are not
    numbers, letters or underscores.

    For example:
    >>magic_functio n('si_98%u^d@.a s-*gf')
    str: 'si_98udasgf'

    Would you have any hint?

    Thanks a lot!

    Julien
  • Chris

    #2
    Re: Remove some characters from a string

    On Jul 17, 10:13 am, Julien <jpha...@gmail. comwrote:
    Hi,
    >
    I can't seem to find the right regular expression to achieve what I
    want. I'd like to remove all characters from a string that are not
    numbers, letters or underscores.
    >
    For example:
    >
    >magic_function ('si_98%u^d@.as-*gf')
    >
    str: 'si_98udasgf'
    >
    Would you have any hint?
    >
    Thanks a lot!
    >
    Julien
    One quick and dirty way would be...

    import string
    safe_chars = string.ascii_le tters + string.digits + '_'
    test_string = 'si_98%u^d@.as-*gf'
    ''.join([char if char in safe_chars else '' for char in test_string])

    you could also use a translation table, see string.translat e (the
    table it uses can be made with string.maketran s)

    Comment

    • Paul Hankin

      #3
      Re: Remove some characters from a string

      On Jul 17, 9:13 am, Julien <jpha...@gmail. comwrote:
      Hi,
      >
      I can't seem to find the right regular expression to achieve what I
      want. I'd like to remove all characters from a string that are not
      numbers, letters or underscores.
      >
      For example:
      >
      >magic_function ('si_98%u^d@.as-*gf')
      >
      str: 'si_98udasgf'
      For speed, you can use 'string.transla te', but simplest is to use a
      comprehension:

      import string

      def magic_function( s, keep=string.asc ii_letters + string.digits +
      '_'):
      return ''.join(c for c in s if c in keep)

      --
      Paul Hankin

      Comment

      • Fredrik Lundh

        #4
        Re: Remove some characters from a string

        Julien wrote:
        I can't seem to find the right regular expression to achieve what I
        want. I'd like to remove all characters from a string that are not
        numbers, letters or underscores.
        >
        For example:
        >
        >>>magic_functi on('si_98%u^d@. as-*gf')
        str: 'si_98udasgf'
        the easiest way is to replace the things you don't want with an empty
        string:
        >>re.sub("\W" , "", "si_98%u^d@ .as-*gf")
        'si_98udasgf'

        ("\W" matches everything that is "not numbers, letters, or underscores",
        where the alphabet defaults to ASCII. to include non-ASCII letters, add
        "(?u)" in front of the expression, and pass in a Unicode string).

        </F>

        Comment

        Working...