regexp help

**Paul McGuire** · Jun 27 '08, 04:24 PM

Re: regexp help

On May 9, 5:19 pm, globalrev <skanem...@yaho o.sewrote:

i want to a little stringmanipulat iona nd im looking into regexps. i
couldnt find out how to do:
s = 'poprorinoncoce '
re.sub('$o$', '$', s)
should result in 'prince'
>
$ is obv the wrng character to use bu what i mean the pattern is
"consonant o consonant" and should be replace by just "consonant" .
both consonants should be the same too.
so mole would be mole
mom would be m etc

from re import *
vowels = "aAeEiIoOuU "
cons = "bcdfghjklmnpqr stvwxyzBCDFGHJK LMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])[%s]\1" % (cons,vowels))
print encodeRe.sub(r" \1",s)

This is actually a little more complex than you asked - it will search
for any consonant-vowel-same_consonant triple, and replace it with the
leading consonant. To meet your original request, change to:

from re import *
cons = "bcdfghjklmnpqr stvwxyzBCDFGHJK LMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])o\1" % cons)
print encodeRe.sub(r" \1",s)

Both print "prince".

-- Paul

(I have a pyparsing solution too, but I just used it to prototype up
the solution, then coverted it to regex.)

**Matimus** · Jun 27 '08, 04:24 PM

Re: regexp help

On May 9, 3:19 pm, globalrev <skanem...@yaho o.sewrote:

i want to a little stringmanipulat iona nd im looking into regexps. i
couldnt find out how to do:
s = 'poprorinoncoce '
re.sub('$o$', '$', s)
should result in 'prince'
>
$ is obv the wrng character to use bu what i mean the pattern is
"consonant o consonant" and should be replace by just "consonant" .
both consonants should be the same too.
so mole would be mole
mom would be m etc

>>import re
>>s = s = 'poprorinoncoce '
>>coc = re.compile(r"(. )o\1")
>>coc.sub(r'\1' , s)

'prince'

Matt

**John Machin** · Jun 27 '08, 04:24 PM

Re: regexp help

Paul McGuire wrote:

from re import *

Perhaps you intended "import re".

vowels = "aAeEiIoOuU "
cons = "bcdfghjklmnpqr stvwxyzBCDFGHJK LMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])[%s]\1" % (cons,vowels))
print encodeRe.sub(r" \1",s)
>
This is actually a little more complex than you asked - it will search
for any consonant-vowel-same_consonant triple, and replace it with the
leading consonant. To meet your original request, change to:
>
from re import *

And again.

cons = "bcdfghjklmnpqr stvwxyzBCDFGHJK LMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])o\1" % cons)
print encodeRe.sub(r" \1",s)
>
Both print "prince".
>

No they don't. The result is "NameError: name 're' is not defined".

**globalrev** · Jun 27 '08, 04:24 PM

Re: regexp help

ty. that was the decrypt function. i am slo writing an encrypt
function.

def encrypt(phrase) :
pattern =
re.compile(r"([bcdfghjklmnpqrs tvwxyzBCDFGHJKL MNPQRSTVWXYZ])")
return pattern.sub(r"1 \o\1", phrase)

doesnt work though, h becomes 1\\oh.

def encrypt(phrase) :
pattern =
re.compile(r"([bcdfghjklmnpqrs tvwxyzBCDFGHJKL MNPQRSTVWXYZ])")
return pattern.sub(r"o \1", phrase)

returns oh.

i want hoh.

i dont quite get it.why cant i delimit pattern with \

**John Machin** · Jun 27 '08, 04:24 PM

Re: regexp help

globalrev wrote:

ty. that was the decrypt function. i am slo writing an encrypt
function.
>
def encrypt(phrase) :
pattern =
re.compile(r"([bcdfghjklmnpqrs tvwxyzBCDFGHJKL MNPQRSTVWXYZ])")

The inner pair of () are not necessary.

return pattern.sub(r"1 \o\1", phrase)
>
doesnt work though, h becomes 1\\oh.

To be precise, "h" becomes "1\\oh", which is the same as r"1\oh". There
is only one backslash in the result.

It's doing exactly what you told it to do: replace each consonant by
(1) the character '1'
(2) a backslash
(3) the character 'o'
(4) the consonant

>
>
def encrypt(phrase) :
pattern =
re.compile(r"([bcdfghjklmnpqrs tvwxyzBCDFGHJKL MNPQRSTVWXYZ])")
return pattern.sub(r"o \1", phrase)
>
returns oh.

It's doing exactly what you told it to do: replace each consonant by
(1) the character 'o'
(2) the consonant

i want hoh.

So tell it to do that:
return pattern.sub(r"\ 1o\1", phrase)

i dont quite get it.why cant i delimit pattern with \

Perhaps you could explain what you mean by "delimit pattern with \".

**globalrev** · Jun 27 '08, 04:24 PM

Re: regexp help

The inner pair of () are not necessary.

yes they are?

ty anyway, got it now.

**John Machin** · Jun 27 '08, 04:24 PM

Re: regexp help

globalrev wrote:

>The inner pair of () are not necessary.

>
yes they are?

You are correct. I was having a flashback to a dimly remembered previous
incarnation during which I used regexp software in which something like
& or \0 denoted the whole match (like MatchObject.gro up(0)) :-)

**Paul McGuire** · Jun 27 '08, 04:24 PM

Re: regexp help

On May 9, 6:52 pm, John Machin <sjmac...@lexic on.netwrote:

Paul McGuire wrote:

from re import *

>
Perhaps you intended "import re".

Indeed I did.

Both print "prince".

>
No they don't. The result is "NameError: name 're' is not defined".

Dang, now how did that work in my script? I assure you I did test it
before posting.

Ah! My pyparsing prototype preceded the regex version in the same
script, and importing the pyparsing module imports re using "import
re". That is why I didn't get NameError. Sorry for sloppy posting...

Once you clean up the mistakes, you essentially get the same code as
earlier posted by Matimus.

-- Paul

regexp help

regexp help

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment