On Thu, 30 Oct 2008 13:50:47 +0300, Seid Mohammed wrote:
You have typed 9Â characters but they are not encoded as 9Â bytes. I guess
your environment uses UTF-8 as encoding, because mine does too and:
In [124]: abebe = 'አበበበሶ በላ'
In [125]: len(abebe)
Out[125]: 23
In [126]: s = 'አ'
In [127]: len(s)
Out[127]: 3
In [128]: s
Out[128]: '\xe1\x8a\xa0'
So that one character is encoded in three bytes. If you really want to
operate on characters instead of bytes, use `unicode` objects:
In [129]: u = abebe.decode('u tf-8')
In [130]: len(u)
Out[130]: 9
In [131]: print u[0]
áŠ
Ciao,
Marc 'BlackJack' Rintsch
ok
but still i am not clear with my problem. if i test this one
==============
kk ='how old are you'
15
==========
but in my case
==========
23
==========
why the lenght is 23 while I am expecting to be 9 only. becuase I have 9
characters(incl uding space) just typed. there must be a kind of trick
over it.
but still i am not clear with my problem. if i test this one
==============
kk ='how old are you'
>>>len(kk)
==========
but in my case
==========
>>>abebe = 'አበበበሶ በላ'
>>>len(abebe)
>>>len(abebe)
==========
why the lenght is 23 while I am expecting to be 9 only. becuase I have 9
characters(incl uding space) just typed. there must be a kind of trick
over it.
your environment uses UTF-8 as encoding, because mine does too and:
In [124]: abebe = 'አበበበሶ በላ'
In [125]: len(abebe)
Out[125]: 23
In [126]: s = 'አ'
In [127]: len(s)
Out[127]: 3
In [128]: s
Out[128]: '\xe1\x8a\xa0'
So that one character is encoded in three bytes. If you really want to
operate on characters instead of bytes, use `unicode` objects:
In [129]: u = abebe.decode('u tf-8')
In [130]: len(u)
Out[130]: 9
In [131]: print u[0]
áŠ
Ciao,
Marc 'BlackJack' Rintsch