Re: compressing short strings?

**Paul Rubin** · Jun 27 '08, 04:25 PM

"inhahe" <inhahe@gmail.c omwrites:

i don't see anybody mentioning huffman encoding. i think it just works per
byte, so it's not as tight as gzip or whatever. but it sounds like it would
be easy to implement and wouldn't require any corpus-wide compression
information. except a character frequency count if you wanted to be optimal.

In principle you could do it over digraphs but I tried that once and
it didn't help much. Basially -because- it doesn't use any
corpus-wide compression information, it doesn't compress anywhere near
as well as LZ, DMC, or whatever.