Unicode help

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • geniuskanwal
    New Member
    • Jan 2008
    • 1

    Unicode help

    Before I begin to explain my problem, I just want to say that I can do the following two things:

    1. Using Perl, connect to a MS Access Databse Table and perform the required operations.(Dat abase is in English language)
    2. I can read and write UTF8 text files using Perl.

    The following code explains how do I write UTF8 files.:

    [CODE=perl]$infile="out2.t xt";
    #$outfile="out. txt";
    #open (FH,">:utf8",$o utfile);

    open(F,"<:utf8" ,$infile);

    while(<F>){
    chomp;

    binmode(STDOUT, ":utf8");

    print "$_"."\n";

    }
    if($_ == /\x{0932}/)
    {

    print "L";
    }
    $strvar="\x{093 2} \x{0917}";
    print $strvar."done". "\n";
    #print FH $strvar."\n";
    close(F);[/CODE]

    Here, I am reading from one UTF8 file and writing to another UTF8 file. Also, in the new UTF8 file I am adding two more characters whose hex values are there in the variable "strvar" as can be seen in the program. Now when I see this file in Wordpard or Notepad everything is fine.

    Now I come to my question.
    I have enabled Hindi Language/Internationaliz ation support on my machine. If I directly try to write Hindi to the tables in MS Access, I am able to do that. Now the problem is that when I try to insert a string into MS Access from Perl whose Hex value(Unicode) is \x{0932} it does not appear correctly in the table. The Character that should appear corresponding to 0932 Hex is shown below(You can see this if you have Unicode support on your machine).


    However, instead of the above character, some garbled character is displayed on the screen. I dnt know how to insert unicode characters into MS Access from Perl(That is issuing a insert query from Perl to insert unicode strings). I need urgent help on this. I would really appreciate if someone could help me out.

    thanks in advance.

    Kanwaljeet
    Last edited by eWish; Jan 12 '08, 08:06 PM. Reason: Added Code Tags
  • eWish
    Recognized Expert Contributor
    • Jul 2007
    • 973

    #2
    Welcome to TSDN!

    When you insert the data are you using placeholders? Are you expecting to see the hexadecimal representation in the db or the actual character after the insert?

    --Kevin

    Comment

    • bvhoyweg
      New Member
      • Jan 2008
      • 2

      #3
      Can I hijack this thread? I'm having exactly the same problem.

      Yes, I'm using placeholders to insert data in MSAccess. From in MSAccess I see the UTF8 characters as two separate characters.

      When I'm doing a select from Perl, I do get the characters in UTF8 format back (when printing to a file).

      So it seems perl treats characters as 8 bit objects, splitting up UTF8 characters in 2 (or more) pieces. Because in MSAccess each 8bit part of the UTF8 character is visible as a separate character

      I'm using activestate Perl 5.10 with DBD::ADO

      Bart

      Comment

      Working...