Using detectEncodingFromByteOrderMarks while copying a text file

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Claire

    Using detectEncodingFromByteOrderMarks while copying a text file

    I've noticed after copying a text file line by line and comparing, that the
    original had several bytes of data at the beginning denoting its encoding.
    How do I use that in my copy?
    My original code shown below, didn't produce a perfect copy, so I used the
    StreamReader construct that includes detectEncodingF romByteOrderMar ks. But I
    need to pass that to the construct for my StreamWriter so I need to be able
    to work out the encoding type somehow. How please?

    string InputPath = Path.GetDirecto ryName(Applicat ion.ExecutableP ath) +
    @"\intext.tx t";
    string OutputPath = Path.GetDirecto ryName(Applicat ion.ExecutableP ath)
    + @"\outtext.txt" ;
    string In;
    string Out;

    using (StreamReader Input = new StreamReader(In putPath))
    // using (StreamReader Input = new StreamReader(In putPath, true)) <<
    construct
    {
    using (StreamWriter Output = new StreamWriter(Ou tputPath))
    {
    while ((In = Input.ReadLine( )) != null)
    {
    Out = DoSomethingTo(I n);
    Output.WriteLin e(Out);
    }
    }
    }

  • Marc Gravell

    #2
    Re: Using detectEncodingF romByteOrderMar ks while copying a text file

    I'm guessing - tell the writer about it?

    using (StreamWriter Output = new StreamWriter(Ou tputPath, false,
    Input.CurrentEn coding)) {...}

    Marc

    Comment

    • Marc Gravell

      #3
      Re: Using detectEncodingF romByteOrderMar ks while copying a text file

      Correction - the CurrentEncoding is not valid until it has read some
      data; perhaps something like below; note that it also can't detect every
      encoding possible...

      Marc

      using (StreamReader reader = new StreamReader(pa th1, true))
      {
      string line = reader.ReadLine ();
      using (StreamWriter writer = new StreamWriter(pa th2, false,
      reader.CurrentE ncoding))
      {
      Console.WriteLi ne("Reading {0} with {1}", path1,
      reader.CurrentE ncoding.Encodin gName);
      Console.WriteLi ne("Writing {0} with {1}", path2,
      writer.Encoding .EncodingName);

      while (line != null)
      {
      string t = Transform(line) ;
      Console.WriteLi ne(t);
      writer.WriteLin e(t);
      line = reader.ReadLine ();
      }
      }
      }

      Comment

      • Claire

        #4
        Re: Using detectEncodingF romByteOrderMar ks while copying a text file

        "Marc Gravell" <marc.gravell@g mail.comwrote in message
        news:u4a125vxIH A.4912@TK2MSFTN GP03.phx.gbl...
        Correction - the CurrentEncoding is not valid until it has read some data;
        perhaps something like below; note that it also can't detect every
        encoding possible...
        That's great! thank you :)

        Comment

        • Mihai N.

          #5
          Re: Using detectEncodingF romByteOrderMar ks while copying a text file

          Using detectEncodingF romByteOrderMar ks while copying a text file
          Unless you process the text somehow, it is not worth the trouble to
          copy a text file as text file (with encoding detection, line ending,
          and so on).
          Just copy it as a binary. The routine can also be reused for any type
          of files, and there is no risk of data corruption if you "guess" the
          encoding wrong.


          --
          Mihai Nita [Microsoft MVP, Visual C++]
          Best internationalization practices, solving internationalization problems.

          ------------------------------------------
          Replace _year_ with _ to get the real email

          Comment

          • Marc Gravell

            #6
            Re: Using detectEncodingF romByteOrderMar ks while copying a text file

            I very nearly said the same thing - but if you look carefully, there is
            a transform hidden in the code:

            Out = DoSomethingTo(I n);
            Output.WriteLin e(Out);

            Marc

            Comment

            • Mihai N.

              #7
              Re: Using detectEncodingF romByteOrderMar ks while copying a text file

              I very nearly said the same thing - but if you look carefully, there is
              a transform hidden in the code:
              Right, I missed that one. Got fouled by the subject :-)


              --
              Mihai Nita [Microsoft MVP, Visual C++]
              Best internationalization practices, solving internationalization problems.

              ------------------------------------------
              Replace _year_ with _ to get the real email

              Comment

              Working...