How do you replace a hidden non-printable character in a text file?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Tom Brackney

    How do you replace a hidden non-printable character in a text file?

    The problem I have is that I have a 10meg file that I need to read in to my C# program and format the file. The file contains hidden GS,RS,US and SUB unicode characters. You can not seem them when the file is read in as a text file. You can see them if you read them into a byte array. Actually what you see is the byte representation of the GS or RS. By the way, GS is a Group Seperator and RS is Record Separator. I think these are produced from a Ebcidic system. What I need to do is replace that GS or RS with a delimeter. I found a way to do that by doing a kind of find and replace with the byte array but the problem is with the size of the file it takes forever to run. I need to be able to read the file in and do a general replace. Regex.Replace or something. I tried the regex but it doesnt seem to see the unicode character. I dont know. Can anyone help. ??
  • Plater
    Recognized Expert Expert
    • Apr 2007
    • 7872

    #2
    The file is "mostly" text right?
    If you use a StreamReader, and change what it considers the end of the line, you can use ReadLine() (which removes the new line characters) and pop your new deliminator in.

    You are saying reading theough a byte[] takes too long though?

    [code=C#]
    int LastLocation=Ar ray.IndexOf<byt e>(myByteArray , ByteVal,0);
    while (LastLocation != -1)
    {
    myByteArray[LastLocation]=NewByteValue;
    LastLocation=Ar ray.IndexOf<byt e>(myByteArray , ByteVal,LastLoc ation)
    }
    [/code]
    Last edited by Plater; Oct 1 '10, 09:13 PM.

    Comment

    Working...