replacing a specific part of a specific line of text inside a file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • blazedaces
    Contributor
    • May 2007
    • 284

    replacing a specific part of a specific line of text inside a file

    Alright guys, so the title explains exactly my goal. The truth is I'm going to be reading in a lot of data from an xml file. The file is too large and there's too much data to store in arraylists without running out of memory, so I'm reading and as I'm reading I'm going to write to a file.

    This is the thing though, I already can do this and have it done, but I want to modify the program so you can choose what data you want to take out. To do this I would set up the text file to show something like this:

    Code:
    one:1,2,3,4,5,
    two:1,2,3,4,5,
    three:1,2,3,4,5,
    four:1,2,3,4,5,
    five:1,2,3,4,5,
    Where one, two, etc. are data names and the numbers are data separated by commas as you see.

    Here's my problem. I will not run into all the data at one time. Every second is another piece of information, so when I get to that element in the xml I want to open the file, locate the correct data (what line), then replace that line, so it could look like this afterwards:

    Code:
    one:1,2,3,4,5,
    two:1,2,3,4,5,
    three:1,2,3,4,5,
    four:1,2,3,4,5,6,
    five:1,2,3,4,5,
    Now, I wrote something using RandomAccessFil e in java, here's all my code (it's a test program):

    [code=java]
    import java.util.*;
    import java.io.*;

    public class testwriteMoreTo File {
    private RandomAccessFil e raf;
    private static String fileToBe;

    public void writeMoreToFile (String tagName, String newData) {
    try {
    try {
    raf = new RandomAccessFil e( new File(fileToBe), "rw");
    } catch (FileNotFoundEx ception e) {
    raf = new RandomAccessFil e(fileToBe, "rw");
    }

    StringBuffer contents = new StringBuffer();
    String line = null;
    long prevFilePointer = 0;

    while ((line = raf.readLine()) != null) {
    if (line.substring (0,tagName.leng th()).equals(ta gName)) {
    contents.append (line).append(n ewData).append( ",").append(Sys tem.getProperty ("line.separato r"));
    raf.seek(prevFi lePointer);
    raf.writeChars( contents.toStri ng());
    break;
    }
    prevFilePointer = raf.getFilePoin ter();
    }
    } catch (IOException e2) {
    e2.printStackTr ace();
    } finally {
    try {
    raf.close();
    } catch (IOException e2) {
    e2.printStackTr ace();
    }
    }
    }

    public static void main(String args[]) {
    String[] tagNames = new String[]{ "one","two","th ree","four","fi ve" };
    String line = new String("1,2,3,4 ,5,");
    String[] lines = new String[5];
    fileToBe = "Z:\\test.t xt";
    PrintWriter out = null;
    for(int i = 0; i < lines.length; i++) {
    lines[i] = tagNames[i]+":"+line;
    }
    try {
    out = new PrintWriter(fil eToBe);
    for (int i = 0; i < lines.length; i++) {
    out.println(lin es[i]);
    }
    } catch (IOException e2) {
    e2.printStackTr ace();
    } finally {
    out.close();
    }

    testwriteMoreTo File test = new testwriteMoreTo File();

    test.writeMoreT oFile(tagNames[3], "6");
    }
    }
    [/code]

    I predicted this problem before I tried it, but thought maybe it could work so I'll try it anyway.

    The output looks like this in the text file:

    Code:
    one:1,2,3,4,5,
    two:1,2,3,4,5,
    three:1,2,3,4,5,
     f o u r : 1 , 2 , 3 , 4 , 5 , 6 ,
    After I add the line (and it doesn't even seem to add correctly, why the spaces?) bytes after that probably don't read correctly..

    Any ideas guys, just ideas, what do you think? Any better concepts?

    I've got this one that I read somewhere:

    Two files:

    Copy first line of file 1 to file 2
    repeat until you get to line you want to change
    change the line, put it in file 2
    continue till end copying to file 2

    Problem is if I do this sooooo many times because of all the data I"ll need to do it with will this take a very long time? Even so, what, 300 copies of a file? I should delete the old one right, I believe something was mentioned about a kill method... Can I delete the old one and then change the new file name to be the old file name?

    Thanks for all the help guys...

    -blazed
  • Finomosec
    New Member
    • Jul 2007
    • 7

    #2
    First, there are many different XML-parsers.
    - Those which load the complete file into memory (bad for big files)
    - Those which walk the XML-file from top to bottom (Visitor-Pattern i think)
    maybe SAX-parser, if i'm not mistaken

    Maybe using another XML-parser solves your problem in the first place.

    Second i would do something like this:
    Code:
    public class FileTest {
    
        public static void main(String[] args) throws Exception {
            BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(new File("input.txt"))));
            try {
                BufferedWriter writer = new BufferedWriter(new FileWriter(new File("output.txt")));
                try {
                    // for performance compile once and use repeatedly (instead of line.matches("four.*"))
                    Pattern pattern = Pattern.compile("(four.*)"); // brackets only needed if using regex replace below ...
                    String line;
                    while ((line = reader.readLine()) != null) {
                        Matcher matcher = pattern.matcher(line);
                        if (matcher.matches()) {
                            line = line + "6,"; // your data
                            // line = matcher.replaceAll("$16,"); // or with regex-replace ... "$1" is the first "()" in the pattern above
                        }
                        writer.append(line);
                        writer.newLine();
                        // writer.flush(); // to see everything that is written immediately in the file [it's buffered otherwise]
                    }
                } finally {
                    reader.close();
                }
            } finally {
                reader.close();
            }
        }
    
    }
    Greetings Finomosec;

    Comment

    • JosAH
      Recognized Expert MVP
      • Mar 2007
      • 11453

      #3
      I'd say use a (relational) database using one table with two columns: the primary
      key ("one", "two" "three" etc.) and a second column that simply contains the
      text ("1, 2, 3, 4, 5" etc.).

      It'll be easy to update/delete/insert data then and when you have to read data in
      you could do the processing on that second column. Using plain text files is an
      unmanageable burden (as you already sketched).

      kind regards,

      Jos

      Comment

      • blazedaces
        Contributor
        • May 2007
        • 284

        #4
        Originally posted by JosAH
        I'd say use a (relational) database using one table with two columns: the primary
        key ("one", "two" "three" etc.) and a second column that simply contains the
        text ("1, 2, 3, 4, 5" etc.).

        It'll be easy to update/delete/insert data then and when you have to read data in
        you could do the processing on that second column. Using plain text files is an
        unmanageable burden (as you already sketched).

        kind regards,

        Jos
        Looks like I'll be looking into databases then (relational). Thank you...

        -blaze

        Comment

        • blazedaces
          Contributor
          • May 2007
          • 284

          #5
          Originally posted by Finomosec
          First, there are many different XML-parsers.
          - Those which load the complete file into memory (bad for big files)
          - Those which walk the XML-file from top to bottom (Visitor-Pattern i think)
          maybe SAX-parser, if i'm not mistaken

          Greetings Finomosec;
          Sorry that I was unclear dude. I have the xml reading data part down, I'm using SAX. Thing is I'm reading 200mb+ files, sometimes they can be gigabytes in size, too big to use something like DOM. Thanks though...

          As for your suggestion for the code I think it might be easier to take Jos' suggestion and lookup databases, then simply print it to a file or input into something like excel (have no idea if this is possible).

          Just wanted to respond to your post as well... Thanks for your input,

          -blazed

          Comment

          • r035198x
            MVP
            • Sep 2006
            • 13225

            #6
            Speaking of databases MySQL 5.0 + has improved the MySQL database a lot. If you want something quick and free you can consider using that one.

            Comment

            • JosAH
              Recognized Expert MVP
              • Mar 2007
              • 11453

              #7
              Originally posted by r035198x
              Speaking of databases MySQL 5.0 + has improved the MySQL database a lot. If you want something quick and free you can consider using that one.
              Personally I use Caché; it's an object oriented database that got rid of that
              nasty OR mapping (Object - Relational). Relational database tables are not
              well suited for storing objects. Persisting objects using this database is a
              breeze; it's just like serializing Java POJOs but you can search, manipulate
              them just like they are in memory all the time. No need for all those clumsy
              DAO, DTO etc. patterns anymore. (Caché is free, google for it).

              kind regards,

              Jos

              ps. I'm just a happy user, no commercial interest in Caché at all.

              Comment

              • r035198x
                MVP
                • Sep 2006
                • 13225

                #8
                Originally posted by JosAH
                Personally I use Caché; it's an object oriented database that got rid of that
                nasty OR mapping (Object - Relational). Relational database tables are not
                well suited for storing objects. Persisting objects using this database is a
                breeze; it's just like serializing Java POJOs but you can search, manipulate
                them just like they are in memory all the time. No need for all those clumsy
                DAO, DTO etc. patterns anymore. (Caché is free, google for it).

                kind regards,

                Jos

                ps. I'm just a happy user, no commercial interest in Caché at all.
                My knowlegde of object oriented databases is rather limited(still in the learning phase for them I'd say).
                Caché looks good. (First link I opened was a clothes shop).
                Have you worked with JavaDB before?

                Comment

                • JosAH
                  Recognized Expert MVP
                  • Mar 2007
                  • 11453

                  #9
                  Originally posted by r035198x
                  My knowlegde of object oriented databases is rather limited(still in the learning phase for them I'd say).
                  Caché looks good. (First link I opened was a clothes shop).
                  Have you worked with JavaDB before?
                  I think I've played with it a bit once, but I'm not sure about it. Caché is made by
                  Intersystems. Those folks made "MUMPS" years ago. For reasons that are
                  beyond me MUMPS is still extremely popular in the medical equipment industry.
                  You hardly find any, say, Oracle or DB2 or whatever overthere.

                  kind regards,

                  Jos

                  Comment

                  • r035198x
                    MVP
                    • Sep 2006
                    • 13225

                    #10
                    Originally posted by JosAH
                    I think I've played with it a bit once, but I'm not sure about it. Caché is made by
                    Intersystems. Those folks made "MUMPS" years ago. For reasons that are
                    beyond me MUMPS is still extremely popular in the medical equipment industry.
                    You hardly find any, say, Oracle or DB2 or whatever overthere.

                    kind regards,

                    Jos
                    They are calling it the world's fastest database. I've requested a free cd ( I do not want to download 242mb of it on our network today)
                    I hope they'll send the CD.

                    Comment

                    • JosAH
                      Recognized Expert MVP
                      • Mar 2007
                      • 11453

                      #11
                      Originally posted by r035198x
                      They are calling it the world's fastest database. I've requested a free cd ( I do not want to download 242mb of it on our network today)
                      I hope they'll send the CD.
                      I didn't even request a free CD but they still sent me one: within a few weeks,
                      packed with a lot of propaganda paperwork ;-) I like that database, I still haven't
                      used it for any commercial application yet, just playtime. I like their 'jalapeno'
                      technique (JAva LAnguage PErsistency NO mapping); lousy acronym though.

                      kind regards,

                      Jos

                      Comment

                      Working...