Script to compare files recursively using sdiff

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • optimusprime
    New Member
    • Apr 2013
    • 4

    Script to compare files recursively using sdiff

    Hi All,

    I have been surfing to get some idea on how to compare same files from two different paths.

    one path will have oldfiles directory and another path will have newfiles directory. Each main directories will have sub-directories in them and

    each sub-directories inturn will have *.txt files(simple plain text file having several lines in them).


    Note : one advantage is that :

    All the sub-directories name in both oldfiles and newfiles main directory will have same directory names.

    All the text filenames(*.txt ) in sub-directories in both main folders will be the same as well

    Now,

    a. script has to accept 2 paths ie, oldfiles and newfiles directory path as arguments

    b. read first sub-folder name from oldfiles and search the same sub-folder name in newfiles path. If same sub folder found, then

    c. check, if text files present, if present check if both filename matches, if so, then

    d. Sort the files first, then do sdiff on those 2 files and store the results as a seperate file.

    So, to give example how folder structure will look like :

    Main folders:

    oldfiles path : /tmp/oldfiles/

    newfiles path : /tmp/newfiles/


    Each main folders will have sub-folders :

    oldfiles :
    subdirA
    subA.txt

    subdirB
    subB.txt

    subdirC
    subC.txt

    newfiles :

    subdirA
    subA.txt

    subdirB
    subB.txt



    Each sub-dirs will ahve *.txt having same filename in them.


    From the above ex :

    script should generate sdiff results in output folders as:

    subdirA_subA_re sult.txt
    subdirB_subB_re sult.txt

    I hope have mentioned the what i tend to achieve clearly.

    from the below script which i wrote it doesn;t checks for same sub-folders/files and even not generating seperate result files instead

    it reads all the *.txt files and just produce one single result file.

    Code:
    #!/bin/bash 
      
    # cmp_dir - program to compare two directories 
      
    # Check for required arguments 
    if [ $# -ne 2 ]; then 
        echo "usage: $0 directory_1 directory_2" 1>&2 
        exit 1 
    fi 
      
    # Make sure both arguments are directories 
    if [ ! -d $1 ]; then 
        echo "$1 is not a directory!" 1>&2 
        exit 1 
    fi 
      
    if [ ! -d $2 ]; then 
        echo "$2 is not a directory!" 1>&2 
        exit 1 
    fi 
      
    # Process each file in directory_1, comparing it to directory_2 
    find $1/ -name '*.txt' -print | while read src 
    do 
    #for filename in $1/*.txt; do 
    #echo $filename 
        fn=$(basename "$filename") 
        if [ -f "$filename" ]; then 
            #if [ ! -f "$2/$fn" ]; then 
                #echo "$fn is missing from $2" 
                #missing=$((missing + 1)) 
            #fi 
                    sort $filename 
                    #echo $filename 
                    sort $2/$fn 
                    #echo $2/$fn 
                    sdiff $filename $2/$fn | egrep '>|<|\|' > resultfile.txt 
        fi 
    #done 
    done 
    echo "File comparision done, please see resultfile"
  • Luuk
    Recognized Expert Top Contributor
    • Mar 2012
    • 1043

    #2
    It will only create one result because you did not add the path in line 37 like this:
    Code:
    sdiff $filename $2/$fn | egrep '>|<|\|' > [B]$1/[/B]resultfile.txt
    NOTE: On a second run of this script, your script WILL find the result.txt of the previous run

    NOTE2: The check in line 28
    Code:
    if [ -f "$filename" ]; then
    can be skipped if you change line 23 to:
    Code:
    find $1/ [B]-type f[/B] -name '*.txt' -print | while read src

    Comment

    Working...