Need file3 with ONLY differences between file1 and file2
-
- Posts: 82
- Joined: Wed 30 Mar 2011, 07:02
Need file3 with ONLY differences between file1 and file2
I have a simple problem for many but for me was unsolvable until now:
file1 with only a single line content
01 02 03 04 05 06
file2 also with only single line content
01 02 03 04 05 06 07 08 09 10
I want to get file3 also single line, having as content only the differences found between file1 and file2
like:
file3 with ONLY the difference content
07 08 09 10
I tried to find the answer to this problem which looks simple by google it but I found solutions on other complex situations, not for this.
Thank you
file1 with only a single line content
01 02 03 04 05 06
file2 also with only single line content
01 02 03 04 05 06 07 08 09 10
I want to get file3 also single line, having as content only the differences found between file1 and file2
like:
file3 with ONLY the difference content
07 08 09 10
I tried to find the answer to this problem which looks simple by google it but I found solutions on other complex situations, not for this.
Thank you
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
Many ways. As long as your real file contents are as simple as in your example you can try
Change the path as needed.
Code: Select all
#!/bin/sh
CONTENT_FILE1=$(< /root/tmp/file1)
CONTENT_FILE2=$(< /root/tmp/file2)
echo -n ${CONTENT_FILE2/$CONTENT_FILE1} > /root/tmp/file3
-
- Posts: 82
- Joined: Wed 30 Mar 2011, 07:02
Empry result...I will try more, maybe I misspelled something
MochiMoppel wrote:Many ways. As long as your real file contents are as simple as in your example you can tryChange the path as needed.Code: Select all
#!/bin/sh CONTENT_FILE1=$(< /root/tmp/file1) CONTENT_FILE2=$(< /root/tmp/file2) echo -n ${CONTENT_FILE2/$CONTENT_FILE1} > /root/tmp/file3
-
- Posts: 82
- Joined: Wed 30 Mar 2011, 07:02
I need a method for real sorting from lower to higher values
Thanks
The script is working but I need more, but the results are not that I expected because my files should first be sorted like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
and they are like
1 10 11 12 13 14 2 3 4 5 6 7 8 9
How can I sort them ascending with lower number first and bigger number last?
sort -n doesn't work for my single line files
Or I need something that try to identify which numbers from file2 are not found in file1
Any ideea?
The script is working but I need more, but the results are not that I expected because my files should first be sorted like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
and they are like
1 10 11 12 13 14 2 3 4 5 6 7 8 9
How can I sort them ascending with lower number first and bigger number last?
sort -n doesn't work for my single line files
Or I need something that try to identify which numbers from file2 are not found in file1
Any ideea?
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
Re: I need a method for real sorting from lower to higher values
Then you should have asked for more in the first place, giving a realistic example of your input files and and the expected result. Now the topic changed from file differences to file sorting? Sorting what? The input or the output? Clearly I don't know anymore what you are asking for.laurentius77 wrote:The script is working but I need more
-
- Posts: 82
- Joined: Wed 30 Mar 2011, 07:02
Real example
I'm so sorry that I was confusing...
Here is a real example
file1
0,1,10,11,12,13,14,15,16,17,18,19,2,20,21,22,23,24,25,26,27,28,29,3,30,31,32,33,34,35,36,37,38,39,4,40,41,42,43,44,5,6,7,8,9
file2
0,1,10,11,12,13,14,15,16,17,18,19,2,20,21,22,23,24,25,26,27,28,29,3,30,31,32,33,34,35,36,37,38,39,4,40,41,42,43,5,6,7,8,9
I expect
file3
44
Just a file with what is different between those two files
Sorry again for my mistake.
Here is a real example
file1
0,1,10,11,12,13,14,15,16,17,18,19,2,20,21,22,23,24,25,26,27,28,29,3,30,31,32,33,34,35,36,37,38,39,4,40,41,42,43,44,5,6,7,8,9
file2
0,1,10,11,12,13,14,15,16,17,18,19,2,20,21,22,23,24,25,26,27,28,29,3,30,31,32,33,34,35,36,37,38,39,4,40,41,42,43,5,6,7,8,9
I expect
file3
44
Just a file with what is different between those two files
Sorry again for my mistake.
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
Try
This assumes that file2 has more values than file1 (like in your first example). I don't know if your 2nd example had this rule(?) switched intentionally or by mistake. If there is no rule, you will have to check first, which of the 2 files is the bigger one, then adapt the last line.
Results in file3 will be space delimited.
Code: Select all
#!/bin/sh
IFS=${IFS},
CONTENT_FILE1=$(< /root/tmp/file1)
CONTENT_FILE1=$(printf '%s\n' $CONTENT_FILE1 | sort -n)
CONTENT_FILE2=$(< /root/tmp/file2)
CONTENT_FILE2=$(printf '%s\n' $CONTENT_FILE2 | sort -n)
echo -n ${CONTENT_FILE2#$CONTENT_FILE1} > /root/tmp/file3
Results in file3 will be space delimited.
-
- Posts: 82
- Joined: Wed 30 Mar 2011, 07:02
Thank you
Thank you sir and I'm sorry for my lack of attention.
Yes, indeed, one file has more values than another. In the second example (in reality) file2 have less values than file1. But I will addapt the script in order to have it funcional.
Thank you a lot!
Yes, indeed, one file has more values than another. In the second example (in reality) file2 have less values than file1. But I will addapt the script in order to have it funcional.
Thank you a lot!
MochiMoppel wrote:TryThis assumes that file2 has more values than file1 (like in your first example). I don't know if your 2nd example had this rule(?) switched intentionally or by mistake. If there is no rule, you will have to check first, which of the 2 files is the bigger one, then adapt the last line.Code: Select all
#!/bin/sh IFS=${IFS}, CONTENT_FILE1=$(< /root/tmp/file1) CONTENT_FILE1=$(printf '%s\n' $CONTENT_FILE1 | sort -n) CONTENT_FILE2=$(< /root/tmp/file2) CONTENT_FILE2=$(printf '%s\n' $CONTENT_FILE2 | sort -n) echo -n ${CONTENT_FILE2#$CONTENT_FILE1} > /root/tmp/file3
Results in file3 will be space delimited.
wow I remember I last used sort -u this month 21 (or 20) years ago, had a programming contract which started on Columbus Day ( also a Monday so should be able to figure out if 20/21 yrs ago ) and I did not go to work since it was a federal holiday MAN where they Mad I did not go to work, I.T. did work even at the investment bank on Columbus Day.
Looks like SORT is smarter than it once was, can't find setting to treat commas as linefeed.
Looks like SORT is smarter than it once was, can't find setting to treat commas as linefeed.
@OP
I hope I'm not just muddying the waters, but following up on MochiMoppel's last post above, just what exactly are you assuming about the two files? For example,
a) is it possible for EACH file to have values that the OTHER file does not, as in:
file1:
1, 3, 4, 6, 13, 21
file2:
1, 2, 4, 6, 17
where file1 has 3, 13, and 21 but file2 does not, and conversely file2 has 2 and 17 but file1 does not.
And if this is possible, do you care which file the differences come from? If the result is
file3:
2, 3, 13, 17, 21
does it matter that there is no way to tell (from file3) which value came from which input file?
b) is it safe to assume that the values in each input file are numbers, and are sorted in some order (either numeric or lexicographic)?
I ask these things not to nitpick, but because the more precisely you state your requirements, the better solution "we" (ie MochiMoppel ) can provide.
I hope I'm not just muddying the waters, but following up on MochiMoppel's last post above, just what exactly are you assuming about the two files? For example,
a) is it possible for EACH file to have values that the OTHER file does not, as in:
file1:
1, 3, 4, 6, 13, 21
file2:
1, 2, 4, 6, 17
where file1 has 3, 13, and 21 but file2 does not, and conversely file2 has 2 and 17 but file1 does not.
And if this is possible, do you care which file the differences come from? If the result is
file3:
2, 3, 13, 17, 21
does it matter that there is no way to tell (from file3) which value came from which input file?
b) is it safe to assume that the values in each input file are numbers, and are sorted in some order (either numeric or lexicographic)?
I ask these things not to nitpick, but because the more precisely you state your requirements, the better solution "we" (ie MochiMoppel ) can provide.
Last edited by 6502coder on Mon 12 Oct 2015, 20:43, edited 1 time in total.
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
Maybe right track, but wrong train: sort -u does not return only the unique values (which would indeed be the solution). Instead it only eliminates duplicates. In laurentius77's example you would end up with a file containing exactly the same data as the bigger of the two input files.Ted Dog wrote:... on right tract.
cat file1 file2 >file3;
sort -u file3
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
Yes you did. At this age it's getting hard for old dogs to remember old tricksTed Dog wrote:Lol did I say it was 20 or more years ago..
Still not being sure if I'm on the right train, here is a solution that is more robust. It determines the file with the highest value, then returns all values from that file that are greater than the greatest value of the other file. Does this make sense? In below example it returns '31 40' (of file2). It will return an empty string if both files contain the same maximum value.
Code: Select all
#!/bin/sh
# Create some test files
echo -n "11,30,5,28" > /tmp/file1
echo -n "24,7,40,9,31" > /tmp/file2
IFS=${IFS},
CONTENT_FILE1=$(< /tmp/file1)
CONTENT_FILE1=$(printf '%s\n' $CONTENT_FILE1 | sort -n)
HIGHEST_VALU1=${CONTENT_FILE1##*$'\n'}
CONTENT_FILE2=$(< /tmp/file2)
CONTENT_FILE2=$(printf '%s\n' $CONTENT_FILE2 | sort -n)
HIGHEST_VALU2=${CONTENT_FILE2##*$'\n'}
if ((HIGHEST_VALU2 > HIGHEST_VALU1)); then
LARGER=$CONTENT_FILE2
SMALLER=$CONTENT_FILE1
MIN_VAL=$HIGHEST_VALU1
else
LARGER=$CONTENT_FILE1
SMALLER=$CONTENT_FILE2
MIN_VAL=$HIGHEST_VALU2
fi
for VAL in $LARGER ;do
(($VAL > $MIN_VAL)) && RESULT="${RESULT}${VAL} "
done
RESULT=${RESULT% } #Remove trailing space
echo -n $RESULT > /tmp/file3
# Show test result
gxmessage -file /tmp/file3
There used to be a great little app called xfdiff-cut for doing just this.
Drag file1 to the left pane, the second file to the right and click apply. The differences are shown in the bottom panel.
It was present in all earlier pups but seems to have disappeared.
I have a version here compiled in slacko64 if it's of any help.
Drag file1 to the left pane, the second file to the right and click apply. The differences are shown in the bottom panel.
It was present in all earlier pups but seems to have disappeared.
I have a version here compiled in slacko64 if it's of any help.
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
Maybe for a good reasonsmokey01 wrote:It was present in all earlier pups but seems to have disappeared..
- Attachments
-
- xdiff-cut.png
- (71.51 KiB) Downloaded 264 times
- Argolance
- Posts: 3767
- Joined: Sun 06 Jan 2008, 22:57
- Location: PORT-BRILLET (Mayenne - France)
- Contact:
Bonjour,
This works fine in console but not inside my script!
Cordialement.
Code: Select all
$ cat file1
zzzz
eeee
rrrr
yyyy
uuuu
iiii
$ cat file 2
yyyy
aaaa
zzzz
rrrr
eeee
uuuu
iiii
$ comm -3 <(sort file1) <(sort file2)
aaaa
Cordialement.
- L18L
- Posts: 3479
- Joined: Sat 19 Jun 2010, 18:56
- Location: www.eussenheim.de/
comm
ThisArgolance wrote:This works fine in console but not inside my script!Code: Select all
$ comm -3 <(sort file1) <(sort file2) aaaa
Code: Select all
#
# LANGUAGE=en comm -3 $(sort file1) $(sort file2)
comm: extra operand ‘rrrr’
Try 'comm --help' for more information.
#
# sort file1 > file1s
# sort file2 > file2s
# LANGUAGE=en comm -3 file1s file2s
aaaa
#
thanks for pointing to this nice tool: comm
- MochiMoppel
- Posts: 2084
- Joined: Wed 26 Jan 2011, 09:06
- Location: Japan
It doesn't work when your script starts with shebang #!/bin/sh. It works when you change to #!/bin/bash.Argolance wrote:This works fine in console but not inside my script!
From the bash manual:
- "When invoked as sh, Bash enters POSIX mode after reading the startup files.
The following list is what’s changed when ‘POSIX mode’ is in effect:
<snip>
28. Process substitution is not available."
I agree.MochiMoppel wrote:Maybe for a good reasonsmokey01 wrote:It was present in all earlier pups but seems to have disappeared..
I've been doing a bit of searching and diffuse looks pretty good.
http://diffuse.sourceforge.net/
It seems to work on all file types.