file3
a
c
e
file4
a 1
b 2
c 3
d 4
e 5
the one liner for comparing the first field of file4 with the first field of file3 is:
awk 'FNR==NR{a[$0];next}($1 in a)' file3 file4
and the output is:
a 1
c 3
e 5
And if you want to remove the lines which match just change the above mentioned command by adding a !
awk 'FNR==NR{a[$0];next}!($1 in a)' file3 file4
Can you please explain how it is working ?
ReplyDeleteawk 'FNR==NR{a[$0];next}($1 in a)' file3 file4
ReplyDeleteFNR->line number of the file.
NR->line number of all collected data of all the files.
So the first thing is:
FNR==NR->this condition will be a succes untill all the lines in the first file are completed processing.As soon as all the lines in the file3 are completed,FNR will be set back to 1 and NR will continue with its numbering.
So untill this condition satisfies the array a keeps on building with $0(which is the complete line of file3 here).So at the end of file3 the array has all the lines of file3.
next is like continue in c language it will tell awk to start processing the next line.
The rest of the code ($1 in a) will applied only after all the lines in file3 are completed(that is from first line of file4).$1 represents the first field of file4.
($1 in a) will check whether ther is a $1 as a key in the array a.If success this will print the line