I have multiple files where i need to create a matrix with matching values
File_1, which is primary file contains all numbers tab delimited with one row
Sample_1 23 45 46 67 78 47 98 73 87 45 97 21
There are multiple files where if a number matches, add 1 or else add 0 to file above
File_2
Sample_2 23 67 47 235 87 102 97
File_3
Sample_3 67 51 78 98 52 12 21 124
Output
Sample_1 23 45 46 67 78 47 98 73 87 45 97 21 Sample_2 1 0 0 1 0 1 0 0 1 0 1 0 Sample_3 0 0 0 1 1 0 1 0 0 0 0 1
Advertisement
Answer
awk
to the rescue!
$ awk 'function pr() {for(i=2;i<=n;i++) printf "%s%d", OFS,(h[i] in p)+0; print ""} NR==1 {n=split($0,h); print; next} FNR==1 {if(f) pr(); delete p; printf "%s",$0} {f=1;p[$1]} END {pr()}' file1 file2 file3 | column -t Sample_1 23 45 46 67 78 47 98 73 87 45 97 21 Sample_2 1 0 0 1 0 1 0 0 1 0 1 0 Sample_3 0 0 0 1 1 0 1 0 0 0 0 1