Skip to content
Advertisement

Building matrix with values from multiple files

I have multiple files where i need to create a matrix with matching values

File_1, which is primary file contains all numbers tab delimited with one row

  Sample_1   23   45   46   67   78   47   98   73   87   45   97   21

There are multiple files where if a number matches, add 1 or else add 0 to file above

File_2

Sample_2
23
67
47
235
87
102
97

File_3

Sample_3
67
51
78
98
52
12
21
124

Output

Sample_1   23   45   46   67   78   47   98   73   87   45   97   21
Sample_2   1    0    0    1    0    1    0    0    1    0    1    0
Sample_3   0    0    0    1    1    0    1    0    0    0    0    1 

Advertisement

Answer

awk to the rescue!

$ awk 'function pr() {for(i=2;i<=n;i++) printf "%s%d", OFS,(h[i] in p)+0; print ""}     
       NR==1  {n=split($0,h); print; next} 
       FNR==1 {if(f) pr(); delete p; printf "%s",$0} {f=1;p[$1]} 
       END    {pr()}' file1 file2 file3 | column -t

Sample_1  23  45  46  67  78  47  98  73  87  45  97  21
Sample_2  1   0   0   1   0   1   0   0   1   0   1   0
Sample_3  0   0   0   1   1   0   1   0   0   0   0   1
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement