I have a code in R that works. But I want to re-do it in python. I use R to use apply function in order to calculate minor allele frequency. Can someone tell me how such a code would look in python? I am using pandas to read the data in python.
##R-code ###Reading file var_freq <- read_delim("./cichlid_subset.frq", delim = "t", col_names = c("chr", "pos", "nalleles", "nchr", "a1", "a2"), skip = 1) # find minor allele frequency var_freq$maf <- var_freq %>% select(a1, a2) %>% apply(1, function(z) min(z))
I have read the file using pandas but I am struggling with the second part.
###Python code ###Reading file var_freq = pd.read_csv("./cichlid_subset.frq",sep='t',header=None) column_indices = [0,1,2,3,4,5] new_names = ["chr", "pos", "nalleles", "nchr", "a1", "a2"] old_names = df_snv_gnomad.columns[column_indices] ###Finding minor allele frequency
Insights will be appreciated.
Advertisement
Answer
Use:
# Read file colnames = ["chr", "pos", "nalleles", "nchr", "a1", "a2"] var_freq = pd.read_csv('./cichlid_subset.frq', sep='t', header=None, skiprows=1, names=colnames) # Get MAF var_freq['maf'] = var_freq[['a1','a2']].min(axis=1)