Skip to content
Advertisement

Problem with counting times a parameter is in a certain range

I am trying to count how many times a value, “i”, is in the range i<0.5. I am counting from a csv file. To be clear, I only want the number of times to be appended to a dictionary. I will post my code and the result I get. the result I want is like this:(86 is a place holder, not the actual number that is true) {‘0.0-0.5’:[86]…..

input:

import pandas as pd

df = pd.read_csv("gen_pop.csv", index_col=0)
new_d = {'0.0-0.5':[],'0.5-1':[],'1-10':[],'10-100':[],'100-450':[]}
for i in df.values:
    count=0
    for col in df:
        if str(i) == "nan":
            continue
        if (i<0.5).any():
            count+=1
    new_d['0.0-0.5'].append(count)
print (new_d)

output: {‘0.0-0.5’: [0, 32, 0, 0, 32, 0, 0, 32,(… and so on a thousand times)], ‘0.5-1’: [], ‘1-10′: [], ’10-100’: [], ‘100-450’: []}

Thanks in advance!

So I tried counting the number of times is true for i<0.5, but it doesn’t work. to be clear the csv file contanes data in the form of table with 12,000 names and 32 tissues in which the names expressed. example of the data: my data

Advertisement

Answer

Okay, let’s say you have a toy data frame.

>>> pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})

   A  B
0  1  4
1  2  5
2  3  6

Your comment seemed to imply something like this, where you could cut your data into bins and count.

>>> pd.cut(df.values.flatten(), bins=[0, 1, 4, 6]).value_counts()
>>> # specific bins are arbitrary etc

(0, 1]    1
(1, 4]    3
(4, 6]    2
dtype: int64

But your code seems to imply that you want to execute a cut action for every row separately. You could do that by applying the cut function with appropriate parameters df.apply(lambda r: pd.cut(r, bins=[ arbitrary... bins... here... ]).value_counts() but I’m not entirely sure what output you want anyway. To get it into the form implied by your code would just require transposing the output.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement