Tag: numpy

Problem: In dataset i have data of country,state and city. Where some state and city name is 0.so,I want to replace “0” value of state with city name

I tried this method. In this code list some city of israel and generate 500 random value of it. And apply condition to insert data in israel state location where data is 0. Answer

Numpy: Indexing 3D matrix using 1D array

arrays numpy python

I’m trying to index this array of shape (3,2,2) with an array of shape (3) containing the y-index of the value I want to get. I tried to make it work with for in statement, but is there an elegant way to do it with numpy? Answer So you want arr[0,0,:], arr[1,1,:], arr[2,1,:]? How about

Convert rec.array to dataframe

arrays numpy pandas python

I’ve been trying to convert a numpy rec.array into a dataframe. The current array looks like: The result should be a five-column dataframe like the following: Weights v_1 v_2 v_3 v_4 0.2 1.76405235 0.40015721 0.97873798 2.2408932 0.2 1.86755799 -0.97727788 0.95008842 -0.15135721 …. …. … … … 0.05882353 0.17742614 -0.40178094 -1.63019835 0.46278226 and so on.. However, as I do pd.DataFrame(my_list), the

how to change datetime format column that contains a string

numpy pandas python python-3.x

I’ve a data frame contains a column (Start Shift) and it has different data types (Datetime/string), What i need is to change the datetime format to be time format only and keep the string without any change, i used the below code to solve this issue but i can’t find a way to apply this change in the data frame

Fit data with a lognormal function via Maximum Likelihood estimators

numpy python scipy.stats

Could someone help me in fitting the data collapse_fractions with a lognormal function, which has median and standard deviation derived via the maximum likelihood method? I tried scipy.stats.lognormal.fit(data), but I did not obtain the data I retrieved with Excel. The excel file can be downloaded: https://stacks.stanford.edu/file/druid:sw589ts9300/p_collapse_from_msa.xlsx Also, any reference is really welcomed. Answer I couldn’t figure out how to get

Join to dataframes based on index where the second dataframe has repeated indexes related to the first dataframe

data-science dataframe numpy pandas python

I have two data frames where first dataframe has index starting from zero. The second dataframe has repeated indexes starting from zero. I want to to join the two dataframes based on their indexes. First dataframe is like this The second dataframe is I want to join these two dataframes based on index i.e the new dataframe should look like

Pandas: str.extract() giving unexpected NaN

dataframe numpy pandas python

I have a data set which has a column that looks like this I need only the numbers. Here’s my code: I was expecting an output like: but I got Just to test, I dumped the dataframe to a .csv and read it back with pd.read_csv(). That gave me just the numbers, as I need (though of course that’s not

Pandas – Count repeating values by condition

numpy pandas python

Dataframe: I have columns “group” and “val” and I don’t know how to write pandas code to get column “count”? The logic is like this, it should count the number of consecutive values that are on the same side (either positive or negative) grouped by column “group”. When side changes the counter should be reset to 1 and start counting

Update column based on grouped date values

dataframe numpy pandas python

Edited/reposted with correct sample output. I have a dataframe that looks like the following: This dataframe is split into groups by ID. I would like to make an updated combined column based on if df[‘bool’] == True, but only if df[‘bool’] == True AND there is another ‘finished’ row in the same group with a LATER (not the same) year.

How to clean survey data in pandas

data-cleaning dataframe numpy pandas python

Input: Output: here’s the data: d = {‘Morning’: [“Didn’t answer”, “Didn’t answer”, “Didn’t answer”, ‘Morning’, “Didn’t answer”], ‘Afternoon’: [“Didn’t answer”, ‘Afternoon’, “Didn’t answer”, ‘Afternoon’, “Didn’t answer”], ‘Night’: [“Didn’t answer”, ‘Night’, “Didn’t answer”, ‘Night’, ‘Night’], ‘Sporadic’: [“Didn’t answer”, “Didn’t answer”, ‘Sporadic’, “Didn’t answer”, “Didn’t answer”], ‘Constant’: [“Didn’t answer”, “Didn’t answer”, “Didn’t answer”, ‘Constant’, “Didn’t answer”]} I want the output to be: