Tag: pandas

Pandas: Date Format Changes from mmmyy to YYYY-MM-DD

I have a data frame column in mmmyy format and would like to change in yyyy-mm-dd format. Input : Output: tried: conv = lambda x: datetime.strptime(x, “%b %y”) Answer Sept doesn’t match the %b format as only 3 letters are valid. You could fix the string to keep only 3 letters: or .replace(r’…

Improve function that redistribute shares in a pandas dataframe column (possible to avoid nested for loops?)

pandas python

Below I have dataframe (df) of ten rows, each row has a NAME and belongs to a GROUP. Each row has a value for SHARE that is 0.1. I want to manipulate the distribution of shares. For example, if I increase share value for NAME=’ONE’ from 0.1 to 0.175 I want a function that simultaneously decrease s…

how to sum values in a new column, based on conditions and occurrence of the same value in col (value recurrence) as a factor

dataframe finance pandas python python-3.x

I’m trying to find a way to update values in a new column having written a piece of code that in every step (row by row) displays the sum of buy/sell orders with the best price. Because of updates that may occur for a particular id, I need to find a way to use this factor to proper populate new

Pandas cumsum with keys

cumsum pandas pandas-groupby python

I have two DataFrames (first, second): index_first value_1 value_2 0 100 1 1 200 2 2 300 3 index_second value_1 value_2 0 50 10 1 100 20 2 150 30 Next I concat the two DataFrames with keys: My goal is to calculate the cumulative sum of value_1 and value_2 in z considering the keys. So the final DataFrame shou…

Issue w/ pandas.index.get_loc() when match is found, TypeError: (“‘>’ not supported between instances of ‘NoneType’ and ‘str'”, ‘occurred at index 1’)

dataframe indexing pandas python

Below is the example to reproduce the error: The desired output should be a list or array with the values [3,NaN,4,3]. The NaN because it does not satisfy the criteria. I checked the pandas references and it says that for cases when you do not have an exact match you can change the “method” to &#8…

Why the rank function is not working when I set axis=1?

dataframe jupyter-notebook pandas python python-3.x

I have this code: The code is working as it is but is not returning What I want. I was trying to rank num considering only it’s row so I tried to change this line: to: But it didn’t work. What am i missing here? Answer Building on what you already have here: we could add the Rank column as

Split column into multiple columns with unique values in pandas

dataframe pandas python python-3.x

I have the following dataframe: which needs to be turned into: Answer Using pandas.concat: For python < 3.8: Output: NB. add fillna(”) to have empty strings for missing values

How to get the all columns except last column in 3D numpy array?

multidimensional-array numpy-ndarray pandas python

I Have a 3D array composed of various columns. I just want to slice the last column. The array looks like the following: I have tried the following code. But it only shows the last column while I want to show all columns values except the last column. Answer IUUC, you can use: Example input: matching output:

Creating a large dataframe out of 100 csv files (full join required)

csv pandas python

I need to create a dataframe one 100+ csv file. My issues is that I have more than 100 CSVs with more than 55000 rows in each (as primary keys). Now the difference between the csv files is that is all columns (maybe around 1200 columns) were broken into separate files. In other words, I need to do a FULL