I have a data frame column in mmmyy format and would like to change in yyyy-mm-dd format. Input : Output: tried: conv = lambda x: datetime.strptime(x, “%b %y”) Answer Sept doesn’t match the %b format as only 3 letters are valid. You could fix the string to keep only 3 letters: or .replace(r’…
Tag: pandas
Improve function that redistribute shares in a pandas dataframe column (possible to avoid nested for loops?)
Below I have dataframe (df) of ten rows, each row has a NAME and belongs to a GROUP. Each row has a value for SHARE that is 0.1. I want to manipulate the distribution of shares. For example, if I increase share value for NAME=’ONE’ from 0.1 to 0.175 I want a function that simultaneously decrease s…
how to sum values in a new column, based on conditions and occurrence of the same value in col (value recurrence) as a factor
I’m trying to find a way to update values in a new column having written a piece of code that in every step (row by row) displays the sum of buy/sell orders with the best price. Because of updates that may occur for a particular id, I need to find a way to use this factor to proper populate new
Data type assigned inside nested for loop isn’t as expected
I get the error: AttributeError: ‘float’ object has no attribute ‘lower’ When trying to compile this triple nested for loop: df_row_list is a list of 18 series. I am trying to iterate through it and comb through the data. How do I assign the str data type to row_item_data so that I can…
Pandas cumsum with keys
I have two DataFrames (first, second): index_first value_1 value_2 0 100 1 1 200 2 2 300 3 index_second value_1 value_2 0 50 10 1 100 20 2 150 30 Next I concat the two DataFrames with keys: My goal is to calculate the cumulative sum of value_1 and value_2 in z considering the keys. So the final DataFrame shou…
Issue w/ pandas.index.get_loc() when match is found, TypeError: (“‘>’ not supported between instances of ‘NoneType’ and ‘str'”, ‘occurred at index 1’)
Below is the example to reproduce the error: The desired output should be a list or array with the values [3,NaN,4,3]. The NaN because it does not satisfy the criteria. I checked the pandas references and it says that for cases when you do not have an exact match you can change the “method” to …
Why the rank function is not working when I set axis=1?
I have this code: The code is working as it is but is not returning What I want. I was trying to rank num considering only it’s row so I tried to change this line: to: But it didn’t work. What am i missing here? Answer Building on what you already have here: we could add the Rank column as
Split column into multiple columns with unique values in pandas
I have the following dataframe: which needs to be turned into: Answer Using pandas.concat: For python < 3.8: Output: NB. add fillna(”) to have empty strings for missing values
How to get the all columns except last column in 3D numpy array?
I Have a 3D array composed of various columns. I just want to slice the last column. The array looks like the following: I have tried the following code. But it only shows the last column while I want to show all columns values except the last column. Answer IUUC, you can use: Example input: matching output:
Creating a large dataframe out of 100 csv files (full join required)
I need to create a dataframe one 100+ csv file. My issues is that I have more than 100 CSVs with more than 55000 rows in each (as primary keys). Now the difference between the csv files is that is all columns (maybe around 1200 columns) were broken into separate files. In other words, I need to do a FULL