I just got started on Kaggle and for my first project I was working on the Titanic dataset. I ran the following codeblock Although I’m getting the output as: The Pclass, SibSp and Parch variables did not convert to one_hot encoded vectors though the Sex attribute did. I didn’t understand why becau…
Tag: pandas
Transform DataFrame: place values in right columns as new rows
I am analyzing a consumer survey and there are both Dutch (NL) and French (FR) respondents. Depending on the answer they gave when we asked about their mother language they got the same questionnaire but translated in Dutch or French. The problem is that the output of Qualtrics (the survey software) gave us t…
pandas: manage duplicated sentences on different columns
I have a dataframe as follows: I want to add the first column value to a sentence if that sentence is repeated somewhere else in the next three columns. so my desired output would be col1 col2 col3 col4 1_a 1_aJoe waited for the train. the weather is nice the house looks amazing 2_a The train was late. the we…
Remove data time older than specific hours
I want to remove data from my dataframe older than say 2 hours from current time starting with 00 mins (datetime column is in index) when i use below code Current datetime: ’17-03-2022 17:05:00′ Issue: My code keeps all records in df from ’17-03-2022 15:05:00′ to ’17-03-2022 17:0…
Pandas groupby, assign and to_excel – on loop/repeat
I have a dataframe like as shown below My objective is to do the below a) Group columns based on multiple criteria (as shown in below code) b) Assign a default value based on target column. (ex: if target_at50, then assign value 50, if target_at60, then assign 60. if target_at70, then assign 70) b) Repeat the…
cannot search value in dataframe althought the value exists
I have a data frame with location data. I know a value for a certain location exists and I even know its index location. When I search using index location the values is shown correctly but if I search using a combination of other columns(lat and lon), the value does not show. I am attaching the screenshot be…
Control how NAs are displayed with pandas styler
I am trying to use the na_rep argument of df.style.format() to control how cells with NaN are shown in the table. Documentation: https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.format.html Reproducible code: I get this error message. TypeError: format() got an unexpected keyword ar…
How to apply function for only special cells in column pandas?
I have a issue with applying function for column in pandas, please see below code : my df now show like below: I would like to apply function checknum for 2 cell in column ‘new’ which is having ‘None’ value. Can someone assist this ? Thank you Answer IIUC, you can use vectorial code: o…
pandas pivot_tables doesn’t work with date data (No numeric types to aggregate)
I have the following dataframe: I want to create pivot table to get the following table: I have tried to do this using pivot_table (pandas): but I get this error: DataError: No numeric types to aggregate I have read this post but it seems to be a bit different as I don’t want to change the columns and a…
How to define breaks of the bins for log scale in Seaborn
Consider the following minimal working example: The minimal working example above will produce Now, let’s use our own breaks of the bins: I expected to produce a histogram similar to the one before but I’m getting What am I doing wrongly? Or is this a bug with Seaborn 0.11.2? Answer Apparently, yo…