Dataframe is like below: Where I want to change dataframes value to ‘dead’ if age is more than 100. Desired outcome I was trying something like this: Error shown: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() I am looking for a loop that works on all dataframe. Please correct my
Tag: data-science
How to replace NaN value in column in Dataframe based on values from another column in same dataframe
Below is the Dataframe i’m working. I want to replace NaN values in ‘Score’ columns using values from column ‘Country’ and ‘Sectors’ Below is the code which I’ve tried I want to replace only NaN values specific to country == ‘USA’ and Sectors == ‘CHEM’ and keep all values as it is. Could anyone please help?“` Answer You can use
How to remove extra quotes in between quotes for following example “Dec 01, 1999″,”Pocket Aquarium “Pocker” Pocket”,”Random : USA”,”USA” using python
I want to remove extra quotes in each line of csv file. ex: ideal output required: Answer you could try this: input: code: test_modified.csv
Can Pandas output inferred schema for a CSV file?
Is there a method I can use to output the inferred schema on a large CSV using pandas? In addition, any way to have it tell me with that type if it is nullable/blank based off the CSV? File is about 500k rows with 250 columns. With my new job, I’m constantly being handed CSV files with zero format documentation.
Replace grouped columns’ outliers with mean of the group based on defined zscore
I have a very huge dataFrame with many datapoints on a map with outliers which are very close to each other on the dataset(Latitudes and longitudes). I would like to group all the rows as shown below for column A, calculate their zscores and replace every value within a group whose zscore is > 1.5 with the mean value for
How to divide one column by another where one dataframe’s column value corresponds to another dataframe’s column’s value in Python Pandas?
Consider the following data frames in Python Pandas: DataframeA ColA ColB ColC 1 dog 439 1 cat 932 1 frog 932 2 dog 2122 2 cat 454 2 frog 773 3 dog 9223 3 cat 3012 3 frog 898 DataframeB ColD ColE 1 101 2 314 3 124 To note, ColB just repeats it’s string values as ColA iterates upwards.
Jupyter Notebook ImportError: cannot import name ‘example_var’
When I change/add a variable to my config.py file and then try to import it to my Jupyter Notebook I get: ImportError: cannot import name ‘example_var’ from ‘config’ config.py: jp_notebook.ipynb: But after I restart the Jupyter Kernel it works fine until I modify the config.py file again. I read somewhere that it’s because jupyter already cached that import. Is there
Weibull: R vs Python – slightly different results
I’m trying to replicate R’s fitdist() results (reference, cannot modify R code) in Python using scipy.stats. The results are quite close but still different (difference is at not acceptable level). Does anybody know why the results are different? How can I reduce the difference between the results? scipy_stats.weibull_min definition (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.weibull_min.html) seems to be the same as R’s weibull (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/Weibull.html. Data
Converting a multindex dataframe to a nested dictionary [closed]
Closed. This question needs debugging details. It is not currently accepting answers. Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question. Closed 2 years ago. Improve this question I have a grouped dataframe as shown in this link: I want to
Convert timeseries csv in Python
I want to convert a CSV file of time-series data with multiple sensors. This is what the data currently looks like: The different sensors are described by numbers and have different numbers of axes. If a new activity is labeled, everything below belongs to this new label. The label is in the same column as the first entry of each