Calling an attribute defined in a method from another method in data science (python)

I’m learning object oriented programing in a data science context. I want to understand what good practice is in terms of writing methods within a class that relate to one another. When I run my code: I get the following output (only part of the output is shown due to space constrains): I am happy with the output generated by each method. But if I try to call print(data.quality_fun()) without first calling print(data.prepper_fun()), I get an error AttributeError: ‘MyData’ object has no attribute ‘df’. Being new to objected oriented programming, I am wondering if it is considered good practice to

Empty dataframe when filtering

I have a dataframe that looks like this: Now I’d like to filter like this: However, I am getting an empty dataframe. What am I doing wrong here? Answer Try df1.loc[df1[‘PZAE’]==”‘HAE'”] Details : the column ‘PZAE’ contains str starting and finishing by ‘ that’s why you have to include them in the condition

‘BACKTICK_QUOTED_STRING__AT_key’ is not defined for Pandas Quering function

Trying to get a hold of pd.query function Getting a UndefinedVariableError: name ‘BACKTICK_QUOTED_STRING__AT_key’ is not defined for the below pandas python code. Where am i going wrong? Answer Try this : or

How to combine rows into seperate dataframe python pandas

i have the following dataset: i want to combine x y z into another dataframe like this: and i want these dataframes for each x y z value like first, second third and so on. how can i select and combine them? desired output: Answer Use GroupBy.cumcount for counter and then loop by another groupby object:

Use DataFrame column as index and append duplicates as new columns

I have a DataFrame that contains a column with dates which I’d like to use as my DataFrame’s index. The dates in that column are not necessarily unique – sometimes there might be duplicates. I wish to append duplicates as new columns. Dates that are unique can just contain NaN (or whatever) for the newly appended columns. To clarify I’ll provide an example: This will yield: What I want: The naming of the newly appended columns can be arbitrary. I don’t even know whether appending would be the right way to go about it. Maybe it’s easier to create a

Pandas – iloc – comparing value to the cell below

For the following table: Using Pandas – I would like achieve the desired_output column, that is TRUE when the value below the current cell i different – otherwise FALSE. I have tried the following code – but error occurs. Answer

Create Dataframe by calling indices of df1 that are listed in df2

I’m new to Python Pandas and struggling with the following problem for a while now. The following dataframe df1 values show the indices that are coupled to the values of df2 that should be called df2 contains the values that belong to the indices that have to be called. For example, df1 shows the value ‘0’ in column ‘Name161’. Then df3 should show the value that is listed in df2 with index 0. In this case ‘164’. Till so far, I got df3 showing the first 3 values of df2, but of course that not what I would like to

Read a TSV file from a remote server

I have this function which returns the path to the file I need to read Later on, I am trying to open the file db_file holds one of the paths above. When I execute the script I get this error: I have checked the files names and the paths they all exist and in the right location. I have tried the following and got the same traceback: Answer Your file is on FTP server. Use paramiko in order to read it.

Calculate time between two different values in the same pandas column

I have data that look like the following I need to create a new column that will find the time between the first issue and the first resolved. I need a groupby statement that will keep the first issue and the first resolved for all the issues. Then find the time – When I use group by Device and condition it just kept one issue per device. The desired output is like the following As groupby Device and Condition is not enough I thought to create an index column Then use pivot table for the time calculations Answer The biggest

How to print MLB data into Pandas DataFrame?

I am still learning how to web scrape and could use some help. I would like to print the MLB data into a Pandas DataFrame. It looks like the program does not run correctly but I did not receive an error. Any suggestions would be greatly appreciated. Thanks in advance for any help that you may offer. Answer That page contains a text file in CSV format. So load it with pandas like this: And that should get you what you are looking for.