Tag: pandas

Pandas: convert dtype ‘object’ to int

I’ve read an SQL query into Pandas and the values are coming in as dtype ‘object’, although they are strings, dates and integers. I am able to convert the date ‘object’ to a Pandas datetime dtype, but I’m getting an error when trying to convert the string and integers. Here is an example: Converting the df[‘date’] to a datetime works:

Stack columns above value labels in pandas pivot table

pandas python

Given a dataframe that looks like: Key1 Key2 Value1 Value2 0 one A 1.405817 1.307511 1 one B -0.037627 -0.215800 2 two C -0.116591 -1.195066 3 three A 2.044775 -1.207433 4 one B -1.109636 0.031521 5 one C -1.529597 1.761366 6 two A -1.349865 0.321454 7 three B 0.814374 2.285579 8 one C 0.178702 0.479210 9 one A 0.718921 0.504311

Convert pandas DataFrame to dict where each value is a list of values of multiple columns

dataframe dictionary pandas python

Let’s say I have the DataFrame I want to create a dictionary in the form Solutions I have found deal with the case of creating a dict with single values using something like Answer Set ‘filename’ as the index, take the transpose, then use to_dict with orient=’list’: The resulting output:

Pyspark: display a spark data frame in a table format

apache-spark-sql pandas pyspark python

I am using pyspark to read a parquet file like below: Then when I do my_df.take(5), it will show [Row(…)], instead of a table format like when we use the pandas data frame. Is it possible to display the data frame in a table format like pandas data frame? Thanks! Answer The show method does what you’re looking for. For

Python: Convert map in kilometres to degrees

degrees pandas projection python

I have a pandas Dataframe with a few million rows, each with an X and Y attribute with their location in kilometres according to the WGS 1984 World Mercator projection (created using ArcGIS). What is the easiest way to project these points back to degrees, without leaving the Python/pandas environment? Answer Many years later, this is how I would do

How to filter a pandas series with a datetime index on the quarter and year

datetime datetimeindex pandas python

I have a Series, called ‘scores’, with a datetime index. I wish to subset it by quarter and year pseudocode: series.loc[‘q2 of 2013’] Attempts so far: s.dt.quarter AttributeError: Can only use .dt accessor with datetimelike values s.index.dt.quarter AttributeError: ‘DatetimeIndex’ object has no attribute ‘dt’ This works (inspired by this answer), but I can’t believe it is the right way to

Python Pandas dataframe reading exact specified range in an excel sheet

excel pandas python

I have a lot of different table (and other unstructured data in an excel sheet) .. I need to create a dataframe out of range ‘A3:D20’ from ‘Sheet2’ of Excel sheet ‘data’. All examples that I come across drilldown up to sheet level, but not how to pick it from an exact range. Once I get this, I plan to

Could pandas use column as index?

pandas python

I have a spreadsheet like this: I don’t want to manually swap the column with the row. Could it be possible to use pandas reading data to a list as this: Answer Yes, with pandas.DataFrame.set_index you can make ‘Locality’ your row index. If inplace=True is not provided, set_index returns the modified dataframe as a result. Example:

Writing large Pandas Dataframes to CSV file in chunks

dataframe export-to-csv large-data pandas python

How do I write out a large data files to a CSV file in chunks? I have a set of large data files (1M rows x 20 cols). However, only 5 or so columns of the data files are of interest to me. I want to make things easier by making copies of these files with only the columns of

Get HTML table into pandas Dataframe, not list of dataframe objects

dataframe html-parsing pandas python

I apologize if this question has been answered elsewhere but I have been unsuccessful in finding a satisfactory answer here or elsewhere. I am somewhat new to python and pandas and having some difficulty getting HTML data into a pandas dataframe. In the pandas documentation it says .read_html() returns a list of dataframe objects, so when I try to do