Skip to content
Advertisement

Tag: pandas

xlswriter formatting a range

In xlswriter, once a format is defined, how can you apply it to a range and not to the whole column or the whole row? for example: this gets applied it to the whole “B” column, but how can this “perc_fmt” applied to a range, for example, if I do: it says: Answer Actually I found a workaround that avoids

Groupby and lag all columns of a dataframe?

I want to lag every column in a dataframe, by group. I have a frame like this: which looks like and I want it to look like this: This question manages the result for a single column, but I have an arbitrary number of columns, and I want to lag all of them. I can use groupby and apply, but

Write Large Pandas DataFrames to SQL Server database

I have 74 relatively large Pandas DataFrames (About 34,600 rows and 8 columns) that I am trying to insert into a SQL Server database as quickly as possible. After doing some research, I learned that the good ole pandas.to_sql function is not good for such large inserts into a SQL Server database, which was the initial approach that I took

How to read a Parquet file into Pandas DataFrame?

How to read a modestly sized Parquet data-set into an in-memory Pandas DataFrame without setting up a cluster computing infrastructure such as Hadoop or Spark? This is only a moderate amount of data that I would like to read in-memory with a simple Python script on a laptop. The data does not reside on HDFS. It is either on the

Python – Command “python setup.py egg_info” failed with error code 1 in /tmp/pip-build-21ft0H/pandas

I’m using Centos 7 and Python 2.7.5. The problem is when I install Pandas, i got this error message I already tried a lot of solutions but no success even yum -y update. Can’t install via pip because of egg_info error Python pip install fails: invalid command egg_info https://www.digitalocean.com/community/tutorials/how-to-set-up-python-2-7-6-and-3-3-3-on-centos-6-4 pip fails to install anything, error: invalid command ‘egg_info’ Answer I

Pandas dataframe from nested dictionary

My dictionary looks like this: I want to get a dataframe that looks like this: I tried calling pandas.from_dict(), but it did not give me the desired result. So, what is the most elegant, practical way to achieve this? EDIT: In reality, my dictionary is of depth 4, so I’d like to see a solution for that case, or ideally,

How to convert datetime object to milliseconds

I am parsing datetime values as follows: How can I convert this datetime objects to milliseconds? I didn’t see mention of milliseconds in the doc of to_datetime. Update (Based on feedback): This is the current version of the code that provides error TypeError: Cannot convert input to Timestamp. The column Date3 must contain milliseconds (as a numeric equivalent of a

Advertisement