Skip to content
Advertisement

How do reduce a set of columns along another set of columns, holding all other columns?

I think this is a simple operation, but for some reason I’m not finding immediate indicators in my quick perusal of the Pandas docs.

I have prototype working code below, but it seems kinda dumb IMO. I’m sure that there are much better ways to do this, and concepts to describe it.

Is there a better way? If not, at least better way to describe?

Abstract Problem

Basically, I have columns p0, p1, y0, y1, .... ... are just things I’d like held constant (remain as separate in table). p0, p1 are things I’d like to reduce against. y0, y1 are columns I’d like to be reduced.

DataFrame.grouby didn’t seem like what I wanted. When perusing the code, I wasn’t sure if anything else was I wanted. Multi-indexing also seemed like a possible context, but I didn’t immediately see an example of what I desired.

Here’s the code that does I what I want:

JavaScript

Background

I am running some sweeps for ML stuff; in it, I sweep on some model architecture param, as well as dataset size and the seed that controls random initialization of the model parameters.

I’d like to reduce along the seed to get a “feel” for what architectures are possibly more robust to initialization; for now, I’d like to see what dataset size helps the most. In the future, I’d like to do (heuristic) reduction along dataset size as well.

Advertisement

Answer

Actually, looks like DataFrame.groupby(hold_cols).agg({k: ["mean"] for k in reduce_cols}) is what I want. Source: https://jamesrledoux.com/code/group-by-aggregate-pandas

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement