Skip to content
Advertisement

Pandas Grouping by Hostname. Average of Sessions(on host) by Hour

The dataframe looks like this.

JavaScript

What I am trying to show the average sessions per hour by individual hostname.

So I would get something back like this.

JavaScript

I think I’m getting my grouping wrong as when trying this what I end up with is typically the largest average value per hour for any given hostname ordered in date by hour.

For example I may see something like

JavaScript

Rather than the full 24 hours per hostname listed.

The code I tried was:

JavaScript

What do I need to do to get the desired result?

Advertisement

Answer

Here is an example based on the data you have provided. I have added the steps to make dates into datetime (in case they were objects) and to set datetime as a datetimeindex in order to use resample. It would go something like this:

JavaScript

Actually, you can modify this example to fit other purposes. As I understood your question, you want to calculate hourly mean number of sessions. Check the resample-function if you need other groupby.s

The alternative to doing this is to seaprate date and time and then take the mean:

JavaScript

which gives

enter image description here

Advertisement