Using a column of values to create a counter for a…

I currently have a pandas dataframe with some columns. I’m looking to build a column, Sequential, that lists what iteration is recorded at that part of the cycle. I’m currently doing this using itertools.cycle, and a fixed number of iterations block_cycles, like so:

# Fill out Sequential Numbers
block_cycles = 330
lens = len(raw_data.index)
sequential = list(itertools.islice(itertools.cycle(range(1, block_cycles)),lens))
interim_output['Sequential'] = sequential

With an output like this:

print(interim_output['Sequential'])

0    1
1    2
2    3
...
329  330
331  1
332  2
332  3

And this would be fine, if the number of iterations in a cycle was the same. However, upon investigation, I’ve found that not every cycle contains the same amount of iterations. I have another column, CycleNumber, that contains what cycle number the iteration belongs to. It looks like this:

print(raw_data['CycleNumber'])

0           1
1           1
2           1
3           1
4           1

51790    4936
51791    4936
51792    4936
51793    4936
51794    4936

So, for example, one cycle might contain 330 iterations, and another could contain 333, 331, and so forth – it’s not guaranteed to be the same. The values in cycle number increase incrementally.

I’ve built a dictionary of the amount of iterations each cycle contains, cycle_freq, which looks like this:

# Calculate the number of iterations each cycle contains
cycle_freq = {}
for item in cycle_number:
    if (item in cycle_freq):
        cycle_freq[item] += 1
    else:
        cycle_freq[item] = 1

print (cycle_freq)

{1: 330, 2: 332, 3: 331, 4: 332, 5: 332, 6: 333, 7: 333, 8: 330....
4933: 331, 4934: 334, 4935: 287, 4936: 24}

How could I go about using this dictionary to replace the constant variable block_cycles, creating a big column list of sequential numbers based on exactly how many iterations were in that cycle? So far, this is my logic to try to get it to use the values contained in the dictionary cycle_freq, but to no avail:

for i in cycle_freq:
    iteration = list(itertools.islice(itertools.cycle(range(1, cycle_freq[i])),lens))
    sequential.append(iteration)

My desired output would look like this:

Any help would be greatly appreciated!

Answer

I’ve used a workaround and gave up itertools:

sequential = []
for _, cycles in cycle_freq.items():
    seq = [cycle for cycle in range(1, cycles + 1)]
    sequential.extend(seq)

interim_output['Sequential'] = sequential

Using a column of values to create a counter for a variable sequential number column

Advertisement

Answer