Skip to content
Advertisement

How do I loop column names in a pandas dataframe?

I am new to Python and have never really used Pandas, so forgive me if this doesn’t make sense. I am trying to create a df based on frontend data I am sending to a flask route. The data is looped through and appended for each row. My only problem is that I don’t know how to get the df columns to reflect that. Here is my code to build the rows and the current output:

claims = csv_data["claims"]
setups = csv_data["setups"]
for setup in setups:
  setup = setups[0]
  offerings = setup["currentOfferings"]
  considered = setup["considerationSet"]
  reach_dict = setup["reach"]
  favorite_dict = setup["favorite"]
  summary_dict = setup["summaryMetrics"]


rows = []
for i, claim in enumerate(claims):
  row = []
  row.append(i + 1)
  row.append(claim)
  for setup in setups:
    setup = setups[0]
    row.append("X") if claim in setup["currentOfferings"] else row.append(float('nan'))
    row.append("X") if claim in setup["considerationSet"] else row.append(float('nan'))
    if claim in setup["currentOfferings"]:
      reach_score = reach_dict[claim]
      reach_percentage = "{:.0%}".format(reach_score)
      row.append(reach_percentage)
    else:
      row.append(float('nan'))
    if claim in setup["currentOfferings"]:
      favorite_score = favorite_dict[claim]
      fav_percentage = "{:.0%}".format(favorite_score)
      row.append(fav_percentage)
    else:
      row.append(float('nan'))

  rows.append(row)

output I know that I can put columns = [“#”, “Claims”, “Setups”, etc…] in the df, but that doesn’t work because the rows are looping through multiple setups, and the number of setups can change. If I don’t specify the column names (how it is in the image), then I just have numbers as columns names. Ideally it should loop through the data it receives in the route, and would start with “#” “Claims” as columns, and then for each setup “Setup 1”, “Consideration Set 1”, “Reach”, “Favorite”, “Setup 2”, “Consideration Set 2”, and so on… etc.

I tried to create a similar type of loop for the columns:

my_columns = []
for i, row in enumerate(rows):
  col = []
  if row[0] != None:
    col.append("#")
  else:
    pass
  if row[1] != None:
    col.append("Claims")
  else:
    pass
  if row[2] != None:
    col.append("Setup")
  else:
    pass
  if row[3] != None:
    col.append("Consideration Set")
  else:
    pass
  if row[4] != None:
    col.append("Reach")
  else:
    pass
  if row[5] != None:
    col.append("Favorite")
  else:
    pass

my_columns.append(col)

df = pd.DataFrame(
    rows,
    columns = my_columns
    )

But this didn’t work because I have the same issue of no loop, I have 6 columns passed and 10 data columns passed. I’m not sure if I am just not doing the loop of the columns properly, or if I am making everything more complicated than it needs to be.

This is what I am trying to accomplish without having to explicitly name the columns because this is just sample data. There could end up being 3, 4, however many setups in the actual app. what I would like the ouput to look like

Advertisement

Answer

I don’t know if this is the most efficient way of doing something like this but I think this is what you want to achieve.

def create_columns(df):
    new_cols=[]
    for i in range(len(df.columns)):
        repeated_cols = 6 #here is the number of columns you need to repeat for every setup
        idx = 1 + i // repeated_cols 
        basic = ['#', 'Claims', f'Setup_{idx}', f'Consideration_Set_{idx}', 'Reach', 'Favorite']
        new_cols.append(basic[i % len(basic)])
    return new_cols
    
df.columns = create_columns(df)
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement