I am new to Python and have never really used Pandas, so forgive me if this doesn’t make sense. I am trying to create a df based on frontend data I am sending to a flask route. The data is looped through and appended for each row. My only problem is that I don’t know how to get the df columns to reflect that. Here is my code to build the rows and the current output:
claims = csv_data["claims"] setups = csv_data["setups"] for setup in setups: setup = setups[0] offerings = setup["currentOfferings"] considered = setup["considerationSet"] reach_dict = setup["reach"] favorite_dict = setup["favorite"] summary_dict = setup["summaryMetrics"] rows = [] for i, claim in enumerate(claims): row = [] row.append(i + 1) row.append(claim) for setup in setups: setup = setups[0] row.append("X") if claim in setup["currentOfferings"] else row.append(float('nan')) row.append("X") if claim in setup["considerationSet"] else row.append(float('nan')) if claim in setup["currentOfferings"]: reach_score = reach_dict[claim] reach_percentage = "{:.0%}".format(reach_score) row.append(reach_percentage) else: row.append(float('nan')) if claim in setup["currentOfferings"]: favorite_score = favorite_dict[claim] fav_percentage = "{:.0%}".format(favorite_score) row.append(fav_percentage) else: row.append(float('nan')) rows.append(row)
I know that I can put columns = [“#”, “Claims”, “Setups”, etc…] in the df, but that doesn’t work because the rows are looping through multiple setups, and the number of setups can change. If I don’t specify the column names (how it is in the image), then I just have numbers as columns names. Ideally it should loop through the data it receives in the route, and would start with “#” “Claims” as columns, and then for each setup “Setup 1”, “Consideration Set 1”, “Reach”, “Favorite”, “Setup 2”, “Consideration Set 2”, and so on… etc.
I tried to create a similar type of loop for the columns:
my_columns = [] for i, row in enumerate(rows): col = [] if row[0] != None: col.append("#") else: pass if row[1] != None: col.append("Claims") else: pass if row[2] != None: col.append("Setup") else: pass if row[3] != None: col.append("Consideration Set") else: pass if row[4] != None: col.append("Reach") else: pass if row[5] != None: col.append("Favorite") else: pass my_columns.append(col) df = pd.DataFrame( rows, columns = my_columns )
But this didn’t work because I have the same issue of no loop, I have 6 columns passed and 10 data columns passed. I’m not sure if I am just not doing the loop of the columns properly, or if I am making everything more complicated than it needs to be.
This is what I am trying to accomplish without having to explicitly name the columns because this is just sample data. There could end up being 3, 4, however many setups in the actual app. what I would like the ouput to look like
Advertisement
Answer
I don’t know if this is the most efficient way of doing something like this but I think this is what you want to achieve.
def create_columns(df): new_cols=[] for i in range(len(df.columns)): repeated_cols = 6 #here is the number of columns you need to repeat for every setup idx = 1 + i // repeated_cols basic = ['#', 'Claims', f'Setup_{idx}', f'Consideration_Set_{idx}', 'Reach', 'Favorite'] new_cols.append(basic[i % len(basic)]) return new_cols df.columns = create_columns(df)