How to create Dataframe with the columns names being as a part of a csv file path?

Question

I have a root folder With multiple folders in it with and ultimate paths to csv files: I was managed to create a Dataframe containing all csv files concatenated using the following code: Result: But I am now struggling to add the respective date+name to the Dataframe, so it would look like this: How can I do it? Answer With

Accepted Answer

With pathlib, you can go 1 & 2 directories up and get the name and date. Since this involves two things, an explicit for loop might be more readable than the list comprehension:from pathlib import Path# ...above are the samedfs = []for csv_path in full_path:    # generate a `Path` object and get parents    p = Path(csv_path)    parents = p.parents    # get the desired values from "parent" dirs    name = parents[0].name    date = parents[1].name    # read in the CSV as is    frame = pd.read_csv(csv_path)        # assign the `name` and `date` columns    frame["name"] = name    frame["date"] = date    # store in the list    dfs.append(frame)# lastly concating as you diddf = pd.concat(dfs)Or equivalently, the list comprehension counterpart is:dfs = [pd.read_csv(csv_path).assign(name=csv_path.parents[0].name,                                    date=csv_path.parents[1].name)       for csv_path in map(Path, full_path)]df = pd.concat(dfs)where we use assign to put new columns to each dataframe.It depends on you to choose between explicit for loop or list comprehension.

Advertisement

Answer