In FTP, the structure looks like this:
main_folder / year / month / day / multiple csv files
For example:
main_folder / 2020 / 02 / 03 / '2020-02-03_01.csv', '2020-02-03_02.csv', '2020-02-03_03.csv', ..... main_folder / 2020 / 03 / 03 / '2020-03-03_01.csv', '2020-03-03_02.csv', '2020-03-03_03.csv', ..... main_folder / 2021 / 01 / 01 / '2021-01-01_01.csv', '2021-01-01_02.csv', '2021-01-01_03.csv', .....
So each year has 12 folders (one for each month), each month contains multiple folders (one for one day), and each day have multiple csv
files (filename is consisted of the date_xx.csv
).
I have a list of filenames that I want to download, for example:
example_list = ['2021-08-09_01.csv', '2021-08-09_02.csv', '2021-08-10_12.csv', '2021-08-10_03.csv']
My current code behaves like this: extract the date year/month/day
from the filename -> then construct the corresponding dir in FTP, for example, for file '2021-08-09_01.csv'
, it will look at all the files under dir main_folder/2021/08/09
, but if I use the complete directory to tell FTP to only look at the specific file, it gave me error ftplib.error_perm: 550 No such directory.
This is the code:
file_dir = "main_folder/2021/08/09/2021-08-09_01.csv" ftp_conn = open_ftp_connection(ftp_host, ftp_username, ftp_password, file_dir) ftp = ftplib.FTP_TLS(host) ftp.login(username, password) ftp.cwd(file_dir)
I’m a bit confused here, how can I tell FTP to look for those files in the corresponding directory and read the data of them (end goal is to publish to s3 bucket)
Advertisement
Answer
This is how I would do it:
import ftplib, os example_list = ['2021-08-09_01.csv', '2021-08-09_02.csv', '2021-08-10_12.csv', '2021-08-10_03.csv'] FTP_IP = "1.2.3.4" FTP_LOGIN = "username" FTP_PASSWD = "password" CURRENT_DIR = os.getcwd() MAIN_DIR = "/main_folder" with ftplib.FTP(FTP_IP, FTP_LOGIN, FTP_PASSWD) as ftp: for entry in example_list: filesplit = entry.split("-") directory = "main_folder/"+filesplit[0]+"/"+filesplit[1]+"/"+filesplit[2].split("_")[0] ftp.cwd(directory) with open(os.path.join(CURRENT_DIR, entry), 'wb') as f: ftp.retrbinary(entry, f.write) ftp.cwd(MAIN_DIR)
The file will be downloaded to the directory, where you execute the python script from with the same filename as those on the server.