I have the DataFrame:
link = [{'name': 'http://website.com/product76tre53932'}, {'name': 'http://website.it/productiee8340'}, {'name': 'http://website.de/productooi7309'}] df = pd.DataFrame(link)
How I can cut values that get the next result, which you can see in the df['name_2]
column:
enter image description here
Advertisement
Answer
You can use urllib.parse
module to parse those URLs.
>>> from urllib.parse import urlsplit >>> >>> def create_url(url): ... r = urlsplit(url) ... return f"{r.scheme}://{r.netloc}" ... >>> link = [{'name': 'http://website.com/product76tre53932'}, {'name': 'http://website.it/productiee8340'}, {'name': 'http://website.de/productooi7309'}] >>> >>> import pandas as pd >>> df = pd.DataFrame(link) >>> df['new_url'] = df.name.apply(create_url) >>> df name new_url 0 http://website.com/product76tre53932 http://website.com 1 http://website.it/productiee8340 http://website.it 2 http://website.de/productooi7309 http://website.de >>>