I have the DataFrame:
JavaScript
x
3
1
link = [{'name': 'http://website.com/product76tre53932'}, {'name': 'http://website.it/productiee8340'}, {'name': 'http://website.de/productooi7309'}]
2
df = pd.DataFrame(link)
3
How I can cut values that get the next result, which you can see in the df['name_2]
column:
enter image description here
Advertisement
Answer
You can use urllib.parse
module to parse those URLs.
JavaScript
1
18
18
1
>>> from urllib.parse import urlsplit
2
>>>
3
>>> def create_url(url):
4
r = urlsplit(url)
5
return f"{r.scheme}://{r.netloc}"
6
7
>>> link = [{'name': 'http://website.com/product76tre53932'}, {'name': 'http://website.it/productiee8340'}, {'name': 'http://website.de/productooi7309'}]
8
>>>
9
>>> import pandas as pd
10
>>> df = pd.DataFrame(link)
11
>>> df['new_url'] = df.name.apply(create_url)
12
>>> df
13
name new_url
14
0 http://website.com/product76tre53932 http://website.com
15
1 http://website.it/productiee8340 http://website.it
16
2 http://website.de/productooi7309 http://website.de
17
>>>
18