Skip to content
Advertisement

Web scraping problem during passing fuction as paramater in function

Hello I’ve created two functions that work well well called alone. But when I try to use a for loop with these functions I got a problem with my parameter.

First function to search and get link to pass to the second one.

JavaScript

Second function to scrape a link.

JavaScript

All these function worked when I tested them on a link.

Now I have a csv file with name of companies using searchsport() to search in website and the returned link is passed to single_text() to scrape.

JavaScript

Error:

JavaScript

When I run this I got a df.

JavaScript

My expected results should be two DataFrame for every keyword. Why it doesn’t work?

Advertisement

Answer

So I have noted throughout the code some of the problems I saw with your code as posted.

Some things I noticed:

Not handling cases of where something is not found e.g. ‘PARIS-SAINT-GERMAIN-FOOTBALL’ will fail whereas ‘PARIS SAINT GERMAIN FOOTBALL’ as a search term will not

Opportunities for simplification missed e.g. creating a dataframe by looping tr then td when could just use read_html on table; Using find_all when a single table or a tag is needed

Overwriting variables in loops as well as typos e.g.

JavaScript

Not testing if a dataframe is empty

Risking generating incorrect urls by using 'https://www.verif.com/' as the next part you concatenate on starts with “/” as well

Inconsistent variable naming e.g. what is single_item? The function I see is called single_text.

These are just some observations and there is certainly still room for improvement.

JavaScript

Returning df from main()

JavaScript
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement