Tag: web-scraping

Can’t convert this text in normal format in python?

I am web-scraping some stuff and i got something like this “735 𝚆𝚒𝚕𝚕𝚒𝚊𝚖 𝚃 𝙼𝚘𝚛𝚛𝚒𝚜𝚜𝚎𝚢 𝙱𝚕𝚟𝚍, 𝙳𝚘𝚛𝚌𝚑𝚎𝚜𝚝𝚎𝚛, 𝙼𝙰 02122 Dorchester MA 02121” how do i convert it to normal text in python? Answer You can run it through Unicode normalization: Here’s a REPL screenshot that demonstrates it works:

Beautifulsoup : Unable to extract href with several conditions

beautifulsoup python regex web-scraping

I’m trying to extract every links with BeautifulSoup from the SEC website such as this one by using the code from this Github. The thing is I do not want to extract every 8-K but only the ones matching the items “2.02” within the column “Description”. So i edited the “Download.py” file and identified the following : I’ve tried to

Get meta tag content by name, beautiful soup and python

beautifulsoup html metadata python web-scraping

I’m trying to get the meta data from this website(here’s the code). however I get this error. Any ideas? Answer From bs4 docs: You can’t use a keyword argument to search for HTML’s name element, because Beautiful Soup uses the name argument to contain the name of the tag itself. Instead, you can give a value to ‘name’ in the

Python selenium: finding multiple elements with partially different names

python selenium web-scraping xpath

I have a webpage full of elements that consists below example (gave 2, the webpage consists of around >10), I want to search for all of the below elements containing ‘suda-data’ and click on all of them. Howvever I am unable to define the finding of all the elements properly. Notes: a. cannot search by class=”S_txt2″ (will include elements that

beautiful soup find_all() not returning all elements

beautifulsoup python python-3.x web-scraping

I am trying to scrape this website using bs4. Using inspect on particular car ad tile, I figured what I need to scrape in order to get the title & the link to the car’s page. I am making use of the find_all() function of the bs4 library but the issue is that it’s not scraping the required info of

Python for loop not looping through all the elements? It is only taking the 1st element

for-loop nosuchelementexception python selenium web-scraping

I am trying to scrape the data from naukri.com, here I am trying to scrape the location details for each recruiter visible on the page. The code I am writing is : 1st I have extracted all recruiters details in list highlight_table_tag. highlight_table_tag includes all the elements on the page however the loop only takes the 0th element of my

BeautifulSoup extract conditioned digit coloured by css

beautifulsoup css python web-scraping

I successfully get the data from this table from THRIVEN : But as you can see, at the Net% column, those values negative/positive are determined by some CSS (which I believed, and I couldn’t find them where they are located). How can I extract those data and put them into my Excel as negative/positive numbers? Below is my current code

Extract data from Json: Error JSONDecodeError: Expecting value

beautifulsoup json python python-requests web-scraping

Error : File “C:UsersAdminanaconda3libjsondecoder.py”, line 355, in raw_decode raise JSONDecodeError(“Expecting value”, s, err.value) from None JSONDecodeError: Expecting value Answer This is how you do it: Output:

is it possible to write image to csv file?

docx image python web-scraping

Hi everyone this is my first post here and wanted to know how can ı write image files that ı scraped from a website to a csv file or if its not possible to write on csv how can ı write this header,description,time info and image to a maybe word file Here is the code Everything works perfectly just wanna

How would I increment the numerical portion of a specific parameter in a URL string?

automation html macos python web-scraping

Here’s my specific example: I need the numerical value of param7 in this example to increase by a count of 1 before each pass up to param7 = ’30’. Is there any way to create a list or dict containing values ‘1’ through ’30’ and tell param7 to use use move through the dict at index + 1? Answer You