I am relatively new and trying to use Python to scrape data. Here is my code:
import requests import pandas as pd from bs4 import BeautifulSoup URL = 'https://www.wsj.com/market-data/stocks/asia?mod=md_usstk_view_asia' HEADERS = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36" } page = requests.get(URL, headers=HEADERS) soup = BeautifulSoup(page.content, 'html.parser') table = soup.find("table", attrs={"class": "WSJTables--table--1QzSOCfq"}) print(table)
I have already added headers, but the output shows no value. Any help would be greatly appreciated, thanks!
Advertisement
Answer
The data you’re looking for is loaded from external source via Ajax. You can use next example how to load it with requests
module:
import json import requests url = "https://www.wsj.com/market-data/stocks/asia" params = { "id": '{"application":"WSJ","instruments":[{"symbol":"INDEX/HK//HSI","name":"Hong Kong: Hang Seng"},{"symbol":"INDEX/JP//NIK","name":"Japan: Nikkei 225"},{"symbol":"INDEX/CN//SHCOMP","name":"China: Shanghai Composite"},{"symbol":"INDEX/IN//1","name":"India: S&P BSE Sensex"},{"symbol":"INDEX/AU//XJO","name":"Australia: S&P/ASX"},{"symbol":"INDEX/KR//SEU","name":"S. Korea: KOSPI"},{"symbol":"INDEX/US//GDOW","name":"Global Dow"},{"symbol":"FUTURE/US//DJIA FUTURES","name":"DJIA Futures"}]}', "type": "mdc_quotes", } headers = { "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0" } data = requests.get(url, params=params, headers=headers).json() # uncomment to see all data: # print(json.dumps(data, indent=4)) for instrument in data["data"]["instruments"]: print( "{:<30} {:<10}".format( instrument["formattedName"], instrument["lastPrice"] ) )
Prints:
Hong Kong: Hang Seng 28458.44 Japan: Nikkei 225 28317.83 China: Shanghai Composite 3486.56 India: S&P BSE Sensex 50540.48 Australia: S&P/ASX 7030.3 S. Korea: KOSPI 3156.42 Global Dow 4022.82 DJIA Futures 34208
To load it as panda’s DataFrame:
df = pd.json_normalize(data["data"]["instruments"]) print(df)
Prints:
country dailyHigh dailyLow exchangeIsoCode formattedName lastPrice mantissa name priceChange percentChange requestSymbol ticker timestamp type url bluegrassChannel.channel bluegrassChannel.type 0 HK 28584.34 28286.92 XHKG Hong Kong: Hang Seng 28458.44 2 Hang Seng Index 8.15 0.03 INDEX/HK//HSI HSI 2021-05-21T16:08:32+08:00 Index https://www.wsj.com/market-data/quotes/index/H... /zigman2/quotes/210598030/delayed DelayedChannel 1 JP 28411.56 28193.03 XTKS Japan: Nikkei 225 28317.83 2 NIKKEI 225 Index 219.58 0.78 INDEX/JP//NIK NIK 2021-05-21T15:15:02+09:00 Index https://www.wsj.com/market-data/quotes/index/J... /zigman2/quotes/210597971/delayed DelayedChannel 2 CN 3518.38 3479.67 XSHG China: Shanghai Composite 3486.56 2 Shanghai Composite Index -20.39 -0.58 INDEX/CN//SHCOMP SHCOMP 2021-05-21T15:01:13+08:00 Index https://www.wsj.com/market-data/quotes/index/C... /zigman2/quotes/210598127/delayed DelayedChannel 3 IN 50591.12 49832.72 XBOM India: S&P BSE Sensex 50540.48 2 S&P BSE Sensex Index 975.62 1.97 INDEX/IN//1 1 2021-05-21T15:30:50+05:30 Index https://www.wsj.com/market-data/quotes/index/I... /zigman2/quotes/210597966/delayed DelayedChannel 4 AU 7056.40 6999.60 XASX Australia: S&P/ASX 7030.3 1 S&P/ASX 200 Benchmark Index 10.7 0.15 INDEX/AU//XJO XJO 2021-05-21T17:20:23+10:00 Index https://www.wsj.com/market-data/quotes/index/A... /zigman2/quotes/210598100/delayed DelayedChannel 5 KR 3198.01 3149.46 Korea Exchange S. Korea: KOSPI 3156.42 2 KOSPI Composite Index -5.86 -0.19 INDEX/KR//SEU 180721 2021-05-21T15:33:00+09:00 Index https://www.wsj.com/market-data/quotes/index/K... /zigman2/quotes/210598069/delayed DelayedChannel 6 US 4040.16 4012.70 S&P Dow Jones Global Dow 4022.82 2 Global Dow Realtime USD 7.66 0.19 INDEX/US//GDOW GDOW 2021-05-21T18:43:17-04:00 Index https://www.wsj.com/market-data/quotes/index/U... /zigman2/quotes/210599024/realtime DelayedChannel 7 US 34372.00 34017.00 XCBT DJIA Futures 34208 0 E-Mini Dow Continuous Contract 55 0.16 FUTURE/US//DJIA FUTURES YM00 2021-05-21T15:59:59.595-05:00 Future https://www.wsj.com/market-data/quotes/futures... /zigman2/quotes/210407078/delayed DelayedChannel