Skip to content
Advertisement

Need help scraping WSJ Markets Data

I am relatively new and trying to use Python to scrape data. Here is my code:

import requests
import pandas as pd
from bs4 import BeautifulSoup

URL = 'https://www.wsj.com/market-data/stocks/asia?mod=md_usstk_view_asia'

HEADERS = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36"
}

page = requests.get(URL, headers=HEADERS)
soup = BeautifulSoup(page.content, 'html.parser')

table = soup.find("table", attrs={"class": "WSJTables--table--1QzSOCfq"})
print(table)

I have already added headers, but the output shows no value. Any help would be greatly appreciated, thanks!

Advertisement

Answer

The data you’re looking for is loaded from external source via Ajax. You can use next example how to load it with requests module:

import json
import requests

url = "https://www.wsj.com/market-data/stocks/asia"
params = {
    "id": '{"application":"WSJ","instruments":[{"symbol":"INDEX/HK//HSI","name":"Hong Kong: Hang Seng"},{"symbol":"INDEX/JP//NIK","name":"Japan: Nikkei 225"},{"symbol":"INDEX/CN//SHCOMP","name":"China: Shanghai Composite"},{"symbol":"INDEX/IN//1","name":"India: S&P BSE Sensex"},{"symbol":"INDEX/AU//XJO","name":"Australia: S&P/ASX"},{"symbol":"INDEX/KR//SEU","name":"S. Korea: KOSPI"},{"symbol":"INDEX/US//GDOW","name":"Global Dow"},{"symbol":"FUTURE/US//DJIA FUTURES","name":"DJIA Futures"}]}',
    "type": "mdc_quotes",
}
headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0"
}

data = requests.get(url, params=params, headers=headers).json()

# uncomment to see all data:
# print(json.dumps(data, indent=4))

for instrument in data["data"]["instruments"]:
    print(
        "{:<30} {:<10}".format(
            instrument["formattedName"], instrument["lastPrice"]
        )
    )

Prints:

Hong Kong: Hang Seng           28458.44  
Japan: Nikkei 225              28317.83  
China: Shanghai Composite      3486.56   
India: S&P BSE Sensex          50540.48  
Australia: S&P/ASX             7030.3    
S. Korea: KOSPI                3156.42   
Global Dow                     4022.82   
DJIA Futures                   34208     

To load it as panda’s DataFrame:

df = pd.json_normalize(data["data"]["instruments"])
print(df)

Prints:

  country  dailyHigh  dailyLow exchangeIsoCode              formattedName lastPrice  mantissa                            name priceChange percentChange            requestSymbol  ticker                      timestamp    type                                                url            bluegrassChannel.channel bluegrassChannel.type
0      HK   28584.34  28286.92            XHKG       Hong Kong: Hang Seng  28458.44         2                 Hang Seng Index        8.15          0.03            INDEX/HK//HSI     HSI      2021-05-21T16:08:32+08:00   Index  https://www.wsj.com/market-data/quotes/index/H...   /zigman2/quotes/210598030/delayed        DelayedChannel
1      JP   28411.56  28193.03            XTKS          Japan: Nikkei 225  28317.83         2                NIKKEI 225 Index      219.58          0.78            INDEX/JP//NIK     NIK      2021-05-21T15:15:02+09:00   Index  https://www.wsj.com/market-data/quotes/index/J...   /zigman2/quotes/210597971/delayed        DelayedChannel
2      CN    3518.38   3479.67            XSHG  China: Shanghai Composite   3486.56         2        Shanghai Composite Index      -20.39         -0.58         INDEX/CN//SHCOMP  SHCOMP      2021-05-21T15:01:13+08:00   Index  https://www.wsj.com/market-data/quotes/index/C...   /zigman2/quotes/210598127/delayed        DelayedChannel
3      IN   50591.12  49832.72            XBOM      India: S&P BSE Sensex  50540.48         2            S&P BSE Sensex Index      975.62          1.97              INDEX/IN//1       1      2021-05-21T15:30:50+05:30   Index  https://www.wsj.com/market-data/quotes/index/I...   /zigman2/quotes/210597966/delayed        DelayedChannel
4      AU    7056.40   6999.60            XASX         Australia: S&P/ASX    7030.3         1     S&P/ASX 200 Benchmark Index        10.7          0.15            INDEX/AU//XJO     XJO      2021-05-21T17:20:23+10:00   Index  https://www.wsj.com/market-data/quotes/index/A...   /zigman2/quotes/210598100/delayed        DelayedChannel
5      KR    3198.01   3149.46  Korea Exchange            S. Korea: KOSPI   3156.42         2           KOSPI Composite Index       -5.86         -0.19            INDEX/KR//SEU  180721      2021-05-21T15:33:00+09:00   Index  https://www.wsj.com/market-data/quotes/index/K...   /zigman2/quotes/210598069/delayed        DelayedChannel
6      US    4040.16   4012.70   S&P Dow Jones                 Global Dow   4022.82         2         Global Dow Realtime USD        7.66          0.19           INDEX/US//GDOW    GDOW      2021-05-21T18:43:17-04:00   Index  https://www.wsj.com/market-data/quotes/index/U...  /zigman2/quotes/210599024/realtime        DelayedChannel
7      US   34372.00  34017.00            XCBT               DJIA Futures     34208         0  E-Mini Dow Continuous Contract          55          0.16  FUTURE/US//DJIA FUTURES    YM00  2021-05-21T15:59:59.595-05:00  Future  https://www.wsj.com/market-data/quotes/futures...   /zigman2/quotes/210407078/delayed        DelayedChannel
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement