I want to scrape dataframe from dropdow value with BeautifulSoup.
- I select the value in both dropdown
- I submit my selection
- I get a data table
I would like to catch this dataframe with BS. any idea of the process to achieve this?
example site: https://coinarbitragebot.com/arbitrage.php
thanks
Advertisement
Answer
You can issue simple POST requests with custom parameters (the parameters you will see in Firefox/Chrome network tab when click Submit button). Then you can use pandas.read_html()
function to get your DataFrame.
For example:
data = {'bibox': 1, 'biki': 1, 'binance':1, 'bit-z': 1, 'bitbns': 1, 'bitfinex': 1, 'bitforex': 1, 'bithumb':1, 'bitkub': 1, 'bitmart':1, 'bitmax': 1, 'bitrue': 1, 'bitso': 1, 'bitstamp': 1, 'bittrex':1, 'bleutrade': 1, 'btcturk':1, 'bw_com':1, 'catex': 1, 'cex_io': 1, 'coinall':1, 'coinbase': 1, 'coinbene': 1, 'coincheck': 1, 'coindeal': 1, 'coineal':1, 'coinsbit': 1, 'cointiger': 1, 'crex24': 1, 'dcoin': 1, 'digifinex': 1, 'exmo': 1, 'exx_com':1, 'fatbtc': 1, 'finexbox': 1, 'gate_io':1, 'graviex':1, 'hitbtc': 1, 'hotbit':1, 'huobi': 1, 'indodax':1, 'koineks':1, 'kraken': 1, 'kucoin': 1, 'latoken':1, 'lbank': 1, 'liquid': 1, 'livecoin': 1, 'mercatox': 1, 'mxc':1, 'okcoin': 1, 'okex': 1, 'p2pb2b': 1, 'poloniex': 1, 'simex': 1, 'sistemkoin': 1, 'stex': 1, 'tokok': 1, 'tradeogre': 1, 'tradesatoshi': 1, 'upbit': 1, 'yobit': 1, 'zb_com': 1, 'zbg':1, 'bcurr': 'usd', 'arb_margin': 25, 'sbmfrm': 1} import requests import pandas as pd from bs4 import BeautifulSoup url = 'https://coinarbitragebot.com/arbitrage.php' data['bcurr'] = 'usd' # <-- set to 'usd', 'btc' or 'all' data['arb_margin'] = 5 # <-- set to your value soup = BeautifulSoup( requests.post(url, data=data).text, 'html.parser' ) df = pd.read_html(str(soup.select_one('#tbl1')))[0] df.columns = df.loc[0] df = df.iloc[1:].set_index('Coin', drop=True) print(df)
Prints:
0 bibox biki binance bit-z bitbns bitfinex bitforex bithumb bitkub bitmart ... sistemkoin stex tokok tradeogre tradesatoshi upbit yobit zb.com zbg % Coin ... DOGE/USD 0.002102 0.002102 0.0021 0.002097 0 0 0.00209838 0 0.00205862 0 ... 0.002178 0 0 0 0 0 0 0.0021027 0.0021013 29.34 TRX/USD 0 0.014055 0.01408 0.01409 0 0.013905 0.0141 0.0137128 0 0.01408 ... 0.014512 0 0.014 0 0 0 0 0.01406 0.0145 7.63 XLM/USD 0 0 0.04733 0.047 0.0472 0.04724 0.0472 0.0460763 0.0471012 0.04733 ... 0.047811 0 0 0 0 0 0 0.0473 0.0475 5.08 BSV/USD 0 113.299 0 0 0 113.27 113.457 110.545 108.698 113.638 ... 113.69 0 0 0 0 0 0 112.172 113.48 5.89 NEO/USD 9.484 9.45 9.483 9.4823 9.386 9.4783 9.49 0 0 9.483 ... 9.91 0 9.483 0 0 0 0 9.4925 0 5.51 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... PCX/USD 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6.86 QCX/USD 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 24.54 XDCE/USD 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6.84 YAS/USD 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 14.72 ZEL/USD 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9.93 [73 rows x 65 columns]
EDIT:
To select only binance
, bitfinex
and bittrex
, you can set data
like this:
data = {'binance':1, 'bitfinex': 1, 'bittrex':1, 'bcurr': 'all', 'arb_margin': 5, 'sbmfrm': 1}
This will print:
0 binance bitfinex bittrex % Coin SC/BTC 0.00000018 0 0.00000017 5.56
If no arbitrage opportunity is found, no table is found (you will need to handle this case too probably).