0

Level2StockQuotes.com offers free real-time top of book quotes that I would like to capture in python using BeautifulSoup. The issue is even though I can see the actual data values in a browser inspector, I can't scrape these values into python.

BeautifulSoup returns all data rows with each data element blank. Pandas returns a dataframe with NaN for each data element.

import bs4 as bs
import urllib.request
import pandas as pd

symbol = 'AAPL'
url = 'https://markets.cboe.com/us/equities/market_statistics/book/'+ symbol + '/'
page = urllib.request.urlopen(url).read()
soup = bs.BeautifulSoup(page,'lxml')

rows = soup.find_all('tr')
print(rows)

for tr in rows:
    td = tr.find_all('td')
    row =(i.text for i in td)
    print(row)

#using pandas to get dataframe
dfs = pd.read_html(url)
for df in dfs:
    print(df)

Can someone more experienced than I tell me how to pull this data? Thanks!

  • The page uses AJAX requests to markets.cboe.com/json/bzx/book/AAPL. From this page it acquires the content in JSON format. – Casper Aug 13 at 14:47
2

The page is dynamic. You'll either need to use Selenium to simulate a browser and let the page render before grabbing the html, or you can get the data straight from the json XHR.

import requests
import pandas as pd
from pandas.io.json import json_normalize



url = 'https://markets.cboe.com/json/bzx/book/AAPL' 

headers = {
'Referer': 'https://markets.cboe.com/us/equities/market_statistics/book/AAPL/',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36',
'X-Requested-With': 'XMLHttpRequest'}

jsonData = requests.get(url, headers=headers).json()

df_asks = pd.DataFrame(jsonData['data']['asks'], columns=['Shares','Price'] )
df_bids = pd.DataFrame(jsonData['data']['bids'], columns=['Shares','Price'] )
df_trades = pd.DataFrame(jsonData['data']['trades'], columns=['Time','Price','Shares','Time_ms'])

Output:

df_list = [df_asks, df_bids, df_trades]
for df in df_list:
    print (df)

   Shares   Price
0      40  209.12
1     100  209.13
2     200  209.14
3     100  209.15
4      24  209.16
   Shares   Price
0     200  209.05
1     200  209.02
2     100  209.01
3     200  209.00
4     100  208.99
       Time Price    Shares         Time_ms
0  10:45:57   300  209.0700  10:45:57.936000
1  10:45:57   300  209.0700  10:45:57.936000
2  10:45:55    29  209.1100  10:45:55.558000
3  10:45:52    45  209.0900  10:45:52.265000
4  10:45:52    50  209.0900  10:45:52.265000
5  10:45:52     5  209.0900  10:45:52.265000
6  10:45:51   100  209.1100  10:45:51.902000
7  10:45:48   100  209.1400  10:45:48.528000
8  10:45:48   100  209.1300  10:45:48.528000
9  10:45:48   200  209.1300  10:45:48.528000

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.