Hot to get data from webapge using selenium and show it using flask?

Tags: , , ,



Hello I’m a theologian and one of the things that I usually have to do is translate from latin to english or spanish. In order to do that I use an online dictionary and check if an specific word is in nominative case or dative case (latinist stuff)…

Now I’d code a simple script in python using selenium that get the dictionary’s page and extract the case of the word. All works fine and as I want to, but…

Always there is a ‘but’ haha. I want to take that data that I extract by using selenium and ‘print’ it by using flask in a webpage. I code that, but it doesn’t work…

my code:

from flask import Flask
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from tabulate import tabulate
import sys
import os

app = Flask(__name__)

chrome_opt = Options()
chrome_opt.binary_location = g_chrome_bin = os.environ.get("GOOGLE_CHROME_BIN")
chrome_opt.add_argument('--headless')
chrome_opt.add_argument('--no-sandbox')
chrome_opt.add_argument('--disable-dev-sh--usage')

selenium_driver_path = os.environ.get("CHROMEDRIVER_PATH")
driver = webdriver.Chrome(executable_path= selenium_driver_path if selenium_driver_path else "./chromedriver", options=chrome_opt)

def analyze (words):
    ws = words.split()
    sentence = []
    for w in ws:
        driver.get('http://archives.nd.edu/cgi-bin/wordz.pl?keyword=' + w)
        pre = driver.find_element_by_xpath('//pre')
        sentence = sentence + [[w] + [ pre.text.replace('.', '') ]]
    return tabulate(sentence, headers=["Word", "Dictionary"])

#analyze("pater noster qui est in celis")

@app.route("/api/<string:ws>")
def api (ws):
    return analyze(ws)

driver.close()

if __name__ == "__main__":
    app.run(debug=True)

And when I go to http://localhost:5000/api/pater (for ex.) I’ve got Internal Server Error and in the console selenium.common.exceptions.InvalidSessionIdException: Message: invalid session id

Answer

You close your driver session (driver.close())before the main method runs. Thus when you make an api request and try to call driver.get() that driver is already closed. Eather you initialise a new driver for every call to analazye() and close that at the end of the method OR you dont close the driver session at all.



Source: stackoverflow