Skip to content

scrapy/regex get json_object from html

I’m crawling reviews from a website in scrapy python and want to get all the reviews from the following part of the raw html as a dictionary. Getting the window.cj.listings is no problem, but I can’t seem to get the window.cj.app_data out with regex. The following code works for getting the listin…

Custom data generator

I have a standard directory structure of train, validation, test, and each contain class subdirectories. I want to use the flow_from_directory API, but all I can find is an ImageDataGenerator, and the files I have are raw numpy arrays (generated with arr.tofile(…)). Is there an easy way to use ImageData…