Skip to content Skip to sidebar Skip to footer

Beautifulsoup Scraping: Loading Div Instead Of The Content

Noob here. I'm trying to scrape search results from this website: http://www.mastersportal.eu/search/?q=di-4|lv-master&order=relevance I'm using python's BeautifulSoup import

Solution 1:

If you want only the text, you should do this

lista.append(h3.get_text())

Regarding your second question, jsfan's answer is right. You should try Selenium and use its wait feature to wait for your search results, that appear in divs with the class names Result master premium

element = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.XPATH, "div[@class*='Result master premium']))
)

Solution 2:

You have basically answered your own question. Beautiful Soup is a pure web scraper which will only download whatever the server returns for a specific URL.

If you want to render the page as it is shown in a browser, you will need to use something like Selenium Webdriver which will start up an actual browser and remote control it.

While using Webdriver is very powerful, it has a much steeper learning curve than pure web scraping as well though.

If you want to get into using Webdriver with Python, the official documentation is a good place to start.

Post a Comment for "Beautifulsoup Scraping: Loading Div Instead Of The Content"