Web Scraping Articles

  • How to web scrape iframes with scrapy - Web Scraping pages with iframes in can be done with Scrapy if you use a separate URL to access the data inside the iframe. You need to identify the name of the page of the iframe and then append that to your base url to provide a 2nd URL for the Scrapy spider to visit. […]
  • How to scrape iframes - If you are scraping a website with pop up messages asking you to agree to accept cookies. This can prevent your scraper from continuing to the pages you want to scrape. How do you get past these? Using Selenium you need to switch to the iframe (which you can identify using browser tools / inspect […]
  • Comparing values in SQL against previously scraped data - If you have scraped data on more than one occasion and want to check if a value has changed in a column since the previous scrape you could use this: We now know that since the last time we scraped the site, only one of our claims has been updated by “them” This has been […]
  • Parsing Scraped Data with Pandas - Once you have some data you’ll need to find the relevant parts and use them for further analysis. Rather than use the proverbial Excel sheet you can write code to automate the task. Consider the following : The following code will match where the row contains “Dijbouti” and return the value that is in the […]
  • Debugging Python Code - & using Snippets This article will cover how to achieve the following, using Python : Reduce the need for print statements Reduce the need to type repetitive code Why not automate some of your coding to allow you to automate the ‘boring stuff’ – anything that you do regularly could be saved as a snippet […]
  • Scrapy response.meta - capture your start urls in your output with Scrapy response.meta Every web scraping project has aspects that are different or interesting and worth remembering for future use. This is a look at a recent real world project and looks saving more than one start url in the output. This assumes basic knowledge of web scraping, […]
  • Add data to a database from a CSV using Python, Pandas, and SQL - Do you have a CSV file and a database to transfer the data into? The data needs to go into your database and you want to automate this? For instance : you have been web scraping and extracted publicly available data and now need to store it in a database table? This guide will show […]
  • regex examples - Sooner or later you will need to resort to using regular expressions if you are scraping text. This article will show you some useful Python (version 3) examples which you can use over and over again.. We’ll be using “re.search” and “group” example 1 – parse a paragraph of text for the number of floors, […]
  • Back Up MySQL - Back up your MySQL database (on a Raspberry Pi) Once in production you will need to ensure you have a copy of your data for many reasons which we can all identify with. Hardware failure Data Corruption Human Error So, before getting too far into a web scraping project using Scrapy with MySQL let’s spare […]
  • Get started with Pandas and MySQL - How to make a dataframe from your SQL database and query the data This assumes you already have a sample database set up on your MySQL server and you have the username and password. In the example shown we are logging on to a Raspberry Pi running MariaDB and we are executing a query to […]