Comparing values in SQL against previously scraped data

If you have scraped data on more than one occasion and want to check if a value has changed in a column since the previous scrape you could use this:

select col1, col2 

from TABLENAME 

group by col1, col2 

having count(col2) <2

This will compare and check if a value for col2 has changed since the previous scrape

Let’s put some real names into this:

select CLAIMNO, STATUSID 

from PREV 

group by CLAIMNO, STATUSID 

having count(STATUSID) <2

We can see from our results that from our entire comparison, only 1 claim has had a status change

We now know that since the last time we scraped the site, only one of our claims has been updated by “them”

This has been achieved by using “group by” – where we have many records in our table, and we’ve picked out where there are not 2 matching STATUSID values for a given CLAIM

Summary

We have used SQL to identify where a column value has changed compared to the previous time we checked by grouping 2 columns and checking if the count is less than 2.

If you would like more information please contact me via the contact page

See also : https://stackoverflow.com/questions/1786533/find-rows-that-have-the-same-value-on-a-column-in-mysql#1786568

Previous article

Parsing Scraped Data with Pandas

Next article

How to scrape iframes