A common task is to track competitors prices and use that information as a guide to the prices you can charge, or if you are buying, you can spot when a product is at a new lowest price. The purpose of this article is to describe how to web scrape Amazon.
Using Python, Scrapy, MySQL, and Matplotlib you can extract large amounts of data, query it, and produce meaningful visualizations.
In the example featured, we wanted to identify which Amazon books related to “web scraping” had been reduced in price over the time we had been running the spider.
If you want to run your spider daily then see the video for instructions on how to schedule a spider in CRON on a Linux server.
Procedure used for price tracking
query = '''select amzbooks2.* from
lag(price) over (partition by title order by posted) as prev_price
from amzbooks2) amzbooks2
where prev_price <> price'''
Visualize the stored data using Python and Matplotlib
All of the code is on GitHub