Python Code

June 3, 2020 / Last updated : June 3, 2020 admin Python Code

Scrapy : Yield

So here we see the code for “parse” with “yield” being used 3 times.

Go and fetch the geo data
Go and fill the container fields in items.py
Now go and find the next page (120) listings

The 2nd ‘yield’ has no URL to go to but after every one of the “ads in all_ads” has had it’s values gathered and sent to items the for loop ends the pagination code checks for next page and the code goes on to get the next bunch of listings to process.

parse_detail

We’ve already covered what this does, it gets the “‘longitude” and “latitude” from the ‘detail’ page for the property. As we can see below, there is no ‘Yield’ required and self.lon and self.lat get their values on each and every iteration of ‘for ads in all_ads:’

main

Above we can see the class ‘RealestateSpdier’ being instantiated and then FEEDS being assigned a path and format.

crawl specifies the class to use, and start does what is says!

Conclusion

We hope this has been a useful explanation and example of using ‘Yield’ more than once in a method/fucntion.

You may see “return” used in some examples but from experience “yield” is more robust (where you have a choice).

‘FEEDS’ was a new way of saving the output, previously we’ve used “FEED_FORMAT” and “FEED_URI” – both ways work, but FEEDS seems to be the new way.

The YouTube video will show this in action and will appear here soon!

Thanks for reading! ✅

Categories: Python Code and Scrapy

Python Code

April 3, 2020

Python Code

June 22, 2020

Scrapy : Yield

parse_detail

main

Conclusion

Nested Dictionaries

Scrapy tips