1.

Does Scrapy Crawl In Breadth-first Or Depth-first Order?

Answer»

By default, Scrapy uses a LIFO QUEUE for storing pending requests, which basically means that it CRAWLS in DFO ORDER. This order is more convenient in most cases. If you do want to CRAWL in true BFO order, you can do it by setting the following settings:

DEPTH_PRIORITY = 1

SCHEDULER_DISK_QUEUE = 'scrapy.squeues.PickleFifoDiskQueue'

SCHEDULER_MEMORY_QUEUE = 'scrapy.squeues.FifoMemoryQueue'

By default, Scrapy uses a LIFO queue for storing pending requests, which basically means that it crawls in DFO order. This order is more convenient in most cases. If you do want to crawl in true BFO order, you can do it by setting the following settings:

DEPTH_PRIORITY = 1

SCHEDULER_DISK_QUEUE = 'scrapy.squeues.PickleFifoDiskQueue'

SCHEDULER_MEMORY_QUEUE = 'scrapy.squeues.FifoMemoryQueue'



Discussion

No Comment Found