21 + Interview Questions in Scrapy Interview Questions in Scrapy Tutorial Page 1 InterviewSolution

1.	Should I Use Spider Arguments Or Settings To Configure My Spider?
Answer» Both spider arguments and settings can be used to configure your spider. There is no strict RULE that mandates to use one or the other, but settings are more suited for parameters that, once set, don’t change much, while spider arguments are meant to change more often, even on each spider run and sometimes are required for the spider to run at all (for example, to set the START url of a spider). To illustrate with an example, assuming you have a spider that needs to log into a site to SCRAPE data, and you only want to scrape data from a certain SECTION of the site (which varies each time). In that case, the CREDENTIALS to log in would be settings, while the url of the section to scrape would be a spider argument. Both spider arguments and settings can be used to configure your spider. There is no strict rule that mandates to use one or the other, but settings are more suited for parameters that, once set, don’t change much, while spider arguments are meant to change more often, even on each spider run and sometimes are required for the spider to run at all (for example, to set the start url of a spider). To illustrate with an example, assuming you have a spider that needs to log into a site to scrape data, and you only want to scrape data from a certain section of the site (which varies each time). In that case, the credentials to log in would be settings, while the url of the section to scrape would be a spider argument.

1.

Should I Use Spider Arguments Or Settings To Configure My Spider?

Answer»

Both spider arguments and settings can be used to configure your spider. There is no strict RULE that mandates to use one or the other, but settings are more suited for parameters that, once set, don’t change much, while spider arguments are meant to change more often, even on each spider run and sometimes are required for the spider to run at all (for example, to set the START url of a spider).

To illustrate with an example, assuming you have a spider that needs to log into a site to SCRAPE data, and you only want to scrape data from a certain SECTION of the site (which varies each time). In that case, the CREDENTIALS to log in would be settings, while the url of the section to scrape would be a spider argument.

Both spider arguments and settings can be used to configure your spider. There is no strict rule that mandates to use one or the other, but settings are more suited for parameters that, once set, don’t change much, while spider arguments are meant to change more often, even on each spider run and sometimes are required for the spider to run at all (for example, to set the start url of a spider).

To illustrate with an example, assuming you have a spider that needs to log into a site to scrape data, and you only want to scrape data from a certain section of the site (which varies each time). In that case, the credentials to log in would be settings, while the url of the section to scrape would be a spider argument.

Explore topic-wise InterviewSolutions in .

Should I Use Spider Arguments Or Settings To Configure My Spider?

How Can I Instruct A Spider To Stop Itself?

How Can I See The Cookies Being Sent And Received From Scrapy?

Does Scrapy Manage Cookies Automatically?

What’s The Best Way To Parse Big Xml/csv Data Feeds?

What’s This Huge Cryptic __viewstate Parameter Used In Some Forms?

Simplest Way To Dump All My Scraped Items Into A Json/csv/xml File?

Can I Call Pdb.set_trace() From My Spiders To Debug Them?

What Does The Response Status Code 999 Means?

Can I Return (twisted) Deferreds From Signal Handlers?

Can I Use Json For Large Exports?

I Get “filtered Offsite Request” Messages. How Can I Fix Them?

Can I Run A Spider Without Creating A Project?

Why Does Scrapy Download Pages In English Instead Of My Native Language?

Can I Use Basic Http Authentication In My Spiders?

Does Scrapy Crawl In Breadth-first Or Depth-first Order?

Does Scrapy Work With Http Proxies?

Did Scrapy “steal” X From Django?

What Python Versions Does Scrapy Support?

Can I Use Scrapy With Beautifulsoup?

How Does Scrapy Compare To Beautifulsoup Or Lxml?