InterviewSolution
| 1. |
What’s The Best Way To Parse Big Xml/csv Data Feeds? |
|
Answer» Parsing big feeds with XPath SELECTORS can be problematic SINCE they need to build the DOM of the entire feed in memory, and this can be QUITE slow and consume a lot of memory. In ORDER to avoid parsing all the entire feed at once in memory, you can use the functions xmliterand csviter from scrapy.utils.iterators module. In fact, this is what the feed spiders use under the cover. Parsing big feeds with XPath selectors can be problematic since they need to build the DOM of the entire feed in memory, and this can be quite slow and consume a lot of memory. In order to avoid parsing all the entire feed at once in memory, you can use the functions xmliterand csviter from scrapy.utils.iterators module. In fact, this is what the feed spiders use under the cover. |
|