|
Answer» Hey guys,
Can someone help me scrape this page please: http://director.flyerservices.com/SOB/default.aspx?banner=Sobeys&pubtype=1&language=en&view=Text&storeNumber=743
The page seems to redirect to another page and my PHP result from cURL says "Object moved here".
Can someone please PROVIDE me with a solution on how to scrape this page?
Thank you in advance,
- ultimatumI always feel slightly uneasy about helping out with screen scraping. There are plenty of legitimate use cases, of course, and I am not for one second suggesting that your intentions are dark - but the intentions of many reading this thread may be less scrupulous.
So let's keep this to helpful pointers.
Firstly, I'd strongly recommend reading up on curl_setopt(): http://uk3.php.net/manual/en/function.curl-setopt.php Pay particular attention to the myriad of options and the effects they have. HINT: one of my web apps uses the following, but these of course may not be appropriate for your needs:
Code: [Select] curl_setopt($this->_ch, CURLOPT_POST, 1); curl_setopt($this->_ch, CURLOPT_FOLLOWLOCATION, false); curl_setopt($this->_ch, CURLOPT_COOKIEJAR, '/tmp/cookies/cookie_'.$cookie_serial.'.txt'); curl_setopt($this->_ch, CURLOPT_HEADER , 1); curl_setopt($this->_ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($this->_ch, CURLOPT_TIMEOUT, 25); curl_setopt($this->_ch, CURLOPT_CONNECTTIMEOUT, 25); curl_setopt($this->_ch, CURLOPT_SSL_VERIFYPEER, false); // cURL having problems with CA certificates
Secondly, get yourself a copy of Fiddler. Understand what it does. It will help you immensely with tricky web transactions: http://www.fiddler2.com/Fiddler2/
|