1.

Solve : PHP - Help Scrape a page?

Answer»

Hey guys,

Can someone help me scrape this page please: http://director.flyerservices.com/SOB/default.aspx?banner=Sobeys&pubtype=1&language=en&view=Text&storeNumber=743

The page seems to redirect to another page and my PHP result from cURL says "Object moved here".

Can someone please PROVIDE me with a solution on how to scrape this page?

Thank you in advance,


- ultimatumI always feel slightly uneasy about helping out with screen scraping. There are plenty of legitimate use cases, of course, and I am not for one second suggesting that your intentions are dark - but the intentions of many reading this thread may be less scrupulous.

So let's keep this to helpful pointers.

Firstly, I'd strongly recommend reading up on curl_setopt(): http://uk3.php.net/manual/en/function.curl-setopt.php Pay particular attention to the myriad of options and the effects they have. HINT: one of my web apps uses the following, but these of course may not be appropriate for your needs:

Code: [Select] curl_setopt($this->_ch, CURLOPT_POST, 1);
curl_setopt($this->_ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($this->_ch, CURLOPT_COOKIEJAR, '/tmp/cookies/cookie_'.$cookie_serial.'.txt');
curl_setopt($this->_ch, CURLOPT_HEADER , 1);
curl_setopt($this->_ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($this->_ch, CURLOPT_TIMEOUT, 25);
curl_setopt($this->_ch, CURLOPT_CONNECTTIMEOUT, 25);
curl_setopt($this->_ch, CURLOPT_SSL_VERIFYPEER, false); // cURL having problems with CA certificates

Secondly, get yourself a copy of Fiddler. Understand what it does. It will help you immensely with tricky web transactions: http://www.fiddler2.com/Fiddler2/



Discussion

No Comment Found