|
Answer» I love to experiment with web stuff. My new goal (sort of) is to make a web crawler that will go through a bunch of pages, find URL's, and link them.
I would like to make a web crawler that can get a large amount of URL's, and save them in a simple TEXT file.
How would I do this? (Not looking for anything fancy, I've got limited bandwidth, you know. ) I can't remember where you're up to - have you got PHP under your belt yet?
First thing to bear in mind: you are considering USING an automated process to retrieve information from web sites. Some WEBMASTERS would consider that abuse of their bandwidth, never mind yours. You should really ensure that whatever code you write respects robots.txt files - but you're on your own there. I am not an expert when it comes to their syntax.
Have a look at >the PHP filesystem reference< - in particular fopen, which can open a url, not just a file. You can do more complex stuff with the >streams< library.Looks a bit complicated... You're not JOKING!
|