For Programmers: Free Programming Magazines  


Home > Archive > PHP Language > April 2005 > Downloading and parsing web-stuff









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Downloading and parsing web-stuff
David Rasmussen

2005-04-22, 3:55 am

Very basic:

What is the easiest way in php to download the source code (HTML etc.)
of a given URL (say, http://www.google.com) and parse this code for
certain patterns?

I guess my question can be split in two:

1) How do I download a webpage (into a string or whatever)?

2) How can I do string manupulation, regexp matching, information
extraction etc. on the downloaded information?

/David
Sander Van de Moortel

2005-04-22, 3:55 am

David Rasmussen wrote:
> Very basic:
>
> What is the easiest way in php to download the source code (HTML etc.)
> of a given URL (say, http://www.google.com) and parse this code for
> certain patterns?
>
> I guess my question can be split in two:
>
> 1) How do I download a webpage (into a string or whatever)?
>
> 2) How can I do string manupulation, regexp matching, information
> extraction etc. on the downloaded information?
>
> /David

Download:

$file = fopen("http://www.google.com","r");
while (!eof($file)) {
$buffer = fgets($file, 512);
}

and then you just do whatever you want with $buffer.

Afaik.

S
Bender Rodriguez

2005-04-24, 3:55 pm

On Fri, 22 Apr 2005 03:27:55 +0300, Sander Van de Moortel <sander@jnm.be>
wrote:

> David Rasmussen wrote:
> Download:
>
> $file = fopen("http://www.google.com","r");
> while (!eof($file)) {
> $buffer = fgets($file, 512);
> }
>
> and then you just do whatever you want with $buffer.
>
> Afaik.
>
> S


that !eof is invalid as a check. it should be

while (!feof($file))


--
"Two things are infinite: the universe and human stupidity; and I'm not
sure about the first one." - Albert Einstein
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com