Home > Archive > PERL CGI Beginners > August 2004 > Help needed in extracting html over HTTTPS protocol.
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Help needed in extracting html over HTTTPS protocol.
|
|
| Anas Perwez 2004-08-03, 8:55 pm |
| Hi All,
My requirement is to extract html from any site ( HTTPS) and then parse
it for selective contents
I am able to connect to HTTP sites but when it comes to HTTPS , it is
throwing errors.
Code snippet for HTTP
-------------------
my $res;
use LWP::Request;
require LWP::UserAgent;
require Io::All;
=20
my $ua =3D LWP::UserAgent->new;
$ua->timeout(10);
$ua->env_proxy;
$ua->proxy('http', 'http://proxy:80/');
my $response =3D $ua->get('http://email.indiatimes.com');
=20
if ($response->is_success) {
print $response->content; # or whatever
}
else {
die $response->status_line;
}
-------------------------------------------------
This is working fine. Can any body help me with HTTPS?
Thanks=20
Anas Perwez
| |
| T. Raymond 2004-08-03, 8:55 pm |
| Anas,
I think it is doing that because it is the secure protocol. Presumably, I
think that means that others can't tap into it unless they have a username
and password or some other authorized access. If you're trying to extract
from your own site, then I think you need to download the html into a
folder onto your PC and then run your cgi through perl on your machine.
--Teresa
At 04:07 AM 8/3/2004, you wrote:
>Hi All,
>My requirement is to extract html from any site ( HTTPS) and then parse
>it for selective contents
>
>I am able to connect to HTTP sites but when it comes to HTTPS , it is
>throwing errors.
>
>Code snippet for HTTP
>-------------------
>
>my $res;
>use LWP::Request;
>require LWP::UserAgent;
>require Io::All;
>
>
> my $ua = LWP::UserAgent->new;
> $ua->timeout(10);
> $ua->env_proxy;
> $ua->proxy('http', 'http://proxy:80/');
>
>
> my $response = $ua->get('http://email.indiatimes.com');
>
> if ($response->is_success) {
> print $response->content; # or whatever
> }
> else {
> die $response->status_line;
> }
>-------------------------------------------------
>
>This is working fine. Can any body help me with HTTPS?
>
>Thanks
>Anas Perwez
>
>--
>To unsubscribe, e-mail: beginners-cgi-unsubscribe@perl.org
>For additional commands, e-mail: beginners-cgi-help@perl.org
><http://learn.perl.org/> <http://learn.perl.org/first-response>
| |
| Bob Showalter 2004-08-04, 8:55 am |
| Anas Perwez wrote:
> Hi All,
> My requirement is to extract html from any site ( HTTPS) and then
> parse it for selective contents
>
> I am able to connect to HTTP sites but when it comes to HTTPS , it is
> throwing errors.
[ snip LWP code ]
Have you read http://search.cpan.org/src/GAAS/lib....800/README.SSL ?
| |
| Sanjay Arora 2004-08-04, 8:55 am |
| Anas
I have a similar requirement. Can you tell me which perl modules are you
using for connecting to http and getting html & which modules for
extracting info. I havent got to your stage, so I guess Ill keep an eye
on your thread ;-).
And Bob, thanks for the tip though its a bit advanced for me yet.
Best regards.
Sanjay.
On Wed, 2004-08-04 at 17:45, Bob Showalter wrote:
> Anas Perwez wrote:
>
> [ snip LWP code ]
>
> Have you read http://search.cpan.org/src/GAAS/lib....800/README.SSL ?
|
|
|
|
|