Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

How do I parse this page?
I am trying to parse
http://www.ebay.com without success.

I view the source, and I see a lot of ?/td>. This page is unsavable.

It displays perfectly in IE, but once the source is saved/viewed, it no long
display right in IE.

When I use LYNX to view it, it is formated perfectly.

My question is how Ebay allow any brower to display the content right
without allowing viewing source or safe as?



Report this thread to moderator Post Follow-up to this message
Old Post
nntp
10-26-04 08:57 PM


Re: How do I parse this page?
nntp wrote:

> I am trying to parse
> http://www.ebay.com without success.

In Perl, try
http://search.cpan.org/~gaas/HTML-Parser-3.35/Parser.pm

> I view the source, and I see a lot of ?/td>. This page is unsavable.

> It displays perfectly in IE, but once the source is saved/viewed, it no
> long display right in IE.

Maybe it uses css, or needs images to provide formatting hints.

> When I use LYNX to view it, it is formated perfectly.
>
> My question is how Ebay allow any brower to display the content right
> without allowing viewing source or safe as?

Please don't clutter Perl newsgroups with web server questions.

gtoomey

Report this thread to moderator Post Follow-up to this message
Old Post
Gregory Toomey
10-26-04 08:57 PM


Re: How do I parse this page?
> > I am trying to parse 
>
> In Perl, try
> http://search.cpan.org/~gaas/HTML-Parser-3.35/Parser.pm
> 
> 
>
> Maybe it uses css, or needs images to provide formatting hints.
Have you looked at the source codes of www.ebay.com?
I don't know what you mean by uses images to provide formatting hints.



Report this thread to moderator Post Follow-up to this message
Old Post
nntp
10-26-04 08:57 PM


Re: How do I parse this page?
[F'ups set to a.w.w.]

nntp wrote:

> http://www.ebay.com
> I view the source, and I see a lot of ?/td>. This page is unsavable.
> It displays perfectly in IE, but once the source is saved/viewed, it no lo
ng
> display right in IE. My question is how Ebay allow any brower to
> display the content right without allowing viewing source or safe as?

IE doesn't simply show you the source when you hit the "view source"
button. Oh no. That would be too easy. It does all kinds of weird crap
first and then shows you some modified source code. I'm guessing that some
of that weird crap screws up some of the characters.

Look at the source code in a different browser and it displays fine.

Not that you should try to emulate any of that code. It's pants.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me  ~ http://tobyinkster.co.uk/contact


Report this thread to moderator Post Follow-up to this message
Old Post
Toby Inkster
10-27-04 01:56 AM


Re: How do I parse this page?
"nntp" <nntp@rogers.com> wrote in message
news:_dydnarTGPNdFePcRVn-sQ@rogers.com...
>I am trying to parse
> http://www.ebay.com without success.
>
> I view the source, and I see a lot of ?/td>. This page is unsavable.
>
> It displays perfectly in IE, but once the source is saved/viewed, it no
> long
> display right in IE.
>
> When I use LYNX to view it, it is formated perfectly.
>
> My question is how Ebay allow any brower to display the content right
> without allowing viewing source or safe as?
>

I don't have a copy of Lynx, so I can't duplicate your problem, but...
Opera saves the file with images and IE displays it just fine from the saved
files.

Ebay.com (index.html) uses an external CSS stylesheet.  It also uses a
sizeable number of external javascript files and 68 images to make up the
page I looked at.

George




Report this thread to moderator Post Follow-up to this message
Old Post
George King
10-27-04 01:56 AM


Re: How do I parse this page?
Quoth "nntp" <nntp@rogers.com>:
> I am trying to parse
> http://www.ebay.com without success.
>
> I view the source, and I see a lot of ?/td>. This page is unsavable.
>
> It displays perfectly in IE, but once the source is saved/viewed, it no lo
ng
> display right in IE.
>
> When I use LYNX to view it, it is formated perfectly.
>
> My question is how Ebay allow any brower to display the content right
> without allowing viewing source or safe as?

They can't. You've probably got character-set issues. Use LWP to retreive th
e
page.

Ben

--
I must not fear. Fear is the mind-killer. I will face my fear and
I will let it pass through me. When the fear is gone there will be
nothing. Only I will remain.
ben@morrow.me.uk                                          Frank Herbert, 'Du
ne'

Report this thread to moderator Post Follow-up to this message
Old Post
Ben Morrow
10-27-04 01:56 AM


Re: How do I parse this page?
"nntp" <nntp@rogers.com> wrote in
news:_dydnarTGPNdFePcRVn-sQ@rogers.com:

> I am trying to parse
> http://www.ebay.com without success.
>
> I view the source, and I see a lot of ?/td>. This page is unsavable.

That ain't true. If you have any questions on parsing HTML using
HTML::Parser, please post them here. Otherwise, this waaay off-topic.

Sinan

Report this thread to moderator Post Follow-up to this message
Old Post
A. Sinan Unur
10-27-04 01:56 AM


Re: How do I parse this page?
JRS:  In article <Xns958EB135B42A7asu1cornelledu@132.236.56.8>, dated
Tue, 26 Oct 2004 21:25:13, seen in news:comp.lang.javascript, A. Sinan
Unur <1usa@llenroc.ude.invalid> posted :
>"nntp" <nntp@rogers.com> wrote in
>news:_dydnarTGPNdFePcRVn-sQ@rogers.com:
> 
>
>That ain't true. If you have any questions on parsing HTML using
>HTML::Parser, please post them here. Otherwise, this waaay off-topic.

Please take greater, or at least better, thought before using a word
such as "here".

--
© John Stockton, Surrey, UK.  ?@merlyn.demon.co.uk   Turnpike v4.00   IE 4 ©
<URL:http://www.jibbering.com/faq/>  JL/RC: FAQ of news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.

Report this thread to moderator Post Follow-up to this message
Old Post
Dr John Stockton
10-27-04 08:57 PM


Re: How do I parse this page?
Dr John Stockton <spam@merlyn.demon.co.uk> wrote:
> JRS:  In article <Xns958EB135B42A7asu1cornelledu@132.236.56.8>, dated
> Tue, 26 Oct 2004 21:25:13, seen in news:comp.lang.javascript, A. Sinan
> Unur <1usa@llenroc.ude.invalid> posted : 
>
> Please take greater, or at least better, thought before using a word
> such as "here".


Please take greater, or at least better, notice of the Newsgroups
header before determining which "where" is "here".

:-)


--
Tad McClellan                          SGML consulting
tadmc@augustmail.com                   Perl programming
Fort Worth, Texas

Report this thread to moderator Post Follow-up to this message
Old Post
Tad McClellan
10-27-04 08:57 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

PERL Miscellaneous archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 04:55 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.