Code Comments
Programming Forum and web based access to our favorite programming groups.Am I correct in assuming screen scraping is just the response text sent to
the browser? If so, would that mean that this could not be screen scraped?
function moi() {
var tag = '<a href=';
var tagType1 = '"mail'+'to:', tagType2 = '">', tagType3 = '<\/a>';
var user1 = 'web', user2 = 'master', user3 = '@';
var dom1 = 'danger', dom2 = 'ous', dom3 = 'ly';
var tld = '.us';
document. write(tag+tagType1+user1+user2+user3+dom
1+dom2+dom3+tld+tagType2+us
er1+user2+user3+dom1+dom2+dom3+tld+tagTy
pe3);
}
--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp
Post Follow-up to this messageScreen scraping is a technique, not a format. The technique is to intercept the raw data (in this case HTML)that would normally be displayed on the client system screen and extract data from it. In ASP context screen scraping would typically be done by having a server-side component (such as xmlhttprequest) perform a get or post to a url and return the raw HTML as text. Then a parser of some kind is used to extract the desired information. The example you present would be difficult (though not impossible) to screen-scrape server-side. The parser would have to be able to evaluate the output of the JavaScript function to get the data. I have seen references to using the HTML browser component (MSHTML object) to do things like this but I don't think it works well server-side. -- Mark Schupp Head of Development Integrity eLearning www.ielearning.com "Roland Hall" <nobody@nowhere> wrote in message news:OgDybTjMFHA.244@tk2msftngp13.phx.gbl... > Am I correct in assuming screen scraping is just the response text sent to > the browser? If so, would that mean that this could not be screen scraped? > > function moi() { > var tag = '<a href='; > var tagType1 = '"mail'+'to:', tagType2 = '">', tagType3 = '<\/a>'; > var user1 = 'web', user2 = 'master', user3 = '@'; > var dom1 = 'danger', dom2 = 'ous', dom3 = 'ly'; > var tld = '.us'; > document. write(tag+tagType1+user1+user2+user3+dom 1+dom2+dom3+tld+tagType2+us er1+user2+user3+dom1+dom2+dom3+tld+tagTy pe3); > } > > -- > Roland Hall > /* This information is distributed in the hope that it will be useful, but > without any warranty; without even the implied warranty of merchantability > or fitness for a particular purpose. */ > technet Script Center - http://www.microsoft.com/technet/scriptcenter/ > WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp > MSDN Library - http://msdn.microsoft.com/library/default.asp > >
Post Follow-up to this message
Roland Hall wrote:
> Am I correct in assuming screen scraping is just the response text
sent to
> the browser? If so, would that mean that this could not be screen
scraped?
>
> function moi() {
> var tag = '<a href=';
> var tagType1 = '"mail'+'to:', tagType2 = '">', tagType3 = '<\/a>';
> var user1 = 'web', user2 = 'master', user3 = '@';
> var dom1 = 'danger', dom2 = 'ous', dom3 = 'ly';
> var tld = '.us';
>
document. write(tag+tagType1+user1+user2+user3+dom
1+dom2+dom3+tld+tagType2+user1+user2+use
r3
+dom1+dom2+dom3+tld+tagType3);
> }
Anything can be scraped. If you want to hide an email address, put a
form up and send the email server side so that the email address can
never be retrieved over HTML.
Post Follow-up to this message<larrybud2002@yahoo.com> wrote in message news:1112109209.296120.61670@f14g2000cwb.googlegroups.com... : : Anything can be scraped. If you want to hide an email address, put a : form up and send the email server side so that the email address can : never be retrieved over HTML. Hi Larry... Thanks for responding... I understand a form is best but I was looking for a way to defeat the javascript. Surely a spammer is not going to capture all scripts and process them in hopes of finding a single email address. The goal of a spammer is to be lazy and get as much as possible with as little effort as possible. There is no benefit to processing every script they spider with no guarantee to of finding an email address encoded in it somewhere. I see the benefit of finding one in plain sight since 99.99% of them will be that way. I also shouldn't have said "screen" scraped as it's not really the screen memory that's being queried but rather the response text. Javascript doesn't show the results, except to the browser. I have not seen a way to grab those results although I can think of some possibilities which appear to be a lot of effort. I just don't see the ROI but would welcome any info on how it is accomplished. -- Roland Hall /* This information is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. */ Technet Script Center - http://www.microsoft.com/technet/scriptcenter/ WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp MSDN Library - http://msdn.microsoft.com/library/default.asp
Post Follow-up to this message"Mark Schupp" wrote in message news:OCKz1U7MFHA.3540@tk2msftngp13.phx.gbl... : Screen scraping is a technique, not a format. Hi Mark... Thanks for responding. I didn't realize I said it was a format and I should have said HTML scraping since it's not really screen scraping like it would be on a terminal. : The technique is to intercept : the raw data (in this case HTML)that would normally be displayed on the : client system screen and extract data from it. In ASP context screen : scraping would typically be done by having a server-side component (such as : xmlhttprequest) perform a get or post to a url and return the raw HTML as : text. Then a parser of some kind is used to extract the desired information. Yes, I'm familiar with that process. : The example you present would be difficult (though not impossible) to : screen-scrape server-side. The parser would have to be able to evaluate the : output of the JavaScript function to get the data. I have seen references to : using the HTML browser component (MSHTML object) to do things like this but : I don't think it works well server-side. I have not been able to do it either. I think it may require HTML scraping the site and then "screen" scraping my page, implying printing it to a text file and then reloading and parsing that or capturing it from my screen memory, the former being the easier of the two. This would require the result look like user@domain.com instead of user at domain dot com. I think I'll test the first since so many suggest using encoded javascript to hide from spammers. -- Roland Hall /* This information is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. */ Technet Script Center - http://www.microsoft.com/technet/scriptcenter/ WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp MSDN Library - http://msdn.microsoft.com/library/default.asp
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.