For Programmers: Free Programming Magazines  


Home > Archive > ASP > February 2006 > asp innerText?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author asp innerText?
Giles

2006-02-16, 7:55 am

in DHTML, body.innerText nicely strips out the raw textual contents of a
formatted page. Is there a straighforwards way to do this with a server-side
ASP function (e.g. on a string containing the HTML) ? It is to fill a
database field used for a simple search routine.
I don't have permission on this server to use 3rd party components, it's
plain IIS6.
Thanks.
Giles


Bob Barrows [MVP]

2006-02-16, 7:55 am

Giles wrote:
> in DHTML, body.innerText nicely strips out the raw textual contents
> of a formatted page. Is there a straighforwards way to do this with a
> server-side ASP function (e.g. on a string containing the HTML) ? It
> is to fill a database field used for a simple search routine.
> I don't have permission on this server to use 3rd party components,
> it's plain IIS6.


Use a Regular Expression.
Bob Barrows
--
Microsoft MVP - ASP/ASP.NET
Please reply to the newsgroup. This email account is my spam trap so I
don't check it very often. If you must reply off-line, then remove the
"NO SPAM"


Giles

2006-02-16, 7:55 am

from Bob Barrows [MVP]
> Giles wrote:
>
> Use a Regular Expression.
> Bob Barrows


RegExp is a black art to me! Off the top of the head,
delete from "<head" to "/head>"
delete from "<style" to "/style>" (in case not in head)
delete from "<script" to "/script>" (in case not in head)
replace anything in chevrons with nothing.
replace line-breaks with spaces
replace multiple spaces with single spaces
replace HTML entities with literals
Does that sound about right?
thanks, Giles


Bob Barrows [MVP]

2006-02-16, 7:55 am

Giles wrote:
> from Bob Barrows [MVP]
>
> RegExp is a black art to me!

Somewhat to me as well ...
A couple people in this group (Chris Hohmann comes to mind) have it down
pretty well. There are some websites out there that provide libraries of
regular expression patterns.

> Off the top of the head,
> delete from "<head" to "/head>"
> delete from "<style" to "/style>" (in case not in head)
> delete from "<script" to "/script>" (in case not in head)
> replace anything in chevrons with nothing.
> replace line-breaks with spaces
> replace multiple spaces with single spaces
> replace HTML entities with literals
> Does that sound about right?


I guess so, but why are you leaving the closing and opening brackets?


--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.


Justin Piper

2006-02-16, 6:55 pm

Giles wrote:
> in DHTML, body.innerText nicely strips out the raw textual contents of a
> formatted page. Is there a straighforwards way to do this with a server-side
> ASP function (e.g. on a string containing the HTML) ? It is to fill a
> database field used for a simple search routine.


If you can, you might consider using the Indexing Services instead of
rolling your own search routine.

http://www.codeproject.com/asp/indexserver.asp

If that's not an option, you should be able to use Internet Explorer
from an ASP.

<% Option Explicit

Dim ie: Set ie = CreateObject("InternetExplorer.Application")
ie.Navigate "about :blank"

Dim doc: Set doc = ie.Document
doc.open
doc.writeln "<dl>"
doc.writeln "<dt>em</dt>"
doc.writeln "<dd>Indicates <em>emphasis</em></dd>"
doc.writeln "<dt>strong</dt>"
doc.writeln "<dd>Indicates <strong>stronger emphasis</strong></dd>"
doc.writeln "</dl>"
doc.close

Response.ContentType = "text/plain"
Response.Write doc.documentElement.InnerText
%>
Tom Kaminski [MVP]

2006-02-17, 6:55 pm

"Giles" <giles@nospam.com> wrote in message
news:euO7v3uMGHA.3556@TK2MSFTNGP10.phx.gbl...
> in DHTML, body.innerText nicely strips out the raw textual contents of a
> formatted page. Is there a straighforwards way to do this with a
> server-side ASP function (e.g. on a string containing the HTML) ? It is to
> fill a database field used for a simple search routine.
> I don't have permission on this server to use 3rd party components, it's
> plain IIS6.


With ASP you have complete control over the content of the page before it
gets written so it's not clear to me why you would need to do this ...

--
Tom Kaminski IIS MVP
http://www.microsoft.com/windowsser...ty/centers/iis/
http://mvp.support.microsoft.com/
http://www.iistoolshed.com/ - tools, scripts, and utilities for running IIS


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com