Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Some part of the text should not be converted
Dear All,

I am doing capitalization of the titles. In that the text within <u></u>
tag should be as input. means my conversion program should not touch
this text.

What I have used is, I have removed this part from the text in the
beginning and in the last once again I put the text back. This working
fine if I have only one case in the line. If I have multiple cases this
logic is not working.

my code is:

if($line=~m!<u>(.+)</u>!i)
{
$un=$1;
}
$line=~s!<u>(.+)</u>!<u></u>!ig;

code for conversion ....

$line=~s!<u></u>!$un!ig;

the above code is working if the input is like this.   "A Practical
Guide to <u>CD-Rom</u>"
the output "A Practical Guide to CD-Rom"

I have tried with non-greedy by putting the question mark after + but
the DVD is also getting replaced with CD-Rom.

the above code is not working if the input is like this.   "A Practical
Guide to <u>CD-Rom</u> and <u>DVD</u>"
the output "A Practical Guide to CD-Rom and CD-Rom"

Please help to solve this problem.

Regards,
Ganesh


Report this thread to moderator Post Follow-up to this message
Old Post
N. Ganesh Babu
04-26-05 01:57 PM


Re: Some part of the text should not be converted
On Tue, 26 Apr 2005 12:52:32 +0530, N. Ganesh Babu wrote:

> Dear All,
>
> I am doing capitalization of the titles. In that the text within <u></u>
> tag should be as input. means my conversion program should not touch
> this text.
>
> What I have used is, I have removed this part from the text in the
> beginning and in the last once again I put the text back. This working
> fine if I have only one case in the line. If I have multiple cases this
> logic is not working.
>
> my code is:
>
> if($line=~m!<u>(.+)</u>!i)
> {
> $un=$1;
> }
> $line=~s!<u>(.+)</u>!<u></u>!ig;
>
> code for conversion ....
>
> $line=~s!<u></u>!$un!ig;
>
> the above code is working if the input is like this.   "A Practical
> Guide to <u>CD-Rom</u>"
> the output "A Practical Guide to CD-Rom"
>
> I have tried with non-greedy by putting the question mark after + but
> the DVD is also getting replaced with CD-Rom.

This is not a greedy problem. Your logic is flawed.

> the above code is not working if the input is like this.   "A Practical
> Guide to <u>CD-Rom</u> and <u>DVD</u>" the output "A Practical Guide to
> CD-Rom and CD-Rom"

This is because you're only capturing the first instance of <u>(.*)</u> on
a line and storing it in $un. So when you've got several on a line they
will all be replaced with the same $un.

> Please help to solve this problem.
>
> Regards,
> Ganesh

Why not do the capturing and substituting at the same time? e.g:

$line =~ s|<u>(.+?)</u>|$1|gi;

Chris.

Report this thread to moderator Post Follow-up to this message
Old Post
Chris Cole
04-26-05 01:57 PM


Re: Some part of the text should not be converted
On 4/26/05, N. Ganesh Babu wrote:
>=20
> the above code is not working if the input is like this.   "A Practical
> Guide to <u>CD-Rom</u> and <u>DVD</u>"
> the output "A Practical Guide to CD-Rom and CD-Rom"
>=20

One way is to get the list of texts between the <u> and </u> tag. I
choose to do it together with the substitution, using the "e"
modifier:
my @un;
$line=3D~s!<u>(.+?)</u>!push @un,$1;"<u></u>"!ige;

After processing, put it back in using a for loop over the list of
saved texts, or an s///e construct:
$line=3D~s!<u></u>!shift @un!ige;

Note the question mark in ".+?". In reference to your question, yes,
you must use it, or the match will be greedy.
Personally I think you're workin too hard - you should be able to do
any processing on the line and not touch the the <u> delimited text,
without having to resort to removing it. But of course TIMTOWTDI :-)
Here is a complete working example:
###################### begin code
use strict;
use warnings;
while(defined(my $line=3D<DATA> )) {
print '-' x 80 , "\n";
print "Original line: $line";
my @un;
$line=3D~s!<u>(.+?)</u>!push @un,$1;"<u></u>"!ige;
#do something with line...
print "line after data removal: $line";
# put back the data
$line=3D~s!<u></u>!shift @un!ige;
print "line after data replace: $line";
}

__DATA__
A Practical Guide to <u>CD-Rom</u>
A Practical Guide to <u>CD-Rom</u> and <u>DVD</u>
###################### end code

BTW, I would be happy if any of the gurus on the list could shorten:
my @un;
$line=3D~s!<u>(.+?)</u>!push @un,$1;"<u></u>"!ige;
To a single line. Unlike "m//", "s///" never seems to return the
results of "()" in the RE, even in list context. Annoying :-(

HTH,
--=20
Offer Kaye

Report this thread to moderator Post Follow-up to this message
Old Post
Offer Kaye
04-26-05 01:57 PM


Re: Some part of the text should not be converted
On 4/26/05, N. Ganesh Babu wrote:
>  Dear Offer Kaye,
> =20
>  I want to preserve the <u> tag also in the context. Can you help me how =
to
> do it. If you run 2nd time also the same action will happen. If we remove=
,
> in the 2nd execution again the conversion will take place on these words.
> =20

Hi Ganesh,
I'm not following you - that do you mean "context"? What "2nd execution"?=
=20
Wild guess- you want the final line output from the code to include
the <u> tags? If so, simply use:
$line=3D~s!(<u>.+?</u> )!push @un,$1;"<u></u>"!ige;
So now the tags as well as the text are saved into @un and will appear
in the final output line.

Please read "perldoc perlrequick", it will help you learn regular
expressions in Perl. You can read it online at:
http://perldoc.perl.org/perlrequick.html

HTH,
--=20
Offer Kaye

Report this thread to moderator Post Follow-up to this message
Old Post
Offer Kaye
04-26-05 08:57 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

PERL Beginners archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 07:36 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.