Code Comments
Programming Forum and web based access to our favorite programming groups.I need to separate all merged words in my text instead of www and e-mail
addresses. How can I do it?
preg_replace("/(\.)([[:alpha:]])/", "\\1 \\2", "www.google.com
any,merged.words mymail@yahoo.com")
result:
www. google. pl any, merged. words mymail@yahoo. com :(((
any, merged. words - this part of text is good but www. google. pl and
mymail@yahoo. com not. How to eliminate splitting of www and e-mail
addresses?
Post Follow-up to this messageOn Sun, 26 Sep 2004 12:41:25 +0200, Piotr <piou@gaztea.pl> wrote:
> I need to separate all merged words in my text instead of www and e-mail
> addresses. How can I do it?
>
> preg_replace("/(\.)([[:alpha:]])/", "\\1 \\2", "www.google.com
> any,merged.words mymail@yahoo.com")
>
> result:
> www. google. pl any, merged. words mymail@yahoo. com :(((
>
> any, merged. words - this part of text is good but www. google. pl and
> mymail@yahoo. com not. How to eliminate splitting of www and e-mail
> addresses?
Why not ask somewhere that they discuss whatever it is that has this
"preg_replace" function that perl doesn't?
--
Sam Holden
Post Follow-up to this messageDnia 26 Sep 2004 10:45:51 GMT, Sam Holden napisa³(a): > On Sun, 26 Sep 2004 12:41:25 +0200, Piotr <piou@gaztea.pl> wrote: > > Why not ask somewhere that they discuss whatever it is that has this > "preg_replace" function that perl doesn't? Because it's not preg_replace problem indeed. It is regular expression problem and preg_replace has roots in pearl language :)
Post Follow-up to this messagePiotr wrote: > Dnia 26 Sep 2004 10:45:51 GMT, Sam Holden napisa³(a): > > > Because it's not preg_replace problem indeed. It is regular expression > problem and preg_replace has roots in pearl language :) PHP's regular expressions, while based on Perl, are not identical to Perl's regular expressions. If you want help with an issue in this group, post a minimal but complete *Perl* program that demonstrates your problem. Paul Lalli P.S. This language has nothing to do with oysters.
Post Follow-up to this messageDnia Sun, 26 Sep 2004 07:30:38 -0400, Paul Lalli napisa³(a): > If you want help with an issue in this group, post a minimal but > complete *Perl* program that demonstrates your problem. ok, let's forget about PHP and preg_replace :) I have no any code, I have only question: Is there any regular expression in perl, which can separate by space all merged words in text file instead of www and e-mail addresses? example text: "www.google.com any,merged.words mymail@yahoo.com" incorrect result: www. google. pl any, merged. words mymail@yahoo. com correct result: www.google.pl any, merged. words mymail@yahoo.com
Post Follow-up to this messagePiotr wrote: > Dnia Sun, 26 Sep 2004 07:30:38 -0400, Paul Lalli napisa³(a): > > > > > ok, let's forget about PHP and preg_replace :) > > I have no any code, I have only question: > > Is there any regular expression in perl, which can separate by space all > merged words in text file instead of www and e-mail addresses? > perldoc -f split HTH Abhinav
Post Follow-up to this messagePiotr wrote: > Is there any regular expression in perl, which can separate by space > all merged words in text file instead of www and e-mail addresses? > > example text: > "www.google.com any,merged.words mymail@yahoo.com" > > incorrect result: > www. google. pl any, merged. words mymail@yahoo. com > > correct result: > www.google.pl any, merged. words mymail@yahoo.com What is your criteria (on a technical level, not your meta-knowledge about the Internet) to differentiate between what you call words and www addresses? If foo.bar.eu a list of words or a www address? jue
Post Follow-up to this messagePiotr <piou@gaztea.pl> wrote: > ok, let's forget about PHP That is to be expected in a Perl newsgroup. > Is there any regular expression in perl, I can give a Perl "solution", it is up to you to translate it into some other language if you want it in some other language. > which can separate by space all > merged words in text file instead of www and e-mail addresses? The hard part is determining how accurately you can describe "URL" (not a "www address") and "email addresses". I'll take rather inept definitions in the code below, you may need to replace them with more accurate ones. > example text: > "www.google.com any,merged.words mymail@yahoo.com" ^^^^ ^^^^ > correct result: > www.google.pl any, merged. words mymail@yahoo.com ^^^ ^^^ Huh? ------------------------------------------- #!/usr/bin/perl use warnings; use strict; $_ = 'www.google.com any,merged.words mymail@yahoo.com'; s/(\S+)/ is_addr($1) ? $1 : separate($1) /ge; print "$_\n"; sub is_addr { my($adr) = @_; return 1 if $adr =~ /^www\./; # legal URLs may not match this... return 1 if $adr =~ tr/@// == 1; # legal email adrs may not match this eithe r return 0; } sub separate { my($text) = @_; $text =~ s/(\W)/$1 /g; return $text; } ------------------------------------------- -- Tad McClellan SGML consulting tadmc@augustmail.com Perl programming Fort Worth, Texas
Post Follow-up to this messageDnia Sun, 26 Sep 2004 09:46:33 -0500, Tad McClellan napisa³(a): > > ^^^^ > ^^^^ > ^^^ > ^^^ www.google.com of course - sorry for my mistake :) > $_ = 'www.google.com any,merged.words mymail@yahoo.com'; > s/(\S+)/ is_addr($1) ? $1 : separate($1) /ge; > print "$_\n"; > > sub is_addr { > my($adr) = @_; > return 1 if $adr =~ /^www\./; # legal URLs may not match this... > return 1 if $adr =~ tr/@// == 1; # legal email adrs may not match this either > return 0; > } > > sub separate { > my($text) = @_; > $text =~ s/(\W)/$1 /g; > return $text; thanks for your suggestion, im my case all of the strings with www, http and @ inside - there are addresses. I don't have addresses without http:// or www.
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.