Home > Archive > PERL Miscellaneous > September 2004 > how to separate all but www addresses?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
how to separate all but www addresses?
|
|
|
| I need to separate all merged words in my text instead of www and e-mail
addresses. How can I do it?
preg_replace("/(\.)([[:alpha:]])/", "\\1 \\2", "www.google.com
any,merged.words mymail@yahoo.com")
result:
www. google. pl any, merged. words mymail@yahoo. com :(((
any, merged. words - this part of text is good but www. google. pl and
mymail@yahoo. com not. How to eliminate splitting of www and e-mail
addresses?
| |
| Sam Holden 2004-09-26, 9:07 am |
| On Sun, 26 Sep 2004 12:41:25 +0200, Piotr <piou@gaztea.pl> wrote:
> I need to separate all merged words in my text instead of www and e-mail
> addresses. How can I do it?
>
> preg_replace("/(\.)([[:alpha:]])/", "\\1 \\2", "www.google.com
> any,merged.words mymail@yahoo.com")
>
> result:
> www. google. pl any, merged. words mymail@yahoo. com :(((
>
> any, merged. words - this part of text is good but www. google. pl and
> mymail@yahoo. com not. How to eliminate splitting of www and e-mail
> addresses?
Why not ask somewhere that they discuss whatever it is that has this
"preg_replace" function that perl doesn't?
--
Sam Holden
| |
|
| Dnia 26 Sep 2004 10:45:51 GMT, Sam Holden napisa³(a):
> On Sun, 26 Sep 2004 12:41:25 +0200, Piotr <piou@gaztea.pl> wrote:
>
> Why not ask somewhere that they discuss whatever it is that has this
> "preg_replace" function that perl doesn't?
Because it's not preg_replace problem indeed. It is regular expression
problem and preg_replace has roots in pearl language :)
| |
| Paul Lalli 2004-09-26, 9:07 am |
| Piotr wrote:
> Dnia 26 Sep 2004 10:45:51 GMT, Sam Holden napisa³(a):
>
>
> Because it's not preg_replace problem indeed. It is regular expression
> problem and preg_replace has roots in pearl language :)
PHP's regular expressions, while based on Perl, are not identical to
Perl's regular expressions.
If you want help with an issue in this group, post a minimal but
complete *Perl* program that demonstrates your problem.
Paul Lalli
P.S. This language has nothing to do with oysters.
| |
|
| Dnia Sun, 26 Sep 2004 07:30:38 -0400, Paul Lalli napisa³(a):
> If you want help with an issue in this group, post a minimal but
> complete *Perl* program that demonstrates your problem.
ok, let's forget about PHP and preg_replace :)
I have no any code, I have only question:
Is there any regular expression in perl, which can separate by space all
merged words in text file instead of www and e-mail addresses?
example text:
"www.google.com any,merged.words mymail@yahoo.com"
incorrect result:
www. google. pl any, merged. words mymail@yahoo. com
correct result:
www.google.pl any, merged. words mymail@yahoo.com
| |
| Abhinav 2004-09-26, 9:07 am |
| Piotr wrote:
> Dnia Sun, 26 Sep 2004 07:30:38 -0400, Paul Lalli napisa³(a):
>
>
>
>
> ok, let's forget about PHP and preg_replace :)
>
> I have no any code, I have only question:
>
> Is there any regular expression in perl, which can separate by space all
> merged words in text file instead of www and e-mail addresses?
>
perldoc -f split
HTH
Abhinav
| |
| Jürgen Exner 2004-09-26, 3:56 pm |
| Piotr wrote:
> Is there any regular expression in perl, which can separate by space
> all merged words in text file instead of www and e-mail addresses?
>
> example text:
> "www.google.com any,merged.words mymail@yahoo.com"
>
> incorrect result:
> www. google. pl any, merged. words mymail@yahoo. com
>
> correct result:
> www.google.pl any, merged. words mymail@yahoo.com
What is your criteria (on a technical level, not your meta-knowledge about
the Internet) to differentiate between what you call words and www
addresses?
If foo.bar.eu a list of words or a www address?
jue
| |
| Tad McClellan 2004-09-26, 3:56 pm |
| Piotr <piou@gaztea.pl> wrote:
> ok, let's forget about PHP
That is to be expected in a Perl newsgroup.
> Is there any regular expression in perl,
I can give a Perl "solution", it is up to you to translate it into
some other language if you want it in some other language.
> which can separate by space all
> merged words in text file instead of www and e-mail addresses?
The hard part is determining how accurately you can describe
"URL" (not a "www address") and "email addresses".
I'll take rather inept definitions in the code below, you may
need to replace them with more accurate ones.
> example text:
> "www.google.com any,merged.words mymail@yahoo.com"
^^^^
^^^^
> correct result:
> www.google.pl any, merged. words mymail@yahoo.com
^^^
^^^
Huh?
-------------------------------------------
#!/usr/bin/perl
use warnings;
use strict;
$_ = 'www.google.com any,merged.words mymail@yahoo.com';
s/(\S+)/ is_addr($1) ? $1 : separate($1) /ge;
print "$_\n";
sub is_addr {
my($adr) = @_;
return 1 if $adr =~ /^www\./; # legal URLs may not match this...
return 1 if $adr =~ tr/@// == 1; # legal email adrs may not match this either
return 0;
}
sub separate {
my($text) = @_;
$text =~ s/(\W)/$1 /g;
return $text;
}
-------------------------------------------
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
| |
|
| Dnia Sun, 26 Sep 2004 09:46:33 -0500, Tad McClellan napisa³(a):
>
> ^^^^
> ^^^^
> ^^^
> ^^^
www.google.com of course - sorry for my mistake :)
> $_ = 'www.google.com any,merged.words mymail@yahoo.com';
> s/(\S+)/ is_addr($1) ? $1 : separate($1) /ge;
> print "$_\n";
>
> sub is_addr {
> my($adr) = @_;
> return 1 if $adr =~ /^www\./; # legal URLs may not match this...
> return 1 if $adr =~ tr/@// == 1; # legal email adrs may not match this either
> return 0;
> }
>
> sub separate {
> my($text) = @_;
> $text =~ s/(\W)/$1 /g;
> return $text;
thanks for your suggestion, im my case all of the strings with www, http
and @ inside - there are addresses. I don't have addresses without http://
or www.
|
|
|
|
|