For Programmers: Free Programming Magazines  


Home > Archive > PERL Miscellaneous > September 2004 > how to separate all but www addresses?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author how to separate all but www addresses?
Piotr

2004-09-26, 9:07 am

I need to separate all merged words in my text instead of www and e-mail
addresses. How can I do it?

preg_replace("/(\.)([[:alpha:]])/", "\\1 \\2", "www.google.com
any,merged.words mymail@yahoo.com")

result:
www. google. pl any, merged. words mymail@yahoo. com :(((

any, merged. words - this part of text is good but www. google. pl and
mymail@yahoo. com not. How to eliminate splitting of www and e-mail
addresses?
Sam Holden

2004-09-26, 9:07 am

On Sun, 26 Sep 2004 12:41:25 +0200, Piotr <piou@gaztea.pl> wrote:
> I need to separate all merged words in my text instead of www and e-mail
> addresses. How can I do it?
>
> preg_replace("/(\.)([[:alpha:]])/", "\\1 \\2", "www.google.com
> any,merged.words mymail@yahoo.com")
>
> result:
> www. google. pl any, merged. words mymail@yahoo. com :(((
>
> any, merged. words - this part of text is good but www. google. pl and
> mymail@yahoo. com not. How to eliminate splitting of www and e-mail
> addresses?


Why not ask somewhere that they discuss whatever it is that has this
"preg_replace" function that perl doesn't?

--
Sam Holden
Piotr

2004-09-26, 9:07 am

Dnia 26 Sep 2004 10:45:51 GMT, Sam Holden napisa³(a):

> On Sun, 26 Sep 2004 12:41:25 +0200, Piotr <piou@gaztea.pl> wrote:
>
> Why not ask somewhere that they discuss whatever it is that has this
> "preg_replace" function that perl doesn't?


Because it's not preg_replace problem indeed. It is regular expression
problem and preg_replace has roots in pearl language :)
Paul Lalli

2004-09-26, 9:07 am

Piotr wrote:
> Dnia 26 Sep 2004 10:45:51 GMT, Sam Holden napisa³(a):
>
>
> Because it's not preg_replace problem indeed. It is regular expression
> problem and preg_replace has roots in pearl language :)


PHP's regular expressions, while based on Perl, are not identical to
Perl's regular expressions.

If you want help with an issue in this group, post a minimal but
complete *Perl* program that demonstrates your problem.

Paul Lalli

P.S. This language has nothing to do with oysters.
Piotr

2004-09-26, 9:07 am

Dnia Sun, 26 Sep 2004 07:30:38 -0400, Paul Lalli napisa³(a):

> If you want help with an issue in this group, post a minimal but
> complete *Perl* program that demonstrates your problem.


ok, let's forget about PHP and preg_replace :)

I have no any code, I have only question:

Is there any regular expression in perl, which can separate by space all
merged words in text file instead of www and e-mail addresses?

example text:
"www.google.com any,merged.words mymail@yahoo.com"

incorrect result:
www. google. pl any, merged. words mymail@yahoo. com

correct result:
www.google.pl any, merged. words mymail@yahoo.com
Abhinav

2004-09-26, 9:07 am

Piotr wrote:
> Dnia Sun, 26 Sep 2004 07:30:38 -0400, Paul Lalli napisa³(a):
>
>
>
>
> ok, let's forget about PHP and preg_replace :)
>
> I have no any code, I have only question:
>
> Is there any regular expression in perl, which can separate by space all
> merged words in text file instead of www and e-mail addresses?
>


perldoc -f split

HTH

Abhinav
Jürgen Exner

2004-09-26, 3:56 pm

Piotr wrote:
> Is there any regular expression in perl, which can separate by space
> all merged words in text file instead of www and e-mail addresses?
>
> example text:
> "www.google.com any,merged.words mymail@yahoo.com"
>
> incorrect result:
> www. google. pl any, merged. words mymail@yahoo. com
>
> correct result:
> www.google.pl any, merged. words mymail@yahoo.com


What is your criteria (on a technical level, not your meta-knowledge about
the Internet) to differentiate between what you call words and www
addresses?
If foo.bar.eu a list of words or a www address?

jue


Tad McClellan

2004-09-26, 3:56 pm

Piotr <piou@gaztea.pl> wrote:


> ok, let's forget about PHP



That is to be expected in a Perl newsgroup.


> Is there any regular expression in perl,



I can give a Perl "solution", it is up to you to translate it into
some other language if you want it in some other language.


> which can separate by space all
> merged words in text file instead of www and e-mail addresses?



The hard part is determining how accurately you can describe
"URL" (not a "www address") and "email addresses".

I'll take rather inept definitions in the code below, you may
need to replace them with more accurate ones.



> example text:
> "www.google.com any,merged.words mymail@yahoo.com"

^^^^
^^^^
> correct result:
> www.google.pl any, merged. words mymail@yahoo.com

^^^
^^^

Huh?



-------------------------------------------
#!/usr/bin/perl
use warnings;
use strict;

$_ = 'www.google.com any,merged.words mymail@yahoo.com';
s/(\S+)/ is_addr($1) ? $1 : separate($1) /ge;
print "$_\n";

sub is_addr {
my($adr) = @_;
return 1 if $adr =~ /^www\./; # legal URLs may not match this...
return 1 if $adr =~ tr/@// == 1; # legal email adrs may not match this either
return 0;
}

sub separate {
my($text) = @_;
$text =~ s/(\W)/$1 /g;
return $text;
}
-------------------------------------------


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
Piotr

2004-09-26, 3:56 pm

Dnia Sun, 26 Sep 2004 09:46:33 -0500, Tad McClellan napisa³(a):

>
> ^^^^
> ^^^^
> ^^^
> ^^^


www.google.com of course - sorry for my mistake :)

> $_ = 'www.google.com any,merged.words mymail@yahoo.com';
> s/(\S+)/ is_addr($1) ? $1 : separate($1) /ge;
> print "$_\n";
>
> sub is_addr {
> my($adr) = @_;
> return 1 if $adr =~ /^www\./; # legal URLs may not match this...
> return 1 if $adr =~ tr/@// == 1; # legal email adrs may not match this either
> return 0;
> }
>
> sub separate {
> my($text) = @_;
> $text =~ s/(\W)/$1 /g;
> return $text;


thanks for your suggestion, im my case all of the strings with www, http
and @ inside - there are addresses. I don't have addresses without http://
or www.
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com