Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

how to separate all but www addresses?
I need to separate all merged words in my text instead of www and e-mail
addresses. How can I do it?

preg_replace("/(\.)([[:alpha:]])/", "\\1 \\2", "www.google.com
any,merged.words mymail@yahoo.com")

result:
www. google. pl any, merged. words  mymail@yahoo. com :(((

any, merged. words  - this part of text is good but www. google. pl and
mymail@yahoo. com not. How to eliminate splitting of www and e-mail
addresses?

Report this thread to moderator Post Follow-up to this message
Old Post
Piotr
09-26-04 02:07 PM


Re: how to separate all but www addresses?
On Sun, 26 Sep 2004 12:41:25 +0200, Piotr <piou@gaztea.pl> wrote:
> I need to separate all merged words in my text instead of www and e-mail
> addresses. How can I do it?
>
> preg_replace("/(\.)([[:alpha:]])/", "\\1 \\2", "www.google.com
> any,merged.words mymail@yahoo.com")
>
> result:
> www. google. pl any, merged. words  mymail@yahoo. com :(((
>
> any, merged. words  - this part of text is good but www. google. pl and
> mymail@yahoo. com not. How to eliminate splitting of www and e-mail
> addresses?

Why not ask somewhere that they discuss whatever it is that  has this
"preg_replace" function that perl doesn't?

--
Sam Holden

Report this thread to moderator Post Follow-up to this message
Old Post
Sam Holden
09-26-04 02:07 PM


Re: how to separate all but www addresses?
Dnia 26 Sep 2004 10:45:51 GMT, Sam Holden napisa³(a):

> On Sun, 26 Sep 2004 12:41:25 +0200, Piotr <piou@gaztea.pl> wrote: 
>
> Why not ask somewhere that they discuss whatever it is that  has this
> "preg_replace" function that perl doesn't?

Because it's not preg_replace problem indeed. It is regular expression
problem and preg_replace has roots in pearl language :)

Report this thread to moderator Post Follow-up to this message
Old Post
Piotr
09-26-04 02:07 PM


Re: how to separate all but www addresses?
Piotr wrote:
> Dnia 26 Sep 2004 10:45:51 GMT, Sam Holden napisa³(a):
> 
>
> Because it's not preg_replace problem indeed. It is regular expression
> problem and preg_replace has roots in pearl language :)

PHP's regular expressions, while based on Perl, are not identical to
Perl's regular expressions.

If you want help with an issue in this group, post a minimal but
complete *Perl* program that demonstrates your problem.

Paul Lalli

P.S.  This language has nothing to do with oysters.

Report this thread to moderator Post Follow-up to this message
Old Post
Paul Lalli
09-26-04 02:07 PM


Re: how to separate all but www addresses?
Dnia Sun, 26 Sep 2004 07:30:38 -0400, Paul Lalli napisa³(a):

> If you want help with an issue in this group, post a minimal but
> complete *Perl* program that demonstrates your problem.

ok, let's forget about PHP and preg_replace :)

I have no any code, I have only question:

Is there any regular expression in perl, which can separate by space all
merged  words in text file instead of www and e-mail addresses?

example text:
"www.google.com any,merged.words mymail@yahoo.com"

incorrect result:
www. google. pl any, merged. words  mymail@yahoo. com

correct result:
www.google.pl any, merged. words  mymail@yahoo.com

Report this thread to moderator Post Follow-up to this message
Old Post
Piotr
09-26-04 02:07 PM


Re: how to separate all but www addresses?
Piotr wrote:
> Dnia Sun, 26 Sep 2004 07:30:38 -0400, Paul Lalli napisa³(a):
>
> 
>
>
> ok, let's forget about PHP and preg_replace :)
>
> I have no any code, I have only question:
>
> Is there any regular expression in perl, which can separate by space all
> merged  words in text file instead of www and e-mail addresses?
>

perldoc -f split

HTH

Abhinav

Report this thread to moderator Post Follow-up to this message
Old Post
Abhinav
09-26-04 02:07 PM


Re: how to separate all but www addresses?
Piotr wrote:
> Is there any regular expression in perl, which can separate by space
> all merged  words in text file instead of www and e-mail addresses?
>
> example text:
> "www.google.com any,merged.words mymail@yahoo.com"
>
> incorrect result:
> www. google. pl any, merged. words  mymail@yahoo. com
>
> correct result:
> www.google.pl any, merged. words  mymail@yahoo.com

What is your criteria (on a technical level, not your meta-knowledge about
the Internet) to differentiate between what you call words and www
addresses?
If foo.bar.eu a list of words or a www address?

jue



Report this thread to moderator Post Follow-up to this message
Old Post
Jürgen Exner
09-26-04 08:56 PM


Re: how to separate all but www addresses?
Piotr <piou@gaztea.pl> wrote:


> ok, let's forget about PHP


That is to be expected in a Perl newsgroup.


> Is there any regular expression in perl,


I can give a Perl "solution", it is up to you to translate it into
some other language if you want it in some other language.


> which can separate by space all
> merged  words in text file instead of www and e-mail addresses?


The hard part is determining how accurately you can describe
"URL" (not a "www address") and "email addresses".

I'll take rather inept definitions in the code below, you may
need to replace them with more accurate ones.



> example text:
> "www.google.com any,merged.words mymail@yahoo.com"
^^^^
^^^^
> correct result:
> www.google.pl any, merged. words  mymail@yahoo.com
^^^
^^^

Huh?



-------------------------------------------
#!/usr/bin/perl
use warnings;
use strict;

$_ = 'www.google.com any,merged.words mymail@yahoo.com';
s/(\S+)/ is_addr($1) ? $1 : separate($1) /ge;
print "$_\n";

sub is_addr {
my($adr) = @_;
return 1 if $adr =~ /^www\./;    # legal URLs may not match this...
return 1 if $adr =~ tr/@// == 1; # legal email adrs may not match this eithe
r
return 0;
}

sub separate {
my($text) = @_;
$text =~ s/(\W)/$1 /g;
return $text;
}
-------------------------------------------


--
Tad McClellan                          SGML consulting
tadmc@augustmail.com                   Perl programming
Fort Worth, Texas

Report this thread to moderator Post Follow-up to this message
Old Post
Tad McClellan
09-26-04 08:56 PM


Re: how to separate all but www addresses?
Dnia Sun, 26 Sep 2004 09:46:33 -0500, Tad McClellan napisa³(a):

> 
>              ^^^^
>              ^^^^ 
>             ^^^
>             ^^^

www.google.com of course - sorry for my mistake :)

> $_ = 'www.google.com any,merged.words mymail@yahoo.com';
> s/(\S+)/ is_addr($1) ? $1 : separate($1) /ge;
> print "$_\n";
>
> sub is_addr {
>    my($adr) = @_;
>    return 1 if $adr =~ /^www\./;    # legal URLs may not match this...
>    return 1 if $adr =~ tr/@// == 1; # legal email adrs may not match this 
either
>    return 0;
> }
>
> sub separate {
>    my($text) = @_;
>    $text =~ s/(\W)/$1 /g;
>    return $text;

thanks for your suggestion, im my case all of the strings with www, http
and @ inside - there are addresses. I don't have addresses without http://
or www.

Report this thread to moderator Post Follow-up to this message
Old Post
Piotr
09-26-04 08:56 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

PERL Miscellaneous archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 05:25 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.