For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > July 2004 > A RegEx question









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author A RegEx question
Ian Marlier

2004-07-28, 8:56 pm

Hi, all --

I'm in the process of writing a script to migrate from one wiki package to
another.

The old wiki help articles in a series of flat text files. The new one
holds everything in MySQL, so I need to parse the text files into a single
SQL import script. I've got most of it, but there's one place where I'm
getting stuck and could use some help.

The old wiki holds links in one of these formats:
[WikiLink]
Or
[Text That Links|WikiLink]
Or
[Text That Links|http://someurl.com/]

The new one holds them in the form
((WikiLink))
Or
((WikiLink|Text That Links))
Or
((http://someurl.com/|Text That Links))


Replacing the individual characters is easy.

But how can I grab the "link" and "text" parts out (IF there is a "|" in the
middle), and reverse them?

Any help appreciated.

Thanks,

Ian

Jupiterhost.Net

2004-07-28, 8:56 pm



Ian Marlier wrote:
> Hi, all --


Howdy,

> I'm in the process of writing a script to migrate from one wiki package to
> another.
>
> The old wiki help articles in a series of flat text files. The new one
> holds everything in MySQL, so I need to parse the text files into a single
> SQL import script. I've got most of it, but there's one place where I'm
> getting stuck and could use some help.
>
> The old wiki holds links in one of these formats:
> [WikiLink]
> Or
> [Text That Links|WikiLink]
> Or
> [Text That Links|http://someurl.com/]
>
> The new one holds them in the form
> ((WikiLink))
> Or
> ((WikiLink|Text That Links))
> Or
> ((http://someurl.com/|Text That Links))
>
>
> Replacing the individual characters is easy.
>
> But how can I grab the "link" and "text" parts out (IF there is a "|" in the
> middle), and reverse them?



How about something like this:
$line = '[Text That Links|http://someurl.com/]';
$line =~ s/\[([^\|]*)\|([^\]]*)\]/(($2\|$1))/ if $line =~ m/\|/;;
print "$line\n";

HTH :)

Lee.M JupiterHost.Net
Ian Marlier

2004-07-28, 8:56 pm

> Ian Marlier wrote:
>
> Howdy,
>
>
>
> How about something like this:
> $line = '[Text That Links|http://someurl.com/]';
> $line =~ s/\[([^\|]*)\|([^\]]*)\]/(($2\|$1))/ if $line =~ m/\|/;;
> print "$line\n";
>
> HTH :)


That helps quite a bit...

Now any thoughts on how I can feed 45,000 lines of text into that? Will a
pipe work? (I'm actually using cat to spool the text and sed to do most of
the replacements, but I'm pretty sure that sed isn't smart enough to do the
conditional).

Gunnar Hjalmarsson

2004-07-28, 8:56 pm

Ian Marlier wrote:
> I'm in the process of writing a script to migrate from one wiki
> package to another.


<snip>

> The old wiki help articles in a series of flat text files. The new
> one holds everything in MySQL, so I need to parse the text files
> into a single SQL import script. I've got most of it, but there's
> one place where I'm getting stuck and could use some help.
>
> The old wiki holds links in one of these formats:
> [WikiLink]
> Or
> [Text That Links|WikiLink]
> Or
> [Text That Links|http://someurl.com/]
>
> The new one holds them in the form
> ((WikiLink))
> Or
> ((WikiLink|Text That Links))
> Or
> ((http://someurl.com/|Text That Links))
>
> Replacing the individual characters is easy.
>
> But how can I grab the "link" and "text" parts out (IF there is a
> "|" in the middle), and reverse them?


You can use the s/// operator and have the right side evaluated as an
expression:

s{\[([^\]|]+)(?:(\|)([^\]|]+))?\]}
{ '((' . ($2 ? "$3$2$1" : $1) . '))' }eg;

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com