Home > Archive > PERL Beginners > July 2004 > A RegEx question
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Ian Marlier 2004-07-28, 8:56 pm |
| Hi, all --
I'm in the process of writing a script to migrate from one wiki package to
another.
The old wiki help articles in a series of flat text files. The new one
holds everything in MySQL, so I need to parse the text files into a single
SQL import script. I've got most of it, but there's one place where I'm
getting stuck and could use some help.
The old wiki holds links in one of these formats:
[WikiLink]
Or
[Text That Links|WikiLink]
Or
[Text That Links|http://someurl.com/]
The new one holds them in the form
((WikiLink))
Or
((WikiLink|Text That Links))
Or
((http://someurl.com/|Text That Links))
Replacing the individual characters is easy.
But how can I grab the "link" and "text" parts out (IF there is a "|" in the
middle), and reverse them?
Any help appreciated.
Thanks,
Ian
| |
| Jupiterhost.Net 2004-07-28, 8:56 pm |
|
Ian Marlier wrote:
> Hi, all --
Howdy,
> I'm in the process of writing a script to migrate from one wiki package to
> another.
>
> The old wiki help articles in a series of flat text files. The new one
> holds everything in MySQL, so I need to parse the text files into a single
> SQL import script. I've got most of it, but there's one place where I'm
> getting stuck and could use some help.
>
> The old wiki holds links in one of these formats:
> [WikiLink]
> Or
> [Text That Links|WikiLink]
> Or
> [Text That Links|http://someurl.com/]
>
> The new one holds them in the form
> ((WikiLink))
> Or
> ((WikiLink|Text That Links))
> Or
> ((http://someurl.com/|Text That Links))
>
>
> Replacing the individual characters is easy.
>
> But how can I grab the "link" and "text" parts out (IF there is a "|" in the
> middle), and reverse them?
How about something like this:
$line = '[Text That Links|http://someurl.com/]';
$line =~ s/\[([^\|]*)\|([^\]]*)\]/(($2\|$1))/ if $line =~ m/\|/;;
print "$line\n";
HTH :)
Lee.M JupiterHost.Net
| |
| Ian Marlier 2004-07-28, 8:56 pm |
| > Ian Marlier wrote:
>
> Howdy,
>
>
>
> How about something like this:
> $line = '[Text That Links|http://someurl.com/]';
> $line =~ s/\[([^\|]*)\|([^\]]*)\]/(($2\|$1))/ if $line =~ m/\|/;;
> print "$line\n";
>
> HTH :)
That helps quite a bit...
Now any thoughts on how I can feed 45,000 lines of text into that? Will a
pipe work? (I'm actually using cat to spool the text and sed to do most of
the replacements, but I'm pretty sure that sed isn't smart enough to do the
conditional).
| |
| Gunnar Hjalmarsson 2004-07-28, 8:56 pm |
| Ian Marlier wrote:
> I'm in the process of writing a script to migrate from one wiki
> package to another.
<snip>
> The old wiki help articles in a series of flat text files. The new
> one holds everything in MySQL, so I need to parse the text files
> into a single SQL import script. I've got most of it, but there's
> one place where I'm getting stuck and could use some help.
>
> The old wiki holds links in one of these formats:
> [WikiLink]
> Or
> [Text That Links|WikiLink]
> Or
> [Text That Links|http://someurl.com/]
>
> The new one holds them in the form
> ((WikiLink))
> Or
> ((WikiLink|Text That Links))
> Or
> ((http://someurl.com/|Text That Links))
>
> Replacing the individual characters is easy.
>
> But how can I grab the "link" and "text" parts out (IF there is a
> "|" in the middle), and reverse them?
You can use the s/// operator and have the right side evaluated as an
expression:
s{\[([^\]|]+)(?:(\|)([^\]|]+))?\]}
{ '((' . ($2 ? "$3$2$1" : $1) . '))' }eg;
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
|
|
|
|
|