For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > October 2006 > spaced text









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author spaced text
Norbert_L.

2006-10-30, 7:03 pm

Would someone please guide me with this problem for which I found not
the slighest hint on the web:

I have texts where some words are s p a c e d by adding blanks between
characters. I am looking for a regular expression to get rid of those
spaces.

If this is no good group to ask this question, could someone please
guide me to a better one?

Thank you very much in advance.

Paul Lalli

2006-10-30, 7:03 pm

Norbert_L. wrote:
> Would someone please guide me with this problem for which I found not
> the slighest hint on the web:
>
> I have texts where some words are s p a c e d by adding blanks between
> characters. I am looking for a regular expression to get rid of those
> spaces.
>
> If this is no good group to ask this question, could someone please
> guide me to a better one?


How do you determine which sequences of characters "spaced words" and
which are just one-letter words? How do you determine where one
"spaced word" ends and another begins?

H o w m a n y w o r d s d o I h a v e i n a s e n t e n c e l i k e t h
i s ?

Until you better define your problem set, there's no way to answer your
question.

Paul Lalli

Norbert_L.

2006-10-30, 7:03 pm

Oh, I apologize - I use a language that has no one-letter words
(German, that is). So I thought about looking for two letters or a
punctuation mark, than at least one whitespace, .... and at the end a
whitespace and an end-of-line | -page | printable character. My problem
is to access the unknown number of sequences [alpha blank].

Paul Lalli schrieb:

> Norbert_L. wrote:
>
> How do you determine which sequences of characters "spaced words" and
> which are just one-letter words? How do you determine where one
> "spaced word" ends and another begins?
>
> H o w m a n y w o r d s d o I h a v e i n a s e n t e n c e l i k e t h
> i s ?
>
> Until you better define your problem set, there's no way to answer your
> question.
>
> Paul Lalli


nobull67@gmail.com

2006-10-30, 7:03 pm



On Oct 25, 3:14 pm, "Norbert_L." <n...@web.de> top-post:

[ please don't top post, it's rude ]

>
>
>
>
>
[color=darkred]
> Oh, I apologize - I use a language that has no one-letter words
> (German, that is). So I thought about looking for two letters or a
> punctuation mark, than at least one whitespace, .... and at the end a
> whitespace and an end-of-line | -page | printable character. My problem
> is to access the unknown number of sequences [alpha blank].


Isn't your question just how to remove all the spacs that are between
single letters?

s{ # look for...
(?<=(?<![[:alpha:]]) # Not preceded by an alpha
([[:alpha:]]) # One letter
\ # One space
(?=[[:alpha:]](?![[:alpha:]])) # One and _only_ one letter
}{$1}xg;

nobull67@gmail.com

2006-10-30, 7:03 pm



On Oct 25, 3:14 pm, "Norbert_L." <n...@web.de> top-post:

[ please don't top post, it's rude ]

>
>
>
>
>
[color=darkred]
> Oh, I apologize - I use a language that has no one-letter words
> (German, that is). So I thought about looking for two letters or a
> punctuation mark, than at least one whitespace, .... and at the end a
> whitespace and an end-of-line | -page | printable character. My problem
> is to access the unknown number of sequences [alpha blank].


Isn't your question just how to remove all the spacs that are between
single letters?

s{ # look for...
(?<=(?<![[:alpha:]]) # Not preceded by an alpha
([[:alpha:]]) # One letter
\ # One space
(?=[[:alpha:]](?![[:alpha:]])) # One and _only_ one letter
}{$1}xg;

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com