Home > Archive > Compilers > August 2005 > Flex regular expression problem
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Flex regular expression problem
|
|
| pmatos 2005-08-21, 2:56 am |
| Hi all,
I'm having some problems generating the regular expression for "any
printable character except " between ". For example:
"hello, how are you"
"212 dasd 2"
"bugasd "
"" - SHOULD NOT WORK
" " - Can work!
I'm trying along the following lines:
identifier \"[[:print:]^\"]+\"
but it never works as I expect. Any ideas?
Cheers,
Paulo Matos
[You can use [^"] for anything except a quote, but there's no lex-ism
to take the difference of two regexps. -John]
| |
| Detlef Meyer-Eltz 2005-08-24, 7:00 pm |
| As John already wrote, you can define a string as : "[^"]*"
A string limited to a single line would be : "[^"\r\n]*"
A string limited to a single line, which may contain other double
quotes preceded by a backslash, is:
"([^"\\\r\n]*(\\.[^"\\\r\n]*)*)".
Here the first sub-expression delivers the text inside of the double
qotes. This expression is optimized according to Friedl's scheme (see
the help file of the TextTransformer).
It is funny for me that you ask the question just today. Yesterday I
finished a dialog, by which character classes can be defined by
adding, substracting or negation of predefined character classes,
individual characters or ranges and lists of them. By this dialog the
result of a subtraction of '"' from [:print:] yields in:
-[:word:] !#\$%&'\(\)\*\+,\./:;<=>\?@\[\\\]\^`\{\|\}~
or
-
!#\$%&'\(\)\*\+,\./0123456789:;<=>\? @ABCDEFGHIJKLMNOPQRSTUVWXYZ\[\\\]\^_`abc
defghijklmnopqrstuvwxyz\{\|\}~
(The hyphen is at the begininig, so it doesn't define a range)
The dialog will be included in the next update of the TextTransformer.
If you like, I can send you a little (Windows) test application right
now.
--
Detlef Meyer-Eltz
--
mailto:Meyer-Eltz@t-online.de
url: http://www.texttransformer.de
url: http://www.texttransformer.com
> Hi all,
> I'm having some problems generating the regular expression for "any
> printable character except " between ". For example:
> "hello, how are you"
> "212 dasd 2"
> "bugasd "
> "" - SHOULD NOT WORK
> " " - Can work!
| |
| Laurence Finston 2005-08-24, 7:00 pm |
| > but it never works as I expect. Any ideas?
You could use a start condition for strings. The scanner should enter it
when it scans the first double quote character and collect all characters
up to the closing double quote, at which point it should leave the start
condition. I would also implement a way of quoting
double quote characters, so that "\"" would be interpreted as a string
containing a double quote character, and not a string containing a
backslash followed by an opening double quote. You could also implement
special handling for other characters or sequences of characters.
Laurence Finston
| |
| Paolo Bonzini 2005-08-31, 3:58 am |
| > "([^"\\\r\n]*(\\.[^"\\\r\n]*)*)".
>
> Here the first sub-expression delivers the text inside of the double
> qotes. This expression is optimized according to Friedl's scheme (see
> the help file of the TextTransformer).
Flex uses a DFA, so you do not really need optimization and you can do
simply
"(\\.|[^"\\\r\n])*"
Paolo
|
|
|
|
|