Code Comments
Programming Forum and web based access to our favorite programming groups.Hi. I want to generate text based on a given regex. just any text that satisifies the regex. Are there any existing tools/libraries to do that. [The harder question of coming up with strings that satisfy a context free grammar has been discussed at length. See for example http://compilers.iecc.com/comparch/article/91-04-055 -John]
Post Follow-up to this messageMidhat wrote: > Hi. I want to generate text based on a given regex. just any text that > satisifies the regex. Are there any existing tools/libraries to do > that. I don't know any tools/libraries, but it seems like this problem isn't so difficult. * Create a syntax tree of the regex. * Generate text (T(r)) for the leafs of the tree (r). The atomic regexes: - If r = a character, T(r) = that character - If r = a character class, T(r) = any character satisfying that class - If r = . , T(r) = any character (except perhaps \n) * Recursively generate text (T(R)) for the non-atomic sub-regexes (R) of the tree, assuming text has already been generated for all the immediate sub-regexes of R: - If R = x*, T(R) = a concatenation of any number of T(x). - If R = x+, T(R) = a concatenation of 1 or more of T(x). - If R = x?, T(R) = the empty string or T(x). - If R = xy, T(R) = the concatenation of T(x) and T(y). - If R = x|y, T(R) = T(x) or T(y) - If R = (x), T(R) = T(x) Any choices you make here will lead to T(regex), the text you want. Of course, if any text will do, you can immediately return empty strings for x* and x?, without processing x. And I am assuming here that we are talking about regular expressions in the classical sense. So without recalling previously captured sub-expressions and such. However, if you want to have those anyway, there are still relatively easy adaptations you can make to this algorithm to make it work. Good luck! -- Michiel Helvensteijn
Post Follow-up to this messageOn Mar 24, 2:48 am, Midhat <midhat...@gmail.com> wrote: > Hi. I want to generate text based on a given regex. just any text that > satisifies the regex. Are there any existing tools/libraries to do > that. > [The harder question of coming up with strings that satisfy a > context free grammar has been discussed at length. See for examplehttp://compilers.iecc.com/comparch/article/91-04-055-John] Wow. I really admire John's memory, having posted on the CFG topic 17 years ago. It would be a simple matter to convert a regex to a regular grammar. The Common Lisp code in the last article would then print all the strings given by the regex in increasing order of length. [Hey, I have this search engine on the archives, you know. -John]
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.