Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Get all strings matching given RegExp
Can I get sequence of all strings that can match a given regular
expression? For example, for expression '(a|b)|(x|y)' it would be ['ax',
'ay', 'bx', 'by']

It would be useful for example to pass these strings to a search engine
not supporting RegExp (therefore adding such support to it). A program
can also let user specify sequence of strings using RegExp (filenames to
process, etc.). If there are other types of expressions for these
purposes, please let me know.

I know that for some expressions there would be infinite amount of
matching strings, but these aren't the cases I'm considering. It'd still
be possible if string length is limited (there might be large but finite
number of matching strings).

Thanks

Report this thread to moderator Post Follow-up to this message
Old Post
Alex9968
04-03-08 11:29 AM


Re: Get all strings matching given RegExp
I don't think there is any built in way.  Regular expressions are
compiled into an expanded pattern internally, but I don't think that
it is anything that would be useful for you to directly access.

If you are interested in a lot of work, you could do something with
PLY and write an re parser that would expand it into a series of
possible textual matches :)

Report this thread to moderator Post Follow-up to this message
Old Post
Jeff
04-03-08 01:44 PM


Re: Get all strings matching given RegExp
Alex9968 <noname9968@gmail.com> wrote:
> Can I get sequence of all strings that can match a given regular
> expression? For example, for expression '(a|b)|(x|y)' it would be ['ax',
> 'ay', 'bx', 'by']
>
> It would be useful for example to pass these strings to a search engine
> not supporting RegExp (therefore adding such support to it). A program
> can also let user specify sequence of strings using RegExp (filenames to
> process, etc.). If there are other types of expressions for these
> purposes, please let me know.
>
> I know that for some expressions there would be infinite amount of
> matching strings, but these aren't the cases I'm considering. It'd still
> be possible if string length is limited (there might be large but finite
> number of matching strings).

This will give you all (byte-)strings upto a given length which match a
given regular expression. But beware, it can be slow ;)

import re

all_chars = [chr(i) for i in xrange(256)]

def gen_strings(length, alphabet=all_chars):
if length == 1:
for c in alphabet:
yield c
else:
for i in alphabet:
yield c
for s in gen_strings(length - 1, alphabet):
yield c + s

def regex_matches(regex, max_length, alphabet=all_chars):
r = re.compile('(' + regex + r')\Z')
return (s for s in gen_strings(max_length, alphabet) if r.match(s))

Marc

Report this thread to moderator Post Follow-up to this message
Old Post
Marc Christiansen
04-03-08 01:44 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Python archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 11:57 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.