Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

regex question
hi, i'm trying to do some parsing in matlab and would like to extract
'a = 3 b = 4 c = 5 d = 6" such that i get an array of pairs of values


ie: myArray(a) = 3
myArray(b) = 4
myArray(c) = 5 etc

also, i don't know how many pairs of values will be in the original
string

can anyone help a matlab newbie with some kind of regex here?

thx

Report this thread to moderator Post Follow-up to this message
Old Post
stroller
04-15-05 09:02 PM


Re: regex question
> hi, i'm trying to do some parsing in matlab and would like to
> extract
> 'a = 3 b = 4 c = 5 d = 6" such that i get an array of pairs of
> values
>
>
> ie: myArray(a) = 3
> myArray(b) = 4
> myArray(c) = 5 etc
>
> also, i don't know how many pairs of values will be in the original
> string

I don't have MATLAB in front of me, so excuse any errors. Just let
me know if it doesn't work and I'll fix it for you.

You could do this:

regexprep(yourstring, ...
['MyArray\(([^\)]+)' ...
'\)\s*=\s*(\d+)\s*'], ...
'$1 = $2');

But if that's all you need, you shouldn't use regular expression, use
SSCANF.

You could use a much more flexible expression, one that sets what's
in the parenthesis to what's after the equals sign. Let me know if
the above expression doesn't suit you.

Report this thread to moderator Post Follow-up to this message
Old Post
Michael Robbins
04-16-05 09:02 AM


Re: regex question
hi, thx for getting back to me :)

>
> You could do this:
>
> regexprep(yourstring, ...
> ['MyArray\(([^\)]+)' ...
> '\)\s*=\s*(\d+)\s*'], ...
> '$1 = $2');
>
> But if that's all you need, you shouldn't use regular expression,
> use
> SSCANF.
>

but how do i put this in to a loop to collect up all the x=y
assigments? i have no idea how many to expect when i parse this

> You could use a much more flexible expression, one that sets what's
> in the parenthesis to what's after the equals sign. Let me know if
> the above expression doesn't suit you.

i would be interested in seeing a more flexible expression, pray tell

in general there are going to be alot of scenarios where i don't know
how of a particular pattern will occur in my string, so a good way to
collect all these up would be very useful

i'm trying to write in matlab what i would usually write in
flex/bison... is this just a bad idea??

thx

Report this thread to moderator Post Follow-up to this message
Old Post
stroller
04-16-05 09:02 AM


Re: regex question
also, if i were doing this in perl i could do

my $str = 'a = 3 b = 4 c = 5 d = 6';
$str =~ s/[^a-z0-9]+/ /g;
my %hash = split / /, $str;
print "$_ = $hash{$_}\n" for keys %hash;

can i do anything like this in matlab?

Report this thread to moderator Post Follow-up to this message
Old Post
stroller
04-16-05 09:02 AM


Re: regex question
>> regexprep(yourstring, ... 

> but how do i put this in to a loop to collect up all the x=y
> assignments? i have no idea how many to expect when i parse this

If you want to make the actual assignments in the matlab workspace
then use

regexprep(yourstring, ...
['MyArray\(([^\)]+)' ...
'\)\s*=\s*(\d+)\s*'])

You will then get two sets of tokens, the variable letters and the
numbers.

You can use ASSIGNIN to assign the numbers to the letters.

I'm not sure of the exact syntax, but something like

 assignin('caller',token{i}{1},str2num(to
ken{i,2})

Try not to use EVAL if you can avoid it.
 
what's 
know if 

> i would be interested in seeing a more flexible expression, pray
tell
> in general there are going to be alot of scenarios where i don't
> know
> how of a particular pattern will occur in my string, so a good way
to
> collect all these up would be very useful

Give me an idea of what other forms they may take.

> i'm trying to write in matlab what i would usually write in
> flex/bison... is this just a bad idea??

I'm not familiar with that language, but MATLAB has pretty decent
text manipulation capabilities now. It's a little different from
many text-oriented languages since it is primarily C-like and
designed for traditional calculations. That often becomes an
advantage when using it for text manipulation. Many tricks are
available to you if you remember that, to MATLAB, a text string is
just an array if ASCII.

You should become *very* familiar with the following functions if you
want to do parsing in MATLAB. There are many useful ones, but these
come to mind:

REPMAT
SETDIFF
DIFF
INTERSECT
+
CHAR
CELLSTR
SSCANF

Report this thread to moderator Post Follow-up to this message
Old Post
Michael Robbins
04-16-05 09:02 AM


Re: regex question
In the previous post, I meant you should use REGEXP not REGEXPREP to
get the tokens.

> also, if i were doing this in perl i could do
>
> my $str = 'a = 3 b = 4 c = 5 d = 6';
> $str =~ s/[^a-z0-9]+/ /g;
> my %hash = split / /, $str;
> print "$_ = $hash{$_}\n" for keys %hash;
>
> can i do anything like this in matlab?

Yes, what I showed you should do it. You can split pretty easily but
why bother when REGEXP will give you the tokens directly?

The cell format of the output of REGEXP is a little unwieldy. I
stumble on it now and again, but I suppose it's difficult to have
such a flexible output and make it easy to anticipate and manipulate.

Functions that will help you with the cell-cellstr format include
CELL,CELLSTR,ISCELL,ISCELLSTR.

You must drill down into the structure until you find a cellstring
and then you can use {:} to extract the string.

while ~iscellstr(yourtokens) && iscell(yourtokens)
yourtokens = yourtokens{:};
...

I don't mean that literally, because the structure of your output may
be very complex.

Report this thread to moderator Post Follow-up to this message
Old Post
Michael Robbins
04-16-05 09:02 AM


Re: regex question
>
> 
> 
this
>
> If you want to make the actual assignments in the matlab workspace
> then use
>
> regexprep(yourstring, ...
> ['MyArray\(([^\)]+)' ...
> '\)\s*=\s*(\d+)\s*'])
>
> You will then get two sets of tokens, the variable letters and the
> numbers.
>

sorry, i'm a matlab newbie and i didn't quite understand this..

when i type in this command i get nothing returned... MyArray is not
set either...

i guess i'd like to see a code fragment that populates MyArray so i
can understand this a little better

the language i'm parsing could have

a = 1 b = 2 c = 3 .... and so on, there's no way of knowing how many
of these will show up in the files i get, so i need something that
will build the array for in this scenario

again, sorry for the newbie questions

Report this thread to moderator Post Follow-up to this message
Old Post
stroller
04-16-05 09:02 AM


Re: regex question
> when i type in this command i get nothing returned... MyArray is
> not set either...

MATLAB is not like perl, it is a traditional language and you must
assign values. Also I made the error of typing "MyArray" in the
regex instead of "myArray." Using REGEXPI instead of REGEXP will
ignor case if you want to avoid a similar error.

I don't have MATLAB here to test the code so I'm prone to typos.

yourstring=sprintf(['myArray(a) = 3 \n' ...
'myArray(b) = 4 \nmyArray(c) = 5\n');
[startn endn extents match tokens names] = ...
regexp(yourstring, ...
'MyArray\(([^\)]+)\)\s*=\s*(\d+)\s*');

> i guess i'd like to see a code fragment that populates MyArray so i
> can understand this a little better

In this example, the variable TOKENS will contain both your variable
names and your values (in string format). The docs are here <http://www.mathworks.
com/ac...ref/regexp.html>

Without matlab, I'm not sure exactly what form TOKENS will take, so
you will have to play with the syntax to extract the data. You may
have to type TOKENS{i}{1} to get the variable name and
STR2NUM(TOKENS{i}{2}) to get the value associated with that variable.
It may be some other similar syntax.

If that is the case,

for i=1:length(tk)
 assignin('caller',tokens{i}{1},str2num(t
okens{i}{2});
end;

should assign the values. Again, my syntax for using TOKENS may be
off. Help for ASSIGNIN is available here <http://www.mathworks.com/access/hel...f/assignin.html>

> the language i'm parsing could have
> a = 1 b = 2 c = 3 .... and so on, there's no way of knowing how
> many
> of these will show up in the files i get, so i need something that
> will build the array for in this scenario

This regex should handle any number of variable-value pairs. It will
match any variable in the parentheses with the integer immediatly
following the equals sign.

Report this thread to moderator Post Follow-up to this message
Old Post
Michael Robbins
04-16-05 01:58 PM


Re: regex question
I finally got to test the code and I missed a closing parenthesis.
Here's working code:
 
yourstring=sprintf(['myArray(a) = 3 \n' ...
'myArray(b) = 4 \nmyArray(c) = 5\n']);
[startn endn extents match tokens names] = ...
regexp(yourstring, ...
'myArray\(([^\)]+)\)\s*=\s*(\d+)\s*');
for i=1:length(tokens)
 assignin('caller',tokens{i}{1},str2num(t
okens{i}{2}));
end;
fprintf('\nINPUT\n%s\n',yourstring);
fprintf('OUTPUT\na=%d\nb=%d\nc=%d\n',[a b c].');

INPUT
myArray(a) = 3
myArray(b) = 4
myArray(c) = 5

OUTPUT
a=3
b=4
c=5 

Report this thread to moderator Post Follow-up to this message
Old Post
Michael Robbins
04-17-05 02:00 AM


Re: regex question
Here are a couple of regexp tips to help with this problem.

The following assumes you have a variable in your workspace:
 

If you want to use the tokens output, you can ask for it directly:
 

However, anytime you find yourself using the tokens output, you should
consider using named tokens:
 

This returns a structure with fields named 'lhs' and 'rhs' which
correspond to the names given in the expression.  Structures in MATLAB
will serve the purpose that you are used to getting from Perl in the
form of associative arrays.

Using the names structure you can convert it to a more useful structure
like this:
 

myStruct =

a: '3'
b: '4'
c: '5'
d: '6'

Good luck!

-=>J

Report this thread to moderator Post Follow-up to this message
Old Post
Jason Breslau
04-18-05 09:01 PM


Sponsored Links




Last Thread Next Thread Next
Pages (2): [1] 2 »
Search this forum -> 
Post New Thread

Matlab archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 07:17 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.