Code Comments
Programming Forum and web based access to our favorite programming groups.hi, i'm trying to do some parsing in matlab and would like to extract 'a = 3 b = 4 c = 5 d = 6" such that i get an array of pairs of values ie: myArray(a) = 3 myArray(b) = 4 myArray(c) = 5 etc also, i don't know how many pairs of values will be in the original string can anyone help a matlab newbie with some kind of regex here? thx
Post Follow-up to this message> hi, i'm trying to do some parsing in matlab and would like to > extract > 'a = 3 b = 4 c = 5 d = 6" such that i get an array of pairs of > values > > > ie: myArray(a) = 3 > myArray(b) = 4 > myArray(c) = 5 etc > > also, i don't know how many pairs of values will be in the original > string I don't have MATLAB in front of me, so excuse any errors. Just let me know if it doesn't work and I'll fix it for you. You could do this: regexprep(yourstring, ... ['MyArray\(([^\)]+)' ... '\)\s*=\s*(\d+)\s*'], ... '$1 = $2'); But if that's all you need, you shouldn't use regular expression, use SSCANF. You could use a much more flexible expression, one that sets what's in the parenthesis to what's after the equals sign. Let me know if the above expression doesn't suit you.
Post Follow-up to this messagehi, thx for getting back to me :) > > You could do this: > > regexprep(yourstring, ... > ['MyArray\(([^\)]+)' ... > '\)\s*=\s*(\d+)\s*'], ... > '$1 = $2'); > > But if that's all you need, you shouldn't use regular expression, > use > SSCANF. > but how do i put this in to a loop to collect up all the x=y assigments? i have no idea how many to expect when i parse this > You could use a much more flexible expression, one that sets what's > in the parenthesis to what's after the equals sign. Let me know if > the above expression doesn't suit you. i would be interested in seeing a more flexible expression, pray tell in general there are going to be alot of scenarios where i don't know how of a particular pattern will occur in my string, so a good way to collect all these up would be very useful i'm trying to write in matlab what i would usually write in flex/bison... is this just a bad idea?? thx
Post Follow-up to this messagealso, if i were doing this in perl i could do
my $str = 'a = 3 b = 4 c = 5 d = 6';
$str =~ s/[^a-z0-9]+/ /g;
my %hash = split / /, $str;
print "$_ = $hash{$_}\n" for keys %hash;
can i do anything like this in matlab?
Post Follow-up to this message>> regexprep(yourstring, ...
> but how do i put this in to a loop to collect up all the x=y
> assignments? i have no idea how many to expect when i parse this
If you want to make the actual assignments in the matlab workspace
then use
regexprep(yourstring, ...
['MyArray\(([^\)]+)' ...
'\)\s*=\s*(\d+)\s*'])
You will then get two sets of tokens, the variable letters and the
numbers.
You can use ASSIGNIN to assign the numbers to the letters.
I'm not sure of the exact syntax, but something like
assignin('caller',token{i}{1},str2num(to
ken{i,2})
Try not to use EVAL if you can avoid it.
what's
know if
> i would be interested in seeing a more flexible expression, pray
tell
> in general there are going to be alot of scenarios where i don't
> know
> how of a particular pattern will occur in my string, so a good way
to
> collect all these up would be very useful
Give me an idea of what other forms they may take.
> i'm trying to write in matlab what i would usually write in
> flex/bison... is this just a bad idea??
I'm not familiar with that language, but MATLAB has pretty decent
text manipulation capabilities now. It's a little different from
many text-oriented languages since it is primarily C-like and
designed for traditional calculations. That often becomes an
advantage when using it for text manipulation. Many tricks are
available to you if you remember that, to MATLAB, a text string is
just an array if ASCII.
You should become *very* familiar with the following functions if you
want to do parsing in MATLAB. There are many useful ones, but these
come to mind:
REPMAT
SETDIFF
DIFF
INTERSECT
+
CHAR
CELLSTR
SSCANF
Post Follow-up to this messageIn the previous post, I meant you should use REGEXP not REGEXPREP to
get the tokens.
> also, if i were doing this in perl i could do
>
> my $str = 'a = 3 b = 4 c = 5 d = 6';
> $str =~ s/[^a-z0-9]+/ /g;
> my %hash = split / /, $str;
> print "$_ = $hash{$_}\n" for keys %hash;
>
> can i do anything like this in matlab?
Yes, what I showed you should do it. You can split pretty easily but
why bother when REGEXP will give you the tokens directly?
The cell format of the output of REGEXP is a little unwieldy. I
stumble on it now and again, but I suppose it's difficult to have
such a flexible output and make it easy to anticipate and manipulate.
Functions that will help you with the cell-cellstr format include
CELL,CELLSTR,ISCELL,ISCELLSTR.
You must drill down into the structure until you find a cellstring
and then you can use {:} to extract the string.
while ~iscellstr(yourtokens) && iscell(yourtokens)
yourtokens = yourtokens{:};
...
I don't mean that literally, because the structure of your output may
be very complex.
Post Follow-up to this message> > > this > > If you want to make the actual assignments in the matlab workspace > then use > > regexprep(yourstring, ... > ['MyArray\(([^\)]+)' ... > '\)\s*=\s*(\d+)\s*']) > > You will then get two sets of tokens, the variable letters and the > numbers. > sorry, i'm a matlab newbie and i didn't quite understand this.. when i type in this command i get nothing returned... MyArray is not set either... i guess i'd like to see a code fragment that populates MyArray so i can understand this a little better the language i'm parsing could have a = 1 b = 2 c = 3 .... and so on, there's no way of knowing how many of these will show up in the files i get, so i need something that will build the array for in this scenario again, sorry for the newbie questions
Post Follow-up to this message> when i type in this command i get nothing returned... MyArray is > not set either... MATLAB is not like perl, it is a traditional language and you must assign values. Also I made the error of typing "MyArray" in the regex instead of "myArray." Using REGEXPI instead of REGEXP will ignor case if you want to avoid a similar error. I don't have MATLAB here to test the code so I'm prone to typos. yourstring=sprintf(['myArray(a) = 3 \n' ... 'myArray(b) = 4 \nmyArray(c) = 5\n'); [startn endn extents match tokens names] = ... regexp(yourstring, ... 'MyArray\(([^\)]+)\)\s*=\s*(\d+)\s*'); > i guess i'd like to see a code fragment that populates MyArray so i > can understand this a little better In this example, the variable TOKENS will contain both your variable names and your values (in string format). The docs are here <http://www.mathworks. com/ac...ref/regexp.html> Without matlab, I'm not sure exactly what form TOKENS will take, so you will have to play with the syntax to extract the data. You may have to type TOKENS{i}{1} to get the variable name and STR2NUM(TOKENS{i}{2}) to get the value associated with that variable. It may be some other similar syntax. If that is the case, for i=1:length(tk) assignin('caller',tokens{i}{1},str2num(t okens{i}{2}); end; should assign the values. Again, my syntax for using TOKENS may be off. Help for ASSIGNIN is available here <http://www.mathworks.com/access/hel...f/assignin.html> > the language i'm parsing could have > a = 1 b = 2 c = 3 .... and so on, there's no way of knowing how > many > of these will show up in the files i get, so i need something that > will build the array for in this scenario This regex should handle any number of variable-value pairs. It will match any variable in the parentheses with the integer immediatly following the equals sign.
Post Follow-up to this messageI finally got to test the code and I missed a closing parenthesis.
Here's working code:
yourstring=sprintf(['myArray(a) = 3 \n' ...
'myArray(b) = 4 \nmyArray(c) = 5\n']);
[startn endn extents match tokens names] = ...
regexp(yourstring, ...
'myArray\(([^\)]+)\)\s*=\s*(\d+)\s*');
for i=1:length(tokens)
assignin('caller',tokens{i}{1},str2num(t
okens{i}{2}));
end;
fprintf('\nINPUT\n%s\n',yourstring);
fprintf('OUTPUT\na=%d\nb=%d\nc=%d\n',[a b c].');
INPUT
myArray(a) = 3
myArray(b) = 4
myArray(c) = 5
OUTPUT
a=3
b=4
c=5
Post Follow-up to this messageHere are a couple of regexp tips to help with this problem. The following assumes you have a variable in your workspace: If you want to use the tokens output, you can ask for it directly: However, anytime you find yourself using the tokens output, you should consider using named tokens: This returns a structure with fields named 'lhs' and 'rhs' which correspond to the names given in the expression. Structures in MATLAB will serve the purpose that you are used to getting from Perl in the form of associative arrays. Using the names structure you can convert it to a more useful structure like this: myStruct = a: '3' b: '4' c: '5' d: '6' Good luck! -=>J
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.