Home > Archive > PERL Miscellaneous > September 2006 > Splitting and keeping key/value
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Splitting and keeping key/value
|
|
| Sandman 2006-09-26, 8:03 am |
| Indata:
-------------------------------------
Date: 2006-04-03
Message: Wonderful! Let's
meet there! I'll call you
later
Sent by: John
-------------------------------------
I want this parsed into:
Array (
[Date] => "2006-04-03",
[Message] => "Wonderful! Let's\nmeet there! ....."
[Sent by] => "John"
)
By defining keywords that data should be split in, in this case
"Date", "Message", "Sent by" and that those should be the first word
on the line and they should be followed by a ":". The Message part in
my actual indata is at no risk of containing any of these keywords.
Any cute ideas on how to solve that? Thanks in advance. :)
--
Sandman[.net]
| |
| Paul Lalli 2006-09-26, 8:03 am |
| Sandman wrote:
> Indata:
> -------------------------------------
> Date: 2006-04-03
> Message: Wonderful! Let's
> meet there! I'll call you
> later
> Sent by: John
> -------------------------------------
Please speak Perl, not some bizarre pseudo-code. Do you mean:
my $Indata = "Date: 2006-04-03
Message: Wonderful! Let's
meet there! I'll call you
later
Sent by: John";
or do you mean:
my @Indata = (
"Date: 2006-04-03\n",
"Message: Wonderful! Let's\n",
"meet there! I'll call you\n",
"later\n",
"Sent by: John\n"
);
?
The difference is important.
> I want this parsed into:
>
> Array (
> [Date] => "2006-04-03",
> [Message] => "Wonderful! Let's\nmeet there! ....."
> [Sent by] => "John"
> )
Is this some sort of pseudo-PHP? Are you aware you posted to a Perl
newsgroup? Do you mean you want:
my %hash = (
'Date' => '2006-04-03',
'Message' => "Wonderful! Let's\nmeet there! ...",
'Sent by' => John',
);
?
> By defining keywords that data should be split in, in this case
> "Date", "Message", "Sent by" and that those should be the first word
> on the line and they should be followed by a ":". The Message part in
> my actual indata is at no risk of containing any of these keywords.
>
> Any cute ideas on how to solve that?
I don't how cute it is, but yes, I could solve that using regular
expressions. Have you made any attempts to solve it yourself yet? If
you post your best attempt, and describe how that attempt is not
working for you, we can probably help you fix it.
Paul Lalli
| |
| Sandman 2006-09-26, 8:03 am |
| In article <1159267729.600837.37420@m7g2000cwm.googlegroups.com>,
"Paul Lalli" <mritty@gmail.com> wrote:
> Sandman wrote:
>
> Please speak Perl, not some bizarre pseudo-code. Do you mean:
> my $Indata = "Date: 2006-04-03
> Message: Wonderful! Let's
> meet there! I'll call you
> later
> Sent by: John";
>
> or do you mean:
> my @Indata = (
> "Date: 2006-04-03\n",
> "Message: Wonderful! Let's\n",
> "meet there! I'll call you\n",
> "later\n",
> "Sent by: John\n"
> );
>
> ?
>
> The difference is important.
My indata is a textfile. Sorry.
>
> Is this some sort of pseudo-PHP?
No.
> Are you aware you posted to a Perl newsgroup?
Yes. Did you or did you not understand the array composition I was
looking for? If you didn't, I would be glad to explain it further as
to avoid confusion.
>
> I don't how cute it is, but yes, I could solve that using regular
> expressions. Have you made any attempts to solve it yourself yet? If
> you post your best attempt, and describe how that attempt is not
> working for you, we can probably help you fix it.
No, I am currently parsing it by:
if ($body=~m/Message: (.*?)\n/){
my $message = $1;
}
But I want a more modular approach.
--
Sandman[.net]
| |
| Paul Lalli 2006-09-26, 8:03 am |
| Sandman wrote:
> In article <1159267729.600837.37420@m7g2000cwm.googlegroups.com>,
> "Paul Lalli" <mritty@gmail.com> wrote:
>
>
> My indata is a textfile. Sorry.
That completely fails to answer the question. How are you storing this
data *within your program*.
>
> No.
>
>
> Yes. Did you or did you not understand the array composition I was
> looking for?
No, I can only *guess* as to what you meant. My guess may or may not
be correct.
> If you didn't, I would be glad to explain it further as to avoid confusion.
To avoid confusion, just "speak Perl". That way there is no guessing.
Show us an actual Perl data structure that is the result you are
desiring.
>
> No, I am currently parsing it by:
>
> if ($body=~m/Message: (.*?)\n/){
> my $message = $1;
> }
>
> But I want a more modular approach.
Presumably, you want an approach that works, too, since the above
doesn't. Even assuming you have more in your if() statement, which
adds the message to your structure, that would stop $1 at the first
line of the Message, rather than where the message actually ends.
Consider matching all non-colons up to an internal end-of-line (take a
look at the /m modifier for RegExps)
Code the attempt, and let us know if it doesn't work.
Paul Lalli
| |
| Mumia W. (reading news) 2006-09-26, 8:03 am |
| On 09/26/2006 05:29 AM, Sandman wrote:
> Indata:
> -------------------------------------
> Date: 2006-04-03
> Message: Wonderful! Let's
> meet there! I'll call you
> later
> Sent by: John
> -------------------------------------
>
> I want this parsed into:
>
> Array (
> [Date] => "2006-04-03",
> [Message] => "Wonderful! Let's\nmeet there! ....."
> [Sent by] => "John"
> )
>
> By defining keywords that data should be split in, in this case
> "Date", "Message", "Sent by" and that those should be the first word
> on the line and they should be followed by a ":". The Message part in
> my actual indata is at no risk of containing any of these keywords.
>
> Any cute ideas on how to solve that? Thanks in advance. :)
>
>
I would use the substitution operator s/// to repeatedly suck off
keyword and value segments and place them in a hash. The /e option to
s/// allows you execute complicated expressions, and that's what I would
use here.
Try it yourself.
--
paduille.4058.mumia.w@earthlink.net
| |
| Peter J. Holzer 2006-09-26, 8:03 am |
| On 2006-09-26 12:21, Paul Lalli <mritty@gmail.com> wrote:
> Sandman wrote:
>
> That completely fails to answer the question. How are you storing this
> data *within your program*.
There is no reason why that data should be stored within the program at
all. The file can be read line by line and the array/hash/whatever
datastructure can be constructed on the fly. Slurping the whole file
into memory may make constructing the desired data structure easier
(hard to tell from the vague descriptions Sandman gave us), but it is
certainly not required.
hp
--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sy min WSR | > ist?
| | | hjp@hjp.at | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
| |
| Sandman 2006-09-26, 8:03 am |
| In article <1159273292.373250.51450@b28g2000cwb.googlegroups.com>,
"Paul Lalli" <mritty@gmail.com> wrote:
>
> That completely fails to answer the question. How are you storing this
> data *within your program*.
If you don't want to help, that's fine. No need to be aggressive. The
way it's stored within the program isn't important. If you assume it's
stored as the content of a variable, work with that. If you don't want
to make any assumptions, don't hit the reply button.
I've been in this group for way too long to be bothered with people
that rather nitpick on syntax than actually trying to help. For
instance, a good response from you would have been something along the
lines of:
Well, if you have the above in, for example, $data, then I would
probably do something like <code>
And my reply them might have been
Thanks, it's not a variable, but read from STDIN, but I can adapt
' your solution to my indata, thanks for helping me out!
Thanks for listening.
--
Sandman[.net]
| |
| Sandman 2006-09-26, 8:03 am |
| In article <3h9Sg.6492$UG4.1495@newsread2.news.pas.earthlink.net>,
"Mumia W. (reading news)" <paduille.4058.mumia.w@earthlink.net>
wrote:
> I would use the substitution operator s/// to repeatedly suck off
> keyword and value segments and place them in a hash. The /e option to
> s/// allows you execute complicated expressions, and that's what I would
> use here.
>
> Try it yourself.
Yeah, that's pretty much how I've been doing it. I just thought that
there were a more modular approach. I'll try some more. Thanks :)
--
Sandman[.net]
| |
| anno4000@radom.zrz.tu-berlin.de 2006-09-26, 8:03 am |
| Sandman <mr@sandman.net> wrote in comp.lang.perl.misc:
> In article <3h9Sg.6492$UG4.1495@newsread2.news.pas.earthlink.net>,
> "Mumia W. (reading news)" <paduille.4058.mumia.w@earthlink.net>
> wrote:
>
>
> Yeah, that's pretty much how I've been doing it. I just thought that
> there were a more modular approach. I'll try some more. Thanks :)
You've said that twice now. "Modular" means consisting of independent
components. How does that apply here?
Anno
| |
| Paul Lalli 2006-09-26, 6:59 pm |
| Sandman wrote:
> In article <1159273292.373250.51450@b28g2000cwb.googlegroups.com>,
> "Paul Lalli" <mritty@gmail.com> wrote:
>
>
> If you don't want to help, that's fine. No need to be aggressive.
While I was not being agressive, I rather disagree that there was no
need to be. For some reason, you seem completely unwilling to help
anyone to help you without mulitple prodding.
> The
> way it's stored within the program isn't important
Of course it is. If it's stored in a scalar variable, there are
certain operations you can do on it. If it's stored as a list of
lines, there are other options you can do on it. How is that not
relevant?
>. If you assume it's
> stored as the content of a variable, work with that. If you don't want
> to make any assumptions, don't hit the reply button.
My point is that there is NO REASON to make any assumptions, neither on
my part nor on yours. You clearly are reading the file at some point
in your current script, so why not just tell us *how* you're doing so?!
> I've been in this group for way too long to be bothered with people
> that rather nitpick on syntax than actually trying to help.
I was trying to help. I was trying to help you see how to ask a
question that would be likely to produce a response that would solve
your problem. How is that not helpful?
> For
> instance, a good response from you would have been something along the
> lines of:
>
> Well, if you have the above in, for example, $data, then I would
> probably do something like <code>
No, that would be a REALLY REALLY bad response, because it would
encourage you to continue to post badly formed questions with no
attempt to solve the problem on your own, and would only increase the
number of people who refuse to help you. That would NOT help you in
the long run at all.
Paul Lalli
| |
| Paul Lalli 2006-09-26, 6:59 pm |
| Peter J. Holzer wrote:
> On 2006-09-26 12:21, Paul Lalli <mritty@gmail.com> wrote:
[color=darkred]
>
> There is no reason why that data should be stored within the program at
> all. The file can be read line by line and the array/hash/whatever
> datastructure can be constructed on the fly. Slurping the whole file
> into memory may make constructing the desired data structure easier
> (hard to tell from the vague descriptions Sandman gave us), but it is
> certainly not required.
I was working on the assumption that the text file really is just 5
lines as the OP showed. In that case, the "penalty" for slurping the
entire file is less than negligable, and the benefits of not having to
parse each line looking for the end of the record, storing the previous
line, joining multiple lines to complete the record, etc, are far more
than worth it.
Paul Lalli
| |
| Mumia W. (reading news) 2006-09-26, 6:59 pm |
| On 09/26/2006 07:44 AM, Mumia W. (reading news) wrote:
> On 09/26/2006 05:29 AM, Sandman wrote:
>
> I would use the substitution operator s/// to repeatedly suck off
> keyword and value segments and place them in a hash. The /e option to
> s/// allows you execute complicated expressions, and that's what I would
> use here.
>
> Try it yourself.
>
>
S/// is not needed for this. The match operator m// will do just fine
(with lookahead). Read "perldoc perlre" and the section on "Extended
Patterns."
--
paduille.4058.mumia.w@earthlink.net
| |
| Sandman 2006-09-26, 6:59 pm |
| In article <4nsqafFbtkmtU1@news.dfncis.de>,
anno4000@radom.zrz.tu-berlin.de wrote:
>
> You've said that twice now. "Modular" means consisting of independent
> components. How does that apply here?
In programming, "modular" really hasn't got a strict definition. I
used it to mean that I could add and subtract dependencies in the
script at will, without having to change the code.
Plus, I'm from sweden.
--
Sandman[.net]
| |
| Sandman 2006-09-26, 6:59 pm |
| In article <1159280569.894960.270420@m73g2000cwd.googlegroups.com>,
"Paul Lalli" <mritty@gmail.com> wrote:
>
> While I was not being agressive, I rather disagree that there was no
> need to be. For some reason, you seem completely unwilling to help
> anyone to help you without mulitple prodding.
Ok, then leave it at that. No problem for me. Thanks anyway.
--
Sandman[.net]
| |
| anno4000@radom.zrz.tu-berlin.de 2006-09-26, 6:59 pm |
| Sandman <mr@sandman.net> wrote in comp.lang.perl.misc:
> In article <4nsqafFbtkmtU1@news.dfncis.de>,
> anno4000@radom.zrz.tu-berlin.de wrote:
>
>
> In programming, "modular" really hasn't got a strict definition.
If that is what you think then don't use the term. It can only
add to the confusion.
> I
> used it to mean that I could add and subtract dependencies in the
> script at will, without having to change the code.
So you expect us to divine what meaning you have assigned to the
term for the moment? Great attempt at communication!
> Plus, I'm from sweden.
Then don't teach us about English. The term modular has a quite
well-defined meaning, especially in programming.
Anno
| |
| Tad McClellan 2006-09-26, 6:59 pm |
| Sandman <mr@sandman.net> wrote:
> In article <1159273292.373250.51450@b28g2000cwb.googlegroups.com>,
> "Paul Lalli" <mritty@gmail.com> wrote:
>
>
> If you don't want to help, that's fine.
If you don't want to be helped, that's fine too.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
| |
| Sandman 2006-09-26, 6:59 pm |
| In article <slrnehiv7r.91g.tadmc@magna.augustmail.com>,
Tad McClellan <tadmc@augustmail.com> wrote:
> Sandman <mr@sandman.net> wrote:
>
>
> If you don't want to be helped, that's fine too.
Indeed.
--
Sandman[.net]
| |
| Sandman 2006-09-26, 6:59 pm |
| In article <4nt2l4Fc3du5U1@news.dfncis.de>,
anno4000@radom.zrz.tu-berlin.de wrote:
>
> If that is what you think then don't use the term. It can only
> add to the confusion.
>
>
> So you expect us to divine what meaning you have assigned to the
> term for the moment? Great attempt at communication!
I keep getting reminded of what a bunch of idiots this group harbors
when I spend too much time in groups where people help each other out.
*plonk*
--
Sandman[.net]
| |
| Tad McClellan 2006-09-26, 6:59 pm |
| Sandman <mr@sandman.net> wrote:
> I keep getting reminded
An easy way to avoid that would be to stop coming back.
> of what a bunch of idiots this group harbors
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
| |
| Sandman 2006-09-27, 7:59 am |
| In article <slrnehjb5i.9b3.tadmc@magna.augustmail.com>,
Tad McClellan <tadmc@augustmail.com> wrote:
>
> An easy way to avoid that would be to stop coming back.
Indeed. But I could also get lucky and come upon someone that's
helpful. Not that big of a chance in this group, I know. But it has
happened before.
--
Sandman[.net]
| |
| Mumia W. (reading news) 2006-09-27, 7:59 am |
| On 09/26/2006 08:57 AM, Sandman wrote:
> In article <3h9Sg.6492$UG4.1495@newsread2.news.pas.earthlink.net>,
> "Mumia W. (reading news)" <paduille.4058.mumia.w@earthlink.net>
> wrote:
>
>
> Yeah, that's pretty much how I've been doing it. I just thought that
> there were a more modular approach. I'll try some more. Thanks :)
>
>
>
Can you show me how you did it?
--
paduille.4058.mumia.w@earthlink.net
|
|
|
|
|