Code Comments
Programming Forum and web based access to our favorite programming groups.I am having a regex nightmare and can't see the wood for the trees. I want to extract data from an HTML file. I have been using the file() command which gets the html alright, I am just falling down with the regular expression. eg: <span class="something">Some Text</span><span class="something">Some More text</span> I want to extract the information and write it to an array. The above should produce: $data[0]="Some Text" , $data[1]="Some More Text" I am such a noddy when it comes to regex, can anyone help with a code snippet Thanks Regards Richard Grove http://www.shopmaker.co.uk - Ecommerce Shop Systems
Post Follow-up to this messageRichard Grove - 容d Eye Media wrote: > I am having a regex nightmare and can't see the wood for the trees. > I want to extract data from an HTML file. I have been using the file() > command which gets the html alright, I am just falling down with the regul ar > expression. > > eg: <span class="something">Some Text</span><span class="something">Some > More text</span> > I want to extract the information and write it to an array. The above shou ld > produce: > $data[0]="Some Text" , $data[1]="Some More Text" > > I am such a noddy when it comes to regex, can anyone help with a code > snippet > Thanks > > Regards > Richard Grove > http://www.shopmaker.co.uk - Ecommerce Shop Systems > > > I did something like this b4. Here is some of it modified... <?php $data = array(); $quote = '<span class="something">Some Text</span><span class="something">Some More text</span>'; // get <span ...>...</span> within $quote preg_match_all('(<span.*?>*</span> )',$quote,$all_span, PREG_PATTERN_ORDER); foreach($all_span[0] as $span_match) { echo $span_match."\n"; // $span_match = <span ...>...</span> // get data between the span tags preg_match_all('(>.*< )',$span_match,$all_data, PREG_PATTERN_ORDER); foreach($all_data[0] as $data_match) { echo ' '. substr($data_match,1,strlen($data_match) -2)."\n"; array_push($data,substr($data_match,1,st rlen($data_match)-2)); } } print_r($data); ?> now $data should have all the stuff you need. -JI
Post Follow-up to this message"Jamie Isaacs" <jamie@shsu.edu> wrote in message
news:cjeeai$dgm@library1.airnews.net...
> Richard Grove - 容d Eye Media wrote:
regular
should
>
> I did something like this b4. Here is some of it modified...
>
> <?php
> $data = array();
> $quote = '<span class="something">Some Text</span><span
> class="something">Some More text</span>';
>
> // get <span ...>...</span> within $quote
> preg_match_all('(<span.*?>*</span> )',$quote,$all_span,
PREG_PATTERN_ORDER);
> foreach($all_span[0] as $span_match)
> {
> echo $span_match."\n";
> // $span_match = <span ...>...</span>
>
> // get data between the span tags
> preg_match_all('(>.*< )',$span_match,$all_data, PREG_PATTERN_ORDER);
> foreach($all_data[0] as $data_match)
> {
> echo ' '. substr($data_match,1,strlen($data_match)
-2)."\n";
> array_push($data,substr($data_match,1,st
rlen($data_match)-2));
> }
> }
>
> print_r($data);
> ?>
>
> now $data should have all the stuff you need.
> -JI
Many thanks, we are on the right road now.
I changed it to this but it doesn't work.
preg_match_all('(<span class="bodybold">*</span> )',$lines[$a],$all_span,
PREG_PATTERN_ORDER);
I would like to get data from between <span class="bodybold">data</span>
Any ideas?
Post Follow-up to this messageto match just the ones with bodybold in the span tag try this:
<?php
$data = array();
$quote = '<span class="bodybold">Some Text</span><span
class="something">Some More text</span>';
// get <span ...>...</span> within $quote
preg_match_all('(<span.*?(bodybold).*?>.*?</span> )',$quote,$all_span,
PREG_PATTERN_ORDER);
foreach($all_span[0] as $span_match)
{
echo $span_match."\n";
// $span_match = <span ...>...</span>
// get data between the span tags
preg_match_all('(>.*< )',$span_match,$all_data, PREG_PATTERN_ORDER);
foreach($all_data[0] as $data_match)
{
echo ' '. substr($data_match,1,strlen($data_match)
-2)."\n";
array_push($data,substr($data_match,1,st
rlen($data_match)-2));
}
}
print_r($data);
?>
Post Follow-up to this message"Jamie Isaacs" <jamie@shsu.edu> wrote in message
news:cjer6c$nbc@library1.airnews.net...
> to match just the ones with bodybold in the span tag try this:
>
> <?php
> $data = array();
> $quote = '<span class="bodybold">Some Text</span><span
> class="something">Some More text</span>';
>
> // get <span ...>...</span> within $quote
> preg_match_all('(<span.*?(bodybold).*?>.*?</span> )',$quote,$all_span,
> PREG_PATTERN_ORDER);
> foreach($all_span[0] as $span_match)
> {
> echo $span_match."\n";
> // $span_match = <span ...>...</span>
> // get data between the span tags
> preg_match_all('(>.*< )',$span_match,$all_data, PREG_PATTERN_ORDER);
> foreach($all_data[0] as $data_match)
> {
> echo ' '. substr($data_match,1,strlen($data_match)
-2)."\n";
> array_push($data,substr($data_match,1,st
rlen($data_match)-2));
> }
> }
>
> print_r($data);
> ?>
Many thanks Jamie,
I'll give it a spin
Regards
Richard Grove
http://www.shopmaker.co.uk - Ecommerce Shop Systems
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.