For Programmers: Free Programming Magazines  


Home > Archive > PHP Language > September 2004 > Regex headache









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Regex headache
Richard Grove - 容d Eye Media

2004-09-29, 10:45 am

I am having a regex nightmare and can't see the wood for the trees.
I want to extract data from an HTML file. I have been using the file()
command which gets the html alright, I am just falling down with the regular
expression.

eg: <span class="something">Some Text</span><span class="something">Some
More text</span>
I want to extract the information and write it to an array. The above should
produce:
$data[0]="Some Text" , $data[1]="Some More Text"

I am such a noddy when it comes to regex, can anyone help with a code
snippet
Thanks

Regards
Richard Grove
http://www.shopmaker.co.uk - Ecommerce Shop Systems



Jamie Isaacs

2004-09-29, 10:45 am

Richard Grove - 容d Eye Media wrote:
> I am having a regex nightmare and can't see the wood for the trees.
> I want to extract data from an HTML file. I have been using the file()
> command which gets the html alright, I am just falling down with the regular
> expression.
>
> eg: <span class="something">Some Text</span><span class="something">Some
> More text</span>
> I want to extract the information and write it to an array. The above should
> produce:
> $data[0]="Some Text" , $data[1]="Some More Text"
>
> I am such a noddy when it comes to regex, can anyone help with a code
> snippet
> Thanks
>
> Regards
> Richard Grove
> http://www.shopmaker.co.uk - Ecommerce Shop Systems
>
>
>


I did something like this b4. Here is some of it modified...

<?php
$data = array();
$quote = '<span class="something">Some Text</span><span
class="something">Some More text</span>';

// get <span ...>...</span> within $quote
preg_match_all('(<span.*?>*</span> )',$quote,$all_span, PREG_PATTERN_ORDER);
foreach($all_span[0] as $span_match)
{
echo $span_match."\n";
// $span_match = <span ...>...</span>

// get data between the span tags
preg_match_all('(>.*< )',$span_match,$all_data, PREG_PATTERN_ORDER);
foreach($all_data[0] as $data_match)
{
echo ' '. substr($data_match,1,strlen($data_match)
-2)."\n";
array_push($data,substr($data_match,1,st
rlen($data_match)-2));
}
}

print_r($data);
?>

now $data should have all the stuff you need.
-JI
Richard Grove - 容d Eye Media

2004-09-29, 10:45 am

"Jamie Isaacs" <jamie@shsu.edu> wrote in message
news:cjeeai$dgm@library1.airnews.net...
> Richard Grove - 容d Eye Media wrote:
regular[color=darkred]
should[color=darkred]
>
> I did something like this b4. Here is some of it modified...
>
> <?php
> $data = array();
> $quote = '<span class="something">Some Text</span><span
> class="something">Some More text</span>';
>
> // get <span ...>...</span> within $quote
> preg_match_all('(<span.*?>*</span> )',$quote,$all_span,

PREG_PATTERN_ORDER);
> foreach($all_span[0] as $span_match)
> {
> echo $span_match."\n";
> // $span_match = <span ...>...</span>
>
> // get data between the span tags
> preg_match_all('(>.*< )',$span_match,$all_data, PREG_PATTERN_ORDER);
> foreach($all_data[0] as $data_match)
> {
> echo ' '. substr($data_match,1,strlen($data_match)
-2)."\n";
> array_push($data,substr($data_match,1,st
rlen($data_match)-2));
> }
> }
>
> print_r($data);
> ?>
>
> now $data should have all the stuff you need.
> -JI




Many thanks, we are on the right road now.
I changed it to this but it doesn't work.
preg_match_all('(<span class="bodybold">*</span> )',$lines[$a],$all_span,
PREG_PATTERN_ORDER);

I would like to get data from between <span class="bodybold">data</span>

Any ideas?



Jamie Isaacs

2004-09-29, 8:03 pm

to match just the ones with bodybold in the span tag try this:

<?php
$data = array();
$quote = '<span class="bodybold">Some Text</span><span
class="something">Some More text</span>';

// get <span ...>...</span> within $quote
preg_match_all('(<span.*?(bodybold).*?>.*?</span> )',$quote,$all_span,
PREG_PATTERN_ORDER);
foreach($all_span[0] as $span_match)
{
echo $span_match."\n";
// $span_match = <span ...>...</span>
// get data between the span tags
preg_match_all('(>.*< )',$span_match,$all_data, PREG_PATTERN_ORDER);
foreach($all_data[0] as $data_match)
{
echo ' '. substr($data_match,1,strlen($data_match)
-2)."\n";
array_push($data,substr($data_match,1,st
rlen($data_match)-2));
}
}

print_r($data);
?>
Richard Grove - 容d Eye Media

2004-09-30, 11:24 am

"Jamie Isaacs" <jamie@shsu.edu> wrote in message
news:cjer6c$nbc@library1.airnews.net...
> to match just the ones with bodybold in the span tag try this:
>
> <?php
> $data = array();
> $quote = '<span class="bodybold">Some Text</span><span
> class="something">Some More text</span>';
>
> // get <span ...>...</span> within $quote
> preg_match_all('(<span.*?(bodybold).*?>.*?</span> )',$quote,$all_span,
> PREG_PATTERN_ORDER);
> foreach($all_span[0] as $span_match)
> {
> echo $span_match."\n";
> // $span_match = <span ...>...</span>
> // get data between the span tags
> preg_match_all('(>.*< )',$span_match,$all_data, PREG_PATTERN_ORDER);
> foreach($all_data[0] as $data_match)
> {
> echo ' '. substr($data_match,1,strlen($data_match)
-2)."\n";
> array_push($data,substr($data_match,1,st
rlen($data_match)-2));
> }
> }
>
> print_r($data);
> ?>





Many thanks Jamie,
I'll give it a spin

Regards
Richard Grove
http://www.shopmaker.co.uk - Ecommerce Shop Systems


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com