Regex question

Status
Not open for further replies.

Enigmabomb

New member
Feb 26, 2007
2,035
66
0
Than Franthithco
Hey Guys,

I'm getting better with regex, but this query is giving me a mother of a time. I want to capture the information between two arbitrary html tags, with no back references. (so it'll parse less shit)

in this case, the tag is named <Features> and </Features>

preg_match("<Features>(.*?)<//Features>",$description,$urlarray);


Warning: preg_match() [function.preg-match]: Unknown modifier '(' in


I know I'm close, and it's just missing some stupid piece of something. Anyone see where Im completely fucking this up?

thanks.

josh
 


You are missing the /'s to define the pattern
Code:
<?php

$html = '<feature someatt="foo" otheratt="bar">blah <b>Blah</b> blah</feature>';

$regex = '/<feature(.*)>(.*)<\/feature>/';

preg_match($regex, $html, $matches);

// what you want is in matches[0]
echo "<pre>"; print_r($matches); echo "</pre>";

?>
 
preg_match("/<Features>(.*?)<\/Features>/", $description, $urlarray);

preg_match("/<Features>(.*?)<\/Features>/i", $description, $urlarray); to make it case-insenstive
 
I'm guessing you might be trying to match several instance of ' <FEATURES>Ass</FEATURES>' with the same document, if this is the case then you want to use preg_match_all, also you might want to add the greedy flag ( 'U' ) to the regex. So for instance:

Code:
<?php

$html = '<FEATURES>Ass</FEATURES><FEATURES>Ass2</FEATURES>';

$regex = '/<FEATURES>(.*)<\/FEATURES>/Ui';

preg_match_all($regex, $html, $matches);

echo "<pre>"; print_r($matches); echo "</pre>";

?>
without the greedy flag it will match the broadest sense of the pattern

ungreedy match ( 1 match )
<FEATURES>Ass</FEATURES><FEATURES>Ass2</FEATURES>

greedy match ( 1 match )
<FEATURES>Ass</FEATURES><FEATURES>Ass2</FEATURES>

greedy match with preg_match_all ( 2 matches )
<FEATURES>Ass</FEATURES><FEATURES>Ass2</FEATURES>
 
Didn't know about the U-thing. I just add a ? after my "how many of these do you want to find". Like this:
Code:
$regex = '/<FEATURES>(.*?)<\/FEATURES>/i';
 
The greedy tag is interesting, I also didn't know about that. The pattern matches the hwole thing, Tags INCLUDED. Now that I think about it, I can use str_Replace to filter out the tags rather than making a regex to pluck between the tags.

thanks for the interesting feedback

josh
 
Status
Not open for further replies.