Hi, I\'ve been trying to create only 1 regular expression to extract only the de

ID: 3590989 • Letter: H

Question

Hi, I've been trying to create only 1 regular expression to extract only the description information from the following 3 description tags, except I have found them to be a little bit different. Just hoping someone can provide me with one that will work for all 3 below. (Please note that 1 of the description tags seems to have no description listed so I would want my regular expression to still recognise this but to just give no words. For example for the first one I would only want "The Los Angeles City Council agreed Wednesday to pay $1.9 million to the family of a man who was shot to death by police officers after he had stabbed himself in the abdomen during an apparent suicide attempt" extracted. Thanks

<description><![CDATA[ <img src="http://media.nbclosangeles.com/images/213*120/luis+molina+martinez.JPG" align="left" hspace="5" /> The Los Angeles City Council agreed Wednesday to pay $1.9 million to the family of a man who was shot to death by police officers after he had stabbed himself in the abdomen during an apparent suicide attempt. Photo Credit: Martinez Family]]></description>

<description><![CDATA[ <img src="http://media.nbclosangeles.com/images/262*120/171011-CandyFactoryEvictionVillageValley.JPG" align="left" hspace="5" /> The famed Candy Factory on Magnolia Boulevard in Valley Village has been forced to close up shop and move to a cheaper location due to rising retail prices.]]></description> </item>

<photo:thumbnail>http://media.nbclosangeles.com/images/231*120/10-10-2017-sky-fire-anaheim-hills-3.jpg</photo:thumbnail> </media:content> <description><![CDATA[ <img src="http://media.nbclosangeles.com/images/231*120/10-10-2017-sky-fire-anaheim-hills-3.jpg" align="left" hspace="5" /> Photo Credit: KNBC-TV]]></description> </item> <item> <dc:creator><![CDATA[]]></dc:creator> <title><![CDATA[Photos: Memorable Dodger Moments From 2017]]></title> <link><![CDATA[http://www.nbclosangeles

Explanation / Answer

Since you need to find multiple matches, you can use /(expression)/g . g matches mutliple times saving the last matched index also. Now, To match a description, the sentences are placed in between two tags, so we need to match a string between a closing > and opening <.

So far we have this expression / > .* < /g .

Note that here .* matches any character. Now, To match a sentence we need a starting character to be capital i.e in [A-Z] and ends with ".", in between we can have any characters. Since . is a special character we need to escape it with .

So we get ( [A-Z] .* .) for matching a sentence. Now we want to find the first "." instead of searching any ".", For this we need to add a "?" to make the search greedy i.e. pickup the first match. So we get :

( [A-Z] .*? .)

After a Putting this in earlier expression we get:

/ > ( [A-Z] .*? . ) .* < /g

Here, printing the value of $1 will give you your output since, we have grouped the sentence part together and left out the > and <.

Navigate

Hi, I\'ve been having trouble trying to figure out how to get my code to be able

Hi, I\'ve been trying to find a way to get the following text \"One of the bigge

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Hi, I\'ve been trying to create only 1 regular expression to extract only the de

Question

Explanation / Answer

Related Questions

Navigate