Using Regex in Yahoo pipes to "clean" RSS feeds -
There is some help in creating a Yahoo pipe that strips some elements from RSS feeds. To empower: I will use the regex code on Yahoo Pipes. I agree that the regex syntax is universal?
I have broken the question to some sub-questions:
-
What would be regex / to extract a specific HTML tag (its own square)? Content
-
How can I link to linked images, but how can I keep the image markup?
-
How can I add sequential classes to all the links found in a feed item? If there is 5 links in a feed item, they will be given classes: link001, link002, link003, link004, link005 ...
Examples of new account's boundary code Here you can find:
Reggae is not perfect ... so any help would be greatly appreciated! Thanks a lot!
Regular expression syntax is certainly not universal. Unfortunately Yahoo Pipes Docs does not say what kind of taste they use. Examples look like Pearl-style reggaces, so I'll use it.
To delete a specific HTML tag (such as span
) with a specific class attribute (such as some classes
), search for the following :
(? Si) & lt; Span [^ & lt; & Gt;] * class = ["']? Someclass ["']? [^ <^ < & Gt;] * & gt; (. *) Of the & lt; / Span & gt;
and replace with:
$ 1
The above regex will fail if span < The / code> tag you are attempting to remove has a nested
span
tag.
To delete a
tag, find img
as the first thing in its content:
(? Si) & lt; A [^ ^ lt; & Gt;] * & gt; (& Lt; img. *?) & Lt; / A & gt;
and replace with:
$ 1
The third item in your question is not alone with regular expressions can be done . You will need a feature to increase the number in the replacement. I do not know if Yahoo Pipes supports something like this, you do not really need a regex just & lt; Search for a
and & lt; Replace with a class = "link001"
Of course, XML / HTML with regular expressions are applied, regexes work on the examples given by you, but they will probably have every potential of HTML Can not work for pieces.
Comments
Post a Comment