FilterPlugin
Description
This plugin allows to substitute and extract information from content by
using regular expressions. There are three different types of new functions:
- FORMATLIST: maniplulate a list of items; it is highly configurable to define what constitutes a list and how to extract items from it
- SUBST, STARTSUBST/STOPSUBST: substiture a pattern in a chunk of text
- EXTRACT, STARTEXTRACT/STOPEXTRACT: extract a pattern from a text
While the START-STOP versions of SUBST and EXTRACT work on inline text,
the normal versions process a source topic before including it into the current one.
Syntax Rules
SUBST
Syntax:
%SUBST{topic="..." ...}%
insert a topic by processing its content.
-
topic="...": name of the topic text to be processed
-
pattern="...": pattern to be extracted or substituted
-
format="...": format expression or pattern substitute
-
header="...": header string prepended to output
-
footer="...": footer string appended to output
-
limit="<n>" maximum number of occurences to extract or substitute counted from the start of the text (defaults to 100000 aka all hits)
-
skip="<n>" skip the first n occurences
-
exclude="...": skip occurences that match this regular expression
- sort="on,off,alpha,num" order of the formatted items (default "off")
-
expand="on,off": toggle expansion of TWiki markup before filtering (defaults to on)
STARTSUBST, STOPSUBST
Syntax:
%STARTSUBST{...}%
...
%STOPSUBST%
substitute text given inline. see
SUBST.
EXTRACT
Syntax:
%EXTRACT{topic="..." ...}%
extract text from a topic. see
SUBST.
STARTEXTRACT, STOPEXTRACT
Syntax:
%STARTEXTRACT{...}%
...
%STOPEXTRACT%
extract content given inline. see
SUBST.
FORMATLIST
Syntax:
%FORMATLIST{"<list>" ...}%
formats a list of items. The <list> argument is separated into items by using
a split expression; each item is matched agains a pattern and then formatted
using a format string while being separated by a separator string; the result is
prepended with a header and appended with a footer in case the list is not empty.
- <list>: the list
- split="...": the split expression (default ",")
- pattern="...": pattern applied to each item (default "\s(.*)\s")
- format="...": the format string for each item (default "$1")
- header="...": header string
- footer="...": footer string
- separator="...": string to be inserted between list items
- limit="...": max number of items to be taken out of the list (default "-1")
- skip="...": number of list items to skip, not adding them to the result
- sort="on,off,alpha,num" order of the formatted items (default "off")
- reverse="on,off": reverse the sortion of the list
- unique="on,off": remove dupplicates from the list
- exclude="...": remove list items that match this regular expression
The pattern string shall group matching substrings in the list item to which you can refer to by
using $1, $2, ... in the format string. Any format string (
format,
header,
footer) may
contain variables
$percnt$,
$nop,
$dollar and
$n. The variable
$index referse to the position number within the list being formatted; using
$count in the
footer or header argument refers to the total number of list elements.
Examples
Secure Html
%STARTSUBST{pattern="<(a href=\"javascript:.*?)>(.*?)" format="<$1>$2</a>"}%
Pop me up
%STOPSUBST%
Format Comments
%EXTRACT{topic="FilterPlugin" expand="off" pattern=".div class=\"text\">.*?[\r\n]+(.*?)[\r\n]+(?:.*?[\r\n]+)+?-- (.*?) on (.*?)[\r\n]+" format="| $3 | $2 | $1 ... |$n"}%
Extract table data
| Pos |
Description |
Hours |
| 1 |
onsite troubleshooting |
3 |
| 2 |
normalizing data to new format |
10 |
| 3 |
testing server performace |
5 |
%EXTRACT{topic="FilterPlugin"
expand="off"
pattern="\|\s*(.*?)\s*\|\s*(.*?)\s*\|\s*(.*?)\s*\|"
format=" * it took $3 hours $2$n"
skip="1"
}%
Plugin Installation Instructions
- Download the ZIP file
- Unzip it in your twiki installation directory. Content:
%$MANIFEST%
- Visit
configure in your TWiki installation, and enable the plugin in the {Plugins} section.
Plugin Info
| Plugin Author: |
TWiki:Main.MichaelDaum |
| Copyright ©: |
2005-2007, Michael Daum http://wikiring.de |
| License: |
GPL (GNU General Public License) |
| Plugin Version: |
v1.30 |
| Change History: |
|
| 14 Sep 2007: |
added sorting for EXTRACT and SUBST |
| 02 May 2007: |
using registerTagHandler() as far as possible; enhanced parameters to EXCTRACT and SUBST |
| 05 Feb 2007: |
fixed escapes in format strings; added better default value for max number of hits to prevent deep recursions on bad regexpressions |
| 22 Jan 2007: |
fixed SUBST, added skip parameter to FORMATLIST |
| 18 Dec 2006: |
using registerTagHandler for FORMATLIST |
| 13 Oct 2006: |
fixed limit parameter in FORMATLIST |
| 31 Aug 2006: |
added NO_PREFS_IN_TOPIC |
| 15 Aug 2006: |
added use strict; and fixed revealed errors |
| 14 Feb 2006: |
moved in FORMATLIST from the TWiki:Plugins/NatSkinPlugin; added escape variables to format strings |
| 06 Dec 2005: |
fixed SUBST not to cut off the rest of the text |
| 09 Nov 2005: |
fixed deep recursion using expand="on" |
| 22 Aug 2005: |
Initial version; added expand toggle |
| TWiki Dependency: |
$TWiki::Plugins::VERSION 1.024 |
| CPAN Dependencies: |
none |
| Other Dependencies: |
none |
| Perl Version: |
5.005 |
| TWiki:Plugins/Benchmark: |
GoodStyle nn%, FormattedSearch nn%, FilterPlugin nn% |
| Plugin Home: |
TWiki:Plugins/FilterPlugin |
| Feedback: |
TWiki:Plugins/FilterPluginDev |
| Appraisal: |
TWiki:Plugins/FilterPluginAppraisal |
--
TWiki:Main.MichaelDaum - 14 Sep 2007