You are viewing an historical archive of past issues. Please report new issues to the appropriate project issue tracker on GitHub.

Home » Issues » Feature request #1352

Feature request #1352: Hook for preprocessing

Kind	feature request
Product	wikitext
When	Created 2009-07-17T14:42:20Z, updated 2009-08-13T13:44:34Z
Status	closed
Reporter	August Lilleaas
Tags	no tags

Description

It would be useful to have a preprocessing hook in between the HTML tag sanitazion an the rest of the wikitext processing.

Say that I want to add definition list syntax to wikitext, based on the wikipedia syntax. The syntax uses newlines to parse out the elements. If I parse before handing it to wikitext, wikitext will sanitize all the HTML I generate. If i parse after handing it to wikitext, wikitext has removed all the newlines and such from the definition list syntax I was trying to parse.

# Example usage
parser = Wikitext::Parser.new
parser.preprocessor = proc {|txt| txt.gsub(/definition list regex/) { ... } }
parser.parse("Some wikitext here")

Comments

Greg Hurrell 2009-07-20T06:01:36Z

Just so I can get an idea of what you're trying to achieve here, can you show me what you want your markup to look like (before and after)?
Greg Hurrell 2009-08-06T11:53:04Z

In the absence of further input I'm going to mark this one as closed for a couple of reasons.

There really is no "between" moment after sanitization and the rest of the processing. This is because there is only a single pass through the input text, and the HTML is emitted along the way. For example, as soon as the translator sees '' it immediately emits <em>. As soon as it sees [[ it knows it's going to emit a link and so gobbles up the needed tokens and emits it immediately. It's the same for other tokens, and an essential part of the design given that it's all about speed. Basically with this architecture there can never really be a hooks into the middle of the translation process; we can do pre-processing and post-processing, but nothing in the middle because it's an indivisible pass.

If you want to add support for definition lists, the right way to do it would be to support them as first-class citizens (ie. with built-in support for syntax). As * and # are already used for unordered (unnumbered) and ordered (numbered) lists, a new symbol would be needed.
Greg Hurrell 2009-08-06T11:53:10Z
Status changed:
- From: new
- To: closed
August Lilleaas Created 2009-08-09T19:48:52Z, edited 2009-08-13T13:43:59Z
The markup is something like this:
```
# before
foo:: bar
baz:: maz

# after
<dl>
  <dt>foo</dt>
  <dd>bar</dd>

  <dt>baz</dt>
  <dd>maz>/dd>
</dl>
```
Does "built in support for syntax" mean having to write C code? I'm not fluent with C at all.
Greg Hurrell 2009-08-13T13:43:41Z
Yeah, that's what it means.

Can you show what the syntax would look like when the term contains a space? (ie. "foo bar" instead of "foo")

I wonder if something like this might be better:
```
::term 1: a definition
::term 2: another definition
```
Greg Hurrell 2009-08-13T13:44:34Z
Or perhaps for symmetry:
```
::term 1:: foo
::term 2:: bar
```

Add a comment

Comments are now closed for this issue.

Feature request #1352: Hook for preprocessing

Description

Comments

Add a comment

Menu