PR: #3461
--------------------
• Branch @ GitHub • Pull request •
Synopsis
s9e\TextFormatter, a text formatting library that supports BBCodes and most other features found in forums, implemented as plugins. I'm its author. This RFC is about replacing the current BBCode/smilies/autolinking/censor routines (hereinafter referred to as legacy) with s9e\TextFormatter. I originally wanted to focus on providing support from the s9e\TextFormatter side of things while leaving the phpBB-centric code to somebody more proficient in it but I've recently come to the conlusion I should work on integration directly. This is why I'm currently implementing this RFC.
The goal is to close as many BBCodes-related bugs as possible and get to an equivalent level of functionality, backed with decent testing and within a reasonable time frame. As far as schedule goes, my goal is to be done before September 2013. I intentionally leave out other improvements (other than the more robust BBCode parser) for future RFCs because I don't want this RFC to die of inactivity or lack of consensus.
This RFC does not replace the legacy routines, it complements them. Old messages are still handled by the old code, but new messages are parsed and displayed by the new text formatter. Editing an old message will transparently convert it to the new format.
Altered functions/methods
- acp_bbcodes::main()
- acp_icons::main()
- acp_words::main()
- decode_message()
- generate_text_for_display()
- generate_text_for_storage()
- strip_bbcode()
- message_parser::parse()
- The markup inside quote's author is not parsed, e.g.
[quote="[b]author[/b]"]
-- One exception: up to 1url
tag is supported - Up to 1 blank line around block-level elements (quotes, lists, etc...) is ignored. In 3.1, if there's a blank line after a quote it is rendered as two <br>.
Censored words are marked at posting time. Changing the words list does not affect old posts. The censor can still be turned off at rendering time. Censoring does not apply to BBCode attributes. That means you can't prevent links to a given site by censoring its domain name.- No server-side syntax highlighting in code blocks.
- Default limits are set on the number of occurence of individual BBCodes in text (1000 of each), global number of BBCodes in text (10000) and nesting level (10.) Same thing for smilies, the default is 1000 smilies per post. Past this limit, the markup is ignored.
- The text between
list
and the first*
is ignored. - In the following text, the
[quote]
BBCode is ignored:This can be changed to be automatically interpreted as either of the following:Code: Select all
[b]...[quote]...[/quote]...[/b]
Code: Select all
[b]...[/b][quote]...[/quote]...[/b] [b]...[/b][quote][b]...[/b][/quote][b]...[/b]
- There can't be BBCodes named
E
orbecause those names are used for smilies and censored words respectively. This can be changed to something else, including namespaced names such asCENSOR
e:e
orfoo:bar
. See the discussion.
✓ PHPBB3-3981 - URL BBCode does not support IDN domains (only if idn_to_ascii() is available )
✓ PHPBB3-7187 - Quote smilies error
✓ PHPBB3-7275 - Custom bbodes trim('${1}')
✓ PHPBB3-8419 - custom tag eats up space character
✓ PHPBB3-8420 - emoticon removes space before itself when using preview (see decode_message_test.php)
✓ PHPBB3-9377 - Custom BB Code Nesting
✓ PHPBB3-10002 - (incomplete) BBCode usage of [quоte] and [lіst] forces closing [/lіst] and [/quоte]s, ultimately breaking HTML/Design
✓ PHPBB3-10122 -
[list=]
should support "none", along with CSS2 types✓ PHPBB3-10425 - URLs contains non ANSII characters are rejected and not recognize as URLs
✓ PHPBB3-10587 - URL Bug in phpBB 3
✗ PHPBB3-10922 - Allow parameters for [email] BBCode content instead of addresses only (Edit: not currently fixed, must have misread the description)
✓ PHPBB3-10989 - Bug in BBCode
✓ PHPBB3-11153 - Custom BBCode token {EMAIL} subpattern are captured - token can never be used as parameter
✓ PHPBB3-12195 - Double-slash URLs not supported
✓ PHPBB3-13555 - Poll options preview rendered incorrectly on <br /> collision
TODO
✓ Legacy messages are preserved and displayed with the legacy routines
✓ Legacy messages can be edited (but then they're parsed with s9e\TextFormatter)
✓ New messages are parsed with s9e\TextFormatter
✓ New messages are properly displayed in viewtopic
✓ Editing BBCodes, smilies and censored words in the ACP regenerates the new parser
✓ Default BBCodes are supported: B, CODE, COLOR, EMAIL, FLASH, I, IMG, LIST, *, QUOTE, SIZE, U, URL
✓ Per-style BBCode templates
✓ Custom BBCodes are supported, provided they are safe to use (e.g. no XSS)
✓
[attachment]
BBCode✓ Posting limits
- ✓ max_*_font_size
- ✓ max_*_img_height / max_*_img_width applied to
[img]
- ✓ max_*_img_height / max_*_img_width applied to
[flash]
- ✓ max_*_smilies
- ✓ max_*_urls
✓ Tabs in code blocks are preserved, legacy routines replaces them. This might be the occasion to change this behaviour, preserve tabs and use a JavaScript syntax highlighter
✓ Handles viewcensors, viewflash, viewimg and viewsmilies
✓ Takes care of toggling BBCodes, magic URLs, smilies, IMG BBCode, FLASH BBCode, QUOTE BBCode, and URLs in parse_message::parse()
✓
✓ Error message when using an unauthorized BBCode in poll options
✓ strip_bbcode()
✓ Polls
✓ Look into how text is cleaned up for the search engine
✓ Copy the tests from tests/bbcode/parser_test.php and instead of testing the parsed text, test the rendered test. Mostly done
✓ That path_in_domain() thing
Many other things I forgot to list and that you should remind me