Hi all,
Basically, I'm the developper behind the pecl bbcode extension,
it's a C implementation of a BBCode parser and converter, the system is able to handle simple "search and replace", or more complex works by allowing users to provide callbacks.
What's Important in this extension is: It don't use regexp, and as such avoid this performance bottleneck. I provided when phpBB3 was released a PHP implementation of the algorithm.
So, my question is basically: can it be a usefull base for phpBB4? Is it flexible enough? Do you have suggestions?
For info:
http://pecl.php.net/bbcode/
http://php.net/manual/en/book.bbcode.php
Thanks for your feedback
PECL BBCode Extension
Forum rules
Discussion of general topics related to the new release and its place in the world. Don't discuss new features, report bugs, ask for support, et cetera. Don't use this to spam for other boards or attack those boards!
Discussion of general topics related to the new release and its place in the world. Don't discuss new features, report bugs, ask for support, et cetera. Don't use this to spam for other boards or attack those boards!
Re: BBCode Support
It seems like it does all things we want. I'll be looking into it. Does it support multiple arguments like:
If not, would it be easy to add? Or would you have an alternative to this?
Code: Select all
[tag="first arg value" argument="value" otherargument="othervalue"]Hello[/tag]
Don't give me my freedom out of pity!
Re: BBCode Support
You can find the current state of the BBCode parser we intend to use for the next version of phpBB in svn:
http://code.phpbb.com/projects/phpbb/re ... r_base.php
http://code.phpbb.com/projects/phpbb/re ... parser.php
Obviously it is useful for us if we are able to provide a common interface for both the pure PHP implementation and the implementation using the PECL extension. So that the PECL extension can be used when available without noticable difference to the user. We support custom BBCodes that can be chosen by the administrator of a board. A list of popular BBCodes for the current parser which makes extensive use of regular expressions can be found at http://www.phpbb.com/community/viewtopi ... 6&t=579376. Some are not recommendable due to the XSS risk they pose which is anchored in the use of regular expressions. We do not yet have an interface for creating such custom BBCodes for the next version but it will essentially be an interface to select the values as you can find them in the bbcode_parser.php array.
Maybe you can take a look at that and try to identify problems, too. I think it would be great if we could provide an interface compatible with the PECL extension in phpBB3.1.
I'm also moving this to a different forum, since it's more phpBB3 than phpBB4 related.
http://code.phpbb.com/projects/phpbb/re ... r_base.php
http://code.phpbb.com/projects/phpbb/re ... parser.php
Obviously it is useful for us if we are able to provide a common interface for both the pure PHP implementation and the implementation using the PECL extension. So that the PECL extension can be used when available without noticable difference to the user. We support custom BBCodes that can be chosen by the administrator of a board. A list of popular BBCodes for the current parser which makes extensive use of regular expressions can be found at http://www.phpbb.com/community/viewtopi ... 6&t=579376. Some are not recommendable due to the XSS risk they pose which is anchored in the use of regular expressions. We do not yet have an interface for creating such custom BBCodes for the next version but it will essentially be an interface to select the values as you can find them in the bbcode_parser.php array.
Maybe you can take a look at that and try to identify problems, too. I think it would be great if we could provide an interface compatible with the PECL extension in phpBB3.1.
I'm also moving this to a different forum, since it's more phpBB3 than phpBB4 related.
Re: PECL BBCode Extension
Hi, sorry i did not see you responses, i was watching the previous topic.
Basically, the parser does not implement multiple args, but it can be added, the complex part was how to declare the rules, it's allready a feature waiting to be implemented, but i don't have an easy answer as of now, even more, you can provide another set of rules to parse arguments (for quote for example)
Here my php implementation, using the same algorithm, avoiding regexp, and therefore providing more scalable speed (and a single pass even with a lot of rules) (to be more specific, i have one pass for tags, and a search and replace for smileys, but the smiley part might be optimizable, and this allow some tags to exclude smileys code sections for example)
http://svn.php.net/viewvc/pecl/bbcode/t ... iew=markup
Furthermore as a full tree is built, we can ensure syntax errors are corrected, and so on (correct tags closing orders or even tag reopening)
The php code is quite old, so it might need some small corrections to be 100% functional.
The goal of the php implementation, is the proof of concept, before the real C part
Basically, the parser does not implement multiple args, but it can be added, the complex part was how to declare the rules, it's allready a feature waiting to be implemented, but i don't have an easy answer as of now, even more, you can provide another set of rules to parse arguments (for quote for example)
Here my php implementation, using the same algorithm, avoiding regexp, and therefore providing more scalable speed (and a single pass even with a lot of rules) (to be more specific, i have one pass for tags, and a search and replace for smileys, but the smiley part might be optimizable, and this allow some tags to exclude smileys code sections for example)
http://svn.php.net/viewvc/pecl/bbcode/t ... iew=markup
Furthermore as a full tree is built, we can ensure syntax errors are corrected, and so on (correct tags closing orders or even tag reopening)
The php code is quite old, so it might need some small corrections to be 100% functional.
The goal of the php implementation, is the proof of concept, before the real C part
Re: PECL BBCode Extension
One thing even our new BBCode parser is not able to deal with, but that we do currently support is allowing limited BBCode in the quote tag parameter.
Code: Select all
[quote="[url]http://www.example.com[/url] [i][u]Example[/u][/i]"]This text is the quote content.[/quote]
While the current solution for supporting this format is quite an ugly hack, I am not sure what the ideal replacement for this would be, do you have any suggestions on how to design a better BBCode for this?[url]http://www.example.com[/url] [i][u]Example[/u][/i] wrote:This text is the quote content.
Re: PECL BBCode Extension
no, you can provide another parser ruleset for the arg parsing
I support a subparser (global currently, but can be implemented "by tag" there: http://php.net/manual/en/function.bbcod ... parser.php )
but api can be enhanced to allow a ruleset per tag, with a global sub-parser as default (and the current behaviour as fallback (the parser being his own subparser))
Let me know what you think of this.
I support a subparser (global currently, but can be implemented "by tag" there: http://php.net/manual/en/function.bbcod ... parser.php )
but api can be enhanced to allow a ruleset per tag, with a global sub-parser as default (and the current behaviour as fallback (the parser being his own subparser))
Let me know what you think of this.
Re: PECL BBCode Extension
Oh great, I missed that, looks pretty good We should probably add a similar feature to our plain PHP implementation
Re: PECL BBCode Extension
I don't think it will be easily implemented with regexp however
Re: PECL BBCode Extension
If you look at the BBCode parser I linked to, it does not user regular expressions for parsing. It does make use of some regular expresions, but the parser itself is a proper stack based one.
Re: PECL BBCode Extension
yes, seen it,
but the complex part is how to escape quotes and detect the good ]
altough you can only try to find the opening of the tag in regexp and only then parse the argument.
but the complex part is how to escape quotes and detect the good ]
altough you can only try to find the opening of the tag in regexp and only then parse the argument.