Try harder to fix block elements BBCodes used inside of inline elements BBCodes

These requests for comments/change have lead to an implemented feature that has been successfully merged into the 3.2/Rhea branch. Everything listed in this forum will be available in phpBB 3.2.
Post Reply
User avatar
JoshyPHP
Registered User
Posts: 381
Joined: Fri Jul 08, 2011 9:43 pm

Try harder to fix block elements BBCodes used inside of inline elements BBCodes

Post by JoshyPHP »

https://tracker.phpbb.com/browse/PHPBB3-13921

Recently I ran some tests while reparsing about ~500K posts from an existing forum. I looked at the error log from the s9e\TextFormatter parser using this low-tech extension that I don't recommend to anyone.

I found that a few posts used something like this: [size=50][quote]...[/quote][/size]. The size BBCode uses a <span> so it doesn't produce valid HTML. The default settings in s9e\TextFormatter prevent invalid HTML from being generated, and in this case they prevented the quote from being applied.

I don't think that's the best way to handle this invalid markup. I have updated the library's default settings to bring it closer to how HTML5 handles that kind of invalid markup. From now on, the following code:

Code: Select all

[size=50][quote]...[/quote][/size]
Will have the same result as this code:

Code: Select all

[size=50][/size][quote][size=50]...[/size][/quote][size=50][/size]
For reference, some docs about the library's default rules: Same links on ReadTheDocs:
Last edited by JoshyPHP on Fri Jul 10, 2015 7:49 am, edited 1 time in total.

User avatar
AmigoJack
Registered User
Posts: 110
Joined: Wed May 04, 2011 7:47 pm
Location: グリーン ヒル ゾーン
Contact:

Re: Try harder to fix block elements BBCodes used inside of inline elements BBCodes

Post by AmigoJack »

While I welcome to finally respecting valid HTML (I tried to drag attention to this years ago) I wonder how to deal with this in general: consider all the custom BBCodes that you can't imagine - how to recognize/deal with those?

User avatar
JoshyPHP
Registered User
Posts: 381
Joined: Fri Jul 08, 2011 9:43 pm

Re: Try harder to fix block elements BBCodes used inside of inline elements BBCodes

Post by JoshyPHP »

If a BBCode is rendered as a formatting element (or a span element with a class or style attribute and nothing else) and it hosts a BBCode rendered as a block element, it gets closed before the second BBCode and reopened inside of it.

IOW, this:

Code: Select all

[color=#AAA][center]hey there guys[/center][/color]
...gets interpreted as:

Code: Select all

[color=#AAA][/color][center][color=#AAA]hey there guys[/color][/center]
If you're curious about the code, grep through the codebase for isFormattingElement, isFormattingSpan, BlockElementsFosterFormattingElements and fosterParent.

printf
Registered User
Posts: 1
Joined: Sun Sep 06, 2015 7:09 am

Re: Try harder to fix block elements BBCodes used inside of inline elements BBCodes

Post by printf »

I think changing bbcode to do what "you alone" want is wrong. Especially when you could easily combine both those elements into a single inline element style with an inline block added to that sytle or better yet, add the inline color style element to the center block element which would maintain what the user original wanted. The phpbb bbcode parser is sooooo... way over complicated, and it's logic is outdated. Look I am not trying to be jerk, all I am saying is it's time to reinvent wheel, and build a new parser that parses bbcode in a "single pass", without regex(s), and good restructuring logic so that elements can be combined when they follow a certain pattern. Which would save a lot of database space, and tons, and tons of processing resources.


Me!

User avatar
DavidIQ
Customisations Team Leader
Customisations Team Leader
Posts: 1903
Joined: Thu Mar 02, 2006 4:29 pm
Location: Earth
Contact:

Re: Try harder to fix block elements BBCodes used inside of inline elements BBCodes

Post by DavidIQ »

printf wrote: Sun Sep 06, 2015 7:53 am I think changing bbcode to do what "you alone" want is wrong. Especially when you could easily combine both those elements into a single inline element style with an inline block added to that sytle or better yet, add the inline color style element to the center block element which would maintain what the user original wanted. The phpbb bbcode parser is sooooo... way over complicated, and it's logic is outdated. Look I am not trying to be jerk, all I am saying is it's time to reinvent wheel, and build a new parser that parses bbcode in a "single pass", without regex(s), and good restructuring logic so that elements can be combined when they follow a certain pattern. Which would save a lot of database space, and tons, and tons of processing resources.


Me!
I'd agree somewhat except the BBCode parser was completely rewritten so your entire argument is based on outdated assumptions.
Image

Post Reply