PHPBB3-13803 / PHPBB3-13891 - Reparsing text

Discuss requests for comments/changes posted in the Issue Tracker for the development of phpBB. Upcoming releases are 3.2/Rhea and 3.3.
Post Reply
User avatar
Elsensee
Development Team
Development Team
Posts: 36
Joined: Sun Mar 16, 2014 1:08 pm
Location: Hamburg, Germany
Contact:

Re: PHPBB3-13803 - Implement a generic and scalable way to reparse rich text

Post by Elsensee » Fri May 01, 2015 9:31 pm

I don't like the current state. The administrator should be able to change permissions for every BBCode including the ones they can create in the ACP.

User avatar
JoshyPHP
Registered User
Posts: 358
Joined: Fri Jul 08, 2011 9:43 pm

Re: PHPBB3-13803 - Implement a generic and scalable way to reparse rich text

Post by JoshyPHP » Mon May 04, 2015 1:03 am

JoshyPHP wrote:For future reference: the signatures of generate_text_for_storage() and message_parser::parse() don't support the same feature set. Extra parameters could be added to generate_text_for_storage() to support toggling $allow_img_bbcode, $allow_flash_bbcode, $allow_quote_bbcode and $allow_url_bbcode but those values aren't stored in the database. That means it will require tracking each call to message_parser::parse() to determine how those values were computed at the call site. It will probably involve checking the text's author's permissions.
Following up: instead of tracking down each call to message_parser::parse() to figure out the logic behind toggling certain BBCode, it's possible to test whether the BBCode is present. It it is, then it was enabled.

posting.php prevents those BBCodes from being submitted in the first place, so it may be unnecessary to even test for their presence. They could remain unconditionally enabled. However, that relies on the assumption that everything in the database was submitted through posting.php and that this feature worked consistently in past versions. Most of the code related to toggling Flash and other BBCodes dates back to 2006 and earlier.

User avatar
JoshyPHP
Registered User
Posts: 358
Joined: Fri Jul 08, 2011 9:43 pm

Re: PHPBB3-13803 - Implement a generic and scalable way to reparse rich text

Post by JoshyPHP » Fri May 29, 2015 2:38 pm

Posted a working example that reparses all posts immediately (not a cron) at https://gist.github.com/s9e/623359cc041a82281f54

An important note about `bbcode_uid` and other metadata: the text_reparser services do not remove this data from the database. One of the reason is that the values from one table are sometimes used for things that are stored in another table. For instance, some metadata is shared between topics, first posts and poll titles/options.

User avatar
JoshyPHP
Registered User
Posts: 358
Joined: Fri Jul 08, 2011 9:43 pm

Re: PHPBB3-13803 / PHPBB3-13891 - Reparsing text

Post by JoshyPHP » Thu Jul 02, 2015 10:09 am

The text_reparser services are in and I'm putting the finishing touches to the CLI. That's how it looks at the moment on my local install:
phpbb-cli-reparser.png
While working on CLI I found/fixed a couple of memory leaks in the library on PHP 7, one of which (circular references) may also apply to earlier PHP versions. I'll update the library version in another PR.

PHP 7 is about 4 times faster than PHP 5.6 as far as PHP execution goes, with a slightly lower memory usage.

Cin
Registered User
Posts: 20
Joined: Mon Mar 23, 2015 11:33 am

Re: PHPBB3-13803 / PHPBB3-13891 - Reparsing text

Post by Cin » Thu Jul 02, 2015 9:12 pm

Do we need to run the reparser manually or does it automatically reparse a post when viewing it? Because I'm running the current github build on my server on php7 and I am not getting any errors whatsoever, everything is very smooth!

User avatar
JoshyPHP
Registered User
Posts: 358
Joined: Fri Jul 08, 2011 9:43 pm

Re: PHPBB3-13803 / PHPBB3-13891 - Reparsing text

Post by JoshyPHP » Thu Jul 02, 2015 9:23 pm

Manually, but you don't have to for the time being and probably not for a long time. Old posts are displayed with the old routines until they're reparsed.

Cin
Registered User
Posts: 20
Joined: Mon Mar 23, 2015 11:33 am

Re: PHPBB3-13803 / PHPBB3-13891 - Reparsing text

Post by Cin » Sun Jul 05, 2015 7:53 am

Is there any benefit in doing it anyway?

Nicofuma
3.2 Release Manager
3.2 Release Manager
Posts: 298
Joined: Sun Apr 13, 2014 1:40 am
Location: Paris

Re: PHPBB3-13803 / PHPBB3-13891 - Reparsing text

Post by Nicofuma » Sun Jul 05, 2015 9:47 am

Yes because the old routine will be removed one day.
Member of the phpBB Development-Team
No Support via PM

Cin
Registered User
Posts: 20
Joined: Mon Mar 23, 2015 11:33 am

Re: PHPBB3-13803 / PHPBB3-13891 - Reparsing text

Post by Cin » Sun Jul 05, 2015 10:43 am

Obviously :-) I meant if there is any benefit in reparsing the posts before the old routine is removed (e.g. performance wise)

I'm testing it out now, but I get the following error when using /usr/local/php7/bin/php -dmemory_limit=32M phpbbcli.php reparser:reparse post_text --ansi

PHP Warning: str_repeat(): Second argument has to be greater than or equal to 0 in .../phpbb/path_helper.php on line 232

It's still reparsing but it goes very slow, it'll take more than 5 hours to fully convert the posts (about 450.000).

User avatar
JoshyPHP
Registered User
Posts: 358
Joined: Fri Jul 08, 2011 9:43 pm

Re: PHPBB3-13803 / PHPBB3-13891 - Reparsing text

Post by JoshyPHP » Sun Jul 05, 2015 11:55 am

There's a small benefit in that it normalizes whatever difference there is between 3.1 and 3.2. And I don't think that the old custom BBCodes currently work on PHP7. Apart from that, not really. The performance is about the same.

5 hours is kind of a lot, but at the same time that's about 25 posts per second. What kind of server is that? Is that a live server or something you use for testing? It would be interesting to run it a second time once it's done because then it will still reparse every post but it won't have to update the database. You'd see how much time was spent on the database during the first run. On my test board running on localhost, that's less than 10%. It takes 10-11 minutes for the first run and 9-10 minutes for the second.

Post Reply