PhpBB server load management

General discussion of development ideas and the approaches taken in the 3.x branch of phpBB. The current feature release of phpBB 3 is 3.3/Proteus.
Forum rules
Please do not post support questions regarding installing, updating, or upgrading phpBB 3.3.x. If you need support for phpBB 3.3.x please visit the 3.3.x Support Forum on phpbb.com.

If you have questions regarding writing extensions please post in Extension Writers Discussion to receive proper guidance from our staff and community.
Post Reply
v12mike
Registered User
Posts: 14
Joined: Wed Dec 07, 2016 10:05 pm

PhpBB server load management

Post by v12mike »

The existing phpBB server load management system is rubbish, at least for the purpose of maintaining the best service for users while the server load is high.

My main forum recently had an episode of server overloading which from log analysis seems to have been caused by an overly aggressive search engine bot which simultaneously opened several sessions from different IP addresses bombarding our site with page requests. Our site typically gets about 10 hits per second during a busy period, but I think this bot was sending something over 100 per second, for several minutes at a time, repeated several times per day.

This resulted in our server hitting its CPU resource limit and eventually mysql hit its queue limit and everything went a bit pear shaped. The site and server also became more or less unmanageable during the episode.

I tried enabling the phpBB load management feature, but it really made things (from our user's perspective) worse. It is slow to operate so the initial response is not improved, and even worse, it is slow to recover, so users lose out a both ends, and they don't like the forum locked message.

I think that it is worth considering the addition of a feature to limit server load caused by individual phpBB users. The scheme I am thinking of is:
  • Limit per-user hits
    • For each logged-in user, limit the rate of pages served per (~10s) interval, across all sessions for that user.
    • For each bot limit the rate of pages served per (~10s) interval, across all sessions for that bot (maybe a different limit)
    • For guest sessions similar but per IP address.
    • Hits exceeding the governed rate are responded to with code 429 Too Many Requests (https://tools.ietf.org/html/rfc6585) this response will take much less server resource than a normal page response.
  • New session creations are rate limited in a similar manner (to lower limits)
  • The rate limits could be made dynamic reducing when the measure server load nears a critical level
    • Probably get get sufficient benefit from 2 steps (normal and high load)
    • Bots could be completely locked out under high load
  • Misbehaving users and periods of high load would be logged.
I realise that this would need to be implemented carefully so that the load management itself does not add significantly to the server load, but I think that it is feasible.

My thought is that although this could (probably) be achieved as an extension, it should be added to the phpBB core.

User avatar
Elsensee
Former Team Member
Posts: 42
Joined: Sun Mar 16, 2014 1:08 pm
Location: Hamburg, Germany
Contact:

Re: PhpBB server load management

Post by Elsensee »

I have to agree that the current load management has.. room for improvements. ;)
Since it only provides configuration based on the load value, it's only useful on unix machines which don't have this function disabled. Also, it's pretty inaccurate, thus improvements should be made.

I think your ideas are great and they should indeed be implemented in the core, except maybe for the logging feature but I'm not so sure.
I'm also wondering how the 429 should be send, as error pages by phpBB will usually contain header and footer. So if we either decide that outputting a standard phpBB page just without content is okay or if we should use a very simple error page instead.

Regarding the "high load"-mode, it might also be useful to select extensions that may be disabled if under high load to further reduce pressure.

User avatar
JoshyPHP
Registered User
Posts: 381
Joined: Fri Jul 08, 2011 9:43 pm

Re: PhpBB server load management

Post by JoshyPHP »

If you push load management to the application, by the time it starts checking whether the request should be served, a hundred different PHP files have already been loaded, a connection to the database has been established, a dozen different tables have already been queried and ~40% of the work required to serve the page has already been done.

How can you track the rate at which a user sends requests? (especially across different IPs) You're going to end up writing to the database and spend enough resources on accounting to defeat the purpose.

v12mike
Registered User
Posts: 14
Joined: Wed Dec 07, 2016 10:05 pm

Re: PhpBB server load management

Post by v12mike »

JoshyPHP wrote: Sat Oct 13, 2018 12:16 pm If you push load management to the application, by the time it starts checking whether the request should be served, a hundred different PHP files have already been loaded, a connection to the database has been established, a dozen different tables have already been queried and ~40% of the work required to serve the page has already been done.

How can you track the rate at which a user sends requests? (especially across different IPs) You're going to end up writing to the database and spend enough resources on accounting to defeat the purpose.
My assumption was that the check on user activity would be done in the user->session_begin() function at the point where there is definitely a session existing. At that point, in the normal case, there has been only one database lookup to find the session and user from the session_id. I would be surprised if getting to that point takes a significant proportion of the resource it takes to render a normal forum page.

I find it hard to believe that 40% of the work for a typical phpBB page is done in the first function call of (e.g.) viewtopic.php.

It is possible to consolidate the stats for multiple sessions of a user or bot using the existing session management system. For guests, it could only be done per-IP address.

I would design the session-hit monitoring system so that it does not use the database, instead storing the short-term session/hit data in the cache, which on any sensibly configured forum will be held in RAM.

User avatar
JoshyPHP
Registered User
Posts: 381
Joined: Fri Jul 08, 2011 9:43 pm

Re: PhpBB server load management

Post by JoshyPHP »

A quick test on my local 3.2.x install shows that it takes between 200-300 files and 1-6 queries to get past session_begin(), the smaller number being for a hot cache. I don't think the exact number of queries really matters, it's more about having to create a connection. It takes about ~60ms to get past session_begin() and ~70ms to display the index page. That's on PHP 7.2 with Opcache.

Senky
Extension Customisations
Extension Customisations
Posts: 315
Joined: Thu Jul 16, 2009 4:41 pm

Re: PhpBB server load management

Post by Senky »

JoshyPHP wrote: Sat Oct 13, 2018 8:20 pm A quick test on my local 3.2.x install shows that it takes between 200-300 files and 1-6 queries to get past session_begin(), the smaller number being for a hot cache. I don't think the exact number of queries really matters, it's more about having to create a connection. It takes about ~60ms to get past session_begin() and ~70ms to display the index page. That's on PHP 7.2 with Opcache.
+1. Thanks for the research. Simply said, load management should not be done on PHP level, but rather on HTTP server. If your board is big enough that it needs special load management, then you should probably have enough resources to invest on proper one. Not pseudo-manager which is now part of the core.

User avatar
Elsensee
Former Team Member
Posts: 42
Joined: Sun Mar 16, 2014 1:08 pm
Location: Hamburg, Germany
Contact:

Re: PhpBB server load management

Post by Elsensee »

I also had the thought that it's kinda too late if this happens on PHP level, because we're almost halfway through, but I was assuming that for example the display_forums() method which is called in index.php is taking quite some time on big boards with a lot of forums.

I had another look at the logic and realised that while the load limit check is happening in session_begin(), nothing is done about it until $user->setup(). Actually, I couldn't spot a single sql query until that first checkpoint.
(The reason, I guess, why this happens later is, to wait for the ACL to initiate, so admins and moderators will never be locked out)

I don't know if it would make more sense if the "locking out" happens earlier or if it's entirely wrong on PHP level, though.

Post Reply