Extending the scope of the phpBB cache

Discussion of general topics related to the new version and its place in the world. Don't discuss new features, report bugs, ask for support, et cetera. Don't use this to spam for other boards or attack those boards!
Forum rules
Discussion of general topics related to the new release and its place in the world. Don't discuss new features, report bugs, ask for support, et cetera. Don't use this to spam for other boards or attack those boards!
TerryE
Registered User
Posts: 95
Joined: Sat May 23, 2009 12:24 am
Contact:

Extending the scope of the phpBB cache

Post by TerryE »

This thread is just to float an idea about performance that might be a later 3.x or 4.0 option. The phpBB architecture already uses caching at an application level to move a lot of the processing of the style templates from a per-page runtime costs to a one-off preprocess to generate the tpl_* module in the cache directory.

We seem to be in the process of refactoring areas of the current implementation into normalised modules to enhance the maintainability of the source base. Whilst this has many benefits, this also means for a "compile and go" language like PHP that the source becomes more fragmented and therefore the number of modules that need to be loaded in order to generate a page is continuing to rise. This isn't a significant problem for high volume boards which have implemented a PHP code accelerator cache (e.g. Xcache). However for lower volume boards, they won't have such caching and the php source files will be slowly cycled out of the file system cache; each extra module to be loaded can require an extra two seeks when this happens, leading to an extra time delay of perhaps 0.1-0.3 secs on page rendering response. These loads can all build up to quite material response delays.

At the moment the template preprocessing removes the runtime overhead of mapping the various <!-- block --> tags onto the conditional code to do this work. This is independent of language so a lot of error checking and binding has to be retained at runtime, but this could largely be eliminated by the following changes:
  • Instead of generating a tpl module per style/HTML_template we generate a compiled_template class module per style/language/display_page. that is tpl_prosilver_forumlist_body.html.php is replaced by tpl_prosilver_en_forumlist.html.php:
  • The common and page specific language files are preprocessed at template generation, and the results rolled up into a get_lang method -- which is probably just an unserialize call to an embedded (and rather large) literal. This template needs to be bound before language references are needed and this call to a get_lang() method can then replace the includes of the language files. (OK, there are a some rare early error conditions which require language strings before this call point but these can be handled by a lazy include of common.php.)
  • This denormalisation may see to be a combinatorial explosion, but in reality few forums implement more than one or two languages and only one language / style combination is ever needed per request.
  • This also means that we can hoist a lot more error checking into the compile phase (such as comparing language X to EN and flagging missing keys) and log this for review in the ACP.
  • Most language string can be bound at preprocessing time so whilst a "{L_PRINT_TOPIC}" reference currently generates "<?php echo ((isset($this->_rootref['L_PRINT_TOPIC'])) ? $this->_rootref['L_PRINT_TOPIC'] : ((isset($user->lang['PRINT_TOPIC'])) ? $user->lang['PRINT_TOPIC'] : '{ PRINT_TOPIC }')); ?>);" this would now be replaced by the literal value of the PRINT_TOPIC string, etc.
  • Because the tpl is a proper class we can split out default setting for variables into a separate method so that the main display doesn't need to embed the convolved isset() conditionals. It would just include at it's head a $this->set_defaults() call.
  • Include files could also be rolled up at preprocessing time. Hence where a page currently loads 5-6 tpl and language files, these would be replaced by a single cache module which will be smaller and leaner than their combined size.
  • This approach would only need small changes at a source level since the guts would be included in the template classes.
Reactions and thoughts?

User avatar
naderman
Consultant
Posts: 1727
Joined: Sun Jan 11, 2004 2:11 am
Location: Berlin, Germany
Contact:

Re: Extending the scope of the phpBB cache

Post by naderman »

Generally I think compiling language variables into the templates is a good idea and we've been discussing this a bit (not in detail) for 3.2. The questions I have are: how effective is this? Especially since there are language variables assigned dynamically which we cannot compile into the template right away. And how to deal language files that are dynamically loaded at a later point? or should that just not be allowed?

Something else on extending the scope of phpBB caching: I've been thinking about "compiling" the actual php scripts to reduce the number of files involved in one request. Especially if we improve modularity as much as we would like to in 3.2 that might become a problem.

TerryE
Registered User
Posts: 95
Joined: Sat May 23, 2009 12:24 am
Contact:

Re: Extending the scope of the phpBB cache

Post by TerryE »

I've been sucking my teeth on this one. The problem with an "idea" is that it is very hard to "kick the tyres". What I was thinking of doing was a concept demonstrator based on 3.0.5 which wouldn't be a production quality refactoring but might have hacky fringes to minimize the amount of code changes, but would at least allow the dev team to evaluate the pros and cons a little more and do a proper cost benefit analysis of moving in this direction.

In terms of implementation architecture we come down to the platform use-cases that I discussed in my sandbox page. For virtual and dedicated server implementations, I would expect the sysadmin to implement a PHP cache accelerator such as APC or Xcache, and in this case there is minimal runtime overhead incurred by better modularisation of the code. However, phpBB hosted on a shared service (which is probably 95%+ of installs) is entirely a different issue. (The one that I use doesn't even use php5 module -- it uses php5-cgi and zilch caching.) It's a real pity that PHP doesn't adopt the python approach of compiling to intermediate bytecodes and storing these to file because in this shared service scenario, strong modularisation can result in a performance penalty.

What we would need to try would be look at real-world performance. Benchmarking would be of little value because we don't want to look at responsiveness when you are hitting the site frequently enough for everything to be nicely cached in the file system cache, but the case where you are doing the board listing when no one has visited the site for five mins.

== [edit] Footnote ==

Out of interest viewtopic.php which accounts for perhaps 85% of page requests loads 25 separate modules to render the webpage. 4 of these are cached. In the round these could be batched into the following:
  • Data Access Group. config.php plus the includes: cache.php, constants.php, functions.php, acm files (in my case acm_memory.php, acm_xcache.php), db files (in my case dbal.php, mysqli.php)
  • Common Group. common.php, plus the includes auth.php, auth/auth_db.php, functions_content.php, hooks/index.php, session.php, template.php, utf/utf_tools.php
  • Viewtopic Includes. functions_display.php, functions_profile_fields.php, bbcode.php,
  • ViewTopic/Language/Style Language files: common.php, viewtopic.php, and Style templates: overall_header.html, viewtopic_body.html, jumpbox.html, overall_footer.html
These could be mapped onto four cached files. The reason for spitting out the Data Access from the Common Group is that style.php only loads the former, whereas all of the page display functions load the latter. The Viewtopic includes would be specific to viewtopic.php and each display has its corresponding include. The first three groups would basically be a logical concatenation of the base files, possibly with trivial whitespace and comment removal simply to reduce the file size for input by 30% or so. The ViewTopic/Language/Style include would be fairly heavily transformed by the template compile process. A copy of this would be generated for each active style/language combination (typically Prosilver for one language). So 25 files goes down to 4 files. Remember the main saving is not on the lines of code being compiled or executed but on latency saving by reducing the number and total size of files to be loaded.

User avatar
naderman
Consultant
Posts: 1727
Joined: Sun Jan 11, 2004 2:11 am
Location: Berlin, Germany
Contact:

Re: Extending the scope of the phpBB cache

Post by naderman »

Since we'd probably be looking at a major release for this I guess we can base our ideas on trunk. One difference is that we have auto loading there. It should be relatively easy to efficiently recognise inclusion patterns like the ones you found for viewtopic through that. Maybe even automatically.

User avatar
DavidMJ
Registered User
Posts: 932
Joined: Thu Jun 16, 2005 1:14 am
Location: Great Neck, NY

Re: Extending the scope of the phpBB cache

Post by DavidMJ »

The problem with caching language stuff into templates is an interesting one, I know this because I tried to do it a while back for _everything_ (Things like imagesets and our dynamically generated stylesheets). The problem is not with the caching of the data itself but with the cache invalidation, that quickly becomes expensive. An option is to just not do it and throw it into the ACP as an option that you just have to be aware of. With that in mind, all sorts of stuff can be inlined.

While we include only a handful of things at any given point in time, I think that we are not fine grained enough in places like functions.php ... It turned into a file where any random function that did not belong anywhere else. I think that we should attempt minor refactoring and get a lot of that stuff out of there. Benchmarking has shown that merely including an already cached file could be made even cheaper even though all of the opcodes are already generated as it still needs to populate all sorts of hash tables and other stuff...

Another nice thing about reducing the number of files that we open at runtime is for the opcode caches... I see many hosts that have their opcode cache configured to recache whenever an on disk state change occurs...
Freedom from fear

TerryE
Registered User
Posts: 95
Joined: Sat May 23, 2009 12:24 am
Contact:

Re: Extending the scope of the phpBB cache

Post by TerryE »

David,
DavidMJ wrote:The problem is not with the caching of the data itself but with the cache invalidation...
Agreed that the coherency issue is always there to bite you. However, we have three broad categories of information in the phpBB system and they have different sweet spots and therefore caching approaches.
  1. The phpBB code configuration. In reality this is pretty piecewise constant and usually only changes at version upgrade for any individual installation.
  2. The board configuration metadata. This is also piece constant but more volatile than the code configuration. On my boards we might change this on perhaps a once per week to once per month basis.
  3. The board data content. This changes on a per transaction basis.
I discussed three infrastructure use cases in my sandbox paper but on reflection we could for this purpose collapse them down to two:
  1. Shared. Here the board is installed on a typical ISP provided shared webservice. The sysadmin probably usually doesn't interactive access to the service let alone configuration rights over the LAMP (or equivalent) stack. No opcode or data caches are available.
  2. Dedicated. This may not be literally dedicated, but the sysadmin installing the board has change control on the hosting server and can optimise it for phpBB use by tuning the webserver, PHP and adopting APC or equivalent cache to offer effective RAM based opcode and variable caching. Whether this is a virtual or dedicated host really only effects the scale of board volumentics that can be supported.
We probably have something like a 95:5 split for cases (1) and (2) with the high volume boards and the more able sysadmins in the latter.

In terms of (A) and (B) the volatility is so low that a very simple coherency policy can be applied: just flush the entire cache after any change, and don't bother trying to maintain any coherency metadata. This approach is also very easy to implement. The application based code caching that I discuss in this topic falls into category (A) and is really targeted at lifting the end-user responsiveness for the 95% of phpBB boards in case (1). It will have minimal benefits for the dedicated boards as it would be rendered largely unnecessary but a decent opcode cache.

The ACM cache really targets (B) which the ACM files optimised for (1) and the ACM memory modules optimised for (2). (BTW, I have implemented a mysqli "memory" cache and I find that it performs a lot better than the ACM files cache.)

I agree that (C) is very problematic because of the coherency issues. It has virtually no benefits for (1). My inclination would be to leave this sort of caching to the database engine. Certainly based on my drilldown on viewtopic, there are very few opportunities here. Perhaps the main one that leaps out is the would be to cache fully the WHO IS ONLINE data (say with a 120sec TTL) as this has an order N squared costs and there is no integrity penalty is the data is slightly stale. This would give a modest but worthwhile performance boost to (2).

I feel that the main benefit of the sort of caching strategy that I outlined would be a subtle one. At the moment the developers need to be wary of these issue of the performance impact of modularisation, and this is not good for code quality and maintainability. If we adopt a caching strategy where we JiT assemble our modules into loadsets to minimise load overheads, then this frees up the developers to focus on functionality and maintainability.

One last comment: current benchmarks answer the wrong Q for type (1) systems. You tend to hammer the system to take throughput measures as you increase demand. Many type (1) systems would be lucky to get tens of posts per day and a thousand views. The main measure for type (1) boards should be latency on phpBB request. As I have said before, the killer here is that the (filesystem) cache miss rate can be very hight and therefore the number of physical reads needed to generate the page dominates latency.

User avatar
Kellanved
Former Team Member
Posts: 407
Joined: Sun Jul 30, 2006 4:59 pm
Location: Berlin

Re: Extending the scope of the phpBB cache

Post by Kellanved »

TerryE wrote: Perhaps the main one that leaps out is the would be to cache fully the WHO IS ONLINE data (say with a 120sec TTL) as this has an order N squared costs and there is no integrity penalty is the data is slightly stale. This would give a modest but worthwhile performance boost to (2).
I added caching to that a while back - we had to revert it, as the userbase complained about the list being not up-to-date. It's hard to please everybody :)
No support via PM.
Trust me, I'm a doctor.

User avatar
Acyd Burn
Posts: 1838
Joined: Tue Oct 08, 2002 5:18 pm
Location: Behind You
Contact:

Re: Extending the scope of the phpBB cache

Post by Acyd Burn »

Kellanved wrote:
TerryE wrote: Perhaps the main one that leaps out is the would be to cache fully the WHO IS ONLINE data (say with a 120sec TTL) as this has an order N squared costs and there is no integrity penalty is the data is slightly stale. This would give a modest but worthwhile performance boost to (2).
I added caching to that a while back - we had to revert it, as the userbase complained about the list being not up-to-date. It's hard to please everybody :)
If i remember we had an additional more serious problem with this too - due to the who is online query being different within forums and user based (the query changes slightly from user to user and forum to forum) there were hundreds of sql cache files created which resulted in an actual slowdown than a performance boost. ;)

Image

TerryE
Registered User
Posts: 95
Joined: Sat May 23, 2009 12:24 am
Contact:

Re: Extending the scope of the phpBB cache

Post by TerryE »

@Kevin. What TTL did you use? This would only have any advantage on the higher volume boards such as http://www.phpBB.com/community, where a 2 min TTL would save this, so this should only be cached if a memory cache is being implemented (I've posted separately on the simplification of the coherence model for these caches which will improve performance. But given that MySQL should be set up properly on such systems, then this is a reasonably cheap query so maybe you're right and it's not worth doing. However this a detail which should distract us from the central topic.

On the wider issue of caching and modularisation, I've had a look at the PHP specific performance discussions on the Internet. Most are motherhood for beginners and decent discussions are thin on the ground, but here are some useful hits: Most of the performance discussions really focus on the high throughput dedicated systems, and don't really address improving the responsiveness of shared systems, yet the characteristics and therefore optimisation strategy for types (1) and (2) are different, so we may want to make the caching approaches configurable.

An example of such a difference is that there are some benefits in a normalised modularisation on cached dedicated systems as this causes less fragmentation of the opcode cache; whereas on shared systems globbing up the modules into composites has distinct performance benefits. Another issue is that some engines will scan and preload any fixed path includes as part of the module load. Hence if we are adopting a dynamic cache we can also convert any cascade includes from programmatic to fixed path. More discussion is needed here.

One other issue that makes life difficult is the really inconsistent application the approach to how modules are included. Of the 99 PHP modules in my phpBB 3.0.5 config, I have include (178 uses), include_once (101), require (67) require_once (0), with the include mechanism varying from module to module — e.g. includes/functions_user (include: 22, include_once:7, require:1). The Drupal performance discussions give a clear reasoning for preferring include_once over include, and I am not sure why we should prefer include to require as all a few do not include error handing yet will cause fatal function call errors. To me what cries out here is that we adopt a coherent and unified approach to module inclusion. It is important that this works well for the request which comprise over 99% of transactions (see here: viewtopic, files, styles, viewforum, ucp(logon), index, search (new/active/ego posts), post. It is a waste of time trying to optimise the rest, so there isn't much point in any optimisation of the ACP, MCP, and UCP (less logon) functions, but the main path for these high usage functions should be reviewed. I've attached a rough kit analysis of the includes in phpBB to show you what I mean. I feel that we should encapsulate the main path includes in any module in a wrapper function at the top of the module which provides a unified method of handling (possibly including optional debug instrumentation) all such includes, at least for the modules in this main tree hierarchy. This unified method could also hook into a cache rollup which batches up the modules and resolves any residual includes to a fixed path.

Anyway this is just a brainstorm. I'll try to do an outline implementation in the next month or so if you are interested.
Attachments
IncludeAnalysis.zip
Excel analysis of includes in phpBB codebase
(18.57 KiB) Downloaded 548 times

User avatar
naderman
Consultant
Posts: 1727
Joined: Sun Jan 11, 2004 2:11 am
Location: Berlin, Germany
Contact:

Re: Extending the scope of the phpBB cache

Post by naderman »

I'd actually say that based on that info it's quite apparent that we should go with autoload and include/require rather than either of the _once() methods. Whether one uses include or require doesn't really make a performance difference. Sometimes require is used when something is included unconditionally, while include() is used for conditional stuff. But I guess we should just decide on one and stick to it. The decision for autoload+require in future versions was already made based on this kind of analysis. What I wonder however is how to get autoloading right wrt caching as well as automatically including dependencies so the autoloader won't have to be called too often.

I'd love to allow MODs to alter the dependencies without code changes but I really don't see how that'd work so I guess anything other than the default stuff would still have to be loaded via the autoloader regularly. Unless of course we come up with some dynamic cached way of detecting these automatically.

Post Reply