Perhaps one of the key reasons for my slightly different perspective on this issue of performance is that my IT background wasn't one of classic applications development, but more one of modelling, real-time and systems development moving in more general systems engineering. From a systems engineering perspective you try to understand the architecture and performance of the application in macroscopic terms and try to understand the key parameters from this perspective. I have now got a very sound grasp of these macro characteristics of phpBB and what I am now trying to grapple with is how to make these accessible to the development team as a set of simple guidelines.
- phpBB is a relatively simple application from an algorithmic perspective. There is no complex processing for the common web-requests; more the presentation of formatted database content. A typical request involves perhaps executing 40K lines of PHP code, and given that the opcode interpreter can execute ~20M opcodes per second, this is only 5-10 mSec of runtime.
- On the other hand viewing this topic on the area51 takes perhaps 2 secs and topics on a forum running on a shared service maybe 2-3 times that, so a lot of the time (99%) is being spent on none application activities. Understanding these and addressing "the low hanging fruit" can have a profound impact on the responsiveness of the application.
- The one of these that is apparent at an application level is the overhead of executing D/B queries, so quite a lot of effort has gone into optimising query design and adding a query cache cache mechanism to avoid hitting the D/B with low volatility queries.
- PHP image activation and application code compilation are unavoidable on a shared service and these add about 150mSec to request times. This overhead that can be and normally is avoided on a dedicated service (e.g. by using mod_php/FastCGI to create persistent PHP processes, and use of an opcode cache to avoid compilation).
- Correct tuning of Apache / IIS to maximise use of local browser caching and data-stream compression can significantly reduce the number and size of secondary requests to render a phpBB page. Yet we are entirely silent on this issue in our documentation.
- There are two key strategies which can significantly reduce runtime on a shared service:
- Reading (and writing) lots of files kills application performance. The reason for this is that there is huge contention for the Virtual Filesystem Cache (VFC) on a shared server and files rapidly get flushed out. The server then needs to reload content from the filesystem in the case of a NAS served NFS backend, this involves NFS I/O tunnelled through RPC calls to the NAS server which are of comparable runtime to a typical SQL query to a D/B server. A developer will simply not see this when testing phpBB on a local Apache instance as everything will get cached and all network traffic will be on a localhost loopback. A typical phpBB request reads (and occasionally writes) 30-40 files.
Aggregating content from multiple files in to one file can significantly accelerate performance.
- Whenever possible, the application should correctly negotiate the revalidation protocols for quasi-static content. The two relevant modules here are download/file.php and style.php, and these modules don't do this (except for avatars).
In terms of file aggregation, there are three major opportunities for this (i) the single-file ACM cache option for small sites that we've previously discussed in this topic; (ii) the ability to create safe composite include files, so that if you know that request X always needs 14 specific include files then if you are running on a shared service, simply replace X by a copy which also includes the said 14 include file content. (iii) a simple modification to the templating engine to inline included templates. These three together reduce the number of files needed to be read by over 20.
The easiest way for me to explain what I mean about a correct-negotiation version of file.php will be to rewrite it and submit this version.