I'm working on making phpbb cron facilities invokable by the system cron daemon, because having users request 'cron.php' on a dedicated server is just silly. That's not actually straightforward because phpbb cron code is tightly coupled to the idea that it's invoked by users like normal pages. I came up with a new design that allows cron tasks to be invoked either the 'phpbb way' or the 'system way', and has a very nice side effect that modifications can trivially add cron tasks without needing to change any existing code. Is there interest in incorporating this change into phpbb itself?
Here's how the new cron design works. A 'cron task' is a piece of php code that should be periodically executed, so named to avoid confusion with a 'cron job', which is something one specifies in system crontab. There is a new class called cron which is the interface between cron tasks and the code that uses cron tasks. The cron class provides methods to retrieve runnable tasks, 'schedule' them (if using phpbb system) by generating appropriate html code, and invoke the tasks. Cron class does not have any actual task definitions however; those exist in separate classes which are loaded by cron class (automatically). A task class contains definitions of one or more tasks along with code to actually run the tasks. Task definitions describe task properties: which configuration setting controls whether the task is enabled, whether task needs to check a condition to determine its runnability, etc.
In order for a modification to add cron tasks it needs to do two things. First, create a new cron task class with appropriate definitions and place it in the right directory. Second, add the class name to the list of enabled classes (or modules) in config.
Configuration gains two settings: how to run cron tasks (phpbb or system) and which cron modules are enabled (comma-separated list).
I think mod authors will benefit from this cron design. From what I read there's demand for adding cron tasks to phpbb but not many people do it through phpbb's current cron system.
[RFC|Merged] Modular cron
Re: System cron and cron tasks in mods
I've been thinking about converting the cron system on my site to a "real" cronjob as well, because I'm currently split between some newer code which uses real cronjobs, and some older code which uses the system from phpBB3 which I ported into my site.
I'd also like to rewrite the functions in C to see if I can get any better performance.
I'd also like to rewrite the functions in C to see if I can get any better performance.
Re: System cron and cron tasks in mods
Hey, I've had this tab open for a few days now, meaning to reply. But well pretty busy Anyway I think this is something phpBB itself should support. You could probably release your changes as a MOD for 3.0 and/or make them available for inclusion in phpBB 3.1. You might want to look at the Periodic project (I haven't really done that myself yet): http://arbitracker.org/periodic.html
Re: System cron and cron tasks in mods
nn-: your cron system sounds similar to the vBulletin implementation of pseudo-cron jobs, which have their own file and so on.
[RFC] Modular cron
I'd like to propose some changes/improvements to the cron system.
What is cron
Cron is a utility that periodically runs pre-defined tasks at pre-defined intervals.
Status quo
phpBB 3.0 has a cron-like system to run maintenance tasks such as garbage collection for sessions, db cache, search index, permissions. There is also message queue (email) processing, pruning of warnings and topics.
The way it is implemented is by outputting a transparent 1x1 pixel gif in the board footer, while doing the work in the background. If possible it uses register_shutdown_function to handle the tasks after the image was rendered. This allows any board user to trigger the execution of the script.
The way it is implemented is not very extensible. The possible tasks are hard-coded into the cron.php file and the page_footer() function from includes/functions.php. This makes it hard to add new tasks or to run those tasks by other means.
Proposal
There are several use-cases for a modular cron system. First of all, the ability to add new tasks. If a MOD author (or other person) wants to extend phpBB, there are cases which require running tasks periodically. For example a feed aggregator that posts an RSS/Atom feed to a specified forum.
Also from a task caller perspective this has benefits. A large board may want to switch to a real server-side cron instead of the current client-side pseudo cron implementation. Currently this is not easily possible, because phpBB's files are not particularly CLI-friendly.
Speaking of which, many frameworks provide cli tools to aid development. ameeck has been working on such a tool for phpBB. It might be useful for a developer (mod author) to be able to run cron tasks explicitly from the command line. CLI and cron tasks could use a common or easily combinable interface, to make this a breeze.
Implementation suggestions
Most of cron.php and the cron parts of page_footer need to be factored out into a separate class. There needs to be an interface for tasks and implementations of the current existing tasks. There needs to be a naming convention for tasks to allow automatically loading them.
Because checking whether a cron task needs to be run may result in some overhead (loading all the task classes, running the checks) it may be worthwhile to only do it every n pageloads or so.
Ideally a task would get its dependencies injected, decoupling it completely from any global state. Possible dependencies include cache, db, paths, config. Doing so would allow a reduced bootstrapping, which is good for performance and non-web environments such as CLI. Not sure what's the best way to implement this, though. The task could have a method returning a list of its dependencies, the caller would have a mapping for those strings and inject the instance. Or perhaps some kind of annotations (reading docblocks with reflection, getDocComment) would be plausible.
Here's an (attempted) UML diagram of how this could look, using the command pattern.
Of course it would need to be adjusted to phpBB3 naming conventions. Note that the shouldRun() method is static.
What is cron
Cron is a utility that periodically runs pre-defined tasks at pre-defined intervals.
Status quo
phpBB 3.0 has a cron-like system to run maintenance tasks such as garbage collection for sessions, db cache, search index, permissions. There is also message queue (email) processing, pruning of warnings and topics.
The way it is implemented is by outputting a transparent 1x1 pixel gif in the board footer, while doing the work in the background. If possible it uses register_shutdown_function to handle the tasks after the image was rendered. This allows any board user to trigger the execution of the script.
The way it is implemented is not very extensible. The possible tasks are hard-coded into the cron.php file and the page_footer() function from includes/functions.php. This makes it hard to add new tasks or to run those tasks by other means.
Proposal
There are several use-cases for a modular cron system. First of all, the ability to add new tasks. If a MOD author (or other person) wants to extend phpBB, there are cases which require running tasks periodically. For example a feed aggregator that posts an RSS/Atom feed to a specified forum.
Also from a task caller perspective this has benefits. A large board may want to switch to a real server-side cron instead of the current client-side pseudo cron implementation. Currently this is not easily possible, because phpBB's files are not particularly CLI-friendly.
Speaking of which, many frameworks provide cli tools to aid development. ameeck has been working on such a tool for phpBB. It might be useful for a developer (mod author) to be able to run cron tasks explicitly from the command line. CLI and cron tasks could use a common or easily combinable interface, to make this a breeze.
Implementation suggestions
Most of cron.php and the cron parts of page_footer need to be factored out into a separate class. There needs to be an interface for tasks and implementations of the current existing tasks. There needs to be a naming convention for tasks to allow automatically loading them.
Because checking whether a cron task needs to be run may result in some overhead (loading all the task classes, running the checks) it may be worthwhile to only do it every n pageloads or so.
Ideally a task would get its dependencies injected, decoupling it completely from any global state. Possible dependencies include cache, db, paths, config. Doing so would allow a reduced bootstrapping, which is good for performance and non-web environments such as CLI. Not sure what's the best way to implement this, though. The task could have a method returning a list of its dependencies, the caller would have a mapping for those strings and inject the instance. Or perhaps some kind of annotations (reading docblocks with reflection, getDocComment) would be plausible.
Here's an (attempted) UML diagram of how this could look, using the command pattern.
Of course it would need to be adjusted to phpBB3 naming conventions. Note that the shouldRun() method is static.
Last edited by igorw on Wed Apr 14, 2010 7:43 pm, edited 2 times in total.
Re: [RFC] Modular cron
Sounds like a good idea to me as a MOD Developer and Web Host.
Formerly known as Unknown Bliss
No unsolicited PMs please except for quotes.psoTFX wrote: I went with Olympus because as I said to the teams ... "It's been one hell of a hill to climb"
Re: System cron and cron tasks in mods
Well, the board install I've been working on has questionable launching status, so I don't know whether I would have the motivation to finish this.
I put the code I have written so far up here: http://github.com/p/phpbb3/compare/git- ... ystem-cron
It is definitely incomplete. Parameter passing needs to be improved in some of the more tricky cases, and I think some task definitions are missing.
One of my goals was to avoid running more database queries compared to the current implementation. There is I believe maybe a single query added in one task so far. The flip side is parameter passing has to be more convoluted to allow passing data between the function that checks whether a task needs to be run and the function that actually does the work.
I put the code I have written so far up here: http://github.com/p/phpbb3/compare/git- ... ystem-cron
It is definitely incomplete. Parameter passing needs to be improved in some of the more tricky cases, and I think some task definitions are missing.
One of my goals was to avoid running more database queries compared to the current implementation. There is I believe maybe a single query added in one task so far. The flip side is parameter passing has to be more convoluted to allow passing data between the function that checks whether a task needs to be run and the function that actually does the work.
Re: [RFC] Modular cron
I did some work in this direction: viewtopic.php?f=4&t=32454
A modular cron system is needed by mod authors. A cron system providing non-web interface is needed by system administrators.
The diagram splits cron and cli modes; what is the difference between them?
A modular cron system is needed by mod authors. A cron system providing non-web interface is needed by system administrators.
Admins that care about performance would not (want to) run cron in "web mode" anyway.eviL3 wrote:Because checking whether a cron task needs to be run may result in some overhead (loading all the task classes, running the checks) it may be worthwhile to only do it every n pageloads or so.
The diagram splits cron and cli modes; what is the difference between them?
Re: [RFC] Modular cron
Now that you mention it, I remember the topic. I must have forgotten about it.
Good question. They can be merged.nn- wrote:The diagram splits cron and cli modes; what is the difference between them?
Re: System cron and cron tasks in mods
It's a good start, thanks for sharing. I like your straight-forward approach.
It would be nice if the tasks were more extensible/generic, with one file per task, either implementing or extending an interface or base class. The use of globals and constants should be reduced/avoided, since it makes it harder to use in a different (cli/test) environment. Perhaps CRON_ID can be defined outside of the cron_lock class or replaced with a static property (which isn't great either, but at least more namespaced).
After reading my cron RFC, you probably knew I'd suggest these things.
It would be nice if the tasks were more extensible/generic, with one file per task, either implementing or extending an interface or base class. The use of globals and constants should be reduced/avoided, since it makes it harder to use in a different (cli/test) environment. Perhaps CRON_ID can be defined outside of the cron_lock class or replaced with a static property (which isn't great either, but at least more namespaced).
After reading my cron RFC, you probably knew I'd suggest these things.