[feature/mass-email-speed-control]

Note: We are moving the topics of this forum and it will be deleted at some point

Publish your own request for comments/change or patches for the next version of phpBB. Discuss the contributions and proposals of others. Upcoming releases are 3.2/Rhea and 3.3.
User avatar
naderman
Consultant
Posts: 1727
Joined: Sun Jan 11, 2004 2:11 am
Location: Karlsruhe, Germany
Contact:

Re: [feature/mass-email-speed-control]

Post by naderman » Fri Apr 23, 2010 10:55 pm

nn- wrote:Noone besides me cares?
I certainly care, but you seem to be handling this pretty well ;-)

peterkclee
Registered User
Posts: 7
Joined: Tue Apr 20, 2010 3:42 am

Re: [feature/mass-email-speed-control]

Post by peterkclee » Sat Apr 24, 2010 12:58 pm

nn- wrote: Further, you removed @ before flock calls. This means that if flock fails, which it could because it was disabled, or runs on file system that does not support locking (nfs), etc., users will get a nice dump of php complainage before the headers which could wreck whichever page they are on.
I had read the manual of flock() and fopen():
http://www.php.net/manual/en/function.flock.php
http://www.php.net/manual/en/function.fopen.php
By comparing them, I don’t think @ has any effect on flock().
flock() should return false if the operation fails.
If this is the case, code with @ sign is quite misleading.
nn- wrote: Locking is hard and doing it right requires knowing the theory and testing on real systems. You don't just remove parts of locking code that you did not write because you have no idea why they are there, and they could be very important in some environments. You also don't change locking code together with adding unrelated functionality in a single commit. First, such commits are nearly impossible to review. Second, if the changes turn out to break systems the locking changes cannot be reverted for debugging since they are intertwined with unrelated changes that would break functionality if touched.
When I found the original code does not properly handle the case that fopen()/flock() returns false, I think I should only trust the online php manual but not the original code.
If fact, I don’t think checking whether the file exists or fopen() succeeds is helpful, because another process could do the same thing and even complete and unlink the file!
I would believe that two process could fopen() the same file in “wb” mode, otherwise, there is no need to use flock() to get an exclusive lock! In fact, the online php manual contains the followings:

Note: Because flock() requires a file pointer, you may have to use a special lock file to protect access to a file that you intend to truncate by opening it in write mode (with a "w" or "w+" argument to fopen()).

If only one process could fopen() in “wb” mode at a time, I don’t think the above words are meaningful.

nn- wrote: 2. When analyzing the original code, I found that emails are not immediately sent as described in the original explanation string. In fact, emails are queued, and the queue would only be processed when “cron.php” is run.
If this is true it is a bug that should be fixed. Did you see branching in msg_email to use queue or send immediately?nn-
Sorry, I think that I made a wrong statement because of my wrong tests. When developing the feature, I had added another config variable for testing and later deleted that. It is very likely that I had not tried all value combinations for testing.
Sorry, that’s my mistake.

However, I think the new explanation string is ok. It explains how the setting affects the queue processing behavior. On the other hand, it sounds like a bug in the original code, if the user chooses not to send mass email immediately, but finds out the emails are sent immediately when package size is set to zero!
nn- wrote: Probably because config values are strings in the database.

Do you have examples of such problems?
For example, one may expect a variable storing an integer value, but the variable storing floating point number instead. Calculation using such value would give different result.

Code: Select all

$now = (int) time();
The above code could be rewritten using no cast. But when reading such code, all programmers know what would be the type of $now instantly without knowing / memorizing the output type of time().

For the $n issue, I don’t think comment like “$n is a temporary variable which I could not make a good name…” is useful. It is only a temporary variable, and the code showed how it is used.

Oleg
Posts: 1150
Joined: Tue Feb 23, 2010 2:38 am
Contact:

Re: [feature/mass-email-speed-control]

Post by Oleg » Sat Apr 24, 2010 1:29 pm

You are wrong.

Code: Select all

pie@snap test1 % cat >php.ini
[PHP]
disable_functions = flock
pie@snap test1 % php -c php.ini -a
Interactive mode enabled

<?php
$f=fopen('lock','w');
flock($f);
^D
Warning: flock() has been disabled for security reasons in - on line 3
pie@snap test1 % 
Until you quit being as stubborn as you have been so far I don't see any point in continuing here.

User avatar
A_Jelly_Doughnut
Registered User
Posts: 1780
Joined: Wed Jun 04, 2003 4:23 pm

Re: [feature/mass-email-speed-control]

Post by A_Jelly_Doughnut » Sat Apr 24, 2010 4:32 pm

First:

What problem does this solve? Yes, cron.php may run several times in an hour, but only if the admin sets it up that way. I guess there could be an issue with the current implementation (in phpBB3) if the web host limits the number of emails and the number of recipients per unit time separately (a configuration I don't know if I've ever heard of), but I'm also not seeing that this patch would fix that.

Second:
I find that nn-'s comments are mostly correct. Generally, any PHP function that tries to open a file will throw a warning or notice if the file does not exist. flock() is no different - its just that the manual does not mention it.

Locking files isn't pretty, I've FUBARed it myself a couple of times. Would a database-based solution be better? I'm not convinced that it would be.
A_Jelly_Doughnut

User avatar
Kellanved
Former Team Member
Posts: 407
Joined: Sun Jul 30, 2006 4:59 pm
Location: Berlin

Re: [feature/mass-email-speed-control]

Post by Kellanved » Sat Apr 24, 2010 5:33 pm

I would prefer to get rid of file-based locking alltogether; the cache or database layers are far more promising. The problem - actually a downright showstopper - with file-based locking is that it is not feasible for multi-server and cloud installs.
No support via PM.
Trust me, I'm a doctor.

User avatar
ToonArmy
Registered User
Posts: 335
Joined: Fri Mar 26, 2004 7:31 pm
Location: Bristol, UK
Contact:

Re: [feature/mass-email-speed-control]

Post by ToonArmy » Sun Apr 25, 2010 12:52 am

Kellanved wrote:I would prefer to get rid of file-based locking alltogether; the cache or database layers are far more promising. The problem - actually a downright showstopper - with file-based locking is that it is not feasible for multi-server and cloud installs.
I hope to move to database stored email queue in the future, not sure if i can manage it for 3.1.
Chris SmithBlogXMOOhlohArea51WikiNo support via PM/IM
Image

peterkclee
Registered User
Posts: 7
Joined: Tue Apr 20, 2010 3:42 am

Re: [feature/mass-email-speed-control]

Post by peterkclee » Sun Apr 25, 2010 1:50 am

nn- wrote:You are wrong.

Code: Select all

pie@snap test1 % cat >php.ini
[PHP]
disable_functions = flock
pie@snap test1 % php -c php.ini -a
Interactive mode enabled

<?php
$f=fopen('lock','w');
flock($f);
^D
Warning: flock() has been disabled for security reasons in - on line 3
pie@snap test1 % 
Until you quit being as stubborn as you have been so far I don't see any point in continuing here.
I did not know how to do the above testing.
Thanks for sharing.
However, I guess the result would be similar for any function (e.g. sizeof()).

peterkclee
Registered User
Posts: 7
Joined: Tue Apr 20, 2010 3:42 am

Re: [feature/mass-email-speed-control]

Post by peterkclee » Sun Apr 25, 2010 2:27 am

A_Jelly_Doughnut wrote:First:

What problem does this solve? Yes, cron.php may run several times in an hour, but only if the admin sets it up that way. I guess there could be an issue with the current implementation (in phpBB3) if the web host limits the number of emails and the number of recipients per unit time separately (a configuration I don't know if I've ever heard of), but I'm also not seeing that this patch would fix that.
I am using shared web hosting service.
The patch has been tested and applied on my website already.
My web hosting service has a limit for the number of email recipients per hour (150 per hour).
A_Jelly_Doughnut wrote: Second:
I find that nn-'s comments are mostly correct. Generally, any PHP function that tries to open a file will throw a warning or notice if the file does not exist. flock() is no different - its just that the manual does not mention it.
Really?
The manual stated that fopen() may generate E_WARNING error.
What error could be generated by flock()?
A_Jelly_Doughnut wrote: Locking files isn't pretty, I've FUBARed it myself a couple of times. Would a database-based solution be better? I'm not convinced that it would be.
I agree.
And I think it is better if all data are stored using database instead of using cache files.
But there are many frameworks (PHP and non-PHP) use cache files.
So, I would guess that using cache files has much better performance.

User avatar
A_Jelly_Doughnut
Registered User
Posts: 1780
Joined: Wed Jun 04, 2003 4:23 pm

Re: [feature/mass-email-speed-control]

Post by A_Jelly_Doughnut » Sun Apr 25, 2010 3:30 am

peterkclee wrote:
A_Jelly_Doughnut wrote:...
I am using shared web hosting service.
The patch has been tested and applied on my website already.
My web hosting service has a limit for the number of email recipients per hour (150 per hour).
Good to hear, but what is the new feature that is added, other than apparently forcing the cron interval for the email queue to one hour?
A_Jelly_Doughnut

peterkclee
Registered User
Posts: 7
Joined: Tue Apr 20, 2010 3:42 am

Re: [feature/mass-email-speed-control]

Post by peterkclee » Mon Apr 26, 2010 8:01 am

A_Jelly_Doughnut wrote:
peterkclee wrote:
A_Jelly_Doughnut wrote:...
I am using shared web hosting service.
The patch has been tested and applied on my website already.
My web hosting service has a limit for the number of email recipients per hour (150 per hour).
Good to hear, but what is the new feature that is added, other than apparently forcing the cron interval for the email queue to one hour?
Actually, I had tried more than one approach to limit the number of emails not exceeding the limit, including calculating and regulating the rate of sending emails, using and checking another config variable to effectively changing the queue processing interval to one hour, etc.
However, some of them are not easy to implement correctly, and most of them are not optimal solution.

For example, suppose the limit is 250 (5 x 50) emails/recipients:
1. If there are 300 (6 x 50) emails/recipients for a mass email operation, and the package size is set to 4, the original implementation could sent 200 (4 x 50) emails/recipients at time close to 00:10 and another 100 (2 x 50) emails/recipients at time close to 00:20, which is not allowed (300 > 250). The cron interval has to be changed to one hour or more to avoid the error!
2. If the cron/queue processing interval is (effectively) set to one hour, such implementation could not do two (or more) mass email operations in less than one hour even if sending all the emails at once is lower than the email speed limit (e.g. 150 (3 operations x 1 x 50) emails/recipients)!

The approach in my patch is simple: store the queue processing history with the queue data, and when processing the queue, calculate how many emails could be sent without exceeding the limit using the history data. This sounds stupid, but it is simple, and it gives the optimal performance because it would always try to send as many emails as possible!

Post Reply