[RFC & Patch][Implemented] Coding Guidelines

Note: We are moving the topics of this forum and it will be deleted at some point

Publish your own request for comments/change or patches for the next version of phpBB. Discuss the contributions and proposals of others. Upcoming releases are 3.2/Rhea and 3.3.
User avatar
EXreaction
Registered User
Posts: 1555
Joined: Sat Sep 10, 2005 2:15 am

Re: [RFC & Patch] Coding Guidelines

Post by EXreaction »

Putting code in forums is just a problem because our code bbcode is not nice to the formatting. There are plenty of other sites that handle plain text code perfectly (like pastebin, even letting you tab while you are in the post message box). If we were to set it up that way there wouldn't be any problem with putting code in posts or copying them from posts.

User avatar
Highway of Life
Registered User
Posts: 1399
Joined: Tue Feb 08, 2005 10:18 pm
Location: I'd love to change the World, but they won't give me the Source Code
Contact:

Re: [RFC & Patch] Coding Guidelines

Post by Highway of Life »

naderman wrote:Easily happens that you end up doing something like this without the braces. The statement is always executed, even though the formatting makes it look like it will be conditional. That's part of what I meant about braces making it absolutely unambiguous.
Okay, that makes sense. I’m all for making it a bit more fool-proof. :)
naderman wrote:It has become much more common in PHP projects and it's part of the autoloading standard for PHP 5.3.
One issue with using CamelCase is that dynamic code generation becomes a problem.
i.e.

Code: Select all

$lang = array(
    'SOMETHING_FREAKY'    => 'foo',
    'PHPBB_ROCKS'        => 'bar',
);

foreach ($ary as $key => $value)
{
    $call = strtolower($key);

    if (method_exists(self::$plugin, $call))
    {
        self::$plugin->$call;
    }
} 
With CamelCase, you would have to use two functions:

Code: Select all

foreach ($ary as $key => $value)
{
    $call = str_replace('_', '', mb_convert_case($key, MB_CASE_TITLE, 'UTF-8'));

    if (method_exists(self::$plugin, $call))
    {
        self::$plugin->$call;
    }
} 
But then you’d be better off doing...

Code: Select all

function ConvertToCamelCase($string)
{
    return str_replace('_', '', mb_convert_case($key, MB_CASE_TITLE, 'UTF-8'));
}
// [...]
foreach ($ary as $key => $value)
{
    $call = ConvertToCamelCase($string);

    if (method_exists(self::$plugin, $call))
    {
        self::$plugin->$call;
    }
} 
But in the case of 'SOME_ABR_METHOD' it simply could not be done if you use CamelCase for Identifiers. (because it would have to be SomeABRMethod instead of the lowercase some_abr_method)
The benefit of using lower_case (with underscores) for dynamic code generation is that it's easy to convert any uppercase, CamelCase or lowercase string into a method/function call by simply using strtolower().

One of my pet peeves is code you can’t read or understand by quickly glancing over it. This is the reason I started using phpBB in the first place, because it had an easy-to-understand structure, you could glance at the code and understand what it was doing (there are exceptions of course, especially in the posting structure and MCP)

I have spoken with a few developers about CamelCase versus lower_case, and the only real consensus that I could come up with for why CamelCase was better is from Java(Script), C++, etc developers who said it is because that is how the core methods are written and because languages like Java don't have a visual differentiation between variables and method calls.
i.e. 'someidentifier' could be either a variable or a method call, so to simplify this, SomeIdentifier is a method, and 'someidentifier' (or some_identifier) is a variable.
But in the case of PHP, that is not necessary, because there is a difference between functions, methods and variables.
some_function(), $foo->some_method(), and $some_variable. You don’t need CamelCase to tell which is which.
jQuery solved this in JavaScript by prepending variables with $.

In the PHP Core, most function names use underscore naming conventions, and most classes use CamelCase naming conventions, but if we go to using strictly CamelCase for everything, that could become more than a little confusing for a lot of developers.

But for readability, there is also good reason to use lowercase instead of CamelCase;
Example 1: Variable name

Code: Select all

$SomeInstanceVariable = 'foo'; 

Code: Select all

$some_instance_variable = 'foo'; 
So whats the big deal? Well when you use underscore naming conventions you will find it easier to write more descriptive variable names without having to truncate or abbreviate parts of the name because of how complex long camel case names look.

Example 2: Method Name

Code: Select all

public function GenerateTextForDisplay(object $object, array $data)
{
    return;
} 

Code: Select all

public function generate_text_for_display(object $object, array $data)
{
    return;
} 
It may just be me but I feel with the underscores you can easily read exactly what the method does without getting hung up in the alternating Case.

Example 3: Forced uppercase words

Code: Select all

$MovieClip = ContainerMCGroup::MovieClip(); 

Code: Select all

$movie_clip = container_mc_group::movie_clip(); 
That looks really hard to read. With the underscore version you don’t have to force any part of the variable name to be upper or lower case just to make it fit the CamelCase naming convention.
Last edited by Highway of Life on Wed Jan 20, 2010 11:04 pm, edited 1 time in total.
Image

User avatar
EXreaction
Registered User
Posts: 1555
Joined: Sat Sep 10, 2005 2:15 am

Re: [RFC & Patch] Coding Guidelines

Post by EXreaction »

Indeed, IfindCamelCaseHarderToRead_than_what_we_use_now.

User avatar
Highway of Life
Registered User
Posts: 1399
Joined: Tue Feb 08, 2005 10:18 pm
Location: I'd love to change the World, but they won't give me the Source Code
Contact:

Re: [RFC & Patch] Coding Guidelines

Post by Highway of Life »

EXreaction wrote:Indeed, IfindCamelCaseHarderToRead_than_what_we_use_now.
This is because in English, you type words using a space, and when reading a sentence that uses underscores, the eye identifies it as a separation of words, it identifies it as a space. With CamelCase, the eye has a harder time identifying where words begin and end, it has to rely on the case identification in words, which because difficult to do the longer the string is.
If you are Welsh, we can’t help you. :P

Examples:
InCamelCaseYouMightHaveAHarderTimeIdentifyingWhereMyWordsBeginAndEnd.
but_using_lower_case_makes_it_easier_on_the_eyes_to_see_where_words_begin_and_end_even_if_the_string_is_longer.

Longer CamelCase strings _might_ be more prone to typos, or for example when the Modifications Team has to validate MODs, it could be easier to see a problem in a string that is lowercase rather than CamelCase.

Just my 2¢ :)
Image

User avatar
naderman
Consultant
Posts: 1727
Joined: Sun Jan 11, 2004 2:11 am
Location: Berlin, Germany
Contact:

Re: [RFC & Patch] Coding Guidelines

Post by naderman »

The code generation thing seems rather fabricated to me. To me camel case and underscores are equally readable, I guess it's just a question of how used to it you are. PHP's traditional functions are based on C naming conventions, new parts like the SPL are entirely camel case. And as I said I think following a PHP project interoperability standard outweighs all of this.

User avatar
naderman
Consultant
Posts: 1727
Joined: Sun Jan 11, 2004 2:11 am
Location: Berlin, Germany
Contact:

Re: [RFC & Patch] Coding Guidelines

Post by naderman »

After hearing some arguments on the tabs vs. spaces thing, and the fact that more developers seem to prefer tabs than spaces, I'm going to change that. Afterwards I'll merge the guidelines into master.

User avatar
naderman
Consultant
Posts: 1727
Joined: Sun Jan 11, 2004 2:11 am
Location: Berlin, Germany
Contact:

Re: [RFC & Patch] Coding Guidelines

Post by naderman »

Alright it has been updated to use tabs. All code has been updated to follow the new guidelines. I will now move this RFC to Implemented. Of course making changes later will still be possible.

User avatar
Highway of Life
Registered User
Posts: 1399
Joined: Tue Feb 08, 2005 10:18 pm
Location: I'd love to change the World, but they won't give me the Source Code
Contact:

Re: [RFC & Patch] Coding Guidelines

Post by Highway of Life »

naderman wrote:The code generation thing seems rather fabricated to me.
Well, a basic example. ;) And you’re correct, the term I used: "code generation" is fabricated, I don’t know the correct name for it when performing this kind of code manipulation.
e.g.:

Code: Select all

$string = 'foo';
$$string = 'bar';
echo $foo;

$obj_string = 'SomeClass';
$object = new $obj_string;

$method = 'getData';
$object->$method(); 
phpBB3 uses this (The DBAL, ACM, modules, for starters), and in most of the large projects that I’ve worked on I’ve had to do this, as I’m sure most of you have as well. :) ... I just don’t know the correct term for it.
naderman wrote:PHP's traditional functions are based on C naming conventions, new parts like the SPL are entirely camel case. And as I said I think following a PHP project interoperability standard outweighs all of this.
But not all of them are! Take a look at the SPL Functions as an example. Not "All Iterators" are CamelCase, only Class names and methods are lower camel case. Making everything CamelCase is actually not consistent with PHP's coding standards.

PHP's rules state:
PHP Userland Naming Rules wrote:Function names use underscores between words, while class names use the camel case rule (there are some exceptions for older classes and functions).
And in the PHP Coding Standards:
PHP Coding Standards Rules wrote:5. Variable names should be in lowercase. Use underscores to separate between words.

6. Method names follow the 'studlyCaps' (also referred to as 'bumpy case' or 'camel caps') naming convention, with care taken to minimize the letter count. The initial letter of the name is lowercase, and each letter that starts a new 'word' is capitalized::

Good:
'connect()'
'getData()'
'buildSomeWidget()'

Bad:
'get_Data()'
'buildsomewidget'
'getI()'

7. Classes should be given descriptive names. Avoid using abbreviations where possible. Each word in the class name should start with a capital letter, without underscore delimiters (CampelCaps starting with a capital letter). The class name should be prefixed with the name of the 'parent set' (e.g. the name of the extension)::

Good:
'Curl'
'FooBar'

Bad:
'foobar'
'foo_bar'
Yes, I’m not a huge fan of being all CamelCasey (even though I’ve had to do it a lot), but if we do have to switch to CamelCase, might we be consistent with the PHP coding standards? pweeze? :D
Image

User avatar
naderman
Consultant
Posts: 1727
Joined: Sun Jan 11, 2004 2:11 am
Location: Berlin, Germany
Contact:

Re: [RFC & Patch][Implemented] Coding Guidelines

Post by naderman »

I would say those coding guidelines are not too consistent themselves. And since the goal is to write proper object oriented code I would say it doesn't really matter for us. The reason they use those rules is that they want to stay consistent with what they already have. The rule about variables is actually one for the C code. Either way I think rather than discussing this any further we should focus on something more productive.

code reader
Registered User
Posts: 653
Joined: Wed Sep 21, 2005 3:01 pm

Re: [RFC & Patch][Implemented] Coding Guidelines

Post by code reader »

two things:
1) tabs vs. spaces:
if you want to use tabs, it requires discipline, especially when you need to break a long line for readability:
e.g., let's say you have a line that looks like so:

Code: Select all

'TAB_TAB'if (($some variable_with a long_name > 3 && $some variable_with a long_name < 17 && $today_is _tueseday) || $next_week_i_will_get_my_annual_bonus || $your_name == "AL")
 
such a line needs to be broken.
when breaking it, you should *not* use:

Code: Select all

'TAB_TAB'if (($some variable_with a long_name > 3 && $some variable_with a long_name < 17 && $today_is _tueseday) || 
'TAB_TAB_TAB'$next_week_i_will_get_my_annual_bonus || $your_name == "AL")
 
but rather:

Code: Select all

'TAB_TAB'if (($some variable_with a long_name > 3 && $some variable_with a long_name < 17 && $today_is _tueseday) || 
'TAB_TAB'"___"$next_week_i_will_get_my_annual_bonus || $your_name == "AL")
 
(i used _ to denote space)
in other words: <TAB> for indetation, and <sp> for tabulation when breaking long lines, otherwise things are completely messed when breaking long lines and different people use different TAB value.

since it isn't reasonable to expect coders to use this much discipline, it makes sense to avoid tabs altogether (of course, if long lines are never broken my argument is void, but this present even worse issues of readability)




2) braces:
naderman wrote:

Code: Select all

if ($x > 3 && $x < 7) echo $x; $x = substr(foo($x), 5);  
Easily happens that you end up doing something like this without the braces. The statement is always executed, even though the formatting makes it look like it will be conditional. That's part of what I meant about braces making it absolutely unambiguous.
i don't think this is a good example or argument: the only place it should be even permissible to place more than one statement in a line is chain-assignment, e.g.
$a = $b = $c = array();
or such. for these cases the question of braces is moot.

as to the case of adding a second statement to a block and forgetting to add braces:
this can only happen if one is sloppy with indentation, which is a mortal sin and is much more important than the question of braces/no braces.

i do not think that the argument of "less braces leads to programming errors" holds water.
there is no data to support such claims. this is just a red herring.

the only real question is readability.
personally, my opinion is that the fact that having all those noise-braces in the code cause you to view less real code on the screen is detrimental to readability. i prefer to have less lines. look at this example:

Code: Select all

if (condition)
    statement1;
else
    statement2;

if (condition_2) {
    statement3;
    statement4;
} else {
    statement5;
    statement6;
}
 
and compare:

Code: Select all

if (condition)
{
    statement1;
}
else
{
    statement2;
}

if (condition_2) 
{
    statement3;
    statement4;
} 
else 
{
    statement5;
    statement6;
}
 
in my humble opinion, the first version is more readable, and by consuming less screen real estate it allows yo to see more code on the screen at one time: less scrolling, better understanding.

i can accept (though it's hard for me to understand) that some people view the 2nd way as more readable.

there is also the hard-to-argue issue of keeping existing guidelines because everybody is used to it. this is the only argument i can actually agree with (and note that the same argument holds for keeping the 'c' style of variable names: all lowercase with underscores, rather switch to CameCase).



peace.

Post Reply