Extending DBAL query builder for boolean generation

General discussion of development ideas and the approaches taken in the 3.x branch of phpBB. The current feature release of phpBB 3 is 3.3/Proteus.
Forum rules
Please do not post support questions regarding installing, updating, or upgrading phpBB 3.3.x. If you need support for phpBB 3.3.x please visit the 3.3.x Support Forum on phpbb.com.

If you have questions regarding writing extensions please post in Extension Writers Discussion to receive proper guidance from our staff and community.
User avatar
brunoais
Registered User
Posts: 964
Joined: Fri Dec 18, 2009 3:55 pm

Extending DBAL query builder for boolean generation

Post by brunoais »

The whole idea behind this is to improve DBAL's query builder by adding a boolean query builder using a structured stack of arrays.

This can be used everywhere in an SQL query where boolean operations are used.
Currently, this can only be used in the WHERE and the JOINs' ON clause.
Initially, I'm intending on trying it just on the WHERE clause.

The main and general use-cases are:
  1. To flexibly the build of the WHERE clause in SQL queries
  2. To ease, simplify and prevent errors when doing SQL query editing by phpBB's extensions
Why not...
  1. Doctrine dbal -> The issue with Doctrine dbal is that it's query builder is not ready for the 2nd major use case listed above. There is no way of altering an SQL query. If you want to alter something, you have to rebuild the whole SQL query. I also didn't find a way to know the currently build query except for getting the query built so far as a string.
  2. Linq -> I didn't know the assistance of Linq until today. From what I searched, not only it has the same issue as Doctrine, while also its interface is unnecessarily complex for the common folk who just wants to change a small amount of information.

Quick guide on how to use

This builder uses a tree-like information organization for the boolean comparisons.
Now, there are 3 types of arrays:
  1. A
  2. B
  3. C
(names are not thought yet)

The A type contains 3 elements:
Left hand, operator, right hand.
E.g.

Code: Select all

array('f.forum_id', '=', 1)
array('f.forum_id', '<>', 1)
array('f.forum_id', 'IN', array())
array('f.forum_id', 'IN', array(1,2,5,6,7))
array('f.forum_id', 'NOT_IN', array(1,2,5,6,7))
array('f.forum_id', 'IS', NULL)
 
The B type contains 5 elements:
Left hand, operator, sub query operator, sub query SELECT type, the sub query.

This is essentially a recursive call to sql_build_query() to parse that sub-tree which is the sub query.

Code: Select all

array('f.forum_id', '=', 'ANY', 'SELECT', array(
                    'SELECT' => array(/*...*/),
                    'FROM' => array(/*...*/),
                )
)

array('f.forum_id', '', 'IN', 'SELECT', array(
                    'SELECT' => array(/*...*/),
                    'FROM' => array(/*...*/),
                )
)
 

The C type which contains a boolean operator to apply to all other elements which are either A or B or C type:

Code: Select all

array('OR',
    array('t.forum_id', '=', 3),
    array('t.topic_type', '=', 0),
)

array('AND',
        array('t.forum_id', '=', 3),
        array('t.topic_type', '=', 0),
        array('t.topic_id', '>', 5),
        array('t.topic_poster', '<>', 5),
    ),


array('AND',
        array('t.forum_id', '=', 3),
        array('NOT',
            array('t.topic_type', '=', 0),
        ),
        array('t.topic_id', '>', 5),
        array('t.topic_poster', '<>', 5),
    ),
 
That last one produces:

Code: Select all

t.forum_id = 3
AND NOT ( t.topic_type = 0 )
AND t.topic_id > 5
AND t.topic_poster <> 5


Comparison examples
Here are some code examples of the "before" and "after" in order to fully use this feature.

Code: Select all

// Original
sql_build_query('SELECT',
    array(
        'SELECT'    => "COUNT(topic_id) AS num_topics",
        'FROM'        => array("phpbb_topics" => ''),

        'WHERE'        => 'topic_id IS NOT NULL',
    )

)

// Now
sql_build_query('SELECT',
    array(
        'SELECT'    => "COUNT(topic_id) AS num_topics",
        'FROM'        => array("phpbb_topics" => ''),

        'WHERE'        => 
                array('topic_id', 'IS NOT', 'NULL'),
    )

)
 
For the other examples, I'll shorten to show just the WHERE clause.

Code: Select all


'WHERE'        => 'forum_id = 1',

'WHERE'        => array('forum_id', '=', 1),

// Or, if you want to negate it:

'WHERE'        => 'NOT ( forum_id = 1)',

'WHERE'        => 
            array('NOT',
                array('forum_id', '=', 1),
            ),

 
Here's something more complex
source: https://github.com/phpbb/phpbb/blob/rel ... m.php#L277

Code: Select all

// original
" WHERE forum_id = 51
    AND (topic_last_post_time >= $min_post_time
        OR topic_type = " . POST_ANNOUNCE . '
        OR topic_type = ' . POST_GLOBAL . ')
    AND 1=1'


// Becomes
'WHERE'        => 
            array('AND',
                array('forum_id', '=', 51),
                array('OR',
                    array('topic_last_post_time', '>=', 1409240111),
                    array('topic_type', '=', POST_ANNOUNCE),
                    array('topic_type', '=', POST_GLOBAL),
                ),
                array('NOT',
                    array('AND',
                        array('topic_poster', '=', 48),
                        array(1, '=', 1),
                    ),
                ),
            )
 
Last example
Source: https://github.com/phpbb/phpbb/blob/rel ... m.php#L445

Code: Select all


// Original
'WHERE'        => '(t.forum_id = 3
                AND t.topic_type = 3) OR
            (' . sql_in_set('t.forum_id', array(1,2,3,4,5,6)) . '
                AND t.topic_type = 4)',
                
// Becomes
        'WHERE'        => 
            array('OR',
                array('AND',
                    array('t.forum_id', '=', 3),
                    array('t.topic_type', '=', 0),
                ),
                array('AND',
                    array('t.forum_id', 'IN', array(1,2,3,4,5,6)),
                    array('t.topic_type', '=', 0),
                ),
            ),
 

Feel free to discuss to your hearts content. Any constructive opinion is welcome.

ticket: https://tracker.phpbb.com/browse/PHPBB3-13652
PR: https://github.com/phpbb/phpbb/pull/3441
Last edited by brunoais on Sat Feb 28, 2015 12:39 pm, edited 3 times in total.

User avatar
DavidIQ
Customisations Team Leader
Customisations Team Leader
Posts: 1904
Joined: Thu Mar 02, 2006 4:29 pm
Location: Earth
Contact:

Re: Extending DBAL query builder for boolean generation

Post by DavidIQ »

Code: Select all

AND NOT ( t.topic_type = 0 )
I know these are just examples but would one ever do this? Is it even accepted as correct by SQL experts (DBAs) or do they consider it bad practice?
Image

User avatar
brunoais
Registered User
Posts: 964
Joined: Fri Dec 18, 2009 3:55 pm

Re: Extending DBAL query builder for boolean generation

Post by brunoais »

I would never do it. I don't think anyone should do it.
As a DB admin, I consider it a bad move. "<>" instead of "=" should be used instead of the "NOT" operator, IMO.
I just used that because I didn't have any good idea at the time :).

User avatar
Pony99CA
Registered User
Posts: 986
Joined: Sun Feb 08, 2009 2:35 am
Location: Hollister, CA
Contact:

Re: Extending DBAL query builder for boolean generation

Post by Pony99CA »

brunoais wrote:I would never do it. I don't think anyone should do it.
As a DB admin, I consider it a bad move. "<>" instead of "=" should be used instead of the "NOT" operator, IMO.
I just used that because I didn't have any good idea at the time :).
Perhaps, but suppose the condition inside the NOT was more complicated, like:

Code: Select all

AND NOT ( (t.topic_type = 0  OR blah <> whatever) AND (something else) )
Yes, you can use Boolean logic (like De Morgan's laws) to figure out the equivalent, but sometimes it's easier to NOT the entire thing.

Steve
Silicon Valley Pocket PC (http://www.svpocketpc.com)
Creator of manage_bots and spoof_user (ask me)
Need hosting for a small forum with full cPanel & MySQL access? Contact me or PM me.

User avatar
brunoais
Registered User
Posts: 964
Joined: Fri Dec 18, 2009 3:55 pm

Re: Extending DBAL query builder for boolean generation

Post by brunoais »

Yeah, and phpBB has some of those.
Regardless of that, this is supposedly ready for it. I found no limitations nor errors on that so far.

User avatar
brunoais
Registered User
Posts: 964
Joined: Fri Dec 18, 2009 3:55 pm

Re: Extending DBAL query builder for boolean generation

Post by brunoais »

I still have one doubt in this feature:
Should the use of the ternary (A type):

Code: Select all

array({left hand}, {operator}, {right hand})
Be enforced for each value operation element?
... Or should it be free to choose (i.e. it is OK to just use a string (1 array element) instead of those 3 array elements)?


One of the things I'm trying to expand in this is to also have it add the "?" automatically and then use prepared statements on the driver itself.
I've checked phpBB's code, nearly everywhere where the query builder is called, its result is directly used with the .query() method.
The only exception is in phpbb\search\fulltext_native->keyword_search(), which can be easily re-written.

If these changes get a positive response, I'm also planning on expanding to other features belonging to the SELECT, FROM, LEFT_JOIN, GROUP_BY and ORDER_BY.
Last edited by brunoais on Sat Feb 28, 2015 12:47 pm, edited 1 time in total.

User avatar
nickvergessen
Former Team Member
Posts: 733
Joined: Sun Oct 07, 2007 11:54 am
Location: Stuttgart, Germany
Contact:

Re: Extending DBAL query builder for boolean generation

Post by nickvergessen »

I think doing thinks like this is a step in the right direction.
Maybe you can extract the sql_build_query method into a new class because otherwise that code will just get even bigger.
Member of the Development-TeamNo Support via PM

User avatar
brunoais
Registered User
Posts: 964
Joined: Fri Dec 18, 2009 3:55 pm

Re: Extending DBAL query builder for boolean generation

Post by brunoais »

So make all the query builder thing into a class of its own inside the DB namespace?

User avatar
brunoais
Registered User
Posts: 964
Joined: Fri Dec 18, 2009 3:55 pm

Re: Extending DBAL query builder for boolean generation

Post by brunoais »

Extra idea:
Does it make sense to allow separating the CROSS JOIN WHERE clause conditions (conditionals that would be in the ON clause if it was an INNER JOIN) from the rest of the WHERE conditions?
I have 2 possible implementations I believe to be the best choices given the current situation.
  1. Place a new possible KEY in the php array that represents the query. All its content is merged with the WHERE clause with an "AND".
  2. Place a new possible KEY in the php array for the WHERE clause. All its content is merged with the WHERE clause at its root with an "AND".
This is perfectly backwards compatible. It, however, adds a new tool on how queries are built making it easier for extensions to search on it and modify it at will to add and remove tables from the FROM clause.

For example:

Code: Select all

$sql_ary = array(
    'SELECT'    => 'u.user_id',
    'FROM'        => array(
        'phpbb_users'        => 'u',
        'phpbb_user_group'    => 'ug',
        'phpbb_banlist'    => 'b',
    ),
    'JOIN_ON'     => array(
            'ug'    =>    array('u.user_id', '=', 'ug.user_id'),
            'b'        =>    array('u.user_id', '=', 'b.ban_userid'),
        ),
    'WHERE'        => array('OR',
        array('AND',
            array('ug.user_id', 'IN', array(1, 2, 3, 4)),
            array('ug.group_id', '=', 2),
        ),
        array('AND',
            array('ug.group_id', '=', 1),
            array('b.ban_id', 'IS_NOT', NULL),
        ),
    ),
);
 
Instead of how it is done now:

Code: Select all

$sql_ary = array(
    'SELECT'    => 'u.user_id',
    'FROM'        => array(
        'phpbb_users'        => 'u',
        'phpbb_user_group'    => 'ug',
        'phpbb_banlist'    => 'b',
    ),
    'WHERE'        => array('AND',
        array('u.user_id', '=', 'ug.user_id'),
        array('u.user_id', '=', 'b.ban_userid'),
        array('OR',
            array('AND',
                array('ug.user_id', 'IN', array(1, 2, 3, 4)),
                array('ug.group_id', '=', 2),
            ),
            array('AND',
                array('ug.group_id', '=', 1),
                array('b.ban_id', 'IS_NOT', NULL),
            ),
        ),
    ),
); 
As for which table to place in the KEY of that array, it should be the table that doesn't do the "glue" job in joining the tables.
While processing, all of them are joined using an 'AND'. In cases where an OR is required (I don't remember any, tbh), just use the WHERE clause.
Although here there's only 1 condition, it is part of the objective to allow it to be as complete as the WHERE clause specified above.



Alternatively (I prefer this less but it can be done too):

Code: Select all

    'WHERE'        => array('OR',
        array('AND',
            array('ug.user_id', 'IN', array(1, 2, 3, 4)),
            array('ug.group_id', '=', 2),
        ),
        array('AND',
            array('ug.group_id', '=', 1),
            array('b.ban_id', 'IS_NOT', NULL),
        ),
        'JOIN_ON'     => array('AND',
            array('u.user_id', '=', 'ug.user_id'),
            array('u.user_id', '=', 'b.ban_userid'),
        ),
    ), 
The JOIN_ON name can be anything else. That's just what popped to my mind.

Comments please.

User avatar
DavidIQ
Customisations Team Leader
Customisations Team Leader
Posts: 1904
Joined: Thu Mar 02, 2006 4:29 pm
Location: Earth
Contact:

Re: Extending DBAL query builder for boolean generation

Post by DavidIQ »

Not sure I like that because of the use of the keyword "JOIN". At first glance it looked like there was a join in addition to the tables under the SELECT statement. I think it might be clearer to determine what the query is doing how it is now.
Image

Post Reply