[DOC] Syntax Reference for the SQL Parser

Discussion on a SQL Parser for EasyMOD
Locked
markus_petrux
Registered User
Posts: 376
Joined: Fri Jun 18, 2004 10:58 pm
Location: Girona, Catalunya (Spain)
Contact:

[DOC] Syntax Reference for the SQL Parser

Post by markus_petrux »

Code: Select all

@package SQL Parser
 @version $Id&#58 syntax_reference_bbcode.txt,v 1.2 2005/09/27 19ᚸᛄ markus_petrux Exp $
 @copyright (c) 2005 phpBB Group
 @license http&#58//opensource.org/licenses/gpl-license.php GNU General Public License
  • Introduction

    This document describes the syntax that the SQL Parser for EasyMOD accepts. While it is largely MySQL based, not all options available in the MySQL syntax are supported. phpBB supports several SQL servers with its Database Abstraction Layer (DBAL) implementation, hence a minimal set of options available to perform DB operations needs to be defined. This is the main goal of the SQL Parser, that is provide support for as much options as possible while making sure all those options are portable.

    NOTE: These standards are under development and most importantly open to discussion. However, as being said, whatever is implemented should be portable to, at least, all SQL servers supported by phpBB.
  • Syntax Conventions

    This section describes the general rules applied to SQL statements and their components:
    • No explicit rule is applied to case. However, keywords and identifiers are case insensitive for almost all SQL servers. While the parser accepts any lettercase, we recommend use of keywords written in uppercase and identifiers in lowercase, for readability, as well as to prevent possible portability related issues. All SQL statements generated by the SQL Parser will use keywords in uppercase and identifiers written as found in the input. Read the Identifiers section for other rules applied to identifiers, though.
    • Syntax Diagrams and Tables used in this document adhere the following conventions:

      Code: Select all

      CONVENTION          USED FOR
      ----------------    ----------------------------------------
      UPPERCASE           SQL keywords.
      lowercase           User supplied elements such as SQL identifiers or constants.
      <label>             Name for a block of syntax. This convention is used to group
                          portions of lengthy blocks of syntax that can be used in more
                          than one place.
      | (vertical bar)    Separating optional syntax items within brackets or braces
                          Only one of the specified items can be choosen.
      [ ] (brackets)      Optional syntax items (brackets aren't part of the input).
      { } (braces)        Required syntax items (braces aren't part of the input).
      [,...]              The preceding item can be repeated n number of times
                          (separated by commas).
      [ ...]              The preceding item can be repeated n number of times
                          (separated by blanks).
    • SQL statements must be terminated with a semicolon (;). Note the semicolon is required, even if only one statement is specified or for the last one.
    • SQL statements can be splitted into several lines.
    • Several forms of comments are allowed (described below).
  • Indentifiers

    Table, index and column names are user supplied identifiers. To minimize portability related issues, all user supplied identifiers are subject to the following rules:
    Examples:

    Code: Select all

    forum_id                 -- Valid identifier!
    `default`                -- Invalid, it's a Reserver Keyword!
    default_id               -- Valid identifier!
    default-id               -- Invalid, the hyphen (-) is NOT a valid character!
    _default                 -- Valid identifier!
    1st_flag                 -- Invalid, starts with a number!
    first_flag               -- Valid identifier!
    this_is_a_very_very_long_identifier   -- Invalid, 35 characters in length!
  • Constants
    • String Constants:
      • Standard SQL string constants are arbitrary sequences of characters delimited by single quotes.

        ie. 'This is a standard SQL string constant'
      • Double quotes are used in SQL to quote identifiers, except MySQL, which uses backticks for this purpose. MySQL (therefore this SQL parser too) also accepts double quotes to delimit string constants.

        ie. "This is NOT standard, but still considered a string constant by MySQL, and so it is here"
      • Quotes used as string delimiters may be escaped by themselves or by backslashes.

        ie. 'Here''s an example' 'Here\'s another example'
      • Character set specifiers and collations are not supported.
    • Numeric Constants:

      Numeric constants include Intergers and Floating-Point Numbers. Numeric constants may or may not be enclosed in quotation marks. However, the SQL parser will remove the quotes when possible to prevent the SQL server from performing redundant and unnecessary type conversions. It is not always possible for the parser to know if a numeric constant should be un-quoted (for instance, when setting a default value in an ALTER statement), therefore it is recommended to not quote numbers.
      • Integer constants are represented by a sequence of numbers (0-9), optionally preceded by a hyphen (-) to indicate a negative value. Integers are accepted in the following format:

        Code: Select all

        [+-]digits
        Examples:

        Code: Select all

        123
        0
        -123
      • Floating-Point numbers are inexact, variable precision numeric types that are represented using scientific notation, accepted in the following format:

        Code: Select all

        [+-]digits[[.decimals][e[+-]exponent]]
        Examples:

        Code: Select all

        123
        123.45
        -123.45
        123.45e+10
        123.45e10
        123.45e-10
        123e10
    • Boolean Constants:
      • Boolean values can be specified as 1 (true), 0 (false), TRUE or FALSE (in any lettercase).
    • NULL Constants:
      • The NULL value means "no data". It can be written in any lettercase.
    • Hexadecimal Constants:
      • Not supported, yet.
    • Binary Constants:
      • Not supported, yet.
    * References about Constants:
    MySQL,
    PostgreSQL,
    Microsoft SQL Server.
  • Comments

    Simple SQL comments and C-style comments are allowed. Comments are removed from the input before parsing the SQL statements. Note comment markers (described below) inside of string constants should be (in fact, are) considered part of the constant.
    • Simple SQL comments:

      Simple SQL comments are introduced by two consecutive hyphens (--) or by a hash character (#). All input from (and including) any of these comment markers until the end of the line is ignored.

      The hash (#), when found on the first character of a line, is used by the MOD Template to delimit MOD Actions (note hashes preceded only by whitespaces may still be considered MOD action delimiters). However, you can still use the hash in the middle of a line or at end of a SQL statement to add comments.

      Examples:

      Code: Select all

      ALTER TABLE phpbb_users ADD user_flag1 SMALLINT;   # This is a SQL comment.
      ALTER TABLE phpbb_users ADD user_flag2 SMALLINT;   -- This is a SQL comment.
      #
      # This is NOT a SQL comment!
      # Note it will mark the end of the SQL action in the MOD Template!
    • C-style comments:

      C-style comments start with /* and end with */ and may affect multiple lines.

      Example:

      Code: Select all

      /*
         This is a multi line comment
      */
      ALTER TABLE phpbb_users ADD user_flag1 SMALLINT;
  • Data Types
    • The following data types are supported:

      Code: Select all

      DATA TYPE                   DESCRIPTION
      ----------------            ------------------------------
      BOOL                        See BOOLEAN
      BOOLEAN                     Logical boolean (true/false)
      TINYINT[(D)]                Very small integer
      SMALLINT[(D)]               Small integer
      MEDIUMINT[(D)]              Medium-size integer
      INT[(D)]                    See INTEGER
      INTEGER[(D)]                Normal-size integer
      BIGINT[(D)]                 Large integer
      FLOAT[(L[,D])]              Single precision floating-point number
      REAL[(L[,D])]               See FLOAT
      DOUBLE[(L[,D])]             See DOUBLE PRECISION
      DOUBLE PRECISION[(L[,D])]   Double precision floating-point number
      DEC[(L[,D])]                See DECIMAL
      DECIMAL[(L[,D])]            Exact fixed-point number
      FIXED[(L[,D])]              See DECIMAL
      NUMERIC[(L[,D])]            See DECIMAL
      CHAR[(S)]                   Fixed-size character string
      VARCHAR[(S)]                Variable-size character string
      BINARY[(S)]                 Fixed-size string of binary data
      VARBINARY[(S)]              Variable-size string of binary data
      TINYTEXT                    Unlimited-size character string
      TEXT                        Unlimited-size character string
      MEDIUMTEXT                  Unlimited-size character string
      LONGTEXT                    Unlimited-size character string
      TINYBLOB                    Binary large Object
      BLOB                        Binary large Object
      MEDIUMBLOB                  Binary large Object
      LONGBLOB                    Binary large Object
      (D) is an optional argument used to specify display width for integer data types.
      (S) is an optional argument used to specify field size, in chars or bytes depending on the data type.
      L in (L[,D]) is the total number of digits (including decimals).
      D in (L[,D]) is the total number of digits after the decimal point.
      Note (L[,D]) May also be specified for floating-point data types. However, it is not recommended since the meaning of these arguments may vary between SQL servers.
    • The following tables detail assumed storage sizes used for integer types:

      The column "DW" means "Display Width", which is the maximum recommended value to use as the (D) optional argument for integer types. This value is used by the parser to decide if an unsigned integer should be promoted to a higher size type when converting SQL statements for SQL servers that have no support for the unsigned attribute, such as PostgreSQL, MSSQL, etc.

    Code: Select all

    TYPE        BYTES        MINIMUM UNSIGNED   DW        MAXIMUM UNSIGNED   DW
    ---------   -----   ---------------------   --   ---------------------   --
    TINYINT       1                      -128    4                    +127    4
    SMALLINT      2                    -32768    6                  +32767    6
    MEDIUMINT     3                  -8388608    8                +8388607    8
    INTEGER       4               -2147483648   11             +2147483647   11
    BIGINT        8      -9223372036854775808   20    +9223372036854775807   20

    Code: Select all

    TYPE        BYTES          MAXIMUM SIGNED   DW
    ---------   -----   ---------------------   --
    TINYINT       1                      +255    4
    SMALLINT      2                    +65535    6
    MEDIUMINT     3                 +16777215    9
    INTEGER       4               +4294967295   11
    BIGINT        8     +18446744073709551615   21
    * Data Type References:
    MySQL,
    PostgreSQL,
    Microsoft SQL Server.
  • SQL Reference
    • CREATE TABLE
      • Syntax:

        Code: Select all

        CREATE TABLE table_name ({<column_definition>|<index_definition>}[,...]);
      • There must be, at least, one table definition element enclosed between parentheses. If more than one element is specified, they must be separated with commas.
      • Table attributes outside the parentheses are SQL server dependent, therefore are NOT accepted here.
      • <column_definition> and <index_definition> are described below.
      • For the table_name syntax, please refer to the Identifiers chapter.
      * CREATE TABLE References:
      MySQL,
      PostgreSQL,
      Microsoft SQL Server.
    • ALTER TABLE
      • Syntax:

        Code: Select all

        ALTER TABLE table_name ADD PRIMARY KEY (<index_key_definition>[,...]);
        
        ALTER TABLE table_name ADD {INDEX|UNIQUE} [index_name] (<index_key_definition>[,...]);
        
        ALTER TABLE table_name ADD [COLUMN] <column_definition>;
        
        ALTER TABLE table_name ALTER [COLUMN] column_name {SET DEFAULT constant|DROP DEFAULT};
        
        ALTER TABLE table_name CHANGE [COLUMN] old_col_name <column_definition>;
        
        ALTER TABLE table_name MODIFY [COLUMN] <column_definition>;
        
        ALTER TABLE table_name DROP [COLUMN] column_name;
        
        ALTER TABLE table_name DROP PRIMARY KEY;
        
        ALTER TABLE table_name DROP INDEX index_name;
      • For the table_name, index_name or column_name syntax, please refer to the Identifiers chapter.
      * ALTER TABLE References:
      MySQL,
      PostgreSQL,
      Microsoft SQL Server.
    • Syntax Elements
      • <column_definition>
        • Syntax:

          Code: Select all

          column_name data_type [UNSIGNED] [ZEROFILL] [BINARY] [NULL|NOT NULL] [DEFAULT constant|NULL] [AUTO_INCREMENT]
        • The following column attributes are accepted in any order:
          • UNSIGNED: Optional. For numeric types only.
          • ZEROFILL: Optional. For numeric types only. Ignored on non-MySQL servers.
          • BINARY: Optional. For CHAR or VARCHAR data types only.
          • NULL or NOT NULL: Optional. NOT NULL is required for primary keys.
          • DEFAULT: Optional.
          • AUTO_INCREMENT: Optional, If used, column must be indexed.
        • For the column_name syntax, please refer to the Identifiers chapter.
      • <index_definition>
        • Syntax:

          Code: Select all

          PRIMARY KEY (<index_key_definition>[,...])
          {KEY|INDEX} [index_name] (<index_key_definition>[,...])
          UNIQUE [INDEX] [index_name] (<index_key_definition>[,...])
        • For the index_name syntax, please refer to the Identifiers chapter.
      • <index_key_definition>
        • Syntax:

          Code: Select all

          column_name[(length)] [ASC|DESC]
        • The length attribute (next to the column name and enclosed between parentheses) is optional for CHAR, VARCHAR, BINARY and VARBINARY data types and required for TINYTEXT, TEXT, MEDIUMTEXT and LONGTEXT data types.
        • The order attribute is optional and defaults to ASC.
        • For the column_name syntax, please refer to the Identifiers chapter.
Last edited by markus_petrux on Sun Jan 01, 2006 5:13 am, edited 4 times in total.

markus_petrux
Registered User
Posts: 376
Joined: Fri Jun 18, 2004 10:58 pm
Location: Girona, Catalunya (Spain)
Contact:

Re: Syntax Reference for the SQL Parser

Post by markus_petrux »

The Syntax Conventions section has been updated.

Further updates on this document can be found in CVS:
http://cvs.sourceforge.net/viewcvs.py/e ... rser/docs/" target="_blank

Locked