SQL

From BC$ MobileTV Wiki
Jump to: navigation, search

Structured Query Language (commonly abbreviated as SQL) is a database Information Retrieval computer language founded in the 1970s, and used to place commands and execute queries on stores of data within a given database structure. [1]




Specification

[2]


Schema

See: SQL Schema


Queries

The most common operation in SQL databases is the query, which is performed with the declarative SELECT keyword. SELECT retrieves data from a specified table, or multiple related tables, in a database. While often grouped with Data Manipulation Language (DML) [3] statements, the standard SELECT query is considered separate from SQL DML, as it has no persistent effects on the data stored in a database. Note that there are some platform-specific variations of SELECT that can persist their effects in a database, such as the SELECT INTO syntax that exists in some databases.[4]

SQL queries allow the user to specify a description of the desired result set, but it is left to the devices of the Database Management System (DBMS) to plan, optimize, and perform the physical operations necessary to produce that result set in as efficient a manner as possible. An SQL query includes a list of columns to be included in the final result immediately following the SELECT keyword. An asterisk ("*") can also be used as a "wildcard" indicator to specify that all available columns of a table (or multiple tables) are to be returned. SELECT is the most complex statement in SQL, with several optional keywords and clauses, including:

  • The FROM clause which indicates the source table or tables from which the data is to be retrieved. The FROM clause can include optional JOIN clauses to join related tables to one another based on user-specified criteria.
  • The WHERE clause includes a comparison predicate, which is used to restrict the number of rows returned by the query. The WHERE clause is applied before the GROUP BY clause. The WHERE clause eliminates all rows from the result set where the comparison predicate does not evaluate to True.
  • The GROUP BY clause is used to combine, or group, rows with related values into elements of a smaller set of rows. GROUP BY is often used in conjunction with SQL aggregate functions or to eliminate duplicate rows from a result set.
  • The HAVING clause includes a comparison predicate used to eliminate rows after the GROUP BY clause is applied to the result set. Because it acts on the results of the GROUP BY clause, aggregate functions can be used in the HAVING clause predicate.
  • The ORDER BY clause is used to identify which columns are used to sort the resulting data, and in which order they should be sorted (options are ascending or descending). The order of rows returned by an SQL query is never guaranteed unless an ORDER BY clause is specified.


Functions


EXAMPLES

SELECT

The following is an example of a SELECT query that returns a list of expensive books. The query retrieves all rows from the books table in which the price column contains a value greater than 100.00. The result is sorted in ascending order by title. The asterisk (*) in the select list indicates that all columns of the books table should be included in the result set.

SELECT * 
FROM books
WHERE price > 100.00
ORDER BY title;


JOIN

The example below demonstrates the use of multiple tables in a join, grouping, and aggregation in an SQL query, by returning a list of books and the number of authors associated with each book.

SELECT books.title, count(*) AS Authors
FROM books
JOIN book_authors 
ON books.isbn = book_authors.isbn
GROUP BY books.title;

Example output might resemble the following:

Title                   Authors
----------------------  -------
SQL Examples and Guide     3
The Joy of SQL             1
How to use Wikipedia       2
Pitfalls of SQL            1
How SQL Saved my Dog       1

(The underscore character "_" is often used as part of table and column names to separate descriptive words because other punctuation tends to conflict with SQL syntax. For example, a dash "-" would be interpreted as a minus sign.)

Under the precondition that isbn is the only common column name of the two tables and that a column named title only exists in the books table, the above query could be rewritten in the following form:

SELECT title, count(*) AS Authors
FROM books 
NATURAL JOIN book_authors 
GROUP BY title;

However, many vendors either don't support this approach, or it requires certain column naming conventions. Thus, it is less common in practice.

Data retrieval is very often combined with data projection when the user is looking for calculated values and not just the verbatim data stored in primitive data types, or when the data needs to be expressed in a form that is different from how it's stored. SQL allows the use of expressions in the select list to project data, as in the following example which returns a list of books that cost more than 100.00 with an additional sales_tax column containing a sales tax figure calculated at 6% of the price.

SELECT isbn, title, price, price * 0.06 AS sales_tax
FROM books
WHERE price > 100.00
ORDER BY title;

[5]


Data manipulation

First, there are the standard Data Manipulation Language (DML) elements. DML is the subset of the language used to add, update and delete data:

  • INSERT is used to add rows (formally tuples) to an existing table, for example:
INSERT INTO my_table (field1, field2, field3) VALUES ('test', 'N', NULL);
  • UPDATE is used to modify the values of a set of existing table rows, eg:
UPDATE my_table SET field1 = 'updated value' WHERE field2 = 'N';
  • DELETE removes zero or more existing rows from a table, eg:
DELETE FROM my_table WHERE field2 = 'N';
  • MERGE is used to combine the data of multiple tables. It is something of a combination of the INSERT and UPDATE elements. It is defined in the SQL:2003 standard; prior to that, some databases provided similar functionality via different syntax, sometimes called an "upsert".


Transaction controls

Transactions, if available, can be used to wrap around the DML operations:

  • START TRANSACTION (or BEGIN WORK, or BEGIN TRANSACTION, depending on SQL dialect) can be used to mark the start of a database transaction, which either completes completely or not at all.
  • COMMIT causes all data changes in a transaction to be made permanent.
  • ROLLBACK causes all data changes since the last COMMIT or ROLLBACK to be discarded, so that the state of the data is "rolled back" to the way it was prior to those changes being requested.

Once the COMMIT statement has been executed, the changes cannot be rolled back. In other words, its meaningless to have ROLLBACK executed after COMMIT statement and vice versa.

COMMIT and ROLLBACK interact with areas such as transaction control and locking. Strictly, both terminate any open transaction and release any locks held on data. In the absence of a START TRANSACTION or similar statement, the semantics of SQL are implementation-dependent. Example: A classic bank transfer of funds transaction.

START TRANSACTION;
  UPDATE ACCOUNTS SET AMOUNT=AMOUNT-200 WHERE ACCOUNT_NUMBER=1234;
  UPDATE ACCOUNTS SET AMOUNT=AMOUNT+200 WHERE ACCOUNT_NUMBER=2345;
IF ERRORS=0 COMMIT;
IF ERRORS<>0 ROLLBACK;


Data definition

The second group of keywords is the Data Definition Language (DDL). DDL allows the user to define new tables and associated elements. Most commercial SQL databases have proprietary extensions in their DDL, which allow control over nonstandard features of the database system. The most basic items of DDL are the CREATE, ALTER, RENAME, TRUNCATE and DROP statements:

  • CREATE causes an object (a table, for example) to be created within the database.
  • DROP causes an existing object within the database to be deleted, usually irretrievably.
  • TRUNCATE deletes all data from a table (non-standard, but common SQL statement).
  • ALTER statement permits the user to modify an existing object in various ways -- for example, adding a column to an existing table.

Example:

CREATE TABLE my_table (
  my_field1   INT,
  my_field2   VARCHAR (50),
  my_field3   DATE         NOT NULL,
  PRIMARY KEY (my_field1, my_field2) 
);


Data control

The third group of SQL keywords is the Data Control Language (DCL). DCL handles the authorization aspects of data and permits the user to control who has access to see or manipulate data within the database. Its two main keywords are:

  • GRANT authorizes one or more users to perform an operation or a set of operations on an object.
  • REVOKE removes or restricts the capability of a user to perform an operation or a set of operations.

Example:

GRANT SELECT, UPDATE ON my_table TO some_user, another_user


BLOB

PIVOT


Other

  • ANSI-standard SQL supports double dash, --, as a single line comment identifier (some extensions also support curly brackets or C style /* comments */ for multi-line comments).

Example:

SELECT * FROM inventory -- Retrieve everything from inventory table





Tools

Validator


Resources


Tutorials


External Links

References

  1. wikipedia:SQL
  2. SQL Standards process/info: http://www.jcc.com/sql.htm
  3. Data Manipulation Language (DML)
  4. INTO Clause (Transact-SQL), SQL Server 2005 Books Online, Microsoft, 2007: http://msdn2.microsoft.com/en-us/library/ms188029(SQL.90).aspx (accessed 2007-06-17)
  5. Say No to Venn Diagrams When Explaining JOINs (in SQL): http://dzone.com/articles/say-no-to-venn-diagrams-when-explaining-joins
  6. Pivot tables in SQL Server. A simple sample: http://blogs.msdn.com/b/spike/archive/2009/03/03/pivot-tables-in-sql-server-a-simple-sample.aspx
  7. Pivoting Without Aggregation: http://sqlmag.com/t-sql/pivoting-without-aggregation
  8. Convert Rows to columns using 'Pivot' in SQL Server: http://stackoverflow.com/questions/15931607/convert-rows-to-columns-using-pivot-in-sql-server
  9. Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*): https://stackoverflow.com/questions/186588/which-is-fastest-select-sql-calc-found-rows-from-table-or-select-count#188682

See Also

DB | DBMS | RDBMS | SPARQL