Tuesday, September 14, 2010

Combining queries: the union operator

The union operator combines the results of two or more queries into a single result set. The Transact-SQL extension to union allows you to:
  • Use union in the select clause of an insert statement.
  • Specify new column headings in the order by clause of a select statement when union is present in the select statement.
The syntax of the union operator is as follows:
query1
[union [all] queryN ] ...
[order by clause]
[compute
clause]
where query1 is:
select select_list[into clause]
[from clause]
[where clause]
[group by clause]
[having clause]
and queryN is:
select select_list[from clause]
[where clause]
[group by clause]
[having clause]
Figure 3-1: Union combining queriesrasterFor example, suppose you have the following two tables containing the data shown:
The following query creates a union between the two tables:
create table T1 (a char(4), b int)
insert T1 values ("abc", 1)
insert T1 values ("def", 2)
insert T1 values ("ghi", 3)
create table T2 (a char(4), b int)
insert T2 values ("ghi", 3)
insert T2 values ("jkl", 4)
insert T2 values ("mno", 5)
select * from T1 
union
select * from T2
a     b
---- ---------
abc 1
def 2
ghi 3
jkl 4
mno 5

(5 rows affected)
By default, the union operator removes duplicate rows from the result set. Use the all option to include duplicate rows. Notice also that the columns in the result set have the same names as the columns in T1. You can use any number of union operators in a Transact-SQL statement. For example:
x union y union z 
By default, Adaptive Server evaluates a statement containing union operators from left to right. You can use parentheses to specify a different evaluation order.
For example, the following two expressions are not equivalent:
x union all (y union z)

(x union all y) union z
In the first expression, duplicates are eliminated in the union between y and z. Then, in the union between that set and x, duplicates are not eliminated. In the second expression, duplicates are included in the union between x and y, but are then eliminated in the subsequent union with z; all does not affect the final result of this statement.

Guidelines for union queries

When you use union statements:
  • All select lists in the union statement must have the same number of expressions (such as column names, arithmetic expressions, and aggregate functions). The following statement is invalid because the first select list is longer than the second:
    create table stores_east
    (stor_id char(4) not null,
    stor_name varchar(40) null,
    stor_address varchar(40) null,
    city varchar(20) null,
    state char(2) null,
    country varchar(12) null,
    postalcode char(10) null,
    payterms varchar(12) null)
    select stor_id, city, state from stores 
    union
    select stor_id, city from stores_east
    drop table stores_east
  • Corresponding columns in all tables, or any subset of columns used in the individual queries, must be of the same datatype, or an implicit data conversion must be possible between the two datatypes, or an explicit conversion should be supplied. For example, a union is not possible between a column of the char datatype and one of the int datatype, unless an explicit conversion is supplied. However, a union is possible between a column of the money datatype and one of the int datatype. See union and "Datatype Conversion Functions" in the Reference Manual for more information about comparing datatypes in a union statement.
  • You must place corresponding columns in the individual queries of a union statement in the same order, because union compares the columns one to one in the order given in the query. For example, suppose you have the following tables:
    Figure 3-2: Union comparing columnsrasterThis query:
    select a, b from T3 
    union
    select b, a from T4
    produces:
    a          b
    --------- ---
    1 abc
    2 def
    3 ghi

    (3 rows affected)
    The following query results in an error message, because the datatypes of corresponding columns are not compatible:
    select a, b from T3 
    union
    select a, b from T4
    drop table T3
    drop table T4
    When you combine different (but compatible) datatypes such as float and int in a union statement, Adaptive Server converts them to the datatype with the most precision.
  • Adaptive Server takes the column names in the table resulting from a union from the first individual query in the union statement. Therefore, to define a new column heading for the result set, do so in the first query. In addition, to refer to a column in the result set by a new name, for example, in an order by statement, refer to it in that way in the first select statement.
    The following query is correct:
    select Cities = city from stores 
    union
    select city from authors
    order by Cities

Using union with other Transact-SQL commands

When you use union statements with other Transact-SQL commands:
  • The first query in the union statement may contain an into clause that creates a table to hold the final result set. For example, the following statement creates a table called results that contains the union of tables publishers, stores, and salesdetail:
    use master
    sp_dboption pubs2, "select into", true
    use pubs2
    checkpoint
    select pub_id, pub_name, city into results 
    from publishers
    union
    select stor_id, stor_name, city from stores
    union
    select stor_id, title_id, ord_num from salesdetail
    You can use the into clause only in the first query; if it appears anywhere else, you get an error message.
  • You can use order by and compute clauses only at the end of the union statement to define the order of the final results or to compute summary values. You cannot use them within the individual queries that make up the union statement.
  • You can use group by and having clauses within individual queries only; you cannot use them to affect the final result set.
  • You can also use the union operator within an insert statement. For example:
    create table tour (city varchar(20), state char(2))
    insert into tour 
    select city, state from stores
    union
    select city, state from authors
    drop table tour
  • You cannot use the union operator within a create view statement.
  • You cannot use the union operator on text and image columns.
  • You cannot use the for browse clause in statements involving the union operator.

The compute clause : Summarizing groups of data

The compute clause is a Transact-SQL extension. Use it with row aggregates to produce reports that show subtotals of grouped summaries. Such reports, usually produced by a report generator, are called control-break reports, since summary values appear in the report under the control of the groupings ("breaks") you specify in the compute clause.
These summary values appear as additional rows in the query results, unlike the aggregate results of a group by clause, which appear as new columns.
A compute clause allows you to see detail and summary rows with one select statement. You can calculate summary values for subgroups and you can calculate more than one row aggregate (see "Row aggregates and compute") for the same group.
The general syntax for compute is:
compute row_aggregate(column_name) 
[, row_aggregate(column_name)]...
[by column_name [, column_name]...]
The row aggregates you can use with compute are sum, avg, min, max, and count. You can use sum and avg only with numeric columns. Unlike the order by clause, you cannot use the positional number of a column from the select list instead of the column name.
You cannot use text or image columns in a compute clause.
A system test may fail because there are too many aggregates in the compute clause of a query. The number of aggregates that each compute clause can accommodate is limited to 127, and if a compute clause contains more than 127 aggregates, the system generates an error message when you try to execute the query.
Each avg() aggregate counts as two aggregates when you are counting toward the limit of 127, because an avg() aggregate is actually a combination of a sum() aggregate and a count() aggregate.
Following are two queries and their results. The first one uses group by and aggregates. The second uses compute and row aggregates. Note the difference in the results.
select type, sum(price), sum(advance) 
from titles
group by type
type 
------------ ------- ----------
UNDECIDED NULL NULL
business 54.92 25,125.00
mod_cook 22.98 15,000.00
popular_comp 42.95 15,000.00
psychology 67.52 21,275.00
trad_cook 47.89 19,000.00

(6 rows affected)
select type, price, advance 
from titles
order by type
compute sum(price), sum(advance) by type
type         price                    advance
------------ ------------------------ --------
UNDECIDED NULL NULL

Compute Result:
------------------------ ------------------------
NULL NULL

type price advance
------------ -------------------- ----------
business 2.99 10,125.00
business 11.95 5,000.00
business 19.99 5,000.00
business 19.99 5,000.00
Compute Result:
------------------------ ------------------------
54.92 25,125.00
type price advance
------------ ----------------------- ---------
mod_cook 2.99 15,000.00
mod_cook 19.99 0.00
Compute Result:
------------------------ ------------------------
22.98 15,000.00

type price advance
------------- ------------------- ------------
popular_comp NULL NULL
popular_comp 20.00 8,000.00
popular_comp 22.95 7,000.00
Compute Result:
------------------------ ------------------------
42.95 15,000.00

type price advance
------------ ------------------------ --------
psychology 7.00 6,000.00
psychology 7.99 4,000.00
psychology 10.95 2,275.00
psychology 19.99 2,000.00
psychology 21.59 7,000.00
Compute Result:
------------------------ ------------------------
67.52 21,275.00

type price advance
------------ ----------------------- --------
trad_cook 11.95 4,000.00
trad_cook 14.99 8,000.00
trad_cook 20.95 7,000.00
Compute Result:
------------------------ ------------------------
47.89 19,000.00

(24 rows affected)
Each summary value is treated as a row.

Row aggregates and compute

The row aggregates used with compute are listed in Table 3-2:
How aggregates are used with a compute statement
Row aggregatesResult
sumTotal of the values in the expression
avgAverage of the values in the expression
maxHighest value in the expression
minLowest value in the expression
countNumber of selected rows
These row aggregates are the same aggregates that can be used with group by, except there is no row aggregate function that is the equivalent of count(*). To find the summary information produced by group by and count(*), use a compute clause without the by keyword.
Rules for compute clauses
  • Adaptive Server does not allow the distinct keyword with the row aggregates.
  • The columns in a compute clause must appear in the select list.
  • You cannot use select into (see Chapter 7, "Creating Databases and Tables") in the same statement as a compute clause because statements that include compute do not generate normal rows.
  • If you use compute with the by keyword, you must also use an order by clause. The columns listed after by must be identical to, or a subset of, those listed after order by, and must be in the same left-to-right order, start with the same expression, and not skip any expressions.
    For example, suppose the order by clause is:

    order by a, b, c 
    The compute clause can be any or all of these:

    compute row_aggregate (column_name) by a, b, c 
    compute row_aggregate (column_name) by a, b 
    compute row_aggregate (column_name) by a
    The compute clause cannot be any of these:

    compute row_aggregate (column_name) by b, c 
    compute row_aggregate (column_name) by a, c
    compute row_aggregate (column_name) by c
    You must use a column name or an expression in the order by clause; you cannot sort by a column heading.
  • The compute keyword can be used without by to generate grand totals, grand counts, and so on. order by is optional if you use the compute keyword without by. The compute keyword without by is discussed under "Grand values: compute without by".

Specifying more than one column after compute

Listing more than one column after the by keyword affects the query by breaking a group into subgroups and applying the specified row aggregate to each level of grouping. For example, this query finds the sum of the prices of psychology books from each publisher:
select type, pub_id, price 
from titles
where type = "psychology"
order by type, pub_id, price
compute sum(price) by type, pub_id
type        pub_id  price 
----------- ------- -------------
psychology 0736 7.00
psychology 0736 7.99
psychology 0736 10.95
psychology 0736 19.99

Compute Result:
---------------
45.93
type        pub_id  price 
----------- ------- -------------
psychology 0877 21.59

Compute Result:
---------------
21.59

(7 rows affected)

Using more than one compute clause

You can use different aggregates in the same statement by including more than one compute clause. The following query is similar to the preceding one but adds the sum of the prices of psychology books by publisher:
select type, pub_id, price 
from titles
where type = "psychology"
order by type, pub_id, price
compute sum(price) by type, pub_id
compute sum(price) by type
type        pub_id  price 
----------- ------- --------------
psychology 0736 7.00
psychology 0736 7.99
psychology 0736 10.95
psychology 0736 19.99

Compute Result:
---------------
45.93

type pub_id price
---------- ------- --------------
psychology 0877 21.59

Compute Result:
---------------
21.59

Compute Result:
---------------
67.52

(8 rows affected)

Applying an aggregate to more than one column

One compute clause can apply the same aggregate to several columns. This query finds the sum of the prices and advances for each type of cookbook:
select type, price, advance 
from titles
where type like "%cook"
order by type
compute sum(price), sum(advance) by type
type      price            advance 
--------- ---------------- ---------------
mod_cook 2.99 15,000.00
mod_cook 19.99 0.00

Compute Result:
--------------- ---------------
22.98 15,000.00

type price advance
--------- ---------------- ---------------
trad_cook 11.95 4,000.00
trad_cook 14.99 8,000.00
trad_cook 20.95 7,000.00

Compute Result:
--------------- ---------------
47.89 19,000.00

(7 rows affected)
Remember, the columns to which the aggregates apply must also be in the select list.

Using different aggregates in the same compute clause

You can use different aggregates in the same compute clause:
select type, pub_id, price 
from titles
where type like "%cook"
order by type, pub_id
compute sum(price), max(pub_id) by type
type        pub_id  price 
----------- ------- --------------
mod_cook 0877 2.99
mod_cook 0877 19.99

Compute Result:
--------------- ----
22.98 0877

type pub_id price
----------- ------- --------------
trad_cook 0877 11.95
trad_cook 0877 14.99
trad_cook 0877 20.95

Compute Result:
--------------- ----
47.89 0877

(7 rows affected)

Grand values: compute without by

You can use the compute keyword without by to generate grand totals, grand counts, and so on.
This statement finds the grand total of the prices and advances of all types of books that cost more than $20:
select type, price, advance 
from titles
where price > $20
compute sum(price), sum(advance)
type         price            advance 
------------ ---------------- -------------
popular_comp 22.95 7,000.00
psychology 21.59 7,000.00
trad_cook 20.95 7,000.00

Compute Result:
--------------- ---------
65.49 21,000.00

(4 rows affected)
You can use a compute with by and a compute without by in the same query. The following query finds the sum of prices and advances by type and then computes the grand total of prices and advances for all types of books.
select type, price, advance 
from titles
where type like "%cook"
order by type
compute sum(price), sum(advance) by type
compute sum(price), sum(advance)
type         price                advance
----------- ----------------- ------------
mod_ cook 2.99 15,000.00
mod_cook 19.99 0.00

Compute Result:
--------------- ---------
22.98 15,000.00

type price advance
----------- ----------------- ------------
trad_cook 11.95 4,000.00
trad_cook 14.99 8,000.00
trad_cook 20.95 7,000.00

Compute Result:
--------------- ---------
47.89 19,000.00

Compute Result:
--------------- ---------
70.87 34,000.00

(8 rows affected)

Order by clause : Sorting query results

The order by clause allows you to sort query results by one or more columns, up to 31. Each sort is either ascending (asc) or descending (desc). If neither is specified, asc is the default. The following query orders results by pub_id:
select pub_id, type, title_id 
from titles
order by pub_id
pub_id  type             title_id 
------ ------------ --------
0736 business BU2075
0736 psychology PS2091
0736 psychology PS2106
0736 psychology PS3333
0736 psychology PS7777
0877 UNDECIDED MC3026
0877 mod_cook MC2222
0877 mod_cook MC3021
0877 psychology PS1372
0877 trad_cook TC3218
0877 trad_cook TC4203
0877 trad_cook TC7777
1389 business BU1032
1389 business BU1111
1389 business BU7832
1389 popular_comp PC1035
1389 popular_comp PC8888
1389 popular_comp PC9999

(18 rows affected)
If you name more than one column in the order by clause, Adaptive Server nests the sorts. The following statement sorts the rows in the stores table first by stor_id in descending order, then by payterms (in ascending order, since desc is not specified), and finally by country (also ascending). Adaptive Server sorts null values first within any group.
select stor_id, payterms, country
from stores
order by stor_id desc, payterms
stor_id payterms     country 
------- ------------ ------------
8042 Net 30 USA
7896 Net 60 USA
7131 Net 60 USA
7067 Net 30 USA
7066 Net 30 USA
6380 Net 60 USA
5023 Net 60 USA

(7 rows affected)
You can use the position number of a column in a select list instead of the column name. Column names and select list numbers can be mixed. Both of the following statements produce the same results as the preceding one.
select pub_id, type, title_id 
from titles
order by 1 desc, 2, 3
select pub_id, type, title_id 
from titles
order by 1 desc, type, 3
Most versions of SQL require that order by items appear in the select list, but Transact-SQL has no such restriction. You could order the results of the preceding query by title, although that column does not appear in the select list.
You cannot use order by on text or image columns.
Adaptive Server does not allow subqueries, aggregates, variables and constant expressions in the order by list.
With order by, null values come before all others.
The effects of an order by clause on mixed-case data depend on the sort order installed on your Adaptive Server. The basic choices are binary, dictionary order, and case-insensitive. The system procedure sp_helpsort displays the sort order for your server. See the Reference Manual for more information on sort orders.

order by and group by

You can use an order by clause to order the results of a group by in a particular way.
Put the order by clause after the group by clause. For example, to find the average price of each type of book and order the results by average price, the statement is:
select type, avg(price) 
from titles
group by type
order by avg(price)
type 
---------- ------------
UNDECIDED NULL
mod_cook 11.49
psychology 13.50
business 13.73
trad_cook 15.96
popular_comp 21.48

(6 rows affected)

order by and group by used with select distinct

A select distinct query with order by or group by can return duplicate values if the order by or group by column is not in the select list. For example:
select distinct pub_id
from titles
order by type
pub_id
------
0877
0736
1389
0877
1389
0736
0877
0877

(8 rows affected)
If a query has an order by or group by clause that includes columns not in the select list, Adaptive Server adds those columns as hidden columns in the columns being processed. The columns listed in the order by or group by clause are included in the test for distinct rows. To comply with ANSI standards, include the order by or group by column in the select list. For example:
select distinct pub_id, type
from titles
order by type
pub_id type 
------ ------------
0877 UNDECIDED
0736 business
1389 business
0877 mod_cook
1389 popular_comp
0736 psychology
0877 psychology
0877 trad_cook

(8 rows affected)

The having clause : Selecting groups of data

Use the having clause to display or reject rows defined by the group by clause. The having clause sets conditions for the group by clause in the same way where sets conditions for the select clause, except where cannot include aggregates, while having often does. This example is legal:
select title_id 
from titles
where title_id like "PS%"
having avg(price) > $2.0
But this example is not:
select title_id
from titles
where avg(price) > $20
having clauses can reference any of the items that appear in the select list.
This statement is an example of a having clause with an aggregate function. It groups the rows in the titles table by type, but eliminates the groups that include only one book:
select type 
from titles
group by type
having count(*) > 1
type 
----------------
business
mod_cook
popular_comp
psychology
trad_cook

(5 rows affected)
Here is an example of a having clause without aggregates. It groups the titles table by type and displays only those types that start with the letter "p":
select type 
from titles
group by type
having type like "p%"
type 
------------
popular_comp
psychology

(2 rows affected)
When you include more than one condition in the having clause, combine the conditions with and, or, or not. For example, to group the titles table by publisher, and to include only those publishers who have paid more than $15,000 in total advances, whose books average less than $18 in price, and whose identification numbers (pub_id) are greater than 0800, the statement is:
select pub_id, sum(advance), avg(price)
from titles
group by pub_id
having sum(advance) > 15000
and avg(price) < 18
and pub_id > "0800"
pub_id 
------ ---------------- ----------------
0877 41,000.00 15.41

(1 row affected)

How the having, group by, and where clauses interact

When you include the having, group by, and where clauses in a query, the sequence in which each clause affects the rows determines the final results:
  • The where clause excludes rows that do not meet its search conditions.
  • The group by clause collects the remaining rows into one group for each unique value in the group by expression.
  • Aggregate functions specified in the select list calculate summary values for each group.
  • The having clause excludes rows from the final results that do not meet its search conditions.
The following query illustrates the use of where, group by, and having clauses in one select statement containing aggregates:
select stor_id, title_id, sum(qty)
from salesdetail
where title_id like "PS%"
group by stor_id, title_id
having sum(qty) > 200
stor_id  title_id
------- -------- -----------
5023 PS1372 375
5023 PS2091 1,845
5023 PS3333 3,437
5023 PS7777 2,206
6380 PS7777 500
7067 PS3333 345
7067 PS7777 250

(7 rows affected)
The query functioned in this order:
  1. The where clause identified only rows with title_id beginning with "PS" (psychology books),
  2. group by collected the rows by common stor_id and title_id,
  3. The sum aggregate calculated the total number of books sold for each group, and
  4. The having clause excluded the groups whose totals do not exceed 200 books from the final results.
All of the previous having examples adhere to the SQL standards, which specify that columns in a having expression must have a single value, and must be in the select list or group by clause. However, the Transact-SQL extensions to having allow columns or expressions not in the select list and not in the group by clause.
The following example determines the average price for each title type, but it excludes those types that do not have more than $10,000 in total sales, even though the sum aggregate does not appear in the results.
select type, avg(price)
from titles
group by type
having sum(total_sales) > 10000
type
------------ ----------
business 13.73
mod_cook 11.49
popular_comp 21.48
trad_cook 15.96

(4 rows affected)
The extension behaves as if the column or expression were part of the select list but not part of the displayed results. If you include an non-aggregated column with having, but it is not part of the select list or the group by clause, the query produces results similar to the "extended" column extension described earlier in this chapter. For example:
select type, avg(price)
from titles
group by type
having total_sales > 4000
type
------------ ----------
business 13.73
business 13.73
business 13.73
mod_cook 11.49
popular_comp 21.48
popular_comp 21.48
psychology 13.50
trad_cook 15.96
trad_cook 15.96

(9 rows affected)
Unlike an extended column, the total_sales column does not appear in the final results, yet the number of displayed rows for each type depends on the total_sales for each title. The query indicates that three business, one mod_cook, two popular_comp, one psychology, and two trad_cook, titles exceed $4000 in total sales.
As mentioned earlier, the way Adaptive Server handles extended columns may seem as if the query is ignoring the where clause in the final results. To make the where conditions affect the results for the extended column, you should repeat the conditions in the having clause. For example:
select type, advance, avg(price) 
from titles
where advance > 5000
group by type
having advance > 5000
type           advance
------------- --------- --------
business 10,125.00 2.99
mod_cook 15,000.00 2.99
popular_comp 7,000.00 21.48
popular_comp 8,000.00 21.48
psychology 7,000.00 14.30
psychology 6,000.00 14.30
trad_cook 7,000.00 17.97
trad_cook 8,000.00 17.97

(8 rows affected)

Using having without group by

A query with a having clause should also have a group by clause. If you omit group by, all the rows not excluded by the where clause return as a single group.
Because no grouping is done between the where and having clauses, they cannot act independently of each other. having acts like where because it affects the rows in a single group rather than groups, except the having clause can still use aggregates.
This example uses the having clause in the following way: it averages the price, excludes from the results titles with advances greater than $4,000, and produces results where price is less than the average price:
select title_id, advance, price
from titles
where advance < 4000
having price > avg(price)
title_id        advance    price
------------- --------- --------
BU1032 5,000.00 19.99
BU7832 5,000.00 19.99
MC2222 0.00 19.99
PC1035 7,000.00 22.95
PC8888 8,000.00 20.00
PS1372 7,000.00 21.59
PS3333 2,000.00 19.99
TC3218 7,000.00 20.95

(8 rows affected)
You can also use the having clause with the Transact-SQL extension that allows you to omit the group by clause from a query that includes an aggregate in its select list. These scalar aggregate functions calculate values for the table as a single group, not for groups within the table.
In this example, the group by clause is omitted, which makes the aggregate function calculate a value for the whole table. The having clause excludes non-matching rows from the result group.
select pub_id, count(pub_id)
from publishers
having pub_id < "1000"
pub_id
------ ----------------
0736 3
0877 3

(2 rows affected)

Group by (SQL)

The group by clause divides the output of a query into groups. You can group by one or more column names, or by the results of computed columns using numeric datatypes in an expression. When used with aggregates, group by retrieves the calculations in each subgroup, and may return multiple rows. The maximum number of columns or expressions you can use in a group by clause is 16.
You cannot group by columns of text or image datatypes.
While you can use group by without aggregates, such a construction has limited functionality and may produce confusing results. The following example groups the results by title type:
select type, advance
from titles
group by type
type             advance
------------ ---------
business 5,000.00
business 5,000.00
business 10,125.00
business 5,000.00
mod_cook 0.00
mod_cook 15,000.00
UNDECIDED NULL
popular_comp 7,000.00
popular_comp 8,000.00
popular_comp NULL
psychology 7,000.00
psychology 2,275.00
psychology 6,000.00
psychology 2,000.00
psychology 4,000.00
trad_cook 7,000.00
trad_cook 4,000.00
trad_cook 8,000.00

(18 rows affected)
With an aggregate for the advance column, the query returns the sum for each group:
select type, sum(advance)
from titles
group by type
type                                  
------------ ------------------------
UNDECIDED NULL
business 25,125.00
mod_cook 15,000.00
popular_comp 15,000.00
psychology 21,275.00
trad_cook 19,000.00

(6 rows affected)
The summary values in a group by clause using aggregates are called vector aggregates, as opposed to scalar aggregates, which result when only one row is returned (see "Using aggregate functions").

group by syntax

The complete syntax of the select statement is repeated here so that you can see the group by clause in context:
select [all | distinct] select_list 
[into [[database.]owner.]table_name]
[from [[database.]owner.]{view_name | table_name [(index {index_name | table_name}
[parallel [degree_of_parallelism]]
[prefetch size ] [lru | mru])]}
[holdlock | noholdlock] [shared]
[,[[database.]owner.]{view_name | table_name [(index {index_name | table_name}
[parallel [degree_of_parallelism]]
[prefetch size] [lru | mru])]}
[holdlock | noholdlock] [shared]]...]

[where search_conditions]

[group by [all] aggregate_free_expression [, aggregate_free_expression]...]
[having search_conditions]

[order by
{[[[database.]owner.] {table_name. | view_name.}]
column_name | select_list_number | expression} [asc | desc]
[, {[[[database.]owner.]{table_name|view_name.}] column_name | select_list_number | expression}
[asc | desc]]...]

[compute row_aggregate (column_name)
[, row_aggregate(column_name)]...
[by column_name [, column_name]...]]

[for {read only | update [of column_name_list]}]

[at isolation {read uncommitted | read committed | serializable}]

[for browse]
Remember that the order of the clauses in the select statement is significant. You can omit any of the optional clauses, but when you use them, they must appear in the order shown above.
group by and SQL standardsThe SQL standards for group by are more restrictive than Sybase's standard. The SQL standard requires that:
  • The columns in a select list must be in the group by expression or they must be arguments of aggregate functions.
  • A group by expression can only contain column names in the select list, but not those used only as arguments for vector aggregates.
Several Transact-SQL extensions (described in the following sections) relax these restrictions, however the more complex result sets may be more difficult to understand. If you set the fipsflagger option as follows, you will receive a warning message stating that Transact-SQL extensions are used:
set fipsflagger on
For more information about the fipsflagger option, see the set command in the Reference Manual.
Nesting groups with group byYou can list more than one column in the group by clause to nest groups. Once the sets are established with group by, the aggregates are applied. This statement finds the average price and the sum of book sales, grouped first by publisher identification number and then by type:
select pub_id, type, avg(price), sum(total_sales)
from titles
group by pub_id, type
pub_id  type
------ ------------ ------ -------
0736 business 2.99 18,722
0736 psychology 11.48 9,564
0877 UNDECIDED NULL NULL
0877 mod_cook 11.49 24,278
0877 psychology 21.59 375
0877 trad_cook 15.96 19,566
1389 business 17.31 12,066
1389 popular_comp 21.48 12,875

(8 rows affected)
You can nest many groups within groups, up to the maximum of 16 columns or expressions specified with group by.

Referencing other columns in queries using group by

SQL standards state that the group by must contain items from the select list. However, Transact-SQL allows you to specify any valid column name in either the group by or select list, whether they employ aggregates or not.
Through the following extensions, Sybase lifts restrictions on what you can include or omit in the select list of a query that includes group by.
  • The columns in the select list are not limited to the grouping columns and columns used with the vector aggregates.
  • The columns specified by group by are not limited to those non-aggregate columns in the select list.
A vector aggregate must be accompanied by a group by clause. The SQL standards require that the non-aggregate columns in the select list match the group by columns. However, the first bulleted item described above allows you to specify additional, extended columns in the select list of the query.
For example, many versions of SQL do not allow the inclusion of the extended title_id column in the select list, but it is legal in Transact-SQL:
select type, title_id, avg(price), avg(advance) 
from titles
group by type
type           title_id 
------------ -------- ----- -------
business BU1032 13.73 6,281.25
business BU1111 13.73 6,281.25
business BU2075 13.73 6,281.25
business BU7832 13.73 6,281.25
mod_cook MC2222 11.49 7,500.00
mod_cook MC3021 11.49 7,500.00
UNDECIDED MC3026 NULL NULL
popular_comp PC1035 21.48 7,500.00
popular_comp PC8888 21.48 7,500.00
popular_comp PC9999 21.48 7,500.00
psychology PS1372 13.50 4,255.00
psychology PS2091 13.50 4,255.00
psychology PS2106 13.50 4,255.00
psychology PS3333 13.50 4,255.00
psychology PS7777 13.50 4,255.00
trad_cook TC3218 15.96 6,333.33
trad_cook TC4203 15.96 6,333.33
trad_cook TC7777 15.96 6,333.33

(18 rows affected)
The above example still aggregates the price and advance columns based on the type column, but its results also display the title_id for the books included in each group.
The second extension described above allows you to group columns that are not specified as columns in the select list of the query. These columns do not appear in the results, but the vector aggregates still compute their summary values. For example:
select state, count(au_id)
from authors
group by state, city
state
----- --------
CA 2
CA 1
CA 5
CA 5
CA 2
CA 1
CA 1
CA 1
CA 1
IN 1
KS 1
MD 1
MI 1
OR 1
TN 1
UT 2

(16 rows affected)
This example groups the vector aggregate results by both state and city, even though it does not display which city belongs to each group. Therefore, results are potentially misleading.
You may think the following query should produce similar results to the previous query, since only the vector aggregate seems to tally the number of each city for each row:
select state, count(au_id)
from authors
group by city
However, its results are much different. By not using group by with both the state and city columns, the query tallies the number of each city, but it displays the tally for each row of that city in authors rather than grouping them into one result row per city.
state
----- -----------
CA              1 
CA 5
CA 2
CA 1
CA 5
KS 1
CA 2
CA 2
CA 1
CA 1
TN 1
OR 1
CA 1
MI 1
IN 1
CA 5
CA 5
CA 5
MD 1
CA 2
CA 1
UT 2
UT 2

(23 rows affected)
When you use the Transact-SQL extensions in complex queries that include the where clause or joins, the results may become even more difficult to understand. To avoid confusing or misleading results with group by, Sybase suggests that you use the fipsflagger option to identify queries that use Transact-SQL extensions. See "group by and SQL standards" for details.
For more information about Transact-SQL extensions to group by and how they work, see the Reference Manual.

Expressions and group by

Another Transact-SQL extension allows you to group by an expression that does not include aggregate functions. For example:
select avg(total_sales), total_sales * price 
from titles
group by total_sales * price
---------      -------------
NULL NULL
111 777.00
375 7,856.25
375 8,096.25
2045 22,392.75
3336 26,654.64
2032 40,619.68
3876 46,318.20
18722 55,978.78
4095 61,384.05
22246 66,515.54
4072 81,399.28
4095 81,859.05
4095 81,900.00
15096 180,397.20
8780 201,501.00

(16 rows affected)
The expression "total_sales * price" is allowed.
You cannot group by a column heading, also known as an alias, although you can still use one in your select list. This statement produces an error message:
select Category = type, title_id, avg(price), avg(advance) 
from titles
group by Category
The group by clause should be "group by type", not "group by Category".
select Category = type, title_id, avg(price), avg(advance) 
from titles
group by type

Nesting aggregates with group by

Another Transact-SQL extension allows you to nest a vector aggregate inside a scalar aggregate. For example, to find the average price of all types of books using a non-nested aggregate, enter:
select avg(price) 
from titles
group by type
--------------- 
NULL
13.73
11.49
21.48
13.50
15.96

(6 rows affected)
Nesting the average price inside the max function produces the highest average price of a group of books, grouped by type:
select max(avg(price)) 
from titles
group by type
------------- 
21.48

(1 row affected)
By definition, the group by clause applies to the innermost aggregate--in this case, avg.

Null values and group by

If the grouping column contains a null value, that row becomes its own group in the results. If the grouping column contains more than one null value, the null values form a single group. Here is an example that uses group by and the advance column, which contains some null values:
select advance, avg(price * 2) 
from titles
group by advance
advance                                            
------------------ -----------------
NULL NULL
0.00 39.98
2000.00 39.98
2275.00 21.90
4000.00 19.94
5000.00 34.62
6000.00 14.00
7000.00 43.66
8000.00 34.99
10125.00 5.98
15000.00 5.98

(11 rows affected)
If you are using the count(column_name) aggregate function, grouping by a column that contains null values will return a count of zero for the grouping row, since count(column_name) does not count null values. In most cases, you should use count(*) instead. This example groups and counts on the price column from the titles table, which contains null values, and shows count(*) for comparison:
select price, count(price), count(*) 
from titles
group by price
price   
------------- ----- -----
NULL 0 2
2.99 2 2
7.00 1 1
7.99 1 1
10.95 1 1
11.95 2 2
14.99 1 1
19.99 4 4
20.00 1 1
20.95 1 1
21.59 1 1
22.95 1 1

(12 rows affected)

where clause and group by

You can use a where clause in a statement with group by. Rows that do not satisfy the conditions in the where clause are eliminated before any grouping is done. Here is an example:
select type, avg(price)
from titles
where advance > 5000
group by type
type 
------------- --------
business 2.99
mod_cook 2.99
popular_comp 21.48
psychology 14.30
trad_cook 17.97

(5 rows affected)
Only the rows with advances of more than $5000 are included in the groups that are used to produce the query results.
However, the way that Adaptive Server handles extra columns in the select list and the where clause may seem contradictory. For example:
select type, advance, avg(price) 
from titles
where advance > 5000
group by type
type           advance
------------- --------- --------
business 5,000.00 2.99
business 5,000.00 2.99
business 10,125.00 2.99
business 5,000.00 2.99
mod_cook 0.00 2.99
mod_cook 15,000.00 2.99
popular_comp 7,000.00 21.48
popular_comp 8,000.00 21.48
popular_comp NULL 21.48
psychology 7,000.00 14.30
psychology 2,275.00 14.30
psychology 6,000.00 14.30
psychology 2,000.00 14.30
psychology 4,000.00 14.30
trad_cook 7,000.00 17.97
trad_cook 4,000.00 17.97
trad_cook 8,000.00 17.97

(17 rows affected)
It only seems as if the query is ignoring the where clause when you look at the results for the advance (extended) column. Adaptive Server still computes the vector aggregate using only those rows that satisfy the where clause, but it also displays all rows for any extended columns that you include in the select list. To further restrict these rows from the results, you must use a having clause (described later in this chapter).

group by and all

The keyword all in the group by clause is a Transact-SQL enhancement. It is meaningful only if the select statement in which it is used also includes a where clause.
If you use all, the query results include all the groups produced by the group by clause, even if some groups do not have any rows that meet the search conditions. Without all, a select statement that includes group by does not show groups for which no rows qualify.
Here is an example:
select type, avg(advance)  
from titles
where advance > 1000 and advance < 10000
group by type
type                                    
------------ ------------------------
business 5,000.00
popular_comp 7,500.00
psychology 4,255.00
trad_cook 6,333.33

(4 rows affected)
select type, avg(advance)  
from titles
where advance > 1000 and advance < 10000
group by all type
type                                    
------------ ------------------------
UNDECIDED NULL
business 5,000.00
mod_cook NULL
popular_comp 7,500.00
psychology 4,255.00
trad_cook 6,333.33

(6 rows affected)
The first statement produces groups only for those books that commanded advances of more than $1000 but less than $10,000. Since no modern cooking books have an advance within that range, there is no group in the results for the mod_cook type.
The second statement produces groups for all types, including modern cooking and "UNDECIDED," even though the modern cooking group does not include any rows that meet the qualification specified in the where clause. Adaptive Server returns a NULL result for all groups that lack qualifying rows.

Using aggregates without group by

By definition, scalar aggregates apply to all rows in a table, producing a single value for the whole table for each function. The Transact-SQL extension that allows you to include extended columns with vector aggregates also allows you to include extended columns with scalar aggregates. For example, look at the publishers table:
pub_id pub_name             city           state 
------ ------------------   -------------- ----- 
0736 New Age Books Boston MA
0877 Binnet & Hardley Washington DC
1389 Algodata Infosystems Berkeley CA
It contains three rows. The following query produces a three-row scalar aggregate based on each row of the table:
select pub_id, count(pub_id)
from publishers
pub_id 
---------- ---------
0736 3
0877 3
1389 3

(3 rows affected)
Because Adaptive Server treats publishers as a single group, the scalar aggregate applies to the (single-group) table. The results display every row of the table for each column you include in the select list, in addition to the scalar aggregate.
The where clause behaves the same way for scalar aggregates as with vector aggregates. The where clause restricts the columns included in the aggregate summary values, but it does not affect the rows that appear in the results for each extended column you specify in the select list. For example:
select pub_id, count(pub_id) 
from publishers
where pub_id < "1000"
pub_id 
-------------- -----------
0736 2
0877 2
1389 2

(3 rows affected)
Like the other Transact-SQL extensions to group by, this extension provides results that may be difficult to understand, especially for queries on large tables or queries with multitable joins.

Operators in t-sql

Operators have the following precedence levels, where 1 is the highest level and 6 is the lowest:
  1. unary (single argument) - + ~
  2. * /%
  3. binary (two argument) + - & | ^
  4. not
  5. and
  6. or
     
Bitwise operators
So we have bitwise operators like we have in c .. these are & , | , ^ and ~ operators.

The string operator + can concatenate two or more character or binary expressions. For example:
  1. select Name = (au_lname + ",
    " + au_fname)
    from authors
    Displays author names under the column heading "Name" in last-name, first-name order, with a comma after the last name; for example, "Bennett, Abraham."
  2. select "abc" + "" + "def" 
    Returns the string "abc def". The empty string is interpreted as a single space in all char, varchar, nchar, nvarchar, and text concatenation, and in varchar insert and assignment statements.
When concatenating non-character, non-binary expressions, use convert:
select "The date is " + 
convert(varchar(12), getdate())

The comparison operators
Adaptive Server uses these comparison operators:
Comparison operators
OperatorMeaning
=Equal to
>Greater than
<Less than
>=Greater than or equal to
<=Less than or equal to
<>Not equal to
!=Not equal to (Transact-SQL extension)
!>Not greater than (Transact-SQL extension)
!<Not less than (Transact-SQL extension)
In comparing character data, < means closer to the beginning of the server's sort order and > means closer to the end of the sort order. Uppercase and lowercase letters are equal in a sort order that is not case sensitive. Use sp_helpsort to see the sort order for your Adaptive Server. Trailing blanks are ignored for comparison purposes.
In comparing dates, < means earlier than and > means later than.
Put single or double quotes around all character and datetime data used with a comparison operator:
= "Bennet" 
> "May 22 1947"
Nonstandard operatorsThe following operators are Transact-SQL extensions:
  • Modulo operator: %
  • Negative comparison operators: !>, !<, !=
  • Bitwise operators: ~, ^, |, &
  • Join operators: *= and =*
Comparing character expressionsAdaptive Server treats character constant expressions as varchar. If they are compared with non-varchar variables or column data, the datatype precedence rules are used in the comparison (that is, the datatype with lower precedence is converted to the datatype with higher precedence). If implicit datatype conversion is not supported, you must use the convert function (see "Supported conversions").
Comparison of a char expression to a varchar expression follows the datatype precedence rule; the "lower" datatype is converted to the "higher" datatype. All varchar expressions are converted to char (that is, trailing blanks are appended) for the comparison.
Using the empty stringThe empty string ("") or ('') is interpreted as a single blank in insert or assignment statements on varchar data. When varchar, char, nchar, or nvarchar data is concatenated, the empty string is interpreted as a single space. For example, this is stored as "abc def":
"abc" + "" + "def"
The empty string is never evaluated as NULL.
Including quotation marks in character expressionsThere are two ways to specify literal quotes within a char or varchar entry. The first method is to use an additional quote with a quote of the same type. This is called "escaping" the quote. For example, if you begin a character entry with a single quote, but you want to include a single quote as part of the entry, use two single quotes:
'I don't understand.' 
Here is an example containing internal double and single quotes. The single quote does not have to be escaped, but the double quote does:
"He said, ""It's not really confusing.""" 
The second method is to enclose a quote in the opposite kind of quotation mark. In other words, surround an entry containing a double quote with single quotes (or vice versa). Here are some examples:
'George said, "There must be a better way."' 
"Isn't there a better way?"
'George asked, "Isn"t there a better way?"'
To continue a character string that would go off the end of one line on your screen, enter a backslash (\) before going to the following line.
If the quoted_identifier option is set to on, do not use double quotes around character or date data. You must use single quotes, or Adaptive Server treats the data as an identifier. For more information about quoted identifiers, see "Delimited identifiers".

Relational and logical expressions

A logical expression or relational expression returns TRUE, FALSE, or UNKNOWN. The general patterns are:
expression comparison_operator [any | all] expression
expression [not] in expression
[not] exists expression
expression [not] between expression and expression
expression [not] like "match_string" [escape "escape_character"]
not expression like "match_string" [escape "escape_character"]
expression is [not] null
not logical_expression
logical_expression {and | or} logical_expression
Using any, all, and inany is used with <, >, or = and a subquery. It returns results when any value retrieved in the subquery matches the value in the where or having clause of the outer statement. all is used with < or > and a subquery. It returns results when all values retrieved in the subquery are less than (<) or greater than (>) the value in the where or having clause of the outer statement. See Chapter 5, "Subqueries: Using Queries Within Other Queries," for more information.
in returns results when any value returned by the second expression matches the value in the first expression. The second expression must be a subquery or a list of values enclosed in parentheses. in is equivalent to = any.
Connecting expressions with and and orand connects two expressions and returns results when both are true. or connects two or more conditions and returns results when either of the conditions is true.
When more than one logical operator is used in a statement, and is evaluated before or. Use parentheses to change the order of execution.
Table 1-6 shows the results of logical operations, including those that involve null values:
Truth tables for logical expressions
andTRUEFALSENULL
TRUETRUEFALSEUNKNOWN
FALSEFALSEFALSEFALSE
NULLUNKNOWNFALSEUNKNOWN




orTRUEFALSENULL
TRUETRUETRUETRUE
FALSETRUEFALSEUNKNOWN
NULLTRUEUNKNOWNUNKNOWN




not


TRUEFALSE

FALSETRUE

NULLUNKNOWN

The result UNKNOWN indicates that one or more of the expressions evaluates to NULL, and that the result of the operation cannot be determined to be either TRUE or FALSE.

Getting information about datatypes and tables

Use sp_help to display information about the properties of a system datatype or a user-defined datatype. The report indicates the base type from which the datatype was created, whether it allows nulls, the names of any rules and defaults bound to the datatype, and whether it has the IDENTITY property.
The following examples display the information about the money system datatype and the tid user-defined datatype:
sp_help money
Type_name  Storage_type Length Prec  Scale  
---------- ------------ ------ ----- -----
money money 8 NULL NULL
Nulls   Default_name  Rule_name   Identity
----- ------------ --------- --------
1 NULL NULL NULL

(return status = 0)
sp_help tid
Type_name  Storage_type Length Prec  Scale  
---------- ------------ ------ ----- -----
tid varchar 6 NULL NULL
Nulls   Default_name  Rule_name   Identity
----- ------------ --------- --------
0 NULL NULL 0