Tuesday, September 14, 2010

The having clause : Selecting groups of data

Use the having clause to display or reject rows defined by the group by clause. The having clause sets conditions for the group by clause in the same way where sets conditions for the select clause, except where cannot include aggregates, while having often does. This example is legal:
select title_id 
from titles
where title_id like "PS%"
having avg(price) > $2.0
But this example is not:
select title_id
from titles
where avg(price) > $20
having clauses can reference any of the items that appear in the select list.
This statement is an example of a having clause with an aggregate function. It groups the rows in the titles table by type, but eliminates the groups that include only one book:
select type 
from titles
group by type
having count(*) > 1
type 
----------------
business
mod_cook
popular_comp
psychology
trad_cook

(5 rows affected)
Here is an example of a having clause without aggregates. It groups the titles table by type and displays only those types that start with the letter "p":
select type 
from titles
group by type
having type like "p%"
type 
------------
popular_comp
psychology

(2 rows affected)
When you include more than one condition in the having clause, combine the conditions with and, or, or not. For example, to group the titles table by publisher, and to include only those publishers who have paid more than $15,000 in total advances, whose books average less than $18 in price, and whose identification numbers (pub_id) are greater than 0800, the statement is:
select pub_id, sum(advance), avg(price)
from titles
group by pub_id
having sum(advance) > 15000
and avg(price) < 18
and pub_id > "0800"
pub_id 
------ ---------------- ----------------
0877 41,000.00 15.41

(1 row affected)

How the having, group by, and where clauses interact

When you include the having, group by, and where clauses in a query, the sequence in which each clause affects the rows determines the final results:
  • The where clause excludes rows that do not meet its search conditions.
  • The group by clause collects the remaining rows into one group for each unique value in the group by expression.
  • Aggregate functions specified in the select list calculate summary values for each group.
  • The having clause excludes rows from the final results that do not meet its search conditions.
The following query illustrates the use of where, group by, and having clauses in one select statement containing aggregates:
select stor_id, title_id, sum(qty)
from salesdetail
where title_id like "PS%"
group by stor_id, title_id
having sum(qty) > 200
stor_id  title_id
------- -------- -----------
5023 PS1372 375
5023 PS2091 1,845
5023 PS3333 3,437
5023 PS7777 2,206
6380 PS7777 500
7067 PS3333 345
7067 PS7777 250

(7 rows affected)
The query functioned in this order:
  1. The where clause identified only rows with title_id beginning with "PS" (psychology books),
  2. group by collected the rows by common stor_id and title_id,
  3. The sum aggregate calculated the total number of books sold for each group, and
  4. The having clause excluded the groups whose totals do not exceed 200 books from the final results.
All of the previous having examples adhere to the SQL standards, which specify that columns in a having expression must have a single value, and must be in the select list or group by clause. However, the Transact-SQL extensions to having allow columns or expressions not in the select list and not in the group by clause.
The following example determines the average price for each title type, but it excludes those types that do not have more than $10,000 in total sales, even though the sum aggregate does not appear in the results.
select type, avg(price)
from titles
group by type
having sum(total_sales) > 10000
type
------------ ----------
business 13.73
mod_cook 11.49
popular_comp 21.48
trad_cook 15.96

(4 rows affected)
The extension behaves as if the column or expression were part of the select list but not part of the displayed results. If you include an non-aggregated column with having, but it is not part of the select list or the group by clause, the query produces results similar to the "extended" column extension described earlier in this chapter. For example:
select type, avg(price)
from titles
group by type
having total_sales > 4000
type
------------ ----------
business 13.73
business 13.73
business 13.73
mod_cook 11.49
popular_comp 21.48
popular_comp 21.48
psychology 13.50
trad_cook 15.96
trad_cook 15.96

(9 rows affected)
Unlike an extended column, the total_sales column does not appear in the final results, yet the number of displayed rows for each type depends on the total_sales for each title. The query indicates that three business, one mod_cook, two popular_comp, one psychology, and two trad_cook, titles exceed $4000 in total sales.
As mentioned earlier, the way Adaptive Server handles extended columns may seem as if the query is ignoring the where clause in the final results. To make the where conditions affect the results for the extended column, you should repeat the conditions in the having clause. For example:
select type, advance, avg(price) 
from titles
where advance > 5000
group by type
having advance > 5000
type           advance
------------- --------- --------
business 10,125.00 2.99
mod_cook 15,000.00 2.99
popular_comp 7,000.00 21.48
popular_comp 8,000.00 21.48
psychology 7,000.00 14.30
psychology 6,000.00 14.30
trad_cook 7,000.00 17.97
trad_cook 8,000.00 17.97

(8 rows affected)

Using having without group by

A query with a having clause should also have a group by clause. If you omit group by, all the rows not excluded by the where clause return as a single group.
Because no grouping is done between the where and having clauses, they cannot act independently of each other. having acts like where because it affects the rows in a single group rather than groups, except the having clause can still use aggregates.
This example uses the having clause in the following way: it averages the price, excludes from the results titles with advances greater than $4,000, and produces results where price is less than the average price:
select title_id, advance, price
from titles
where advance < 4000
having price > avg(price)
title_id        advance    price
------------- --------- --------
BU1032 5,000.00 19.99
BU7832 5,000.00 19.99
MC2222 0.00 19.99
PC1035 7,000.00 22.95
PC8888 8,000.00 20.00
PS1372 7,000.00 21.59
PS3333 2,000.00 19.99
TC3218 7,000.00 20.95

(8 rows affected)
You can also use the having clause with the Transact-SQL extension that allows you to omit the group by clause from a query that includes an aggregate in its select list. These scalar aggregate functions calculate values for the table as a single group, not for groups within the table.
In this example, the group by clause is omitted, which makes the aggregate function calculate a value for the whole table. The having clause excludes non-matching rows from the result group.
select pub_id, count(pub_id)
from publishers
having pub_id < "1000"
pub_id
------ ----------------
0736 3
0877 3

(2 rows affected)

No comments:

Post a Comment