count vs sum for Queries

Nice post in “Beyond relational” about using SUM vs COUNT to get …. counts.

First, COUNT(*) and SUM (series of ‘1’s) will be the same, right?  If we have a column where the value is “1”, then the sum is the same as the count.  OK, there is some possible confusion if the column can have ‘NULL” in it, but never mind.

Second, if we use a CASE statement, then we can evaluate a “non-unity” column to become “1” or “0”, thereby getting around the “NULL” condition problem, AND then we can SUM.  For example

SELECT COUNT(*) from gender where gender = ‘M’ as M_count

and

SELECT SUM (CASE WHEN GENDER = ‘M’ then 1 else 0 end) as M_count

What about dirty data where GENDER can be ‘M’ or ‘m’ (not an issue since SQL is not case sensitive, but what about ‘B’ for boy?  We can clearly add to the where clause, but we can also just add another ELSE to the case.  And if we had three genders, we can easy move ‘O’ for other from M_count to someplace else.  

And it become easy to have multiple SUMS, for M_count, F_count, O_count, or even SUM(CASE WHEN Year < ‘2000’ then 1 else ..).  OK, so that last won’t work, but the idea is that the use of SUM with CASE is more flexible that COUNT. (of course, we also can use COUNT and CASE together.

Advertisements
This entry was posted in Uncategorized and tagged . Bookmark the permalink.