For example, take Pony. For one, aggregrate database functions have been standard in SQL probably well before any of us were born, so that's nothing special. Then for example, on their home page they give an example foreach loop, and the corresponding SQL statement that gets generated:
SELECT "c"."id"
FROM "Customer" "c"
LEFT JOIN "Order" "order-1"
ON "c"."id" = "order-1"."customer"
GROUP BY "c"."id"
HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000
What kind of convulated mess is that? I don't know, but I don't want my software running statements like that against the database. Here, in basic SQL:
SELECT c.id FROM Customer AS c, Order as o WHERE c.id = o.customer AND o.total_price > 1000;
Isn't that cleaner and simpler?
1. You missed a GROUP BY. Let us add that:
SELECT c.id FROM Customer AS c, Order as o WHERE c.id = o.customer AND o.total_price > 1000 GROUP BY c.id;
2. Here is the original Python generator from Pony's home page:
select(c for c in Customer
if sum(c.orders.price) > 1000)
Oh! We missed a sum(). We have an aggregate function and so, to filter that, the simple where clause is not enough. We have to use HAVING. Replacing:
SELECT c.id FROM Customer AS c, Order as o WHERE c.id = o.customer GROUP BY c.id HAVING SUM(o.price) > 1000
3. The comma join is supposed to represent a cross join, but it is ambiguous and works different depending on your database (e.g.: mySQL treats it as a cross join, and in mySQL, inner join is equivalent to a cross join!). Also, for our purpose, we need an outer join, since we want a list of all customers, whether there are any columns on the right side or not.
So, next step would be to replace the comma join with an explicit join. This will also involve using a key. So, the comparison of ids in the WHERE clause will become the key comparison for ON.
Our query now looks like this:
SELECT c.id FROM Customer AS c LEFT JOIN Order as o ON c.id = o.customer GROUP BY c.id HAVING SUM(o.price) > 1000
4. To avoid special characters and spaces messing with our syntax, we need to add quotes around the names. Adding the quotes, the query now looks like this:
SELECT "c"."id"
FROM "Customer" AS "c"
LEFT JOIN "Order" AS "o"
ON "c"."id" = "o"."customer"
GROUP BY "c"."id"
HAVING SUM("o"."price") > 1000
5. o,price may have null values. So, the comparison may not work in all cases (Instead of 1000, what if it is 0?) because of NULL comparison rules. To take care of this, let us add COALESCE function, which returns the first non-null value from the list of values given to it.
SELECT "c"."id"
FROM "Customer" AS "c"
LEFT JOIN "Order" AS "o"
ON "c"."id" = "o"."customer"
GROUP BY "c"."id"
HAVING coalesce(SUM("o"."price"), 0) > 1000
6. ORMs make decisions to create aliases so as to account for multiple uses. If you want to do two comparisons, having aliases like "order-1", order-2", etc will help. This is a minor change, let us do that.
SELECT "c"."id"
FROM "Customer" AS "c"
LEFT JOIN "Order" AS "order-1"
ON "c"."id" = "order-1"."customer"
GROUP BY "c"."id"
HAVING coalesce(SUM("order-1"."price"), 0) > 1000
7. "AS" is redundant, so we can take that out.
SELECT "c"."id"
FROM "Customer" "c"
LEFT JOIN "Order" "order-1"
ON "c"."id" = "order-1"."customer"
GROUP BY "c"."id"
HAVING coalesce(SUM("order-1"."price"), 0) > 1000
8. Behold the result, which is the same as the one on Pony's home page!
9. Advanced: Look at the query again, very carefully. And then, look at the query on their home page. Wow - they have a typo! Instead of
HAVING coalesce(SUM("order-1"."price"), 0) > 1000
they have
HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000
Should we trust a company which has a typo like this on their home page, that too as part of their primary example? Discuss.