Why you shouldn't rely on ORMs.

Kiopa_Matt

Banned
May 13, 2011
5,216
52
0
Several on here believe my disdain for ORMs stems from the fact I'm not knowledgeable and/or efficient. So if you're still learning development, here's some reasons why you should spend the time to learn proper SQL, and the various database functions available. I could be wrong, but if you're learning strictly ORMs, I think most of this stuff is going to be left out.

We'll start with the most simplistic:


1. Date Functions

One thing I don't see many times is the proper use of date functions. This is especially important if you're releasing a distributable software, as on many servers the system timezone is different from the database timezone. For example, say the next payment date needs to be pushed up a month:

Code:
UPDATE subscriptions SET next_date = date(date_add(next_date, interval 1 month)) WHERE id = $id
Say you want to group a bunch of payments by year:

Code:
SELECT YEAR(date_added) AS year, count(*) AS total, SUM(amount) AS amount FROM transaction GROUP BY YEAR(date_added)
Instead of the above simple SQL statement, I see people literally running a loop and executing a separate SQL query for each year.


2. Cascading

Excellent if you have child tables that contain data on a parent table. For example, maybe you have a products table, and a list of attributes stores within the product_attributes table. When a product is deleted, I see many people executing two SQL statements to delete the product data from both tables. Instead, you can create your attributes table such as:

Code:
CREATE TABL product_attributes (
    id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, 
    product_id INT NOT NULL, 
    attr_name VARCHAR(50) NOT NULL, 
    attr_value VARCHAR(50) NOT NULL, 
    FOREIGN KEY (product_id) REFERENCES products (id) ON DELETE CASCADE 
) engine=InnoDB;
Or if the attributes table already exists, add the foreign key:

Code:
ALTE TABLE product_attributes ADD FOREIGN KEY (product_id) REFERENCES products (id) ON DELETE CASCADE
Now any time a product is deleted, all of its attributes data will be deleted as well. No reason to delete from both tables within your code.


3. ON DUPLICAT KEY UPDATE

Excellent for things such as aggregate data. For example, maybe you have a table named payments, and instead of heavily querying it everytime the admin wants to view a report, you store a monthly summary in the payments_summary table, which maybe looks like:

Code:
CREAT TABLE payments_summary (
    id VARCHAR(20) NOT NULL PRIMARY KEY, 
    year SMALLINT NOT NULL, 
    month SMALLINT NOT NULL, 
    method VARCHAR(30) NOT NULL, 
    payments INT NOT NULL, 
    amount DECIMAL(8,2) NOT NULL
) engine=InnoDB;
Every time a payment comes in, this table needs to be updated, so the admin can view reports in real-time. When updating it, you can use a simple SQL query such as:

Code:
$guid = $year . '-' . $month . '-' . $payment_method;
INSER  INTO payments_summary('$guid', $year, $month, '$payment_method', 43,  1594) ON DUPLICATE KEY UPDATE payments = payments + 43, amount = amount +  1594
That's it. Instead of checking to see if the relevant row already exists, then inserting or updating as needed, you just need that one SQL statement, and the database will take care of the rest.


4. Triggers

Taking the above example, we can move that entirely into the database using a trigger, such as:

Code:
DELIMITER @@
CREAT TRIGGER payments_after_insert AFTER INSERT ON payments FOR EACH ROW 
BEGIN
    DECLARE guid varchar(30);
    SELECT CONCAT_WS('-', YEAR(NEW.date_added), MONTH(NEW.date_added), NEW.method) INTO guid;

     INSER INTO payments_summary VALUES (guid, YEAR(NEW.date_added),  MONTH(NEW.date_added), NEW.method, 1, NEW.amount) ON DUPLICATE KEY  UPDATE payments = payments + 1, amount = amount + NEW.amount;
END;
@@
DELIMITER ;
Execute the above SQL code against the database once, and that's it. You don't have to type a single line of code into your software, and your monthly aggregate data of all payments will be automatically maintained within the payments_summary table.

I still haven't even touched on different things such as views, stored procedures, etc. Nonetheless, if you're not currently utilizing the above, you should be as it makes your code leaner, and is simply the proper way to do things. Lean on the database engine as much as you can, instead of relying on a ORM or your own code. That's what the database is there for.

EDIT: Typos in the above SQL code should be pretty obvious. Incapsula wouldn't let the post through without them.
 


There are good reasons to have disdain for ORMs and any other abstractions. Any absolutist point of view is the naive one, for or against. Speaking of absolutism, nobody believes an ORM supplants/precludes SQL knowledge. :)

Re: 1. Already, you've coupled your application to the MySQL innodb engine.

An abstraction over dates is one of the first things I'd require from an ORM that I'd use in my project. Or from the database adapter that converts dates in my language to dates in the DB.

Also, let's agree right now that our fictional ORM-user is proficient enough to keep a development-server log open and spot a sequence of queries that can be sponged up into a single query. I imagine most of us ORM-assholes don't even care about N+1 queries until our app is launched and has enough traction to make us think "alright, now let's pick the low-hanging optimization fruit". *15 minutes later* ...Done.

Re: 2. The ORMs I use in Ruby (ActiveRecord and Sequel) and Clojure support application-level semantics for destroying dependent records.

Also, most of the deletion cascades in apps are rather monumental, like deleting the Posts of a Topic and similar queries that can still be expressed in <number of tables-affected> quantity of queries. I've never even been bothered enough to see if any ORM I use even has DB-level semantics for that.

Re: 3 & 4. These are more examples of things often expressed in application-level semantics.

Execute the above SQL code against the database once, and that's it. You don't have to type a single line of code into your software, and your monthly aggregate data of all payments will be automatically maintained within the payments_summary table.

I still haven't even touched on different things such as views, stored procedures, etc. Nonetheless, if you're not currently utilizing the above, you should be as it makes your code leaner, and is simply the proper way to do things. Lean on the database engine as much as you can, instead of relying on a ORM or your own code. That's what the database is there for.
Here's the thing, man: You're right, but you're only enumerating one side of a pros/cons table.

It's a known trade-off between expressing the semantics of your app at the application-level VS. the database-level.

It's a classic point of contention and debate like Emacs vs Vim. However, like Emacs vs Vim, you gain the most power by learning both so that you can leverage the power of their strengths as you need it.

In most high-level applications, I tend to code at the app-level. Without changing any code, I can deploy my forum app to an Amazon EC2 instance, an EC2 cluster, my dedicated box, or a $10/mo cheapo VPS with whatever DB offering they're coupled to so that I don't need to be.

And I've written some lower level services that act like wrappers around very specific database features (like Postgres' GIS support) where I write at the DB-level. -- And my higher-level apps will call out to these.

But you're seeing a faction war where there really isn't one.

Sometimes you want to treat the database as a dumb storage layer.
Sometimes you want to leverage its features.


Often you do both, an an ORM doesn't prevent that.
 
1) Coordinated Universal Time - Wikipedia, the free encyclopedia is nice - but timezones usage is optional both with and without ORMs

Adding an amount of time to a datetime field is easy in every orm I have used eg
Code:
obj.some_datetime_field + datetime.timedelta(days=3)

As for the 2nd query, ORMs allow grouping queries, it's up to the user to use them, just like it is with raw SQL.

2. https://github.com/ponyorm/pony/blob/orm/pony/orm/core.py#L2777

3. ok, it is pretty easy to have a method on a model called get_or_create that gets an object if it exists, makes it if it does not exist.

4. I think you have to be careful to record where triggers are being fired, and make sure they are in source control somehow, otherwise fair enough.

I think it'd help your knowledge greatly, and give you a more informed opinion, if you played around with an ORM or 2 for a simple application.