Best Database for High Volume Inserts

illeat

New member
May 13, 2007
69
3
0
I'm working on a tracking system for a project that requires a heavy insert volume on a larger table - around 200 inserts/sec with table adding around 5million rows a day with a single index (primary key on autoincremented id column). The structure of the row is: int(primary/autoincrement),vc(32),vc(128),vc(64),vc(64),vc(16),vc(16),int,vc(254),vc(96),vc(254),timestamp,int,timestamp,int.
I've always used Mysql for this kind of stuff, but my server is starting to buckle (load jumps above 1 and queries start to lag), even after tossing in various 'hacks' to speed things up - cleansing the main tracking table to an archive table every hour, maxing out mysql conf, using partitions and insert delays, etc. I've also confirmed that mysql is the bottleneck and not httpd/nginx as the server does fine when i remove the tracking calls. Also, I think the server should be able to handle this at 12 core intel, 64 gigs ram, ssd hd. I'd ideally like to keep all of this to a single server if possible, as well as remove the archiving aspect of it since its making readback of stats kind of a pain and not 'realtime.' Also, the id of the row inserted (via mysql_insert_id) is used later on so I need that intact.

Anyways, is mysql capable of handling this volume given my constraints and server or are there better solutions out there to explore? I have no problem upgrading the server as well if that will help or working on more optimization, but I've kind of maxed out what I know about db optimizing/scaling. Wanted to see if anyone on here has any thoughts/suggestions before I go down the stackoverflow rabbit hole even further. Thanks in advance.
 


Couple things...

Are you using myisam or innodb?

Are you sure your my.cnf is properly tuned to take advantage of the 64GB RAM?

Do you know all the inserts you need to do ahead of the actually insert loop? If so "LOAD DATA INFILE" is the quickest way to do mass inserts (though 200/s isn't enormous)

Try bringing your table down to just one column (the primary key), and run your tests. Then one by one add each successive column and test it out to see if you can pinpoint where your server starts choking on the inserts.
 
"Maxing my.cnf" not always the best thing to just pick higher #s

MySQL can easily handle that on a smaller server.

ø per second: 567

That's on my server with .20 to .28 load, 4 core xeon, 32gb ram. Ram usage mostly <8gb.
(NO SSD either, but 15krpm ENT drives.)
I haven't even optimized my.cnf which I know will drop my load to .10 - .20, but I've been busy :) and this is acceptable.

12 days since restart of mysql: 688.8 GiB per-hour: 2.2 GiB


How fast is your CPU?

What drives specifically? Mysql on same as system? What about log files, what drive? Raid, if so what kind?

What table type?
-- Are you locking during any queries?

Have you looked at Percona, if you want to stay 'mysql' ?


DISCLAIMER: I'm not a "DBA" but I do manage, configure and tweak mysql for a variety of projects/servers, and what you want to do is very very minor on today's hardware.
 
first of all, i presume you're using InnoDB as the storage engine. MyISAM can't do row level locking so the whole table is locked at every insert! crazy stuff for high volume inserts
 
Insert multiple records with a single query.
Insert multiple records into MySQL with a single query

To speed up amount of inserts via my.conf change the option the way innodb commits data to hard disk. Normally mysql commits the data to hard drive instantly it is written, however you have an option to commit the data to disk once per second. Thus greatly increasing the amount of inserts (or updates).
See this post:
The only way you can get a decent figure is through benchmarking your specific use case. There are just too many variables and there is no way around that.

It shouldn't take to long either if you just knock a bash script or a small demo app and hammer it with jmeter, then that can give you a good idea.

I used jmeter when trying to benchmark a similar use case. The difference was I was looking for write throughput for number of INSERTS. The most useful thing that came out when I was playing was the 'innodb_flush_log_at_trx_commit' param. If you are using INNODB and don't need ACID compliance for your use case, then changing it to 2. This makes a huge difference to INSERT throughput and will likely do the same in your UPDATE use case. Although note that with this setting, changes only get flushed to disk once per second, so if your server gets a power cut or something, you could lose a seconds worth of data.

On my Quad Core 8GB Machine for my use case:

innodb_flush_log_at_trx_commit=1 resulted in 80 INSERTS per second
innodb_flush_log_at_trx_commit=2 resulted in 2000 INSERTS per second
These figures will probably bear no relevance to your use case - which is why you need to benchmark it yourself.
http://stackoverflow.com/a/12838938

Also read this for some additional info:
MySQL :: MySQL 5.1 Reference Manual :: 8.3.2.1 Speed of INSERT Statements

Mysql should have no problem inserting 200 records per table per second for single queries.
Should you decide to utilize multiple inserts per query (inserting more than 1 line per query) you can bulk that up even more.

Alas, if you still insist on alternative databases (though I do not see a reason as to why) this link will provide you with additional information:
mysql - what DB for high amount of inserts/sec - Stack Overflow
 
Chuckle: Its inno - good call on testing adding the columns and seeing which ones are causing chokes. The inserts are generated in realtime and one per request to my pages, so can't wrap them into load data infile. I think my.cnf is relatively optimized - I actually used the percona wizard at https://tools.percona.com/wizard to generate it.

Todd: The processor is an Intel E5645 at 2.4 GHZ - crazy your getting 567 inserts per second unoptimized - im guessing your inserting less data though than what I am? I have disabled loggin on both of the web daemons and mysql to minimize hits to the disk. I do slow_queries when testing though to see where there are holdups, but not when in production. No RAID - not sure the exact SSD..its a storm baremetal server, but ill check dmesg later and report exact. I haven't looked at percona beyond using their mysql tool to generate my my.cnf - Ill look more into though if you think thats a good plug and play fix.

Klemen - Ya, I can't bundle the inserts into one query - I use the individual ID returned from mysql_insert_id to track individual visitors. And ya, not insistent on switching DBs especially since all I know is Mysql - was more just throwing that out there in the case that mysql couldnt handle what Im wanting without scaling to multiple servers (felt learning a new DB api would be easier than managing multiple servers and load balancing everything).
 
I went from that generation CPU to a more recent one with less cores but higher GHZ and updated MySQL to a more recent version that was better optimized and it made a huge difference on CPU load.

No, sorry. That is total on server. Inserts are around 100/s avg.

I'd look at benchmarking / monitoring your DISK drive. Does liquid web offer some type of "IOMeter" ?
 
Try Klemens suggestion to use innodb_flush_log_at_trx_commit=2.

Turn off binary logging if you don't need it.

Are you sure you are using the InnoDB table type? I've seen situations where you say to create an InnoDB table on a server that doesn't have the InnoDB engine installed, so it just quietly makes it an ISAM table without you knowing it.

Can you verify that the database is indeed stored on the SSD? Is the SSD directly running on the computer, or do you have a SAN or something that runs through a network?

The server you described should be able to do thousands of inserts per second if configured correctly. It's a bit confusing that you are only able to do a few hundred.

Are there any other processes running that may do an update to the table, or is it strictly inserts and selects running against it?

What version of MySQL are you running?

You should run show processlist on the server whenever it is running slow to get a better idea what the bottleneck might be.

You could also change the software to store the records in memory and batch flush them to the table every few seconds.
 
If you wanna stick with mysql, and you've run out of options... guess you're last resort is to
consider sharding... lol.. It sucks to implement and I don't recommend it unless absolutely needed.

If you can forgo mysql, check out Cassandra (apache), Mongo or even Couch. Go for a SSD for the best write speeds..

- Ash