While I talk about using Gearman with PHP, its language neutral, meaning that no matter what language you use it will work. There already are API's for C, PHP, perl, python and C# to name a few.
One of the main problems with PHP (and other script based languages) is that threading support is non-existent. There are ways to overcome this but they end up being, in my opinion, cheap hacks which make scaling and extending the code difficult.
This is where Gearman's power comes from, because it allows for easy and rapid scaling through multiple threads across multiple processes; it removes the limitations of single thread architecture and frees code design to be dynamically scalable and fault tolerant.
While this makes it harder, at least initially, to write code, the rewards offset the initial challenge.
I'm not going to go in depth into Gearman features since you can read that on their site, but I will briefly outline the pros and cons that I think are most relevant.
Pros
The problem with this design, even though it works, is it can't scale effectively and you're limited to speed at which that single operation can be processed. Even if you use CURL with multiple simultaneous request you are still limited to the time it take for them to complete. It's also a fantastic way to saturate your bandwidth.
On the other hand if we look at the code design for the same request for Gearman it not only allows the code to scale but it also effectively uses resources of the server or multiple servers.
You could, if you wanted to, scale the whole process as well.
One of the main problems with PHP (and other script based languages) is that threading support is non-existent. There are ways to overcome this but they end up being, in my opinion, cheap hacks which make scaling and extending the code difficult.
This is where Gearman's power comes from, because it allows for easy and rapid scaling through multiple threads across multiple processes; it removes the limitations of single thread architecture and frees code design to be dynamically scalable and fault tolerant.
While this makes it harder, at least initially, to write code, the rewards offset the initial challenge.
I'm not going to go in depth into Gearman features since you can read that on their site, but I will briefly outline the pros and cons that I think are most relevant.
Pros
- Scaling is as simple as adding more servers and/or more modules (I use the term module to describe code that performs a certain action, like for example, scrape Google SERPS)
- Its language neutral, you can write a module in PHP and have results sent to a script in perl (or C# as a windows app, etc).
- There is no encryption of the data being sent to and from Gearman. You can work around this by sending the data through an encrypted tunnel, but Gearman won't do this for you. They have adding TLS and SASL support to the roadmap, but when that gets added is anyone's guess.
- No native compression. This can be worked around like with encryption, but since the packets are basically raw text with a bit of binary, the data sent between modules can add up quickly without it. Note though that Gearman was never design to run out side of a trusted internal LAN, so data use and packet size was never an issue.
The problem with this design, even though it works, is it can't scale effectively and you're limited to speed at which that single operation can be processed. Even if you use CURL with multiple simultaneous request you are still limited to the time it take for them to complete. It's also a fantastic way to saturate your bandwidth.
On the other hand if we look at the code design for the same request for Gearman it not only allows the code to scale but it also effectively uses resources of the server or multiple servers.
You could, if you wanted to, scale the whole process as well.