1.5 days of downtime ...

Status
Not open for further replies.

erect

New member
Jun 3, 2007
3,796
154
0
Esoterica
twitter.com
Never had a complaint before about zone.net but I'm getting a little pissed at the moment. 1 of my 5 hosting accounts has been down since 3AM or so on Friday morning.

So like I'm a forgiving soul, but shit, at some point they you'd think that running a service once or twice would suffice but running Fsck 4 times!!!

The worst part about it is that their forums were apparently effected by downtime so no community conversation. Only promises of an RSS feed for further downtime and an email in 24 hrs explaining the issues.

Could someone familiar with the server side of things take a look at their Downtime Status page and translate that into fucking English for me. I'm a pretty big nerd but don't know (and don't care) much about *nix as it applies to hosting.

Some terms like FSCK, MAGGIE & Lost+Found Inode I can google but I'd rather hear from someone with experience as to why a server would go into these stages and not what the programs do.

BTW: zone.net has a solid rep up (I checked before signing up) to this point and I'm willing to accept a black eye, but it's getting a bit frustrating.

No PPC happening on this server, just organic sites. That's almost worse as downtime can really effect google's opinion of my site.
 


^^^^ what he said. fsck is trying to fix a corrupted filesystem, meaning if it cannot, chances are strong you've just lost all your data on that disk. maybe the power dropped out on that box and it fooked the filesystem, fsck is usually run on reboot...

good luck
 
Thanks guys, this was acutally a planned downtime and they are talking about "no suspected loss of data" ... If they knew this would be possible what good reason could they have for not backing up all data before they went through the process?

Our SLA guarantee applies to this downtime, and all users affected by this downtime (about 35) will receive a 100% credit for next month's hosting on the new node.

This has been, by far, the longest downtime in ZONE.NET history, and the first real downtime this node has ever faced (uptime was somewhere above 350+ days according to our records). We're doing all we can now to prepare for the node comes back online, and will keep you updated with any information we have.


Yea, $65 free ... It's just a shame this doesn't cover my lost profits.

I'm still happy with zone.net other than this. In my opinion, this is not a fuckup on their part, rather a hardware failure of sort. Have you seen anything to imply otherwise? If so, I have no problems sticking with them regardless of this catastrophic downtime.
 
1:47 AM EST (11/10/2008)
We are still waiting for the FSCK to complete. Expect another update in the morning.​



Still down ... getting a bit frustrated obviously. I'm going to get back online and find that google has deindexed all of my sites (like 20% of my total sites). I understand that they are working on the problem, but this alone is enough to make me go elsewhere even though I'm happy with them up till now .... I guess, however, this could easily happen on any server. It's just disasterous and no good can come of it at this point.
 
I know that whenever it gets this far it is usually followed by a great deal of lost data. The storyline goes backups were corrupted, disk was bad, fine print says we are not liable, company stops responding.
 
Thanks for the link subigo ... hadn't read that.

Server just got back online but from what I understand the service is spotty now as EVERYONE is taking backups to move to a different host ... overloading the CPU

Just checked 4 of my smaller sites and not a single indexed page on any of them. To me, that's really worse news than having to relocate to a different server.

Since I've weathered this storm (and prepaid until Feb 09) I'll probably hang tight and take my time looking around for a different host. I paid what I consider a premium for this VPS ($65) as there are many cheaper on the market ... customer support sucks either way you look at it.

Fuck!!!!
 
We will be extending our SLA coverage to 3 months for customers on MAGGIE for this downtime. We will also be giving hot-spare servers for these 3 months at no charge. Please open a ticket with sales to get this credit added to your account

I'm going to stop posting on this thread now that things are back to normal, thanks for all the input. I just wanted to comment real quick that this can happen to any server at any time.

BACKUPS ARE CRITICAL!!!

I had them, but figured I'd wait out the storm before shifting my bets around the table. I'm 2nd guessing this logic now but hindsight is always 20/20 .. probably should have moved this stuff on Friday afternoon. But it really would have been jumping the gun to retreat after 12 hours down.

Regardless, they are comping 3 months and giving me a free hot swap. At least they are stepping up to the plate and taking responsibility for this issue. I'm sure this hurts financially, ooh well, not my buck :]

For the record, zone.net is not shit .. they just have bad luck
 
BACKUPS ARE CRITICAL!!!
You were doing backups but didnt move them?

If your not already doing them, here's a quick automated way to get that going. I use a cheap arse Dreamhost account, cron, scp (ssh), and zip (or tar) for a Unix box.

- Setup a trust with ssh from your box over to the DH account (loads of tutorials out there, it's just the simple matter of copying a key over from DH).

- Setup a cron job to tar (or zip) up your files.

- Then have the cron scp over your files to your DH account.

Really cheap, quick, and easy - DH has gob's of disk space to dump crap on...
 
Shit. That's pretty bad man.
We got DDoSd the weekend. Little fucker waited until Friday evening, Euro time, so there was well and truly no one in the office to take care of shit for a good few hours.
The boss was kicking himself about not having a 2nd server.

If this was scheduled downtime, I'd be asking two things
1) How come it's taking so much longer than expected
2) Why wouldn't they host your shit on another box temporarily and redirect as appropriate?

They don't have fail-over or cascade setups at all?
 
Status
Not open for further replies.