I just uploaded pics and information on the new server, take a peek!
I was thinking for some time about a fitting subject line for this post. Turns out: there is no fitting subject line. “Argh!” is the most fitting I could come up with. “But why, Chris?” you might ask. “Server goodness”, I would respond.
As you all know, the new server came with non-working Hitachi hard-disks that like to pop out the raid when IO stress is made. So Dell send replacement disks to address this issue. They did send Hitachi hard disks again, yet, I should try it. Turns out, the problem persists with those.
Yesterday I sent a enw batch of logfiles to Dell with the persistent error. After analyzing them they came up with a solution: “We need to replace all disks with another brand…” – I heard that line before – “… and also replace the RAID Controller” – that one I have not.
The replacement disks and raid controller are on their merry ways to me. My task is to backup the entire server again, replace all the hard drives with the new ones as well as to replace the RAID controller. Followed by a even more happier complete System rebuild. What does this mean for ..
– you: Downtime tomorrow. Approx noon (WEST) to evening’ish.
– me: Lots and lots of work.
The irony is that the new server with dual Power Supplies and RAID10 was supposed to eliminate any downtimes — we never had that many downtimes ever.
Update 25.06.2013, 21:41: Server is operational again, and will stay up this time! 😉
We’re not getting any breaks, are we?
The RAID that shipped with our Server is faulty, according to Dell. I had several errors on the Console, accompanied with RAID rebuilds et all. Not good. Not funny.
So I sent the Error Logs to Dell (you know, the small downtime on Wednesday?) and they analyzed it. I just got the information: The harddisks need to be swapped. All of them. Different Model. So this means: Backup all VMs to external media, then redo the raid, hypervisor – then reimport every single vms. This is work-intensive as well as time consuming.
Anyway. There will be a MASSIVE downtime from tomorrow, 12:00 GMT to Wednesday early morning hours.
I am truly sorry – and you folks know alpha-labs.net is normally rock-solid. Lets call it growing pains.
Update: The Server is back online as of 20:00 WEST Friday, 28 hours ahead of schedule.
our new server hardware is due to arrive soon. To facilitate the move I have to take the Server down for a level 0 off-line backup. The expected downtime is
from Saturday, 01:00 WEST (friday 23:00 GMT) to
to Saturday, 10:00 WEST (saturday 08:00 GMT).
The server will be completely unavailable during this time.
Update: The downtime concluded within the given timeframe.
I switched from the official google-DNS (126.96.36.199, 188.8.131.52) to my own DNS Server just now. With full caching, root zone transfer and statistics logging, things are faster on the lookups. This should benefit us all, alongside with one less dependency on a major corporation.
Even tho you can use alpha-labs as your DNS server, you should refrain from doing so at this time. The IP of the DNS will change in a few weeks again. Also, it’s in testing.
and having a simple splash screen with the servers logo does not cut it anymore. Incredibly alpha-labs.net has made it from 2001 to 2013, thats nearly 13 years of service, with only two major downtimes in all that time. I honestly can’t tell just how many petabytes of data we have pushed around, but it’s save to say a few.
Also, with the new upcoming projects this server just asks for a new, central place of information. This is it.