Try to guess my answer to that question, however temptingly you poste it.
Mario Monti

They Allmanaged to Ignore Me

My Advice? Choose Your ISP with Immense Care...

 

Gateway to Oblivion

Source: flickr.com

We had an account with Sago Networks/AllManaged.com for over a year.  Our former hosts were not without their problems, but overall the experience was mostly positive.  True, it took them a week to provision the server.  And true they accidentally unplugged it once while performing maintenance on a neighbouring server (luckily I noticed before more than an hour had gone by).  And it's true that a botched network infrastructure upgrade caused 2 - 3 days of downtime.  But, after all, nobody is perfect.  On the other hand, there are limits.  Read this ticket history - how would you respond?  I know how I did...

Ticket Details

URGENT - total system failure

ID:  -  NTY-42940-283
Status:  -  resolved
Priority:  -  high
Opened:  -  Sun Mar 19 2006 03:48AM
Last Msg:  -  Thu Mar 23 2006 11:38PM

Sun Mar 19 2006 03:47AM by cody.hatch@gmail.com

I'm having serious problems with my server.  It seems to have experienced some form of serious hardware or disk failure.  Apache won't start, Plesk is giving error messages, and the /usr/home partition on the /dev/ad0s1g device doesn't seem to be mounting.

I'm have no idea what the problem is, or how to proceed.

Help?


Sun Mar 19 2006 03:58AM by support@allmanaged.com

Hello,

Can you provide details to the error messages you were receiving?

Let us know if you need anything else.

Sago Tech Support


Sun Mar 19 2006 04:25AM by cody.hatch@gmail.com

I didn't make a note of the errors in Plesk.  They appeared to be caused by the failure of some partitions to mount.  I'm not sure where to look exactly to find the error messages given by Apache when it didn't start.

I note that /var/log/messages is getting a lot of errors such as the following:

Mar 19 04:10:41 chaos kernel: ad0: FAILURE - READ_DMA status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=133103871

I'm unsure what that means, however.  The system seems to be experiencing some problems.

Running:

du -hs

In /home/vhosts/ took an extremely long time, and spat out dozens of errors such as:

du: /home/vhosts/flatrock.org.nz/httpdocs/topics/animals/assets/whales1_sei.jpg: Input/output error

Does this mean the hard disk or file system is failing?


Sun Mar 19 2006 04:40AM by support@allmanaged.com

Hello,

The errors you have noted are symptoms of hard drive issues.  When could we bring your server offline to perform hard drive diagnostics and possibly an fsck in single mode?  The downtime you would want to expect could be up towards 2 hours for this.

Please note, if the drive is found to be failing, we will need you to be on standby to begin the reinstall process.  Most of the times the errors can be corrected using the spare drive space allocations placed on the drive from the drive manufacture.

Let us know if you need anything else.

Sago Tech Support


Sun Mar 19 2006 04:50AM by cody.hatch@gmail.com

Quote:

"Please note, if the drive is found to be failing, we will need you to be on standby to begin the reinstall process."

I'm not sure I understand.  Can you explain that in more detail?  What will happen if the drive is, in fact, failing?

Thanks...


Sun Mar 19 2006 05:05AM by support@allmanaged.com

Hello,

Your server's OS will be reinstalled using a new disk.

There is information we will need to obtain from you to begin the reinstall process.  Also, if you have not made recent backups of your data off of your server, I would recommend this a precaution.

Let us know if you need anything else.

Sago Tech Support


Sun Mar 19 2006 06:28AM by cody.hatch@gmail.com

All right.  Ideally I'd like to have the diagnostics done fairly late at night to minimize the impact to users - preferably some time after midnight, EST.  Is that possible?  If so, it would be good if this could happen as soon as possible - tonight, perhaps?  I will be available.

By the way, what information will you need to begin the reinstall process?  In the meantime, I'll be ensuring that I have backups of all the data...


Sun Mar 19 2006 06:53AM by support@allmanaged.com

Hello,

We will address the reinstall process in further details if we reach that point.

As sectors on a hard drive become unusable or "wear out", the drive manufactures provide a buffer - for example, if your server has a 80Gb hard drive, the drive mfg actual total capacity may be 100Gb but only reports 80Gb on the hard drive's map to provide a buffer so their software can manipulate the way sectors are mapped to your drive.

We are placing this ticket on hold for the technician scheduled at Midnight EST / 0:00 AM 3/20/06 and will begin diagnostics at that time.

Let us know if you need anything else.

Sago Tech Support

Wire You not Helping Me?

Source: www.ritilan.com

Mon Mar 20 2006 03:19AM by support@allmanaged.com

Hello,

The hard drive diagnostics did detect an unrepairable read element failure.

In the event of a failing hard drive, Sago follows the following procedures for reinstalling your system:

  1. Give the customer an opportunity to perform OS or Control Panel backups (if possible).
  2. Open the server and label the existing drive(s).
  3. Check the cables, fans and connections in the server.
  4. Place a new drive into the server and install the OS and partitions.
  5. Slave the old drive to the server.  (This depends on if BIOS recognizes the drive).
  6. Leave the old drive on the server for a period of no more 72 hours.

It is the customer's responsibility to copy any salvageable data from the failing hard drives.  Sago will not be held responsible for lost data or lost time due to hardware failure of any kind.  If you need to recover data from any old drives, it must be instructed in the ticket so that a record is kept of your requests and not done by phone conversations.  Also, please advised if you have any specific OS versions, partitioning schemes, or control panels you wish to have deployed to your newly installed OS.

Thank you

Sago Tech Support


Mon Mar 20 2006 04:58AM by cody.hatch@gmail.com

The /usr/home partition on /dev/ad0s1g will not mount.  As such, the web server is down.  Please replace the drive as soon as possible, as it's extremely important to me for the web server to be functioning.

Please install FreeBSD and Plesk.  I have no particular partitioning requirements.

Can you give me an estimate on how soon this can be fixed?


Mon Mar 20 2006 05:57AM by support@allmanaged.com

Hello,

Your reinstall has started at this time.  The OS install should be completed within a few moments and then we can begin to run scripts to begin to secure the server followed by installing Plesk. An ETA for completion should be expected approximately 9am - 10am EST.  As you are an AllManaged.com customer, we can then begin to assist you with data restoration if you would like assistance.

Let us know if you need anything else.

Sago Tech Support


Mon Mar 20 2006 09:52AM by cody.hatch@gmail.com

Can you give me an update on the current status and estimate on when the install will be done?

Thanks


Mon Mar 20 2006 11:33AM by cody.hatch@gmail.com

Can you give me some sort of status update?  I was told completion would be 9 - 10am, but it's now approaching 12pm, and I still haven't heard anything...

Thank you for your time.


Mon Mar 20 2006 11:48AM by support@allmanaged.com

The re-install technician installed BSD 5.2.1 again.  Unfortunately, Plesk only supports 4.9 and 5.3 now.  I am re-installing with FreeBSD 5.3 now.

Please let me know if I can be of further assistance.

K.J.

Sago Tech Support


Mon Mar 20 2006 04:38PM by cody.hatch@gmail.com

All right - so what's the status of the server now?


Mon Mar 20 2006 06:05PM by cody.hatch@gmail.com

Please contact me and let me know what the status is and when it will be back up.

The server has been down for over 12 hours now, it has been over 7 hours since I was told the server would be back up, and it's been over 5 since I've even received a reply.

What is going on?!


Mon Mar 20 2006 07:56PM by support@allmanaged.com

Hello,

I do apologize for the delay in getting this resolved.  It appears that there were a number of problems during the reinstallation process.  The technician had left today at 5pm with Plesk currently at 33% installed and it was taking awhile.  Plesk apparently has finished sometime between then and now.  So as of right now your server reinstallation is complete.

The next step would be for me to go ahead and place your old primary hard drive to your server and then move this ticket over to Allmanaged so they can take care of the restore.  I feel obligated to tell you that restores are not an instantaneous thing and sometimes it can take some time to perform the reinstall.  I only tell you this to set your expectation of what you will experience from this point.

I will go ahead and place your old drive on your server momentarily.  I will update your ticket next once I am moving it to the Allmanaged queue.  Thank you.

--------------

Max R.
General Support
Sago Networks


Mon Mar 20 2006 08:09PM by support@allmanaged.com

Hello,

Your old primary hard drive has been attached to your server.  I am moving this ticket to the Allmanaged queue so they can process your restoration.

Thank you.

--------------

Max R.
General Support
Sago Networks


Mon Mar 20 2006 08:14PM by cody.hatch@gmail.com

Thank you.  I appreciate the information.


Mon Mar 20 2006 08:27PM by cody.hatch@gmail.com

The single most critical item for me is to get email service for the domain chaos.net.nz functioning as quickly as possible.  In particular, I need the mail accounts that existed to be restored or recreated.

Is the old drive sufficiently readable that old email settings/account details/passwords/etc can be recovered from it?  I have no backup of the Plesk settings, as far as I'm aware.  Or will the individual email accounts need to be recreated from scratch?

Thank you for your time.


Mon Mar 20 2006 09:54PM by support@allmanaged.com

Hello,

I have notified an Allmanaged tech of your request for a response on an ETA for the restoration.  They are currently investigating the situation to ensure they give you a proper response.  You should receive a response shortly.

Thank you.

--------------

Max R.
General Support
Sago Networks


Mon Mar 20 2006 09:59PM by cody.hatch@gmail.com

Can I please get an update on current status or an estimate of when things will be done?

I'm a bit unsure as to whats going on - the last response to this ticket said that it was being transferred to the Allmanaged queue, but that was roughly two hours ago.  Web, email, plesk, etc. are all non-functioning, and I'm in the dark about what's happening, what's being done to fix it, or when it will be done.  I'm sure you can understand this is frustrating!

It's my understanding that an Allmanaged tech is working (will work?) on the machine to try and restore the old configuration.  Is this correct?  If so, can you tell me when I should hear whether this has suceeded?

As always, thank you for your time.


Mon Mar 20 2006 10:00PM by cody.hatch@gmail.com

My apologies - I didn't see Max's reply of 09:54PM until after I had sent that last message.


Mon Mar 20 2006 11:30PM by support@allmanaged.com

Hello,

We are working on the issue with top priority and hopefully we will finish it within 8 hours time.

Regards,
General Support Team.
Sago Networks/Allmanaged.com.


Mon Mar 20 2006 11:36PM by cody.hatch@gmail.com

Still wondering what's going on, and when it will be done...

I'm aware that these things take time...however...how *MUCH* time?  What's being done?  When will it finish?

Communication has been very poor so far, and I'm getting somewhat frustrated at the lack of updates!  Do you have any estimate of when the machine will be done being restored and what state it will be in at that time?


Mon Mar 20 2006 11:50PM by cody.hatch@gmail.com

Would you mind telling me what the problem is that is taking so long?  I don't mean to be critical, but the delay seems surprising.

It has now been 24 hours since the server was scheduled to come down for diagnostics, and over 20 hours since it has been in any way functional.

As I hope I made clear in earlier replies, my first priority is that email service be restored as quickly as possible.  If the old account information could be restored from the old hard disk that would be nice; if not I can recreate them manually.

It has been almost 4 hours since then, and I'm now told it will be an additional 8?  Is this really the fastest that this can occur?  12 hours doesn't seem amazingly fast - is there some issue causing the delay?  If so, what? Are you having to wait for a daytime tech?  Is my ticket simply in the queue?  Or what?

Please reply with any information you have...this will apparently end up being 30+ hours of downtime, and I'm getting a little stressed here...


Tue Mar 21 2006 01:29PM by cody.hatch@gmail.com

No word?  What's going on?  What's the problem?  It's been 8 hours - plus another 6!


Tue Mar 21 2006 03:04PM by support@allmanaged.com

Hello,

Sorry for the delay caused.  We faced some issues while trying to complete the task and we have contacted the Plesk team regarding this issue.  We will get back to you as soon as we receive an update from them.

Regards,
General Support Team.
Sago Networks/Allmanaged.com.


Wed Mar 22 2006 06:49AM by cody.hatch@gmail.com

So, looks like it's been almost 16 hours since I heard from you last.  What's the status?  What issues did you encounter?  Have you heard from the Plesk team?  Can you give me an estimate on when this will be resolved?

Let's see...you took the server down for diagnostics at midnight, Monday morning.  It's now 06:40 Wed morning.  That makes, by my count, almost 55 hours since I had a functioning server. Out of curiosity, do hardware replacements normally take this long?  I note on your FAQ you say:

"Any hardware failure will be taken care of as quickly as possible with service restoration attempts if necessary as soon as service is restored.  We guarantee hardware replacement within 2 hours of failure."

Just wondering...


Wed Mar 22 2006 10:18PM by cody.hatch@gmail.com

Okay, so its now been 31 hours since your last response to this ticket.

I don't have a server, and I don't seem to be getting any replies.  I can't imagine a business that treats customers in this fashion, so I can only assume that I am no longer a customer.  You aren't actually planning on fixing this problem, are you?

DisOrganization...

Source: knuttz.yi.org

Update: Well over 12 hours later, the skilled techs at Sago Networks/Allmanaged.com informed me that the server was finally set up, configured, and ready to go.  They gave me a password to login.  Needless to say, the password was incorrect, and the server was incorrectly configured and non-functioning.  This led to the following exchange:

Wed Mar 22 2006 06:49AM by cody.hatch@gmail.com

I went to the address but the password does not work.

Above you said "Your server's current admin/root password is: ********" so I tried that.  That worked, but I went through the setup screen and it did not ask me for a key.  I can now go to license management but it only lets me upload a license key file.  I do not have one.

I am dead in the water for now.  This is urgent.

Please respond.


Tue Mar 21 2006 03:04PM by support@allmanaged.com

Hello.

Try logging in using admin and the password : ******** [Ed: This was the same password as above.]

If this does not work, please reopen this ticket and we shall go from there.

Several tickets, a couple hours, and two phone calls later, they managed to understand my problem and fix it.  Total time elapsed: Almost exactly 96 hours or four full days.