Dispatches From The Geeks

News and Announcements from the MCS Systems Group

Mail problems resolved

A critical server crashed in the middle of the night, taking down CI authentication. Normally, this should not cause too big a problem, as we have backup authentication servers. However, there appears to be a misconfiguration on the Zimbra servers that was causing it to fail on the backup servers. This caused a cascading problem which made the Zimbra servers unresponsive. Mail was still coming in, but nobody could login to check it.

The initial failure has been fixed (the authentication server is now back up), and we’re digging through the mess trying to make sure we fully understand why the other servers didn’t work as expected. We’ll have this bolted down such that the next failure will result in a proper fallback to redundant servers.

Sorry for the inconvenience.


Written by Craig Stacey

April 25, 2012 at 2:33 pm

Posted in Uncategorized

%d bloggers like this: