Dispatches From The Geeks

News and Announcements from the MCS Systems Group

Zimbra Upgrade Postponed – Mailing List and RT outage – Blog move

Greetings, folks!

First up, the announced upgrade to Zimbra that was supposed to happen on Sunday has been postponed. There were some hangups in other areas that prevented adequate testing to be done beforehand, and we like to be sure before we upgrade. I’ll announce the new upgrade date when it’s picked.

However, we are doing some maintenance on the mailing list server this weekend that will temporarily take out all mailing lists and RT trouble tickets. It’ll take about an hour while we sync data between the hosts, so we’re calling the downtime 11AM to noon on Jan 30. During this period, mail sent to mailing lists will queue up and be delivered after the work is complete. Likewise trouble tickets. In both cases, the web interfaces to the services will also be unavailable.

Lastly, the Systems Blog has moved to a new home at http://mcssys.posterous.com (though you can still get there via http://mcs.anl.gov/systems/blog). Not that we don’t like the service we provide at http://press.mcs.anl.gov, but more so that I (and others) can post things to the blog via e-mail (which is nice), as well as the added bonus of having a site that’s up even if the entire lab is not (due to either network or power catastrophes). In those situations, we can inform you that things are down and why. You know, if you’re not here, sitting in your dark office wondering why your computer screen is so uninteresting.

Let me know if any of this poses a problem.

Thanks!

==
Craig Stacey, IT Manager, MCS & CI
stace@mcs.anl.gov
stace@ci.uchicago.edu

Written by Craig Stacey

January 27, 2011 at 11:19 pm

Posted in Uncategorized

Mailing list and RT outage – UPDATE: Fixed

We’ve had a failure on the server that handles mailing lists and trouble tickets. We’re bringing RT back online via a backup machine ASAP, and are working on getting the mailing list data in sync so we can bring that back.Mail is not being lost, it’s just queueing up.Will post updates here as they happen.System is up, mail is flowing again.

Written by Craig Stacey

December 22, 2010 at 12:07 am

Posted in Uncategorized

Tagged with ,

Out of Office Messages

As many of you are probably setting out of office messages before heading out for the break, I want to make sure they work the way you (and I) expect them to work.

Here’s the short version:  If you can, take a moment to visit https://accounts.mcs.anl.gov/account.php and make sure your “Preferred Email” is the mail address you use as your primary mail address.  For instance, if you generally use an “@anl.gov” address, make sure it says that.  It should only say “<your username>@mcs.anl.gov” if that’s the one you expect people to generally send mail to.

(While you’re there, it wouldn’t hurt to make sure the rest of the info is up to date as well.)

You’ll be good to go.  Now, here’s an explanation as to why this is necessary, for those who care.

Zimbra, like most mail servers, will only send an out of office response on mail that’s addressed to you specifically.  This prevents it from responding to mail sent to mailing lists (like this one I’m sending now) which is *terribly* annoying and spammy.  

The problem is that Zimbra doesn’t have everyone’s address right.  Until we migrate into actual separate domains early next year, everyone’s got an address in every domain we have.  I’m stace@mcs.anl.govstace@alcf.anl.gov, and stace@cels.anl.gov.  On top of that, though http://www.anl.gov/alias, I also have stace@anl.gov.  And, for extra fun, I’m also stace@ci.uchicago.edu.  However, as far as Zimbra’s concerned, my address is stace@mcs.anl.gov.  Mail to *any* of those addresses will reach me, but only mail to stace@mcs.anl.govwill trigger my vacation responder when it’s active, unless I do something about it.

Unfortunately, most of our users have their address set, right or wrong, to consider foo@mcs.anl.gov canonical.  Folks in ALCF who use @alcf.anl.gov aren’t going to have working autoresponders.  Ditto for anyone who uses @anl.gov by default.

So pick one you want to be canonical, and make it so on the accounts page.  (If you’re an oddball like me and want to trigger an autoresponder on multiple addresses, drop me an e-mail and let me know and I’ll handle you separately.)

Next week I’m going to run a script on the Zimbra server to make sure everyone’s autoreply address is set correctly so out of office messages will trigger correctly.

Thanks!

==
Craig Stacey, IT Manager, MCS & CI
stace@mcs.anl.gov
stace@ci.uchicago.edu

 

Written by Craig Stacey

December 17, 2010 at 9:44 pm

Posted in Uncategorized

Tagged with

Power Outage, Dec 3-5.

Just a reminder that most systems are down due to a planned power outage in building 240. Things will return to normal on Sunday.

Written by Craig Stacey

December 5, 2010 at 2:41 am

Posted in Uncategorized

Tagged with ,

Mail/Zimbra Upgrade and Outage

On Saturday, October 30th, there will be a Zimbra outage in the morning while the service is upgraded to the newest version. This upgrade will fix a number of bugs and introduce a number of new features — we’re very excited about this upgrade.Originally, this upgrade was slated to happen during the power outage work on that weekend, but since the power outage has been pushed to December, we’re still moving forward with the upgrade.Expect the service to go down in the morning and stay down until early evening. No mail will be lost, only delayed. We’ll post updates as we have them.

Written by Craig Stacey

October 26, 2010 at 3:25 am

Posted in Uncategorized

Tagged with ,

Zimbra Update

Mike Rios has been working tirelessly on this issue from the start, loosing sleep and I suspect some wits, (kidding).  Thanks Mike for all your work!He has just sent out an note on the current state of afairs vis a vis Zimbra, along with a link to a page that they will be using to post updates:all.We are still experiencing an issue with our Zimbra mail system.  At present,Zimbra runs normally for a short while, then response time to the users willbegin to degrade after about 15-30 minutes until the system is virtuallyunusable.  Mail delivery is still happening; the front-end is the only thingbeing impacted by this issue.We have taken to restarting the module that is responsible for the user responseand mail interface every 15 minutes, at :00, :15, :30, and :45 on the clock.The restart process takes about 90 seconds, during which time reading andsending mail, along with the Zimbra web interface, will be affected.  Thisstrategy has allowed us to “limp” along while we work with Zimbra in finding asolution for the problem we have.Zimbra engineers are working with us, examining our log files and going throughtheir code.  There is at present no estimated time to repair for this issue.Zimbra understands that this is a critical issue for us, and has a number ofpeople working this issue and keeping us informed of their progress.  Whatinformation we receive will be communicated on this list.  In addition, we willbe keeping a wiki page up-to-date with information as we have it:https://wiki.inside.anl.gov/inside/Zimbra/Current_IssuesIf there are any questions regarding this outage or any other issues related tothis outage, please don’t hesitate to direct them to this list or any of theArgonne people involved.Thank you for your support and patience!mike rios.

Written by Craig Stacey

October 15, 2010 at 4:14 am

Posted in Uncategorized

Tagged with ,

Zimbra Status Update

From CIS:

As many of you know, the CIS Zimbra service has been “mis-behaving” since roughly 10pm on Tuesday, October 12th.The symptoms of this behavior: poor responsiveness leading to an unresponsive system – typically within 20-30 minutes of last restart.We have a temporary “work-around” in place until Zimbra has a fix for us – we restart the affected process once every 15 minutes. The restart takes around 90 seconds, during which time IMAP and web-based access is unavailable.After doing our own investigation into the issue, we have been working around the clock with Zimbra support on this issue since around 2AM this morning. Zimbra has several developers working on this, and are following several leads. It is at their highest level of priority and we have stressed to them how important it is that we get this solved quickly.As new information becomes available, we will pass it along. Please feel free to contact me directly to address any questions and concerns.Thank you for your continued patience.

At the moment, we’re still waiting for some fix from Zimbra. We’re giving them until Sunday evening before we start trying more drastic measures. We felt this time frame was acceptable given that mail is generally working, except for a 90 second IMAP outage every 15 minutes on the quarter hour.

Written by Craig Stacey

October 15, 2010 at 2:05 am

Posted in Uncategorized

Tagged with

Zimbra problems

The Zimbra mail and calendar service is having serious service issues. Work has been going on all night, and engineers at Zimbra are helping. At this point we have no ETA on when things will be stable. More details as they emerge.

Written by Craig Stacey

October 13, 2010 at 7:28 pm

Posted in Uncategorized

Tagged with

Zimbra Outage Saturday, July 17

CIS will be performing an update to the Zimbra service on Saturday between 9AM and 5PM on Saturday, July 17. During this window, you will not be able to receive new mail, send mail through Zimbra, or check for mail on the server. Also, Calendars will not be available during this window.Any mail sent during this window will cue up and be delivered once the server is back online. We do not expect any loss of mail or bounced mail.This upgrade is migrating the server from a 32-bit version to the 64-bit version, which will allow us to enhance performance. Also, the 32-bit versions will be end-of-life soon, and will no longer receive support.In August, we are planning to upgrade the server from Zimbra 5.0.23 to Zimbra 6.x. We expect this to fix a number of bugs, so that’s something to look forward to.Sorry for any inconvenience.

Written by Craig Stacey

July 15, 2010 at 2:34 am

Posted in Uncategorized

Tagged with

State of Systems Talk 2010

Here’s the talk I gave in early June on the state of systems. Alas, the video didn’t quite make it — audio was too quiet and it cut out before the end. But if you want to talk about anything you see in there, please come see me!Systems Talk 2010

Written by Craig Stacey

June 26, 2010 at 2:57 am

Posted in Uncategorized

Follow

Get every new post delivered to your Inbox.

Join 42 other followers