Greetings, folks!First up, the announced upgrade to Zimbra that was supposed to happen on Sunday has been postponed. There were some hangups in other areas that prevented adequate testing to be done beforehand, and we like to be sure before we upgrade. I’ll announce the new upgrade date when it’s picked. However, we are doing some maintenance on the mailing list server this weekend that will temporarily take out all mailing lists and RT trouble tickets. It’ll take about an hour while we sync data between the hosts, so we’re calling the downtime 11AM to noon on Jan 30. During this period, mail sent to mailing lists will queue up and be delivered after the work is complete. Likewise trouble tickets. In both cases, the web interfaces to the services will also be unavailable. Lastly, the Systems Blog has moved to a new home at http://mcssys.posterous.com (though you can still get there via http://mcs.anl.gov/systems/blog). Not that we don’t like the service we provide at http://press.mcs.anl.gov, but more so that I (and others) can post things to the blog via e-mail (which is nice), as well as the added bonus of having a site that’s up even if the entire lab is not (due to either network or power catastrophes). In those situations, we can inform you that things are down and why. You know, if you’re not here, sitting in your dark office wondering why your computer screen is so uninteresting. Let me know if any of this poses a problem. Thanks! ==
Craig Stacey, IT Manager, MCS & CI
We’ve had a failure on the server that handles mailing lists and trouble tickets. We’re bringing RT back online via a backup machine ASAP, and are working on getting the mailing list data in sync so we can bring that back.Mail is not being lost, it’s just queueing up.Will post updates here as they happen.System is up, mail is flowing again.
Craig Stacey, IT Manager, MCS & CI
Just a reminder that most systems are down due to a planned power outage in building 240. Things will return to normal on Sunday.
On Saturday, October 30th, there will be a Zimbra outage in the morning while the service is upgraded to the newest version. This upgrade will fix a number of bugs and introduce a number of new features — we’re very excited about this upgrade.Originally, this upgrade was slated to happen during the power outage work on that weekend, but since the power outage has been pushed to December, we’re still moving forward with the upgrade.Expect the service to go down in the morning and stay down until early evening. No mail will be lost, only delayed. We’ll post updates as we have them.
Mike Rios has been working tirelessly on this issue from the start, loosing sleep and I suspect some wits, (kidding). Thanks Mike for all your work!He has just sent out an note on the current state of afairs vis a vis Zimbra, along with a link to a page that they will be using to post updates:all.We are still experiencing an issue with our Zimbra mail system. At present,Zimbra runs normally for a short while, then response time to the users willbegin to degrade after about 15-30 minutes until the system is virtuallyunusable. Mail delivery is still happening; the front-end is the only thingbeing impacted by this issue.We have taken to restarting the module that is responsible for the user responseand mail interface every 15 minutes, at :00, :15, :30, and :45 on the clock.The restart process takes about 90 seconds, during which time reading andsending mail, along with the Zimbra web interface, will be affected. Thisstrategy has allowed us to “limp” along while we work with Zimbra in finding asolution for the problem we have.Zimbra engineers are working with us, examining our log files and going throughtheir code. There is at present no estimated time to repair for this issue.Zimbra understands that this is a critical issue for us, and has a number ofpeople working this issue and keeping us informed of their progress. Whatinformation we receive will be communicated on this list. In addition, we willbe keeping a wiki page up-to-date with information as we have it:https://wiki.inside.anl.gov/inside/Zimbra/Current_IssuesIf there are any questions regarding this outage or any other issues related tothis outage, please don’t hesitate to direct them to this list or any of theArgonne people involved.Thank you for your support and patience!mike rios.
As many of you know, the CIS Zimbra service has been “mis-behaving” since roughly 10pm on Tuesday, October 12th.The symptoms of this behavior: poor responsiveness leading to an unresponsive system – typically within 20-30 minutes of last restart.We have a temporary “work-around” in place until Zimbra has a fix for us – we restart the affected process once every 15 minutes. The restart takes around 90 seconds, during which time IMAP and web-based access is unavailable.After doing our own investigation into the issue, we have been working around the clock with Zimbra support on this issue since around 2AM this morning. Zimbra has several developers working on this, and are following several leads. It is at their highest level of priority and we have stressed to them how important it is that we get this solved quickly.As new information becomes available, we will pass it along. Please feel free to contact me directly to address any questions and concerns.Thank you for your continued patience.
At the moment, we’re still waiting for some fix from Zimbra. We’re giving them until Sunday evening before we start trying more drastic measures. We felt this time frame was acceptable given that mail is generally working, except for a 90 second IMAP outage every 15 minutes on the quarter hour.
The Zimbra mail and calendar service is having serious service issues. Work has been going on all night, and engineers at Zimbra are helping. At this point we have no ETA on when things will be stable. More details as they emerge.
CIS will be performing an update to the Zimbra service on Saturday between 9AM and 5PM on Saturday, July 17. During this window, you will not be able to receive new mail, send mail through Zimbra, or check for mail on the server. Also, Calendars will not be available during this window.Any mail sent during this window will cue up and be delivered once the server is back online. We do not expect any loss of mail or bounced mail.This upgrade is migrating the server from a 32-bit version to the 64-bit version, which will allow us to enhance performance. Also, the 32-bit versions will be end-of-life soon, and will no longer receive support.In August, we are planning to upgrade the server from Zimbra 5.0.23 to Zimbra 6.x. We expect this to fix a number of bugs, so that’s something to look forward to.Sorry for any inconvenience.