Dispatches From The Geeks

News and Announcements from the MCS Systems Group

Archive for the ‘Uncategorized’ Category

Gitlab Maintenance Complete: Proposed CIS Maintenance weekends for FY17.

First up, the Gitlab maintenance announced earlier today is complete. The UI is a bit different – seems to be a more unified interface between desktop/mobile, so if you don’t see what you expect, hit the “hamburger menu” up in the upper left corner.

Secondly, CIS has proposed the following potential maintenance weekends:

November 4-6, 2016

May 5-7, 2017 (APS Maintenance)

August 25-27, 2017 (APS Maintenance)

As noted, the May and August weekends are designed to coincide with APS Maintenance. There’s no set expectation of what is or isn’t going to be available for the above weekends, but here’s what you can expect:

* We (CELS) will likely time any upgrades we have to coincide with these.

* ANL business systems will likely be affected

* ANL network access could be affected, though that doesn’t necessarily affect MCS/LCF. But it might.

Bearing that in mind, please send me (stace@anl.gov) any objections to this maintenance schedule so I can forward up the chain.

Thanks!


Craig

Written by Craig Stacey

September 29, 2016 at 5:16 pm

Posted in Uncategorized

Postponed: File Server Maintenance

In prepping for the previously announced file server upgrade (we need to do it right, and smoothly), we discovered a couple of gotchas. When we scheduled it, I set this afternoon as a go/no-go decision point, and if we weren’t ready to do it today, there’s no way to reliably say we’d be ready to go on Saturday. As such, we’re pushing this work back a week. Tentative schedule is next Saturday, October 8, with the same details as the previous announcement. I’ll send an update next week to confirm. If that date causes issues for you, let me know and we’ll find a date that works for everyone.

Thanks for your patience!

Written by Craig Stacey

September 29, 2016 at 2:35 pm

Posted in Uncategorized

Brief outage for gitlab and xgitlab today at 4PM

We need to apply some updates and reboot the servers xgitlab.cels.anl.gov and gitlab.cels.anl.gov. This will happen at 4PM today. The outage shouldn’t take too long, but we’re scheduling an hour just in case. During this time, you won’t be able to visit or use repositories hosted there.

Let me know if this poses an undue inconvenience. Thanks!


Craig

Written by Craig Stacey

September 29, 2016 at 10:08 am

Posted in Uncategorized

File Server Maintenance, Saturday 10/1.

We’ve scheduled this Saturday, 10/1 from 9AM onward (all day) to finally migrate the rest of our unix homes off the aging file server onto our new one. To do this, we need to take the home filesystem offline. This will not affect e-mail, web, or other services beyond the MCS linux computing infrastructure (https://wiki.mcs.anl.gov/IT/index.php/Linux). During this period, you will not be able to login to any MCS workstations or compute servers, including login.mcs.anl.gov. After the migration is complete, all linux workstations and compute servers that use this home directory server will be rebooted.

If this poses an inconvenience to you that would require us to reschedule this, please let us know now. I will make another announcement on Thursday unless we reschedule due to your requests or issues we encounter in the prep for this.

Thanks!

Written by Craig Stacey

September 26, 2016 at 11:36 am

Posted in Uncategorized

Emergency file server reboot

We’re still seeing issues with our primary file server, and it’s resulting in severe degradation of service at this point, to where logins aren’t being handled. We’re going to be rebooting it shortly. Some machines may require a reboot after this is done. Stand by for updates.

Written by Craig Stacey

September 22, 2016 at 5:00 pm

Posted in Uncategorized

File server back in operation

The previously announced outage should be resolved.

Written by Craig Stacey

September 20, 2016 at 10:19 am

Posted in Uncategorized

Unexpected file server outage

Planned power work in the data center took out our file server (which should have stayed up due to redundancy, but we think we may have a bad power supply). The team’s working to bring it back as we speak. Thanks for your patience.

Written by Craig Stacey

September 20, 2016 at 10:03 am

Posted in Uncategorized

Slight change in upgrade schedule

vanquish has a long-running job on it, so we will postpone its rebuild until we can work with the owner of the job.

crank has suffered a disk error, so its rebuild is being accelerated to this week (today).

Sorry for any inconvenience, and thanks for your patience.

Written by Craig Stacey

September 8, 2016 at 9:07 am

Posted in Uncategorized

thwomp is upgraded to 14.04. More machines tomorrow (vanquish is delayed).

Written by Craig Stacey

September 7, 2016 at 2:36 pm

Posted in Uncategorized

Compute server upgrades continue

We’re pushing through on updating the remaining 64 bit compute nodes to Ubuntu 14.04 Trusty. Here’s the schedule:

This week (through Sep 9)

thwomp.mcs.anl.gov

vanquish.mcs.anl.gov

Next week (Sep 12-16)

trounce.mcs.anl.gov

churn.mcs.anl.gov

Week 3 (Sep 19-23)

crush.mcs.anl.gov

crank.mcs.anl.gov

grind.mcs.anl.gov

Week 4 (Sep 26-30)

compute001.mcs.anl.gov

steamroller.mcs.anl.gov

Week 5 (Oct 3-7)

stomp.mcs.anl.gov

During each rebuild, the machine will be unavailable for some portion of that day. We’ll announce the shutdown on the machine itself to all logged-in users 30 minutes prior to shutdown. After the machine is rebuilt, you’ll need to recreate any crontabs you had in place. Also note /sandbox is not backed up and data will be lost – never keep data in /sandbox that can’t be easily reproduced.

If you notice software packages missing or other oddities, please report them to help@cels.anl.gov.

We’ll start this week’s batch of machines tomorrow (Wednesday, September 7).

Let us know if this presents any problems.

Written by Craig Stacey

September 6, 2016 at 9:58 am

Posted in Uncategorized