In prepping for the previously announced file server upgrade (we need to do it right, and smoothly), we discovered a couple of gotchas. When we scheduled it, I set this afternoon as a go/no-go decision point, and if we weren’t ready to do it today, there’s no way to reliably say we’d be ready to go on Saturday. As such, we’re pushing this work back a week. Tentative schedule is next Saturday, October 8, with the same details as the previous announcement. I’ll send an update next week to confirm. If that date causes issues for you, let me know and we’ll find a date that works for everyone.
Thanks for your patience!
We need to apply some updates and reboot the servers xgitlab.cels.anl.gov and gitlab.cels.anl.gov. This will happen at 4PM today. The outage shouldn’t take too long, but we’re scheduling an hour just in case. During this time, you won’t be able to visit or use repositories hosted there.
Let me know if this poses an undue inconvenience. Thanks!
We’ve scheduled this Saturday, 10/1 from 9AM onward (all day) to finally migrate the rest of our unix homes off the aging file server onto our new one. To do this, we need to take the home filesystem offline. This will not affect e-mail, web, or other services beyond the MCS linux computing infrastructure (https://wiki.mcs.anl.gov/IT/index.php/Linux). During this period, you will not be able to login to any MCS workstations or compute servers, including login.mcs.anl.gov. After the migration is complete, all linux workstations and compute servers that use this home directory server will be rebooted.
If this poses an inconvenience to you that would require us to reschedule this, please let us know now. I will make another announcement on Thursday unless we reschedule due to your requests or issues we encounter in the prep for this.
We’re still seeing issues with our primary file server, and it’s resulting in severe degradation of service at this point, to where logins aren’t being handled. We’re going to be rebooting it shortly. Some machines may require a reboot after this is done. Stand by for updates.
The previously announced outage should be resolved.
Planned power work in the data center took out our file server (which should have stayed up due to redundancy, but we think we may have a bad power supply). The team’s working to bring it back as we speak. Thanks for your patience.
vanquish has a long-running job on it, so we will postpone its rebuild until we can work with the owner of the job.
crank has suffered a disk error, so its rebuild is being accelerated to this week (today).
Sorry for any inconvenience, and thanks for your patience.
We’re pushing through on updating the remaining 64 bit compute nodes to Ubuntu 14.04 Trusty. Here’s the schedule:
This week (through Sep 9)
Next week (Sep 12-16)
Week 3 (Sep 19-23)
Week 4 (Sep 26-30)
Week 5 (Oct 3-7)
During each rebuild, the machine will be unavailable for some portion of that day. We’ll announce the shutdown on the machine itself to all logged-in users 30 minutes prior to shutdown. After the machine is rebuilt, you’ll need to recreate any crontabs you had in place. Also note /sandbox is not backed up and data will be lost – never keep data in /sandbox that can’t be easily reproduced.
If you notice software packages missing or other oddities, please report them to firstname.lastname@example.org.
We’ll start this week’s batch of machines tomorrow (Wednesday, September 7).
Let us know if this presents any problems.