Dispatches From The Geeks

News and Announcements from the MCS Systems Group

An opportunity for Linux users to talk to Box about Box on Linux.

Box is coming on site (date TBD) and are looking to have an open and frank discussion about document sharing and collaboration with linux users. They want to know how they can improve the product for linux users. They’re looking for 7-10 users from across the lab. If you have an opinion on this and are willing to give a couple of hours of your time, please let me know. We need to have the list finalized by next week. I was going to send this to linux-users@mcs.anl.gov, but I wanted to cast a wide net so I’m using the general announcement list.

Also, if you have specific things you want to make sure are addressed, also let me know. Ideally, I’d want you in the room to be able to be your own advocate, but if you can’t be (or aren’t on the final list), I want to be sure the big opportunities for improvement are expressed.

Thanks!

Written by Craig Stacey

September 21, 2015 at 4:20 pm

Posted in Uncategorized

Previously down compute nodes back online as of 11:00 AM.

Written by Craig Stacey

September 15, 2015 at 11:05 am

Posted in Uncategorized

Some compute nodes offline, will return this afternoon.

Power work in the data center has taken a handful of compute nodes offline for a few hours. They should be back online early this afternoon. The affected machines are:

  • octagon.mcs.anl.gov
  • cookie.mcs.anl.gov
  • petsc.mcs.anl.gov
  • cg.mcs.anl.gov
  • gnep.mcs.anl.gov
  • octopus.mcs.anl.gov

Sorry for the inconvenience. We didn’t believe these machines would be affected by the work, however we were incorrect.

For a list of alternative machines, see https://wiki.mcs.anl.gov/IT/index.php/General_MCS_Questions#computeservers.

Written by Craig Stacey

September 15, 2015 at 10:06 am

Posted in Uncategorized

Disk migration complete, rdp.mcs.anl.gov available to users

The disk migration is finally finished. User home directories are now on their own partition, and the full disk problem has been rectified. There’s currently over 50GB available in user home directories on RDP for any files and programs that need local storage.

Thanks for your patience!

Written by Craig Stacey

September 13, 2015 at 4:05 pm

Posted in Uncategorized

RDP work has commenced, offline until 5PM

rdp.mcs.anl.gov is offline until further notice. The outage window is through 5PM, but I don’t expect it to take that long. I’ll post here when the work is done.

Written by Craig Stacey

September 13, 2015 at 12:59 pm

Posted in Uncategorized

rdp.mcs.anl.gov available again, second outage on Sunday 9/13/15 at Noon.

Unfortunately, the home directory migration was not yet successful, so we’re in the same boat we were in before the outage with space being very tight.

I’m going to take another crack at it on Sunday, which means from around noon to 5PM you can expect the machine to be unavailable. If anything changes, I’ll send a note to the blog and twitter feeds linked below. Thanks.

Written by Craig Stacey

September 11, 2015 at 3:02 pm

Posted in Uncategorized

rdp.mcs.anl.gov downtime on 9/11/2015

Those of you who use rdp.mcs.anl.gov (Remote Desktop server for Windows) may have noticed the disk is quite full. I need to migrate users to a new partition to free up space. This, however, requires the machine be offline during the migration. At the moment, the plan is to take the machine offline tomorrow at noon. I’m estimating a three hour outage, though it may be less than that. At any point I’ll post an announcement at the start and end of work on the blog and twitter feed linked below.

Thanks, and sorry for any inconvenience this causes. I’d like to do this on a weekend, but schedules don’t align to have it happen this coming weekend and I don’t want it to wait another week as the disk is quite full.

Written by Craig Stacey

September 10, 2015 at 2:52 pm

Posted in Uncategorized

Outage issues resolved

Quick summary: I just got back from the 221 data center (gee, it’s hot outside) having replaced what we suspect are bad power supplies in a Virtual Machine Host server. We isolated the issue to this specific server rebooting without offering any useful information as to why in its logs, coupled with a bad set of configs that prevented the virtual machines hosted on it from restarting without human intervention.

We’ve addressed the config issue, and replaced the power supplies as there were indications that one or possibly both were bad.

We’re ready to migrate these affected virtual machines to a new host if this last fix doesn’t stabilize things, but we’re feeling pretty good about this at the moment.

Thanks so much for your patience, and sorry for any troubles.

Written by Craig Stacey

September 2, 2015 at 3:41 pm

Posted in Uncategorized

Another outage occurring

A similar outage to this morning is occurring (though limited in scope at the moment since we know what *won’t* work to bring things back. Stand by…

Written by Craig Stacey

September 2, 2015 at 2:37 pm

Posted in Uncategorized

Systems Announce RESOLVED: Outage affecting MCS/CELS systems.

Addendum. One of the web servers (personal pages and project pages under http://www.mcs.anl.gov) is still booting up. It has been awhile since it rebooted and it’s doing a filesystem check. It should be back within the next hour.

Written by Craig Stacey

September 2, 2015 at 9:11 am

Posted in Uncategorized

Follow

Get every new post delivered to your Inbox.

Join 55 other followers