At 3:33 PM today the fileserver that serves unix home directories unexpectedly went offline for about one hour.
The system appears stable and all services should be available now, if your workstation is misbehaving you should try rebooting it. If it’s still having issues or if you notice anything else out of order please alert the CELS/MCS help desk.
No data was should have been lost.
I regret and apologize for the inconvenience.
I’m not entirely sure of all the gritty details of what went wrong at this time, but here are a few more details for the curious:
We are in the process of moving the file services to a newer system. This afternoon, as part of preparing for this process I attempted to create a new raid array on 4 new hard drives attached to the system. Unfortunately one of the drives was faulty and something about the creation process cause the system to become unresponsive. This server is a pretty long in the tooth and hadn’t been rebooted in a long time. The boot process was not stable, and we needed to reconfigure some settings in the bios in order for the system to recognize the correct boot environment.
Again, I’m sorry for the interruption in service.
Last night, shortly before 6:00 pm CT one of the network routers in building 240 lost connectivity to another in 221. This caused a breakdown in large portions of the lab’s network fabric, both internally and externally. CIS tracked down and fixed the problem shortly after 8:00 pm CT. The result was that many, if not all, of our services lost connectivity during this time. Most services resumed normally after the issue was fixed, but a handful were left in unknown states until this morning. We believe we have now restored all services, but if you find something that isn’t responding or is responding erratically, please let us know and we’ll get it fixed.
CIS is investigating how such a disruption occurred since the network is designed to mitigate these types of disruptions.
Many of you are learning of an exploit to the bash shell that was revealed last week, so I thought it would be worthwhile to post a summary of what’s been happening and what you need to do.
First up, the exploit in question allows an attacker to take advantage of some poor coding in the Bourne Again Shell (bash), to launch processes on any servers or services that are exposed to the internet, such as web servers or poorly configured workstations.
We’ve been patching servers we manage since the announcement, and are confident we’re safe from attackers on the servers that we’ve got externally exposed.
Generally, if you’ve got a machine you’re managing you shouldn’t have a big worry unless you’re running a web server on it and allow that web server to run scripts that call a bash shell.
In any case, patching your machine is important. Linux distributions have had patches in the pipeline almost immediately, so if you’re running a current build of linux you should be able to update via your regular package manager (yum, apt, etc.). If you are running an unsupported distribution, you’ll need to download and compile a new bash to be safe. Contact email@example.com if you require assistance with that.
Apple released some patches for supported OS versions to address some of the vulnerabilities, but there are still some that need addressing so we expect to see more updates. The updates are not yet in the OS X automatic update package stream yet, but for those of you who manage your own machines, you can find the updates below. Also, check http://support.apple.com/downloads/#macos for future updates in the next few days.
If we manage your MacOS machine, we’ll take care of these security updates for you.
Specific updates can be found here:
* Mavericks 10.9: http://support.apple.com/downloads/DL1769/en_US/BashUpdateMavericks.dmg
* Mountain Lion 10.8: http://support.apple.com/downloads/DL1768/en_US/BashUpdateMountainLion.dmg * Lion 10.7: http://support.apple.com/downloads/DL1767/en_US/BashUpdateLion.dmg
Anything older and you’re running an unsupported and unpatched OS. It should be upgraded.
Microsoft Windows users are only affected if they are running Microsoft Unix services or Cygwin. In either case, follow the update procedures for your installation.
CIS is proposing the following dates for maintenance weekends. During these weekends, various lab services can be unavailable, including networking and business systems. Occasionally, the CELS Systems Team will schedule our maintenance activities to coincide with these dates, but that is flexible and should not necessarily affect your decision on whether these dates pose an insurmountable problem.
Please look over the proposed dates and let me know if there are reasons the lab should not schedule maintenance activities during those times. If so, please suggest an alternate timeframe for any to which you object.
- November 7-9, 2014
- January 16-18, 2015 (APS Maintenance period)
- May 15-17, 2015 Network Maintenance (APS Maintenance period)
- August 14-16, 2015
For those who missed the announcement in Argonne Today last week, the lab has rolled out a new platform for accessing internal applications called “Dash”. The goal for this is to provide a one-stop solution for internal lab applications (including Dayforce) without requiring the VPN, and better protect the lab from possibly infected mobile and user-owned devices. Included below are links to the announcement and information page. Those who appreciate irony will note that you must be on an internal network or VPN to read these links.
- http://dash.anl.gov (no VPN required)
I’ll also note that I have not yet been able to use Dash to access Dayforce from my Android devices, despite the claims of support. I’ll send an update if I ever figure it out. I have not yet tested iOS. MacOS worked just fine.
If you currently access Dayforce directly via the unsupported-provided-as-a-convenience link for those with Silverlight installed, you can continue to do so with no changes. Likewise, if you connect to rdp.mcs.anl.gov (Windows Terminal Service) to access Dayforce, that will continue to work as well.
Hey, folks. I just wanted to drop a quick note to everyone introducing the newest member to our team, Chris Bills. Chris comes to us most recently from a stint at IBM, and has lots of experience as a quality linux systems administrator. We’re really excited to have him join us! Please feel free to introduce yourself next time you’re in the area – he’s sitting in 2139, sharing an office with Max.
CIS has a maintenance weekend scheduled for August 15-17. The details follow. Please let us know ASAP if you have concerns about this maintenance window. Please note the outage to the Argonne website (http://www.anl.gov) will also affect the MCS website at http://www.mcs.anl.gov, as it’s hosted on the same service. Otherwise, services provided by the CELS systems team will be unaffected.
WHAT ARE WE DOING?
Argonne’s quarterly IT maintenance weekend is scheduled for Friday, Aug 15, thru Sunday, Aug 17. Expect that any laboratory network and core IT services may be effected during the weekend.
· Wireless networking will experience outages throughout the weekend.
· Voice mail will be unavailable from 6 to 8 p.m., Friday, August 15th. During this time, voice mail messages will not be received nor will they be retrievable.
· Authentication to Cloud applications such as BOX, Dayforce and Service Now will be unavailable from 1 p.m. to 2 p.m. on Saturday, August 16th.
· All business applications will be unavailable all day Saturday and into Sunday morning.
· MIR3, the labs mass-communication system, will be unavailable from 7:00 p.m. on Saturday until 7:00 a.m. on Sunday while the vendor performs its maintenance.
· CIS will perform verification of IT services on Sunday, August 17th to ensure all services are functioning for business hours on Monday, August 18th.
As with all IT maintenance weekends, there is no explicit guarantee that a service will be available at any given time.
WHEN WILL THIS OCCUR?
Aug 15th, 2014, 5:00 p.m. thru Aug 17th , 2014. We expect the maintenance to be complete Sunday morning and will then be followed by a verification process.
Please note the setup for the display in 3178 has changed. There were some failures in the components that were running it previously, and replacing those components didn’t seem to help. There are new and concise directions printed and posted next to the display. They are also included below.
To use the AppleTV, the TV needs to be on HDMI1 input. You can change the input on the side of the TV by pressing the “Input” button until the desired one is highlighted. There are on-screen directions on how to connect to the unit. (At the moment, we are awaiting CIS to put the wifi reservation into place. That should be in within a day, but the AppleTV will be off-network until it’s done.)
To use the wired HDMI connection on the table, the TV needs to be on the HDMI2 input. Plug the HDMI cable into your laptop or dongle. The TV will output 1080p nice and crisp, so if the display doesn’t look right, check your laptop’s display settings.
If you need DVI, attach an HDMI to DVI adapter to the cable. If you don’t have one, you can ask for one at the help desk. I’ve left them in the room in the past, but they must be getting lost.
To use the wired VGA connection, follow the directions above for HDMI, but instead of plugging the HDMI cable into your laptop, plug it into the HDMI/VGA converter on the table, then plug the VGA into your laptop. This picture quality is obviously lower than HDMI and isn’t recommended – it’s supplied by request for older machines that need it. If you can output HDMI, you’ll have a much better experience if you use that instead.
The automatic source switch that switched between HDMI and VGA is what seems to have been the cause of the problem. Removing this seems to make things work just fine again.
Please see Matt’s note below on a security vulnerability that affects Android users. Here’s the short version – there’s an issue that allows applications to masquerade as trusted apps. Google has identified the issue and is pushing out a fix. The fix is not there yet, and when it gets there will be dependent on your phone’s manufacturer and carrier. For instance, Nexus phones will probably get the update quickest. As soon as you get notified of the security update, please apply it. This only affects Android users, not iPhone users.
You can download and run this tool to assess if you’ve been affected. If you have, feel free to come to us for help in dealing with it. Right now it’s expected you will show up as “Vulnerable” to the “FakeID” bug. The others should show Patched.
On the heels of the new IT Access Agreement to include BYOD, a security company has discovered a flaw in the Android OS. There is a patch available that has begun to get distributed to the open source code base as well as to manufacturers. It is in the new IT access agreement to keep personal systems patched and up to date.
You can read about the vulnerability here:
You can read about the updated Argonne IT Access Agreement here: