Here’s the talk I gave in early June on the state of systems. Alas, the video didn’t quite make it — audio was too quiet and it cut out before the end. But if you want to talk about anything you see in there, please come see me!Systems Talk 2010
Let me tell a couple of true stories of how social networking can be used to cause you and your friends harm.The first happened to a friend of mine. He’s sitting on his computer, browsing facebook, and he gets a chat request from a friend. This friend claimed to be in London, and needed money desperately as he had been robbed. My friend was smart enough to recognize this might not be legit, and started asking questions. Of course, because this guy had access to all the data in facebook, he could be fairly convincing in his answers (minus, of course, the lag time in looking up the information). As you may guess, my friend did not wire any money. Turns out this is a scam that’s getting more and more common.The second story happened to an acquaintance of this same friend. However, in her case, it was her account that was compromised. A Yahoo mail account, which was used to send mail to her friends asking for money. We don’t quite know how successful this one was. We do know the malicious user deleted all her e-mails from her account.Neither of these incidents happened at Argonne, just so you know.I tell you these stories to remind you that you need to be on your toes. In this day and age of social networking and information sharing, we’re putting a lot of information out there than can be used against us in many ways. I was startled when I visited pipl.com and searched for myself — all this information is out there, scraped off of webpages, social networking sites, Usenet… you name it. Someone armed with that information might be able to pull off a convincing job of pretending to be me. Convincing enough to scam someone else out of money or information they shouldn’t have.So be careful what you put out there. Keep your passwords strong, lengthy, diverse, and private. Don’t reuse them.Here’s a couple of links that were passed on to me today from ANL’s Cyber Security Program Office. The first is available on-site only, and is written by Mike Skwarek, the Cyber Security Program Manager and Deputy CIO for the lab. I recommend reading then, as there’s good advice in there.
Hey, folks.Just a friendly reminder that we’ll be upgrading the Zimbra mail and calendar server on Saturday. Your inbox, calendar, and webmail will not be accessible between 9AM and 2PM. No mail will be lost, only delayed during the outage window.This upgrade will address some locked mailbox issues we’ve seen sporadically, as well as allow Android Exchange e-mail syncing to work correctly.Thanks!
Hey, gang.Communication is an important thing. Without it, we’d all just be a bunch of meat sticks making random noises and gestures at each other. A part of communication that’s important to any service organization is feedback. Sometimes, that feedback is immediate and candid, sometimes it’s given given after the fact, and sometimes it’s solicited. While the focus of this post is on soliciting feedback, I really do hope everyone knows you do not need to wait for some survey or visit to offer input and ideas. We’re your Systems Group. When my door is open, which is generally the case, I’m an available set of ears. If you’ve got ideas for improvements, if you’ve got praise for someone, if you’ve got a complaint about someone, or you just want to talk about what we’re doing, I’m always willing and happy to talk. So, please, consider the lines of communication always open. I was going to make some joke about TCP ports and stateful connections, but that’s a sad kind of geeky.Anyway, it’s a new calendar year, we’re in our new building, and John Tesh hasn’t turned into a giant lizard and started terrorizing Tokyo, so it seems a good time to solicit some feedback. Over the coming months, I’m going to be visiting with many of you looking for your opinions and ideas, but I’d like to get a general sense of things, too. Many years back, we did a user satisfaction survey with the division. It proved to be useful information to have, so I’d like to do it again.At your leisure, please visit the survey (hosted at surveymonkey.com): http://www.surveymonkey.com/s/6CWNRWJ. It’s just 10 questions, and the survey is open until COB on Friday, January 29. Your answers are as anonymous as you want them to be – no IP addresses are collected, and no identifying information is asked. Unfiltered feedback is what I want.It’s been a while since we’ve done one of these things, and it’s something we should do more often. At least annually, I think. In any case, I’ll talk about the results in February.Thanks!
Wow, what a day. Most of us got here over 15 hours ago, some of us are still here. I’m a little punch drunk, so I’ll give a quick summary and perhaps a better post mortem later in the week.
- All equipment was moved without incident. Boyer-Rosene (the folks who moved all of us into our offices) were simply fantastic, very professional, and a joy to deal with!
- The bulk of the day was spent putting the machines back together, plugging in all the whosits and whatsits and things that go *ping*
- We had a bit of a network scare around 7, just when we thought we were back. Corby and Linda trudged over to 221 and got things working again after some fighting with some very old networking hardware.
- We*believe no mail was bounced, and things seem to be trickling through now and will continue to do so overnight
- We also believe just about all critical infrastructure is back up. We’ll catch what we missed either tomorrow or Monday
- We finally retired the following machines and services:
- Our old Windows 2000 domain
- Our old mail servers, some of which date back to the 90’s
- Our old tape library, that’s been handling our tape backups since… well, since before I’ve been employed here.
- Your “windows password” is no more. You now have a single MCS password, used for logging into everything we run except Zimbra. And, soon, Zimbra will also use that password
The Core really looks like a datacenter now. It’s filled with racks of machines, all chugging along. We have some work to do, still, and some cleanup in there (along with a hefty amount of cleanup in 221). Once it’s all done, we’ll have a little “open house” over lunch some day when you can stop by and we can show you all the cool things about the room and what’s running in there, plus what will be running in there in the years to come.A big, heartfelt “thank you” to everyone who came in to help today, and throughout the weeks since we started this move: Corby Schmitz, Linda Winkler, Max Trefonides, Hunter Matthews, John Valdes, Jason Hedden, John Roberts, Rick Bradshaw, Ti Leggett, Ken Raffenetti, Dave Goodell, Darius Buntinas, Rob Latham, Jared Wilkenning, Narayan Desai, Pavan Balaji, Rinku Gupta, David Ressman. If I’ve forgotten someone, please let me know! This whole move was a huge thing, and praise is deserved.I don’t want to minimize anyone’s effort in this, but I feel I must call out two people who have put in a really extraordinary effort in making this all happen. First is Hunter, who spent almost every day since we moved to 240 back over in the old machine room getting gear ready to move, all while learning about and rebuilding a new piece of the infrastructure that recently fell under his responsibility. I also want to call out Rick, who really took the lead on the huge organizational burden of this move, and did a fantastic job, also while doing a fantastic job fighting the NFS server issues that plagued us (while coming up with a top-notch and zippy service improvement).Lastly, some entertainment for you. I’ve posted a couple of videos to facebook, but for those of you who aren’t on there, check out the following videos. This is what happens during a long day in the datacenter:
It’s here! It’s underway! Things are moving!Move II: October 16-18
- Full Disclosure (already down)
- Breadboard (already down)
- kbT compute cluster (already down)
- I2U2 resources (already down)
- LCRC DDN storage system (already down)
- MCS Core Computing infrastructure (going down at around 4-4:30 PM today)
First thing Saturday morning, the movers will move the reamining equipment into the Core, at which point we’ll start hooking things back up and have things operational as soon as possible. You should generally not expect the resource to be available until the Monday following the move (though we’ll obviously strive for getting things up as quickly as possible).MCS Computing resources:
- October 16
- 4:00-4:30 PM: general core computing resources go down. At this point, all login nodes, compute machines, file servers, etc., will be offline as we pull cables and prepare the machines for the move.
- Note: This will affect mail service. Reading and sending mail will unaffected, receiving of new mail will be delayed, but we’re working to keep it as short a downtime as possible.
- October 17
- Very early in the morning, movers start moving gear to the Core.
- If all goes well with no snags, we should be operational before 5:00 PM, however the outage window is still until Monday in case things go other than smoothly.
There you have it! We’re shooting to keep downtimes to a minimum and make this weekend go as smoothly for you (and *us*) as possible. Sorry for any inconvenience this may cause you.See you at the TCS reception at 3!
An over-aggressive DOE scanner spam-bombed us this afternoon, bringing our mail service to a crawl as three mail servers tried to deal with over 40,000 e-mail messages sent within a matter of minutes.We believe we got it all stopped, and all the backlog of mail should be delivered at this point.Sorry for the inconvenience.
Pics taken by Max.All in all, it took a little less than 11 hours. Everything went generally smoothly. It bodes very well for the move in two weeks!
The first phase of the move of equipment from our old datacenter (the BMR) to the new datacenter (the Core) is complete, from a physical perspective. All the racks made it over with no issues.On Friday, our smallest cluster was up and running. Today, as the day wore on, we got more equipment up. As a sort of “dry run” for the big move in two weeks, this bodes very well. We’ll keep plugging away at getting the random pieces up and running, but there are no major stumbling blocks.Hooray!We’ll post a wrap-up on or before Monday.
Today, the first computer came up in our new datacenter, The Core. The cosmea cluster came up, largely without incident, and backed itself up to the tape library (that is, incidentally, still sitting in 221).Tomorrow’s gonna be a fun day.