Jump to content
IGNORED

Explosion Takes Out AtariAge [Updated]


Albert

Recommended Posts

It's been a hard few days! :)

 

Glad to see its all back. Goes to show how important this site is to most when something takes it away!

 

I can only imagine the shear torture the techs are going through with people demanding instant fixes etc. In times like this everyone thinks their server is the most important... To heck with them Atariage is the most important! Get in line! LOL!

 

'Guess its an exercise in assuming the worst and I bet you are probably shopping for a mirror or mass storage hehehe. I'd imagine you got your ducks in a row already but hey. When the main electrical goes down its pretty darn hard to come up with a good alternative especially when the alternative would only be another fire/explosion threat.

 

A day without atariage is like night, a really boring night -like when there's no atariage. :)

Link to comment
Share on other sites

Would it make sense to have some designated alternate site for information about AtariAge and its current status? Even something as simple as a geocities (or whatever) page that would usually say simply "AtariAge is online at www.AtariAge.com" but which could easily be updated to let people know what's going on when there's an outage?

 

For that matter, what would be involved from a technical standpoint to set things up so that if neither of the primary AtariAge name servers works, a visitor would get redirected to a name server on another system (e.g. something in your house) which would then redirect to a sorry/status screen? The bandwidth required to handle that would be low enough that even a personal DSL connection would probably suffice if everything could be set up properly.

Not very difficult and I could have done it, but initial estimates didn't have the servers being offline for so long so I didn't think it was necessary then. I am probably going to make some changes to the way DNS is setup so I can quickly redirect the domain to another server in the event that something catastrophic takes place down the road.

 

..Al

Link to comment
Share on other sites

I first blamed it on "Al Commodore" with a successful motherload,8,1 attack when someone really did press play on tape.

 

Then I thought the thrift finds thread opened up some sort of stargate / rift -taking atari out of history and, subsequently AA leaving that particular part of the multiverse mine and mine alone to talk about in the mental institution!

 

Finally I thought the Large Haldron Collider ejected some strange matter as well as produced a black hole and that it was only a matter of time to kiss my loved ones goodbye!

 

Glad I'm wrong cause I sure as heck don't wanna kiss anyone goodbye when it is not a real goodbye! :)

Link to comment
Share on other sites

atari will never go away! even A little explosion, won't get rid of it. Just remember, atari had the Japanise game makers trembling in their boots!!!! Diamonds are forever, and so is ATARI

Link to comment
Share on other sites

I thought I might have been banned, too. I did consider logging in from a different IP, but I figured the best way to find out if AA was down would be to go over to digital press. So I did. Thanks for letting us know, Al. I'm sure lots of folks were worried.

 

I have a few trades still to complete here, and I had wanted to throw Plaque Attack into the modded Supercharger for the high score competition. I didn't know if I'd make it to the HSC. Instead of losing sleep over it, though, I just played more games! :D I got through a lot more of Coded Arms, and I played my Mythicon games.

 

I am an electrician by trade. Some of the stuff I've seen will blow your mind. One place I worked had 11 electrical switchboards and a number of load centers. There were eight service switchboards, and three emergency ones. Each was hooked to two or three others, and each had its own dedicated generator. I have a few stories from that place--sit back and relax...

 

One day, I flipped the field breakers for the #4 generator. When I turned to look at the switchboard, a cloud of foul smelling smoke was growing from the floor. One of our shop personnel was also a firefighter. He was sleeping in the shop, so I woke him and I ran the other dude out. There were two oxygen tanks in the locker, so one of us had to go. We found out later that a control transformer had burned up and it had nothing to do with my project.

 

Another time, I was running the same switchboard, and I heard these words: "Major steam leak in the machinery room!" OK, I thought, this is bad. Next I hear, "Medical emergency in the machinery room!" Uh, oh, someone's been burned. Turns out that a Halon bottle blew its top, releasing the contents. The personnel thought it was steam. Meanwhile, the flying debris struck a fellow in the knee, sending him to the floor unable to move.

 

On another occasion, a friend of mine spilled oil based paint on a steam turbine, setting the turbine ablaze. Fun stuff.

 

Then there are my favorite stories...

 

Half the plant was down while folks were on vacation. Well, during this time, a dehydrator fails and water gets into the control air system. This happened sometime overnight. Also overnight, many of the operators turned off the benchboard lights on their switchboards so they could sleep. In some boards, this killed power to the PA amp. I was working a different board that day, and I wasn't used to the PA amp going off with the lights. Although I was well rested and fully awake, I kept the benchboard lights off. I had no idea what was going on until lights started coming on on my minibus. I got a call on the telephone from a co worker asking to balance the load between two generators. Sometime later, about 90% of the plant went dark when that water hit an air operated valve. The valve slammed shut, cutting steam to two of our service generators. Of the three emergency generators, only #3 was able to run. The mechanics had torn #1 down, and #2 failed to start. Woudln't you know it, 3E was not set up to power any of the loads we'd lost generators for. Vacation was cut way short, and those mechanics worked around the clock to fix 1E and 2E.

 

Then there's this line: "Central control, this is 3E. I have a fire in my switchboard, what do you want me to do next?" That actually happened, and the operator there scored himself a few "Duh!" awards. (The correct response is to cut power to the switchboard immediately and then notify Central Control what happened.)

 

Here' my favorite, though:

I'm sitting on my normal board (#4) and it's parallel to 1 and 5. 7 and 8 are running parallel, as well as 2, 3, and 6. The emergency boards are online. All eight service generators were up, but the emergency generators were supposedly on standby. We're running the plant at about 70% of full load.

I'm half asleep with my feet on the benchboard when I hear bam, bam, bam, bam BRAUM! and I look up to see my kW meter at 160% of max. Oh $#!^, I thought, I just motorized the thing and my reverse power relay failed. I. Am. Screwed. I looked up, and saw both 7 and 8 were offline. 5 was offline was well. I hit the control switch, tripping my generator and field breakers instantly. That left the poor fellow on 1 board with four times his max load. He had in excess of six thousand kilowatts running on this dinky Westinghouse generator. A friend of mine rushed in and started tripping breakers to cooling units all over, and dropping other unnecessary loads. 1 board's load dropped to 3500 kW, and then he dropped offline like a rock. To make matters worse, none of the emergency generators started, either.

What happened? Grab a brewski--this is good. :)

The fellow in 7 board, Mark, was asked to balance the load on his bus. He calls Joe over on 8, and when he tells Joe to increase the load on 8, Joe grabs the wrong handle and trips 8 generator offline. Joe decides he'll bring his board back up by closing a bus tie out of phase. That tripped the main breakers on 5 and 7 instantly, transferring the load from 5, 7, and 8 to 1 and 4. The noises I heard were breakers tripping throughout the plant, and the AC behind me as it single phased.

My response was a mistake as well, my generator was providing vital electricity to the now crippled plant. When I tripped it offline, I was unable to parallel it back to the bus. I was in the same situation as the folks on the other boards. All of our generators were running, but they physically sped up when the load was removed. That put them out of our control for a few vital seconds, and the whole incident (including the struggle to keep 1 board up) happened in less than one minute. This casualty led up to the one above with the failed dehydrator. The emergency generators were under repair because they'd failed to start this time, and the repairs either weren't complete (#1E) or weren't done right (#2E).

In the end, it took less than 60 seconds to go from 70% of full load to about 15%. The plant's boss showed up two days later for a big training/gripe session. then two weeks later, we dump the plant again--yay! :D

 

The same dude that was on 1 board above paralleled the thing with his synchroscope 90 degrees out of phase on another occasion. BAM! We heard breakers trip from one end of the plant to the other.

 

One day, my friend Eric paralleled 1E switchboard to its generator which was out of phase. It blew the contacts in the main breaker to pure slag.

 

My friend Duane was working with his supervisor cleaning out a large electrical box. Duane's supervisor was inside the box with three large bus bars surrounding him. He had less than an inch of clearance in any direction when suddenly there was a familiar buzzzzzz sound. They'd hung danger tags on the breakers to the box, but when Duane checked the bus bars, the bars were live with 440VAC. Duane's supervisor met him halfway to where the breakers were. They found their red tags still hanging, but the breakers were now energized. Duane said he has no idea how his boss lived through that, let alone escaped the box with the power on. Duane hadn't had enough time to kill the power and meet his boss--the boss had to have somehow crawled out of the box while it was live.

Edited by shadow460
Link to comment
Share on other sites

Well, the first day I knew something was up because it's unusual for AtariAge to be MIA without some sort of pre-announced news about it. So I went straight to google to search and found the DP forum topic about the situation. Pretty much all weekend I could feel the same as some of the other members that have posted in feeling withdrawls. AtariAge is a part of my daily life for picking up on gaming news, including weekends.

 

Glad no one was hurt during the explosion though. That's the main thing.

 

And as someone in the DP thread made mention of, it just goes to show you that no matter how much you plan or do to prevent such an occurance, Murphy will always find a way to move into the upstairs bedroom. :P

 

Oh, and shadow460, I'm no electrician (I'm a computer programmer), but that made for some fun anecdotal reading there. I worked part-time in IT at a local company here, and we had a Murphy moment with one of our servers one night late, about 2AM. Luckily I worked part-time and didn't get called to come in, but the other three guys on the IT team did go in.

 

It went down something like this: Apparently, some supervisor on graveyard shift was having trouble with one of the computers on the network in the plant. Instead of calling us for advice (and why on earth would anyone call an IT person in the middle of the night to fix a server problem he/she does not really know how to fix? :roll: ), he took matters into his own hands. Mr. Wonderful here decided the best course of action was to unplug the network cable connecting from the server to the hub. Not just his computer, mind you. The entire hub that controlled that part of the plant. He then tried to plug said cable back in to reset the connections. Now given modern-day hardware, that sounds probably harmless enough on the surface, and likely would've reset all connections and then would've been okay, if it weren't for the fact that our server is an odd amalgamation of new and old hardware (a modern-day Windows server connected also to a fussy old VAX system that was shipped over with Noah on the ark).

 

This, of course, set off a series of unfortunate events that eventually brought down that whole division of the plant. With production halted there, production halted everywhere else pretty much, and network problems surged on to other areas until they finally decided to give the IT manager a call and fill the rest of the full-time people in on what happened. I came in the next morning about 6AM (we had already scheduled to come in early that day while the rest of the office was out to do some work on the office-side of the plant). And that's when they filled me in on what happened. Production continued after the team fixed a temporary solution up (which would've been the proper procedure had they called in the first place and asked instead of halting production for the night). But it took quite some time for the server restore to go through to get things back to normal again. Needless to say, we never got to do what we planned that morning, and had to reschedule it for the next day at 6AM (a Saturday...)

 

Isn't technology wonderful? :D :P :ponder:

Link to comment
Share on other sites

Shadow460, thanks for the great read! Like Rockman, I'm just a computer programmer. But I *do* know enough to be glad that no one was seriously injured in those events you described. I don't think that people tend to realize that a megawatt of electrical power is effectively the same thing as a megawatt of kinetic energy. If something goes seriously wrong, there's a good chance that highly energized equipment will blow up and people will end up dead. (Especially if you lose a turbine!) That story about how you guys brought the plant down is particularly scary! To have 6 MW coming through your board off a turbine that running 4x its maximum rated output... well... I'm just glad the engineers designed those things well!

 

Though the story about the power being 90 degrees out of phase is just plain funny. How to shut down a plant in one easy step! I'd hate to see the braking power THAT applied to the generators! :P

Link to comment
Share on other sites

Another interesting thing... we got new eMail servers and I got no emails for 1 day and at the same time AA was down... and I realised... what if my mobile phone is not working anymore... how the hell could menkind survive nowadays...

 

as gas gets more and more expensive (and I do not drive to get a cup of coffee etc anymore but take my feet or my bike (yes...the one without any electricity except for the dynamo...) I realise when we were kids... the world was still around us and we made business, played games, go out etc... ;) how was that possible...

Link to comment
Share on other sites

Joe decides he'll bring his board back up by closing a bus tie out of phase. That tripped the main breakers on 5 and 7 instantly, transferring the load from 5, 7, and 8 to 1 and 4. The noises I heard were breakers tripping throughout the plant, and the AC behind me as it single phased.

 

How do big generators work? I would expect that they use coils on both the rotor and the stator (like a car's alternator). I'm curious, though: how many of what sort of windings does a typical generator have? And do they not have some sort of electronic controls to ensure that things are within 30 degrees of phase when they switch in?

Link to comment
Share on other sites

Update: Yep, the site was unavailable again for an extended period of time, between about 2:45am early Tuesday morning until 5:30pm. Turns out the temporary backup generator that The Planet was using to power "Phase 1" of their Houston data center developed a fault and the generator could not be repaired. It then took some time for them to get another generator on site and operating. Hopefully we'll be online now without further interruptions of this nature. Other hosting options are now being explored.

Link to comment
Share on other sites

Other hosting options are now being explored.

 

Sounds like Murphy's Law more than incompetence has been responsible for the outages. While I haven't heard anything about the Planet's service since quitting the company that used to be RackShack, I wouldn't hold these outages against them, as frustrating as they may be.

 

Now if you're looking into other hosting as a back-up plan, then I say excellent, carry on!

Link to comment
Share on other sites

Too bad AtariAge can't be hosted in a couple of locations, so if one server is unavailable for any reason, the backup (exact copy) in another state can kick in. Is that even possible?

Sure, it's possible, but it would at least double the current expenses of hosting AtariAge so it's not really feasible. Unless you want to donate $400 a month to the cause. :)

 

..Al

Link to comment
Share on other sites

Other hosting options are now being explored.

 

Sounds like Murphy's Law more than incompetence has been responsible for the outages. While I haven't heard anything about the Planet's service since quitting the company that used to be RackShack, I wouldn't hold these outages against them, as frustrating as they may be.

I'm not very impressed with the way they handled this situation. Knowing that this single generator was their sole means of getting power to the entire first floor of this building, they should have already had a backup generator on hand as a standby. Totally inexcusable that they did not. Plus some of the communication about the problems and time to resolve everything (in both instances) have been piss poor.

 

..Al

Link to comment
Share on other sites

Knowing that this single generator was their sole means of getting power to the entire first floor of this building, they should have already had a backup generator on hand as a standby...

 

Plus some of the communication about the problems and time to resolve everything (in both instances) have been piss poor.

 

Ah, I can see where you're coming from now. Best of luck in your quest!

Link to comment
Share on other sites

Other hosting options are now being explored.

 

Sounds like Murphy's Law more than incompetence has been responsible for the outages. While I haven't heard anything about the Planet's service since quitting the company that used to be RackShack, I wouldn't hold these outages against them, as frustrating as they may be.

I'm not very impressed with the way they handled this situation. Knowing that this single generator was their sole means of getting power to the entire first floor of this building, they should have already had a backup generator on hand as a standby. Totally inexcusable that they did not. Plus some of the communication about the problems and time to resolve everything (in both instances) have been piss poor.

 

..Al

I agree with you here. I think some (esp. IT professionals) are being sympathetic, but it sounds like they were not prepared for this, and they have never said what caused the short that resulted in an explosion. Lots of questions are unanswered, such as why there wasn't a circuit breaker to prevent the explosion from a short, and if human error caused the short. If human error caused this, they should own up to that.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...