I just got home from the office now from a 7.00am start, so I need to rant. Got to office early to finalise reports for an 8:30am shareholders meeting. Couldn't log into email. Called engineer on support, doesn't answer phone. Decided to have a quick look myself, the system had been down since 9:30pm the evening before. Alerts had not been SMSed because this was one of the affected systems. Engineer finally calls me, he's not feeling well and was wondering if he could work from home today. Clients start calling, our secondary DNS is down (which many clinets use as their primary) and Radius is dead, so all dialup's and roaming VPN's are down. Ah the joysw of proving managed solutions to customers. When something breaks, it is broken for every client. Another client calls, one of their main servers that runs the front end for their financial system has literally died (dead montior, motherboard and raid set). UK based Citrix farm not processing logins for a dozen offices we support regionally, with another 1/2 dozen sites coming on line in a few hours as Singapore, Dubai etc come on line. Two other engineers call in "sick", two O/S on holidays, so I only have two heads. Meeting about to commence and I still haven't had time to finalise anything. Humble apologies, excuses and meeting rescheduled. Cancel next three meetings to try and get on top of things. Idiots can't fix our systems inj a timely fashion, impacting 1000's of users. Don't seem to understand that if it isn't working after 2 reboots, time to look for a root cause. I look myself. Problem is AD, next problem, remote management isn't working on dead server. Drive to datacentre. SAN failure, Thanks Moretti, heap of **** Compaq SAN (HSG80 controllers). Servers boot off SAN, disk unit on raidset is corrupt making boot partition unavailable on PDC, which is running the global catalogue. Exchange won't start as can't talk to GC. Two other DC's fail to take up GC role as sopposed to happen. Radius is integrated to AD and hasn't failed over to other IAS either. F!!!!! Compaq redundant SAN cost a couple of million $ isn't doing it's jop properly, MS software being a b1tch. I am now waiting for the Solaris systems to fall over as well. Thanks god this doesn't happen. I establish the simple fix is to reboot SAN controllers, but this will take down two racks of blade servers, I only need to fix one. Force AD role change, reg hack exchange GC change, add IP's for radius and DNS to othere server, to force failover. Everything now working. That was all before lunch. later discovered HP contractor misconfigured redundant HSG80 controller cards doing firmware upgrade last year. Find manuals, read to work out how to do it properly. Also providing 2nd level support to two helpdesk staff, becuase engineers are all on client sites dealing with other peoples emergencies. Back at Datacentre at 6.00pm, backup configs, decide to do a bit of a cleanup at the ame time. Ddelete failed sets, change disks, re-add sparesets, reconfigure systems. Test, test some more, bring all systems back on line. Have an epiphany and reconfigure everything once more using much simpler config, less things to go wrong. test, test and test some more. Bring everything back on line, drive home to cold dinner. Just a typical day in a small IT business. I have a day like this once every three months on average. I love computers!!
this is the best place to vent, stephen...as no one cares, so you're not making anyone sad. good thinking
Sounds like fun!!! I would rather that happen to me every day, then go back to my old helldesk job!! hahaha.
that's why i don't do hardware, i just write software. I.T. guys are my b1tch! my day.... woke up near 10am, went to werk, did a bit of werk, lunch at 12pm, had a lamb roast and 3 pints of beer, back in the office at 1:30, read and wrote some joke emails, did some more bits of werk between 3 and 5pm, got a call from mhh saying he was early and to be picked up downstairs in 10 minutes in his brand new lamborghini. went driving around for an hour and a half, got dropped off at a nice pub to a minor spectacle in north adelaide, drank about 8 pints of beer, went and got a pie and a cab home, and here i am. sadly the gf's not here to top the day off.
HA! Computer nerds! LOL Man,my girl went to the Gold Coast the other day for 2 weeks,i pulled myself to pieces everyday while she was away. You may have to do the same tonite ash
Yeah, it's idiots like you that right failover and clustering code that sh1ts itse;f if the wind is blowing in the wroing direction, then blame the problems on the hardware guys!
it's just that you whiney bastards can't keep it up for very long so i have to write exceptions to exceptions in case (ie. for when) something stuffs up. our system goes down on me so often when i complain to the IT guys i refer to it as their mum!
It sounds like your whole business is dependent on them, you should love them. I would be thinking of systems to safe guard this from happening again. Think redundancies............ g/l
Thanks for the kind thoughts. We do have redundancy, both automatic and manual. Problem is when they break, they're generally broken properly.
Stephens,i can get a hold of Dexter,the robot/computer to help you with your troubles at work if you would like? Hehehe Image Unavailable, Please Login
You guys are *****ing? You have forgotten more about computer technology than I will ever know! My computer has only a 2 gig hard drive and 127 M of ram, and simply downloading spy saver kept locking up everthing, and every time I rebooted, somehow my home page got changed from hotmail to msn! These damn things are useful, but even simply things can be a royal pain in the ass if you are not all that knowledgeable.
Just so you all know.....SS didn't write that. He plagiarised. He has no clue about computers really.