Why can't railroads get their signalling right?

One thought is to remove the computers and do it the way they used to do it when it was more reliable. A computer glitch shuts down the whole system; ruining commerce and passengers’ day (mine) on VRE. Happens all the time.

Usually it is CSX with problems; but this time it was NS.

Here’s yesterday’s message…

This message is for our Manassas Line riders:

Delays on the Manassas Line This Morning

Before we get started with the explanation of the delays this morning, we want to take a moment to apologize not only for the delays, but also for the confusion that occurred as a result of the delays. Unfortunately, when delays of this magnitude happen, communicating the minute by minute changes can be difficult.

What Happened?

At about 5:40a this morning, minutes before train #324 was scheduled to leave Broad Run, Norfolk Southern experienced a system failure to their signal network. Normally, when this occurs, they reboot the system and a delay of about 20 minutes can be expected, affecting only one or two trains. However, today, Norfolk Southern was not able to get the system to reboot. This caused significant delays that created a domino effect throughout morning service on the Manassas line.

The Extent of The Delays

→ Train #322 was already on the CSX system outside of Alexandria when the NS system went down. As a result, they did not experience any delays.
→ Train #324 was given permission to proceed 63 minutes behind its original schedule at 6:48a. Because it had already boarded a large number of passengers at Broad Run, in order to space out the trains and to prevent crowding, the decision was made to skip Manassas, Manassas Park, and Rolling Road.
→ Train #326 was given permission to proceed 11 minutes later at 6:59a. Originally, it was told to skip stops as well. However, because it was operating so close to train #324, skipping stops would have only brought it

So do you equate “getting it right” with “never having a problem”?

yes

I can imagine an entire organization of signal maintainers laughing their heads off.

  1. The signaling didn’t fail; the train-control system failed. The news article is in error. This isn’t semantics. You could be using timetable and train-order to control the railway and have an identical problem.

  2. Why do you believe it was more reliable in the old days?

  3. Why do you think computers are to blame? A Saxby & Farmer interlocking machine of 1870 isn’t any different, except it used mechanical logic instead of electronic logic. Interlocking machines broke too. Should we get rid of them, too?

  4. What’s an acceptable failure rate to you, and how much to you is it worth spending to achieve it?

  5. Who do you want to send the bill for this?

S. Hadid

Even a system operating with the vaunted “4 9’s” (99.99% uptime) will experience almost an hour of downtime a year.

All red signals - I’d say that the system performed exactly as advertised - the controlling system (the computers) failed and the signals all dropped to red. Can’t ask any more than that!

Typical, ridiculous answer. I suppose the Army’s home page never experiences technical glitches. And the Marines always get it right. One hundred percent. All the time.

You’re right…they should go back to paper train orders. That was much more efficient.

If you aim for 99% reliability, that is what you will get.

One possible solution would be to have built in redundancy systems. Aircraft use these where reliability must be 100% or else!

Sioux City, Iowa. DC-10 Triple redundancy. Operational flight control systems at touchdown - zero.

Some people strive for five 9’s. You don’t see it often.

Like I said - there is a fail-safe built into the system, and it worked. Despite the failure of the computers, the signals did not fail. Traffic got messed up for a while, but there were no accidents or injuries as a result of the problem.

Of course they don’t aim for only 99%, but neither should we believe they can’t “get it right” when there’s a system failure. If you’re in the computer and web page business, then you have to have experienced one of those “harmless” military security updates that suddenly cripples the system because even after all their beta testing, R&D, and forethought, there was one hardware-software combination out there they hadn’t counted on. POOF. Oops.

Built in redundancy? Are you serious? On who’s dime? Who is going to step up and pay for a complete back-up system that kicks in when the primary goes down? No, the fail-safe worked in this case. The signals dropped to red, and everybody sat tight till they got it worked out. No damage, no injuries. No big deal.

EDIT: Sorry, tree68, I jumped your post a bit. You’re a faster typer than I am.

Lets see…Spend millions on a redundant signal/communication/computer traffic controll…or…tolerate that 20 min down time that happens once in a blue moon.

When was the last time this happened?

There is no such thing as 100% reliability. Period. Perfection simply isn’t possible. What the aircraft folks do, as well as the railroad folks, is aim for systems with as high a reliability as possible – four nines isn’t unreasonable, and one program I was associated with years ago aimed for five nines and did better (and we still had failures)(it was called Apollo, for the old-timers in the crowd). Then you try to set things up so that when the inevitable occurs, you fail operational if possible, and safe if not. Which is exactly what happened. So relax and enjoy…

Yeah - but great minds think alike…[:D]

Welcome to West World…Where nothing can go wrogn.

Nice thing about being able to drive your own car is you never have to worry about a signal failure.

Heh, yeah but when the batteries in the garage door opener’s remote control go dead over night, life can be hectic…[:D]

It’s not that anything can’t go wrong, it just not anybody’s fault when something does go wrong. When a computer goes down at ABC company a couple of hundred people might be inconvienced for a while. The computer system at NS shuts down and a whole lot more people are inconvienced - no pressure on the NS computer department to get it back up and running. The one good thing about technology is that you can always blame the equipment… not like the good ol days.

Tounge somewhere in cheek on this one,
CC

Do it the old way when it was more reliable? [?] What study are you using that states those figures. Do you really believe a modern train controll system is less efficent that the old block controll system and/ or track warrants. Without todays modern computer based systems you could not move as much traffic on less track as quickly. [2c] As always ENJOY

I think a manual backup system plan in place would be nice.


Have you noticed (those of you old enough) that before computers, banks never had system crashes. But when I go to Navy Federal (technically a credit union) and other banks, very frequently their systems are down. AND, they have the same # of employees they did before computers.

Could that be an analogy to the RR signal debacle?

It’s NS’s dispatching system that had the failure. It’s ironic that it was the new system that had the problem. It is designed to prevent exactly this type of outage. The old dispatching systems were stand-alone and not backuped up. If you lost the system, you were done - until you could fix it - and you data was toast. The new system has hot-backup servers and any desk on the RR can control any territory, so, in theory, it should be more reliable.

You will never see NS shut down completely for a day because a hurricane is head for a headquarters building location!

It is a brand new system. NS is the first to purchase. There have been some glitches along the way. The good news is that they are becoming less and less frequent and of shorter duration as the bugs get worked out.

There aren’t enough people around on the RR these days to ever have a prayer of reverting to doing anything “the old way”. RRs are wholly dependent on technology, so they better get it right.