[CLUE-Tech] Detecting possible hardware failures

Jed S. Baer thag at frii.com
Thu Feb 7 09:37:28 MST 2002


Greetings folks.

This morning I was greeted by an increasingly unresponsive machine. It's
an ABit KA7, Athlon 750, running RH7.1.

Yesterday, the X server crashed without producing a core file, but I was
able to restart it fine. This morning, my chat script wouldn't run, so I
su'd root, and did a tail -f /var/log/messages, to see what was happening,
and retried it. No output on the tail command, and I couldn't ctrl-c it.
This was all in X, using rxvt terminals, and Fvwm command buttons to run
the chat script via sudo. Somewhere in here, while the tail command was
running, I su'd root in another rxvt and tried to run the chat script from
the command line. No error messages from that anywhere.

So, I switched to a different virtual console, logged in as root, and did
a ps, to get the pid to try to kill the tail command. This hung after
about 10 lines of output. Again, ctrl-c doesn't work. I switched to
another virtual console, and (silly me) tried ps again. This one produced
no output, and hung. I switched to another virtual console, and this one
hung immediately upon logging in.

So, I switched back to the console running X, and did the
ctrl-alt-backspace. su'd root, and got an immediate hang, no prompt.

At this point, being unable to find my SYSREQ keys printout, I did a hard
reboot, and the box came up fine.

I grep'd through my tripwire logs, looked at all the usual logs, etc., but
haven't found any evidence of tampering, breakin, or any HW messages.

The only background which now seems relevant is that I've been having some
mouse problems: left button stuck in a button-down state (but only when
I'm running galeon [or any GTK app?]), and what appeared to be an actual
HW problem with the switch on the left button on my mouse. I switched mice
about a week ago, and the only mice-related problems I'm now seeing are in
GTK apps, e.g. the right-button X paste sometimes doesn't work. Exiting
all GTK apps has seemed to "reset" this problem, and it reappears only
intermittently.

This all has me both concerned, and stumped. The keyboard and mouse are
obviously reponsive, from a HW point of view. The complete lack of error
messages is really puzzling. I do log the X server output, and haven't
seen anything suggestive there either.

Any thoughts on how to get some meaningful diagnostics?

TIA
jed

-- 
"Those who expect to reap the blessings of freedom must, like men,
 undergo the fatigue of supporting it."
 - Thomas Paine



More information about the clue-tech mailing list