R: [Voyage-linux] Broken watchdog?

Dario Finardi (spam-protected)
Thu Sep 18 14:38:18 HKT 2008


Are that Alix3 with RTC battery?
 
We've experienced a problem with one of them...as the CS35... (the I/O interface device aside GEODE) was broken.
That device wasn't able to store the actual date in hardware...to be clearer: it was storing it without errors but when I was powering the Alix off and then on the date was reset.
This not common behaviour was accompained by a not running watchdog.

________________________________

Da: voyage-linux-bounces+d.finardi=gear.it at list.voyage.hk [mailto:voyage-linux-bounces+d.finardi=gear.it at list.voyage.hk] Per conto di Robert Rawlins
Inviato: mercoledì 17 settembre 2008 20.22
A: voyage-linux at voyage.hk
Oggetto: RE: [Voyage-linux] Broken watchdog?



Ok Guys,

 

I'm starting to think that this might not be a configuration thing. Seems that I have several boxes running identical configurations, by that I mean.

 

/etc/watchdog.conf

/etc/default/watchdog

/etc/init.d/watchdog

 

Are all the same on each system when I look at them, however, for some reason one of them seems to be having this issue of not rebooting when the tests fail.

 

I'm starting to wonder if this is down to something else, perhaps the drivers, or the watchdog itself? Is there any way to manually ping the watchdog and ask it to reboot the system? Presumably that'll help narrow down where the problem exists J

 

Any suggestions? This unit is out in the field so doing a rebuild isn't an option and I really need this watchdog alive and barking.

 

Cheers guys,

 

Rob

 

From: voyage-linux-bounces+robert.rawlins=thinkbluemedia.co.uk at list.voyage.hk [mailto:voyage-linux-bounces+robert.rawlins=thinkbluemedia.co.uk at list.voyage.hk] On Behalf Of Robert Rawlins
Sent: 15 September 2008 14:18
To: voyage-linux at voyage.hk
Subject: RE: [Voyage-linux] Broken watchdog?

 

Hi Dario,

 

Thanks for the reply on this. I'm glad you can confirm that it's not my watchdog.conf which is causing the problem. The watchdog script (/etc/default/watchdog) on my box looks like this:

 

# Start watchdog at boot time? 0 or 1

run_watchdog=1

#

# Specify additional watchdog options here (see manpage).

 

Does that seem normal to you? Also, as a little additional information, the syslog entry when the box first starts up for watchdog looks like this:

 

Sep 15 12:03:56 voyage watchdog[3290]: starting daemon (5.2):

Sep 15 12:03:56 voyage watchdog[3290]: int=15s realtime=yes sync=no soft=no mla=0 mem=0

Sep 15 12:03:56 voyage watchdog[3290]: ping: no machine to check

Sep 15 12:03:56 voyage watchdog[3290]: file: no file to check

Sep 15 12:03:56 voyage watchdog[3290]: pidfile: /var/run/myapp.pid

Sep 15 12:03:56 voyage watchdog[3290]: interface: no interface to check

Sep 15 12:03:56 voyage watchdog[3290]: test=none(0) repair=none alive=/dev/watchdog heartbeat=none temp=none to=root no_act=no

Sep 15 12:03:56 voyage watchdog[3290]: was able to ping process 3249 (/var/run/myapp.pid).

 

Again, is this what you would expect to see on a normal system configuration?

 

Many thanks,

 

Robert

 

From: Dario Finardi [mailto:d.finardi at gear.it] 
Sent: 15 September 2008 09:04
To: Robert Rawlins
Subject: R: [Voyage-linux] Broken watchdog?

 

Using such a configuration my boards are working correctly.

 

have you turned-on the watchdog script modifing the status variable run_watchdog=1?

 

 

________________________________

Da: voyage-linux-bounces+d.finardi=gear.it at list.voyage.hk [mailto:voyage-linux-bounces+d.finardi=gear.it at list.voyage.hk] Per conto di Robert Rawlins
Inviato: giovedì 11 settembre 2008 16.03
A: voyage-linux at voyage.hk
Oggetto: RE: [Voyage-linux] Broken watchdog?

Dario,

 

Thanks for your reply to and taking the time help out. Sorry for my late reply, I've had my head buried in code the past couple of days.

 

My watchdog configuration file looks like this:

 

realtime             = yes

priority             = 1

pidfile              = /var/run/myapp.pid

watchdog-device      = /dev/watchdog

interval             = 15

 

That's all there is too it. In syslog it logs all the checks as I detailed in my original post but after the process crashes and watchdog cannot find the process I get no more log entries from watchdog and the system is not rebooted.

 

Let me know if you need anything else.

 

Thanks,

 

Robert

 

From: Dario Finardi [mailto:d.finardi at gear.it] 
Sent: 10 September 2008 14:32
To: Robert Rawlins; voyage-linux at voyage.hk; voyage-linux at voyage.hk
Subject: R: [Voyage-linux] Broken watchdog?

 

may you post your watchdog configuration?

 

________________________________

Da: voyage-linux-bounces+d.finardi=gear.it at list.voyage.hk [mailto:voyage-linux-bounces+d.finardi=gear.it at list.voyage.hk] Per conto di Robert Rawlins
Inviato: mercoledì 10 settembre 2008 12.12
A: voyage-linux at voyage.hk; voyage-linux at voyage.hk
Oggetto: [Voyage-linux] Broken watchdog?

Guys,

 

My watchdog doesn't appear to be working quite correctly and I'm hoping you can help me out. I have it watching a process for me to ensure its still alive and I can see it logging this check in syslog like so:

 

Sep 10 08:50:11 voyage watchdog[3351]: still alive after 5661 interval(s)

Sep 10 08:50:11 voyage watchdog[3351]: was able to ping process 3250 (/var/run/myapp.pid).

Sep 10 08:50:26 voyage watchdog[3351]: still alive after 5662 interval(s)

Sep 10 08:50:26 voyage watchdog[3351]: pinging process 3250 (/var/run/myapp.pid) gave errno = 3 = 'No such process'

 

As you can see, it knows it cannot ping my process as it has crashed, yet the system doesn't appear to reboot itself. It just sits there like a dead duck J

 

I'm sure this was working in the past but cannot be sure. This is using Voyage 0.5 on an ALIX board.

 

I'd really appreciate some advice on this and how to debug if this is an issue.

 

Robert

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.voyage.hk/pipermail/voyage-linux/attachments/20080918/b16441aa/attachment.html>


More information about the Voyage-linux mailing list