A watchdog timer (or computer operating properly (COP) timer) is a computer hardware or software timer that triggers a system reset or other corrective action if the main program, due to some fault condition such as a hang, neglects to regularly service the watchdog (writing a "service pulse" to it, also referred to as "kicking the dog", “petting the dog”, "feeding the watchdog" or "waking the watchdog"). The intention is to bring the system back from the unresponsive state into normal operation.
Watchdog timers can be more complex, attempting to save debug information onto a persistent medium; i.e. information useful for debugging the problem that caused the fault. In this case a second, simpler, watchdog timer ensures that if the first watchdog timer does not report completion of its information saving task within a certain amount of time, the system will reset with or without the information saved. The most common use of watchdog timers is in embedded systems, where this specialized timer is often a built-in unit of a microcontroller.
Even more complex watchdog timers may be used in running untrusted code in a sandbox by placing an upper bound on the CPU time available to the untrusted code and thus preventing some types of denial-of-service attack.
Watchdog timers may also trigger fail-safe control systems to move into a safety state, such as turning off motors, high-voltage electrical outputs, and other potentially dangerous subsystems until the fault is cleared.
Where humans cannot constantly monitor embedded systems, watchdog timers may help. For example, most embedded systems need to be self-reliant, and it is not usually possible to wait for someone to reboot them if the software hangs. Some embedded designs, such as space probes, can become inaccessible to human operators. If their software ever hangs, such systems would remain permanently disabled. In cases similar to these, a watchdog timer can help in solving the problem.
Watchdog timers come in many configurations, and most allow their configurations to be altered. Configuration elements include:
- Physical location
- Within a chip external to the processor
- In circuitry included within the CPU chip, as is done in many microcontrollers
- On an expansion card in the computer's chassis
- In software, such as in the mobile operating system iOS
- Clock source for the watchdog
- The CPU clock
- An independent clock, so that a CPU clock failure will cause a watchdog timeout
- How long a timeout must be to trigger the watchdog
- Typical timeouts are from 10 milliseconds to 10 seconds
- What action the watchdog takes on a timeout
- Processor reset
- Non-maskable interrupt