Many cf-agent processes running

Some times during normal operation of CFEngine, you might see several cf-agent processes running simultaneously. This is expected if some promise(s) take long to check or repair and the agents are working on different promises. This is a temporary condition and will return to normal after work on the promises is done.

 

However, if there are tens or more cf-agent processes running this is most likely an indication that something is wrong. The agents could be waiting for some condition like a lock or lmdb database files may be corrupted (due to unclean shutdown of CFEngine).

 

To diagnose the issue, please find the PIDs of the cf-agent processes and create a backtrace. On most Unices this can be done with the ps and gdb tools like the following:

 

1. To get the process IDs of cf-agent:

# ps -ef | grep cf-agent

 

2. To get the backtrace of a process:

# gdb -p <PID>

# bt full

# quit

 

You can either look at the backtraces or attach it to a support case to determine the root cause.

 

Also note the agent_expireafter setting in body executor control, you can set this to e.g. 60 to limit the total number of cf-agent processes to 12 in a default configuration.

 

There is also a CFEngine policy attached to this article that you can use to detect and remediate this situation.

Comments

Powered by Zendesk