Written May 7, 2009 in linux, sysadmin

I had a weirdness happen as I was feeling my way through this node configuration with crm (pacemaker) 1.0.3.

It turns out that as I configured my resources and created locations and constraints, the crm created a bunch of lrm_resource (location resource manager) objects in the xml cib. You can’t see these from the crm shell, but you can see them if you dump the XML out using cibadmin --query > cluster.xml.

I was getting some strange errors. For example, I had the location constraint established for a resource named app-03-stonith such that it’s chance of running on app-03 was -inf. There weren’t any real errors for this, except for this error message in both the crm_mon and the info log:

May  7 14:02:09 app-03 pengine: [6838]: info: unpack_rsc_op:
  app-03-stonith_start_0 on app-03 returned 1 (unknown error) instead of the
  expected value: 0 (ok)

Since I misconfigured a bunch of stuff initially and didn’t have that location rule initially, apparently these misconfigurations made it into the lrm records and weren’t cleaned up automatically. As I deleted the resource objects in the cluster configuration, the lrm records didn’t get cleaned out. Subsequently recreating the objects in the cluster configuration would leave these faulty lrm objects. Worse, deleting an object entirely would still leave these objects there.

I headed into the resource section of the crm shell to play around, and under resource I found the cleanup command. I ran it against all the nodes with the names of all of the objects that I’d removed. Places where there were lrm records in place allowed me to delete them, but not for objects that I’d deleted.

On any node, running the crm-verify command would output errors like this:

app-04:~ # crm_verify -LV
crm_verify[13501]: 2009/05/07_16:52:16 WARN: process_orphan_resource: Nothing
  known about resource app-04-stonith running on app-05
crm_verify[13501]: 2009/05/07_16:52:16 WARN: process_orphan_resource: Nothing
  known about resource app-03-stonith running on app-05
Warnings found during check: config may not be valid

Re-creating the resources app-04-stonith and app-03-stonith, putting proper location constraints in place (no node can host it’s own remote stonith service) and then running cleanup against all of them resulted in a “clean” cib and no errors output by crm_verify.

Looks like the delete command is a little sloppy. I wish I could pin this down a little better for a ‘real’ bug report, but I can’t reproduce whatever newbie errors got the bad records created in the first place.

No comments on ' No eligable nodes for a resource? Check your CIB '

  1. No comments yet.

Leave a comment

name (req'd)

email (req'd)

website