sm-notify: sm-notify leaves monitor records in sm.bak
sm-notify fails to remove monitor records from sm.bak when it has
finally notified a host. This is because of a recent change to send
two SM_NOTIFY requests for each monitored peer: one with the local
host's FQDN, and one with an unqualified version of same. This was
commit
baa41b2c: "sm-notify: Send fully-qualified and unqualified
mon_names" (March 19, 2010).
Because of the March 2010 commit, sm-notify modifies the "my_name"
string during notification, but then uses this modified string to try
to find the monitor record to remove. Of course the search for the
record fails. So a persistent monitor record is left in sm.bak.
Aside from leaving trash around, this causes the same hosts to be
notified after every reboot, even if they successfully responded to
the previous SM_NOTIFY and they had no contact with us during the last
boot.
I also noticed that the trick of truncating the argument of SM_NOTIFY
doesn't work at all if a substitute "my_name" was specified via the "-v"
command line option. This patch attempts to address that as well.
sm-notify should preserve the original my_name string so that
nsm_delete_host() can find the correct monitor record to delete. Also
add some degree of protection to the mon_name and my_name strings in
each nsm_host record to prevent a future change from breaking this
dependency.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>