10964 ExecStop=/usr/sbin/corosync-cfgtool -H --force (code=exited, status=1/FAILURE)

The error you encountered (`ExecStop=/usr/sbin/corosync-cfgtool -H --force (code=exited, status=1/FAILURE)`) typically indicates a problem with the Corosync configuration or a related issue during the cluster service stop process. Here are steps to troubleshoot and potentially resolve this problem:

1. **Check Corosync Configuration Files**: Verify the contents of the Corosync configuration files (`/etc/corosync/corosync.conf` and any included files in `/etc/corosync/service.d/`). Ensure that the configuration is correct, including node definitions, network settings, and any additional parameters specific to your cluster setup.

2. **Review Recent Changes**: If the cluster was previously functioning correctly and this issue occurred after changes were made, review those changes carefully. Revert any recent configuration changes or updates that might have caused the problem.

3. **Check for Corrupted Configuration Files**: Ensure that the Corosync configuration files are not corrupted. Check for any syntax errors or invalid configurations that could cause Corosync to fail during initialization or shutdown.

4. **Verify Cluster Status**: Check the status of the cluster using tools like `pvecm status` or `corosync-quorumtool` on each node. Ensure that all nodes are up and reachable, and there are no issues with cluster communication.

5. **Check System Logs**: Inspect system logs (`/var/log/messages`, `/var/log/syslog`, etc.) for any error messages or warnings related to Corosync. Look for any clues that might indicate the cause of the failure during the stop process.

6. **Restart Corosync Service**: Try restarting the Corosync service on all nodes using `systemctl restart corosync`. Monitor the output for any error messages or warnings that might provide additional information about the problem.

7. **Check Resource Availability**: Ensure that the system has enough resources (CPU, memory, disk space) available to run Corosync and other cluster-related services. Resource constraints could lead to failures during startup or shutdown processes.

8. **Update Corosync**: If you are not already using the latest version of Corosync, consider updating to the latest stable release. Newer versions often include bug fixes and improvements that could address the issue you are experiencing.

9. **Reinstall Corosync**: If the problem persists and you suspect that Corosync may be corrupted, consider reinstalling Corosync on all nodes. Make sure to backup any important configuration files before doing so.

10. **Seek Community Support**: If you are still unable to resolve the issue, consider seeking help from the Proxmox community forums or mailing lists. Other users may have encountered similar problems and can provide assistance or guidance.

By following these steps, you should be able to diagnose and resolve the issue with the Corosync service failing during the stop process in your Proxmox cluster.

© 2024 - ErnesTech - Privacy
E-Commerce Return Policy