check CP
To verify and debug the Control Plane Resource in the Cluster API Provider for Tinkerbell (CAPT), which is a critical component responsible for managing the Kubernetes control plane nodes, you can follow these steps:
1. Verify the Control Plane Resource Creation
The first step is to ensure that the KubeadmControlPlane
resource, which manages the control plane, has been created correctly and is in the desired state.
Check KubeadmControlPlane Resource Status:
Use kubectl
to check the status of the KubeadmControlPlane
resource:
kubectl get kubeadmcontrolplanes -A
- Expected Output: You should see the
KubeadmControlPlane
resources listed with their associated namespaces. TheREADY
column should show the number of ready replicas compared to the desired number, indicating that the control plane is in a healthy state.
Describe the KubeadmControlPlane Resource:
For more details, you can describe the KubeadmControlPlane
resource:
kubectl describe kubeadmcontrolplane <control-plane-name> -n <namespace>
- Expected Output: This command provides detailed information about the
KubeadmControlPlane
resource, including its current status, conditions, events, and any errors or warnings.
2. Check Associated Machines
The KubeadmControlPlane
resource manages a set of Machine resources that represent the control plane nodes. Ensure that these Machines are being created and managed correctly.
Check Machines Managed by KubeadmControlPlane:
kubectl get machines -l cluster.x-k8s.io/control-plane=<control-plane-name> -n <namespace>
- Expected Output: This command lists all Machines associated with the
KubeadmControlPlane
. These Machines should be in aRunning
orProvisioned
state.
Describe the Machines:
For more detailed debugging, describe one of the Machines:
kubectl describe machine <machine-name> -n <namespace>
- Expected Output: Look for conditions, events, or error messages that might indicate issues with the Machines. The status should reflect that the Machines are in a healthy state and have been provisioned correctly.
3. Verify Control Plane Node Registration
Ensure that the control plane nodes are successfully registered as nodes in the Kubernetes cluster.
Check Node Status:
kubectl get nodes
- Expected Output: The nodes corresponding to the Machines in your
KubeadmControlPlane
should be listed in the Kubernetes cluster, with a status ofReady
.
Describe Nodes:
If a node is not Ready
or is missing, describe the node for more information:
kubectl describe node <node-name>
- Expected Output: This output will provide details on why the node might not be
Ready
, such as issues with the kubelet, network configuration, or connectivity to the control plane.
4. Check Associated Infrastructure Resources
The KubeadmControlPlane
interacts with infrastructure-specific resources like TinkerbellMachineTemplate
to define the configuration of the control plane nodes.
Check TinkerbellMachineTemplate:
kubectl get tinkerbellmachinetemplates -A
- Expected Output: The TinkerbellMachineTemplate associated with your
KubeadmControlPlane
should be listed and in aReady
state.
Describe the TinkerbellMachineTemplate:
kubectl describe tinkerbellmachinetemplate <template-name> -n <namespace>
- Expected Output: This command provides detailed information about the template, ensuring that the correct settings (e.g., hardware profile, OS image) are applied to the Machines created by the
KubeadmControlPlane
.
5. Check for Control Plane Initialization
Ensure that the control plane nodes have been correctly initialized. This is often indicated by the presence of conditions like Initialized
, ControlPlaneInitialized
, and Ready
in the KubeadmControlPlane
resource.
Check Initialization Conditions:
kubectl describe kubeadmcontrolplane <control-plane-name> -n <namespace>
- Expected Output: The
KubeadmControlPlane
resource should show conditions such asInitialized
andReady
, indicating that the control plane nodes have been successfully initialized.
6. Check Events in the Namespace
Kubernetes events can provide insights into what might be going wrong with the control plane or its associated components.
List Events in the Namespace:
kubectl get events -n <namespace> --sort-by='.metadata.creationTimestamp'
- Expected Output: Review any warning or error events that might indicate problems with the control plane, such as failed node initialization, issues with etcd, or errors from the Tinkerbell infrastructure.
7. Verify Control Plane Scaling
The KubeadmControlPlane
resource can manage scaling of the control plane. Ensure that the number of control plane replicas is being maintained as expected.
Check Replica Status:
kubectl get kubeadmcontrolplanes -A
- Expected Output: The
REPLICAS
,READY
, andUPDATED
columns should show that the desired number of control plane nodes is being maintained.
Scale the Control Plane:
You can test scaling by adjusting the number of replicas:
kubectl scale kubeadmcontrolplane <control-plane-name> --replicas=<desired-number> -n <namespace>
- Expected Output: The
KubeadmControlPlane
should create or delete control plane nodes to match the desired replica count. Verify that the number of control plane nodes corresponds to the new replica count.
8. Check Logs for CAPT Controller
If there are issues with the control plane resource, checking the logs of the CAPT controller can provide additional insights.
Check Logs for CAPT Controller:
kubectl logs -n capt-system <pod-name>
Replace <pod-name>
with the actual pod name of the CAPT controller managing the KubeadmControlPlane
resource.
- Expected Output: The logs should detail any errors or issues encountered by the controller when managing the control plane resource, including interactions with Tinkerbell or Kubernetes API.
9. Advanced Debugging with Increased Verbosity
If you are still having trouble identifying the issue, you can increase the verbosity of the CAPT controller to gather more detailed logs:
- Edit the Deployment:
kubectl edit deployment capt-controller-manager -n capt-system
- Add Verbosity Flag:
Add --v=5
or --v=10
to the command section to enable more detailed logging.
- Check Logs Again:
kubectl logs -n capt-system <pod-name>
10. Interact with Tinkerbell via Tink CLI (Optional)
If you have direct access to the Tink CLI, you can interact with Tinkerbell resources directly to verify that the control plane nodes are being provisioned as expected:
tink hardware list
tink workflow list
- Expected Output: The hardware and workflow related to the control plane nodes in the
KubeadmControlPlane
should be listed and should indicate whether the provisioning tasks have succeeded.
Conclusion
By following these steps, you can systematically verify and debug the KubeadmControlPlane
resource and its interactions with other components in the Cluster API Provider for Tinkerbell. This process ensures that your Kubernetes control plane is correctly provisioned, managed, and scaled on bare-metal infrastructure, allowing for reliable operation and effective cluster management.