Managing Node Lifecycle

This section describes how to manage the lifecycle of nodes in the HYPER-AI, including registration and unregistration processes.

Node Registration

Overview

The node registration process prepares a worker node and joins it to the Kubernetes cluster. This automated process handles all necessary configuration steps, including:

  • Setting up the hostname

  • Disabling swap

  • Installing and configuring containerd

  • Installing Kubernetes components (kubelet, kubeadm, kubectl)

  • Configuring kernel modules and networking

  • Joining the node to the cluster

Note

Before registering a node, you must obtain the following credentials from the control plane:

  • TOKEN: A bootstrap token for authentication

  • DISCOVERY_CA_HASH: The CA certificate hash for secure cluster discovery

These credentials are required for security reasons to ensure only authorized nodes can join the cluster.

Command

To register a node with the cluster, run the following command on the worker node:

$ hypertool run register

Example Output

$ hypertool run register
[*] Running node preparation script...
[*] Setting hostname to: worker-node
[*] Disabling swap if any...
[*] Updating apt and installing base deps...
[*] Installing containerd...
[*] Configuring containerd...
...
[✓] Node preparation completed successfully!

[*] Now we need to join the cluster...
Please enter the TOKEN: abcdef.0123456789abcdef
Please enter the DISCOVERY_CA_HASH: 1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
[*] Joining cluster with provided credentials...
[✓] Successfully joined the cluster!

Verification

After successful registration, you can verify the node has joined the cluster by running:

$ kubectl get nodes

The newly registered worker node should appear in the list of nodes.

Node Unregistration

Unregistration Overview

The node unregistration process safely removes a worker node from the Kubernetes cluster. This automated process ensures graceful removal by:

  • Cordoning the node to prevent new pods from being scheduled

  • Draining the node to evict all running pods

  • Removing the node from the cluster

Unregister Command

To unregister a node from the cluster, run the following command on the worker node:

$ hypertool run unregister

Unregister Example

$ hypertool run unregister
[*] Cordoning node worker-node...
[✓] Node cordoned successfully
[*] Draining node worker-node...
[*] Evicting pods...
[✓] Node drained successfully
[*] Removing node from cluster...
[✓] Node unregistered successfully!

Warning

Unregistering a node will evict all running pods. Ensure that your workloads have proper replica counts and PodDisruptionBudgets configured to maintain availability during the unregistration process.

Node Self-Advertisement

Self-Advertisement Overview

The self-advertisement mechanism allows nodes to broadcast their capabilities and attributes to HYPER-AI.

Self-Advertisement Command

To advertise node attributes to a specific host, run the following command:

$ hypertool run self-advertisement HOST

Where HOST is the IP address or hostname of the target endpoint that will receive the node’s attributes.

Self-Advertisement Example

$ hypertool run self-advertisement 192.168.1.100
[*] Collecting node attributes...
[*] Sending advertisement to 192.168.1.100...
[✓] Node attributes advertised successfully!

The advertised data is sent in JSON format and includes comprehensive information about the node’s capabilities.