Functionality

In this section, we introduce all utility functions used to calculate HyperTool-specific attributes for Native Nodes. For each function, we specify how the resulting attribute is exposed in Kubernetes (as a label, annotation etc.), along with details about its dependencies, parameters, example output, and error handling.

interface_name_and_type

The interface_name_and_type annotations describe the network interface name and type used to reach a specific IP destination, such as the Kubernetes API server, using the pyroute2 library.

The output provides the interface name and network type (e.g., ethernet, wireless) that the system would use to route traffic to the specified IP address. This information is extracted by querying the system’s routing table.

Kubernetes type: Annotation

Dependencies:

  • pyroute2 : IPRoute class for routing table queries

Parameters:

  • ip (str): The IP address of the Kubernetes API server.

Example Output:

  • hyperai.eu/node-interface: eth0

  • hyperai.eu/node-network-type: ethernet

Error Handling:

If no route is found or an error occurs, the function returns default values:

  • hyperai.eu/node-interface: "unknown"

  • hyperai.eu/node-network-type: "unknown"

node_available_interfaces

The node_available_interfaces annotation provides a list of all network interfaces available on the node . This information is useful for understanding the network configuration of the node.

Kubernetes type: Annotation

Dependencies:

  • netifaces : Provides a list of the available network interfaces on the system.

Parameters:

  • None

Example Output:

  • hyperai.eu/node-available-interfaces: eth0, eth1, wlan0

Error Handling:

There is no specific error handling for this function.

latency

The latency annotation describes the grade of the network latency to the node’s control plane.

After we measure the round-trip time (RTT) of the response of the control plane’s API server. We classify the latency through the this scale:

Grade table: A < 500ms, B < 1000ms, C < 2000ms, D >= 2000ms

  • A (Excellent) : < 500ms

  • B (Good) : < 1000ms

  • C (Fair) : < 2000ms

  • D (Poor) : >= 2000ms

Kubernetes type: Annotation

Dependencies:

  • ping3 : a pure python3 version of ICMP ping implementation using raw socket.

Parameters:

  • ip (str): The IP address of the Kubernetes API server.

Example Output:

  • hyperai.eu/node-latency: A

Error Handling:

If no route is found or an error occurs, the function returns the worst grade:

  • hyperai.eu/node-latency: D

bandwidth

The bandwidth annotations provide the grade of download speed and upload speed of the node.

Kubernetes type: Annotations

Dependencies:

Parameters:

  • None

Example Output:

  • hyperai.eu/node-download-speed: A

  • hyperai.eu/node-upload-speed: B

Error Handling:

If the bandwidth cannot be determined, the function returns default values:

  • hyperai.eu/node-download-speed: "unknown"

  • hyperai.eu/node-upload-speed: "unknown"

packetLoss

The packetLoss annotation measures the percentage of packets lost during transmission to a target host . The packet loss is categorized into grades based on the percentage of lost packets, providing a quick assessment of network reliability.

The function uses the system’s ping command to send ICMP packets and analyzes the output to extract packet loss percentage. The loss percentage is then classified into one of six grades:

  • Grade A: 0% packet loss (perfect)

  • Grade B: ≤ 20% packet loss

  • Grade C: ≤ 40% packet loss

  • Grade D: ≤ 60% packet loss

  • Grade E: ≤ 80% packet loss

  • Grade F: > 80% packet loss

Kubernetes type: Annotation

Dependencies:

  • subprocess : Used to execute the system ping command

  • socket : Used for network socket operations

  • re : Used to parse ping output for packet loss percentage

Parameters:

  • ip (str): The IP address of the Kubernetes API server.

Example Output:

  • hyperai.eu/node-packet-loss: A (0% loss)

  • hyperai.eu/node-packet-loss: B (1-20% loss)

  • hyperai.eu/node-packet-loss: F (>80% loss)

Error Handling:

If packet loss cannot be determined, the function defaults to 100% packet loss and assigns the worst grade:

  • hyperai.eu/node-packet-loss: F

nodeCategory

The nodeCategory annotation provides a composite classification of a node’s computational capacity based on its CPU cores and RAM. This categorization enables quick identification of node capabilities for workload placement and resource allocation decisions.

CPU Grading Scale:

  • Grade F: 1 core

  • Grade E: 2-3 cores

  • Grade D: 4-7 cores

  • Grade C: 8-15 cores

  • Grade B: 16-31 cores

  • Grade A: 32+ cores

RAM Grading Scale:

  • Grade E: 0-8 GB

  • Grade D: 8-16 GB

  • Grade C: 16-32 GB

  • Grade B: 32-64 GB

  • Grade A: 64+ GB

Kubernetes type: Annotation

Dependencies:

  • psutil : Used to retrieve system memory information

  • os : Used to get CPU count

Parameters:

  • None

Example Output:

  • hyperai.eu/node-category: CPU_C_RAM_B (8-15 cores, 32-64 GB RAM)

  • hyperai.eu/node-category: CPU_A_RAM_A (32+ cores, 64+ GB RAM)

  • hyperai.eu/node-category: CPU_D_RAM_C (4-7 cores, 16-32 GB RAM)

Error Handling:

The function uses fallback values if detection fails: - CPU cores default to 1 if os.cpu_count() returns None - RAM is retrieved via psutil.virtual_memory().total

uptime

The uptime label provides a grade indicating how long a Kubernetes node has been running since its last boot. This metric is useful for identifying stable, long-running nodes versus recently restarted ones, which can inform scheduling decisions and maintenance planning.

The uptime is calculated by measuring the time elapsed since system boot and categorizing it into grades:

  • Grade A: ≥ 7 days (10,080 minutes) - Oldest/most stable nodes

  • Grade B: 2-7 days (2,880-10,079 minutes)

  • Grade C: 1-2 days (1,440-2,879 minutes)

  • Grade D: < 1 day (< 1,440 minutes) - Newest nodes

Kubernetes type: Label

Dependencies:

  • psutil : Used to retrieve system boot time

  • time : Used to calculate elapsed time

Parameters:

  • None

Example Output:

  • hyperai.eu/node-uptime: A (node running for 7+ days)

  • hyperai.eu/node-uptime: B (node running for 2-7 days)

  • hyperai.eu/node-uptime: D (node running for less than 1 day)

Error Handling:

If the boot time cannot be determined, the function will return grade D.

TPU

The TPU label provides the total number of TPU (Tensor Processing Unit) devices available on the node.

Kubernetes type: Label

Dependencies:

  • pathlib : Used to check for TPU device files

  • subprocess : Used to execute lspci command (optional fallback)

Parameters:

  • None

Example Output:

  • hyperai.eu/node-tpu-capacity: 4 (4 TPU devices detected)

Error Handling:

If no TPU devices are found or an error occurs during detection, the function returns None and no label is added to the node.

Accelerators

The Accelerators label provides the total count of hardware accelerators (GPUs and other compute accelerators) available for workload scheduling on the node.

Kubernetes type: Label

Dependencies:

  • pathlib : Used to scan for accelerator device files

Parameters:

  • None

Example Output:

  • hyperai.eu/node-allocatable-accelerators: 2 (2 Accelartors devices detected)

Error Handling:

If no accelerator devices are found or an error occurs during detection, the function returns None and no label is added to the node.

NodePool

The NodePool label assigns or removes a logical pool or role to a Kubernetes node, enabling grouping of nodes with similar characteristics (e.g., ML-optimized nodes, CPU-only nodes, GPU nodes) for workload scheduling and placement policies.

Kubernetes type: Label

Dependencies:

  • None (the label is applied/removed via HyperTool CLI)

Parameters:

  • None at runtime; the label value is provided manually by the operator.

Example Output:

  • hyperai.eu/node-pool: ml-node

  • hyperai.eu/node-pool: cpu-node

  • hyperai.eu/node-pool: gpu-node

geolocation

The geolocation labels provide the geographical location of the node based on its public IP address.

Kubernetes type: Label

Dependencies:

  • geocoder : Used to retrieve the geolocation information based on the public IP address.

Parameters:

  • None

Example Output:

  • hyperai.eu/node-geolocation-city: Athens

  • hyperai.eu/node-geolocation-region: Attica

  • hyperai.eu/node-geolocation-country: GR

Error Handling:

If the geolocation cannot be determined, the function returns default values:

  • hyperai.eu/node-geolocation-city: "unknown"

  • hyperai.eu/node-geolocation-region: "unknown"

  • hyperai.eu/node-geolocation-country: "unknown"

get_monetary_cost_annotation

The get_monetary_cost_annotation function adds monetary cost annotations to a HyperAI node. The current implementation follows the simulator cost logic: HyperTool first predicts a numeric monetary cost from the node CPU and memory, then assigns the qualitative category from that predicted cost value only.

Annotation Process

  1. Read node resources:

    • CPU count is converted to vCPU.

    • RAM is converted from MiB/GiB into the format expected by the cost model.

  2. Predict monetary cost:

    The packaged cost regression model estimates the node cost from CPU and memory:

    predicted_cost = cost_model(vCPU, ram_gb)
    
  3. Assign cost category:

    The predicted cost is scaled using the stored price-only scaler and compared against the price-based K-Means centroids:

    cost_scaled = (predicted_cost - mean_price) / scale_price
    cluster_id = argmin_i |cost_scaled - centroid_i|
    
  4. Map the selected cluster to a readable label:

    very low, low, medium, high or very high.

Important Implementation Note

The category is no longer inferred from a mixed CPU, memory and placeholder-price vector. The category is now inferred from the predicted monetary cost value only. This keeps HyperTool aligned with the updated simulator implementation.

Kubernetes type: Annotation

Returned annotations:

  • hyperai.eu/node-monetary-cost

  • hyperai.eu/node-monetary-cost-category

Dependencies:

  • numpy : Used for centroid distance calculation.

  • pandas : Used to prepare model input features.

  • sklearn / joblib : Used to load the packaged regression and scaler assets.

Parameters:

  • None

Example Output:

  • hyperai.eu/node-monetary-cost: 0.102345

  • hyperai.eu/node-monetary-cost-category: medium

Error Handling:

If the cost model, scaler or node resources cannot be read, the function returns:

  • hyperai.eu/node-monetary-cost: "unknown"

  • hyperai.eu/node-monetary-cost-category: "unknown"

energy_efficiency_annotation

The energy_efficiency_annotation function adds energy-efficiency annotations to a HyperAI node. The current implementation is hardware-first. HyperTool first tries to compute the energy-efficiency value from the labelled hardware dataset using CPU core count and clock speed. The FLOPs and TDP regressors are used only when no usable hardware match is available.

Annotation Process

  1. Read hardware features:

    • CPU count is read from the node.

    • Clock speed is extracted from lscpu and stored in GHz.

  2. Try exact hardware lookup:

    HyperTool checks the labelled hardware dataset for the same CPU core count and rounded clock speed. If a match exists, the stored energy-efficiency value and label are used directly.

  3. Try nearest hardware lookup:

    If no exact row exists, HyperTool looks for a same-core hardware row with a close clock speed. This is still treated as hardware-based because the value comes from the labelled dataset.

  4. Use model fallback only if hardware lookup fails:

    If neither exact nor nearest hardware lookup is available, HyperTool predicts FLOPs and TDP using the packaged fallback regressors and calculates:

    energy_efficiency = flops_per_sec / tdp_watts
    
  5. Assign energy-efficiency category:

    The final energy-efficiency value, whether hardware-based or model-based, is passed to the energy-efficiency K-Means model and mapped to:

    very low, low, medium, high or very high.

Important Implementation Note

The fallback models are not the primary EE source. They are used only when the hardware dataset cannot provide a usable value. This keeps EE based on hardware information first, while still allowing HyperTool to annotate unknown hardware.

Kubernetes type: Annotation

Returned annotations:

  • hyperai.eu/node-clock-speed-ghz

  • hyperai.eu/node-energy-efficiency

  • hyperai.eu/node-energy-efficiency-category

  • hyperai.eu/node-energy-efficiency-source

  • hyperai.eu/node-flops-per-sec

  • hyperai.eu/node-tdp-watts

Possible source values:

  • hardware_exact: exact core-count and clock-speed match in the labelled hardware dataset.

  • hardware_nearest: same-core hardware row with close clock speed.

  • model_fallback: FLOPs/TDP regressors were used because hardware lookup failed.

  • unknown: annotation failed.

Dependencies:

  • subprocess : Used to call lscpu for clock-speed extraction.

  • pandas : Used to load and aggregate the labelled hardware dataset.

  • numpy : Used for numeric conversion and distance checks.

  • sklearn / joblib : Used to load the fallback models and K-Means assets.

Parameters:

  • None

Example Output:

  • hyperai.eu/node-clock-speed-ghz: 2.5

  • hyperai.eu/node-energy-efficiency: 18452345.345

  • hyperai.eu/node-energy-efficiency-category: high

  • hyperai.eu/node-energy-efficiency-source: hardware_exact

  • hyperai.eu/node-flops-per-sec: unknown

  • hyperai.eu/node-tdp-watts: unknown

Error Handling:

If the hardware lookup and fallback model path both fail, the function returns:

  • hyperai.eu/node-energy-efficiency: "unknown"

  • hyperai.eu/node-energy-efficiency-category: "unknown"

  • hyperai.eu/node-energy-efficiency-source: "unknown"

  • hyperai.eu/node-flops-per-sec: "unknown"

  • hyperai.eu/node-tdp-watts: "unknown"

flops_per_sec

The annotation is derived from the node’s estimated GFLOPs per Joule, calculated based on synthetic floating-point performance (FLOPs/sec) and the CPU’s Thermal Design Power (TDP). TDP is either retrieved from static datasets or predicted using a regression model when not found.

Kubernetes type: Annotation

Dependencies:

  • platform : to extract CPU model

  • pandas : to load CPU datasets

  • numpy : for synthetic FLOPs calculation

  • time : to measure execution duration

  • sklearn : to load regression model

  • subprocess : to call lscpu for fallback TDP prediction

Parameters:

  • No Params

Example Output:

  • hyperai.eu/node-flops-per-sec: 11545015139.0

Error Handling:

If the TDP cannot be inferred from both the static dataset and the regression model, the fallback result is:

  • hyperai.eu/node-flops-per-sec: "unknown"

This typically occurs when the CPU model cannot be parsed or lscpu fails to return required features. Logs are recorded for such failures with logging.error.

trust_level

The trust_level annotation reports how trustworthy a node is, based on runtime-health signals the node observes about itself locally (uptime, CPU throttling, memory/disk pressure, OOM kills, abnormal restarts, eviction risk and network stability). Each signal yields a score in [0, 1] and the weighted total (weights sum to 1.0) is mapped to a qualitative label. The score is classified as high (≥ 0.7), medium (0.4–0.7) or low (< 0.4).

Kubernetes type: Annotation

Returned annotations:

  • hyperai.eu/node-trust-level

Dependencies:

  • psutil : boot time, memory/disk usage and eviction headroom.

  • subprocess : reads the kernel ring buffer via dmesg.

  • netifaces : resolves the default gateway.

  • ping3 : checks gateway reachability.

  • pathlib : reads /proc/pressure/*, cgroup cpu.stat and /sys/class/net.

Parameters:

  • None

Example Output:

  • hyperai.eu/node-trust-level: high

Error Handling:

A metric that cannot be evaluated falls back to a neutral default score (0.7) so a single missing signal does not dominate the result. If the whole computation fails, the function returns:

  • hyperai.eu/node-trust-level: "unknown"

security_level

The security_level annotation reports a node’s security posture from signals that can be checked locally inside the pod (pending security patches, disk encryption, users with shell access, unsigned kernel modules, open ports, kernel taint, public reachability and outbound internet access). Each signal yields a score in [0, 1]. The score is normalised over the remaining weights before being mapped to high (≥ 0.7), medium (0.4–0.7) or low (< 0.4).

Kubernetes type: Annotation

Returned annotations:

  • hyperai.eu/node-security-level

Dependencies:

  • subprocess : runs the package manager (apt-get / dnf) and lsblk.

  • psutil : enumerates listening ports.

  • netifaces : enumerates interface IP addresses.

  • ping3 : outbound internet-reachability check.

  • pathlib : reads /etc/passwd, /sys/module/*/taint and /proc/sys/kernel/tainted.

  • shutil : locates the package-manager binary.

  • ipaddress : classifies public vs private addresses.

Parameters:

  • None

Example Output:

  • hyperai.eu/node-security-level: low

Error Handling:

A parameter that raises contributes 0 to the score (a penalty); a check that cannot be determined locally returns a neutral 0.5. The weighted total is normalised over the included parameters before classification, so skipped API/scanner signals do not distort the thresholds.