OCI CCM v1.34: the annotations that don't exist and the Token Collision that freezes the LB
Published on April 21, 2026

Context
During the migration of a Next.js application to OKE (Oracle Kubernetes Engine), we used a LoadBalancer-type Service so that the Cloud Controller Manager (CCM) would automatically provision and manage the OCI Load Balancer. The CCM version was v1.34, paired with the OKE control plane.
What should have been simple — declaring annotations on the Service and having CCM obey them — turned into a sequence of silent failures, 409 errors, and orphaned LBs accumulating in the account. This post documents what actually works, what is silently ignored, and the most dangerous trap: Token Collision.
Annotations that don't exist in OCI CCM v1.34
The official OCI CCM documentation lists a set of supported annotations. But when running real workloads, some annotations found in forums, community examples, or older versions simply don't work — and CCM gives no warning. It silently ignores them and continues.
1. Reusing an existing LB
Attempt: point CCM to a Load Balancer that already existed in the account, created manually before the cluster.
# FAILED — annotation does not exist in OCI CCM v1.34
service.beta.kubernetes.io/oci-load-balancer-id: "<ocid-of-manual-lb>"Result: CCM completely ignored the annotation and created a new LB with an auto-generated UUID name. The manual LB remained active and billing, with no traffic.
2. Reserved IP (fixed IP for the LB)
Attempt: pin the Load Balancer's public IP using an OCI Reserved IP to avoid IP change if the Service were recreated.
# FAILED — annotation ignored, CCM creates its own IP
service.beta.kubernetes.io/oci-load-balancer-reserved-ip-id: "<ocid-of-publicip>"
# Also FAILED
oci.oraclecloud.com/load-balancer-ip: "192.0.2.4"Result: CCM ignored both and provisioned the LB with an ephemeral IP (but stable as long as the Service exists). The Reserved IP remained in AVAILABLE state, generating cost without use (~$1.80/month).
In practice, the IP of the CCM-managed LB does not change as long as the Kubernetes Service is not deleted. For production DNS, this IP is stable enough — Reserved IP is unnecessary.
3. Security Rule Management via NSG
Attempt: use NSG mode so that CCM manages traffic rules via Network Security Groups instead of Security Lists.
# FAILED — requires IAM policies that basic OKE does not provision
service.beta.kubernetes.io/oci-load-balancer-security-rule-management-mode: "NSG"
# Also FAILED — 404 without VirtualNetwork Manage permission
service.beta.kubernetes.io/oci-network-security-groups: "<ocid-of-nsg>"Result: 404 errors on the VirtualNetwork API. The CCM service account in basic OKE does not have the IAM policies required to create or modify NSGs. The working solution is to use `security-list-management-mode: None` and manage network rules manually.
Annotations confirmed working in OCI CCM v1.34
After discarding the problematic annotations, this is the set tested in production that works consistently:
annotations:
# Subnet where the LB will be created
service.beta.kubernetes.io/oci-load-balancer-subnet1: "<ocid-of-public-subnet>"
# Flexible shape (recommended over fixed shape)
service.beta.kubernetes.io/oci-load-balancer-shape: "flexible"
service.beta.kubernetes.io/oci-load-balancer-shape-flex-min: "10"
service.beta.kubernetes.io/oci-load-balancer-shape-flex-max: "100"
# CCM does not touch Security Lists (manual management)
service.beta.kubernetes.io/oci-load-balancer-security-list-management-mode: "None"
# Health check via kube-proxy healthz (port 10256)
service.beta.kubernetes.io/oci-load-balancer-health-check-protocol: "HTTP"
service.beta.kubernetes.io/oci-load-balancer-health-check-path: "/healthz"
service.beta.kubernetes.io/oci-load-balancer-health-check-retries: "3"
service.beta.kubernetes.io/oci-load-balancer-health-check-interval: "10000"
service.beta.kubernetes.io/oci-load-balancer-health-check-timeout: "5000"
# Backend protocol
service.beta.kubernetes.io/oci-load-balancer-backend-protocol: "HTTP"
# HTTPS (when TLS secret is created in the cluster)
service.beta.kubernetes.io/oci-load-balancer-ssl-ports: "443"
service.beta.kubernetes.io/oci-load-balancer-tls-secret: "myapp/myapp-tls"Token Collision: the most dangerous problem
Token Collision is the most silent trap in OCI CCM. It manifests as an HTTP 409 on the OCI API — and when it occurs, CCM stops reconciling the LB without emitting any clear alert in the cluster.
How the idempotency token works
OCI CCM generates an idempotency token for each Load Balancer creation operation. The token is derived from:
token = hash("cluster~createLoadBalancer~{serviceUID}")The Service UID is fixed as long as the Service exists. The problem: any change to the Service spec (adding or removing an annotation, changing a port, altering the shape) does not change the UID — but it changes the content of the request sent to the OCI API. OCI returns 409 because the token was already used with a different payload.
Symptom
CCM enters a reconciliation loop unable to update the LB. Service events show repeated errors, but no high-level cluster log indicates what happened. The LB in OCI stays in the previous state, without reflecting the Service changes.
The only way out: delete and recreate the Service
There is no way to resolve Token Collision without recreating the Service. A new Service generates a new UID → new token → clean OCI operation. The correct procedure is:
# 1. Remove the finalizer to unblock deletion
kubectl patch service myapp-svc -n myapp -p '{"metadata":{"finalizers":[]}}' --type=merge
# 2. Delete the Service (CCM deletes the LB automatically)
kubectl delete service myapp-svc -n myapp
# 3. Recreate the Service with the correct annotations
kubectl apply -f service.yaml
# 4. Reactivate WAF on the new LB (if a WAF policy is associated)
# The LB OCID changes — the WAF instance must be recreatedCRITICAL: never delete the Service in production without planning the WAF re-registration. CCM deletes the old LB when the Service is deleted — and the associated WAF instance loses its reference.
Golden rule to avoid Token Collision
Define the final Service spec before the first apply. Once created with a UID, any structural change (ports, shape annotations, subnet) requires the full delete + recreate cycle. Do not try to edit the Service in production with `kubectl edit` or `kubectl patch` to change functional annotations — this triggers the 409.
Mandatory ports the documentation doesn't highlight
Two network requirements are critical and appear only as secondary notes in the official documentation, but cause silent failures when absent.
Port 12250 — node registration
Worker nodes communicate with the OKE management endpoint over TCP port 12250. Without this ingress rule in the endpoint's NSG, nodes never complete registration.
# OKE endpoint NSG
Direction: INGRESS
Protocol: TCP
Source: Worker node NSG (or node subnet CIDR)
Destination port: 12250
Description: Workers -> OKE management endpointSymptom without port 12250: all nodes remain in `UnknownNodeError` state with the message "has not been seen for more than 20 minutes" — even when freshly provisioned by the node pool.
Port 10256 — Load Balancer health check
CCM configures the LB health check pointing to port 10256 on the nodes, which is the kube-proxy `/healthz` endpoint. Without the rule allowing this port from the LB subnet to the nodes, all backends are marked unhealthy and no traffic is routed.
# Worker node NSG (or Security List)
Direction: INGRESS
Protocol: TCP
Source: LB subnet CIDR (e.g. 10.1.0.0/24)
Destination port: 10256
Description: OCI LB health check -> kube-proxy healthzNote: CCM uses 10256 as the default health check port regardless of any health-check-path annotations configured — the port is hardcoded in the controller behavior.
HTTPS managed by CCM: declare in the Service, not in the LB
A common trap is adding HTTPS listeners directly to the OCI Load Balancer via CLI or console. CCM reconciles the LB periodically (on every node or pod change) and deletes anything not declared in the Kubernetes Service.
The correct way to have HTTPS managed by CCM:
# 1. Create the TLS secret in the cluster
kubectl create secret tls myapp-tls -n myapp \
--cert=myapp.crt \
--key=myapp.key
# 2. Declare port 443 in the Service spec
spec:
ports:
- name: http
port: 80
targetPort: 3000
protocol: TCP
- name: https
port: 443
targetPort: 3000
protocol: TCP
# 3. Add the TLS annotations
annotations:
service.beta.kubernetes.io/oci-load-balancer-ssl-ports: "443"
service.beta.kubernetes.io/oci-load-balancer-tls-secret: "myapp/myapp-tls"With this configuration, CCM creates two listeners and two backend sets. When nodes change, CCM reconciles and recreates the listeners automatically — without manual intervention.
CCM backend auto-update: confirmed in ~2 minutes
After resolving all annotation and port problems, we validated CCM's auto-update behavior: we manually removed a node via the OCI console. The node pool provisioned a replacement, and CCM detected the new node and updated the LB backends in approximately 2 minutes, without intervention.
# Service events during node replacement
Warning UnAvailableLoadBalancer 11m service-controller There are no available provisioned nodes for LoadBalancer
Normal UpdatedLoadBalancer 9m21s (x4 over 13m) service-controller Updated load balancer with new hostsThe real downtime was not caused by CCM — it was dominated by container image pull time (2.1 GB taking 1m34s per new node). CCM does its part; the bottleneck is image size.
To reduce downtime during node replacements: use Next.js standalone output, multi-stage builds, and proper .dockerignore. A 2.1 GB image can drop to ~500 MB — and pull time follows.
Summary of traps
For anyone setting up OKE + CCM v1.34 for the first time, the main traps in order of impact:
1. Token Collision 409 — never edit the Service spec after creation. Any structural change requires delete + recreate with finalizer removal.
2. Missing port 12250 — nodes stay in UnknownNodeError indefinitely. Without this port in the endpoint NSG, the cluster does not work.
3. Missing port 10256 — all LB backends are unhealthy. Without this port, traffic never reaches applications.
4. Non-existent annotations — oci-load-balancer-id, reserved-ip-id and security-rule-management-mode: NSG are silently ignored. Don't use them.
5. HTTPS outside the Service — listeners added directly to the OCI LB are deleted by CCM on the next reconciliation.


