← All posts

CVE-2026-46333 in Kubernetes: Unset Seccomp Exposed pidfd_getfd, RuntimeDefault Blocked It

CVE-2026-46333 is the Linux __ptrace_may_access() bug Qualys disclosed on May 15, 2026. The public shorthand is ssh-keysign-pwn, but the kernel primitive is broader: under the right timing, an unprivileged process can use pidfd_getfd to duplicate a file descriptor from a privileged process as it exits.

For Kubernetes teams, the useful question is not only "is the node kernel affected?" It is:

Can an ordinary pod steal a privileged file descriptor, and which Kubernetes controls actually block the path?

We tested that with a controlled target in local Docker, local kind, EKS Auto Mode on Bottlerocket, and a private mixed-node lab cluster. This time we reproduced the important effect in Kubernetes: a non-root pod stole an fd for a root-owned disposable file inside its own test image when seccomp was unset.

The short version:

  • EKS Auto Mode on Bottlerocket reproduced controlled fd theft on all four tested nodes when seccomp was unset.
  • EKS explicit Unconfined reproduced the same result.
  • EKS RuntimeDefault blocked pidfd_getfd.
  • EKS PSS Restricted blocked pidfd_getfd and also prevented the setuid helper from opening the file.
  • EKS PSS Baseline blocked explicit Unconfined and hostPID, but it did not fix unset seccomp; the Baseline unset case still reproduced controlled fd theft.
  • Local kind reproduced the same unset/Unconfined success and RuntimeDefault/Restricted blockers.
  • A Debian arm64 worker in the private lab cluster reproduced the same result.
  • Our Talos worker blocked normal pods with Seccomp: 2; even in a deliberately unconfined lab namespace, it did not reproduce the fd theft in 500 attempts, which is consistent with ptrace_scope=2 adding another gate.

This post is deliberately scoped. We did not target host files, /etc/shadow, SSH host keys, Kubernetes Secrets, host ssh-keysign, or production application files. We used a disposable root-owned file inside the lab image and a purpose-built setuid helper that exists only to validate the control path.

We are not publishing exploit code, lab source, or reproduction commands.

What CVE-2026-46333 is

The upstream Linux fix is titled ptrace: slightly saner 'get_dumpable()' logic. The bug is in a ptrace access-check path used by interfaces such as pidfd_getfd. During process exit, a task can pass through a window where its memory image is gone but its file table still exists. The vulnerable logic could skip a dumpability check in that state.

The public PoCs use that gap to duplicate file descriptors from privileged-but-dropped processes. The known examples focus on helpers such as ssh-keysign and chage, where a process opens a root-owned file, drops privilege, and later exits with the fd still open.

That shape matters in Kubernetes because an ordinary pod can contain setuid files in its image. If privilege escalation is allowed and seccomp does not block pidfd_getfd, the pod can potentially reach the same kernel primitive without host namespaces or hostPath.

The controlled target

Our lab target had four pieces:

  • a root-owned file in the image at /opt/probe-secret;
  • a non-root attacker process running as UID 1001;
  • a purpose-built setuid-root helper in the same image;
  • a marker string: JULIET_PTRACE_FD_PROBE_ONLY.

The attacker first tried to open /opt/probe-secret directly and got EACCES. The helper then opened the file while euid 0, dropped back to UID 1001, and paused. While the helper was alive, pidfd_getfd against the privileged fd returned EPERM. The attacker then raced helper exit and tried to duplicate the fd.

That gives a clean success signal:

  • direct open by UID 1001 fails;
  • helper proves it opened the file as euid 0 and dropped back to UID 1001;
  • alive pidfd_getfd is denied;
  • post-exit pidfd_getfd succeeds;
  • the duplicated fd reads only the disposable marker.

That is a Kubernetes-relevant security effect without touching host data or real secrets.

Local Docker and kind

We first validated the target locally.

Docker's default seccomp blocked the syscall:

Seccomp: 2
self_pidfd_getfd_stdout=0 errno=1 (Operation not permitted)
race_pidfd_getfd_success=0

With seccomp unconfined, the controlled fd theft reproduced:

NoNewPrivs: 0
Seccomp: 0
self_pidfd_getfd_stdout=1
attacker_direct_open=0 errno=13 (Permission denied)
victim before=1001/0/0 after=1001/1001/1001 secret_fd=3 dumpable=0
alive_pidfd_getfd_secret=0 errno=1 (Operation not permitted)
race_pidfd_getfd_success=1 attempt=41 victim_fd=3 read_len=28 sample_prefix=JULIET_PTRACE_FD_PROBE_ONLY

With no_new_privileges, the helper could not become euid 0 and never opened the file:

NoNewPrivs: 1
Seccomp: 0
self_pidfd_getfd_stdout=1
victim before=1001/1001/1001 after=1001/1001/1001 secret_fd=-1 open_errno=13
race_pidfd_getfd_success=0

Then we ran the Kubernetes matrix in a fresh kind cluster on the same local kernel.

Unset seccomp reproduced controlled fd theft:

Seccomp: 0
self_pidfd_getfd_stdout=1
victim before=1001/0/0 after=1001/1001/1001 secret_fd=3 dumpable=0
race_pidfd_getfd_success=1 attempt=54 victim_fd=3 read_len=28 sample_prefix=JULIET_PTRACE_FD_PROBE_ONLY

Explicit Unconfined also reproduced it. RuntimeDefault blocked pidfd_getfd:

Seccomp: 2
self_pidfd_getfd_stdout=0 errno=1 (Operation not permitted)
race_pidfd_getfd_success=0

Restricted blocked both sides of the chain:

NoNewPrivs: 1
Seccomp: 2
CapBnd: 0000000000000000
self_pidfd_getfd_stdout=0 errno=1 (Operation not permitted)
victim before=1001/1001/1001 after=1001/1001/1001 secret_fd=-1 open_errno=13
race_pidfd_getfd_success=0

PSS Baseline gave the important warning: it rejected explicit Unconfined and hostPID, but an unset-seccomp pod was admitted and reproduced the controlled fd theft.

EKS Auto Mode on Bottlerocket

The EKS cluster used Auto Mode Bottlerocket nodes:

Kubernetes:        v1.35.2-eks-f69f56f
OS image:          Bottlerocket (EKS Auto, Standard) 2026.5.12
Kernel:            6.12.83
Container runtime: containerd 2.1.6+bottlerocket
ptrace_scope:      1

We ran the full matrix on one node and an unset/RuntimeDefault check across the other three nodes.

Unset seccomp reproduced controlled fd theft on the first node:

NoNewPrivs: 0
Seccomp: 0
Seccomp_filters: 0
self_pidfd_getfd_stdout=1
attacker_direct_open=0 errno=13 (Permission denied)
victim before=1001/0/0 after=1001/1001/1001 secret_fd=3 dumpable=0
alive_pidfd_getfd_secret=0 errno=1 (Operation not permitted)
race_pidfd_getfd_success=1 attempt=1 victim_fd=3 read_len=28 sample_prefix=JULIET_PTRACE_FD_PROBE_ONLY

The other three Bottlerocket nodes reproduced the same result under unset seccomp, with success in one to three spawns.

Explicit Unconfined behaved the same way:

NoNewPrivs: 0
Seccomp: 0
self_pidfd_getfd_stdout=1
race_pidfd_getfd_success=1 attempt=1 victim_fd=3 read_len=28 sample_prefix=JULIET_PTRACE_FD_PROBE_ONLY

RuntimeDefault blocked the path on every Bottlerocket node we checked:

Seccomp: 2
Seccomp_filters: 1
self_pidfd_getfd_stdout=0 errno=1 (Operation not permitted)
race_pidfd_getfd_success=0

Restricted blocked the syscall and prevented the helper from opening the file:

CapBnd: 0000000000000000
NoNewPrivs: 1
Seccomp: 2
self_pidfd_getfd_stdout=0 errno=1 (Operation not permitted)
victim before=1001/1001/1001 after=1001/1001/1001 secret_fd=-1 open_errno=13
race_pidfd_getfd_success=0

PSS Baseline was not enough

The EKS application namespaces we checked use PSS Baseline enforcement with Restricted audit and warn labels. We created temporary lab namespaces with the same Baseline enforcement posture to answer one question: does Baseline fix unset seccomp?

It did not.

In a Baseline-enforced namespace, an unset-seccomp pod was admitted, ran with Seccomp: 0, and reproduced controlled fd theft:

NoNewPrivs: 0
Seccomp: 0
self_pidfd_getfd_stdout=1
race_pidfd_getfd_success=1 attempt=1 victim_fd=3 read_len=28 sample_prefix=JULIET_PTRACE_FD_PROBE_ONLY

Baseline did block explicit Unconfined:

violates PodSecurity "baseline:latest": seccompProfile ... must not set ... Unconfined

Baseline also blocked hostPID:

violates PodSecurity "baseline:latest": host namespaces (hostPID=true)

That distinction is the main Kubernetes takeaway. Baseline catches some risky declarations, but it does not require a seccomp profile. On the EKS Bottlerocket nodes we tested, unset seccomp meant Seccomp: 0, and that was enough to reproduce the controlled fd theft.

Private lab: Debian reproduced, Talos blocked

Our private lab cluster has both Talos and Debian workers, which made it useful for node-family comparison.

The Debian arm64 worker reproduced controlled fd theft under unset seccomp:

Kernel: 6.12.20+rpt-rpi-2712
Seccomp: 0
self_pidfd_getfd_stdout=1
victim before=1001/0/0 after=1001/1001/1001 secret_fd=3 dumpable=0
race_pidfd_getfd_success=1 attempt=0 victim_fd=3 read_len=28 sample_prefix=JULIET_PTRACE_FD_PROBE_ONLY

RuntimeDefault blocked pidfd_getfd, and Restricted prevented both the syscall and the setuid open.

The Talos worker behaved differently:

Kernel:       6.18.5-talos
ptrace_scope: 2
Seccomp:      2
self_pidfd_getfd_stdout=0 errno=1 (Operation not permitted)
race_pidfd_getfd_success=0

Even after we deliberately used a privileged-labeled lab namespace to allow Unconfined seccomp, the controlled fd theft did not reproduce in 500 attempts:

ptrace_scope: 2
Seccomp: 0
self_pidfd_getfd_stdout=1
victim before=1001/0/0 after=1001/1001/1001 secret_fd=3 dumpable=0
race_pidfd_getfd_success=0

We are not claiming Talos is universally immune. This result says our Talos node had two meaningful gates: unset seccomp still became Seccomp: 2, and ptrace_scope=2 appears to remain relevant even if seccomp is removed.

Result summary

EKS Auto / Bottlerocket 6.12.83

  • Unset seccomp: controlled fd theft reproduced on all four nodes.
  • Explicit Unconfined: reproduced on the full-matrix node.
  • RuntimeDefault: blocked pidfd_getfd.
  • Restricted: blocked pidfd_getfd and prevented the setuid helper from opening the file.
  • Baseline: blocked explicit Unconfined and hostPID, but the unset-seccomp case still reproduced.

Local kind / 6.17.8 OrbStack

  • Unset seccomp and explicit Unconfined: reproduced.
  • RuntimeDefault: blocked pidfd_getfd.
  • Restricted: blocked pidfd_getfd and the setuid open.
  • Baseline: blocked explicit Unconfined and hostPID, but unset seccomp still reproduced.

Private lab Debian arm64 / 6.12.20

  • Unset seccomp: reproduced.
  • Explicit Unconfined: reproduced in a privileged-labeled lab namespace.
  • RuntimeDefault: blocked pidfd_getfd.
  • Restricted: blocked pidfd_getfd and the setuid open.
  • Baseline: cluster admission blocked explicit Unconfined in the normal namespace.

Private lab Talos / 6.18.5

  • Unset seccomp: blocked by effective Seccomp: 2.
  • Explicit Unconfined: did not reproduce in the privileged-labeled lab namespace.
  • RuntimeDefault: blocked pidfd_getfd.
  • Restricted: blocked pidfd_getfd and the setuid open.
  • Baseline: cluster admission blocked explicit Unconfined and hostPID in the normal namespace.

How Juliet helps

The useful product question is not "does a scanner know this CVE exists?" It is "which workloads can still reach the kernel primitive, and can we stop new ones from being admitted?"

Juliet helps with that in three specific ways:

  • Find exposure candidates: Juliet inventories workload seccomp profile type, allowPrivilegeEscalation, capabilities, hostPID, namespace Pod Security labels, node OS image, kernel, and container runtime. That lets teams find pods where unset or Unconfined seccomp intersects with affected or unknown node streams.
  • Block unsafe admissions: Juliet includes a built-in admission policy named cve-2026-46333-pidfd-hardening. It can run in audit or enforce mode and flags unset or Unconfined seccomp, plus containers that do not set allowPrivilegeEscalation: false.
  • Detect runtime attempts: Juliet runtime includes an audit-mode policy named cve_2026_46333_pidfd_getfd that alerts when the sensor observes pidfd_getfd from a container.
  • Apply the boring hardening automatically: Juliet's mutation policies can set pod-level RuntimeDefault, set allowPrivilegeEscalation: false, and add capabilities.drop: ["ALL"] where teams choose mutation instead of rejection.

Runtime note: pidfd_getfd detection is a tripwire for the known public exploitation shape, not the primary mitigation. The blocking control we validated is admission plus seccomp and NoNewPrivs, followed by patched node kernels.

Want this checked in your cluster? Start Juliet free, connect a non-production cluster first, and inspect workload seccomp posture, namespace policy, and node facts. If you want us to run the review with you, request a CVE-2026-46333 Kubernetes exposure review.

What defenders should do

Start with effective seccomp, not manifest intent.

Do not only search for:

seccompProfile:
  type: Unconfined

That misses the EKS/Bottlerocket result that mattered most: pods where seccomp was unset and the effective runtime state was Seccomp: 0.

For untrusted workloads:

  • enforce RuntimeDefault or a tested Localhost seccomp profile;
  • prefer PSS Restricted where workloads can tolerate it;
  • set allowPrivilegeEscalation: false;
  • drop all capabilities;
  • avoid hostPID except for tightly controlled node agents;
  • separate build, plugin, CI, and customer-controlled workloads from sensitive workloads;
  • validate effective runtime state from inside representative pods, not just YAML.

For nodes:

  • track vendor kernel status for CVE-2026-46333;
  • verify the running node image and kernel package, not only the advisory page;
  • check whether your runtime's default seccomp profile denies pidfd_getfd;
  • compare node families separately, especially EKS Auto Mode, Bottlerocket, AL2023, Ubuntu, Debian, Talos, and custom AMIs;
  • understand whether ptrace hardening such as Yama ptrace_scope is present and effective for your node OS.

Patch state is still the durable fix. Kubernetes policy can reduce reachability and exploitability, but it should not become a substitute for patched node kernels.

What this does not prove

This post does not prove:

  • host root;
  • container escape;
  • Kubernetes Secret access;
  • node persistence;
  • theft of real host files;
  • that every EKS or Bottlerocket node is exploitable;
  • that every Talos cluster blocks the path;
  • that ptrace_scope alone is a complete mitigation;
  • that PSS Baseline is useless.

It also does not publish exploit code or reproduction commands. The point is the Kubernetes control result: ordinary unset-seccomp pods could reproduce controlled fd theft in our EKS and kind labs, while RuntimeDefault and Restricted changed the outcome.

Bottom line

CVE-2026-46333 is not just a Linux workstation story. In our EKS Auto Mode Bottlerocket lab, an ordinary non-root pod with unset seccomp stole a controlled root-opened fd from a purpose-built helper. The same pod with RuntimeDefault could not call pidfd_getfd. The same workload under Restricted could not call pidfd_getfd and could not get the helper to open the file in the first place.

The practical Kubernetes lesson is blunt: PSS Baseline is not enough when unset seccomp maps to Seccomp: 0. For untrusted workloads, enforce RuntimeDefault or stronger seccomp and use Restricted-style settings where possible.

References

Get notified when we publish

No spam, no cadence — just an email when we have something worth reading.

Or subscribe via RSS

Find unset seccomp before kernel bugs turn into exposure

Juliet inventories seccomp, privilege escalation, capability posture, namespace policy, node images, kernels, and runtime context, then helps teams enforce the controls that blocked this path in our labs.