Copy Fail: AF_ALG + splice page-cache overwrite (CVE-2026-31431)

This page documents Copy Fail: a Linux kernel local privilege escalation where AF_ALG + splice() turns readable file page-cache pages into part of a writable AEAD destination scatterlist, and authencesn then performs a deterministic 4-byte write past the contractual output boundary.

Affected component: crypto/algif_aead.c in-place decrypt path + crypto/authencesn.c
Primitive: controlled 4-byte page-cache write into any file readable by the attacker
Reachability: unprivileged local user, AF_ALG available, algif_aead loaded
Impact: immediate system-wide corruption of the page-cache copy used by read(), mmap(), and execve()

This is closer to Dirty Pipe / Dirty COW style page-cache abuse than to a classic memory-corruption race:

no race window
no repeated retries
no on-disk file modification
same exploit flow across many distros because the primitive is structural, not offset-dependent

Core idea

splice() moves data between a file, a pipe, and another FD by reference. If a readable file is spliced into a pipe and then into an AF_ALG AEAD socket, the crypto input scatterlist can reference the same page-cache pages backing that file.

For AEAD decrypt, algif_aead historically optimized the request into an in-place layout:

AAD and ciphertext were copied into the user RX buffer
the final authentication tag was not copied
instead, tag scatterlist entries were appended to the destination with sg_chain()
req->src = req->dst, so those appended tag pages became part of a writable destination chain

If the tag pages come from spliced file data, the writable destination chain now includes page-cache pages of a read-only file.

The bug in `authencesn`

authencesn is an AEAD wrapper used for IPsec Extended Sequence Numbers (ESN). During decrypt it uses the destination scatterlist as scratch space and writes 4 bytes past the legitimate decrypt output:

scatterwalk_map_and_copy(tmp, dst, 0, 8, 0);
scatterwalk_map_and_copy(tmp, dst, 4, 4, 1);
scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1);

The last write stores seqno_lo (attacker-controlled AAD bytes 4..7) at dst[assoclen + cryptlen], which is after the tag and therefore outside the contract for AEAD decrypt output.

When algif_aead has chained page-cache-backed tag pages into dst, that write crosses out of the RX buffer and lands in the victim file's page cache.

Why this becomes a useful primitive

The attacker controls:

Which file: any file readable by the attacker
Which offset: via splice offset/length and AEAD assoclen
Which value: the 4 bytes written come from attacker-controlled AAD bytes 4..7

Even if authentication fails and recvmsg() returns an error, the page-cache overwrite persists because the scratch write already happened.

The corrupted page is not marked dirty for writeback, so:

the on-disk file remains unchanged
checksum comparisons on disk miss the attack
all later read(), mmap(), and execve() users consume the modified in-memory page

Typical LPE path

The public write-up targets a setuid-root binary such as /usr/bin/su:

Open AF_ALG and bind to authencesn(hmac(sha256),cbc(aes))
Send AAD where bytes 4..7 contain the 4-byte chunk to write
splice() target file data into the AEAD input so the final tag region references the target file's page-cache pages
Trigger recv() / recvmsg() to force decrypt
Repeat until the page-cache copy of the setuid binary is patched
Execute the binary so the kernel loads the modified cached image and runs attacker code as root

Conceptual PoC skeleton:

a = socket.socket(38, 5, 0)  # AF_ALG, SOCK_SEQPACKET
a.bind(("aead", "authencesn(hmac(sha256),cbc(aes))"))
# set key, accept request socket
u.sendmsg([b"A"*4 + payload_chunk], [cmsg_headers], MSG_MORE)
os.splice(target_fd, pipe_wr, offset)
os.splice(pipe_rd, alg_fd, offset)
u.recv(...)  # triggers decrypt -> page-cache write

How the bug became exploitable

2011: authencesn introduced for IPsec ESN handling (a5079d084f8b)
2015: authencesn converted to the new AEAD interface and kept the out-of-contract scratch write (104880a6b470)
2017: algif_aead switched decrypt to an in-place design and chained tag pages into the destination (72548b093ee3)

That 2017 change is what turned an internal scratch write into a page-cache write primitive reachable from unprivileged userspace.

Fix and mitigations

Mainline fixed this by reverting algif_aead back to out-of-place operation (a664bf3d603d), so page-cache pages can remain in the source scatterlist but no longer become part of the writable destination chain.

Useful mitigations:

patch to a kernel carrying a664bf3d603d or a distro backport
block AF_ALG socket creation with seccomp for untrusted workloads
disable algif_aead if you need an immediate stopgap

Example emergency mitigation:

echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif-aead.conf
rmmod algif_aead 2>/dev/null || true

For containerized environments, AF_ALG should be treated as a kernel attack surface. Even without this specific CVE, it is a good candidate for seccomp denial in CI runners, sandboxes, and multi-tenant containers.

Detection / review notes

A page-cache-only patch means the suspicious effect may be visible only in memory, not on disk.
Look for unusual AF_ALG use on systems that do not intentionally expose kernel crypto sockets to workloads.
When auditing zero-copy kernel interfaces, treat any path that combines splice()-backed page references with scatterlists reused as destinations as high risk.
A useful reviewer rule is: if an algorithm writes beyond its documented output length, any caller that chains foreign pages into dst may turn it into a write primitive.