Linux security in 2011, or my LKML's yearly digest
Linux security in 2011, or LKML’s yearly digest
Disclaimer: I have nothing to do with the following, all credits go
to their respective authors. I’m just publishing my 2011’s bookmarks
about Linux kernel security with a one line summary based on my
(possibly wrong) understanding
Do not hesitate to correct me (gently if possible :)) in comments or
mail.
- False boundaries of (certain) capabilities: Brad Spengler describes 19 capabilities (of 35) which can be used to regain full privileges. Coincidentally, Vasily Kulikov discovered a “funny” behavior of CAP_NET_ADMIN which permit to load any modules available in /lib/modules/ instead of limiting to network related modules only, AFAIK, this vulnerability was closed but the fix got reverted some weeks later because of some userspace breakages.
- PaX team introduced a new range of
stuff using the new GCC plugin
infrastructure. At compile-time,
pro-active code is automatically added to potentially dangerous
paths:
constify_plugin.c
enforces read-onlintroduces new constraints (__do_const
and__no_const
) enforcing read-only permissions at compilation-time and run-time. PaX then makes usage of these new constraints by patching most of the “ops structures”. The plugin also automatically protects structures where all members are function pointers, this patching on-the-fly is required because patching directly the source kernel would never be integrated upstream.stackleak_plugin.c
adds instrumentation code beforealloca()
calls. This code checks that stack-frame size does not overlap with kernel task size. It circumvents techniques described in “Large memory management vulnerabilities” by Gaël Delalleau (2005) and “The stack is back” by Jon Oberheide (2012).- GCC 4.6 introduced named address
spaces.
It was initially specified for embedded processors but PaX team
uses this feature to represent user and kernel space.
checker_plugin.c
thus introduces__user
,__kernel
and__iomem
namespaces to spot non-legit flows between address spaces. kallocstat_plugin.c
produces statistics about the size given in parameter to various memory allocation functionskernexec_plugin.c
enforces non-executable pages like theKERNEXEC
PaX feature, but without huge performance impact on AMD64.
- pagexec also managed to compile Linux Kernel with clang by patching both Linux and clang. Now that gcc integrated plugins, it is less interesting but llvm was the solely compiler with easy access to its internal structure, allowing external applications to perform static analysis…
- A user space interface to kernel Crypto-API was submitted to kernel developers, an interesting use-case was to offer a way to deport key material between processes. Imagine process A in possession of private keys and another one, B, actually performing encryption / decryption stuff part. The idea was to initialize a “crypto socket” in A and pass this file descriptor to B (via a classic ancillary message).
- Pseudo-files in
/proc/<pid>/
have a different security model than “normal” files because of its ephemeral nature: checks need to happen during each system call and not atopen()
time because permissions can change at anytime. Halfdog discovered (and Kees Cook reported it to LKML) that not all files were protected accordingly. If a program opens/proc/self/auxv
and keeps this file descriptor opened. Then, even after aexecve()
of a setuid binary, the file descriptor would still be available, leaking information! Fixing this vulnerability has been a long road and a pretty solution came up with the introduction ofrevoke()
, a new syscall invalidating file descriptors. Unfortunately, the thread didn’t survive and ideas were lost… (by the way, it is funny that this kind of problem resuscitated in CVE-2012-0056 lately…) - As one goes along,
execve()
became almost magical, it had to support Set-User-Id, capabilities, and file capabilities. Each feature added complexity and different legacy behaviors to maintain. Instead of dropping these POSIX features, OpenWall 3.0 took a different approach by removing Suid binaries from its base install, thus preventing execve’s voodoo. This change is just a line in Owl’s changelog but is in fact a major achievement: it required them to re-architecture important software like crontab or user management tools./bin/ping
is setuid-root because it opens a raw socket and injects its packet on the wire directly. A new socket type,PROT_ICMP
, was developed by Openwall team, it makes possible to send ICMP Echo messages without special privileges (caller’s GID has to be included in a range stored in a sysctl key). It is interesting to note that only replies (based on ICMP identifier field) are sent to userspace, not the whole ICMP traffic like in Mac OS X. - TCP Initial Sequence number is now a 32-bits random number using MD5. ISN was previously the concatenation of 24 random bits (MD4 of TCP end points with a secret rekeyed every 5 minutes) and an 8 bits counter (number of times secret key was regenerated)
- Vasilily tried to push upstream additional checks for
copy_{to,from}_user()
(by checking if requested size fits boundaries fixed at compile time), this patch was a cut down version ofPAX_USERCOPY
but was NACKed by Linus asking him for more “balance and sanity”. However, he didn’t reject the idea itself, saying that a cleaner version might be accepted…