Mathias Krause
2011-10-14 13:00:05 UTC
Hi Roland,
I stumbled over some ancient changes you made back in 1996 that turned into
a "problem" in 2006 but went undetected until now. But let me tell the whole
story.
The required alignment of each of the two PT_LOAD entries of libc.so is 2MiB
on x86-64 for quite some time (since binutils version 2.18). Since the first
segment, the code segment, is only around 1.3MiB, this creates a hole in the
address space between the two segments when the library gets loaded.
_dl_map_object_from_fd() tries to handle this case, albeit it does it in a
unlucky way creating a possible huge waste of address space.
There are two problems with the current behaviour:
1/ _dl_map_object_from_fd() calls mmap() for the first segment with a size
argument fitted for the whole address range the shared object will live in.
This might (and in the case of libc.so it actually does) create a mapping
that is larger than the file it is backed with because the virtual address
range to cover all segments might be much larger than the actual file size.
2/ To cope with the possible hole created in 1/ _dl_map_object_from_fd()
tries to, at least, make the underlying address space behave as intended by
setting the protection for the memory hole to PROT_NONE. This, in fact,
leaves the mapping intact and, as a matter of fact, occupies virtual address
space.
This behaviour allows mapping the data part of libc.so as executable code
with just a single call to mprotect -- even on systems that enforce W^X on
address space mappings this will succeed. It also wastes address space which
might not be such of a concern for legitimate ELF objects but will be for
hand crafted ELF object files with small segments but huge gaps between
those segments.
Taking into account that the first mmap() call uses the size of the whole
range needed by the shared object to "reserve" address space for all
following segments, we might leave it this way to not create any races with
other threads doing calls to mmap() while _dl_map_object_from_fd() sets up
the mappings for the other segments. But the mprotect() call is wrong and
should be substituted by a call to munmap(). Actually it was munmap() until
you changed into mprotect() back in 1996. So the question is: Why mprotect,
why not munmap which seems to fit perfect and would resolve the above
issues?
Regards,
Mathias
I stumbled over some ancient changes you made back in 1996 that turned into
a "problem" in 2006 but went undetected until now. But let me tell the whole
story.
The required alignment of each of the two PT_LOAD entries of libc.so is 2MiB
on x86-64 for quite some time (since binutils version 2.18). Since the first
segment, the code segment, is only around 1.3MiB, this creates a hole in the
address space between the two segments when the library gets loaded.
_dl_map_object_from_fd() tries to handle this case, albeit it does it in a
unlucky way creating a possible huge waste of address space.
There are two problems with the current behaviour:
1/ _dl_map_object_from_fd() calls mmap() for the first segment with a size
argument fitted for the whole address range the shared object will live in.
This might (and in the case of libc.so it actually does) create a mapping
that is larger than the file it is backed with because the virtual address
range to cover all segments might be much larger than the actual file size.
2/ To cope with the possible hole created in 1/ _dl_map_object_from_fd()
tries to, at least, make the underlying address space behave as intended by
setting the protection for the memory hole to PROT_NONE. This, in fact,
leaves the mapping intact and, as a matter of fact, occupies virtual address
space.
This behaviour allows mapping the data part of libc.so as executable code
with just a single call to mprotect -- even on systems that enforce W^X on
address space mappings this will succeed. It also wastes address space which
might not be such of a concern for legitimate ELF objects but will be for
hand crafted ELF object files with small segments but huge gaps between
those segments.
Taking into account that the first mmap() call uses the size of the whole
range needed by the shared object to "reserve" address space for all
following segments, we might leave it this way to not create any races with
other threads doing calls to mmap() while _dl_map_object_from_fd() sets up
the mappings for the other segments. But the mprotect() call is wrong and
should be substituted by a call to munmap(). Actually it was munmap() until
you changed into mprotect() back in 1996. So the question is: Why mprotect,
why not munmap which seems to fit perfect and would resolve the above
issues?
Regards,
Mathias