[Gammaray-interest] Impact of "-Bsymbolic" linker flag (summary)

Sat Jan 14 16:21:09 CET 2012

Thanks, very interesting indeed.

While we are at it, someone mentioned you found a way to check whether the 
qt_* hooks are affected by -Bsymbolic-functions or not. Could you maybe add 
that to the wiki or post it here? Ideally we can implement it in the preload 
injector self-test then.

regards,
Volker

On Saturday 14 January 2012 13:17:03 Kevin Funk wrote:
> Hey GammaRayers,
> 
> In case anyone is interested: Thiago just posted a nice explanation of the
> intentions of ld's -Bsymbolic flag to the Qt development mailing list.
> (Actually, I wanted to write a similar mail this weekend, but then came
> Thiago's post - so let's just re-use that one).
> 
> ***
> In case you're unaware (and to explain what this has to do with GammaRay):
> GammaRay uses (amongst other methods) preloading to overwrite specific
> function calls in QtCore to hook into its target application. If the QtCore
> library is linked with "-Bsymbolic", preloading will break. A *very*
> detailed plus interesting explanation of this can be found here [1].
> ***
> 
> Back to Thiago's article: The first part roughly explains the behavior of "-
> Bsymbolic" and what it does, the second part is about Position Independent
> Code (PIC) and why it negatively affects library code in Qt (less
> interesting to us, admittedly, still interesting).
> 
> Especially the background part gives a nice overview on how GOT/PLT play
> together in detail.
> 
> (Forwarded mail is attached inline.)
> 
> Greets
> 
> [1] http://www.technovelty.org/code/c/bsymbolic.html
> 
> ----------  Weitergeleitete Nachricht  ----------
> 
> Betreff: [Development] Serious ABI issue discovered in -reduce-relocations
> Datum: Freitag, 13. Januar 2012, 12:33:46
> Von: Thiago Macieira <thiago.macieira at intel.com>
> An: development at qt-project.org
> 
> Hello
> 
> We've got a problem with -reduce-relocations. tl;dr: it's a broken concept
> and we either add a permanent workaround or we stop using it. The permanent
> workaround is to compile all executables in PIC/PIE mode.
> 
> Long story:
> The -reduce-relocations option in configure checks that the compiler
> supports the linker flag -Bsymbolic-functions. That function was added to
> binutils in 2006 from our urging, to make it possible for us to use it when
> the -Bsymbolic option presented problems. Turns out that
> -Bsymbolic-functions has the same problems that -Bsymbolic had and is no
> fix.
> 
> Those two options cause the linker to "symbolic link" some symbols into the
> binary it's producing. That is, if a symbol X is used and is also defined
> inside this ELF module, then this option tells the linker that it may
> rightly assume that the symbol will always be inside this module. The
> linker will then use cheaper types of relocation, or none at all. This is a
> huge performance improvement both at load- and at run-time.
> 
> -Bsymbolic does it for everything, whereas -Bsymbolic-functions does it for
> functions only.
> 
> The reason why we needed -Bsymbolic-functions in the first place is that ELF
> has a weird feature that causes data variables to move between modules.
> Functions weren't affected because they aren't moved.
> 
> Turns out that there is one situation in which a function is treated as
> data: when you take its address. In order to compare equally, the dynamic
> linker must resolve the function address to only one place, and
> unfortunately for us, the choice isn't to our liking. The "canonical"
> address may be moved from the library.
> 
> We haven't hit this problem before because we hadn't been doing function
> pointer comparisons. Now, with Olivier's "new connection syntax" patch, we
> are.
> 
> The workaround possible is to tell the compiler and linker that even
> executables are position-independent. This causes the linker to stop using
> copy/move relocations because it doesn't need them. However, there use of
> PIC may have a non-trivial performance impact on applications, due to
> indirect variable accesses and loss of one register.
> 
> Regardless of whether I manage to convince the linker people to improve the
> situation, we need to figure out a solution for existing systems. What shall
> we
> do?
> 
> 
> Even longer story (background):
> 
> In code that isn't position-independent (i.e., the executable), a data
> access is done as:
>         movl    variable, %eax
> 
> And a function call as:
>         call    function
> 
> And the loading of a function address as:
>         movl    $function, %edi
> 
> 
> When linking this program, the linker needs to write the address of the
> variable "variable" and of the function "function" into the instructions
> (one is absolute and the other relative, but that's irrelevant). If both
> symbols are found in a shared library, then the linker will "patch up"
> differently.
> 
> For the function, it will make the "call" instruction call to a stub called
> the Procedure Linkage Table (PLT), which then loads the proper address from
> somewhere and then jumps to the proper address. That somewhere is another
> structure called the Global Offset Table, which the dynamic linker will fill
> with the actual function address once the library has been loaded.
> 
> For the variable, things get complicated. There's no way to do the PLT
> trick. So what the linker does instead is add a "copy relocation". It
> writes the name of the variable and its expected size and reserves that
> much in the executable. The dynamic linker will then, at load time, find
> the variable in the shared library, copy the contents and then tell the
> library it should instead find the variable in the executable's memory.
> 
> When using position-independent code options (-fPIC and -fPIE), things
> change. The compiler will write for the function call:
>         call    function at PLT
> 
> The loading of a function address is:
>         movq    function at GOTPCREL(%rip), %rdi
> 
> As for the variable, it produces:
>         movq    variable at GOTPCREL(%rip), %rax
>         movl    (%rax), %eax
> 
> All accesses are position-independent and indirect. The call is placed via
> the PLT, addresses are loaded from the GOT and the loading of values is
> done after the actual address is loaded from the GOT.
> 
> This is suitable for accessing symbols defined in other ELF modules. It's
> also necessary for library code.
> 
> Unfortunately, the side-effect is that access to symbols defined in the
> current
> ELF module is also done indirectly. Two options help change this: -
> fvisibility=hidden and the symbolics.
> 
> The -fvisibility=hidden option is enabled by default in Qt since 4.0 and
> corresponds to the configure option -reduce-exports. It does not change the
> code above, so it means that all variable accesses to variables not defined
> in the same compilation unit are indirect. Fortunately for the function
> call, the linker realises that target is inside the library and cannot be
> anywhere else, so the call is now direct to function. The loading of the
> address is via the GOT, which means a run-time relocation is still
> necessary, when the most efficient solution would be to use the "load
> effective address" instruction with
> no relocation.
> 
> The -Bsymbolic and -Bsymbolic-functions produce the same effect, with the
> difference that the symbol is left the ELF export table (i.e., "default"
> visibility).
> 
> The consequences of all of this are:
>  1) there's absolutely no way to get the most efficient code in libraries,
> period. ELF is optimised for executable code, not library.
>  2) -Bsymbolic is a broken concept so long as copy relocations remain in use
> 3) -Bsymbolic-functions is either the same broken concept or a broken
> implementation. It might be possible to salvage the option by making the
> linker optimise the PLT calls like it does today, but keep the GOT
> references as public.
>  4) calling a function via a function pointer is inefficient because of an
> indirect jump. If that function's address was taken in the executable, it's
> doubly inefficient: the indirect jump you make resolves to another indirect
> jump.
> 
> The only architecture not affected by this is IA-64. One reason is that
> IA-64 ABI mandates that executables also be PIC, so the original problem is
> gone: there are no copy relocations. What's more, Intel engineers realised
> the problem of the indirect loading of data and invented a special
> relocation that the linker is allowed to relax into simpler code. If the
> symbol is found, at link-time, to be on the same ELF module, the linker
> relaxes the "load" generated by the compiler into a "move" between
> registers.
> 
> It's possible to apply the same lessons learned to other platforms, but it
> hasn't been done.
-- 
Volker Krause | volker.krause at kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel. Germany +49-30-521325470, Sweden (HQ) +46-563-540090
KDAB - Qt Experts - Platform-independent software solutions
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3634 bytes
Desc: not available
URL: <http://mail.kdab.com/pipermail/gammaray-interest/attachments/20120114/60d6f51a/attachment.bin>