Samstag, 5. Januar 2013

MinGW Linker Patch

MinGW Linker Deficiency

In MinGW, the standard linker, ld.exe, does not include a working option to strip unused code from the final executable.

Demonstration

To demonstrate the deficiency, consider the following two files:

File listing 1: mymult.c

 #include "standard_headers.h"  
 int mymult(int a, int b);  
 int mymult(int a, int b) { return a*b; }     
Compile into an object file with:
gcc -fdata-sections -ffunction-sections -c mymult.c

File listing 2: trymult.c

 #include "standard_headers.h"  
 int mymult(int a, int b);  
 int main() {  
     printf("The result of mymult is %d\n", 72);  
     return EXIT_SUCCESS;  
 }     
Compile into a final executable and link with the above object file with:
gcc -o trymult trymult.c mymult.o -Wl,--gc-sections

Examining the resulting executable with objdump -g will reveal that the resulting executable does indeed contain the code for the unused function "mymult". Actually, the compiler flags with the -f options in combination with the final linking option "--gc-sections" is supposed to detect this case and remove such an unused function. To make it work, we need to patch the MinGW linker.

Note

It turns out that repeating the above procedure but placing mymult.o into its own archive file (libmymult.a, for example) and then linking with -lmymult will actually cause GCC to get a little smarter and to realize that the function `mymult' was unused and thus not bring in the `mymult.o' file from the archive. This happens even without using the patching procedure below. However, in other cases, GCC is only able to remove unused code with the below-described patch. This probably happens if a piece of code to be removed occurs in the same object file as other code which is still needed. In other words, GCC is probably only excluding objects at the file level, not at the function level.

Patching the MinGW linker HOWTO

1. Download patch files from website

Web: http://sourceware.org/bugzilla/show_bug.cgi?id=11539

2. Name the patch files as follows:

patch-001-bfg-coff-gc.diff
patch-002-binutils-2.21.1.coff-gc.patch

3. Download binutils-2.20 from the MinGW website

Use version 2.21.1 if available.

4. Patch binutils with the patches from step 2

Fix any failed chunks.

5. Adjust Makefile configuration of binutils

Find the places where the option "-Werror" is generated in the Makefiles and remove it.
Removing this option is needed to compile successfully with newer versions of GCC.

6. Compile

Issue the command `make' from the root of the binutils build directory

7. Copy resulting file ld-new.exe to somewhere convenient

mkdir "%USERPROFILE%/bin/ldneo"
cp -v BUILD_ROOT/ld/.libs/ld-new.exe %USERPROFILE%/bin/ldneo/ld.exe

8. Create file env-ldneo.bat in your PATH

@echo off
set CFLAGS=-B"%USERPROFILE%/bin/ldneo"

9. Now, run env-ldneo.bat when you want to compile with the modified linker

env-ldneo
gcc -fdata-sections -ffunction-sections -c module1.c
gcc -fdata-sections -ffunction-sections -c module2.c
ar r libmymodules.a module1.o module2.o
gcc -L. %CFLAGS% -o main main.c -lmymodules -Wl,--gc-sections

How does this make a difference?

Compiling a test program, testaddch.c, and linking with pdcurses which was compiled with various techniques.

Version Comment
---------------- --------
v001 (70670 bytes) standard pdcurses, standard linker
v002 (62464 bytes) standard pdcurses, patched linker called with -Wl,--gc-sections
v003 (53760 bytes) pdcurses compiled with -fdata-sections -ffunction-sections, patched linker called as above
v004 (52224 bytes) pdcurses compiled as above plus -Os, patched linker called as bove

From v003 it can be seen that the most of the space savings is obtained by combining the two -f options while building the library, and then finally using -Wl,--gc-sections when linking against the static library (16910 bytes saved == 24% smaller). Using the -Os option on the program and/or library saves a little bit more but it is not signifcant.

UPDATE

It seems that the advantage shown above is probably most significant when the library to be linked is very large. Repeating the above procedure with a small library, like tiny-rex (a small regular expressions parsing library) actually results in a LARGER overall executable.

References

http://sourceware.org/bugzilla/show_bug.cgi?id=11539
http://stackoverflow.com/questions/6687630/c-c-gcc-ld-remove-unused-symbols

Keine Kommentare:

Kommentar veröffentlichen