Cross compiler


A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is running. For example, a compiler that runs on a PC but generates code that runs on Android devices is a cross compiler.
A cross compiler is useful to compile code for multiple platforms from one development host. Direct compilation on the target platform might be infeasible, for example on embedded systems with limited computing resources.
Cross compilers are distinct from source-to-source compilers. A cross compiler is for cross-platform software generation of machine code, while a source-to-source compiler translates from one coding language to another in text code. Both are programming tools.

Use

The fundamental use of a cross compiler is to separate the build environment from target environment. This is useful in several situations:
  • Embedded computers where a device has highly limited resources. For example, a microwave oven will have an extremely small computer to read its keypad and door sensor, provide output to a digital display and speaker, and to control the microwave for cooking food. This computer is generally not powerful enough to run a compiler, a file system, or a development environment.
  • Compiling for multiple machines. For example, a company may wish to support several different versions of an operating system or to support several different operating systems. By using a cross compiler, a single build environment can be set up to compile for each of these targets.
  • Compiling on a server farm. Similar to compiling for multiple machines, a complicated build that involves many compile operations can be executed across any machine that is free, regardless of its underlying hardware or the operating system version that it is running.
  • Bootstrapping to a new platform. When developing software for a new platform, or the emulator of a future platform, one uses a cross compiler to compile necessary tools such as the operating system and a native compiler.
  • Compiling native code for emulators for older now-obsolete platforms like the Commodore 64 or Apple II by enthusiasts who use cross compilers that run on a current platform.
Use of virtual machines resolves some of the reasons for which cross compilers were developed. The virtual machine paradigm allows the same compiler output to be used across multiple target systems, although this is not always ideal because virtual machines are often slower and the compiled program can only be run on computers with that virtual machine.
Typically the hardware architecture differs but cross-compilation is also usable when only the operating system environment differs, as when compiling a FreeBSD program under Linux, or even just the system library, as when compiling programs with uClibc on a glibc host.

Canadian Cross

The Canadian Cross is a technique for building cross compilers for other machines, where the original machine is much slower or less convenient than the target. Given three machines A, B, and C, one uses machine A to build a cross compiler that runs on machine B to create executables for machine C. The practical advantage in this example is that Machine A is slow but has a proprietary compiler, while Machine B is fast but has no compiler at all, and Machine C is impractically slow to be used for compilation.
When using the Canadian Cross with GCC, and as in this example, there may be four compilers involved
  • The proprietary native Compiler for machine A is used to build the gcc native compiler for machine A .
  • The gcc native compiler for machine A is used to build the gcc cross compiler from machine A to machine B
  • The gcc cross compiler from machine A to machine B is used to build the gcc cross compiler from machine B to machine C
The end-result cross compiler will not be able to run on build machine A; instead it would run on machine B to compile an application into executable code that would then be copied to machine C and executed on machine C.
For instance, NetBSD provides a POSIX Unix shell script named build.sh which will first build its own toolchain with the host's compiler; this, in turn, will be used to build the cross compiler which will be used to build the whole system.
The term Canadian Cross came about because at the time that these issues were under discussion, Canada had three national political parties.

Timeline of early cross compilers

  • 1969 –The first version of UNIX was developed by Ken Thompson on a PDP-7, but due to the lack of tools and cost, it was cross-compiled on a GECOS system and transferred via paper tape. This showed practical cross-compilation for OS development.
  • 1979 –ALGOL 68C generated ZCODE; this aided porting the compiler and other ALGOL 68 applications to alternate platforms. To compile the ALGOL 68C compiler required about 120 KB of memory. With Z80 its 64 KB memory is too small to actually compile the compiler. So for the Z80 the compiler itself had to be cross compiled from the larger CAP capability computer or an IBM System/370 mainframe.
  • 1980s –Aztec C offered native and cross-compilation for home computers like Apple II and Commodore 64.

    GCC and cross compilation

, a free software collection of compilers, can be set up to cross compile. It supports many platforms and languages.
GCC requires that a compiled copy of binutils is available for each targeted platform. Especially important is the GNU Assembler. Therefore, binutils first has to be compiled correctly with the switch --target=some-target sent to the configure script. GCC also has to be configured with the same --target option. GCC can then be run normally provided that the tools, which binutils creates, are available in the path, which can be done using the following :
PATH=/path/to/binutils/bin:$ make
Cross-compiling GCC requires that a portion of the target platform's C standard library be available on the host platform. The programmer may choose to compile the full C library, but this choice could be unreliable. The alternative is to use newlib, which is a small C library containing only the most essential components required to compile C source code.
The GNU Autotools packages use the notion of a build platform, a host platform, and a target platform. The build platform is where the compiler is actually compiled. In most cases, build should be left undefined. The host platform is always where the output artifacts from the compiler will be executed whether the output is another compiler or not. The target platform is used when cross-compiling cross compilers, it represents what type of object code the package will produce; otherwise the target platform setting is irrelevant. For example, consider cross-compiling a video game that will run on a Dreamcast. The machine where the game is compiled is the build platform while the Dreamcast is the host platform. The names host and target are relative to the compiler being used and shifted like son and grandson.
Another method popularly used by embedded Linux developers involves the combination of GCC compilers with specialized sandboxes like Scratchbox and Scratchbox 2, or . These tools create a "chrooted" sandbox where the programmer can build up necessary tools, libc, and libraries without having to set extra paths. Facilities are also provided to "deceive" the runtime so that it "believes" it is actually running on the intended target CPU ; this allows configuration scripts and the like to run without error. Scratchbox runs more slowly by comparison to "non-chrooted" methods, and most tools that are on the host must be moved into Scratchbox to function.

Manx Aztec C cross compilers

, of Shrewsbury, New Jersey, produced C compilers beginning in the 1980s targeted at professional developers for a variety of platforms up to and including IBM PC compatibles and Macs.
Manx's Aztec C programming language was available for a variety of platforms including MS-DOS, Apple II, DOS 3.3 and ProDOS, Commodore 64, Mac 68k and Amiga.
From the 1980s and continuing throughout the 1990s until Manx Software Systems disappeared, the MS-DOS version of Aztec C was offered both as a native mode compiler or as a cross compiler for other platforms with different processors including the Commodore 64 and Apple II. Internet distributions still exist for Aztec C including their MS-DOS based cross compilers. They are still in use today.
Manx's Aztec C86, their native mode 8086 MS-DOS compiler, was also a cross compiler. Although it did not compile code for a different processor like their Aztec C65 6502 cross compilers for the Commodore 64 and Apple II, it created binary executables for then-legacy operating systems for the 16-bit 8086 family of processors.
When the IBM PC was first introduced it was available with a choice of operating systems, CP/M-86 and PC DOS being two of them. Aztec C86 was provided with link libraries for generating code for both IBM PC operating systems. Throughout the 1980s later versions of Aztec C86 added support for MS-DOS "transitory" versions 1 and 2 and which were less robust than the "baseline" MS-DOS version 3 and later which Aztec C86 targeted until its demise.
Finally, Aztec C86 provided C language developers with the ability to produce ROM-able "HEX" code which could then be transferred using a ROM burner directly to an 8086 based processor. Paravirtualization may be more common today but the practice of creating low-level ROM code was more common per-capita during those years when device driver development was often done by application programmers for individual applications, and new devices amounted to a cottage industry. It was not uncommon for application programmers to interface directly with hardware without support from the manufacturer. This practice was similar to Embedded Systems Development today.
Thomas Fenwick and James Goodnow II were the two principal developers of Aztec-C. Fenwick later became notable as the author of the Microsoft Windows CE kernel or NK as it was then called.