Using external compilers with R

These web pages describe compiler-specific details about writing external code to use with the current version of R, from a Windows programming perspective.

For instructions for other platforms, and non-compiler-specific details, see the Writing R Extensions manual in the docs subdirectory of the R installation.


General advice and frequent problems

Calling conventions
Register preservation
Floating point control
Differences between .C() and .Fortran()
is.loaded() returning FALSE

Calling conventions

Calling conventions are the protocols used by the compiler when passing arguments to functions. R always uses the "cdecl" calling convention, which passes all arguments on the stack, pushing the rightmost one first; the caller is responsible for restoring the stack afterwards.

For example, the call

.C("foo",as.integer(1), as.double(2),package="bar")

will do the following:

  1. push a pointer to a vector containing the floating point value 2
  2. push a pointer to a vector containing the integer 1
  3. call the function foo in library bar.dll
  4. when it returns, restore the stack pointer to its original value by adding 8 (the size of two pointers) to it.
The standard calling convention in Windows is "stdcall", which is similar except that the routine that is called will remove the parameters from the stack. There are many other calling conventions used by different compilers: some or all parameters passed in CPU registers, parameters in the reverse order, etc.

Symptoms of mismatched calling conventions are:

Register preservation

When R calls a function, it assumes that the EBP, EBX, EDI and ESI registers will be returned unchanged. In addition, the direction flag must be preserved. Programs may assume R follows these conventions across calls as well.

Floating point control

R sets the floating point control word using the R.dll function Rwin_fpset. This sets the FPU to mask all exceptions, use 64 bit precision, and round to nearest or even.

R expects all calls to DLLs (including the initializing call) to leave the FPU control word unchanged. Many run-time libraries reset the FPU control word during initialization; this will cause problems in R, and will result in a warning message like "DLL attempted to change FPU control word from 8001f to 9001f". The value 8001f that gets reported is in the format expected by the C library routine _controlfp; the raw value that is used in the FPU register is 037F.

If you can't tell your compiler to generate a DLL that leaves the FPU alone, then you need to be sure to restore it to R's value before returning to R. See the Delphi section for an example of how to do this in that language.

Differences between .C() and .Fortran()

The two R functions .C("foo", ...) and .Fortran("foo", ...) differ in the following respects:

  1. .C("foo", ...) looks for the symbol "foo" in the external library, whereas .Fortran("foo", ...) looks for the symbol "foo_" (which is how g77 would export the subroutine "foo").
  2. .C("foo", arg=as.character("red","blue","green")) passes the character mode argument as a pointer to an array of pointers to the strings, whereas .Fortran("foo", arg=as.character("red","blue","green")) would just passes a pointer to a 255 character buffer containing the first string, "red". In both cases the strings are null-terminated.
  3. Both .C and .Fortran allow arbitrary objects to be passed, but only C code which includes the R.h header file is likely to be able to read anything but simple vectors.

is.loaded() returning FALSE

When R uses dyn.load() to load a DLL, it relies on the DLL's export table to find functions. Many compilers use fairly obscure methods to get a function name into the export table. If you don't follow them exactly, your function won't be available to R.

Some compilers (e.g. g77, as mentioned above) make changes to the function names before putting them in the export table. If you specify the original name, R may not be able to find the entry point. Wu Yongwei has written detailed information on this (not in the context of R, but useful here anyway).

Specific instructions on both of these issues are compiler dependent. However, you can diagnose the causes of errors by examining the export table of your DLL. There are a number of ways to do this:


C/C++

MinGW tools

If your code is written in reasonably portable C, then the easiest way to compile it is to use the R toolset together with the distributed Makefile (Makefile.packages in the source distribution). Instructions are in the readme.packages file.

Microsoft Visual C++

The readme.packages file contains instructions for compiling and linking VC++ code.

Other compilers

The "is.loaded() returning FALSE" problem may arise. See that section up above.

Cross-compiling from Linux

If you normally do your development on Linux, then it may be easiest to compile your Windows DLLs there. Instructions are available on CRAN in the document Building Microsoft Windows Versions of R and R packages under Intel Linux by Jun Yan and A.J. Rossini.


Last modified: September 24, 2004, by Duncan Murdoch