CGO

Posted on Jun 26, 2018

cgo is an amazing technology which allows Go programs to interoperate with C libraries. Given a Go source file written with some special features, cgo outputs Go and C files that can be combined into a single Go package.

C code and Go code live in two different universes, cgo traverses the boundary between them. The transition is not free and depending on where it exists in your code, the cost could be inconsequential, or substantial.

There are some cases where cgo is unavoidable, most notably where you have to interoperate with a graphics driver or windowing system what is only available as a binary blob.

package rand

// #include <stdlib.h>
import "C"

func Random() int {
    var r C.long = C.random()
    return int(r)
}

func Seed(i int) {
    C.srandom(C.uint(i))  
}

cgo directives

cgo is not a language nor a compiler. It’s a Foreign Function Interface (FFI), a mechanism we can use in Go to invoke functions and services written in a different language, specifically C.

The comment block right above the import “C” instruction is called a “preamble” and can contain actual C code, in this case an header inclusion. Once imported, the “C” pseudo-package lets us “jump” to the foreign code. You can build the example by invoking go build, the same as if it was plain Go.

1
2
3

// #include <stdio.h>
// #include <errno.h>
import "C"

CFLAGS, CPPFLAGS, CXXFLAGS, FFLAGS, LDFLAGS may be defined with pseudo #cgo directives within these comments to tweak the behavior of the C, C++ or Fortran compiler. Values defined in multiple directives are concatenated together. The directive can include a list of build constraints limiting its effect to systems satisfying one of the constraints.

cgo recognizes this comment. Any lines starting with #cgo followed by a space character are removed; these become directives for cgo. The remaining lines are used as a header when compiling the C parts of the package. In this case those lines are just a single #include statement, but they can be almost any C code. The cgo directives are used to provide flags for the compiler and linker when building the C parts of the package.

// #cgo CFLAGS: -DPNG_DEBUG=1
// #cgo amd64 386 CFLAGS: -D86=1
// #cgo LDFLAGS: -lpng
// #include <png.h>
import "C"

Alternatively, CPPFLAGS and LDFLAGS may be obtained via the pkg-config tool using a ‘#cgo pkg-config:’ directive followed by the package names. For example:

1
2
3

// #cgo pkg-config: png cairo
// #include <png.h>
import "C"

1	// #cgo LDFLAGS -L${SRCDIR}/libs -l foo

1	// #cgo LDFLAGS -L/go/src/foo/libs -l foo

-L is the path to the directories containing the libraries. A search path for libraries.
-l is the name of the library you want to link to.

For instance, if you want to link to the library ~/libs/libabc.a you’d add:

1	-L$(HOME)/libs -labc

| file                            | compiler          |
| ------------------------------- |:-----------------:|
| xxx.c, xxx.s, xxx.S             | C compiler        |
| xxx.cc, xxx.cpp, xxx.cxx        | C++ compiler      |
| xxx.f, xxx.F, xxx.for, xxx.f90  | fortran compiler  |

Any .h, .hh, .hpp or .hxx files will not be compiled separately, but if these header files are changed, the C and C++ files will be recompiled.

tips about pkg-config

pkg-config: return metainformation about installed libraries.

The pkg-config program is used to retrieve information about installed libraries in the system. It is typically used to compile and link against one or more libraries. Here is a typical usage scenario in a Makefile:

1 2	program: program.c cc program.c `pkg-config --cflags --libs gnomeui`

pkg-config retrieves information about packages from special metadata files. These files are named after the package, and has a .pc extension. On most systems, pkg-config looks in /usr/lib/pkgconfig , /usr/share/pkgconfig , /usr/local/lib/pkgconfig and /usr/local/share/pkgconfig for these files. It will additionally look in the colon-separated (on Windows, semi-colon-separated) list of directories specified by the PKG_CONFIG_PATH environment variable.

The package name specified on the pkg-config command line is defined to be the name of the metadata file, minus the .pc extension. If a library can install multiple versions simultaneously, it must give each version its own name (for example, GTK 1.2 might have the package name “gtk+” while GTK 2.0 has “gtk-2.0”).

In addition to specifying a package name on the command line, the full path to a given .pc file may be given instead. This allows a user to directly query a particular .pc file.

/usr/local/lib/pkgconfig/libuv.pc

prefix=/usr/local/Cellar/libuv/1.20.0
exec_prefix=${prefix}
libdir=${exec_prefix}/lib
includedir=${prefix}/include

Name: libuv
Version: 1.20.0
Description: multi-platform support library with a focus on asynchronous I/O.
URL: http://libuv.org/

Libs: -L${libdir} -luv -lphread -ldl
Cflags: -I${includedir}

basic data type conversion

| C                   | Go              |
| ------------------- |:---------------:|
| char                | C.char          |
| signed char         | C.schar         |
| unsigned char       | C.uchar         |
| short               | C.short         |
| unsigned short      | C.ushort        |
| int                 | C.int           |
| unsigned int        | C.uint          |
| long                | C.long          |
| unsigned long       | C.ulong         |
| long long           | C.longlong      |
| unsigned long long  | C.ulonglong     |
| float               | C.float         |
| double              | C.double        |
| complex float       | C.complexfloat  |
| complex double      | C.complexdouble |
| long long           | C.longlong      |
| void *              | unsafe.Pointer  |
| __int128_t          | [16]byte        |
| __uint128_t         | [16]byte        |
| few special C types | uintptr         |

| C                | Go                |
| ---------------- |:-----------------:|
| sizeof(T)        | C.sizeof_T        |
| sizeof(struct T) | C.sizeof_struct_T |
| struct X         | C.struct_X        |
| union X          | C.union_X         |
| enum X           | C.enum_X          |

string conversion between C and Go

Unlike Go, C doesn’t have an explicit string type. Strings in C are represented by a zero-terminated array of chars. Conversion between Go and C strings is done with the C.CString, C.GoString, and C.GoStringN functions. These conversions make a copy of the string data.

// Go string to C string
// The C string is allocated in the C heap using malloc.
// It is the caller's responsibility to arrange for it to be
// freed, such as by calling C.free (be sure to include stdlib.h if C.free is needed).
func C.CString(string) *C.char

// Go []byte slice to C array
// The C array is allocated in the C heap using malloc.
// It is the caller's responsibility to arrange for it to be
// freed, such as by calling C.free (be sure to include stdlib.h if C.free is needed).
func C.CBytes([]byte) unsafe.Pointer

// C string to Go string
func C.GoString(*C.char) string

// C data with explicit length to Go string
func C.GoStringN(*C.char, C.int) string

// C data with explicit length to Go []byte
func C.GoBytes(unsafePinter, C.int) []byte

memory management

Memory allocations made by C code are not known to Go’s memory manager. When you create a C string with C.CString (or any C memory allocation) you must remember to free the memory when you’re done with it by calling C.free.

The call to C.CString returns a pointer to the start of the char array, so before the function exits we convert it to an unsafe.Pointer and release the memory allocation with C.free. A common idiom in cgo programs is to defer the free immediately after allocating (especially when the code that follows is more complex than a single function call), as in this rewrite of Print:

func Print(s string) {
    cs := C.CString(s)
    defer C.free(unsafe.Pointer(cs))
    C.fputs(cs, (*C.FILE)(C.stdout))
}

A standard way to do this follows.

// #include <stdlib.h>
import "C"
import "unsafe"
...
    var cmsg *C.char = C.CString("hi")
    defer C.free(unsafe.Pointer(cmsg))
    // do something with the C string

handle C return error in Go

Any C function (even void functions) may be called in a multiple assigned context to retrieve both the return value (if any) and the C errno variable as an error (use _ to skip the result value if the function returns void). For example.

/*
The sqrt() function compute the non-negative square root of x
sqrt(-0) returns -0
sqrt(x) returns a NaN and generates a domain error for x < 0
*/
#include <math.h>
double
sqrt(double x);

1 2	n, err := C.sqrt(-1) _, err = C.voidFunc()

call Go code in C code

Using //export in a file places a restriction on the preamble: since it is copied into two different C. Output files, it must not contain any definitions, only declarations.

//export MyFunction
func MyFunction(arg1, arg2 int, arg3 string) int64 {...}

//export MyFunction2
func MyFunction2(arg1, arg2 int, arg3 string) (int64, *C.char) {...}

1 2	extern int64 MyFunction(int arg1, int arg2, GoString arg3); extern struct MyFunction2_return MyFunction2(int arg1, int arg2, GoString arg3);