Updates for CBinding 1.0

Keith Rutkowski

Keith Rutkowski
April 28, 2021

Updates for CBinding.jl v1.0
Further improving the user's experience when using C libraries from Julia with the CBinding.jl v1.0 release.


Background

In previous articles we introduced the CBinding.jl Julia package and demonstrated it with a binary I/O example. The new version of CBinding.jl brings support for C (ANSI C, C89, C99, C11, and C18) constructs to Julia and offers the following features:

  • fully supports C’s struct, union, and enum types

    • field alignment strategies

    • bit fields

    • nested types

    • anonymous types

    • type qualifiers

    • unknown-length arrays

  • variadic functions

  • inline functions

  • function calling conventions

  • typed function pointers

  • automatic callback function pointers

  • documentation generation

  • preprocessor macros (partially supported)

Even perfectly valid, yet somewhat inhumane C definitions are supported, such as:

extern struct { int i; } g[2], func();

Several significant changes to the package were made with the 1.0 release, including the integration of capabilities from CBindingGen.jl for automatic bindings generation. Therefore, this article aims at helping you transition your code to incorporate these changes so you can save time and effort.

Specification of C user-defined types

While much of the behind-the-scenes representation of C types remains the same, the method of defining them has changed. Rather than using the half-C, half-Julia @cstruct, @cunion, or @cenum macro syntax (which was fairly unnatural for both C and Julia developers), the C code itself is now used as-is.

julia> using CBinding

julia> c``  (1)

julia> c"""
         struct Number {  (2)
           enum {
             INTEGER,
             UNSIGNED,
             FLOAT
           } kind;
           
           union {
             int i;
             unsigned int u;
             double d;
           };
         };
       """;
1 Create a compiler context (a compilation unit) to accumulate C code into.
2 Specify a user-defined C type just as it is done in C.

Below is a more realistic example where the C code might rely on types defined in other header files or Julia packages.

julia> c"""
         #include <stdint.h>  (1)
       """s
       const c"uint8_t"  = UInt8
       const c"uint16_t" = UInt16
       const c"uint32_t" = UInt32

julia> c"""
         struct WAV_header {  (2)
           uint8_t  riff[4];
           uint32_t fileSize;
           uint8_t  fileHeader[4];
           uint8_t  fmtMarker[4];
           uint32_t fmtLength;
           uint16_t fmtType;
           uint16_t dataChannels;
           uint32_t dataSampleRate;
           uint32_t dataBytesPerSecond;
           uint16_t dataBytesPerSample;
           uint16_t dataBitsPerSample;
           uint8_t  dataHeader[4];
           uint32_t dataSize;
         };
       """j;
1 Include a standard C header to define sized-types in the C code and specify the same types in the Julia code.
2 Specify a user-defined C type that references types declared earlier.

Using the C syntax itself makes for a more natural and convenient way of expressing types since the C code probably already exists. In fact, the update detailed below takes advantage of that fact to make defining bindings even easier!

Just use the header files

In general, C libraries come with header files defining how to interface them. Since CBinding.jl utilizes a C parser to convert C code into Julia, it is now possible to just directly include the definitions from the header files themselves. Using libsndfile as example, we can simply define the module as below to automatically create bindings.

julia> module libsndfile
         using CBinding
         
         using libsndfile_jll  (1)
         c`-fparse-all-comments -I$(libsndfile_jll.artifact_dir)/include -L$(libsndfile_jll.artifact_dir)/lib -lsndfile`
         
         const c"int8_t"  = Int8  (2)
         const c"int16_t" = Int16
         const c"int32_t" = Int32
         const c"int64_t" = Int64
         const c"uint8_t"  = UInt8
         const c"uint16_t" = UInt16
         const c"uint32_t" = UInt32
         const c"uint64_t" = UInt64
         const c"size_t" = Csize_t
         
         c"#include <sndfile.h>"j  (3)
       end
1 Create a compiler context using the library installed by the libsndfile_jll package.
2 Define the sized-types that are referenced by the library.
3 Include the header file and automatically generate Julia bindings to the library.

Users transitioning to Julia from a C or C++ background should find that this method of creating bindings is familiar and easy to work with. Creating a compiler context closely resembles a command-line for compiling C code that uses the library. And just as in C, including the header file defines the types, functions, global variables, and (some of) the macros provided by the library that can then be used in Julia code.

Accessing extern variables and functions

Calling a C library’s functions (even variable-length argument functions) is done in the standard way, func(arg1, arg2, …​), but the bindings created with the new release can also be used to obtain function pointers. Often callback functions and object-oriented types require users to obtain and store function pointers. We have made this easy and elegant by using the empty subscript (func_ptr = func[]) on function bindings.

A change was also made to global variable bindings. Access to a C library’s global variable is achieved with the empty subscript operator as well (global_var[] to load, global_var[] = val to store).

Improved performance

Package load times should be noticeably better in the new version. The original macro approach used by CBinding.jl and CBindingGen.jl resulted in very slow package compilation times. Our new approach has resulted in practically imperceivable compilation times compared to equivalent Julia definitions that even allows for an interactive C experience in the REPL.

The performance of using the C constructs is vastly improved as well. Everything from accessing nested fields in user-defined types to calling function bindings has been significantly optimized. We will use the following types to demonstrate comparable performance with native Julia code.

julia> struct JuliaStruct
         i::Cint
         d::Cdouble
       end
       js = JuliaStruct(123, 45.678)  (1)
JuliaStruct(123, 45.678)

julia> c``
       c"""
       struct CStruct {
         int i;
         double d;
       };
       """j
       cs = CStruct(i = 123, d = 45.678)  (2)
var"(c\"struct CStruct\")" (16-byte struct)
  i = 123
  d = 45.678

julia> mutable struct JuliaMutableStruct
         i::Cint
         d::Cdouble
       end
       ms = JuliaMutableStruct(123, 45.678)  (3)
JuliaStruct(123, 45.678)

julia> ps = Libc.malloc(cs)  (4)
Cptr{var"(c\"struct CStruct\")"}(0x000000000126e190)
1 A native Julia immutable object.
2 An equivalent immutable C object.
3 An equivalent mutable Julia object.
4 Allocated block of memory that is interpreted as a mutable C object.

Now that we have defined equivalent types, we can use a simple function to benchmark performance.

julia> using BenchmarkTools

julia> get_d(x) = x.d  (1)
       get_d(x::Cptr) = x.d[]
get_d (generic function with 2 methods)

julia> @btime get_d($(js))
  0.019 ns (0 allocations: 0 bytes)
45.678

julia> @btime get_d($(cs))  (2)
  0.019 ns (0 allocations: 0 bytes)
45.678

julia> @btime get_d($(ms))
  1.179 ns (0 allocations: 0 bytes)
45.678

julia> @btime get_d($(ps))  (3)
  1.120 ns (0 allocations: 0 bytes)
45.678
1 A simple example function retrieving the value of d field within the object.
2 There is no observable performance difference between Julia or C immutable objects.
3 Likewise, similar performance is observed with Julia and C mutable objects.

As you can see, the performance of using the two immutable objects is identical, and accessing the two mutable objects is practically identical as well.

More elegant pointer manipulation

One of the big features introduced with the 1.0 release is the process of dereferencing pointers. When manipulating objects in memory, by using a pointer to a user-defined type, the access of a field member actually results in a pointer to the field rather than the value of it. Therefore, loading or storing values in memory (for either primitive or aggregate fields) is accomplished by dereferencing the pointer with the empty subscript syntax (ptr[] to load, ptr[] = val to store). We illustrate this with the zero-copy, memory-mapped binary I/O example below.

julia> using Mmap

julia> data = open("sample.wav", "w") do io
         Mmap.mmap(io, Vector{UInt8}, sizeof(WAV_header))  (1)
       end;
       header = reinterpret(Cptr{WAV_header}, pointer(data))
Cptr{var"c\"struct WAV_header\""}(0x00007f94975ab000)

julia> get_size(ptr::Cptr) = ptr.fileSize[] |> ltoh |> signed  (2)
       @btime get_size($(header))
  1.149 ns (0 allocations: 0 bytes)
440634

julia> get_size(x) = x.fileSize |> ltoh |> signed  (3)
       @btime get_size($(header[]))
  0.010 ns (0 allocations: 0 bytes)
440634

julia> header.fileSize[] = 1000  (4)
1000
1 Memory-map a file as a byte array and reinterpret that memory as a WAV_header.
2 Accessing ptr.fileSize obtains a type-safe pointer to the location in memory storing the file size with the actual value being retrieved by ptr.fileSize[].
3 Loading the entire header by dereferencing the pointer header[] before accessing fields may provide better performance optimization opportunities.
4 Write a value to the location in memory storing the file size and eventually to the mapped file as well.

The performance of accessing memory is similar to that of accessing fields in a mutable Julia type. Knowing that fields in immutable types can be used approximately 100 times faster, it is sometimes beneficial to load an entire object from memory before repeatedly accessing fields in it. Experimentation and benchmarking are required to reveal how to optimally structure access patterns in your code.

Better documentation conversion

Another feature introduced in the 1.0 release is the automated importing of code documentation from C into Julia. It is converted automatically just by including a C header file containing properly formatted documentation. Using the libsndfile bindings example from above, the REPL help mode can be used to display the definition and documentation, as shown below:

help?> libsndfile.SNDFILE
  typedef struct SNDFILE_tag SNDFILE

  Defined at sndfile.h:327 (file:///.julia/artifacts/dde4d151/include/sndfile.h)

  A SNDFILE* pointer can be passed around much like stdio.h's FILE* pointer.

Need further assistance?

If you are considering the transition to Julia, but have several C libraries you depend on, let us help! Analytech Solutions offers many years of experience working with both Julia and C, and we can streamline your transition process. Please contact us for more information!



Keith Rutkowski Keith Rutkowski is a seasoned visionary, inventor, and computer scientist with a passion to provide companies with innovative research and development, physics-based modeling and simulation, data analysis, and scientific or technical software/computing services. He has over a decade of industry experience in scientific and technical computing, high-performance parallelized computing, and hard real-time computing.