C Modules and Inline Functions
Modules in C
C has modules. They are implemented as a convention of translation units and headers. Most C programmers are aware of this technique, but many do not appreciate it as a module system.
With this technique, modules are created by defining interface types and functions in header files.
Modules are “used” by #include
ing the appropriate header file.
Each module is implemented as one or more translation units (henceforth known as TUs) that contain implementations of the interface types and functions.
TUs also encapsulate implementation details with TU-local types and static functions.
As a testiment to its encapsulation, a C “module” is recompiled when the implementation changes. However, none of the users of the module need to recompile. The only additional action needed is to relink the program with the recompiled TU’s object file.
Function Inlining in C
Function inlining is a performance optimization. In some cases, such as in hot tight loops, function inlining can considerably increase runtime performance. This benefit comes from both avoiding the function call overhead associated with dumping registers onto the stack and jumping; and also comes from being able to optimize the function body in the context of the particular function call. When the compiler decides to inline a function call, it is weighing the potential performance benefit against increased binary size.
To be able to inline function calls in C, the compiler needs the function’s definition. Because of separate compilation in C, functions defined in one TU are not available when compiling another TU. So we must have the function definition duplicated in each TU when compiling. This is typically accomplished by including the function definition in the header where the function is declared.
In ANSI C (AKA ISO C90 or C89) inline functions are defined in headers using either function-like macros or static
functions.
Both of these approaches have several downsides.
Function-like macros:
- Cannot get a reference to a function-like macro.
- All function calls are always inlined.
- Not type-safe.
- May result in less informative error messages (though they are pretty good in the big 3).
static
functions:
- Can get a reference to a
static
function. - The address of the
static
functions is different in each TU. - Compiler can decided on each function call whether to inline or not.
- Type-safe.
- Good error messages.
- Each TU has a duplicate of the function, increasing the final binary size.
Because of these problems, the ISO C99 version of the language added the inline
specifier for functions.
What is The Inline Specifier?
The inline
specifier in C99 started out life as a compiler hint.
Functions declared inline
were intended to be functions that were good candidates for inlining.
Unfortunately, a function specifier is a very coarse-grained hint.
Thoughtfully, the standard does not force compilers to always inline function calls involving a inline
function.
Modern compilers even tend to ignore the inline
specifier as a hint.
What is still important about inline
functions is the semantics of extern inline
functions.
If a program has multiple definitions of a function that are all declared inline
,
and the definitions of these functions are identical,
the multiple definitions are resolved a single definition in the final executable.
This solves the duplication problem seen with static
functions defined in headers.
It also ensures that all references to that function are the same across the whole program.
This de-duplication is a form of Link-Time Optimization (henceforth known as LTO).
Inlining Breaks C’s Modularity
While inlining functions may lead to performance improvements, inline function definitions breaks C’s modularity and spirit of simplicity.
Inline function definitions, no matter how the function is implemented, leaks implementation detail into the header; breaking the encapsulation of TUs. It means that whenever an implementation changes, users of a module may need to be recompiled; not just the module definition itself.
It also tends to lead to additional #include
directives in a header to service the inline function definition.
The inline function definition will also be analyzed multiple times, once for each TU, decreasing whole-program compilation performance.
In combination with the previous points of:
additional #include
directives in headers,
with inline function definitions to be analyzed in each newly #include
d header,
recursively,
this multiplicity becomes exponential, and compilation performance drops drastically.
This is very apparent in C++.
It is present, but less observable, in C, mostly because C is a simpler language to analyze.
Finally, the inline
specifier complicates the linker implementation, since it is a form of LTO.
This breaks C’s philosophy of simplicity, and the ease of implementation of a C toolchain on a new platform. Easily supporting new platforms was one of the original design goals of C, and part of what made it popular long ago.
What Can We Do?
We need a solution to function inlining that does not have a negative effect the semantics of the language.
Ideally, we also have a finer-grain approach to inlining than the inline
specifier attempts to provide.
Right now, we have LTO.
LTO-capable linkers can inline functions that are defined in different TUs. This is because all code fragments that are a part of the final binary are available at once while linking. LTO-capable linkers are also capable of a number of other whole-program optimizations; like removing unused functions in executables.
That begin said, linkers are not nearly as capable of inlining functions as compilers are. Part of this is because the technology is relatively young, about 10 years old, compared to the 40+ years optimizing C compilers have been in developement. There also doesn’t seem to be much in the way of attributes to help guide LTO yet. But, most major issues with LTO in the Big 3 compilers have been solved as of this writing (2021).
So my suggestion to C programmers is:
- Keep your function definitions in your source files.
- Turn on LTO.
- If your functions aren’t inlining like they should, move them into the header and use
static
.
Hopefully in the near future, point 3 can be replaced with call-site-specific attributes to help LTO decide to inline function calls.
And my suggestion to the C standards committee is:
- Make the
inline
specifier optional in C23.