C++ Dynamic loading of a shared library. Also, create and load one, on your own. - Panos Zafiropoulos

Reading time: 13 Minutes

230517

Intro

Shared (dynamic) libraries are well known for their ability to share their exposed functions with multiple programs calling them. This is achieved because a shared library is loaded once, into memory at runtime, and thus, it is sharable among other applications. Better memory allocation is also one reason that shared libraries are considered preferable over static ones.

Although, I am not going to go into detail describing static and shared libraries. The reader can search the web and find dozens of materials about their characteristics, differences, usage, etc. Our aim here is to focus on how to access (actually to load) shared libraries dynamically.

So, you probably understand that we are not going to deal with dynamic linking, but about dynamic loading.

Dynamic linking vs Dynamic loading

You should know that generally, the linking process of creating an app, requires that any shared library needed, must be accessible/available somewhere in the OS filesystem. Then, when we run our app (our executable), the dynamic linking (the linker) finds out what dynamic libraries our app needs (they have to be also accessible in the OS filesystem). The needed shared libraries are loaded into memory and our app binds its dependencies to those libraries, during the startup time.

For dynamic loading I like Wikipedia’s definition: “Dynamic loading is a mechanism by which a computer program can, at run time, load a library (or other binary) into memory, retrieve the addresses of functions and variables contained in the library, execute those functions or access those variables, and unload the library from memory. It is one of the 3 mechanisms by which a computer program can use some other software; the other two are static linking and dynamic linking. Unlike static linking and dynamic linking, dynamic loading allows a computer program to start up in the absence of these libraries, to discover available libraries, and to potentially gain additional functionality.”

One more thing that could make a bit practically clearer the dynamic loading vs dynamic linking, is that for dynamic loading we will not include any header file of the targeted shared library, as opposed to dynamic linking where header files must be included.

What we will do

So, what we will do is work on a few simple examples in C++ on how we can load an existing shared library, and then how we can create a shared library and load it dynamically.

Specifically, the first example will be about accessing dynamically some of the exposed functionalities of an existing system library. Next, we will see how we can build one shared library on our own, and finally how we can access it, in a similar manner we did with the system library.

However, before proceeding into practical examples, it is worth staying for a while on the conventions o library files and also mentioning some of the well-known system tools for examining a shared library.

Note that we will focus mainly on Linux, and/or macOS, and either not at all, or just a little bit, on Windows.

Library naming conventions and placement

Libraries naming

Static Libraries
o In Linux systems, they have the ending (extension) .a, which indicates an archive file of objects.
o In Windows the extension used to be just .lib.

Shared (Dynamic) Libraries
o .so is the default extension for shared libraries in Linux systems.
o Respectively, the extension being used in Windows is mostly .dll (Dynamic Linking Library).
o The ‘.so’ ending indicates a shared object. A name of a .so library is known as “soname” and almost always it is prefixed by “lib” . Usually, the .so library can be followed by a period and a version number (the “real name”), but then a symlink is being used for just up to the .so ending (the “linker name”).
o For macOS (Darwin) systems the shared libraries ending used to be ‘.dylib’.

Libraries placement

Linux systems
o /lib – libraries required for system startup
o /usr/lib – other system libraries
o /usr/local/lib – 3rd party libraries – libraries that are not part of the system
o Note also, that the location of installed libraries can be found by looking at .conf files in /etc/ld.so.conf.d folder
o The ldconfig command can be used to create and make entries in a .conf file, notifying the system about (new) locations of shared libraries. It can be also used to show us the location of a particular library (if it finds it).

macOS (Darwin)
The same is also valid for macOS systems, however, the /lib folder does not exist. Instead, there is the /System/Library folder where only the system has access. Also, the /Library folder is where application, framework, and library bundles are placed upon their installation. Mostly, standard and other C/C++ libraries are also kept in system bundles.
Note that in macOS, a bundle usually refers to the directory with a standard structure, where executables, and/or libraries, and other resources are kept together.

Examining the libm.so shared library and targeting a function/symbol

libm.so is a standard C math library providing a big number of common elementary mathematical functions. [See more here]. The exposed shared functions are also known as interfaces. Among them, is the sqrt for extracting the square root of a number, which is our target in this example. Note that the name of the library in macOS is libm.dylib.

In Linux, you can use the ldconfig command to locate it, e.g.:

$ ldconfig -p | grep libm.so

$ ldconfig -p | grep libm.so
	libmysofa.so.1 (libc6,AArch64) => /lib/aarch64-linux-gnu/libmysofa.so.1
	libm.so.6 (libc6,AArch64) => /lib/aarch64-linux-gnu/libm.so.6
	libm.so (libc6,AArch64) => /lib/aarch64-linux-gnu/libm.so
$

As you can see for the Linux installation above, it is under the /lib/aarch64-linux-gnu directory. (More precisely, the file is a symlink of the soname ‘libm.so.6’).

When we want to load a shared library and access something from it, e.g. a function, we have to know about that function. In a binary file like a shared library, we can look for its exposed symbols. Symbols can represent anything, e.g.: a function, a variable, a class, etc. The same is true for the libm.so.

As we’ve said the libm.so exposes a number of functions that they reside as symbols inside it. So, let’s take a look and look for sqrt symbol(s) at it.

Since in Linux world, all shared libraries file format is generally the standard ELF (Executable and Linkable Format), we can use for example the readelf utility which comes with GNU Binutils and can be used to display information from any ELF format object file.

e.g.:

$ readelf -s /usr/lib/aarch64-linux-gnu/libm.so | grep sqrt

As you can see there is a number of symbols including the sqrt, which is OK because the sqrt function is overloaded for handling different number types.

Similarly, we can also use the GNU Binutils objdump utility that displays information from object files e.g.:

$ objdump -T /lib/aarch64-linux-gnu/libm.so | grep sqrt

One more tool from GNU Binutils family that we can use is the nm that can be used to list symbols from object files, e.g.:

$ nm -D /usr/lib/aarch64-linux-gnu/libm.so | grep sqrt<strong></strong>

The outputs are similar to the readelf output.

Now, after we’ve ensured about our target symbol, we can proceed with the code.

Code Example 1

A simple program that dynamically loads the libm.so library and calls/uses its sqrt function

In order to load a binary file and access something of it we have to:
• Open the file
• Look for the symbol(s) we are interested in and probably retrieve it/them
• Deal with eventual errors
• Close the file and free the memory

For this purpose, generally we have to use a set of C low-level functions: dlopen(), dlsym() and dlclose(). Below there is a short description of each one of them, focused on what we are going to do. You can read more in other related posts, e.g.: here.

dlopen()
dlopen() returns just a pointer (a handler) of a binary file (like a shared library or an executable). It has 2 parameters: The first one is a string (actually a char*) of the name of the binary (e.g.: the shared library name). The second is a constant int – the mode flag, which generally defines how we can access referenced symbols. Just to mention 2 of them: RTL_LAZY, and RTL_NOW. RTL_LAZY generally instructs the dlopen() to resolve symbols only if the code references them, and upon code execution. If a symbol is never referenced, then it is never resolved. Note that RTL_LAZY is OK for function references, but for references to variables, better to use the RTL_NOW to resolve them immediately, upon library loading.

dlsym()
It gets/returns the address of a symbol (inside the loaded object/binary/library open via dlopen). It has 2 parameters: The first, is the pointer (a handler) of the shared object loaded. The second is a string (actually a char*) of the name of the symbol.
The pointer returned from the dlsym is a void pointer. When the symbol concerns a function, and we want to get access to this function, we have to convert the void pointer to the appropriate pointer type of that function.

dlerror()
It returns a human readable string describing the most recent error that occurred from dlopen(), dlsym() or dlclose() since the last call to dlerror().

dlclose()
Unloads the binary object/library (loaded via dlopen) and frees the memory

After this sort intro, below is a first code example of dynamic loading of the libm library and using its sqrt function.

The makefile

For your convenience this is the makefile that you can use.

It is actually based on the final version of the makefile of one other post of mine:

https://www.devxperiences.com/pzwp1/2023/05/11/make-your-makefiles/

It uses a folder structure similar to:

As you can see above, (into both, the source file ‘main.cpp’ and the ‘makefile’), our program targets a Linux system, as well as a macOS one. It really compiles and works OK in both of them. If you have used make to compile the source code, you have noticed, that I use 2 similar yet different names for the executables: ‘x_dlopen_libm’ for a Linux system and ‘m_dlopen_libm’ for macOS. Of course, it is presumed that you have already installed in your macOS, both the GNU Compiler Collection (gcc, g++, etc.) and GNU Binutils via brew. However, if you are working on a macOS, you are probably wandering, where the heck is the lybm.dylib.

You can compile and link the app, just by using the ‘make’ command.

About the macOS libm.dylib

As we have previously said, a number of standards and other C/C++ libraries are kept in system bundles. The macOS Dynamic Linker is responsible to search those bundles and find a shared library requested via dlopen(). A first approach to finding what are the dependencies of an executable, and where they are located is to use the macOS otool and inspect the executable. In our case, this is the command to be used:

$ otool -L /bin/m_dlopen_libm

Note that the macOS otool is similar to GNU Binutils objdumb tool. According to man pages: “otool command displays specified parts of object files or libraries. It is the preferred tool for inspecting Mach-O binaries, especially for binaries that are bad, corrupted, or fuzzed. It is also useful in situations when inspecting files with new or “bleeding-edge” Mach-O file format changes”.

Furthermore, a way to look for the searching paths in macOS, is to use a non-existing library and look at the error output from dlerror(). For instance, you can change the name of the library from ‘libm’ to ‘libm-x’ and try to compile the program. The dlerror() outputs a long list of searched folders:

Finally, we have to be aware that from “macOS Big Sur 11.0.1, the system ships with a built-in dynamic linker cache of all system-provided libraries. As part of this change,

copies of dynamic libraries are no longer present on the filesystem”. So, looking for a system library in macOS it’s like searching in a maze, and thus, searching more, is out of the scope of this post.

You can find the repo of the first example, here.

Now it’s time to proceed and create our own shared/dynamic library.

Code Example 2

A simple custom shared library

For our demo purposes we are going to make a very simple shared library, having just 3 simple functions. The source file is named “dynlibexample.cpp” and the header file “dynlibexample.h”. Below, you can find both:

You can compile and link it by using the commands:

$ g++ -c -fPIC *.c

Above, we use the -fPIC flag. This flag is a requirement for shared libraries that stands for “Position Independent Code”.

You can link object file(s) to 1 library:

$ g++ -shared -o liball.so *.o

Above, we use the -shared flag. This flag tells the g++ that you want to convert the object .code (.o files) into a shared object files (.so), aka dynamic libraries in Linux/Unix-based computers.

Again, for your convenience, this is a handy example of a makefile, named ‘makedynlib’:

You can use like that:

$ make -f makedynlib

It creates a library named ‘libdynlibexample.so’ for Linux-es and ‘libdynlibexample.dylib’ for macOS-es

It has a number of useful targets. For instance, you can use its ‘setup’ target

$ make -f makedynlib setup

for initializing/scaffolding your library project. The folder structure is similar to:

Avoid “mangling” symbols in the compiled/linked binary file of the library

However, going back to the code, what is interesting, is the fact that in the header, the definitions of the functions are enclosed inside the block:

extern "C" {
. . .
}

This is a linkage specification instructing the C++ compiler to treat the compiled symbols of the block-enclosed functions and or variables, as C symbols, which means to leave the symbols’ names in the compiled binary, un-mangled. This is necessary, because we use C++ and not just C. C++ has capabilities for overloading of function names, but C does not. So, the C++ compiler cannot just use a function name as a unique symbol for linking, and thus it mangles the name by adding prefixes and/or suffixes with some other characters.

You can see the differences in the created binary/library, using the ‘nm’ tool for taking a look at exposed symbols, with and without using the extern “C” block:

In Linux (Ubuntu)

$ nm -D lib/libdynlibexample.so

Or in macOS after having installed GNU Binutils via brew

$ /usr/local/opt/binutils/bin/nm -D lib/libdynlibexample.dylib

This is the output with extern “C” block:

This is the output without extern “C” block:

For more info about symbols name mangling, you can see here.

So, here we use the extern “C” linkage, and therefore, the C++ compiler does not add any prefixes/suffixes to the functions’ names/symbols in the compiled binary. This is essential for locating the functions’ symbols, when we dynamically load the library.

Find the repo of the second example, here.

Let’s now proceed on the next code example for dynamically loading the above library and using its functions.

Code Example 3

Load the shared library from the previous (second) example

This example is similar to the first one, but this time we are going to load our custom dynamic library, created in the previous example. So, as the basis, we can use the first code example repo.

What we have to do is to make the necessary changes to the main.cpp file for dynamically loading our custom library ‘libdynlibexample’ (.so or .dylib), and using its exposed functions. Apart from the name of the app, which now is changed to ‘x_dynloadlib’ for Linux and ‘m_dynloadlib’ for macOS, the makefile remains the same. Finally, we also added the ‘utils.cpp’ with its header file ‘utils.h’, which contains just 1 function. The function is named ‘getfirstinteger()’ and it is for returning an integer from the first part only of an input string containing decimal digits. The whole repo can be found here, but below you can also find the updated main.cpp.

Again, you can use the ‘make’ command to compile and link the app.

About finding and opening the target dynamic/shared library

However, the important here is to place somewhere our custom library and make it available to our app. Generally, as we have previously mentioned, in Linux-es, standard places are /usr/lib and /usr/local/lib for custom and 3^rd party libraries. However, those folders are mainly used for production deployment. Furthermore, it is not a good practice to use a hard-coded full-path name.

In Linux-es, we should be aware that the current path from which our executable runs and calls that library, is not included in the default search options of the dlopen() (Dynamic Linker). See at man pages of the dlopen here or here about the default search places. However, among the searching places, are also the paths defined via the LD_LIBRARY_PATH env variable and the cache file /etc/ld.so.cache (maintained by ldconfig). To set the LD_LIBRARY_PATH pointing to the same directory where our executable runs, we can use the following command (from within the /bin directory where our executable resides):

$ export LD_LIBRARY_PATH=.

On the other hand, in macOS-es, the path from which our executable runs (the bin folder for our project) is actually one of the first places searched by the dlopen(). So, there is no need for defining the LD_LIBRARY_PATH env variable.

This is the structure of the project:

Conclusion(s)

Generally, dynamic loading is not the common approach to using shared libraries. Usually, you will use the “normal” dynamic linking instead. However, there are some cases where libraries are used, e.g. libraries/binaries used as ‘plugins’ and there the dynamic loading mechanism is being used. Finally keep also in mind, that dynamic loading takes some more time and thus the whole process is a bit slower.

That said, we have reached the end of this post.

That’s it for now! I hope you enjoyed it!
Thanks for reading and stay tuned!

About Post Author

Panos

administrator

See author's posts

2 Comments

Ankit Jain says:

29/05/2024 at 17:21

Found this article very helpful and easy to understand.

John says:

23/01/2025 at 03:36

Great article. It’s not only a good introduction to dynamic loading but what is shown also helps understand dynamic linking a little bit better.