Skip to content

Instantly share code, notes, and snippets.

@danielhenrymantilla
Created April 3, 2025 15:57
Show Gist options
  • Save danielhenrymantilla/9342b69de8f694d402a586d39f7fcc51 to your computer and use it in GitHub Desktop.
Save danielhenrymantilla/9342b69de8f694d402a586d39f7fcc51 to your computer and use it in GitHub Desktop.
Understanding the role of `SONAME` in dynamic linkage and dynamic loading

Understanding the role of SONAME in dynamic linkage and dynamic loading

Note that this is a Linux-specific feature.

The primary role of SONAME

Say we are a library libfoo.so.3.5, and some dependent binary artifact is being linked against us.

  • Note that this dependent binary artifact may well be a final executable, à la ./main;

  • but it may just as well be another library acting as "middleware", such as libbar.so -> libfoo.so…. Such library may be a shared/dynamic library, or a static library. I haven't looked into the latter case, since it is not relevant here (and in general, things can get weird when implicitly mixing up these linkage modes).

Then, by having a SONAME attribute among our (program headers (objdump -p, or readelf -d)) metadata, with a value of, say, libfoo.so.3, we are telling the linker that the real path for that dependent binary to refer back to us —later on!—, ought to be libfoo.so.3 rather than whatever filesystem path had been used to refer to us (usually, it would have been a libfoo.so symlinking back to us, discovered either because we were in one of the system auto-discovered directories, such as /usr/lib, or through a specific -L directory lookup flag having been specified/provided to the linker.

The result of this can be observed in the NEEDED property of the dependent binary.

  • should there not have been a SONAME, the value here would have resulted from the filesystem path or whatnot:

    • if the library was referred to through the typical -l foo (+ optional -L and/or /usr/lib location having been used), then the resulting NEEDED "path"/name being used is libfoo.so, unqualified (no / in that path).
    • but if the library was referred to via an explicit some/relative/or/abs/path/to/libfoo.so…, is to tell the linker, i.e., during link-time, to override the path it had registered to to oneself", then the resulting NEEDED path being used becomes /some/relative/or/abs/path/to/libfoo.so….
  • otherwise, if there was a SONAME=<stuff…> set in libfoo.so.3.4 (via a -soname=<stuff…> linker flag, i.e., a -Wl,-soname=<stuff…> cc linkage flag (LD_FLAGS)), then the resulting NEEDED path being used is overridden to become <stuff…>, independently of whichever filepath and whatever reference method (-lfoo vs. libfoo.so…) had been used to refer to our libfoo.so.3.4 file.

The role of NEEDED

The NEEDED entry of a binary is then used in two / two-and-a-half cases.

  • when running/executing a ./main standalone-executable binary, the "automatic dynamic loading" stemming from the default dynamic linkage runtime of Linux gets automagically and implicitly called, tasked to look up the link-time-specified dynamic libraries on which ./main depends, such as libfoo; it shall try to locate them using, as their "path", the value of the corresponding NEEDED entry (so as to then load the library for it to be available to the current binary);

  • similarly, when dynamically loading —either through the mechanism triggered by the previous bullet ("automatic" dynamic loading), or through explicit usage of the fully-dynamic loading machinery (dlopen, LoadLibrary, …)— some libbar.so dynamic library, then "automatic" dynamic loading is triggered here as well, to load the next layer of transitive shared library dependencies.

    For instance, assuming libbar.so had been link-time-specified to depend on libfoo.so (a.k.a "libbar.so had been 'linked against' foo"), it means the dynamic loader will be automatically invoked to look up and load some libfoo.so, at whichever path specified by the NEEDED entry for foo within the (program headers) of the libbar.so file.

Rationale

The main rationale and objective of this design is to enable some form of "smart SemVer" with shared libraries within a system, based on the following Linux sysadmin assumptions:

  • every specific libfoo.so.3.4 kind of artifact, when being produced, is given a SONAME of libfoo.so.3, that is, with every non-major version number having been stripped. This makes it so this artifact "back-references" an eponymous libfoo.so.3 file assumed to be in scope.

  • Whenever such an artifact is "installed" in a proper directory, the following symlinks are to be update:

    • ln -sf libfoo.so.3.<latest_minor> libfoo.so.3, with <latest_minor> representing, at the time of running that command, the highest possible n so that libfoo.so.3.n exist.
    • and likewise, ln -sf libfoo.so.<latest_major> libfoo.so.

That way, we ought to end up with the following setup:

image

which, in the future, could become:

image

Note

There appears to be a special interaction between changing/updating these things, and the "ld cache", which is a memoized resolution of which libraries to load for a given name, and where, across all the mess of different directory priorities and SONAMEs and whatnot. I haven't found a single source which clearly explains this, and have not found it worth it to delve into this further. Just beware, you may neeed to "refresh the cache" when installing new shared libraries, that's all.

Application

(Dynamic) linkage of ./main at some time t0, with libfoo.<max> = libfoo.so.3.4

Now, say we are to link against -lfoo, with libfoo.so.3.4 being the highest version of libfoo installed on the system at this time:

image

Graphviz source
digraph {
    graph [fontname = "courier" style=dotted];
    node [fontname = "courier" shape=box];
    edge [fontname = "courier"];
    rankdir = BT
    
    subgraph cluster_filesystem {
        labelloc = b
        label = "\
in the FS

(say 3 is the *latest* major version
right now (time of linkage))"
        libfoo [label = "libfoo.so"]
        libfoo -> "libfoo.so.3.4" [
            label = "ln -s\n(2)"
            style = dashed
        ]
        
        "libfoo.so.3" -> "libfoo.so.3.4" [
            color = red
            label = " SONAME\n(3)"
            dir = back
        ]
        
        "libfoo.so.3" -> "libfoo.so.3.4" [
            label = " ln -s\n(unused atm)   "
            style = dashed
        ]
    }
    
    subgraph cluster_compilation {
        label = "linkage of\ndependent binary"
        main -> libfoo [
            label = "-l foo\n(1)"
            style = dotted
        ]
    }
    
    main -> "libfoo.so.3" [
        color = red
        label = "NEEDED\n(result!)"
    ]
    
    {
        node [style = invis]
        {rank = same;
            a
            b
        }
        {rank = same;
            c
            d
        }
        
        a -> b [
            style = dashed
            label = temporary
        ]
        
        c -> d [
            label = permanent
        ]
        
        c -> a [style = invis]
    }
}

All this machinery and definitions and assumptions make it so our dependent binary, ./main in the example, ends up NEEDED-referring to libfoo.so.3.

And now, imagine having installed newer version of libfoo.so…, such as 3.5 (minor bump), and 4.… (major bump).

Runtime! Executing ./main at some later point t1, with libfoo.<max> = libfoo.so.4.1

We end up with:

image

Graphviz source
digraph {
    graph [fontname = "courier" style=dotted];
    node [fontname = "courier" shape=box];
    edge [fontname = "courier"];
    rankdir = BT
    
    subgraph cluster_runtime {label = "\
at runtime: `./main`

Time for the
`ld` *loader*
to kick in."
        
        main
    }
    
    main -> "libfoo.so.3" [
        label = " NEEDED\n(1)"
        color = red
    ]
    
    subgraph cluster_filesystem {
        labelloc = b
        "libfoo.so.3.5"
        "libfoo.so.4.0"
        label = "\
 in the FS

 (say 4 is the *latest* major version now)"
        libfoo [label = "libfoo.so"]
        libfoo -> "libfoo.so.4.1" [
            label = "ln -s"
            style = dashed
        ]
        
        "libfoo.so.3.4"
        "libfoo.so.3.5" [
            color = red
        ]
        "libfoo.so.3" -> "libfoo.so.3.5" [
            label = "ln -s  \n(2)"
            style = dashed
            color = red
        ]
        "libfoo.so.3" -> "libfoo.so.3.5" [
            label = " SONAME"
            dir = back
        ]
        "libfoo.so.3" -> "libfoo.so.3.4" [
            label = "SONAME"
            dir = back
        ]
        "libfoo.so.4" -> "libfoo.so.4.0" [
            label = "SONAME"
            dir = back
        ]
        "libfoo.so.4" -> "libfoo.so.4.1" [
            label = "ln -s"
            style = dashed
        ]
        "libfoo.so.4" -> "libfoo.so.4.1" [
            label = "SONAME"
            dir = back
        ]
    }
}

Notice how the binary properly loads an minor-compatible (i.e., API and ABI-compatible) bumped version of libfoo.so, without falling into the trap of loading a major-incompatible (e.g., with some API or ABI incompatibility) version thereof!

The secondary role of SONAME, a happy(?) byproduct: "fixing linkage-path sensitivity"

Go back and read very carefully the rules of the # The primary role of `SONAME` section, but focusing on the -lfoo (+ optionally, -L . or -L /abs/path) vs. libfoo.so | ./libfoo.so | /abs/path/to/libfoo.so difference of specification of the libfoo.so dependency.

And now, consider the rules of dlopen(<libname|path>) resolution:

  • if the string given to dlopen is some library name, identified by the lack of / in it!, then all the shenanigans about dynamic library location ensue (RPATH, else LD_LIBRARY_PATH, else RUNPATH, else system directories, …).

  • otherwise, the library is assumed to be located at the given path, resolved relative to the current working directory. That is, the working directory of whoever ran the original ./main command, or even elsewhere if it changed before some explicit call to dlopen(). This working directory very much does not have to be that of the location of the main binary! (e.g., consider the caller running a subdir/main command, then the working directorty would be the parent of main's.)

Tip

there is a special magical var on Linux, called $ORIGIN/, which acts like ./, but for being resolved relative to the dependent binary (or to the final standalone executable, I don't know, actually…), which is what could allow a deployment to involve a Windows-like pattern of packaging an app within a same-dir bundle à la dir/{main,libfoo.so}.

And this is where SONAME can then come and either ruin the day, or save it.

Application: how SONAME can be able to fix an improper linkage specification having been used

Let's consider the case of libbar.so -> libfoo.so (we won't care about version numbers here).

Now, imagine some app dlopen()ing libbar.so, using whichever path/reference or w/e which makes that direct layer of loading work and succeed.

Since libbar.so "had been linked against libfoo.so" (i.e., since there had been a link-time specification of libbar depending on libfoo resulting in some NEEDED entry inside libbar.so referring back to some path or identifier of libfoo.so), then it means that the dynamic loader is not finished, it now needs to load libfoo.so.

And to do so, it "simply" proceeds to do, in pseudo-code, a dlopen(libbar.dynamic_deps[foo].NEEDED), i.e., it acts as if dlopen()ing the path or identifier specified over its NEEDED entry for foo.

So, in the case libbar.so had been "linked against libfoo" using some command line along the lines of relative/path/to/libfoo.so (rather than -lfoo, -L relative/path/to), it means that the default NEEDED value for libfoo in libbar.so is of relative/path/to/libfoo.so rather than libfoo.so.

Now, we have two possibilities:

  • If libfoo.so has no SONAME to override the effective/actual NEEDED being used

    then NEEDED ends up being relative/path/to/libfoo.so

    So, back to our dlopen()ing of libbar.so, and its transitive dlopen(<libbar's NEEDED for libfoo>), we end up with:

    dlopen("relative/path/to/libfoo.so");

    which is Bad™, because, again, this is a relative path, resolved relative to the completely arbitrary working directory of the user.

    • Best case scenario, libfoo.so is not found, dlopening fails, and the program probably exits there and then, having failed.

    • Worse scenario, the program did not check for success of dlopen(), and it starts using a NULL pointer:

      • it probably segfaults;

      • it could lead to a remote code execution vulnerability, in some very contrived and sophisticated attack scenario.

    • Worst case scenario, there is a malicious libfoo.so located there, and control of the program is hijacked in an incredibly simple and trivial way.

  • If libfoo.so does have a SONAME to override the effective/actual NEEDED being used

    and assuming that SONAME to be a sane library identifier, i.e., with no /s involved in the path, then we end up with:

    dlopen("libfoo.so"); // or `dlopen("libfoo.so.3")` or w/e

    and the usual dynamic-library lookup rules ensue, and all is Fine™.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment