Note that this is a Linux-specific feature.
Say we are a library libfoo.so.3.5
, and some dependent binary artifact is being linked against us.
-
Note that this dependent binary artifact may well be a final executable, à la
./main
; -
but it may just as well be another library acting as "middleware", such as
libbar.so -> libfoo.so…
. Such library may be a shared/dynamic library, or a static library. I haven't looked into the latter case, since it is not relevant here (and in general, things can get weird when implicitly mixing up these linkage modes).
Then, by having a SONAME
attribute among our (program headers (objdump -p
, or readelf -d
)) metadata, with a value of, say, libfoo.so.3
, we are telling the linker that the real path for that dependent binary to refer back to us —later on!—, ought to be libfoo.so.3
rather than whatever filesystem path had been used to refer to us (usually, it would have been a libfoo.so
symlinking back to us, discovered either because we were in one of the system auto-discovered directories, such as /usr/lib
, or through a specific -L
directory lookup flag having been specified/provided to the linker.
The result of this can be observed in the NEEDED
property of the dependent binary.
-
should there not have been a
SONAME
, the value here would have resulted from the filesystem path or whatnot:- if the library was referred to through the typical
-l foo
(+ optional-L
and/or/usr/lib
location having been used), then the resultingNEEDED
"path"/name being used islibfoo.so
, unqualified (no/
in that path). - but if the library was referred to via an explicit
some/relative/or/abs/path/to/libfoo.so…
, is to tell the linker, i.e., during link-time, to override the path it had registered to to oneself", then the resultingNEEDED
path being used becomes/some/relative/or/abs/path/to/libfoo.so…
.
- if the library was referred to through the typical
-
otherwise, if there was a
SONAME=<stuff…>
set inlibfoo.so.3.4
(via a-soname=<stuff…>
linker flag, i.e., a-Wl,-soname=<stuff…>
cc
linkage flag (LD_FLAGS
)), then the resultingNEEDED
path being used is overridden to become<stuff…>
, independently of whichever filepath and whatever reference method (-lfoo
vs.libfoo.so…
) had been used to refer to ourlibfoo.so.3.4
file.
The NEEDED
entry of a binary is then used in two / two-and-a-half cases.
-
when running/executing a
./main
standalone-executable binary, the "automatic dynamic loading" stemming from the default dynamic linkage runtime of Linux gets automagically and implicitly called, tasked to look up the link-time-specified dynamic libraries on which./main
depends, such aslibfoo
; it shall try to locate them using, as their "path", the value of the correspondingNEEDED
entry (so as to then load the library for it to be available to the current binary); -
similarly, when dynamically loading —either through the mechanism triggered by the previous bullet ("automatic" dynamic loading), or through explicit usage of the fully-dynamic loading machinery (
dlopen
,LoadLibrary
, …)— somelibbar.so
dynamic library, then "automatic" dynamic loading is triggered here as well, to load the next layer of transitive shared library dependencies.For instance, assuming
libbar.so
had been link-time-specified to depend onlibfoo.so
(a.k.a "libbar.so
had been 'linked against'foo
"), it means the dynamic loader will be automatically invoked to look up and load somelibfoo.so
, at whichever path specified by theNEEDED
entry forfoo
within the (program headers) of thelibbar.so
file.
The main rationale and objective of this design is to enable some form of "smart SemVer" with shared libraries within a system, based on the following Linux sysadmin assumptions:
-
every specific
libfoo.so.3.4
kind of artifact, when being produced, is given aSONAME
oflibfoo.so.3
, that is, with every non-major version number having been stripped. This makes it so this artifact "back-references" an eponymouslibfoo.so.3
file assumed to be in scope. -
Whenever such an artifact is "installed" in a proper directory, the following symlinks are to be update:
ln -sf libfoo.so.3.<latest_minor> libfoo.so.3
, with<latest_minor>
representing, at the time of running that command, the highest possiblen
so thatlibfoo.so.3.n
exist.- and likewise,
ln -sf libfoo.so.<latest_major> libfoo.so
.
That way, we ought to end up with the following setup:
which, in the future, could become:
Note
There appears to be a special interaction between changing/updating these things, and the "ld
cache", which is a memoized resolution of which libraries to load for a given name, and where, across all the mess of different directory priorities and SONAME
s and whatnot.
I haven't found a single source which clearly explains this, and have not found it worth it to delve into this further.
Just beware, you may neeed to "refresh the cache" when installing new shared libraries, that's all.
Now, say we are to link against -lfoo
, with libfoo.so.3.4
being the highest version of libfoo
installed on the system at this time:
Graphviz source
digraph {
graph [fontname = "courier" style=dotted];
node [fontname = "courier" shape=box];
edge [fontname = "courier"];
rankdir = BT
subgraph cluster_filesystem {
labelloc = b
label = "\
in the FS
(say 3 is the *latest* major version
right now (time of linkage))"
libfoo [label = "libfoo.so"]
libfoo -> "libfoo.so.3.4" [
label = "ln -s\n(2)"
style = dashed
]
"libfoo.so.3" -> "libfoo.so.3.4" [
color = red
label = " SONAME\n(3)"
dir = back
]
"libfoo.so.3" -> "libfoo.so.3.4" [
label = " ln -s\n(unused atm) "
style = dashed
]
}
subgraph cluster_compilation {
label = "linkage of\ndependent binary"
main -> libfoo [
label = "-l foo\n(1)"
style = dotted
]
}
main -> "libfoo.so.3" [
color = red
label = "NEEDED\n(result!)"
]
{
node [style = invis]
{rank = same;
a
b
}
{rank = same;
c
d
}
a -> b [
style = dashed
label = temporary
]
c -> d [
label = permanent
]
c -> a [style = invis]
}
}
All this machinery and definitions and assumptions make it so our dependent binary, ./main
in the example, ends up NEEDED
-referring to libfoo.so.3
.
And now, imagine having installed newer version of libfoo.so…
, such as 3.5
(minor bump), and 4.…
(major bump).
We end up with:
Graphviz source
digraph {
graph [fontname = "courier" style=dotted];
node [fontname = "courier" shape=box];
edge [fontname = "courier"];
rankdir = BT
subgraph cluster_runtime {label = "\
at runtime: `./main`
Time for the
`ld` *loader*
to kick in."
main
}
main -> "libfoo.so.3" [
label = " NEEDED\n(1)"
color = red
]
subgraph cluster_filesystem {
labelloc = b
"libfoo.so.3.5"
"libfoo.so.4.0"
label = "\
in the FS
(say 4 is the *latest* major version now)"
libfoo [label = "libfoo.so"]
libfoo -> "libfoo.so.4.1" [
label = "ln -s"
style = dashed
]
"libfoo.so.3.4"
"libfoo.so.3.5" [
color = red
]
"libfoo.so.3" -> "libfoo.so.3.5" [
label = "ln -s \n(2)"
style = dashed
color = red
]
"libfoo.so.3" -> "libfoo.so.3.5" [
label = " SONAME"
dir = back
]
"libfoo.so.3" -> "libfoo.so.3.4" [
label = "SONAME"
dir = back
]
"libfoo.so.4" -> "libfoo.so.4.0" [
label = "SONAME"
dir = back
]
"libfoo.so.4" -> "libfoo.so.4.1" [
label = "ln -s"
style = dashed
]
"libfoo.so.4" -> "libfoo.so.4.1" [
label = "SONAME"
dir = back
]
}
}
Notice how the binary properly loads an minor
-compatible (i.e., API and ABI-compatible) bumped version
of libfoo.so
, without falling into the trap of loading a major
-incompatible (e.g., with some API or ABI incompatibility) version thereof!
Go back and read very carefully the rules of the # The primary role of `SONAME`
section, but focusing on the -lfoo
(+ optionally, -L .
or -L /abs/path
) vs. libfoo.so | ./libfoo.so | /abs/path/to/libfoo.so
difference of specification of the libfoo.so
dependency.
And now, consider the rules of dlopen(<libname|path>)
resolution:
-
if the string given to
dlopen
is some library name, identified by the lack of/
in it!, then all the shenanigans about dynamic library location ensue (RPATH
, elseLD_LIBRARY_PATH
, elseRUNPATH
, else system directories, …). -
otherwise, the library is assumed to be located at the given path, resolved relative to the current working directory. That is, the working directory of whoever ran the original
./main
command, or even elsewhere if it changed before some explicit call todlopen()
. This working directory very much does not have to be that of the location of themain
binary! (e.g., consider the caller running asubdir/main
command, then the working directorty would be the parent ofmain
's.)
Tip
there is a special magical var on Linux, called $ORIGIN/
, which acts like ./
, but for being resolved relative to the dependent binary (or to the final standalone executable, I don't know, actually…), which is what could allow a deployment to involve a Windows-like pattern of packaging an app within a same-dir bundle à la dir/{main,libfoo.so}
.
And this is where SONAME
can then come and either ruin the day, or save it.
Let's consider the case of libbar.so -> libfoo.so
(we won't care about version numbers here).
Now, imagine some app dlopen()
ing libbar.so
, using whichever path/reference or w/e which makes that direct layer of loading work and succeed.
Since libbar.so
"had been linked against libfoo.so
" (i.e., since there had been a link-time specification of libbar
depending on libfoo
resulting in some NEEDED
entry inside libbar.so
referring back to some path or identifier of libfoo.so
), then it means that the dynamic loader is not finished, it now needs to load libfoo.so
.
And to do so, it "simply" proceeds to do, in pseudo-code, a dlopen(libbar.dynamic_deps[foo].NEEDED)
, i.e., it acts as if dlopen()
ing the path or identifier specified over its NEEDED
entry for foo
.
So, in the case libbar.so
had been "linked against libfoo
" using some command line along the lines of relative/path/to/libfoo.so
(rather than -lfoo
, -L relative/path/to
), it means that the default NEEDED
value for libfoo
in libbar.so
is of relative/path/to/libfoo.so
rather than libfoo.so
.
Now, we have two possibilities:
-
then
NEEDED
ends up beingrelative/path/to/libfoo.so
So, back to our
dlopen()
ing oflibbar.so
, and its transitivedlopen(<libbar's NEEDED for libfoo>)
, we end up with:dlopen("relative/path/to/libfoo.so");
which is Bad™, because, again, this is a relative path, resolved relative to the completely arbitrary working directory of the user.
-
Best case scenario,
libfoo.so
is not found,dlopen
ing fails, and the program probably exits there and then, having failed. -
Worse scenario, the program did not check for success of
dlopen()
, and it starts using aNULL
pointer:-
it probably segfaults;
-
it could lead to a remote code execution vulnerability, in some very contrived and sophisticated attack scenario.
-
-
Worst case scenario, there is a malicious
libfoo.so
located there, and control of the program is hijacked in an incredibly simple and trivial way.
-
-
and assuming that
SONAME
to be a sane library identifier, i.e., with no/
s involved in the path, then we end up with:dlopen("libfoo.so"); // or `dlopen("libfoo.so.3")` or w/e
and the usual dynamic-library lookup rules ensue, and all is Fine™.