Project 2 - Shared Library Monitor

Due: Friday, November 15th. 11:59 p.m.

The software on most modern systems is constructed from shared libraries and dynamically loadable modules (DLLs). When DLLs are used, the traditional process of "linking" is deferred to runtime. That is, the binding of libraries does not occur until a program actually runs. Even then, the actual linking of symbols may be deferred until a symbol is actually used by the running program (lazy binding).

Very few programmers actually understand the process by which DLLs are bound and how symbols are resolved. Moreover, the situation is complicated by issues of system configuration, library versions, and so forth (a problem all too familar to anyone who has experienced DLL problems on Windows or linking problems of Unix).

In this project you will create a tool that allows a programmer to inspect the shared library configuration of a process. Using this tool, you will be to see how symbols were bound, which libraries were in use, library dependencies, unused library symbols and more.

Project Overview

Your project will consist of two programs, ld.mon and ld.stat. The ld.mon program simply collects runtime library and linking information from the loader (ld.so.1). ld.stat should is an interactive program that sorts through all of the collected linking information and presents data to the user in some kind of half-way intelligible manner. Information that might be of interest includes: To use these programs, a programmer will do something like this:
% ld.mon netscape        (runs netscape, collects information)
% ld.stat                (runs the analyzer)
...
(rest of output up to you)

Obtaining linking information

Obtaining linking information is the easy part of this project. Use the LD_DEBUG and LD_DEBUG_OUTPUT environment variables to the run-time loader (ld.so.1). This forces the loader to dump massive amounts of debugging data. For example:
% env LD_DEBUG=bindings ls
24305: 
24305: configuration file=/var/ld/ld.config: [ ELF-DEFAULT-LIBPATH ]
24305: 
24305: binding file=ls to 0x0 (undefined weak): symbol `_ex_deregister'
24305: binding file=ls to 0x0 (undefined weak): symbol `_ex_register'
24305: binding file=ls to 0x0 (undefined weak): symbol `__1cH__CimplKcplus_fini6F_v_'
24305: binding file=ls to 0x0 (undefined weak): symbol `__1cH__CimplKcplus_init6F_v_'
24305: binding file=ls to file=/usr/lib/libc.so.1: symbol `__iob'
24305: binding file=ls to file=/usr/lib/libc.so.1: symbol `optind'
24305: binding file=ls to file=/usr/lib/libc.so.1: symbol `__ctype'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__nl_langinfo_std'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__strptime_std'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__yday_to_month'
24305: binding file=/usr/lib/libc.so.1 to file=ls: symbol `_lib_version'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `_siguhandler'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__huge_val'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `_daylight'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `_numeric'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `_timezone'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `_getdate_err'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `optarg'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__wcrtomb_sb'
24305: binding file=/usr/lib/libc.so.1 to 0x0 (undefined weak): symbol `_ex_deregister'
24305: binding file=/usr/lib/libc.so.1 to file=ls: symbol `free'
24305: binding file=/usr/lib/libc.so.1 to file=ls: symbol `free'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__mblen_sb'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__wcsftime_std'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__btowc_sb'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__ctype_mask'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__iswctype_std'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `__lc_numeric'
24305: binding file=/usr/lib/libc.so.1 to file=/usr/lib/libc.so.1: symbol `_sp'
...
A full list of LD_DEBUG commands can be obtained by typing this:
% env LD_DEBUG=help somecommand
To collect information, simply implement ld.mon as a shell script. For example:
#!/bin/sh
# ld.mon:
# Collect run-time link information and save in a file .ld.mon.pid
rm -f .ld.mon.*
exec env LD_DEBUG=bindings,symbols,detail LD_DEBUG_OUTPUT=.ld.mon $*

Note: raw symbol table information can be obtained from a library using the nm, dump, or elfdump commands. I'm not sure if you will need to use these (maybe).

Obtaining statistics

To implement ld.stat, you need to parse the debugging file generated by ld.mon and assemble that data into some kind of coherent form. At the very least, ld.stat should collect information about the following: You may choose to collect other information as you see fit.

As for the functionality of ld.stat itself, your challenge is to create a program that presents the runtime "linking" configuration of the program as clearly as possible. You may want to write an interactive program that allows a user to query different aspects of the program (libraries, bound symbols, PLT entries, etc.). This could be done using a scripting interface. You could also write a GUI if you were so inclined. The key point is that you're trying to take data that is rather complicated and trying to present it in a way that would make sense to a more inexperienced programmer.

Implementation hints

Because the project involves assembling information from varied sources (and file parsing), you are strongly encouraged to implement the project in Python, Perl, or some other scripting language. The easiest way to parse the ld.so.1 debugging file will probably be to split it into lines and use regular-expression parsing.

Grading

There is no right solution to this project. In fact, there are no tests, and no specific output requirements! Instead, this project will be graded on creativity and the overall utility of the tool you create. Obviously, a solution that does nothing whatsoever won't get many points.