documentation update for build-system changes

+script snippet on how to automatically fill the bochslibs/ directory

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1417 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
This commit is contained in:
hsc
2012-07-03 16:11:15 +00:00
parent 7e9914d576
commit e94773937b
2 changed files with 78 additions and 63 deletions

View File

@ -2,13 +2,11 @@
Additional libraries/packages/tools needed for Fail*:
=========================================================================================
Required anyway:
Required for Fail*:
**********************************************************************
- libprotobuf-dev
- libpthread
- libpcl1-dev
- libboost-dev
- libboost-all-dev (or at least libboost-thread-dev)
- libboost-thread-dev
- protobuf-compiler
- cmake
- cmake-curses-gui
@ -16,6 +14,11 @@ Required anyway:
obtained from http://www.aspectc.org; nightlies can be downloaded from
http://akut.aspectc.org
Required for the Bochs simulator backend:
**********************************************************************
- libpthread
- Probably more, depending on, e.g., the GUI you configure (X11 ->
libxrandr-dev)
For distribution/parallelization:
**********************************************************************
@ -25,9 +28,14 @@ For distribution/parallelization:
32-bit FailBochs on x86_64 Linux machines:
**********************************************************************
- libc6-i386 + all libraries listed by
$ ldd bochs|awk '{print $3}'
in ~/bochslibs (client.sh will add these to LD_LIBRARY_PATH)
- Create a "bochslibs" directory and fill it with all necessary libraries from
your build machine:
$ mkdir bochslibs
$ cp -v $(ldd fail-client|awk '{print $3}'|egrep -v '\(|lib(pthread|selinux|c.so.)|^$') bochslibs/
- Copy this directory to ~/bochslibs on all machines lacking these libraries
(this may also be the case for i386 machines you cannot install library
packages on yourself). client.sh will add ~/bochslibs to LD_LIBRARY_PATH if
it exists.
=========================================================================================
Compiling, building and modifying: Simulators and Fail*
@ -44,18 +52,23 @@ For the first time:
$ find -name CMakeCache.txt | xargs rm
3. Create out of source build directory (${BUILD_DIR}, see also "fail-structure.txt"):
$ mkdir build
Note that currently this build directory must be located somewhere below
the fail/ directory, as generated .ah files in there will not be included
in the compile process otherwise.
4. Enter out-of-source build directory. All generated files end up there.
$ cd build
5. Generate CMake environment.
$ cmake ..
6. Setup build configuration by opening the CMake configuration tool
$ ccmake .
Select "BUILD_BOCHS" or "BUILD_OVP". Select an experiment to enable by naming its
"experiments/" subdirectory under "EXPERIMENTS_ACTIVATED". Configure Fail* features
you need for this experiment by enabling "CONFIG_*" options. Press 'c', 'g' to
regenerate the build system. (Alternatively use
Select "BUILD_BOCHS" or "BUILD_OVP". Select an experiment to enable by
naming its "experiments/" subdirectory under "EXPERIMENTS_ACTIVATED".
Configure Fail* features you need for this experiment by enabling
"CONFIG_*" options. Press 'c', 'g' to regenerate the build system.
(Alternatively use
$ cmake-gui .
for a Qt GUI.) To enable a Debug build, choose "Debug" as the build type.
for a Qt GUI.) To enable a Debug build, choose "Debug" as the build type,
otherwise choose "Release".
7. Additionally make sure Bochs is at least configured (see below).
@ -64,13 +77,11 @@ After changes to Fail* code:
Prerequisite, if you're building with Bochs: configure Bochs (see below).
Compile (in ${BUILD_DIR}, optionally "add -jN" for parallel building):
$ make
CMake will build all Fail* libraries, merge them into a libfail.a and put it into
"${FAIL_DIR}/src". (As the current Bochs Makefile expects it there.) The static
library contains all core components and activated experiments/plugings.
You may use the shell script
CMake will build all Fail* libraries and link them with the simulator backend
library to a binary called "fail-client". You may use the shell script
$ ${FAIL_DIR}/scripts/rebuild-bochs.sh [-]
to speed up repetitive tasks regarding Fail/Bochs builds. This script contains a
concise documentation on itself.
to speed up repetitive tasks regarding Fail/Bochs builds. This script contains
a concise documentation on itself.
Add new Fail* sources to build chain:
@ -99,7 +110,7 @@ to be compiled previously:
$ cd src/core/doc/latex; make
Building Bochs:
Building FailBochs:
**********************************************************************
For the first time:
@ -122,20 +133,16 @@ For the first time:
FIXME: Remove more redundant flags/libraries
After changes to Bochs code or Bochs-affecting aspects:
After changes to Bochs code:
------------------------------------------------------------
- Compiling: The make call from the make-ag++.sh is now invokable by calling
(still in ${BUILD_DIR}, optionally adding -jN for parallel building):
$ cd ../build %% FIXME: involviert make bochs (im build-Ver.) wirklich make-ag++.sh?
$ make bochs
(Of course, this requires a configured Bochs/Fail*.)
- Cleaning up: The former make all-clean is now invokable by
- Just re-run "make" in ${BUILD_DIR}, or call "scripts/rebuild-bochs.sh -".
The latter automatically runs "make install" after rebuilding fail-client
(and probably the experiment's campaign server).
- Cleaning up (forcing a complete rebuild of libfailbochs.a next time):
$ make bochsallclean
- Installing: For installing the bochs executable (former "make install")
$ make bochsinstall
(See "make help" for a target listing.)
- Note: You may use scripts/rebuild-bochs.sh to speed up repetitive tasks regarding
Fail/Bochs builds. This script contains a concise documentation on itself.
This is especially necessary if you changed a Bochs-affecting aspect header
(.ah), as the build system does not know about Bochs sources depending on
certain aspects.
Debug build:
@ -143,6 +150,8 @@ Debug build:
Configure Bochs to use debugging-related compiler flags (expects to be in ${BUILD_DIR}):
$ cd ../simulator/bochs
$ CFLAGS="-g -O0" CXXFLAGS="-g -O0" ./configure --prefix=... ... (see above)
You might additionally want to configure the rest of Fail* into debug mode by
setting CMAKE_BUILD_TYPE to "Debug" (ccmake, see above).
Profiling-based optimization build:

View File

@ -38,13 +38,16 @@ based on the "${PREFIX}/share/doc/bochs/bochsrc-sample.txt" template (or
0xe9 to the console:
port_e9_hack: enabled=1
- Determinism: (Fail)Bochs is deterministic regarding timer interrupts,
i.e., two experiment runs after calling simulator.restore() will count the
same number of instructions between two interrupts. Though, you need to be
careful when running (Fail)Bochs with a GUI enabled: Typing "bochs -q<return>"
i.e., two experiment runs after calling simulator.restore() will count
the same number of instructions between two interrupts. Though, you
need to be careful when running (Fail)Bochs with a GUI enabled: Typing
fail-client -q<return>
on the command line may lead to the GUI window receiving a "return key
released" event, resulting in a keyboard interrupt for the guest system.
This can be avoided by starting Bochs with "sleep 1; bochs -q", or
disabling the GUI (see "headless experiments" above).
This can be avoided by starting Bochs with "sleep 1; fail-client -q", by
suppressing keyboard input (CONFIG_DISABLE_KEYB_INTERRUPTS setting in
the CMake configuration), or disabling the GUI (see "headless
experiments" above).
=========================================================================================
Example experiments and code snippets
@ -56,10 +59,10 @@ A simple standalone experiment (without a separate campaign). To compile this
experiment, the following steps are required:
1. Add "hsc-simple" to ccmake's EXPERIMENTS_ACTIVATED.
2. Enable CONFIG_EVENT_BREAKPOINTS, CONFIG_SR_RESTORE and CONFIG_SR_SAVE.
3. Build Fail* and Bochs, see "how-to-build.txt" for details-
3. Build Fail* and Bochs, see "how-to-build.txt" for details.
4. Enter experiment_targets/hscsimple/, bunzip2 -k *.bz2
5. Start the Bochs simulator by typing
$ bochs -q
$ fail-client -q
After successfully booting the eCos/hello world example, the console shows
"[HSC] breakpoint reached, saving", and a hello.state/ subdirectory appears.
You probably need to adjust the bochsrc's paths to romimage/vgaromimage.
@ -71,8 +74,8 @@ experiment, the following steps are required:
into "#if 0". Make an incremental build, e.g., by running
"${FAIL_DIR}/scripts/rebuild-bochs.sh -" from your ${BUILD_DIR}.
7. Back to ../experiment_targets/hscsimple/ (assuming, your are in ${FAIL_DIR}),
run
$ bochs -q
again run
$ fail-client -q
After restoring the state, the hello world program's calculation should
yield a different result.
@ -88,13 +91,14 @@ experiment, the following steps are required:
../experiment_targets/coolchecksum/.
(If you want to enable COOL_FAULTSPACE_PRUNING, step #2 is mandatory because
it generates the instruction/memory access trace needed for pruning.)
2. Build the campaign server: make coolchecksum-server
2. Build the campaign server (if it wasn't already built automatically):
$ make coolchecksum-server
3. Run the campaign server: bin/coolchecksum-server
4. In another terminal, run step #3 of the experiment ("bochs -q").
4. In another terminal, run step #3 of the experiment ("fail-client -q").
Step #3 of the experiment currently runs 2000 experiment iterations and then
terminates, because Bochs has some memory leak issues. You need to re-run
Bochs for the next 2k experiments.
fail-client for the next 2k experiments.
The experiments can be significantly sped up by
a) parallelization (run more FailBochs clients and
@ -104,9 +108,9 @@ The experiments can be significantly sped up by
Experiment "MHTestCampaign":
**********************************************************************
An example for separate campaign/experiment implementations.
1. Execute Campaign (job server): ${BUILD_DIR}/bin/MHTestCampaign-server
1. Execute campaign (job server): ${BUILD_DIR}/bin/MHTestCampaign-server
2. Run the FailBochs instance, in properly defined environment:
$ bochs -q
$ fail-client -q
=========================================================================================
Parallelization
@ -120,17 +124,16 @@ flows), inquired by the clients. As a consequence, the campaign is running on th
side and the experiment flow are running on the (distributed) clients.
First of all, the Fail* instances (and other required files, e.g. saved state) are
distributed to the clients. In the second step the campaign(-server) is started, preparing
it's parameter-sets in order to be able to answer the requests from the clients. (Once
there are available parameter-sets, the clients can request them.) In the final step,
its parameter sets in order to be able to answer the requests from the clients. (Once
there are available parameter sets, the clients can request them.) In the final step,
the distributed Fail* clients have to be started. As soon as this setup is finished,
the clients request new parameter-sets, execute their experiment code and return their
results to the server (aka campaign) in an iterative way, until all paremeter-sets have
been processed successfully. If all (new) parameter-sets have been distributed, the
campaign starts to resend unfinished parameter-sets to requesting clients in order to
the clients request new parameter sets, execute their experiment code and return their
results to the server (aka campaign) in an iterative way, until all paremeter sets have
been processed successfully. If all (new) parameter sets have been distributed, the
campaign starts to re-send unfinished parameter sets to requesting clients in order to
speed up the overall campaign execution. Additionally, this ensures that all parameter
sets will produce a corresponding result set. (If, for example, a client terminates
abnormally, no result is send back. This scenario is managed by this "resend-mechanism"
of the campain, too.)
abnormally, no result is sent back. This scenario is dealt with by this mechanism, too.)
Shell scripts supporting experiment distribution:
@ -145,27 +148,30 @@ themselves, they contain some documentation):
clients on the experiment hosts.
- multiple-clients.sh: Is run on an experiment host by runcampaign.sh,
starts several instances of client.sh in a tmux session.
- client.sh: (Repeatedly) Runs a single FailBochs instance.
- client.sh: (Repeatedly) Runs a single fail-client instance.
Some useful things to note:
**********************************************************************
- Using the distribute-experiment.sh script causes the local bochs binary to
- Using the distribute-experiment.sh script causes the local fail-client binary to
be copied to the hosts. If the binary is not present in the current directory
the default bochs binary (-> $ which bochs) will be used. If you have modified
some of your experiment code (i.e., your bochs binary will change), don't
forget to delete the local bochs binary in order to distribute the *new* binary.
the default fail-client binary (-> $ which fail-client) will be used. If you
have modified some of your experiment code (i.e., your fail-client binary will
change), don't forget to delete the local fail-client binary in order to
distribute the *new* binary.
- The runcampaign.sh script prints some status information about the clients
recently started. In addition, there will be a few error messages concerning
ssh, tmux and so on. They can be ignored for now.
- The runcampaign.sh script starts the coolchecksum-server. Note that the server
instance will terminate immediatly (without notice), if there is still an
instance will terminate immediately (without notice), if there is still an
existing coolcampaign.csv file.
- In order to make the performance gains (mentioned above) take effect, a "workload
balancing" between the server and the clients is mandatory. This means that
the communication overhead (client <-> server) and the time, needed to execute
the communication overhead (client <-> server) and the time needed to execute
the experiment code on the client-side should be in due proportion. More
specifically, for each experiment there will be exactly 2 TCP connections
(send parameter-set to client, send result to server) established. Therefore
you should ensure that the execution time of the experiment is "long enough"
(heuristic). (See existing experiments for examples.)
(send parameter set to client, send result to server) established. Therefore
you should ensure that the jobs you distribute take enough time not to
overflow the server with requests. You may need to bundle parameters for
more than one experiment if a single experiment only takes a few hundred
milliseconds. (See existing experiments for examples.)