documentation update for build-system changes

+script snippet on how to automatically fill the bochslibs/ directory

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1417 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
This commit is contained in:
hsc
2012-07-03 16:11:15 +00:00
parent 7e9914d576
commit e94773937b
2 changed files with 78 additions and 63 deletions

View File

@ -2,13 +2,11 @@
Additional libraries/packages/tools needed for Fail*: Additional libraries/packages/tools needed for Fail*:
========================================================================================= =========================================================================================
Required anyway: Required for Fail*:
********************************************************************** **********************************************************************
- libprotobuf-dev - libprotobuf-dev
- libpthread
- libpcl1-dev - libpcl1-dev
- libboost-dev - libboost-thread-dev
- libboost-all-dev (or at least libboost-thread-dev)
- protobuf-compiler - protobuf-compiler
- cmake - cmake
- cmake-curses-gui - cmake-curses-gui
@ -16,6 +14,11 @@ Required anyway:
obtained from http://www.aspectc.org; nightlies can be downloaded from obtained from http://www.aspectc.org; nightlies can be downloaded from
http://akut.aspectc.org http://akut.aspectc.org
Required for the Bochs simulator backend:
**********************************************************************
- libpthread
- Probably more, depending on, e.g., the GUI you configure (X11 ->
libxrandr-dev)
For distribution/parallelization: For distribution/parallelization:
********************************************************************** **********************************************************************
@ -25,9 +28,14 @@ For distribution/parallelization:
32-bit FailBochs on x86_64 Linux machines: 32-bit FailBochs on x86_64 Linux machines:
********************************************************************** **********************************************************************
- libc6-i386 + all libraries listed by - Create a "bochslibs" directory and fill it with all necessary libraries from
$ ldd bochs|awk '{print $3}' your build machine:
in ~/bochslibs (client.sh will add these to LD_LIBRARY_PATH) $ mkdir bochslibs
$ cp -v $(ldd fail-client|awk '{print $3}'|egrep -v '\(|lib(pthread|selinux|c.so.)|^$') bochslibs/
- Copy this directory to ~/bochslibs on all machines lacking these libraries
(this may also be the case for i386 machines you cannot install library
packages on yourself). client.sh will add ~/bochslibs to LD_LIBRARY_PATH if
it exists.
========================================================================================= =========================================================================================
Compiling, building and modifying: Simulators and Fail* Compiling, building and modifying: Simulators and Fail*
@ -44,18 +52,23 @@ For the first time:
$ find -name CMakeCache.txt | xargs rm $ find -name CMakeCache.txt | xargs rm
3. Create out of source build directory (${BUILD_DIR}, see also "fail-structure.txt"): 3. Create out of source build directory (${BUILD_DIR}, see also "fail-structure.txt"):
$ mkdir build $ mkdir build
Note that currently this build directory must be located somewhere below
the fail/ directory, as generated .ah files in there will not be included
in the compile process otherwise.
4. Enter out-of-source build directory. All generated files end up there. 4. Enter out-of-source build directory. All generated files end up there.
$ cd build $ cd build
5. Generate CMake environment. 5. Generate CMake environment.
$ cmake .. $ cmake ..
6. Setup build configuration by opening the CMake configuration tool 6. Setup build configuration by opening the CMake configuration tool
$ ccmake . $ ccmake .
Select "BUILD_BOCHS" or "BUILD_OVP". Select an experiment to enable by naming its Select "BUILD_BOCHS" or "BUILD_OVP". Select an experiment to enable by
"experiments/" subdirectory under "EXPERIMENTS_ACTIVATED". Configure Fail* features naming its "experiments/" subdirectory under "EXPERIMENTS_ACTIVATED".
you need for this experiment by enabling "CONFIG_*" options. Press 'c', 'g' to Configure Fail* features you need for this experiment by enabling
regenerate the build system. (Alternatively use "CONFIG_*" options. Press 'c', 'g' to regenerate the build system.
(Alternatively use
$ cmake-gui . $ cmake-gui .
for a Qt GUI.) To enable a Debug build, choose "Debug" as the build type. for a Qt GUI.) To enable a Debug build, choose "Debug" as the build type,
otherwise choose "Release".
7. Additionally make sure Bochs is at least configured (see below). 7. Additionally make sure Bochs is at least configured (see below).
@ -64,13 +77,11 @@ After changes to Fail* code:
Prerequisite, if you're building with Bochs: configure Bochs (see below). Prerequisite, if you're building with Bochs: configure Bochs (see below).
Compile (in ${BUILD_DIR}, optionally "add -jN" for parallel building): Compile (in ${BUILD_DIR}, optionally "add -jN" for parallel building):
$ make $ make
CMake will build all Fail* libraries, merge them into a libfail.a and put it into CMake will build all Fail* libraries and link them with the simulator backend
"${FAIL_DIR}/src". (As the current Bochs Makefile expects it there.) The static library to a binary called "fail-client". You may use the shell script
library contains all core components and activated experiments/plugings.
You may use the shell script
$ ${FAIL_DIR}/scripts/rebuild-bochs.sh [-] $ ${FAIL_DIR}/scripts/rebuild-bochs.sh [-]
to speed up repetitive tasks regarding Fail/Bochs builds. This script contains a to speed up repetitive tasks regarding Fail/Bochs builds. This script contains
concise documentation on itself. a concise documentation on itself.
Add new Fail* sources to build chain: Add new Fail* sources to build chain:
@ -99,7 +110,7 @@ to be compiled previously:
$ cd src/core/doc/latex; make $ cd src/core/doc/latex; make
Building Bochs: Building FailBochs:
********************************************************************** **********************************************************************
For the first time: For the first time:
@ -122,20 +133,16 @@ For the first time:
FIXME: Remove more redundant flags/libraries FIXME: Remove more redundant flags/libraries
After changes to Bochs code or Bochs-affecting aspects: After changes to Bochs code:
------------------------------------------------------------ ------------------------------------------------------------
- Compiling: The make call from the make-ag++.sh is now invokable by calling - Just re-run "make" in ${BUILD_DIR}, or call "scripts/rebuild-bochs.sh -".
(still in ${BUILD_DIR}, optionally adding -jN for parallel building): The latter automatically runs "make install" after rebuilding fail-client
$ cd ../build %% FIXME: involviert make bochs (im build-Ver.) wirklich make-ag++.sh? (and probably the experiment's campaign server).
$ make bochs - Cleaning up (forcing a complete rebuild of libfailbochs.a next time):
(Of course, this requires a configured Bochs/Fail*.)
- Cleaning up: The former make all-clean is now invokable by
$ make bochsallclean $ make bochsallclean
- Installing: For installing the bochs executable (former "make install") This is especially necessary if you changed a Bochs-affecting aspect header
$ make bochsinstall (.ah), as the build system does not know about Bochs sources depending on
(See "make help" for a target listing.) certain aspects.
- Note: You may use scripts/rebuild-bochs.sh to speed up repetitive tasks regarding
Fail/Bochs builds. This script contains a concise documentation on itself.
Debug build: Debug build:
@ -143,6 +150,8 @@ Debug build:
Configure Bochs to use debugging-related compiler flags (expects to be in ${BUILD_DIR}): Configure Bochs to use debugging-related compiler flags (expects to be in ${BUILD_DIR}):
$ cd ../simulator/bochs $ cd ../simulator/bochs
$ CFLAGS="-g -O0" CXXFLAGS="-g -O0" ./configure --prefix=... ... (see above) $ CFLAGS="-g -O0" CXXFLAGS="-g -O0" ./configure --prefix=... ... (see above)
You might additionally want to configure the rest of Fail* into debug mode by
setting CMAKE_BUILD_TYPE to "Debug" (ccmake, see above).
Profiling-based optimization build: Profiling-based optimization build:

View File

@ -38,13 +38,16 @@ based on the "${PREFIX}/share/doc/bochs/bochsrc-sample.txt" template (or
0xe9 to the console: 0xe9 to the console:
port_e9_hack: enabled=1 port_e9_hack: enabled=1
- Determinism: (Fail)Bochs is deterministic regarding timer interrupts, - Determinism: (Fail)Bochs is deterministic regarding timer interrupts,
i.e., two experiment runs after calling simulator.restore() will count the i.e., two experiment runs after calling simulator.restore() will count
same number of instructions between two interrupts. Though, you need to be the same number of instructions between two interrupts. Though, you
careful when running (Fail)Bochs with a GUI enabled: Typing "bochs -q<return>" need to be careful when running (Fail)Bochs with a GUI enabled: Typing
fail-client -q<return>
on the command line may lead to the GUI window receiving a "return key on the command line may lead to the GUI window receiving a "return key
released" event, resulting in a keyboard interrupt for the guest system. released" event, resulting in a keyboard interrupt for the guest system.
This can be avoided by starting Bochs with "sleep 1; bochs -q", or This can be avoided by starting Bochs with "sleep 1; fail-client -q", by
disabling the GUI (see "headless experiments" above). suppressing keyboard input (CONFIG_DISABLE_KEYB_INTERRUPTS setting in
the CMake configuration), or disabling the GUI (see "headless
experiments" above).
========================================================================================= =========================================================================================
Example experiments and code snippets Example experiments and code snippets
@ -56,10 +59,10 @@ A simple standalone experiment (without a separate campaign). To compile this
experiment, the following steps are required: experiment, the following steps are required:
1. Add "hsc-simple" to ccmake's EXPERIMENTS_ACTIVATED. 1. Add "hsc-simple" to ccmake's EXPERIMENTS_ACTIVATED.
2. Enable CONFIG_EVENT_BREAKPOINTS, CONFIG_SR_RESTORE and CONFIG_SR_SAVE. 2. Enable CONFIG_EVENT_BREAKPOINTS, CONFIG_SR_RESTORE and CONFIG_SR_SAVE.
3. Build Fail* and Bochs, see "how-to-build.txt" for details- 3. Build Fail* and Bochs, see "how-to-build.txt" for details.
4. Enter experiment_targets/hscsimple/, bunzip2 -k *.bz2 4. Enter experiment_targets/hscsimple/, bunzip2 -k *.bz2
5. Start the Bochs simulator by typing 5. Start the Bochs simulator by typing
$ bochs -q $ fail-client -q
After successfully booting the eCos/hello world example, the console shows After successfully booting the eCos/hello world example, the console shows
"[HSC] breakpoint reached, saving", and a hello.state/ subdirectory appears. "[HSC] breakpoint reached, saving", and a hello.state/ subdirectory appears.
You probably need to adjust the bochsrc's paths to romimage/vgaromimage. You probably need to adjust the bochsrc's paths to romimage/vgaromimage.
@ -71,8 +74,8 @@ experiment, the following steps are required:
into "#if 0". Make an incremental build, e.g., by running into "#if 0". Make an incremental build, e.g., by running
"${FAIL_DIR}/scripts/rebuild-bochs.sh -" from your ${BUILD_DIR}. "${FAIL_DIR}/scripts/rebuild-bochs.sh -" from your ${BUILD_DIR}.
7. Back to ../experiment_targets/hscsimple/ (assuming, your are in ${FAIL_DIR}), 7. Back to ../experiment_targets/hscsimple/ (assuming, your are in ${FAIL_DIR}),
run again run
$ bochs -q $ fail-client -q
After restoring the state, the hello world program's calculation should After restoring the state, the hello world program's calculation should
yield a different result. yield a different result.
@ -88,13 +91,14 @@ experiment, the following steps are required:
../experiment_targets/coolchecksum/. ../experiment_targets/coolchecksum/.
(If you want to enable COOL_FAULTSPACE_PRUNING, step #2 is mandatory because (If you want to enable COOL_FAULTSPACE_PRUNING, step #2 is mandatory because
it generates the instruction/memory access trace needed for pruning.) it generates the instruction/memory access trace needed for pruning.)
2. Build the campaign server: make coolchecksum-server 2. Build the campaign server (if it wasn't already built automatically):
$ make coolchecksum-server
3. Run the campaign server: bin/coolchecksum-server 3. Run the campaign server: bin/coolchecksum-server
4. In another terminal, run step #3 of the experiment ("bochs -q"). 4. In another terminal, run step #3 of the experiment ("fail-client -q").
Step #3 of the experiment currently runs 2000 experiment iterations and then Step #3 of the experiment currently runs 2000 experiment iterations and then
terminates, because Bochs has some memory leak issues. You need to re-run terminates, because Bochs has some memory leak issues. You need to re-run
Bochs for the next 2k experiments. fail-client for the next 2k experiments.
The experiments can be significantly sped up by The experiments can be significantly sped up by
a) parallelization (run more FailBochs clients and a) parallelization (run more FailBochs clients and
@ -104,9 +108,9 @@ The experiments can be significantly sped up by
Experiment "MHTestCampaign": Experiment "MHTestCampaign":
********************************************************************** **********************************************************************
An example for separate campaign/experiment implementations. An example for separate campaign/experiment implementations.
1. Execute Campaign (job server): ${BUILD_DIR}/bin/MHTestCampaign-server 1. Execute campaign (job server): ${BUILD_DIR}/bin/MHTestCampaign-server
2. Run the FailBochs instance, in properly defined environment: 2. Run the FailBochs instance, in properly defined environment:
$ bochs -q $ fail-client -q
========================================================================================= =========================================================================================
Parallelization Parallelization
@ -120,17 +124,16 @@ flows), inquired by the clients. As a consequence, the campaign is running on th
side and the experiment flow are running on the (distributed) clients. side and the experiment flow are running on the (distributed) clients.
First of all, the Fail* instances (and other required files, e.g. saved state) are First of all, the Fail* instances (and other required files, e.g. saved state) are
distributed to the clients. In the second step the campaign(-server) is started, preparing distributed to the clients. In the second step the campaign(-server) is started, preparing
it's parameter-sets in order to be able to answer the requests from the clients. (Once its parameter sets in order to be able to answer the requests from the clients. (Once
there are available parameter-sets, the clients can request them.) In the final step, there are available parameter sets, the clients can request them.) In the final step,
the distributed Fail* clients have to be started. As soon as this setup is finished, the distributed Fail* clients have to be started. As soon as this setup is finished,
the clients request new parameter-sets, execute their experiment code and return their the clients request new parameter sets, execute their experiment code and return their
results to the server (aka campaign) in an iterative way, until all paremeter-sets have results to the server (aka campaign) in an iterative way, until all paremeter sets have
been processed successfully. If all (new) parameter-sets have been distributed, the been processed successfully. If all (new) parameter sets have been distributed, the
campaign starts to resend unfinished parameter-sets to requesting clients in order to campaign starts to re-send unfinished parameter sets to requesting clients in order to
speed up the overall campaign execution. Additionally, this ensures that all parameter speed up the overall campaign execution. Additionally, this ensures that all parameter
sets will produce a corresponding result set. (If, for example, a client terminates sets will produce a corresponding result set. (If, for example, a client terminates
abnormally, no result is send back. This scenario is managed by this "resend-mechanism" abnormally, no result is sent back. This scenario is dealt with by this mechanism, too.)
of the campain, too.)
Shell scripts supporting experiment distribution: Shell scripts supporting experiment distribution:
@ -145,27 +148,30 @@ themselves, they contain some documentation):
clients on the experiment hosts. clients on the experiment hosts.
- multiple-clients.sh: Is run on an experiment host by runcampaign.sh, - multiple-clients.sh: Is run on an experiment host by runcampaign.sh,
starts several instances of client.sh in a tmux session. starts several instances of client.sh in a tmux session.
- client.sh: (Repeatedly) Runs a single FailBochs instance. - client.sh: (Repeatedly) Runs a single fail-client instance.
Some useful things to note: Some useful things to note:
********************************************************************** **********************************************************************
- Using the distribute-experiment.sh script causes the local bochs binary to - Using the distribute-experiment.sh script causes the local fail-client binary to
be copied to the hosts. If the binary is not present in the current directory be copied to the hosts. If the binary is not present in the current directory
the default bochs binary (-> $ which bochs) will be used. If you have modified the default fail-client binary (-> $ which fail-client) will be used. If you
some of your experiment code (i.e., your bochs binary will change), don't have modified some of your experiment code (i.e., your fail-client binary will
forget to delete the local bochs binary in order to distribute the *new* binary. change), don't forget to delete the local fail-client binary in order to
distribute the *new* binary.
- The runcampaign.sh script prints some status information about the clients - The runcampaign.sh script prints some status information about the clients
recently started. In addition, there will be a few error messages concerning recently started. In addition, there will be a few error messages concerning
ssh, tmux and so on. They can be ignored for now. ssh, tmux and so on. They can be ignored for now.
- The runcampaign.sh script starts the coolchecksum-server. Note that the server - The runcampaign.sh script starts the coolchecksum-server. Note that the server
instance will terminate immediatly (without notice), if there is still an instance will terminate immediately (without notice), if there is still an
existing coolcampaign.csv file. existing coolcampaign.csv file.
- In order to make the performance gains (mentioned above) take effect, a "workload - In order to make the performance gains (mentioned above) take effect, a "workload
balancing" between the server and the clients is mandatory. This means that balancing" between the server and the clients is mandatory. This means that
the communication overhead (client <-> server) and the time, needed to execute the communication overhead (client <-> server) and the time needed to execute
the experiment code on the client-side should be in due proportion. More the experiment code on the client-side should be in due proportion. More
specifically, for each experiment there will be exactly 2 TCP connections specifically, for each experiment there will be exactly 2 TCP connections
(send parameter-set to client, send result to server) established. Therefore (send parameter set to client, send result to server) established. Therefore
you should ensure that the execution time of the experiment is "long enough" you should ensure that the jobs you distribute take enough time not to
(heuristic). (See existing experiments for examples.) overflow the server with requests. You may need to bundle parameters for
more than one experiment if a single experiment only takes a few hundred
milliseconds. (See existing experiments for examples.)