documentation update for build-system changes
+script snippet on how to automatically fill the bochslibs/ directory git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1417 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
This commit is contained in:
@ -2,13 +2,11 @@
|
||||
Additional libraries/packages/tools needed for Fail*:
|
||||
=========================================================================================
|
||||
|
||||
Required anyway:
|
||||
Required for Fail*:
|
||||
**********************************************************************
|
||||
- libprotobuf-dev
|
||||
- libpthread
|
||||
- libpcl1-dev
|
||||
- libboost-dev
|
||||
- libboost-all-dev (or at least libboost-thread-dev)
|
||||
- libboost-thread-dev
|
||||
- protobuf-compiler
|
||||
- cmake
|
||||
- cmake-curses-gui
|
||||
@ -16,6 +14,11 @@ Required anyway:
|
||||
obtained from http://www.aspectc.org; nightlies can be downloaded from
|
||||
http://akut.aspectc.org
|
||||
|
||||
Required for the Bochs simulator backend:
|
||||
**********************************************************************
|
||||
- libpthread
|
||||
- Probably more, depending on, e.g., the GUI you configure (X11 ->
|
||||
libxrandr-dev)
|
||||
|
||||
For distribution/parallelization:
|
||||
**********************************************************************
|
||||
@ -25,9 +28,14 @@ For distribution/parallelization:
|
||||
|
||||
32-bit FailBochs on x86_64 Linux machines:
|
||||
**********************************************************************
|
||||
- libc6-i386 + all libraries listed by
|
||||
$ ldd bochs|awk '{print $3}'
|
||||
in ~/bochslibs (client.sh will add these to LD_LIBRARY_PATH)
|
||||
- Create a "bochslibs" directory and fill it with all necessary libraries from
|
||||
your build machine:
|
||||
$ mkdir bochslibs
|
||||
$ cp -v $(ldd fail-client|awk '{print $3}'|egrep -v '\(|lib(pthread|selinux|c.so.)|^$') bochslibs/
|
||||
- Copy this directory to ~/bochslibs on all machines lacking these libraries
|
||||
(this may also be the case for i386 machines you cannot install library
|
||||
packages on yourself). client.sh will add ~/bochslibs to LD_LIBRARY_PATH if
|
||||
it exists.
|
||||
|
||||
=========================================================================================
|
||||
Compiling, building and modifying: Simulators and Fail*
|
||||
@ -44,18 +52,23 @@ For the first time:
|
||||
$ find -name CMakeCache.txt | xargs rm
|
||||
3. Create out of source build directory (${BUILD_DIR}, see also "fail-structure.txt"):
|
||||
$ mkdir build
|
||||
Note that currently this build directory must be located somewhere below
|
||||
the fail/ directory, as generated .ah files in there will not be included
|
||||
in the compile process otherwise.
|
||||
4. Enter out-of-source build directory. All generated files end up there.
|
||||
$ cd build
|
||||
5. Generate CMake environment.
|
||||
$ cmake ..
|
||||
6. Setup build configuration by opening the CMake configuration tool
|
||||
$ ccmake .
|
||||
Select "BUILD_BOCHS" or "BUILD_OVP". Select an experiment to enable by naming its
|
||||
"experiments/" subdirectory under "EXPERIMENTS_ACTIVATED". Configure Fail* features
|
||||
you need for this experiment by enabling "CONFIG_*" options. Press 'c', 'g' to
|
||||
regenerate the build system. (Alternatively use
|
||||
Select "BUILD_BOCHS" or "BUILD_OVP". Select an experiment to enable by
|
||||
naming its "experiments/" subdirectory under "EXPERIMENTS_ACTIVATED".
|
||||
Configure Fail* features you need for this experiment by enabling
|
||||
"CONFIG_*" options. Press 'c', 'g' to regenerate the build system.
|
||||
(Alternatively use
|
||||
$ cmake-gui .
|
||||
for a Qt GUI.) To enable a Debug build, choose "Debug" as the build type.
|
||||
for a Qt GUI.) To enable a Debug build, choose "Debug" as the build type,
|
||||
otherwise choose "Release".
|
||||
7. Additionally make sure Bochs is at least configured (see below).
|
||||
|
||||
|
||||
@ -64,13 +77,11 @@ After changes to Fail* code:
|
||||
Prerequisite, if you're building with Bochs: configure Bochs (see below).
|
||||
Compile (in ${BUILD_DIR}, optionally "add -jN" for parallel building):
|
||||
$ make
|
||||
CMake will build all Fail* libraries, merge them into a libfail.a and put it into
|
||||
"${FAIL_DIR}/src". (As the current Bochs Makefile expects it there.) The static
|
||||
library contains all core components and activated experiments/plugings.
|
||||
You may use the shell script
|
||||
CMake will build all Fail* libraries and link them with the simulator backend
|
||||
library to a binary called "fail-client". You may use the shell script
|
||||
$ ${FAIL_DIR}/scripts/rebuild-bochs.sh [-]
|
||||
to speed up repetitive tasks regarding Fail/Bochs builds. This script contains a
|
||||
concise documentation on itself.
|
||||
to speed up repetitive tasks regarding Fail/Bochs builds. This script contains
|
||||
a concise documentation on itself.
|
||||
|
||||
|
||||
Add new Fail* sources to build chain:
|
||||
@ -99,7 +110,7 @@ to be compiled previously:
|
||||
$ cd src/core/doc/latex; make
|
||||
|
||||
|
||||
Building Bochs:
|
||||
Building FailBochs:
|
||||
**********************************************************************
|
||||
|
||||
For the first time:
|
||||
@ -122,20 +133,16 @@ For the first time:
|
||||
FIXME: Remove more redundant flags/libraries
|
||||
|
||||
|
||||
After changes to Bochs code or Bochs-affecting aspects:
|
||||
After changes to Bochs code:
|
||||
------------------------------------------------------------
|
||||
- Compiling: The make call from the make-ag++.sh is now invokable by calling
|
||||
(still in ${BUILD_DIR}, optionally adding -jN for parallel building):
|
||||
$ cd ../build %% FIXME: involviert make bochs (im build-Ver.) wirklich make-ag++.sh?
|
||||
$ make bochs
|
||||
(Of course, this requires a configured Bochs/Fail*.)
|
||||
- Cleaning up: The former make all-clean is now invokable by
|
||||
- Just re-run "make" in ${BUILD_DIR}, or call "scripts/rebuild-bochs.sh -".
|
||||
The latter automatically runs "make install" after rebuilding fail-client
|
||||
(and probably the experiment's campaign server).
|
||||
- Cleaning up (forcing a complete rebuild of libfailbochs.a next time):
|
||||
$ make bochsallclean
|
||||
- Installing: For installing the bochs executable (former "make install")
|
||||
$ make bochsinstall
|
||||
(See "make help" for a target listing.)
|
||||
- Note: You may use scripts/rebuild-bochs.sh to speed up repetitive tasks regarding
|
||||
Fail/Bochs builds. This script contains a concise documentation on itself.
|
||||
This is especially necessary if you changed a Bochs-affecting aspect header
|
||||
(.ah), as the build system does not know about Bochs sources depending on
|
||||
certain aspects.
|
||||
|
||||
|
||||
Debug build:
|
||||
@ -143,6 +150,8 @@ Debug build:
|
||||
Configure Bochs to use debugging-related compiler flags (expects to be in ${BUILD_DIR}):
|
||||
$ cd ../simulator/bochs
|
||||
$ CFLAGS="-g -O0" CXXFLAGS="-g -O0" ./configure --prefix=... ... (see above)
|
||||
You might additionally want to configure the rest of Fail* into debug mode by
|
||||
setting CMAKE_BUILD_TYPE to "Debug" (ccmake, see above).
|
||||
|
||||
|
||||
Profiling-based optimization build:
|
||||
|
||||
@ -38,13 +38,16 @@ based on the "${PREFIX}/share/doc/bochs/bochsrc-sample.txt" template (or
|
||||
0xe9 to the console:
|
||||
port_e9_hack: enabled=1
|
||||
- Determinism: (Fail)Bochs is deterministic regarding timer interrupts,
|
||||
i.e., two experiment runs after calling simulator.restore() will count the
|
||||
same number of instructions between two interrupts. Though, you need to be
|
||||
careful when running (Fail)Bochs with a GUI enabled: Typing "bochs -q<return>"
|
||||
i.e., two experiment runs after calling simulator.restore() will count
|
||||
the same number of instructions between two interrupts. Though, you
|
||||
need to be careful when running (Fail)Bochs with a GUI enabled: Typing
|
||||
fail-client -q<return>
|
||||
on the command line may lead to the GUI window receiving a "return key
|
||||
released" event, resulting in a keyboard interrupt for the guest system.
|
||||
This can be avoided by starting Bochs with "sleep 1; bochs -q", or
|
||||
disabling the GUI (see "headless experiments" above).
|
||||
This can be avoided by starting Bochs with "sleep 1; fail-client -q", by
|
||||
suppressing keyboard input (CONFIG_DISABLE_KEYB_INTERRUPTS setting in
|
||||
the CMake configuration), or disabling the GUI (see "headless
|
||||
experiments" above).
|
||||
|
||||
=========================================================================================
|
||||
Example experiments and code snippets
|
||||
@ -56,10 +59,10 @@ A simple standalone experiment (without a separate campaign). To compile this
|
||||
experiment, the following steps are required:
|
||||
1. Add "hsc-simple" to ccmake's EXPERIMENTS_ACTIVATED.
|
||||
2. Enable CONFIG_EVENT_BREAKPOINTS, CONFIG_SR_RESTORE and CONFIG_SR_SAVE.
|
||||
3. Build Fail* and Bochs, see "how-to-build.txt" for details-
|
||||
3. Build Fail* and Bochs, see "how-to-build.txt" for details.
|
||||
4. Enter experiment_targets/hscsimple/, bunzip2 -k *.bz2
|
||||
5. Start the Bochs simulator by typing
|
||||
$ bochs -q
|
||||
$ fail-client -q
|
||||
After successfully booting the eCos/hello world example, the console shows
|
||||
"[HSC] breakpoint reached, saving", and a hello.state/ subdirectory appears.
|
||||
You probably need to adjust the bochsrc's paths to romimage/vgaromimage.
|
||||
@ -71,8 +74,8 @@ experiment, the following steps are required:
|
||||
into "#if 0". Make an incremental build, e.g., by running
|
||||
"${FAIL_DIR}/scripts/rebuild-bochs.sh -" from your ${BUILD_DIR}.
|
||||
7. Back to ../experiment_targets/hscsimple/ (assuming, your are in ${FAIL_DIR}),
|
||||
run
|
||||
$ bochs -q
|
||||
again run
|
||||
$ fail-client -q
|
||||
After restoring the state, the hello world program's calculation should
|
||||
yield a different result.
|
||||
|
||||
@ -88,13 +91,14 @@ experiment, the following steps are required:
|
||||
../experiment_targets/coolchecksum/.
|
||||
(If you want to enable COOL_FAULTSPACE_PRUNING, step #2 is mandatory because
|
||||
it generates the instruction/memory access trace needed for pruning.)
|
||||
2. Build the campaign server: make coolchecksum-server
|
||||
2. Build the campaign server (if it wasn't already built automatically):
|
||||
$ make coolchecksum-server
|
||||
3. Run the campaign server: bin/coolchecksum-server
|
||||
4. In another terminal, run step #3 of the experiment ("bochs -q").
|
||||
4. In another terminal, run step #3 of the experiment ("fail-client -q").
|
||||
|
||||
Step #3 of the experiment currently runs 2000 experiment iterations and then
|
||||
terminates, because Bochs has some memory leak issues. You need to re-run
|
||||
Bochs for the next 2k experiments.
|
||||
fail-client for the next 2k experiments.
|
||||
|
||||
The experiments can be significantly sped up by
|
||||
a) parallelization (run more FailBochs clients and
|
||||
@ -104,9 +108,9 @@ The experiments can be significantly sped up by
|
||||
Experiment "MHTestCampaign":
|
||||
**********************************************************************
|
||||
An example for separate campaign/experiment implementations.
|
||||
1. Execute Campaign (job server): ${BUILD_DIR}/bin/MHTestCampaign-server
|
||||
1. Execute campaign (job server): ${BUILD_DIR}/bin/MHTestCampaign-server
|
||||
2. Run the FailBochs instance, in properly defined environment:
|
||||
$ bochs -q
|
||||
$ fail-client -q
|
||||
|
||||
=========================================================================================
|
||||
Parallelization
|
||||
@ -120,17 +124,16 @@ flows), inquired by the clients. As a consequence, the campaign is running on th
|
||||
side and the experiment flow are running on the (distributed) clients.
|
||||
First of all, the Fail* instances (and other required files, e.g. saved state) are
|
||||
distributed to the clients. In the second step the campaign(-server) is started, preparing
|
||||
it's parameter-sets in order to be able to answer the requests from the clients. (Once
|
||||
there are available parameter-sets, the clients can request them.) In the final step,
|
||||
its parameter sets in order to be able to answer the requests from the clients. (Once
|
||||
there are available parameter sets, the clients can request them.) In the final step,
|
||||
the distributed Fail* clients have to be started. As soon as this setup is finished,
|
||||
the clients request new parameter-sets, execute their experiment code and return their
|
||||
results to the server (aka campaign) in an iterative way, until all paremeter-sets have
|
||||
been processed successfully. If all (new) parameter-sets have been distributed, the
|
||||
campaign starts to resend unfinished parameter-sets to requesting clients in order to
|
||||
the clients request new parameter sets, execute their experiment code and return their
|
||||
results to the server (aka campaign) in an iterative way, until all paremeter sets have
|
||||
been processed successfully. If all (new) parameter sets have been distributed, the
|
||||
campaign starts to re-send unfinished parameter sets to requesting clients in order to
|
||||
speed up the overall campaign execution. Additionally, this ensures that all parameter
|
||||
sets will produce a corresponding result set. (If, for example, a client terminates
|
||||
abnormally, no result is send back. This scenario is managed by this "resend-mechanism"
|
||||
of the campain, too.)
|
||||
abnormally, no result is sent back. This scenario is dealt with by this mechanism, too.)
|
||||
|
||||
|
||||
Shell scripts supporting experiment distribution:
|
||||
@ -145,27 +148,30 @@ themselves, they contain some documentation):
|
||||
clients on the experiment hosts.
|
||||
- multiple-clients.sh: Is run on an experiment host by runcampaign.sh,
|
||||
starts several instances of client.sh in a tmux session.
|
||||
- client.sh: (Repeatedly) Runs a single FailBochs instance.
|
||||
- client.sh: (Repeatedly) Runs a single fail-client instance.
|
||||
|
||||
|
||||
Some useful things to note:
|
||||
**********************************************************************
|
||||
- Using the distribute-experiment.sh script causes the local bochs binary to
|
||||
- Using the distribute-experiment.sh script causes the local fail-client binary to
|
||||
be copied to the hosts. If the binary is not present in the current directory
|
||||
the default bochs binary (-> $ which bochs) will be used. If you have modified
|
||||
some of your experiment code (i.e., your bochs binary will change), don't
|
||||
forget to delete the local bochs binary in order to distribute the *new* binary.
|
||||
the default fail-client binary (-> $ which fail-client) will be used. If you
|
||||
have modified some of your experiment code (i.e., your fail-client binary will
|
||||
change), don't forget to delete the local fail-client binary in order to
|
||||
distribute the *new* binary.
|
||||
- The runcampaign.sh script prints some status information about the clients
|
||||
recently started. In addition, there will be a few error messages concerning
|
||||
ssh, tmux and so on. They can be ignored for now.
|
||||
- The runcampaign.sh script starts the coolchecksum-server. Note that the server
|
||||
instance will terminate immediatly (without notice), if there is still an
|
||||
instance will terminate immediately (without notice), if there is still an
|
||||
existing coolcampaign.csv file.
|
||||
- In order to make the performance gains (mentioned above) take effect, a "workload
|
||||
balancing" between the server and the clients is mandatory. This means that
|
||||
the communication overhead (client <-> server) and the time, needed to execute
|
||||
the communication overhead (client <-> server) and the time needed to execute
|
||||
the experiment code on the client-side should be in due proportion. More
|
||||
specifically, for each experiment there will be exactly 2 TCP connections
|
||||
(send parameter-set to client, send result to server) established. Therefore
|
||||
you should ensure that the execution time of the experiment is "long enough"
|
||||
(heuristic). (See existing experiments for examples.)
|
||||
(send parameter set to client, send result to server) established. Therefore
|
||||
you should ensure that the jobs you distribute take enough time not to
|
||||
overflow the server with requests. You may need to bundle parameters for
|
||||
more than one experiment if a single experiment only takes a few hundred
|
||||
milliseconds. (See existing experiments for examples.)
|
||||
|
||||
Reference in New Issue
Block a user