Commit Graph

78 Commits

Author SHA1 Message Date
0a5e54e9aa ecos_kernel_test experiment bugix: don't resume if 'experiment reached finish() before FI'
Change-Id: Id0bb9400b8aa28307ed385a8c32b91b17254ba1c
2014-01-15 12:52:44 +01:00
f33789b1ac Merge branch 'find-mysql' 2013-09-04 13:09:48 +02:00
6d4dfeb913 shutdown cleanups revisited
This change became necessary as we observed weird fail-client SIGSEGV
crashes with both Bochs and Gem5 backends and different experiments.

Some Fail* components are instantiated statically: the
SimulatorController instance "simulator", containing the
ListenerManager and the CoroutineManager, and the active
ExperimentFlow subclass(es)
(experiments/instantiate-experiment*.ah.in).  The experiment(s) is
registered as an active flow in the CoroutineManager at startup.

As plugins (which are ExperimentFlows themselves) are often created on
an experiment's stack, ExperimentFlows deregister themselves on
destruction (e.g., when leaving the plugin variable's scope).  The
core problem is, that the creation and destruction order of statically
instantiated objects depends on the link order; if the experiment is
destroyed after the CoroutineManager, its automatic self-deregistering
feature talks to the smoking ruins of the latter.

This change removes all static instantiations of ExperimentFlow and
replaces them with constructions on the heap.  Additionally it makes
sure that the CoroutineManager recognizes that a shutdown is in
progress, and refrains from touching potentially already destroyed
data structures when a (mistakenly globally instantiated)
ExperimentFlow deregisters in this case.

Change-Id: I8a7d42fb141222cd2cce6040ab1a01f9de61be24
2013-09-04 10:13:48 +02:00
1ad99c9630 cmake: find MySQL client lib
This change introduces a CMake-style FindMySQL.cmake properly looking for
libmysqlclient_r with mysql_config.  This also fixes linking on some
machines.

Change-Id: Ifdbfdc3c7440dead37a8b63aaa86732d636aa0e2
2013-08-29 19:21:31 +02:00
aab117e5a8 Merge branch 'master' of ssh://vamos.informatik.uni-erlangen.de:29418/fail 2013-07-22 14:05:19 +02:00
eccbc61b1d ecos kernel test: memory access listeners' ranges (high/low) may be equal
Change-Id: I02f53d9d698a56c606ef354a37d7a3c467ec8127
2013-07-22 14:03:20 +02:00
8622c1de12 db: explicitly use MyISAM engine
InnoDB is the default on some setups.

Change-Id: I5cc59854cb88cbec0e7bb7f6aab946252d0bd8e5
2013-07-11 10:38:53 +02:00
955f89b3eb ecos: make LOCAL builds compile again
Change-Id: Icd992aa20443426bbcaa507c39453d6ecb9174c0
2013-07-03 13:46:55 +02:00
01e7f8c8a1 ecos: bugfix: cyg_test_output may be called before FI
This is a pretty old bug that unfortunately affects both DSN 2013 and
SOBRES 2013 results.

Change-Id: I64a2790a4d55515a23a34d108be99646d5dd345d
2013-07-03 13:46:55 +02:00
66ecedd864 ecos: removed configuration checks
Change-Id: I5df0b5f3435f3f695ba1ab9ca624dfa238509e46
2013-07-03 13:46:55 +02:00
27f1f77524 ecos: DB-related campaign changes
- Don't enforce the join order, MariaDB usually gets this right.
- Update DB statistics before terminating.

Change-Id: If7bbbe146321430d199811062d05b3c179c5732f
2013-07-03 13:46:55 +02:00
f3c36e70ef ecos: stack-protection eval
Change-Id: I576c2ef3834f61bb9017af37541afc7639672782
2013-05-29 16:43:46 +02:00
091e8dcae0 ecos: baseline assessment integrated into main experiment
Change-Id: Iaf2a31c917b6ddd50568e5fb784ab8457193ee7d
2013-04-29 14:15:53 +02:00
403886e541 ecos: minor changes, cleanup
- Count experiments, not jobs
 - Debug output

Change-Id: Ide5e1219cdcc8112d1a0d4e7367beca2dd5821ef
2013-04-29 14:15:51 +02:00
c0b36f6236 ecos: use new timer iface to record benchmark runtime
... instead of the previous TimerListener workaround.

Change-Id: I3e712540e93b668301f50ecf4f5a5760e0a8fdb3
2013-04-29 14:15:49 +02:00
0887a53d1d ecos: campaign rewrite for MySQL data store
We're not yet using the common DatabaseCampaign as it doesn't allow
for additional experiment parameters (such as "variant" and
"benchmark") yet.  TODO: Integrate changes in DatabaseCampaign in a
generic way and use it.

Change-Id: I45480003be433654aea8d3a417fbfa66be31155b
2013-04-29 14:15:44 +02:00
0f16f18d75 cosmetics
Change-Id: Ifae805ae1e2dac95324e054af09a7b70f5d5b60c
2013-04-22 14:24:02 +02:00
2d45a2c52c ecos: use MemoryMap materialization instead of own code
Change-Id: I8615a066c53e1d6a02c78bce3199fa1f73edfda9
2013-04-10 13:01:04 +02:00
d7a3a28431 ecos: split valid mem access range
When eCos is built as a multiboot binary, some of its data structures
are still at very low (<<1M) addresses, but the rest moves to
addresses >1M.  This change makes sure our invalid mem access
detection is not overly generous.

Change-Id: If8265a407b3706a4ff71562b316e05aa22255f62
2013-04-09 13:16:13 +02:00
001d036613 ecos: ignore VGA mem accesses
Ignore VGA mem accesses for valid address range detection.

Change-Id: I4a85b6d4a2de52ecbd977d2dba474df818710600
2013-04-06 18:59:12 +02:00
a0293b9d18 ecos: use commandline parameters for local test runs, too
Change-Id: Iac43732408420a9e3687d5756c942c2f2256ad86
2013-04-03 15:18:08 +02:00
373c19dd68 ecos: store traceinfo.txt in correct location
This commit completes the change introduced in commit
59e5fd3169.

Change-Id: I0a6ab7b35fbb69cbb8ef91e187b0d0bc32d01071
2013-04-03 15:18:08 +02:00
59e5fd3169 ecos: fail-client accepts commandline parameters for prerequisites phase
This allows us to generate prerequisites (traces, state snapshots, memory
maps) in parallel, and without the previous shell script hacks.

Change-Id: I05a0321a794b4033d05eed20f5bffbd1e910cf1b
2013-03-28 18:09:39 +01:00
bf6affeca3 decouple ecos experiment class definition from instantiation aspect
See 65eb44a746 for an explanation.
2013-03-19 22:30:39 +01:00
b37a475dfd campaign-controlled experiments depend on fail-comm library
An experiment talking to a campaign server via the JobClient/JobServer
interface needs the FailControlMessage.proto compiler to run before the
experiment is compiled.  A dependency on fail-comm ensures this.
2013-03-19 22:21:42 +01:00
4686c27d3d ElfReader: Support for Section and Symbol size.
- getSection/getSymbol now returns an ElfSymbol reference.

Searching by address now searches if address is within
symbol address and symbol address + size.
So we can test, if we are *within* a function, object or
section and not only at the start address.
2013-03-04 15:18:52 +01:00
b996617a97 ecos_kernel_test: improved dependency check
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2074 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-02-07 10:43:03 +00:00
5f2364e1a2 ecos_kernel_test: updates due to architecture changes (ElfReader still NOT working properly)
At the moment, the experiment only works with a hard-coded address for the _stext symbol because the ElfReader cannot extract the symbol's address from the elf-binary. The experiment has been tested locally with PREREQUISITES = 1 and PREREQUISITES = 0. In the latter case, the test only considered the client/server communication, i.e., no FI has been performed at all (NOT tested yet).

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2066 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-02-06 15:39:57 +00:00
b8df5d6a11 ecos_kernel_test experiment updated (NOT TESTED)
The loop is running now until undone already fetched jobs are available.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2044 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-01-31 16:33:50 +00:00
125914a305 BochsRegister.hpp and BochsRegisterIDs.hpp not needed anymore
The includes of these headers have already been removed from the experiments. In the current code, the content of the header BochsRegister.hpp is rather simply copied to x86/Architecture.hpp. It is therefore necessary to revisit the code soon (especially the FIXME related to register IDs).

Another problem is that there is no generalization of register IDs. Thus, all experiments are currently specific to a concrete architecture (which is not desired).

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2010 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-01-17 13:41:23 +00:00
hsc
87ee9df37b ecos: additional burst fault model
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1961 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-27 17:06:32 +00:00
hsc
5fe61e0f3f ecos: cosmetics
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1960 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-27 17:06:29 +00:00
hsc
5135c79c05 TimerListener: microsecond granularity (ms is too coarse)
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1952 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-23 15:35:08 +00:00
hsc
c65c4936ab ecos: compress traces
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1950 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-23 15:33:27 +00:00
hsc
dfe4a5d4c0 ecos: split campaign into job producer and consumer
When we want to use a bounded job queue, job producer and job consumer must
run in different threads in order not to deadlock the campaign.  Ideally
this functionality moves to the CampaignManager later (including the
retrieval of existing experiment results and storing new results).

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1946 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-20 15:02:01 +00:00
hsc
35026de2d8 ecos: use multiple intermediate states to speed up experiments (disabled)
This modification creates and uses multiple intermediate snapshot states
(one every 1,000,000 instructions) to fast-forward to the FI site.
Unfortunately this doesn't work yet; the trace seems to change in many (not
all!) cases we do this.  One possible cause could be an incorrect
(off-by-one or alike?) restoration of the serial device timers, and
therefore an earlier/later transition to "output buffer empty", resulting
in eCos' serial putc function needing a different amount of polling loop
iterations.  Needs more investigation.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1940 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-20 15:01:43 +00:00
hsc
041746741d ecos: include variant and benchmark in job info
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1939 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-20 15:01:40 +00:00
hsc
20fc14e6b3 ecos: result merging repaired
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1938 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-20 15:01:37 +00:00
hsc
4f48eb3232 ecos: send mandatory ecos_test_result when sanity check fails
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1929 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-13 16:26:58 +00:00
hsc
4cce602bca ecos: actually enqueue jobs
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1928 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-13 00:17:33 +00:00
hsc
7b3e5986d1 ecos: save experiment runtime
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1927 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-13 00:17:31 +00:00
hsc
077f7ea507 ecos: load existing results on campaign startup
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1926 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-13 00:17:28 +00:00
hsc
9decf95a09 ecos: specific traps are OK for two benchmarks
The "except1" and "clockcnv" benchmarks explicitly cause a trap and handle
it in their own trap handler.  This, of course, isn't a bad experiment
outcome.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1923 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-13 00:17:18 +00:00
hsc
a0e6ddd519 ecos: fault-space cutoff specific for this campaign
For the eCos kernel test campaign we define a relatively "special" metric
to compare FI results from different applications:  Instead of aligning
the fault-space dimensions (e.g., by artificially adding "don't care" space
to the right for shorter application variants), we only keep the data
address axis constant.  The rationale behind this is that -- despite the
benchmark applications' run-to-completion behavior -- an operating system's
scheduler usually runs ad infinitum, and that we therefore can extrapolate
from a short application run to any longer period.

In essence, this means we use a failure/experiment-length quotient (ignoring
the Y axis for simplicity, as it is constant in size) to compare
application variants, instead of an absolute failure count.  One important
side effect is that we do not punish any application slowdown with this
metric.  (And simply prolongating non-susceptible program sections with,
e.g., NOPs, seemingly "improves" fault tolerance.)

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1922 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-12 13:55:45 +00:00
hsc
464bc9390b ecos: campaign rewritten
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1920 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-12 10:59:35 +00:00
hsc
283d946295 ecos: no need for experimentInfo.hpp.sh anymore
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1918 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-12 10:59:28 +00:00
hsc
4957cfb71b ecos: remove hard-coded trace filename
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1913 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-10 16:13:59 +00:00
hsc
0ab4f12b7c ecos: no need for clearListeners anymore
Since r1700, listeners remove themselves when they leave their scope.
A restore() call also implicitly clears all listeners.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1912 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-09 15:37:33 +00:00
hsc
661563125e ecos: cosmetics / compiler warning
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1910 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-09 15:37:27 +00:00
hsc
1292b66043 ecos: image/prerequisite/result path naming changes
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1909 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-09 15:37:24 +00:00