Commit Graph

26 Commits

Author SHA1 Message Date
6d4dfeb913 shutdown cleanups revisited
This change became necessary as we observed weird fail-client SIGSEGV
crashes with both Bochs and Gem5 backends and different experiments.

Some Fail* components are instantiated statically: the
SimulatorController instance "simulator", containing the
ListenerManager and the CoroutineManager, and the active
ExperimentFlow subclass(es)
(experiments/instantiate-experiment*.ah.in).  The experiment(s) is
registered as an active flow in the CoroutineManager at startup.

As plugins (which are ExperimentFlows themselves) are often created on
an experiment's stack, ExperimentFlows deregister themselves on
destruction (e.g., when leaving the plugin variable's scope).  The
core problem is, that the creation and destruction order of statically
instantiated objects depends on the link order; if the experiment is
destroyed after the CoroutineManager, its automatic self-deregistering
feature talks to the smoking ruins of the latter.

This change removes all static instantiations of ExperimentFlow and
replaces them with constructions on the heap.  Additionally it makes
sure that the CoroutineManager recognizes that a shutdown is in
progress, and refrains from touching potentially already destroyed
data structures when a (mistakenly globally instantiated)
ExperimentFlow deregisters in this case.

Change-Id: I8a7d42fb141222cd2cce6040ab1a01f9de61be24
2013-09-04 10:13:48 +02:00
12f9915d1c core/efw: send back results earlier
The client sends results back earlier (i.e., before all jobs are
done) if the client response time (CLIENT_JOB_REQUEST_SEC) is
exceeded. This makes sure that extraordinarily long-running
experiments get reported back before, e.g., the LIDO job timeout
kills the Fail* instance.

Change-Id: I3ada0360ec54b63f80a7008570ca514449720220
2013-06-17 17:43:42 +02:00
de754c5f27 comm: handle connect() failures properly
Quoting connect(3posix): "If connect() fails, the state of the socket is
unspecified.  Conforming applications should close the file descriptor and
create a new socket before attempting to reconnect."

Change-Id: Ibcdcc0f546560a41009832894659a37947243f2f
2013-05-29 16:29:09 +02:00
20b70df651 efw/JobClient: knock less often when there are no jobs yet
Change-Id: If769b402a7b00ed3aebedd5f4d0954831a0ee905
2013-04-29 15:34:08 +02:00
880e7a81ff comm: ignore SIGPIPE
This prevents client and server from being sent a SIGPIPE (and
terminating) when the other side unexpectedly closes the connection.
It's way easier to handle this condition when checking the write()
return value, than to do anything smart in a SIGPIPE handler.  More
details:
<http://stackoverflow.com/questions/108183/how-to-prevent-sigpipes-or-handle-them-properly>

Change-Id: I1da5bf5ef79c8b7b00ede976e96ed4f1c560049d
2013-04-29 15:32:12 +02:00
0f16f18d75 cosmetics
Change-Id: Ifae805ae1e2dac95324e054af09a7b70f5d5b60c
2013-04-22 14:24:02 +02:00
fdb39c9613 correction of commit 2079
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2080 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-02-13 13:16:18 +00:00
a2830fa140 JobClient: weighting for throughput calculation
The new troughput is now calculated as:
0.5*old throughput + 0.5* the current throughput of the last job-set.

This prevents excessive variations in the calculation of the new
throughput.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2079 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-02-13 13:11:29 +00:00
94214063ac Fixed whitespaces.
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2067 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-02-07 00:51:14 +00:00
hsc
8ce25257c3 bugfix: compile with libstdc++-4.7
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2049 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-02-01 13:57:42 +00:00
179272abea Once an experiment terminates, all results will be sent to the server.
One an experiment terminates, sending the results back to the server will
be initiated by the jobclients destructor.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2042 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-01-31 16:33:43 +00:00
542ee51c4b Method to query the number of undone jobs in JocClient.hpp added.
Since several jobs can be fetched from the server, it is interesting to
know how much undone jobs are still available. This will accomplished by
the new method getNumberOfUndoneJobs().

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2041 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-01-31 16:33:39 +00:00
ff2dc189ce Correction of commit 2014
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2019 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-01-24 11:26:26 +00:00
00f809231f Code cleanup for commit 1963-1965
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2014 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2013-01-23 14:22:05 +00:00
fc1d21fe53 Bugfix for server-client communication
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1965 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-30 18:13:13 +00:00
d7842c2ad7 The Jobclient can get several jobs with one request
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1963 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-30 16:50:02 +00:00
hsc
49d1608969 correct sanity checks for client/server communication
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1933 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-11-14 13:31:53 +00:00
0c3568dc2f CoroutineManager: comment fix (deprecated).
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1722 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-10-05 11:56:39 +00:00
31aa3aa925 bugfixes in overall coroutine handling to allow the overwriting of onTrigger.
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1721 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-10-05 11:48:39 +00:00
hsc
7513dacad1 properly deal with clients that talked to another campaign server before
A campaign server now tells all clients a unique run ID (the UNIX timestamp
when it was started).  This allows us to ignore results from "old" clients
that talked to another server before, and to tell them to die.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1677 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-09-23 17:28:07 +00:00
hsc
f9c96ddf2d prefix internal libraries to avoid naming conflicts with system libraries
This is a precaution to avoid current and future naming conflicts with
common system libraries.  libutil (part of libc) is the first, but probably
not the last example that already caused trouble twice.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1614 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-09-12 07:52:30 +00:00
hsc
e56918e40e centralized and cmake-based campaign server+port config
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1590 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-09-04 13:57:01 +00:00
2076d21e61 Experiment updates due to last commit.
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1449 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-07-12 10:45:39 +00:00
hsc
4a4b3ea7e2 FailBochs build process reversed
The FailBochs client is not linked by the Bochs build system anymore, but
by our cmake scripts (make fail-client):
 -  All Bochs libraries are merged into libfailbochs.a (a new target
    within the Bochs Autotools scripts).
 -  The previous libfail.a is *not* a merge of all Fail* libraries anymore,
    but pulls these in via library dependencies.

Additionally I did a lot of build system cleanup, e.g. additional external
libraries may now be pulled in where they're needed.

git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1390 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-06-29 22:22:41 +00:00
ad0cfb9b11 Pre-/postprocessing is done within the event objects (Bochs-specific event added), ++coding-style.
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1366 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-06-21 10:47:22 +00:00
2575604b41 Fail* directories reorganized, Code-cleanup (-> coding-style), Typos+comments fixed.
git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1321 8c4709b5-6ec9-48aa-a5cd-a96041d1645a
2012-06-08 20:09:43 +00:00