christoph/fail - fail - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Horst Schirmeier	17e76c140b	cpn: needs comm and MySQL at link time The dependency on fail-comm exists not only at compile time (the latter is due to protobuf header generation). Change-Id: I2bae51e763d9a385bda94e77df3e88619fa28a30	2014-01-23 14:31:24 +01:00
Horst Schirmeier	5ac108ea4b	Merge branch 'mysql-concurrency-fixes'	2014-01-20 18:35:35 +01:00
Horst Schirmeier	8f9ee3fddd	DatabaseCampaign: run statistics update when finished Change-Id: Ib68e54ba82e988db0d2d74ffafa6dc9bd54cd272	2014-01-20 18:34:51 +01:00
Horst Schirmeier	33b63651ae	DatabaseCampaign: MySQL / concurrency fixes According to <http://dev.mysql.com/doc/refman/5.5/en/c-api-threaded-clients.html>, a MySQL connection handle must not be used concurrently with an open result set and mysql_use_result() in one thread (DatabaseCampaign::run()), and mysql_query() in another (DatabaseCampaign::collect_result_thread()). This indeed leads to crashes when bounding the outgoing job queue (SERVER_OUT_QUEUE_SIZE), and maybe even more insidous effects in other cases. The solution is to create separate connections for both threads. Additionally, call mysql_library_init() before spawning any threads. Change-Id: I2981f2fdc67c9a2cbe8781f1a21654418f621aeb	2014-01-20 18:34:51 +01:00
Michael Lenz	9c984b9704	fail/cpn: (Database)Campaign no longer loses jobs Up until now the JobServer was silently losing jobs and only claiming to be finished - a workaround for this was to restart the campaign until all jobs were finished according to the database and the campaign's output. This change fixes the underlying problem, so a single campaign-run suffices and does no longer lose any jobs. Debugging this was awful and took us quite some time... Change-Id: Ie6c982cc3b2ce11128941f1f13be563bae22565c	2014-01-15 12:59:13 +01:00
Horst Schirmeier	ab9c0edf10	DatabaseCampaign: run jobs for known-outcome exps, too Although we know that a known_outcome=1 pilot does not exhibit behavior different from the golden run, the database schema does not yet know what this behavior looks like (in terms of result-table column values). In order to be able to JOIN valid results for all memory writes in the trace table (fspgroup maps them all onto one pilot per variant), we need to run these experiments, too. Additionally, don't join the fspgroup table; we only need this one for result calculations afterwards. Change-Id: Idcd2991274fede84526b1eee68a231774625d11a	2013-12-05 19:27:44 +01:00
Christian Dietrich	d26fc28fa4	cpn/database: include data_width in the fsppilot during prune step During the prune step the data_width of the injected location was not propagated before. It is now stored in fsppilot (database layout change!) and sent in the fsppilot protobuf message. Change-Id: I0562f6fc8957adea0f8a9fb63469ca5e3f4b7b2d	2013-09-11 10:27:04 +02:00
Christian Dietrich	9843b520c1	dbcampaign: select multiple variants/benchmark pairs The variant/benchmark selection now can use SQL LIKE syntax, all unfinished pilots from all selected variants are sent to the clients. E.g.: ./cored-voter-server -v x86-cored-voter -b simple-% -p basic Will select the fsppilots in the variants: - x86-cored-voter/simple-ip/basic - x86-cored-voter/simple-instr/basic The variant and benchmark information is now sent within the fsppilot. Change-Id: I287bfcddc478d0b79d89e156d6f5bf8188674532	2013-07-05 10:19:58 +02:00
Christian Dietrich	d9c9b43102	dciao-kernelstructs: several experiment fixes. The previous fault injection experiment was kind of bullshit. This one is better in several ways: - sanity check at injection time (correct IP) - correct counting of kernel_transistions - copy whole activation scheme Change-Id: I014eea4d6fe103bc02ffd7bbca95dc56a1a4d9ea	2013-05-29 16:18:22 +02:00
Christian Dietrich	6789a313a9	DCiAOKernelImporter: different injection semantic. Is now very similar to normal importer, and may be deleted in the future, but at the moment, this should be merged, since it is the importer used in the sobres-2013 paper. This changes the MySQL Schema. instr1_absolute was introduced. Change-Id: I1bc2919bd14c335beca6d586b7cc0f80767ad7d5	2013-05-29 16:17:03 +02:00
Adrian Böckenkamp	6d8b3331d8	doxygen: doc generation fixed Doxygen skips undesired directories and files now. In addition, the documentation of the "fail" namespace has been fixed. Note that there are still several warnings (due to incomplete documentations) in the Doxygen output. Change-Id: Idad4f1ecff453765b307fa40a5c1cebc0c2ce2bb	2013-05-29 13:34:12 +02:00
Horst Schirmeier	880e7a81ff	comm: ignore SIGPIPE This prevents client and server from being sent a SIGPIPE (and terminating) when the other side unexpectedly closes the connection. It's way easier to handle this condition when checking the write() return value, than to do anything smart in a SIGPIPE handler. More details: <http://stackoverflow.com/questions/108183/how-to-prevent-sigpipes-or-handle-them-properly> Change-Id: I1da5bf5ef79c8b7b00ede976e96ed4f1c560049d	2013-04-29 15:32:12 +02:00
Horst Schirmeier	0f16f18d75	cosmetics Change-Id: Ifae805ae1e2dac95324e054af09a7b70f5d5b60c	2013-04-22 14:24:02 +02:00
Christian Dietrich	c24ed774b0	experiments/dciao-kernelstructs: new database driven experiment for DCiAO The dciao-kernelstructs experiment does a trace imported by the DCiAOKernelImporter: bin/import-trace -t trace.pb -i DCiAOKernelImporter --elf-file app.elf Pruned by the basic method: bin/prune-trace and does CiAO fault injection experiments, where the results are stored in the database. Change-Id: I485dc2e5097b3ebaf354241f474ee3d317213707	2013-04-03 10:39:51 +02:00
Christian Dietrich	f18cddc63c	DatabaseCampaign: abstract campain for interaction with MySQL Database The DatabaseCampaign interacts with the MySQL tables that are created by the import-trace and prune-trace tools. It does offer all unfinished experiment pilots from the database to the fail-clients. Those clients send back a (by the experiment) defined protobuf message as a result. The custom protobuf message does have to need the form: import "DatabaseCampaignMessage.proto"; message ExperimentMsg { required DatabaseCampaignMessage fsppilot = 1; repeated group Result = 2 { // custom fields required int32 bitoffset = 1; optional int32 result = 2; } } The DatabaseCampaignMessage is the pilot identifier from the database. For each of the repeated result entries a row in a table is allocated. The structure of this table is constructed (by protobuf reflection) from the description of the message. Each field in the Result group becomes a column in the result table. For the given example it would be: CREATE TABLE result_ExperimentMessage( pilot_id INT, bitoffset INT NOT NULL, result INT, PRIMARY_KEY(pilot_id) ) Change-Id: I28fb5488e739d4098b823b42426c5760331027f8	2013-04-02 09:52:42 +02:00
hoffmann	94214063ac	Fixed whitespaces. git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2067 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2013-02-07 00:51:14 +00:00
hellwig	00f809231f	Code cleanup for commit 1963-1965 git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@2014 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2013-01-23 14:22:05 +00:00
hellwig	fc1d21fe53	Bugfix for server-client communication git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1965 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-30 18:13:13 +00:00
hellwig	d7842c2ad7	The Jobclient can get several jobs with one request git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1963 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-30 16:50:02 +00:00
hsc	127161ef5a	bounded job queue (configurable, unbounded by default) git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1945 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-20 15:01:58 +00:00
hsc	e409ae2f76	JobServer: synchronization issues Synchronize re-sending jobs in sendPendingExperimentData() and modifying (or indirectly, via getDone() and the campaign, deleting) jobs in the m_runningJobs queue. a) sendPendingExperimentData needs an intact job to serialize and send it. b) After moving the job to m_doneJobs, it may be retrieved and deleted by the campaign at any time. Additionally, receiving a result overwrites the job's contents. This already may cause breakage in sendPendingExperimentData (a). git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1943 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-20 15:01:52 +00:00
hsc	1d498a516b	JobServer: do not try to talk to a dying minion git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1942 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-20 15:01:49 +00:00
hsc	49d1608969	correct sanity checks for client/server communication git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1933 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-14 13:31:53 +00:00
hellwig	6f98d64613	bugfix: racecondition removed git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1921 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-12 11:46:26 +00:00
hsc	35b1d0203e	CampaignManager: destructor / cleanup git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1916 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-10 16:18:40 +00:00
hsc	86ba9cb377	CampaignManager: only instantiate JobServer when needed As we have a global CampaignManager instance in the fail-cpn library, a JobServer member variable is not such a good idea. Essentially, we started all JobServer threads (which is done in its constructor) within a fail-client before this commit. git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1915 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-11-10 16:14:06 +00:00
hsc	55dd79cc03	cosmetics git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1849 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-10-26 16:13:36 +00:00
adrian	15def480d9	warning-fix in release mode (var not initialized). git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1731 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-10-09 11:10:29 +00:00
hsc	d45965753d	bugfix: handle old clients properly Fix 1: A result message with a nonexistent or invalid run ID must be ignored in any case. 0 is only OK for NEED_WORK messages, clients communicating a result must know the ID. Fix 2: Tell the client the run ID in the first place ... git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1692 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-09-25 16:14:22 +00:00
hsc	7513dacad1	properly deal with clients that talked to another campaign server before A campaign server now tells all clients a unique run ID (the UNIX timestamp when it was started). This allows us to ignore results from "old" clients that talked to another server before, and to tell them to die. git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1677 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-09-23 17:28:07 +00:00
hsc	f9c96ddf2d	prefix internal libraries to avoid naming conflicts with system libraries This is a precaution to avoid current and future naming conflicts with common system libraries. libutil (part of libc) is the first, but probably not the last example that already caused trouble twice. git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1614 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-09-12 07:52:30 +00:00
hsc	e56918e40e	centralized and cmake-based campaign server+port config git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1590 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-09-04 13:57:01 +00:00
hsc	f992f53d5d	spacing git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1585 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-09-02 10:17:00 +00:00
unzner	d9b24a7c60	Changes I made in the l4-sys experiment recently, plus one minor style fix git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1584 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-09-01 16:05:22 +00:00
friemel	c06565aa4e	Basic SAL files and makefile modifications for adding gem5. git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1457 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-07-17 15:35:29 +00:00
hsc	4a4b3ea7e2	FailBochs build process reversed The FailBochs client is not linked by the Bochs build system anymore, but by our cmake scripts (make fail-client): - All Bochs libraries are merged into libfailbochs.a (a new target within the Bochs Autotools scripts). - The previous libfail.a is not a merge of all Fail* libraries anymore, but pulls these in via library dependencies. Additionally I did a lot of build system cleanup, e.g. additional external libraries may now be pulled in where they're needed. git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1390 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-06-29 22:22:41 +00:00
adrian	2575604b41	Fail* directories reorganized, Code-cleanup (-> coding-style), Typos+comments fixed. git-svn-id: https://www4.informatik.uni-erlangen.de/i4svn/danceos/trunk/devel/fail@1321 8c4709b5-6ec9-48aa-a5cd-a96041d1645a	2012-06-08 20:09:43 +00:00

37 Commits