Commit Graph

1399 Commits

Author SHA1 Message Date
f5b34a962c data-aggregator: fix alphabetic resulttype sorting
Depending on SQL-statement nesting, some scripts already correctly sorted
resulttypes alphabetically, but some sorted along the numeric ENUM value
behind the resulttype name.  This change explicitly converts the resulttype
to a string before sorting.

Change-Id: Ia18aa4e75b94a6a9f7bb125953bc85b86b3cbd6e
2018-07-24 09:16:33 +02:00
27b697200b data-aggregator: specifically limit to fspmethod 'basic'
In their current implementation, the data-aggregator scripts do not work
correctly on sampling results.

Change-Id: I1035970b352f513d725bd1a40ac9262368ffbcc0
2018-07-24 09:16:33 +02:00
e63f7376f8 JobClient: connect to IPv4 endpoints only
As long as the JobServer only listens on IPv4 endpoints, it makes no
sense to attempt a connect to an IPv6 endpoint on the client side.

(However, it's 2018 and we should also be capable of using IPv6 on
both the client and server side ...)

Change-Id: I9c3916466c350ce74a31cef3b6ae0e7ac56367c7
2018-07-24 09:16:33 +02:00
c5e0825c6f Database: reduce varchar cols to fit MyISAM indexes
MyISAM indexes are limited to 1000 bytes per index.  Recently, Linux
distros (e.g. Debian 9) started to default MariaDB installations to
utf8mb4, which can use up to 4 bytes per character.  Hence, two
varchar columns indexed in a single key have a total maximum length of
250.  Instead, we use some lower, round numbers.

Change-Id: I4b53bc217912bc7070102a0af4938763e61b041d
2018-07-24 09:16:33 +02:00
c88c034ca7 cmake: default build type 'Release'
+Make available build types explicit (pull-down in CMake GUI)

Change-Id: Ib2cdd31ad038cef1bb27fcd14f089a35a9751e76
2018-07-24 09:16:33 +02:00
be0b7b630c doc update
Change-Id: Ie8f9011b7718c971de74ab40689c9de7fbeb3b18
2018-07-24 09:16:33 +02:00
ff3a5fb498 move to LLVM 3.9
This change removes support for earlier LLVM versions; making them
work as well is simply too tedious.

Change-Id: I372a151279ceb2bfd6de101c9e0c15f0a4b18c03
2018-07-24 09:15:33 +02:00
baaa6c3ce8 JobClient/Server fixes
- Retain original CLIENT_RETRY_COUNT semantics after Boost::Asio
  switch
- JobClient is C++11 now, too
- Message reception copy/paste error fixes

Change-Id: I19c474b2a79cd2ac8657e8d58d6170202d096fb0
2018-05-09 17:43:28 +02:00
9272c5cbed Move JobClient to Boost::asio as well
I did this mainly so server and client use a common networking API
IMO, using Boost::asio results in nicer name-lookup code.
Since no longer needed, I removed the SocketComm stuff.
The client is still synchronous; I see no benefit in having it
asynchronous.

I'm not super happy with the random backoff by the clients, if they
can't connect to the server. It makes the code really messy, 3 retries
is totally arbitrary, as is the backup windows. I believe launching
the server and clients in the correct order should be handled by a
launch script
Change-Id: Ifea64919fc228aa530c90449686f51bf63eb70e7
2018-05-09 17:41:52 +02:00
191219ad06 data-aggregator: variant-durations.sh w/o filter
Change-Id: I7a3164635fc2fbd65d99fc8bba66e956d505a515
2018-05-09 15:25:45 +02:00
42d6ff4a97 data-aggregator: "on write" fault model metrics
Change-Id: I784618fd4b3a0074153ce074957b57e363c54657
2018-05-09 15:25:45 +02:00
bbe60745e1 data-aggregator: script overhaul + modularization
Change-Id: I4353db1475f00956d19d91c8c558c34506ec836b
2018-05-09 15:25:45 +02:00
9ae8123433 JobServer: fix C++14 dependency
The recent Boost.Asio overhaul requires C++14 features, not only C++11.

Change-Id: I6decf0e6532956f7061d8a9021ec2c8406679266
2018-05-03 16:28:26 +02:00
5a5a99145c bochs: fix ac++-caused preprocessor namespace clash
When building with an experiment activated, the generated
instantiate-<experimentname>.ah gets included in each and every FAIL*
translation unit including Bochs's ones.  In the case of the
generic-experiment (and probably many others), this indirectly included
Google protobuf headers, which failed to compile for Bochs's gui/wx.cc and
gui/x.cc: The included X headers pollute the preprocessor namespace by
an internal protobuf "Status" class.

Change-Id: I613f5c792a9519cf2573eddc7fef6266c7168494
2018-05-03 16:26:13 +02:00
6f41ad73d3 util: MemoryMap test failure more verbose
Change-Id: Ie42e1983d8cc5658b7e88d59cdbe689e6aefe9f2
2018-05-03 15:24:52 +02:00
4a068792e8 fixes for Ubuntu 17.10
- Bochs: wx_gtk3 needs g(d|t)k2

Change-Id: I0a014e3ce7f1d40d215d5309e842db618a2971ed
2018-03-01 15:57:24 +01:00
6c120004eb Use boost-asio to improve FAIL* server performance
This patch overhauls the FAIL* server code to leverage Boost asio to be able to
handle a large number of clients (>4000). In this implementation the server is
now single threaded. I've not encountered any problems with this for up to
about 10k clients. Boost ASIO can also be used multithreaded, but I assume the
FAIL* internal data structures (Synchronized*) will become a bottleneck first.

The code now additionally depends on Boost Coro and Boost Context, as well as
a C++ 14 compiler, although the only C++14 feature required is a lambda capture
with initializer, such as [ x = std::move(x) ]. gcc-4.9.2 does this.

The code could (and probably should) be cleaned up more. Comments are wordy,
code is unnecessary now (multiple server threads), code is not self-contained
(headers spread dependencies), many ifdef's (server performance measuring
should be runtime rather than a compile time option), and much more. But for
this patch I was going for a minimal changeset the get the functionality in,
to have an easier review. Alas, FAIL* has no Unit-test suite to run the changes
against.

To handle such a large number of clients more changes were necessary, for
example server status output is now performed every 1s, instead for every
request.

The class Minion was removed completely; the only thing it was doing was
encapsulate an int.

The server has now a runtime-configurable port, or it can select a free port on
its own if none is specified. This requires the CampaignManager to add a port
argument and instantiate the JobServer dynamically.

Change-Id: Iad9238972161f95f5802bd2251116f8aeee14884
2017-09-15 06:26:14 +02:00
48ceeb6a14 Clang 4.0.0 fix for Bochs
Clang 4.0.0, which ac++ links against since today, throws an error in
the Bochs code.

config.cc:3480:55: error: ordered comparison between pointer and zero ('char *' and 'int')
    if (SIM->get_param_string("model", base)->getptr()>0) {
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~

Change-Id: I8404a54acd468bf71cbf29867657f9458f3a4c3f
2017-08-01 17:48:39 +02:00
3ad42e270c fixes for Debian 9
- search for libdwarf.h in new locations (e.g., /usr/include/libdwarf/)
- build Bochs with -std=gnu++98 (gnu++14 is default since GCC 6.1)
- specify "proto2" syntax for protobuf messages
- minor build-system and C++ namespace fixes

Change-Id: I16dbc622c797ef8e936fe3c0fb9b03029d27529d
2017-08-01 14:12:03 +02:00
d0d62de3f4 sal: remove perf dependency to watchpoints/breakpoints
This change removes the hard compile-time dependency from the
performance-improving dedicated listener-list implementation
(core/sal/perf/) to basic watchpoints / breakpoints being enabled in
the cmake config.  This allows to keep the CONFIG_FAST_* switches
enabled in practically every experiment.

The primary reason for this change was the recent insight that enabled
breakpoints with disabled CONFIG_FAST_BREAKPOINTS can massively slow
down an experiment even if the latter does not use a single breakpoint
itself.

Change-Id: I5e3f5c1632ed1ee98a3ec887f18b174fa0e15773
2016-12-03 17:49:07 +01:00
da3a78ec4d faultspaceplot: fix ymin calculation
The initialization value for ymin, which tracks the lower bound of
plotted rectangles (and is finally used for the preselected zoom
area), was chosen too small for Linux-kernel data structure addresses.

Change-Id: I7cd8dc690843394107e8aae7fffa90f27ca18153
2016-12-03 17:49:07 +01:00
87f7cae1da Merge "ecos_kernel_test: check if addr_errors_corrected is mapped before access" 2016-10-06 13:19:06 +02:00
e25c42f2b2 bochs: fix segmentation fault (after BX_PANIC) in HDD controller
Change-Id: I584e883b89ae36f4cee83684f9461e7baafa1495
2016-09-09 11:32:14 +02:00
3c6502a111 ecos_kernel_test: check if addr_errors_corrected is mapped before access
Change-Id: I08e751feeffc41a51312b8a9ad4b28a57a45a487
2016-09-09 11:26:07 +02:00
9886a0345e bochs: fix segmentation fault in DMA controller
Change-Id: I10c3e7e89d41abdcaea374ea01a2d1613d013e4c
2016-09-08 10:35:33 +02:00
85844b86cc ecos_kernel_test: compare serial output for coptermock benchmark
Change-Id: Ic4f13035d55c811bda7fa020114141b816a11ed8
2016-08-29 10:50:35 +01:00
89866de85f bochs: backport PCI IDE controller DMA start fix
Upstream SVN r12754: "Fixed PCI IDE controller DMA start (found with a
recent Linux version: "mode sense" command executed in DMA mode).
Updated output of "mode sense" page 0x2a (still reporting CD-ROM
drive)."

(data_ready part not backported due to missing dependency)

Change-Id: I392ba2b20a4138682fc34d6d2a78da0c6706e280
2016-08-06 19:44:49 +02:00
fbd788f05e bochs: backport overlapping memcpy fix
Upstream SVN r12563: "Bugfix: use memmove() if source and destination
range can overlap (found with valgrind)."

(Manually backported, the code structure has significantly changed
before this fix.)

Change-Id: Id176fb5b0aca806908cfb06f06bb5a7221ccc9c4
2016-08-06 17:50:27 +02:00
a2798cc2bf bochs: backport PCI IDE buffer-overflow fix
Upstream SVN r10244: "Fixed possible buffer overflow causing segfault
or memory corruption. The buffers are not large enough for the maximum
sector count in LBA48 mode. Now resetting buffer pointers after
processing a PRD (and move remaining data if necessary). This should
fix the SF bug items #3190970 and #3077616."

This happened to us when booting Debian 8 with a Linux 3.16 kernel
from "flat" or "volatile" disk images, in the end corrupting the VGA
card's ("theVga") internal state and segfaulting.

Change-Id: I6a80432093a547dc2eb5270845369d0918e1e49b
2016-08-06 17:49:39 +02:00
436930de71 rampage: fix integer overflow
Change-Id: I18ee65335efd0207c27da9524d74be5d5a575329
2016-08-01 16:39:42 +02:00
2aeded20be rampage: link correctly with Bochs
Change-Id: I7a0231c6b6e8983f86b94c2bfde78d2524dbfc8d
2016-08-01 16:36:17 +02:00
8a63533137 Merge branch 'hannesweisbach-fixes' 2016-07-26 18:13:53 +02:00
3fc3c6a689 bochs: backport fix for out-of-bounds memory access
Upstream SVN r11912: "Fixed some gcc 4.8.1 warnings"

Change-Id: I599eb4d6bb8d5a7a2585bcca7d9a738ac2930aac
2016-07-26 18:11:55 +02:00
feb61ced7b doxygen: cmake syntax fix
Remove superfluous closing brace in cmake file.

Change-Id: I95224f7a42007f6779a6d4950161e7208355edc6
2016-07-26 18:08:06 +02:00
6509d50ec1 compute-hops: detail fixes and optional procps dep
Cleanups, warning fix, optional procps dep, --help correction.

Change-Id: Iba719493e4a8ec37acb7336a39172b3cdefbdc99
2016-07-26 17:41:32 +02:00
b5aaddcb8f new publications using FAIL*
Change-Id: Idf4bb22712475d5a6df182bb7ad19729e81c4591
2016-07-26 17:41:32 +02:00
d3d2faf680 globally rename Fail* to FAIL*
Change-Id: Ief2cb687cc69dd92c2e04f9314f0f1347e0a84ed
2016-07-26 17:41:32 +02:00
94a56c43c8 remove deprecated stuff
Change-Id: Ifc25d216bbf782416159ceb0c366a080d2c8c428
2016-03-15 23:20:05 +01:00
69da134956 Merge branch 'wsos' 2016-03-15 23:16:42 +01:00
de8598ab83 cmake: prefer LLVM 3.4 over system-wide default
FindLLVM.cmake now starts searching for specific "llvm-config-x.y"
versions instead of using the system-wide default "llvm-config" first.
This avoids breaking builds on Debian 8, where LLVM 3.5 is the (yet
unsupported) default, but 3.4 is still installable.

Change-Id: I6fd577f515a233e30c6f803f87b9a680b5515a5b
2016-03-11 20:59:01 +01:00
449ac1a692 DatabaseExperiment: local debug helper code
Change-Id: Ibf42c93df26f6123edc867147621a011665e9c43
2016-03-11 20:59:01 +01:00
39b120f7ca GenericExperiment: record output during complete runtime
Before this change, the GenericExperiment only recorded port 0xe9 output
*after* the fault was injected.  When a fault was injected during the
workload's output loop, the output data before that point in time was
missing, and the experiment outcome was wrongly classified as SDC.

This change moves the logging activation to before the fast-forwarding
step (DatabaseExperiment::cb_before_fast_forward).  It also makes sure the
DatabaseExperiment only clears its own listeners instead of also touching
the SerialOutputLogger's one.

Change-Id: I66bda4ee318d271ddda6f7ade4e817bf9d14cf46
2016-03-11 20:59:01 +01:00
5bd7c4a9c5 GenericExperiment: limit output logger buffer
Limit the serial-output logger buffer to prevent overly large memory
consumption in case the target system ends up, e.g., in an endless loop.
The buffer is limited to (golden-run output size)+1 to be able to detect
the case when the target system makes a correct output but faultily adds
extra characters afterwards.

Change-Id: I50c082f8fb09a702d87ab83732ca3e3463c46597
2016-03-11 20:59:01 +01:00
e08deef9d5 GenericExperiment: prevent integer overflow
This change prevents an integer overflow in the memory-access listener
for WRITE_OUTERSPACE.  Instead of matching all addresses above
maxima_data, l_mem_outerspace never matched in the
generic-experiment's "--catch-write-outerspace" mode.

Change-Id: I8f4ee4515af3998b7c2a8e83c7a18306c26d8d66
2016-03-11 20:45:50 +01:00
7168566ef5 faultspaceplot: -k -- keep CSVs only optionally
Change-Id: I5900c53b81d15d5262420afbf636444af31b00f1
2016-03-11 20:44:31 +01:00
ee8759f6b8 faultspaceplot: don't round down ymin
Before this change, ymin was rounded down to the nearest Y value
divisible by 1000, showing an empty, white area in the lower part of
the plot.  With this change, the initial Y-axis zoom level is
maximized to exactly show all non-white areas.

Change-Id: I1aea52a3afc331e7f11fe76ff2c5de3c71c61c71
2016-03-11 20:41:52 +01:00
e8ca3ba6ea faultspaceplot: don't plot OK_MARKER
In the current configuration, OK_MARKER would be plotted in white
color and be indistinguishable from the background.  Not plotting
these areas at all reduces output-file size.  As a side effect, the
initial Y-axis zoom level (ymin, ymax) can change.

Change-Id: Ic7b1a22a5a6f58e4df0849bca5262c646051ae2c
2016-03-11 20:36:07 +01:00
915a344223 faultspaceplot: better distinguishable colors
Change-Id: I49517fd104394e598937ab1c8970c739e41993b7
2016-03-11 20:36:07 +01:00
ea0a5f90e2 faultspaceplot: add ticks for symbols if available
+ use helper scripts from the same dir, not from $PATH

Change-Id: I7aba773c8dbff5f8643a39fa1ed8d26867f3a86d
2016-03-11 20:36:07 +01:00
3dd7c9cb48 faultspaceplot: plot burst faults correctly
+ remove "old matplotlib" warning

Change-Id: I47dec1cc6bf6dd86216cd6d373174d4c70556f63
2016-03-11 20:36:07 +01:00