std::chrono::system_clock is not monotonic, instead use
std::chrono::steady_clock for interval measurements.
Change-Id: I231affecfe8e89481720e47b59132fc838cdf73c
If the JobServer is provided a total number of experiments by the
campaign, it now prints a completion percentage and an estimated
remaining runtime along the usual progress reports.
Change-Id: Ibd781ba8bff9af3a85683bbd29728216e316da57
The JobServer progress-report output now shows the total number of
completed jobs instead of the (almost always zero) inbound queue fill
level. Additionally, the current number of incoming results per
second is shown, which also prepares for an ETA calculation in the
following commit.
Change-Id: I6b71c45f44b9e6b9b17c059959a90068b51c165c
When prefixing a symbol name with '?', the GenericExperiment does not abort
in case the symbol is not found in the provided ELF binary:
fail-client -Wf,--detected-marker=?eddiErrorHandler
[...]
[GenericExperiment] ELF Symbol not found, ignoring: eddiErrorHandler
Change-Id: Iec12416ce8e38ff0ee1704e3a725c2cadc97b756
The JobClient now resolves the server IP once (lazily, when needed) instead
on each connect attempt, reducing the amount of DNS requests sent out.
Change-Id: I9804048d3252da333cb3addbe94a01fdf3c707c8
This bugfix makes sure that from a set of symbols with the same
address, only the first one gets imported.
After an assessment whether analysis scripts can deal with multiple
symbols at the same address, an import of all symbols should be made
possible in the future. This will also require to relax the
primary-key constraint of the `symbols' table.
Change-Id: I61c4ddb1af1556d44eab54e53eaa3d0fc20de7c1
The import-trace tool now systematically collects statistics on which
LLVM -> FAIL* register ID mappings failed during import, and presents
those after the import finished.
Change-Id: Ied67853d754483277868fe21bf2c6efeaeb60f09
The generic-experiment now learned to record and compare output on an
arbitrary serial port. Using Bochs' port 0xe9 hack (parameter
--e9-file) is kept for compatibility reasons.
Change-Id: I5b1aa02d244e8b474919e1bdf043e523ea0e4f45
Calling the DatabaseCampaign with --inject-registers or
--force-inject-registers now injects into CPU registers. This is achieved
by reinterpreting data addresses in the DB as addresses within the register
file. (The mapping between registers and data addresses is implemented in
core/util/llvmdisassembler/LLVMtoFailTranslator.hpp.) The difference
between --inject-registers and --force-inject-registers is what the
experiment does when a data address is not interpretable as a register: the
former option then injects into memory (DatabaseCampaignMessage,
RegisterInjectionMode AUTO), the latter skips the injection altogether
(FORCE).
Currently only compiles together with the Bochs backend; the
DatabaseExperiment's redecodeCurrentInstruction() function must be
moved into the Bochs EEA to remedy this.
Change-Id: I23f152ac0adf4cb6fbe82377ac871e654263fe57
elfinfo was what ElfReader started from, but is not needed in itself
anymore. The code has been mostly rewritten, so an explicit mention
of the original authors is not necessary anymore.
Change-Id: Iea48c80f9174504bbb56cc02ee2de5eda4a81489
ElfReader now detects whether a 32- or 64-bit ELF is opened, and uses
the corresponding elf.h data structures. Internally maps 32-bit ELF
structures onto 64-bit structures to use common processing code.
Change-Id: Ib42a4b21701aeadac7568e369a80c08f2807694e
Instead of using assert() (which only does something in a Debug
build), explicitly fail when a user-specified symbol is not found.
Change-Id: I33ac59ca4483ee65ba70c264b5153a7766a919d2
faultspaceplot.sh now fails gracefully if the requested
variant/benchmark combination does not exist in the database.
Change-Id: Ied3b5a0e72cc5ae8e6ce352b65486f15bb13576b
This change adds global fault-coverage and occurrence count
measurement scripts that work with sampling results.
Change-Id: I14d94a2c549cff3256fc7b0800cfd4a702e6ad35
The *-onwrite.sh analysis scripts only work if import-trace was not
run with --no-write-ecs, i.e. they only work if writing memory
accesses were imported into the "trace" table.
Change-Id: Icb2ea4e72d2200c886d4f9074f2da0f9bfd6ac85
Depending on SQL-statement nesting, some scripts already correctly sorted
resulttypes alphabetically, but some sorted along the numeric ENUM value
behind the resulttype name. This change explicitly converts the resulttype
to a string before sorting.
Change-Id: Ia18aa4e75b94a6a9f7bb125953bc85b86b3cbd6e
In their current implementation, the data-aggregator scripts do not work
correctly on sampling results.
Change-Id: I1035970b352f513d725bd1a40ac9262368ffbcc0
As long as the JobServer only listens on IPv4 endpoints, it makes no
sense to attempt a connect to an IPv6 endpoint on the client side.
(However, it's 2018 and we should also be capable of using IPv6 on
both the client and server side ...)
Change-Id: I9c3916466c350ce74a31cef3b6ae0e7ac56367c7
MyISAM indexes are limited to 1000 bytes per index. Recently, Linux
distros (e.g. Debian 9) started to default MariaDB installations to
utf8mb4, which can use up to 4 bytes per character. Hence, two
varchar columns indexed in a single key have a total maximum length of
250. Instead, we use some lower, round numbers.
Change-Id: I4b53bc217912bc7070102a0af4938763e61b041d
This change removes support for earlier LLVM versions; making them
work as well is simply too tedious.
Change-Id: I372a151279ceb2bfd6de101c9e0c15f0a4b18c03
I did this mainly so server and client use a common networking API
IMO, using Boost::asio results in nicer name-lookup code.
Since no longer needed, I removed the SocketComm stuff.
The client is still synchronous; I see no benefit in having it
asynchronous.
I'm not super happy with the random backoff by the clients, if they
can't connect to the server. It makes the code really messy, 3 retries
is totally arbitrary, as is the backup windows. I believe launching
the server and clients in the correct order should be handled by a
launch script
Change-Id: Ifea64919fc228aa530c90449686f51bf63eb70e7
When building with an experiment activated, the generated
instantiate-<experimentname>.ah gets included in each and every FAIL*
translation unit including Bochs's ones. In the case of the
generic-experiment (and probably many others), this indirectly included
Google protobuf headers, which failed to compile for Bochs's gui/wx.cc and
gui/x.cc: The included X headers pollute the preprocessor namespace by
an internal protobuf "Status" class.
Change-Id: I613f5c792a9519cf2573eddc7fef6266c7168494
This patch overhauls the FAIL* server code to leverage Boost asio to be able to
handle a large number of clients (>4000). In this implementation the server is
now single threaded. I've not encountered any problems with this for up to
about 10k clients. Boost ASIO can also be used multithreaded, but I assume the
FAIL* internal data structures (Synchronized*) will become a bottleneck first.
The code now additionally depends on Boost Coro and Boost Context, as well as
a C++ 14 compiler, although the only C++14 feature required is a lambda capture
with initializer, such as [ x = std::move(x) ]. gcc-4.9.2 does this.
The code could (and probably should) be cleaned up more. Comments are wordy,
code is unnecessary now (multiple server threads), code is not self-contained
(headers spread dependencies), many ifdef's (server performance measuring
should be runtime rather than a compile time option), and much more. But for
this patch I was going for a minimal changeset the get the functionality in,
to have an easier review. Alas, FAIL* has no Unit-test suite to run the changes
against.
To handle such a large number of clients more changes were necessary, for
example server status output is now performed every 1s, instead for every
request.
The class Minion was removed completely; the only thing it was doing was
encapsulate an int.
The server has now a runtime-configurable port, or it can select a free port on
its own if none is specified. This requires the CampaignManager to add a port
argument and instantiate the JobServer dynamically.
Change-Id: Iad9238972161f95f5802bd2251116f8aeee14884
Clang 4.0.0, which ac++ links against since today, throws an error in
the Bochs code.
config.cc:3480:55: error: ordered comparison between pointer and zero ('char *' and 'int')
if (SIM->get_param_string("model", base)->getptr()>0) {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
Change-Id: I8404a54acd468bf71cbf29867657f9458f3a4c3f
- search for libdwarf.h in new locations (e.g., /usr/include/libdwarf/)
- build Bochs with -std=gnu++98 (gnu++14 is default since GCC 6.1)
- specify "proto2" syntax for protobuf messages
- minor build-system and C++ namespace fixes
Change-Id: I16dbc622c797ef8e936fe3c0fb9b03029d27529d
This change removes the hard compile-time dependency from the
performance-improving dedicated listener-list implementation
(core/sal/perf/) to basic watchpoints / breakpoints being enabled in
the cmake config. This allows to keep the CONFIG_FAST_* switches
enabled in practically every experiment.
The primary reason for this change was the recent insight that enabled
breakpoints with disabled CONFIG_FAST_BREAKPOINTS can massively slow
down an experiment even if the latter does not use a single breakpoint
itself.
Change-Id: I5e3f5c1632ed1ee98a3ec887f18b174fa0e15773
The initialization value for ymin, which tracks the lower bound of
plotted rectangles (and is finally used for the preselected zoom
area), was chosen too small for Linux-kernel data structure addresses.
Change-Id: I7cd8dc690843394107e8aae7fffa90f27ca18153