Commit Graph

513 Commits

Author SHA1 Message Date
50704e9b59 x86: comment typos
Change-Id: I5092e8db23221ce109b75aee79ecc1c7e44c6d32
2018-12-11 17:14:59 +01:00
67f30a171e x86+bochs: add IDs and accessors for FPU and SSE registers
Change-Id: I33146929255337f679ff80152ed4d83106621ffb
2018-12-11 17:14:59 +01:00
9625587fc4 core/sal: refactoring BochsCPU::get/setRegisterContent
Removing the pData indirection that doesn't really simplify anything.

Change-Id: I98c15ffcd76faeac117bea4e1680dcb2dbdbc15f
2018-12-11 17:14:59 +01:00
171fe54330 core/sal: refactoring BochsCPU::get/setRegisterContent
Using switch/case instead of an if cascade is more readable and has a
better chance to be optimized.

Change-Id: I41dc2cbdf8c14bd35c91520d74b476d7b522a3a4
2018-12-11 17:14:59 +01:00
60329bface core/sal: correctly use CPU id in Bochs backend
Change-Id: I6b5f50d78429284b21794127af3af70df2c687a3
2018-12-11 17:14:59 +01:00
805bede338 util: LLVM code cleanups
Among others, rename instr_info to instr to avoid shadowing the class
member with the same name.

Change-Id: I53d2ee08f11a944528931bf8cb4003ec64391016
2018-09-03 14:14:27 +02:00
527763e87f JobServer: remove "come again" diagnostic
The "--[Server] No workload, come again..." appears every time a
larger job set is loaded from the database, once for every client that
knocks.  This isn't helpful and scrolls out relevant information,
hence I'm removing it for now.

Change-Id: Ic7ca5b3a0c096b384ba4803df5b482a96bf803b1
2018-08-27 20:20:53 +02:00
8426084e5a CampaignManager: avoid parameter-name clash
The -p parameter is already being used by several campaign servers for
the prune method to restrict to (which was broken in commit
6c120004e), hence allow only --port to choose a different server TCP
port at runtime.

Change-Id: Ia30e40d564e85a9702118dc28df4988ec628e491
2018-08-27 15:08:17 +02:00
3a47b20df2 JobServer: use steady_clock for interval measurement
std::chrono::system_clock is not monotonic, instead use
std::chrono::steady_clock for interval measurements.

Change-Id: I231affecfe8e89481720e47b59132fc838cdf73c
2018-08-03 22:00:23 +02:00
a547b0d5b4 JobServer: print completion percentage and ETA
If the JobServer is provided a total number of experiments by the
campaign, it now prints a completion percentage and an estimated
remaining runtime along the usual progress reports.

Change-Id: Ibd781ba8bff9af3a85683bbd29728216e316da57
2018-08-03 19:53:45 +02:00
f89794329c JobServer: progress-report overhaul
The JobServer progress-report output now shows the total number of
completed jobs instead of the (almost always zero) inbound queue fill
level.  Additionally, the current number of incoming results per
second is shown, which also prepares for an ETA calculation in the
following commit.

Change-Id: I6b71c45f44b9e6b9b17c059959a90068b51c165c
2018-08-03 19:51:07 +02:00
1c774ce50d JobClient: fix retry delay
Only wait for the retry delay if really retrying.

Change-Id: If12bd3745c799edc5933874d9a44d049646e0e87
2018-08-01 14:19:05 +02:00
00882f98ad JobClient: resolve endpoint only once
The JobClient now resolves the server IP once (lazily, when needed) instead
on each connect attempt, reducing the amount of DNS requests sent out.

Change-Id: I9804048d3252da333cb3addbe94a01fdf3c707c8
2018-07-31 12:33:52 +02:00
742ec092eb DatabaseExperiment: fix output formatting
Change-Id: If882a9ec68b5d2d040d8a047c2b1ea53eea4c21f
2018-07-31 12:29:20 +02:00
2c7640fe90 import-trace: record stats on failed register mappings
The import-trace tool now systematically collects statistics on which
LLVM -> FAIL* register ID mappings failed during import, and presents
those after the import finished.

Change-Id: Ied67853d754483277868fe21bf2c6efeaeb60f09
2018-07-30 14:36:33 +02:00
226545de58 util: LLVM test code output simplified
llvmDisTest now explicitly catches LLVMtoFailTranslator::notfound.

Change-Id: I45306212d45e00cfabb867159a13ce6d247e8e0f
2018-07-27 08:55:16 +02:00
eef19b80a0 FAIL* works with LLVM 3.9, 4.0, 5.0 or 6.0
Change-Id: I5480c3451daac7c8ea6160a9afe5ce557b73afb1
2018-07-27 08:55:09 +02:00
5d5927a88a DatabaseExperiment: add register FI
Calling the DatabaseCampaign with --inject-registers or
--force-inject-registers now injects into CPU registers.  This is achieved
by reinterpreting data addresses in the DB as addresses within the register
file.  (The mapping between registers and data addresses is implemented in
core/util/llvmdisassembler/LLVMtoFailTranslator.hpp.)  The difference
between --inject-registers and --force-inject-registers is what the
experiment does when a data address is not interpretable as a register: the
former option then injects into memory (DatabaseCampaignMessage,
RegisterInjectionMode AUTO), the latter skips the injection altogether
(FORCE).

Currently only compiles together with the Bochs backend; the
DatabaseExperiment's redecodeCurrentInstruction() function must be
moved into the Bochs EEA to remedy this.

Change-Id: I23f152ac0adf4cb6fbe82377ac871e654263fe57
2018-07-24 09:45:00 +02:00
54f3d3f9b6 x86: add amd64 registers
Floating-point related registers are still missing.

Change-Id: If0e0fa2b25cf2fda6e23aeddb3a72744e6c079a6
2018-07-24 09:24:45 +02:00
dd1b18e580 remove unused elfinfo/*
elfinfo was what ElfReader started from, but is not needed in itself
anymore.  The code has been mostly rewritten, so an explicit mention
of the original authors is not necessary anymore.

Change-Id: Iea48c80f9174504bbb56cc02ee2de5eda4a81489
2018-07-24 09:22:29 +02:00
9bd58cb294 ElfReader: read 64-bit ELF binaries
ElfReader now detects whether a 32- or 64-bit ELF is opened, and uses
the corresponding elf.h data structures.  Internally maps 32-bit ELF
structures onto 64-bit structures to use common processing code.

Change-Id: Ib42a4b21701aeadac7568e369a80c08f2807694e
2018-07-24 09:21:12 +02:00
e63f7376f8 JobClient: connect to IPv4 endpoints only
As long as the JobServer only listens on IPv4 endpoints, it makes no
sense to attempt a connect to an IPv6 endpoint on the client side.

(However, it's 2018 and we should also be capable of using IPv6 on
both the client and server side ...)

Change-Id: I9c3916466c350ce74a31cef3b6ae0e7ac56367c7
2018-07-24 09:16:33 +02:00
c5e0825c6f Database: reduce varchar cols to fit MyISAM indexes
MyISAM indexes are limited to 1000 bytes per index.  Recently, Linux
distros (e.g. Debian 9) started to default MariaDB installations to
utf8mb4, which can use up to 4 bytes per character.  Hence, two
varchar columns indexed in a single key have a total maximum length of
250.  Instead, we use some lower, round numbers.

Change-Id: I4b53bc217912bc7070102a0af4938763e61b041d
2018-07-24 09:16:33 +02:00
ff3a5fb498 move to LLVM 3.9
This change removes support for earlier LLVM versions; making them
work as well is simply too tedious.

Change-Id: I372a151279ceb2bfd6de101c9e0c15f0a4b18c03
2018-07-24 09:15:33 +02:00
baaa6c3ce8 JobClient/Server fixes
- Retain original CLIENT_RETRY_COUNT semantics after Boost::Asio
  switch
- JobClient is C++11 now, too
- Message reception copy/paste error fixes

Change-Id: I19c474b2a79cd2ac8657e8d58d6170202d096fb0
2018-05-09 17:43:28 +02:00
9272c5cbed Move JobClient to Boost::asio as well
I did this mainly so server and client use a common networking API
IMO, using Boost::asio results in nicer name-lookup code.
Since no longer needed, I removed the SocketComm stuff.
The client is still synchronous; I see no benefit in having it
asynchronous.

I'm not super happy with the random backoff by the clients, if they
can't connect to the server. It makes the code really messy, 3 retries
is totally arbitrary, as is the backup windows. I believe launching
the server and clients in the correct order should be handled by a
launch script
Change-Id: Ifea64919fc228aa530c90449686f51bf63eb70e7
2018-05-09 17:41:52 +02:00
9ae8123433 JobServer: fix C++14 dependency
The recent Boost.Asio overhaul requires C++14 features, not only C++11.

Change-Id: I6decf0e6532956f7061d8a9021ec2c8406679266
2018-05-03 16:28:26 +02:00
6f41ad73d3 util: MemoryMap test failure more verbose
Change-Id: Ie42e1983d8cc5658b7e88d59cdbe689e6aefe9f2
2018-05-03 15:24:52 +02:00
6c120004eb Use boost-asio to improve FAIL* server performance
This patch overhauls the FAIL* server code to leverage Boost asio to be able to
handle a large number of clients (>4000). In this implementation the server is
now single threaded. I've not encountered any problems with this for up to
about 10k clients. Boost ASIO can also be used multithreaded, but I assume the
FAIL* internal data structures (Synchronized*) will become a bottleneck first.

The code now additionally depends on Boost Coro and Boost Context, as well as
a C++ 14 compiler, although the only C++14 feature required is a lambda capture
with initializer, such as [ x = std::move(x) ]. gcc-4.9.2 does this.

The code could (and probably should) be cleaned up more. Comments are wordy,
code is unnecessary now (multiple server threads), code is not self-contained
(headers spread dependencies), many ifdef's (server performance measuring
should be runtime rather than a compile time option), and much more. But for
this patch I was going for a minimal changeset the get the functionality in,
to have an easier review. Alas, FAIL* has no Unit-test suite to run the changes
against.

To handle such a large number of clients more changes were necessary, for
example server status output is now performed every 1s, instead for every
request.

The class Minion was removed completely; the only thing it was doing was
encapsulate an int.

The server has now a runtime-configurable port, or it can select a free port on
its own if none is specified. This requires the CampaignManager to add a port
argument and instantiate the JobServer dynamically.

Change-Id: Iad9238972161f95f5802bd2251116f8aeee14884
2017-09-15 06:26:14 +02:00
3ad42e270c fixes for Debian 9
- search for libdwarf.h in new locations (e.g., /usr/include/libdwarf/)
- build Bochs with -std=gnu++98 (gnu++14 is default since GCC 6.1)
- specify "proto2" syntax for protobuf messages
- minor build-system and C++ namespace fixes

Change-Id: I16dbc622c797ef8e936fe3c0fb9b03029d27529d
2017-08-01 14:12:03 +02:00
d0d62de3f4 sal: remove perf dependency to watchpoints/breakpoints
This change removes the hard compile-time dependency from the
performance-improving dedicated listener-list implementation
(core/sal/perf/) to basic watchpoints / breakpoints being enabled in
the cmake config.  This allows to keep the CONFIG_FAST_* switches
enabled in practically every experiment.

The primary reason for this change was the recent insight that enabled
breakpoints with disabled CONFIG_FAST_BREAKPOINTS can massively slow
down an experiment even if the latter does not use a single breakpoint
itself.

Change-Id: I5e3f5c1632ed1ee98a3ec887f18b174fa0e15773
2016-12-03 17:49:07 +01:00
d3d2faf680 globally rename Fail* to FAIL*
Change-Id: Ief2cb687cc69dd92c2e04f9314f0f1347e0a84ed
2016-07-26 17:41:32 +02:00
449ac1a692 DatabaseExperiment: local debug helper code
Change-Id: Ibf42c93df26f6123edc867147621a011665e9c43
2016-03-11 20:59:01 +01:00
39b120f7ca GenericExperiment: record output during complete runtime
Before this change, the GenericExperiment only recorded port 0xe9 output
*after* the fault was injected.  When a fault was injected during the
workload's output loop, the output data before that point in time was
missing, and the experiment outcome was wrongly classified as SDC.

This change moves the logging activation to before the fast-forwarding
step (DatabaseExperiment::cb_before_fast_forward).  It also makes sure the
DatabaseExperiment only clears its own listeners instead of also touching
the SerialOutputLogger's one.

Change-Id: I66bda4ee318d271ddda6f7ade4e817bf9d14cf46
2016-03-11 20:59:01 +01:00
ad558abeb6 DatabaseCampaign/-Experiment: add burst faults
This change introduces the ability to inject burst faults to
the DatabaseCampaign/-Experiment and thus to all derived
campaigns/experiments.

Change-Id: I491d021ed3953562bd7c908e9de50d448bc8ef33
2016-03-11 19:01:17 +01:00
748b0aea09 MemWriteListener: Set accesstype correctly
One construction of MemWriteListener did not set the MemAccessEvent type
correctly.

Change-Id: I34a34a34c1c23b2081d4749ee5e5372461c21717
2015-09-18 12:51:55 +02:00
f7c9917f7e database-experiment: no abort on injection_instr_absolute==NULL
The injection_instr_absolute can be NULL, if the trace was imported by
--faultspace-rightmargin R. The database-experiment then aborted the
injection, since a non present injection instruction is encoded as 0,
which is != 0.

Change-Id: I0abcbf102e8b26678ea574d6f73741c2cfac6781
2015-09-18 12:51:03 +02:00
d71db9211c DatabaseExperiment: fix headers
-  Add missing iomanip header: Without this one, Fail/gem5 does not
    compile.

 -  Remove unnecessary sal/bochs header: This seems to be a relic from
    when the DatabaseExperiment was Bochs-specific.

Change-Id: I91c991795c2c2e76359e9d11415f5119d225a4ab
2015-08-06 16:33:59 +02:00
1d9dae0e21 bochs: translate virtual to linear addresses
This change makes MemoryAccessListeners deliver linear addresses
instead of virtual ones deprived of their segment selector.  Even in
modern operating systems, segment selectors are still used for, e.g.,
thread-local storage.

The hooks within MemAccess.ah could maybe be implemented in a simpler
and less fragile way using the BX_INSTR_LIN_ACCESS instrumentation
hook, but this needs more investigation.

Change-Id: I0cee6271d6812d0a29b3a24f34d605a327ced7da
2015-07-31 12:46:06 +02:00
d38218f0eb DatabaseExperiment: remove Bochs dependency
Use the newly introduced SimulatorController::getCPUCount() instead of
BX_SMP_PROCESSORS to figure out the number of CPUs the back end provides.

Change-Id: I6d6521ae508154366ab5d0c23ddcb6f2de99aa04
2015-04-10 16:44:41 +02:00
ae15ac704d add missing headers
This change adds some missing headers needed for compiling the
PandaBoard variant, which seems to not have seen a compiler for a
while.

Change-Id: Ifb54abb4dc676fafc29ecbae97bafaa547fcfc80
2015-04-10 16:43:13 +02:00
96fae94b1f DatabaseExperiment: fix wrong variable scope
This fixes a wrong variable scope introduced in commit 193e5b7,
breaking compilation.

Change-Id: I74194e9ea6e726bc0a7ce2ee5ad5439b7de87fba
2015-04-10 15:07:45 +02:00
193e5b757e adapt experiments to new restore() behavior
This change adapts several experiments, including the
DatabaseExperiment framework, to the restore() behavior update from
the previous change.  Existing traces should continue to be usable.

This is not tested yet, mainly because I don't have access to most of
the experiment targets / guest systems necessary for testing.  Please
test your own experiments if possible, or at least leave me a note
that you couldn't test it!

Especially the cored-voter/experiment.cc update may be broken, but
maybe the "FISHY" +2 in there was not OK in the first place.

Change-Id: I0c5daeabc8fe6ce0c3ce3e7e13d02195f41340ad
2015-03-18 18:22:21 +01:00
91a9c6f688 core/sal: restore() more reliable for bochs
BochsController::restore() now recreates a state more expectable from
the experiment.  The state is now the same that save() leaves behind
in its most prominent use case after hitting a breakpoint.  This
change breaks backwards compatibility with some experiments, see
below!

Right after a breakpoint on a specific address fired and
BochsController::save() was called, another breakpoint on that
specific address would not fire again (unless that instruction is
executed again later on).

Up to this change, the situation after calling
BochsController::restore() was different:  A breakpoint on that
specific address would fire twice.  This difference led to the problem
that running the tracing plugin after save() would work fine
(recording the current instruction once, since 3dc752c "tracing: fix
loss of first dynamic instruction"), but running it after restore()
would record the current instruction *twice*.

This change aligns restore()'s behavior to that of save().  The
implications for existing experiments, traces and results are:

 -  Existing result data should be not affected at all, as
    trace.time1/time2 were correct before this change.  Nevertheless,
    the assumption time2-time1 >= instr2-instr1 does not hold for
    equivalence classes including the first instruction, if the latter
    was faultily recorded twice (see below).

 -  Existing traces that were recorded after a restore() (with a
    tracing plugin including the aforementioned commit 3dc752c)
    contain the first instruction twice.  An affected trace can be
    corrected with this command line:

      dump-trace old.tc | tail -n +2 | convert-trace -f dump -t new.tc

 -  For experiments that record traces after a restore() (such as
    ecos_kernel_test), nothing changes, as both the tracing and the
    fast-forwarding before the fault injection now see one instruction
    event less.

 -  Experiments that record traces after a save(), especially those
    that rely on the generic-tracing experiment for tracing, now see
    one instruction event less, before they need to inject their
    fault.  These experiments need to be adjusted, for example
    dciao-kernelstructs now should use bp.setCounter(injection_instr)
    instead of bp.setCounter(injection_instr+1).

Change-Id: I913bed9f1cad91ed3025f610024d62cfc2b9b11b
2015-03-06 08:38:40 +01:00
bd5802e5d7 core/sal: allow repeating BochsController::save
BochsController::save() now can in principle be called multiple times
in a row.  Not that this would really make sense, but the results are
consistent now.

Change-Id: Ib4c6eb571a364b0f7ea6142c8cfec004a12f98b3
2015-03-06 08:38:40 +01:00
d2899e8db7 core/sal: silence "unused function" warning
BochsHelpers.hpp is included by some aspect headers, which are implicitly
included into many (all?) translation units.  As in most TUs the "static
inline" defined getCPU function is not used, every time a "unused function"
warning was generated.

Change-Id: Ibb903fe7a11aaf1f455a626c8bf8b86f50857645
2015-02-09 11:02:40 +01:00
8973f65a50 util: don't leak resources from SumTree
This fixes the resource-leaking "should never happen" case when no
element is found by returning a notfound member.  Found by Coverity
Scan, CID 25555.

Change-Id: I9055ae0a3b31e61f3a8e3b098ec5613c3b5535f6
2015-02-07 18:20:40 +01:00
6a0214b132 ProtoStream: member variable -> local var
The contained state is not used over function boundaries anyways.
Found by Coverity Scan, CID 25689.

Change-Id: I34e42c227710be4859f6d62de9311c4201ed29b0
2015-02-07 18:20:39 +01:00
e99e4aafa8 JobServer: initialize sockaddr_in
This most probably is not a real problem, but does not take much work
to fix.  Found by Coverity Scan, in several reports.

Change-Id: I8bd12e3f7afeb4b1c4e1b057bdbd95da9aa9211c
2015-02-07 18:20:39 +01:00
8c2b6cf028 JobServer: fix socket leaks
Found by Coverity Scan, CID 25600.

Change-Id: Ic0c549928ce8058c145d178ed06b41b543676460
2015-02-07 18:20:30 +01:00