Commit Graph

179 Commits

Author SHA1 Message Date
57abdfdb5c Fix typo (dwarf) in the codebase (#2367)
In the codebase, the struct and functions were written without "f" for dwarf.
2023-07-19 17:58:52 +08:00
aafea39b8c Add "--enable-builtin-intrinsics=<flags>" option to wamrc (#2341)
Refer to doc/xip.md for details.
2023-07-06 18:20:35 +08:00
3bbf59ad45 wamrc: Warn on text relocations for XIP (#2340) 2023-07-05 10:49:45 +08:00
ae4069df41 Migrate ExpandMemoryOpPass to llvm new pass manager (#2334)
Fix #2328
2023-07-04 17:17:15 +08:00
1f89e446d9 Avoid switch lowering to lookup tables for XIP (#2339)
Because it involves relocations for the table. (.Lswitch.table.XXX)

Discussions: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2316
2023-07-04 16:48:32 +08:00
44f4b4f062 Add "--enable-llvm-passes=<passes>" option to wamrc (#2335)
Add "--enable-llvm-passes=<passes>" option to wamrc for customizing LLVM passes
2023-07-04 12:20:52 +08:00
03418ef5ac aot: Avoid possible relocations around "stack_sizes" for XIP mode (#2322)
Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/2316

Lightly tested on riscv64 qemu.
2023-06-29 18:45:33 +08:00
5831531449 aot: Move stack_sizes table to a dedicated section (#2317)
To solve the "AOT module load failed: resolve symbol stack_sizes failed" issue.

This PR partly fixes #2312 and was lightly tested on qemu armhf.
2023-06-27 16:18:14 +08:00
ea78b89965 Fix wamrc build issues with LLVM 13 and LLVM 16 (#2313)
Fix some build errors when building wamrc with LLVM-13, reported in #2311
Fix some build warnings when building wamrc with LLVM-16:
```
  core/iwasm/compilation/aot_llvm_extra2.cpp:26:26: warning:
  ‘llvm::None’ is deprecated: Use std::nullopt instead. [-Wdeprecated-declarations]
     26 |             return llvm::None;
```
Fix a maybe-uninitialized compile warning:
```
  core/iwasm/compilation/aot_llvm.c:413:9: warning:
  ‘update_top_block’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    413 |         LLVMPositionBuilderAtEnd(b, update_top_block);
```
2023-06-27 08:59:49 +08:00
cd7941cc39 AOT/JIT native stack bound check improvement (#2244)
Move the native stack overflow check from the caller to the callee because the
former doesn't work for call_indirect and imported functions.

Make the stack usage estimation more accurate. Instead of making a guess from
the number of wasm locals in the function, use the LLVM's idea of the stack size
of each MachineFunction. The former is inaccurate because a) it doesn't reflect
optimization passes, and b) wasm locals are not the only reason to use stack.

To use the post-compilation stack usage information without requiring 2-pass
compilation or machine-code imm rewriting, introduce a global array to store
stack consumption of each functions:
For JIT, use a custom IRCompiler with an extra pass to fill the array.
For AOT, use `clang -fstack-usage` equivalent because we support external llc.

Re-implement function call stack usage estimation to reflect the real calling
conventions better. (aot_estimate_stack_usage_for_function_call)

Re-implement stack estimation logic (--enable-memory-profiling) based on the new
machinery.

Discussions: #2105.
2023-06-22 07:27:07 +08:00
92e073b8ce AOTFuncContext: Remove a stale comment (#2283) 2023-06-09 22:31:08 +08:00
cabcb177c8 dwarf_extractor: Constify a bit (#2278) 2023-06-09 09:52:03 +08:00
6e3c3fe9ec Fix build error with LLVM 16 (#2259) 2023-06-06 13:45:18 +08:00
5d69f364db aot/jit: Set module layout (#2260)
LLVM 15 and later sometimes perform wrong optimizations without this.
2023-06-06 10:18:16 +08:00
8ef09be604 Fix compile error of wamrc with llvm-13/llvm-14 (#2261) 2023-06-06 08:33:15 +08:00
8d88471c46 Implement AOT static PGO (#2243)
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:

1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
   to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
      `iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
    to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
    to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
    to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.

The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
2023-06-05 09:17:39 +08:00
76be848ec3 Implement the segue optimization for LLVM AOT/JIT (#2230)
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT

This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
    i32.load, i64.load, f32.load, f64.load, v128.load,
    i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.

Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!

Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>
2023-05-26 10:13:33 +08:00
94204b90ad aot_compile_op_call: Remove a wrong optimization (#2233)
Unlike a tail-call, the caller of an ordinary recursive call doesn't
necessarily return immediately.
2023-05-25 07:44:54 +08:00
670567f8b3 core/iwasm/compilation: constify a bit (#2223)
Just to make the code a bit easier to read.
2023-05-20 11:55:02 +08:00
f759a1f960 A few changes related to WAMRC_LLC_COMPILER (#2218)
Print `target triple` for wamrc and set target triple for the LLVM module.
And update document.
2023-05-17 09:56:35 +08:00
2b896c80ef wamrc: Add --stack-usage option (#2158) 2023-04-28 13:56:44 +08:00
7e9bf9cdf5 Implement Fast JIT multi-threading feature (#2134)
- Translate all the opcodes of threads spec proposal for Fast JIT
- Add the atomic flag for Fast JIT load/store IRs to support atomic load/store
- Add new atomic related Fast JIT IRs and translate them in the codegen
- Add suspend_flags check in branch opcodes and before/after call function
- Modify CI to enable Fast JIT multi-threading test

Co-authored-by: TianlongLiang <tianlong.liang@intel.com>
2023-04-20 10:09:34 +08:00
62fc486c20 Refine aot compiler check suspend_flags and fix issue of multi-tier jit (#2111)
In LLVM AOT/JIT compiler, only need to check the suspend_flags when memory is
a shared memory since the shared memory must be enabled for multi-threading,
so as not to impact the performance in non-multi-threading memory mode. Also
refine the LLVM IRs to check the suspend_flags.

And fix an issue of multi-tier jit for multi-threading, the instance of the child thread
should be removed from the instance list before it is de-instantiated.
2023-04-07 06:47:24 +08:00
f279ba84ee Fix multi-threading issues (#2013)
- Implement atomic.fence to ensure a proper memory synchronization order
- Destroy exec_env_singleton first in wasm/aot deinstantiation
- Change terminate other threads to wait for other threads in
  wasm_exec_env_destroy
- Fix detach thread in thread_manager_start_routine
- Fix duplicated lock cluster->lock in wasm_cluster_cancel_thread
- Add lib-pthread and lib-wasi-threads compilation to Windows CI
2023-03-08 10:57:22 +08:00
38c67b3f48 thread-mgr: Fix spread "wasi proc exit" exception and atomic.wait issues (#1988)
Raising "wasi proc exit" exception, spreading it to other threads and then
clearing it in all threads may result in unexpected behavior: the sub thread
may end first, handle the "wasi proc exit" exception and clear exceptions
of other threads, including the main thread. And when main thread's
exception is cleared, it may continue to run and throw "unreachable"
exception. This also leads to some assertion failed.

Ignore exception spreading for "wasi proc exit" and don't clear exception
of other threads to resolve the issue.

And add suspend flag check after atomic wait since the atomic wait may
be notified by other thread when exception occurs.
2023-02-24 20:05:39 +08:00
7d3b2a8773 Make memory profiling show native stack usage (#1917) 2023-02-01 11:52:15 +08:00
f818f4c43f Simplify fcmp intrinsic logic for AOT/XIP (#1881) 2023-01-12 12:05:53 +08:00
7401718311 Report error in instantiation when meeting unlinked import globals (#1859) 2023-01-06 15:24:11 +08:00
d5aa354d41 Return result directly if float cmp is called in AOT XIP (#1851) 2022-12-30 16:45:39 +08:00
ba5cdbee3a Fix typo verify_module in aot_compiler.c (#1836) 2022-12-26 12:24:23 +08:00
14288f59b0 Implement Multi-tier JIT (#1774)
Implement 2-level Multi-tier JIT engine: tier-up from Fast JIT to LLVM JIT to
get quick cold startup by Fast JIT and better performance by gradually
switching to LLVM JIT when the LLVM JIT functions are compiled by the
backend threads.

Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1302
2022-12-19 11:24:46 +08:00
9083334f69 Fix XIP issue of handling 64-bit const in 32-bit target (#1803)
- Handle i64 const like f64 const
- Ensure i64/f64 const is stored on 8-byte aligned address
2022-12-13 12:45:26 +08:00
f6bef1e604 Implement i32.rem_s and i32.rem_u intrinsic (#1789) 2022-12-08 09:38:20 +08:00
1652f22a77 Fix issues reported by Coverity (#1775)
Fix some issues reported by Coverity and fix windows exception
check with guard page issue
2022-12-01 19:24:13 +08:00
ce3458da99 Refine AOT exception check when function return (#1752)
Refine AOT exception check in the caller when returning from callee function,
remove the exception check instructions when hw bound check is enabled to
improve the performance: create guard page to trigger signal handler when
exception occurs.
2022-11-30 20:18:28 +08:00
96570cca22 Remove unused LLVM JIT wapper functions (#1747)
Only create the necessary wrapper functions for LLVM JIT
2022-11-25 11:26:08 +08:00
87c3195d47 Revert "Implement call Fast JIT function from LLVM JIT jitted code" (#1737)
Reverts bytecodealliance/wasm-micro-runtime#1714, which was merged mistakenly.
2022-11-22 14:04:48 +08:00
cf7b01ad82 Implement call Fast JIT function from LLVM JIT jitted code (#1714)
Basically implement the Multi-tier JIT engine.
And update document and wamr-test-suites script.
2022-11-21 10:42:18 +08:00
6c16ff7654 Update document and clear compile warnings (#1701)
Update build wasm app document, add how to set buildflags for Rust
project to reduce the footprint.

Clear Windows warnings and a shadow warning in aot_emit_numberic.c
2022-11-15 15:02:23 +08:00
c70e1ebc3d Avoid generating some unused LLVM IRs (#1696)
Refine the generated LLVM IRs at the beginning of each LLVM AOT/JIT function
to fasten the LLVM IR optimization:
- Only create argv_buf if there are func calls in this function
- Only create native stack bound if stack bound check is enabled
- Only create aux stack info if there is opcode set_global_aux_stack
- Only create native symbol if indirect_mode is enabled
- Only create memory info if there are memory operations
- Only create func_type_indexes if there is opcode call_indirect
2022-11-14 14:32:35 +08:00
4b0660cf24 Fix missing float cmp for XIP (#1699) 2022-11-14 11:58:38 +08:00
7fd37190e8 Add control for the native stack check with hardware trap (#1682)
Add a new options to control the native stack hw bound check feature:
- Besides the original option `cmake -DWAMR_DISABLE_HW_BOUND_CHECK=1/0`,
  add a new option `cmake -DWAMR_DISABLE_STACK_HW_BOUND_CHECK=1/0`
- When the linear memory hw bound check is disabled, the stack hw bound check
   will be disabled automatically, no matter what the input option is
- When the linear memory hw bound check is enabled, the stack hw bound check
  is enabled/disabled according to the value of input option
- Besides the original option `--bounds-checks=1/0`, add a new option
  `--stack-bounds-checks=1/0` for wamrc

Refer to: https://github.com/bytecodealliance/wasm-micro-runtime/issues/1677
2022-11-07 18:26:33 +08:00
c8cacbd883 Add LLVM_BUILD_OP_OR_INTRINSIC to avoid code dup (#1672) 2022-11-03 11:48:48 +08:00
5b144c491d Avoid initialize LLVM repeatedly (#1671)
Currently we initialize and destroy LLVM environment in aot_create_comp_context
and aot_destroy_comp_context, which are called in wasm_module_load/unload,
and the latter may be invoked multiple times, which leads to duplicated LLVM
initialization/destroy and may result in unexpected behaviors.

Move the LLVM init/destroy into runtime init/destroy to resolve the issue.
2022-11-02 16:13:58 +08:00
f1f6f4a125 Remove unused codes in AOT compiler (#1668)
Remove the setup of JIT LLVMOrcIRTransformLayerSetTransform and
LLVMOrcObjectTransformLayerSetTransform which is commented.
2022-11-02 08:32:16 +08:00
94cecbe4cb Fix XIP issues of fp to int cast and int rem/div (#1654) 2022-11-01 20:29:07 +08:00
e517dbc7b2 XIP adaptation for xtensa platform (#1636)
Add macro WASM_ENABLE_WORD_ALING_READ to enable reading
1/2/4 and n bytes data from vram buffer, which requires 4-byte addr
alignment reading.

Eliminate XIP AOT relocations related to the below ones:
   i32_div_u, f32_min, f32_max, f32_ceil, f32_floor, f32_trunc, f32_rint
2022-10-31 17:25:24 +08:00
ef21f0c951 Implement Fast JIT dump call stack and perf profiling (#1633)
Implement dump call stack and perf profiling features for Fast JIT,
and refine some code.
2022-10-27 09:28:32 +08:00
4a1e522c53 Move indirect mode optimization to the last of LLVM pipelines (#1627)
The general optimizations may create some intrinsic function calls
like llvm.memset, so we move indirect mode optimization after them
to remove these function calls at last.

Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
2022-10-24 10:20:05 +08:00
1d4cbfceac Refine Fast JIT call indirect and call native process (#1620)
Translate call_indirect opcode by calling wasm functions with Fast JIT IRs instead of
calling jit_call_indirect runtime API, so as to improve the performance.

Translate call native function process with Fast JIT IRs to validate each pointer argument
and convert it into native address, and then call the native function directly instead
of calling jit_invoke_native runtime API, so as to improve the performance.
2022-10-19 17:11:38 +08:00