Commit Graph

149 Commits

Author SHA1 Message Date
4f6d70bc52 Use indirect call in pre-checker function to avoid relocation in XIP mode (#3142)
The stack profiler `aot_func#xxx` calls the wrapped function of `aot_func_internal#xxx`
by using symbol reference,  but in some platform like xtensa, it’s translated into a native
long call, which needs to resolve the indirect address by relocation and breaks the XIP
feature which requires the eliminating of relocation.

The solution is to change the symbol reference into an indirect call through the lookup
table, the code will be like this:
```llvm
call_wrapped_func:                                ; preds = %stack_bound_check_block
  %func_addr1 = getelementptr inbounds ptr, ptr %func_ptrs_ptr, i32 75
  %func_tmp2 = load ptr, ptr %func_addr1, align 4
  tail call void %func_tmp2(ptr %exec_env)
  ret void
```
2024-02-27 11:17:57 +08:00
16a4d71b34 Implement GC (Garbage Collection) feature for interpreter, AOT and LLVM-JIT (#3125)
Implement the GC (Garbage Collection) feature for interpreter mode,
AOT mode and LLVM-JIT mode, and support most features of the latest
spec proposal, and also enable the stringref feature.

Use `cmake -DWAMR_BUILD_GC=1/0` to enable/disable the feature,
and `wamrc --enable-gc` to generate the AOT file with GC supported.

And update the AOT file version from 2 to 3 since there are many AOT
ABI breaks, including the changes of AOT file format, the changes of
AOT module/memory instance layouts, the AOT runtime APIs for the
AOT code to invoke and so on.
2024-02-06 20:47:11 +08:00
a27ddece7f Always allocate linear memory using mmap (#3052)
With this approach we can omit using memset() for the newly allocated memory
therefore the physical pages are not being used unless touched by the program.

This also simplifies the implementation.
2024-02-02 22:17:44 +08:00
99bbad8cdb perf profiling: Adjust the calculation of execution time (#3089) 2024-01-26 18:06:21 +08:00
9fb5fcc709 Add comments to suppress warning from clang-tidy (#3088)
Suppress style warnings for macro definition, name of these macros is
inconsistent with others (upper case).
2024-01-26 17:02:24 +08:00
313ce8cb61 Fix memory/table segment checks in memory.init/table.init (#3081)
According to the wasm core spec, the checks for the table segments in
`table.init` opcode are similar to the checks for `memory.init` opcode:
- The size of a passive segment is shrunk to zero after `data.drop`
  (or `elem.drop`) opcode is executed, and the segment can be used to do
  `memory.init` (or `table.init`) again
- The `memory.init` only traps when `s+n > len(data.data)` or `d+n > len(mem.data)`
  and `table.init` only traps when `s+n > len(elem.elem)` or `d+n > len(tab.elem)`
- The active segment can also be used to do `memory.init` (or `table.init`),
  while it behaves like a dropped passive segment

https://github.com/WebAssembly/bulk-memory-operations/blob/master/proposals/bulk-memory-operations/Overview.md
```
Segments can also be shrunk to size zero by using the following new instructions:
- data.drop: discard the data in an data segment
- elem.drop: discard the data in an element segment

An active segment is equivalent to a passive segment, but with an implicit
memory.init followed by a data.drop (or table.init followed by a elem.drop)
that is prepended to the module's start function.
```
ps.
https://webassembly.github.io/spec/core/bikeshed/#-hrefsyntax-instr-memorymathsfmemoryinitx%E2%91%A0
https://webassembly.github.io/spec/core/bikeshed/#-hrefsyntax-instr-tablemathsftableinitxy%E2%91%A0
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3020
2024-01-26 09:45:59 +08:00
9bcf6b4dd3 Enable quick aot entry when hw bound check is disabled (#3044)
- Enable quick aot entry when hw bound check is disabled
- Remove unnecessary ret_type argument in the quick aot entries
- Declare detailed prototype of aot function to call in each quick aot entry
2024-01-19 08:55:35 +08:00
5c8b8a17a6 Enhancements on wasm function execution time statistic (#2985)
Enhance the statistic of wasm function execution time, or the performance
profiling feature:
- Add os_time_thread_cputime_us() to get the cputime of a thread,
  and use it to calculate the execution time of a wasm function
- Support the statistic of the children execution time of a function,
  and dump it in wasm_runtime_dump_perf_profiling
- Expose two APIs:
  wasm_runtime_sum_wasm_exec_time
  wasm_runtime_get_wasm_func_exec_time

And rename os_time_get_boot_microsecond to os_time_get_boot_us.
2024-01-17 09:51:54 +08:00
ffa131b5ac Allow using mmap for shared memory if hw bound check is disabled (#3029)
For shared memory, the max memory size must be defined in advanced. Re-allocation
for growing memory can't be used as it might change the base address, therefore when
OS_ENABLE_HW_BOUND_CHECK is enabled the memory is mmaped, and if the flag is
disabled, the memory is allocated. This change introduces a flag that allows users to use
mmap for reserving memory address space even if the OS_ENABLE_HW_BOUND_CHECK
is disabled.
2024-01-16 22:15:55 +08:00
b21f17dd6d Refine AOT/JIT code call wasm-c-api import process (#2982)
Allow to invoke the quick call entry wasm_runtime_quick_invoke_c_api_import to
call the wasm-c-api import functions to speedup the calling process, which reduces
the data copying.

Use `wamrc --invoke-c-api-import` to generate the optimized AOT code, and set
`jit_options->quick_invoke_c_api_import` true in wasm_engine_new when LLVM JIT
is enabled.
2024-01-10 18:37:02 +08:00
7c7684819d Register quick call entries to speedup the aot/jit func call process (#2978)
In some scenarios there may be lots of callings to AOT/JIT functions from the
host embedder, which expects good performance for the calling process, while
in the current implementation, runtime calls the wasm_runtime_invoke_native
to prepare the array of registers and stacks for the invokeNative assemble code,
and the latter then puts the elements in the array to physical registers and
native stacks and calls the AOT/JIT function, there may be many data copying
and handlings which impact the performance.

This PR registers some quick AOT/JIT entries for some simple wasm signatures,
and let runtime call the entry to directly invoke the AOT/JIT function instead of
calling wasm_runtime_invoke_native, which speedups the calling process.

We may extend the mechanism next to allow the developer to register his quick
AOT/JIT entries to speedup the calling process of invoking the AOT/JIT functions
for some specific signatures.
2024-01-10 16:44:09 +08:00
aa4d68c2af Refine AOT function call process (#2940)
Don't set exec_env's thread handle and stack boundary in the recursive
calling from host, since they have been initialized in the first time calling.
2024-01-02 19:10:31 +08:00
4d5eb346fc Change is_shared_memory type from bool to uint8 (#2800)
Change WASMMemoryInstance's field is_shared_memory's type from bool
to uint8 whose size is fixed, so as to make WASMMemoryInstance's size
and layout fixed and not break AOT ABI.

See discussion in https://github.com/bytecodealliance/wasm-micro-runtime/pull/2682.
2023-11-22 10:38:08 +08:00
9ad42290d8 Fix formatting in wasm_dump_perf_profiling (#2799)
Changes %d to %PRIu32.
2023-11-20 18:06:35 +08:00
aebe30426f Fix formatting in aot_dump_perf_profiling (#2796)
Changes %d to %PRIu32.
2023-11-20 12:11:02 +08:00
562a5dd1b6 Fix data/elem drop (#2747)
Currently, `data.drop` instruction is implemented by directly modifying the
underlying module. It breaks use cases where you have multiple instances
sharing a single loaded module. `elem.drop` has the same problem too.

This PR  fixes the issue by keeping track of which data/elem segments have
been dropped by using bitmaps for each module instances separately, and
add a sample to demonstrate the issue and make the CI run it.

Also add a missing check of dropped elements to the fast-jit `table.init`.

Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2735
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2772
2023-11-18 08:50:16 +08:00
24aa1cb408 Extend os_mmap to support map file from fd (#2763)
Add an extra argument `os_file_handle file` for `os_mmap` to support
mapping file from a file fd, and remove `os_get_invalid_handle` from
`posix_file.c` and `win_file.c`, instead, add it in the `platform_internal.h`
files to remove the dependency on libc-wasi.

Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
2023-11-16 08:28:54 +08:00
24c4d256b3 Grab cluster->lock when modifying exec_env->module_inst (#2685)
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2680

And when switching back to the original module_inst, propagate exception if any.

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/2512
2023-11-09 18:56:02 +08:00
7f8292ffd1 Add more buffer boundary checks in wasm loader (#2734)
And fix exception not printed in `iwasm --repl` mode and resize the memory
data size to UINT32_MAX if the initial page number is 65536.
2023-11-09 08:42:05 +08:00
68a627ea2c Fix several AOT compiler issues (#2697)
- Fix potential invalid push param phis and add incoming phis to a un-existed basic block
- Fix potential invalid shift count int rotl/rotr opcodes
- Resize memory_data_size to UINT32_MAX if it is 4G when hw bound check is enabled
- Fix negative linear memory offset is used for 64-bit target it is const and larger than INT32_MAX
2023-11-02 20:36:21 +08:00
52db362b89 Refine lock/unlock shared memory (#2682)
Split memory instance's field `uint32 ref_count` into `bool is_shared_memory`
and `uint16 ref_count`, and lock the memory only when `is_shared_memory`
flag is true, no need to acquire a lock for non-shared memory when shared
memory feature is enabled.
2023-10-31 11:46:03 +08:00
4b1a6e5017 Fix repeatedly initialize shared memory data and protect the memory's fields (#2673)
Avoid repeatedly initializing the shared memory data when creating the child
thread in lib-pthread or lib-wasi-threads.

Add shared memory lock when accessing some fields of the memory instance
if the memory instance is shared.

Init shared memory's memory_data_size/memory_data_end fields according to
the current page count but not max page count.

Add wasm_runtime_set_mem_bound_check_bytes, and refine the error message
when shared memory flag is found but the feature isn't enabled.
2023-10-30 11:07:01 +08:00
00539620e9 Improve stack trace dump and fix coding guideline CI (#2599)
Avoid the stack traces getting mixed up together when multi-threading is enabled
by using exception_lock/unlock in dumping the call stacks.

And remove duplicated call stack dump in wasm_application.c.

Also update coding guideline CI to fix the clang-format-12 not found issue.
2023-09-29 10:52:54 +08:00
79b27c1934 Support muti-module for AOT mode (#2482)
Support muti-module for AOT mode, currently only implement the
multi-module's function import feature for AOT, the memory/table/
global import are not implemented yet.

And update wamr-test-suites scripts, multi-module sample and some
CIs accordingly.
2023-09-28 08:56:11 +08:00
6c846acc59 Implement module instance context APIs (#2436)
Introduce module instance context APIs which can set one or more contexts created
by the embedder for a wasm module instance:
```C
    wasm_runtime_create_context_key
    wasm_runtime_destroy_context_key
    wasm_runtime_set_context
    wasm_runtime_set_context_spread
    wasm_runtime_get_context
```

And make libc-wasi use it and set wasi context as the first context bound to the wasm
module instance.

Also add samples.

Refer to https://github.com/bytecodealliance/wasm-micro-runtime/issues/2460.
2023-09-07 14:54:11 +08:00
709127d631 Add callback to handle memory.grow failures (#2522)
When embedding WAMR, this PR allows to register a callback that is
invoked when memory.grow fails.

In case of memory allocation failures, some languages allow to handle
the error (e.g. by checking the return code of malloc/calloc in C), some
others (e.g. Rust) just panic.
2023-09-05 16:41:52 +08:00
5c6613b2b1 Correct --heap-size option in messages (#2458) 2023-08-14 15:12:59 +08:00
51714c41c0 Introduce WASMModuleInstanceExtraCommon (#2429)
Move the common parts of WASMModuleInstanceExtra and
AOTModuleInstanceExtra into the new structure.
2023-08-08 09:35:29 +08:00
91592429f4 Fix memory sharing (#2415)
- Inherit shared memory from the parent instance, instead of
  trying to look it up by the underlying module. The old method
  works correctly only when every cluster uses different module.
- Use reference count in WASMMemoryInstance/AOTMemoryInstance
  to mark whether the memory is shared or not
- Retire WASMSharedMemNode
- For atomic opcode implementations in the interpreters, use
  a global lock for now
- Update the internal API users
  (wasi-threads, lib-pthread, wasm_runtime_spawn_thread)

Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/1962
2023-08-04 10:18:13 +08:00
28125ec538 Move wasm_runtime_destroy_wasi and wasi_nn_destroy calls together (#2418)
And remove obsolete comment.
2023-08-03 08:46:56 +08:00
59b2099b68 Fix some check issues on table operations (#2392)
Fix some check issues on table.init, table.fill and table.copy, and unify the check method
for all running modes.
Fix issue #2390 and #2096.
2023-07-27 21:53:48 +08:00
c39eb46b6f Add a few more assertions on structures to which aot abi is sensitive (#2326) 2023-06-30 10:22:46 +08:00
03418ef5ac aot: Avoid possible relocations around "stack_sizes" for XIP mode (#2322)
Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/2316

Lightly tested on riscv64 qemu.
2023-06-29 18:45:33 +08:00
ab96e01f5e wasi-nn: Add support of wasi-nn as shared lib (#2310)
## Context

Currently, WAMR supports compiling iwasm with flag `WAMR_BUILD_WASI_NN`.
However, there are scenarios where the user might prefer having it as a shared library.

## Proposed Changes

Decouple wasi-nn context management by internally managing the context given
a module instance reference.
2023-06-27 18:18:26 +08:00
7ec77598dd Fix format warning by PRIu32 in [wasm|aot] dump call stack (#2251) 2023-06-11 11:30:25 +08:00
8d88471c46 Implement AOT static PGO (#2243)
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:

1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
   to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
      `iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
    to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
    to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
    to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.

The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
2023-06-05 09:17:39 +08:00
76be848ec3 Implement the segue optimization for LLVM AOT/JIT (#2230)
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT

This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
    i32.load, i64.load, f32.load, f64.load, v128.load,
    i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.

Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!

Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>
2023-05-26 10:13:33 +08:00
b0736e2e88 Fix issues reported by Coverity (#2083)
Get exec_env_tls at the beginning of execute_post_instantiate_functions
to avoid it is uninitialized when is_sub_inst is false.
2023-03-29 19:40:52 +08:00
10f1bf3af7 Fix module_malloc/module_free issues (#2072)
Try using existing exec_env to execute wasm app's malloc/free func and
execute post instantiation functions. Create a new exec_env only when
no existing exec_env was found.
2023-03-28 18:31:09 +08:00
3977f0b22a Use pre-created exec_env for instantiation and module_malloc/free (#2047)
Use pre-created exec_env for instantiation and module_malloc/free,
use the same exec_env of the current thread to avoid potential
unexpected behavior.

And remove unnecessary shared_mem_lock in wasm_module_free,
which may cause dead lock.
2023-03-23 19:19:47 +08:00
bab2402b6e Fix atomic.wait, get wasi_ctx exit code and thread mgr issues (#2024)
- Remove notify_stale_threads_on_exception and change atomic.wait
  to be interruptible by keep waiting and checking every one second,
  like the implementation of poll_oneoff in libc-wasi
- Wait all other threads exit and then get wasi exit_code to avoid
  getting invalid value
- Inherit suspend_flags of parent thread while creating new thread to
  avoid terminated flag isn't set for new thread
- Fix wasi-threads test case update_shared_data_and_alloc_heap
- Add "Lib wasi-threads enabled" prompt for cmake
- Fix aot get exception, use aot_copy_exception instead
2023-03-15 07:47:36 +08:00
f279ba84ee Fix multi-threading issues (#2013)
- Implement atomic.fence to ensure a proper memory synchronization order
- Destroy exec_env_singleton first in wasm/aot deinstantiation
- Change terminate other threads to wait for other threads in
  wasm_exec_env_destroy
- Fix detach thread in thread_manager_start_routine
- Fix duplicated lock cluster->lock in wasm_cluster_cancel_thread
- Add lib-pthread and lib-wasi-threads compilation to Windows CI
2023-03-08 10:57:22 +08:00
d2772c4153 Re-org calling post instantiation functions (#1972)
- Use execute_post_instantiate_functions to call start, _initialize,
  __post_instantiate, __wasm_call_ctors functions after instantiation
- Always call start function for both main instance and sub instance
- Only call _initialize and __post_instantiate for main instance
- Only call ___wasm_call_ctors for main instance and when bulk memory
  is enabled and wasi import functions are not found
- When hw bound check is enabled, use the existing exec_env_tls
  to call func for sub instance, and switch exec_env_tls's module inst
  to current module inst to avoid checking failure and using the wrong
  module inst
2023-02-22 12:24:11 +08:00
ef3a683392 Don't call start/initialize in child thread's instantiation (#1967)
The start/initialize functions of wasi module are to do some initialization work
during instantiation, which should be only called one time in the instantiation
of main instance. For example, they may initialize the data in linear memory,
if the data is changed later by the main instance, and re-initialized again by
the child instance, unexpected behaviors may occur.

And clear a shadow warning in classic interpreter.
2023-02-17 15:11:05 +08:00
7d3b2a8773 Make memory profiling show native stack usage (#1917) 2023-02-01 11:52:15 +08:00
9eed6686df Refactor WASI-NN to simplify the support for multiple frameworks (#1834)
- Reorganize the library structure
- Use the latest version of `wasi-nn` wit (Oct 25, 2022):
    0f77c48ec1/wasi-nn.wit.md
- Split logic that converts WASM structs to native structs in a separate file
- Simplify addition of new frameworks
2023-01-25 18:32:40 +08:00
cadf9d0ad3 Main thread spread exception when thread-mgr is enabled (#1889)
And refactor clear_wasi_proc_exit_exception, refer to
https://github.com/bytecodealliance/wasm-micro-runtime/pull/1869
2023-01-20 08:54:27 +08:00
622cdbefd6 Prevent undefined behavior from c_api_func_imports == NULL (#1883)
The module instance's c_api_func_imports may be NULL under some circumstances,
add checks before accessing it.
2023-01-14 07:52:39 +08:00
1f4580fbd8 Enable CI wasi test suite for x86-32 classic/fast interpreter (#1866)
The original CI didn't actually run wasi test suite for x86-32 since the `TEST_ON_X86_32=true`
isn't written into $GITHUB_ENV.

And refine the error output when failed to link import global.
2023-01-06 17:31:39 +08:00
7401718311 Report error in instantiation when meeting unlinked import globals (#1859) 2023-01-06 15:24:11 +08:00