Commit Graph

56 Commits

Author SHA1 Message Date
8a55a1e7a1 Shared heap enhancements for Interpreter and AOT (#4400)
Propose two enhancements:

- Shared heap created from preallocated memory buffer: The user can create a shared heap from a pre-allocated buffer and see that memory region as one large chunk; there's no need to dynamically manage it(malloc/free). The user needs to make sure the native address and size of that memory region are valid.
- Introduce shared heap chain: The user can create a shared heap chain, from the wasm app point of view, it's still a continuous memory region in wasm app's point of view while in the native it can consist of multiple shared heaps (each of which is a continuous memory region). For example, one 500MB shared heap 1 and one 500 MB shared heap 2 form a chain, in Wasm's point of view, it's one 1GB shared heap.

After these enhancements, the data sharing between wasm apps, and between hosts can be more efficient and flexible. Admittedly shared heap management can be more complex for users, but it's similar to the zero-overhead principle. No overhead will be imposed for the users who don't use the shared heap enhancement or don't use the shared heap at all.
2025-07-04 10:44:51 +08:00
TL
f1ffbb5b37 cr suggestions 2025-02-24 21:20:07 +00:00
TL
dfcadc6202 prevent data overflow on 32 bit platform for memory.grow 2025-02-24 21:20:07 +00:00
e6a47d5cee In wasm32, fix potential conversion overflow when enlarging 65536 pages (#4064)
fix enlarge 65536 pages conversion overflow in wasm32
2025-02-06 14:48:53 +08:00
376385c608 Update memory allocation functions to use allocator user data (#4043) 2025-02-06 13:15:00 +08:00
c7b2683f17 Fix out of bounds issue in is_native_addr_in_shared_heap function (#3886)
When checking for integer overflow, you may often write tests like p + i < p.
This works fine if p and i are unsigned integers, since any overflow in the
addition will cause the value to simply "wrap around." However, using this
pattern when p is a pointer is problematic because pointer overflow has
undefined behavior according to the C and C++ standards. If the addition
overflows and has an undefined result, the comparison will likewise be
undefined; it may produce an unintended result, or may be deleted entirely
by an optimizing compiler.
2024-10-31 12:44:55 +08:00
a3960c834d Refine looking up aot function with index (#3882)
* Refine looking up aot function with index

* refine the code
2024-10-29 10:58:11 +08:00
19160f0e1d Fix issues of destroy_shared_heaps (#3847)
Set shared_heap_list to NULL with lock, and destroy shared_heap_list_lock.

Signed-off-by: wenlingyun1 <wenlingyun1@xiaomi.com>
2024-10-10 10:56:36 +08:00
9ba36e284c Implement shared heap for AOT (#3815) 2024-09-29 12:50:59 +08:00
1fd422ae6e Merge branch main into dev/shared_heap 2024-09-20 14:34:00 +08:00
4dacef2d60 shared heap: Fix some issues and add basic unit test case (#3801)
Fix some issues and add basic unit test case for shared heap feature.

Signed-off-by: wenlingyun1 <wenlingyun1@xiaomi.com>
2024-09-20 14:24:38 +08:00
5e20cf383e Refactor shared heap feature for interpreter mode (#3794)
To add test cases and samples.
2024-09-18 14:53:41 +08:00
92852f3719 Implement a first version of shared heap feature (#3789)
ps. https://github.com/bytecodealliance/wasm-micro-runtime/issues/3546
2024-09-14 10:51:42 +08:00
926f662231 Add memory instance support apis (#3786)
Now that WAMR supports multiple memory instances, this PR adds some APIs
to access them in a standard way.

This involves moving some existing utility functions out from the
`WASM_ENABLE_MULTI_MODULE` blocks they were nested in, but multi-memory
and multi-module seem independent as far as I can tell so I assume that's okay.

APIs added:
```C
wasm_runtime_lookup_memory
wasm_runtime_get_default_memory
wasm_runtime_get_memory
wasm_memory_get_cur_page_count
wasm_memory_get_max_page_count
wasm_memory_get_bytes_per_page
wasm_memory_get_shared
wasm_memory_get_base_address
wasm_memory_enlarge
```
2024-09-14 10:31:13 +08:00
f453d9d5ce Appease GCC strict prototypes warning (#3775) 2024-09-10 09:42:23 +08:00
1329e1d3e1 Add support for multi-memory proposal in classic interpreter (#3742)
Implement multi-memory for classic-interpreter. Support core spec (and bulk memory) opcodes now,
and will support atomic opcodes, and add multi-memory export APIs in the future. 

PS: Multi-memory spec test patched a lot for linking test to adapt for multi-module implementation.
2024-08-21 12:22:23 +08:00
73caf19e69 Fix compile warnings/error reported in Windows (#3616)
Clear some compile warnings and fix undefined reference error for symbol ffs
in Windows platform.
2024-07-12 16:43:22 +08:00
837ff63662 Use 64-bit wasm_runtime_enlarge_memory() increment (#3573)
ps.
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3569#discussion_r1654398315
2024-06-27 14:34:43 +08:00
74dbafc699 Export API wasm_runtime_enlarge_memory (#3569)
Export API wasm_runtime_enlarge_memory to support memory growth.
2024-06-26 11:07:16 +08:00
4c2af25aff aot compiler: Use larger alignment for load/store when possible (#3552)
Consider the following wasm module:
```wast
(module
  (func (export "foo")
    i32.const 0x104
    i32.const 0x12345678
    i32.store
  )
  (memory 1 1)
)
```

While the address (0x104) is perfectly aligned for i32.store,
as our aot compiler uses 1-byte alignment for load/store LLVM
IR instructions, it often produces inefficient machine code,
especially for alignment-sensitive targets.

For example, the above "foo" function is compiled into the
following xtensa machine code.
```
0000002c <aot_func_internal#0>:
  2c:   004136          entry   a1, 32
  2f:   07a182          movi    a8, 0x107
  32:   828a            add.n   a8, a2, a8
  34:   291c            movi.n  a9, 18
  36:   004892          s8i     a9, a8, 0
  39:   06a182          movi    a8, 0x106
  3c:   828a            add.n   a8, a2, a8
  3e:   ffff91          l32r    a9, 3c <aot_func_internal#0+0x10> (ff91828a <aot_func_internal#0+0xff91825e>)
                        3e: R_XTENSA_SLOT0_OP   .literal+0x8
  41:   004892          s8i     a9, a8, 0
  44:   05a182          movi    a8, 0x105
  47:   828a            add.n   a8, a2, a8
  49:   ffff91          l32r    a9, 48 <aot_func_internal#0+0x1c> (ffff9182 <aot_func_internal#0+0xffff9156>)
                        49: R_XTENSA_SLOT0_OP   .literal+0xc
  4c:   41a890          srli    a10, a9, 8
  4f:   0048a2          s8i     a10, a8, 0
  52:   04a182          movi    a8, 0x104
  55:   828a            add.n   a8, a2, a8
  57:   004892          s8i     a9, a8, 0
  5a:   f01d            retw.n
```

Note that the each four bytes are stored separately using
one-byte-store instruction, s8i.

This commit tries to use larger alignments for load/store LLVM IR
instructions when possible.  with this commit, the above example is
compiled into the following machine code, which seems more reasonable.
```
0000002c <aot_func_internal#0>:
  2c:   004136          entry   a1, 32
  2f:   ffff81          l32r    a8, 2c <aot_func_internal#0> (81004136 <aot_func_internal#0+0x8100410a>)
                        2f: R_XTENSA_SLOT0_OP   .literal+0x8
  32:   416282          s32i    a8, a2, 0x104
  35:   f01d            retw.n
```

Note: this doesn't work well for --xip because aot_load_const_from_table()
hides the constness of the value. Maybe we need our own mechanism to
propagate the constness and the value.
2024-06-22 10:32:52 +08:00
0418041b0d wasm_memory.c: Fix typo: hasn't been initialize -> hasn't been initialized (#3547) 2024-06-19 17:02:09 +08:00
40c41d5110 Fix several issues reported by oss-fuzz (#3526)
- possible integer overflow in adjust_table_max_size:
  unsigned integer overflow: 2684354559 * 2 cannot be represented in type 'uint32'
- limit max memory size in wasm_runtime_malloc
- add more checks in aot loader
- adjust compilation options
2024-06-13 16:06:36 +08:00
fe5e7a9981 Implement Memory64 support for AOT (#3362)
Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3266
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3091
2024-05-13 11:03:38 +08:00
ba59e56e19 User defined memory allocator for different purposes (#3316)
Some issues are related with memory fragmentation, which may cause
the linear memory cannot be allocated. In WAMR, the memory managed
by the system is often trivial, but linear memory usually directly allocates
a large block and often remains unchanged for a long time. Their sensitivity
and contribution to fragmentation are different, which is suitable for
different allocation strategies. If we can control the linear memory's allocation,
do not make it from system heap, the overhead of heap management might
be avoided.

Add `mem_alloc_usage_t usage` as the first argument for user defined
malloc/realloc/free functions when `WAMR_BUILD_ALLOC_WITH_USAGE` cmake
variable is set as 1, and make passing `Alloc_For_LinearMemory` to the
argument when allocating the linear memory.
2024-04-18 19:40:57 +08:00
2013f1f7d7 Fix warnings/issues reported in Windows and by CodeQL/Coverity (#3275)
Fix the warnings and issues reported:
- in Windows platform
- by CodeQL static code analyzing
- by Coverity static code analyzing

And update CodeQL script to build exception handling and memory features.
2024-04-07 11:57:31 +08:00
a23fa9f86c Implement memory64 for classic interpreter (#3266)
Adding a new cmake flag (cache variable) `WAMR_BUILD_MEMORY64` to enable
the memory64 feature, it can only be enabled on the 64-bit platform/target and
can only use software boundary check. And when it is enabled, it can support both
i32 and i64 linear memory types. The main modifications are:

- wasm loader & mini-loader: loading and bytecode validating process 
- wasm runtime: memory instantiating process
- classic-interpreter: wasm code executing process
- Support memory64 memory in related runtime APIs
- Modify main function type check when it's memory64 wasm file
- Modify `wasm_runtime_invoke_native` and `wasm_runtime_invoke_native_raw` to
  handle registered native function pointer argument when memory64 is enabled
- memory64 classic-interpreter spec test in `test_wamr.sh` and in CI

Currently, it supports memory64 memory wasm file that uses core spec
(including bulk memory proposal) opcodes and threads opcodes.

ps.
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3091
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3240
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3260
2024-04-02 15:22:07 +08:00
c3e33a96ea Remove unused argument in wasm_runtime_lookup_function and refactor WASMModuleInstance (#3218)
Remove the unused parameter `signature` from `wasm_runtime_lookup_function`.

Refactor the layout of WASMModuleInstance structure:
- move common data members `c_api_func_imports` and `cur_exec_env` from
  `WASMModuleInstanceExtraCommon` to `WASMModuleInstance`
- In `WASMModuleInstance`, enlarge `reserved[3]` to `reserved[5]` in case that
  we need to add more fields in the future

ps.
https://github.com/bytecodealliance/wasm-micro-runtime/issues/2530
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3202
2024-03-13 12:28:45 +08:00
ce44e0ec0c Allow converting the zero wasm address to native (#3215)
This allows to know the beginning of the wasm address space. At the moment
to achieve that, we need to apply a `hack wasm_runtime_addr_app_to_native(X)-X`
to get the beginning of WASM memory in the nativ code, but I don't see a good
reason why not to allow zero address as a parameter value for this function.
2024-03-12 17:46:11 +08:00
0ee5ffce85 Refactor APIs and data structures as preliminary work for Memory64 (#3209)
# Change the data type representing linear memory address from u32 to u64

## APIs signature changes
- (Export)wasm_runtime_module_malloc
  - wasm_module_malloc
    - wasm_module_malloc_internal
  - aot_module_malloc
    - aot_module_malloc_internal
- wasm_runtime_module_realloc
  - wasm_module_realloc
    - wasm_module_realloc_internal
  - aot_module_realloc
    - aot_module_realloc_internal
- (Export)wasm_runtime_module_free
  - wasm_module_free
    - wasm_module_free_internal
  - aot_module_malloc
    - aot_module_free_internal
- (Export)wasm_runtime_module_dup_data
  - wasm_module_dup_data
  - aot_module_dup_data
- (Export)wasm_runtime_validate_app_addr
- (Export)wasm_runtime_validate_app_str_addr
- (Export)wasm_runtime_validate_native_addr
- (Export)wasm_runtime_addr_app_to_native
- (Export)wasm_runtime_addr_native_to_app
- (Export)wasm_runtime_get_app_addr_range
- aot_set_aux_stack
- aot_get_aux_stack
- wasm_set_aux_stack
- wasm_get_aux_stack
- aot_check_app_addr_and_convert, wasm_check_app_addr_and_convert
  and jit_check_app_addr_and_convert
- wasm_exec_env_set_aux_stack
- wasm_exec_env_get_aux_stack
- wasm_cluster_create_thread
- wasm_cluster_allocate_aux_stack
- wasm_cluster_free_aux_stack

## Data structure changes
- WASMModule and AOTModule
  - field aux_data_end, aux_heap_base and aux_stack_bottom
- WASMExecEnv
  - field aux_stack_boundary and aux_stack_bottom
- AOTCompData
  - field aux_data_end, aux_heap_base and aux_stack_bottom
- WASMMemoryInstance(AOTMemoryInstance)
  - field memory_data_size and change __padding to is_memory64
- WASMModuleInstMemConsumption
  - field total_size and memories_size
- WASMDebugExecutionMemory
  - field start_offset and current_pos
- WASMCluster
  - field stack_tops

## Components that are affected by the APIs and data structure changes
- libc-builtin
- libc-emcc
- libc-uvwasi
- libc-wasi
- Python and Go Language Embedding
- Interpreter Debug engine
- Multi-thread: lib-pthread, wasi-threads and thread manager
2024-03-12 11:38:50 +08:00
1429d8cc03 Fix inconsistent coding convention (#3171) 2024-02-22 10:40:50 +08:00
e792c35822 Fix null pointer access in fast-interp when configurable soft bound check is enabled (#3150)
The wasm_interp_call_func_bytecode is called for the first time with the empty
module/exec_env to generate a global_handle_table. Before that happens though,
the function checks if the module instance has bounds check enabled. Because
the module instance is null, the program crashes. This PR added an extra check to
prevent the crashes.
2024-02-14 17:18:37 +08:00
a27ddece7f Always allocate linear memory using mmap (#3052)
With this approach we can omit using memset() for the newly allocated memory
therefore the physical pages are not being used unless touched by the program.

This also simplifies the implementation.
2024-02-02 22:17:44 +08:00
0455071fc1 Access linear memory size atomically (#2834)
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2804
2023-11-29 20:27:17 +08:00
0b29904f26 Fix configurable bounds checks typo (#2809) 2023-11-21 17:32:45 +08:00
52db362b89 Refine lock/unlock shared memory (#2682)
Split memory instance's field `uint32 ref_count` into `bool is_shared_memory`
and `uint16 ref_count`, and lock the memory only when `is_shared_memory`
flag is true, no need to acquire a lock for non-shared memory when shared
memory feature is enabled.
2023-10-31 11:46:03 +08:00
4b1a6e5017 Fix repeatedly initialize shared memory data and protect the memory's fields (#2673)
Avoid repeatedly initializing the shared memory data when creating the child
thread in lib-pthread or lib-wasi-threads.

Add shared memory lock when accessing some fields of the memory instance
if the memory instance is shared.

Init shared memory's memory_data_size/memory_data_end fields according to
the current page count but not max page count.

Add wasm_runtime_set_mem_bound_check_bytes, and refine the error message
when shared memory flag is found but the feature isn't enabled.
2023-10-30 11:07:01 +08:00
83db970953 Add user to enlarge memory error callback (#2546) 2023-09-13 18:03:49 +08:00
709127d631 Add callback to handle memory.grow failures (#2522)
When embedding WAMR, this PR allows to register a callback that is
invoked when memory.grow fails.

In case of memory allocation failures, some languages allow to handle
the error (e.g. by checking the return code of malloc/calloc in C), some
others (e.g. Rust) just panic.
2023-09-05 16:41:52 +08:00
91592429f4 Fix memory sharing (#2415)
- Inherit shared memory from the parent instance, instead of
  trying to look it up by the underlying module. The old method
  works correctly only when every cluster uses different module.
- Use reference count in WASMMemoryInstance/AOTMemoryInstance
  to mark whether the memory is shared or not
- Retire WASMSharedMemNode
- For atomic opcode implementations in the interpreters, use
  a global lock for now
- Update the internal API users
  (wasi-threads, lib-pthread, wasm_runtime_spawn_thread)

Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/1962
2023-08-04 10:18:13 +08:00
8518197053 Remove a few unused functions (#2409)
They have been unused since commit 5fc48e3584
2023-07-31 19:14:44 +08:00
18092f86cc Make memory access boundary check behavior configurable (#2289)
Allow to use `cmake -DWAMR_CONFIGURABLE_BOUNDS_CHECKS=1` to
build iwasm, and then run `iwasm --disable-bounds-checks` to disable the
memory access boundary checks.

And add two APIs:
`wasm_runtime_set_bounds_checks` and `wasm_runtime_is_bounds_checks_enabled`
2023-07-04 16:21:30 +08:00
76be848ec3 Implement the segue optimization for LLVM AOT/JIT (#2230)
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT

This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
    i32.load, i64.load, f32.load, f64.load, v128.load,
    i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.

Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!

Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>
2023-05-26 10:13:33 +08:00
e8d718096d Add/reorganize locks for thread synchronization (#1995)
Attempt to fix data races when using threads.
- Protect access (from multiple threads) to exception and memory
- Fix shared memory lock usage
2023-03-04 08:15:26 +08:00
3e8927a31b Adding option to pass user data to allocator functions (#1765)
Add an option to pass user data to the allocator functions. It is common to
do this so that the host embedder can pass a struct as user data and access
that struct from the allocator, which gives the host embedder the ability to
do things such as track allocation statistics within the allocator.

Compile with `cmake -DWASM_MEM_ALLOC_WITH_USER_DATA=1` to enable
the option, and the allocator functions provided by the host embedder should
be like below (an extra argument `data` is added):
void *malloc(void *data, uint32 size) { .. }
void *realloc(void *data, uint32 size) { .. }
void free(void *data, void *ptr) { .. }

Signed-off-by: Andrew Chambers <ncham@amazon.com>
2022-11-30 16:19:18 +08:00
a182926a73 Refactor interpreter/AOT module instance layout (#1559)
Refactor the layout of interpreter and AOT module instance:
- Unify the interp/AOT module instance, use the same WASMModuleInstance/
  WASMMemoryInstance/WASMTableInstance data structures for both interpreter
  and AOT
- Make the offset of most fields the same in module instance for both interpreter
  and AOT, append memory instance structure, global data and table instances to
  the end of module instance for interpreter mode (like AOT mode)
- For extra fields in WASM module instance, use WASMModuleInstanceExtra to
  create a field `e` for interpreter
- Change the LLVM JIT module instance creating process, LLVM JIT uses the WASM
  module and module instance same as interpreter/Fast-JIT mode. So that Fast JIT
  and LLVM JIT can access the same data structures, and make it possible to
  implement the Multi-tier JIT (tier-up from Fast JIT to LLVM JIT) in the future
- Unify some APIs: merge some APIs for module instance and memory instance's
  related operations (only implement one copy)

Note that the AOT ABI is same, the AOT file format, AOT relocation types, how AOT
code accesses the AOT module instance and so on are kept unchanged.

Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1384
2022-10-18 10:59:28 +08:00
d095876ae6 Enable memory leak check (#1429)
Report the memory leak info when building iwasm with
`cmake .. -DWAMR_BUILD_GC_VERIFY=1`
2022-09-01 16:15:00 +08:00
77c516ac80 Add a new API to get free memory in memory pool (#1430) 2022-08-31 16:39:25 +08:00
d62543c99c Enlarge max pool size and fix bh_memcpy_s dest max size check (#1151)
Enlarge max pool size and fix bh_memcpy_s dest max size check to support
large linear memory, e.g. with initial page count 65535.
2022-05-07 16:09:16 +08:00
52b6c73d9c Apply clang-format for more src files and update spec test script (#775)
Apply clang-format for core/iwasm/include, core/iwasm/common and
core/iwasm/aot files.

Update spec cases test script:
- Checkout latest commit of https://github.com/WebAssembly/spec
- Checkout main branch but not master of https://github.com/WebAssembly/threads
- Update wabt to latest version

And update source debugging document.

Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
2021-10-08 17:47:11 +08:00
64b5459066 Implement Windows thread/mutex/cond APIs to support multi-thread (#627)
Implement Windows thread/mutex/cond related APIs to support Windows multi-thread feature
Change Windows HW boundary check implementation for multi-thread: change SEH to VEH
Fix wasm-c-api issue of getting AOTFunctionInstance by index, fix wasm-c-api compile warnings
Enable to build invokeNative_general.c with cmake variable
Fix several issues in lib-pthread
Disable two LLVM passes in multi-thread mode to reserve volatile semantic
Update docker script and document to build iwasm with Docker image

Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
2021-05-11 16:48:49 +08:00