Propose two enhancements:
- Shared heap created from preallocated memory buffer: The user can create a shared heap from a pre-allocated buffer and see that memory region as one large chunk; there's no need to dynamically manage it(malloc/free). The user needs to make sure the native address and size of that memory region are valid.
- Introduce shared heap chain: The user can create a shared heap chain, from the wasm app point of view, it's still a continuous memory region in wasm app's point of view while in the native it can consist of multiple shared heaps (each of which is a continuous memory region). For example, one 500MB shared heap 1 and one 500 MB shared heap 2 form a chain, in Wasm's point of view, it's one 1GB shared heap.
After these enhancements, the data sharing between wasm apps, and between hosts can be more efficient and flexible. Admittedly shared heap management can be more complex for users, but it's similar to the zero-overhead principle. No overhead will be imposed for the users who don't use the shared heap enhancement or don't use the shared heap at all.
while ideally a user should not need to care this kind of
optimization details, in reality i guess it's sometimes useful.
both of clang and GCC expose a similar option. (-fno-jump-tables)
- Implement TINY / STANDARD frame modes - tiny mode is only able to keep track on the IP
and func idx, STANDARD mode provides more capabilities (parameters, stack pointer etc.).
- Implement FRAME_PER_FUNCTION / FRAME_PER_CALL modes - frame per function adds
code at the beginning and at the end of each function for allocating / deallocating stack frame,
whereas in per-call mode the frame is allocated before each call. The exception is call to
the imported function, where frame-per-function mode also allocates the stack before the
`call` instruction (as it can't instrument the imported function).
At the moment TINY + FRAME_PER_FUNCTION is automatically enabled in case GC and perf
profiling are disabled and `values` call stack feature is not requested. In all the other cases
STANDARD + FRAME_PER_CALL is used.
STANDARD + FRAME_PER_FUNCTION and TINY + FRAME_PER_CALL are currently not
implemented but possible, and might be enabled in the future.
ps. https://github.com/bytecodealliance/wasm-micro-runtime/issues/3758
Implement the GC (Garbage Collection) feature for interpreter mode,
AOT mode and LLVM-JIT mode, and support most features of the latest
spec proposal, and also enable the stringref feature.
Use `cmake -DWAMR_BUILD_GC=1/0` to enable/disable the feature,
and `wamrc --enable-gc` to generate the AOT file with GC supported.
And update the AOT file version from 2 to 3 since there are many AOT
ABI breaks, including the changes of AOT file format, the changes of
AOT module/memory instance layouts, the AOT runtime APIs for the
AOT code to invoke and so on.