- Implement TINY / STANDARD frame modes - tiny mode is only able to keep track on the IP
and func idx, STANDARD mode provides more capabilities (parameters, stack pointer etc.).
- Implement FRAME_PER_FUNCTION / FRAME_PER_CALL modes - frame per function adds
code at the beginning and at the end of each function for allocating / deallocating stack frame,
whereas in per-call mode the frame is allocated before each call. The exception is call to
the imported function, where frame-per-function mode also allocates the stack before the
`call` instruction (as it can't instrument the imported function).
At the moment TINY + FRAME_PER_FUNCTION is automatically enabled in case GC and perf
profiling are disabled and `values` call stack feature is not requested. In all the other cases
STANDARD + FRAME_PER_CALL is used.
STANDARD + FRAME_PER_FUNCTION and TINY + FRAME_PER_CALL are currently not
implemented but possible, and might be enabled in the future.
ps. https://github.com/bytecodealliance/wasm-micro-runtime/issues/3758
In the AOT compiler, allow the user to control stack boundary check when the boundary
check is enabled (e.g. `wamrc --bounds-checks=1`). Now the code logic is:
1. When `--stack-bounds-checks` is not set, it will be the same value as `--bounds-checks`.
2. When `--stack-bounds-checks` is set, it will be the option value no matter what the
status of `--bounds-checks` is.
Make wamrc normalize "arm64" to "aarch64v8". Previously the only way to
make the "arm64" target was to not specify a target on 64 bit arm-based
mac builds. Now arm64 and aarch64v8 are treated as the same.
Make aot_loader accept "aarch64v8" on arm-based apple (as well as
accepting legacy "arm64" based aot targets).
This also removes __APPLE__ and __MACH__ from the block that defaults
size_level to 1 since it doesn't seem to be supported for aarch64:
`LLVM ERROR: Only small, tiny and large code models are allowed on AArch64`
Implement multi-memory for classic-interpreter. Support core spec (and bulk memory) opcodes now,
and will support atomic opcodes, and add multi-memory export APIs in the future.
PS: Multi-memory spec test patched a lot for linking test to adapt for multi-module implementation.
For JIT, we naturally use mach-o on macOS, where the section name
we currently use is not valid and ends up with the errors like:
```
LLVM ERROR: Global variable '__orc_lcl.aot_stack_sizes.0' has an invalid section specifier '.aot_stack_sizes': mach-o section specifier requires a segment and section separated by a comma.
```
Because the dedicated section is not necessary for JIT,
this commit simply stops using it.
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/3730
The table index in the call_indirect/return_call_indirect opcode should be
one byte 0x00 when ref-types/GC isn't enabled, and should be treated as
leb u32 when ref-types/GC is enabled.
And make aot compiler bail out if ref-types/GC is disabled by command line
argument while ref-types instructions are used.
If the value of a float constant is an NaN, the aot compiler creates an alloca,
stores the converted i32 const into it and then loads f32 from it again, which
may introduce a relocation in the AOT file and is not allowed for XIP mode.
aot_load_const_from_table() hides the const-ness of the
value and prevents optimizations like
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3552.
This commit makes the aot compiler tracks the const-ness
of the value directly in the AOTValue and enables the above
mentioned optimization for XIP.
* I believe that LLVM MemoryBuffer interface is supposed to be read-only
and it's allowed to use eg. read-only mmap of the underlying file.
It isn't appropriate to modify the view at all.
* in case of WASM_ENABLE_DEBUG_AOT, the whole buffer is written as the text
section of the aot file. the modified e_type would confuse dwarf consumers.
note that, even when we are using XIP, the debug info usually contains
relocations. for example, llvm-dwarfdump doesn't seem to perform relocations
on .debug_info section for ET_CORE (== 4 == our E_TYPE_XIP) objects.
Consider the following wasm module:
```wast
(module
(func (export "foo")
i32.const 0x104
i32.const 0x12345678
i32.store
)
(memory 1 1)
)
```
While the address (0x104) is perfectly aligned for i32.store,
as our aot compiler uses 1-byte alignment for load/store LLVM
IR instructions, it often produces inefficient machine code,
especially for alignment-sensitive targets.
For example, the above "foo" function is compiled into the
following xtensa machine code.
```
0000002c <aot_func_internal#0>:
2c: 004136 entry a1, 32
2f: 07a182 movi a8, 0x107
32: 828a add.n a8, a2, a8
34: 291c movi.n a9, 18
36: 004892 s8i a9, a8, 0
39: 06a182 movi a8, 0x106
3c: 828a add.n a8, a2, a8
3e: ffff91 l32r a9, 3c <aot_func_internal#0+0x10> (ff91828a <aot_func_internal#0+0xff91825e>)
3e: R_XTENSA_SLOT0_OP .literal+0x8
41: 004892 s8i a9, a8, 0
44: 05a182 movi a8, 0x105
47: 828a add.n a8, a2, a8
49: ffff91 l32r a9, 48 <aot_func_internal#0+0x1c> (ffff9182 <aot_func_internal#0+0xffff9156>)
49: R_XTENSA_SLOT0_OP .literal+0xc
4c: 41a890 srli a10, a9, 8
4f: 0048a2 s8i a10, a8, 0
52: 04a182 movi a8, 0x104
55: 828a add.n a8, a2, a8
57: 004892 s8i a9, a8, 0
5a: f01d retw.n
```
Note that the each four bytes are stored separately using
one-byte-store instruction, s8i.
This commit tries to use larger alignments for load/store LLVM IR
instructions when possible. with this commit, the above example is
compiled into the following machine code, which seems more reasonable.
```
0000002c <aot_func_internal#0>:
2c: 004136 entry a1, 32
2f: ffff81 l32r a8, 2c <aot_func_internal#0> (81004136 <aot_func_internal#0+0x8100410a>)
2f: R_XTENSA_SLOT0_OP .literal+0x8
32: 416282 s32i a8, a2, 0x104
35: f01d retw.n
```
Note: this doesn't work well for --xip because aot_load_const_from_table()
hides the constness of the value. Maybe we need our own mechanism to
propagate the constness and the value.
The compilation errors were introduced by #3515 and occur in debug building
when wasm mini loader is compiled or GC is enabled.
And remove two wasm files in standalone test-running-modes case,
which will be generated by run.sh.
Probably it's better to skip the optimized out parameters.
(that is, parameters w/o locations)
However, I'm not sure how/if it can be done with the lldb api.
For now, just disable parameter processing to avoid crashes.
While band-aid fixes like this is not plausible IMO,
some people prefer to have some debug info even if it's
partial/limited/broken. This commit partially (re)enables
C++ processing. On the other hand, do not bother to process
variables because it's known incompatible with C++.
- Add declarations related to emitting AOT files into a separate header file
named `aot_emit_aot_file.h`
- Add API `aot_emit_aot_file_buf_ex` and refactor `aot_emit_aot_file_buf` to call it
- Expose some APIs in aot_export.h
Support to get `wasm_memory_type_t memory_type` from API
`wasm_runtime_get_import_type` and `wasm_runtime_get_export_type`,
and then get shared flag, initial page cout, maximum page count
from the memory_type:
```C
bool
wasm_memory_type_get_shared(const wasm_memory_type_t memory_type);
uint32_t
wasm_memory_type_get_init_page_count(const wasm_memory_type_t memory_type);
uint32_t
wasm_memory_type_get_max_page_count(const wasm_memory_type_t memory_type);
```
Fix several issues in wasm loader:
- Parse a block's type index with leb int32 instead leb uint32
- Correct dst dynamic offset of loop block arguments for opcode br
when copying the stack operands to the arguments of loop block
- Free each frame_csp's param_frame_offsets when destroy loader ctx
- Fix compilation error in wasm_mini_loader.c
- Add test cases of failed issues
This PR fixes issue #3467 and #3468.
Support getting global type from `wasm_runtime_get_import_type` and
`wasm_runtime_get_export_type`, and add two APIs:
```C
wasm_valkind_t
wasm_global_type_get_valkind(const wasm_global_type_t global_type);
bool
wasm_global_type_get_mutable(const wasm_global_type_t global_type);
```
Enhance the GC subtyping checks:
- Fix issues in the type equivalence check
- Enable the recursive type subtyping check
- Add a equivalence type flag in defined types of aot file, if there is an
equivalence type before, just set it true and re-use the previous type
- Normalize the defined types for interpreter and AOT
- Enable spec test case type-equivalence.wast and type-subtyping.wast,
and enable some commented cases
- Enable set WAMR_BUILD_SANITIZER from cmake variable
- Add new API wasm_runtime_load_ex() in wasm_export.h
and wasm_module_new_ex in wasm_c_api.h
- Put aot_create_perf_map() into a separated file aot_perf_map.c
- In perf.map, function names include user specified module name
- Enhance the script to help flamegraph generations
Fix the warnings and issues reported:
- in Windows platform
- by CodeQL static code analyzing
- by Coverity static code analyzing
And update CodeQL script to build exception handling and memory features.
This reverts commit 0e8d949440.
Because it doesn't make much sense anymore after we disabled debug info
processing on C++ functions in:
"aot debug: process lldb_function_to_function_dbi only for C".