- Clear some compile warnings
- Fix some typos
- Fix llvm LICENSE link error
- Remove unused aot file and binarydump bin
- Add checks when loading AOT exports
Add table64 extension(in Memory64 proposal) support in classic-interp
and AOT running modes, currently still use uint32 to represent table's
initial and maximum size to keep AOT ABI unchanged.
- Implement TINY / STANDARD frame modes - tiny mode is only able to keep track on the IP
and func idx, STANDARD mode provides more capabilities (parameters, stack pointer etc.).
- Implement FRAME_PER_FUNCTION / FRAME_PER_CALL modes - frame per function adds
code at the beginning and at the end of each function for allocating / deallocating stack frame,
whereas in per-call mode the frame is allocated before each call. The exception is call to
the imported function, where frame-per-function mode also allocates the stack before the
`call` instruction (as it can't instrument the imported function).
At the moment TINY + FRAME_PER_FUNCTION is automatically enabled in case GC and perf
profiling are disabled and `values` call stack feature is not requested. In all the other cases
STANDARD + FRAME_PER_CALL is used.
STANDARD + FRAME_PER_FUNCTION and TINY + FRAME_PER_CALL are currently not
implemented but possible, and might be enabled in the future.
ps. https://github.com/bytecodealliance/wasm-micro-runtime/issues/3758
In the AOT compiler, allow the user to control stack boundary check when the boundary
check is enabled (e.g. `wamrc --bounds-checks=1`). Now the code logic is:
1. When `--stack-bounds-checks` is not set, it will be the same value as `--bounds-checks`.
2. When `--stack-bounds-checks` is set, it will be the option value no matter what the
status of `--bounds-checks` is.
Make wamrc normalize "arm64" to "aarch64v8". Previously the only way to
make the "arm64" target was to not specify a target on 64 bit arm-based
mac builds. Now arm64 and aarch64v8 are treated as the same.
Make aot_loader accept "aarch64v8" on arm-based apple (as well as
accepting legacy "arm64" based aot targets).
This also removes __APPLE__ and __MACH__ from the block that defaults
size_level to 1 since it doesn't seem to be supported for aarch64:
`LLVM ERROR: Only small, tiny and large code models are allowed on AArch64`
Implement multi-memory for classic-interpreter. Support core spec (and bulk memory) opcodes now,
and will support atomic opcodes, and add multi-memory export APIs in the future.
PS: Multi-memory spec test patched a lot for linking test to adapt for multi-module implementation.
For JIT, we naturally use mach-o on macOS, where the section name
we currently use is not valid and ends up with the errors like:
```
LLVM ERROR: Global variable '__orc_lcl.aot_stack_sizes.0' has an invalid section specifier '.aot_stack_sizes': mach-o section specifier requires a segment and section separated by a comma.
```
Because the dedicated section is not necessary for JIT,
this commit simply stops using it.
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/3730
The table index in the call_indirect/return_call_indirect opcode should be
one byte 0x00 when ref-types/GC isn't enabled, and should be treated as
leb u32 when ref-types/GC is enabled.
And make aot compiler bail out if ref-types/GC is disabled by command line
argument while ref-types instructions are used.
If the value of a float constant is an NaN, the aot compiler creates an alloca,
stores the converted i32 const into it and then loads f32 from it again, which
may introduce a relocation in the AOT file and is not allowed for XIP mode.
aot_load_const_from_table() hides the const-ness of the
value and prevents optimizations like
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3552.
This commit makes the aot compiler tracks the const-ness
of the value directly in the AOTValue and enables the above
mentioned optimization for XIP.
* I believe that LLVM MemoryBuffer interface is supposed to be read-only
and it's allowed to use eg. read-only mmap of the underlying file.
It isn't appropriate to modify the view at all.
* in case of WASM_ENABLE_DEBUG_AOT, the whole buffer is written as the text
section of the aot file. the modified e_type would confuse dwarf consumers.
note that, even when we are using XIP, the debug info usually contains
relocations. for example, llvm-dwarfdump doesn't seem to perform relocations
on .debug_info section for ET_CORE (== 4 == our E_TYPE_XIP) objects.
Consider the following wasm module:
```wast
(module
(func (export "foo")
i32.const 0x104
i32.const 0x12345678
i32.store
)
(memory 1 1)
)
```
While the address (0x104) is perfectly aligned for i32.store,
as our aot compiler uses 1-byte alignment for load/store LLVM
IR instructions, it often produces inefficient machine code,
especially for alignment-sensitive targets.
For example, the above "foo" function is compiled into the
following xtensa machine code.
```
0000002c <aot_func_internal#0>:
2c: 004136 entry a1, 32
2f: 07a182 movi a8, 0x107
32: 828a add.n a8, a2, a8
34: 291c movi.n a9, 18
36: 004892 s8i a9, a8, 0
39: 06a182 movi a8, 0x106
3c: 828a add.n a8, a2, a8
3e: ffff91 l32r a9, 3c <aot_func_internal#0+0x10> (ff91828a <aot_func_internal#0+0xff91825e>)
3e: R_XTENSA_SLOT0_OP .literal+0x8
41: 004892 s8i a9, a8, 0
44: 05a182 movi a8, 0x105
47: 828a add.n a8, a2, a8
49: ffff91 l32r a9, 48 <aot_func_internal#0+0x1c> (ffff9182 <aot_func_internal#0+0xffff9156>)
49: R_XTENSA_SLOT0_OP .literal+0xc
4c: 41a890 srli a10, a9, 8
4f: 0048a2 s8i a10, a8, 0
52: 04a182 movi a8, 0x104
55: 828a add.n a8, a2, a8
57: 004892 s8i a9, a8, 0
5a: f01d retw.n
```
Note that the each four bytes are stored separately using
one-byte-store instruction, s8i.
This commit tries to use larger alignments for load/store LLVM IR
instructions when possible. with this commit, the above example is
compiled into the following machine code, which seems more reasonable.
```
0000002c <aot_func_internal#0>:
2c: 004136 entry a1, 32
2f: ffff81 l32r a8, 2c <aot_func_internal#0> (81004136 <aot_func_internal#0+0x8100410a>)
2f: R_XTENSA_SLOT0_OP .literal+0x8
32: 416282 s32i a8, a2, 0x104
35: f01d retw.n
```
Note: this doesn't work well for --xip because aot_load_const_from_table()
hides the constness of the value. Maybe we need our own mechanism to
propagate the constness and the value.
The compilation errors were introduced by #3515 and occur in debug building
when wasm mini loader is compiled or GC is enabled.
And remove two wasm files in standalone test-running-modes case,
which will be generated by run.sh.
Probably it's better to skip the optimized out parameters.
(that is, parameters w/o locations)
However, I'm not sure how/if it can be done with the lldb api.
For now, just disable parameter processing to avoid crashes.
While band-aid fixes like this is not plausible IMO,
some people prefer to have some debug info even if it's
partial/limited/broken. This commit partially (re)enables
C++ processing. On the other hand, do not bother to process
variables because it's known incompatible with C++.
- Add declarations related to emitting AOT files into a separate header file
named `aot_emit_aot_file.h`
- Add API `aot_emit_aot_file_buf_ex` and refactor `aot_emit_aot_file_buf` to call it
- Expose some APIs in aot_export.h
Support to get `wasm_memory_type_t memory_type` from API
`wasm_runtime_get_import_type` and `wasm_runtime_get_export_type`,
and then get shared flag, initial page cout, maximum page count
from the memory_type:
```C
bool
wasm_memory_type_get_shared(const wasm_memory_type_t memory_type);
uint32_t
wasm_memory_type_get_init_page_count(const wasm_memory_type_t memory_type);
uint32_t
wasm_memory_type_get_max_page_count(const wasm_memory_type_t memory_type);
```
Fix several issues in wasm loader:
- Parse a block's type index with leb int32 instead leb uint32
- Correct dst dynamic offset of loop block arguments for opcode br
when copying the stack operands to the arguments of loop block
- Free each frame_csp's param_frame_offsets when destroy loader ctx
- Fix compilation error in wasm_mini_loader.c
- Add test cases of failed issues
This PR fixes issue #3467 and #3468.
Support getting global type from `wasm_runtime_get_import_type` and
`wasm_runtime_get_export_type`, and add two APIs:
```C
wasm_valkind_t
wasm_global_type_get_valkind(const wasm_global_type_t global_type);
bool
wasm_global_type_get_mutable(const wasm_global_type_t global_type);
```
Enhance the GC subtyping checks:
- Fix issues in the type equivalence check
- Enable the recursive type subtyping check
- Add a equivalence type flag in defined types of aot file, if there is an
equivalence type before, just set it true and re-use the previous type
- Normalize the defined types for interpreter and AOT
- Enable spec test case type-equivalence.wast and type-subtyping.wast,
and enable some commented cases
- Enable set WAMR_BUILD_SANITIZER from cmake variable
- Add new API wasm_runtime_load_ex() in wasm_export.h
and wasm_module_new_ex in wasm_c_api.h
- Put aot_create_perf_map() into a separated file aot_perf_map.c
- In perf.map, function names include user specified module name
- Enhance the script to help flamegraph generations