aot compiler: Use larger alignment for load/store when possible (#3552)
Consider the following wasm module:
```wast
(module
(func (export "foo")
i32.const 0x104
i32.const 0x12345678
i32.store
)
(memory 1 1)
)
```
While the address (0x104) is perfectly aligned for i32.store,
as our aot compiler uses 1-byte alignment for load/store LLVM
IR instructions, it often produces inefficient machine code,
especially for alignment-sensitive targets.
For example, the above "foo" function is compiled into the
following xtensa machine code.
```
0000002c <aot_func_internal#0>:
2c: 004136 entry a1, 32
2f: 07a182 movi a8, 0x107
32: 828a add.n a8, a2, a8
34: 291c movi.n a9, 18
36: 004892 s8i a9, a8, 0
39: 06a182 movi a8, 0x106
3c: 828a add.n a8, a2, a8
3e: ffff91 l32r a9, 3c <aot_func_internal#0+0x10> (ff91828a <aot_func_internal#0+0xff91825e>)
3e: R_XTENSA_SLOT0_OP .literal+0x8
41: 004892 s8i a9, a8, 0
44: 05a182 movi a8, 0x105
47: 828a add.n a8, a2, a8
49: ffff91 l32r a9, 48 <aot_func_internal#0+0x1c> (ffff9182 <aot_func_internal#0+0xffff9156>)
49: R_XTENSA_SLOT0_OP .literal+0xc
4c: 41a890 srli a10, a9, 8
4f: 0048a2 s8i a10, a8, 0
52: 04a182 movi a8, 0x104
55: 828a add.n a8, a2, a8
57: 004892 s8i a9, a8, 0
5a: f01d retw.n
```
Note that the each four bytes are stored separately using
one-byte-store instruction, s8i.
This commit tries to use larger alignments for load/store LLVM IR
instructions when possible. with this commit, the above example is
compiled into the following machine code, which seems more reasonable.
```
0000002c <aot_func_internal#0>:
2c: 004136 entry a1, 32
2f: ffff81 l32r a8, 2c <aot_func_internal#0> (81004136 <aot_func_internal#0+0x8100410a>)
2f: R_XTENSA_SLOT0_OP .literal+0x8
32: 416282 s32i a8, a2, 0x104
35: f01d retw.n
```
Note: this doesn't work well for --xip because aot_load_const_from_table()
hides the constness of the value. Maybe we need our own mechanism to
propagate the constness and the value.
This commit is contained in:
@ -883,6 +883,12 @@ wasm_enlarge_memory_internal(WASMModuleInstance *module, uint32 inc_page_count)
|
||||
}
|
||||
#endif /* end of WASM_MEM_ALLOC_WITH_USAGE */
|
||||
|
||||
/*
|
||||
* AOT compiler assumes at least 8 byte alignment.
|
||||
* see aot_check_memory_overflow.
|
||||
*/
|
||||
bh_assert(((uintptr_t)memory->memory_data & 0x7) == 0);
|
||||
|
||||
memory->num_bytes_per_page = num_bytes_per_page;
|
||||
memory->cur_page_count = total_page_count;
|
||||
memory->max_page_count = max_page_count;
|
||||
@ -1032,5 +1038,11 @@ wasm_allocate_linear_memory(uint8 **data, bool is_shared_memory,
|
||||
#endif
|
||||
}
|
||||
|
||||
/*
|
||||
* AOT compiler assumes at least 8 byte alignment.
|
||||
* see aot_check_memory_overflow.
|
||||
*/
|
||||
bh_assert(((uintptr_t)*data & 0x7) == 0);
|
||||
|
||||
return BHT_OK;
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user