Implement performance profiler and call stack dump, and update toolchain document (#501)

And remove redundant FAST_INTERP macros in wasm_interp_fast.c, and fix wamrc --help wrong line order issue.

Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
This commit is contained in:
Wenyong Huang
2021-01-17 23:23:10 -06:00
committed by GitHub
parent 794028a968
commit 240ca2ed46
26 changed files with 752 additions and 109 deletions

View File

@ -48,8 +48,14 @@ cmake -DWAMR_BUILD_PLATFORM=linux -DWAMR_BUILD_TARGET=ARM
- **WAMR_BUILD_CUSTOM_NAME_SECTION**=1/0, load the function name from custom name section, default to disable if not set
#### **Enable dump call stack feature**
- **WAMR_BUILD_DUMP_CALL_STACK**=1/0, default to disable if not set
> Note: if it is enabled, the call stack will be dumped when exception occurs.
> - For interpreter mode, the function names are firstly extracted from *custom name section*, if this section doesn't exist or the feature is not enabled, then the name will be extracted from the import/export sections
> - For AoT/JIT mode, the function names are extracted from import/export section, please export as many functions as possible (for `wasi-sdk` you can use `-Wl,--export-all`) when compiling wasm module, and add `--enable-dump-call-stack` option to wamrc during compiling AoT module.
#### **Enable Multi-Module feature**
- **WAMR_BUILD_MULTI_MODULE**=1/0, default to disable if not set
@ -79,6 +85,12 @@ cmake -DWAMR_BUILD_PLATFORM=linux -DWAMR_BUILD_TARGET=ARM
> Note: if it is enabled, developer can use API `void wasm_runtime_dump_mem_consumption(wasm_exec_env_t exec_env)` to dump the memory consumption info.
Currently we only profile the memory consumption of module, module_instance and exec_env, the memory consumed by other components such as `wasi-ctx`, `multi-module` and `thread-manager` are not included.
#### **Enable performance profiling (Experiment)**
- **WAMR_BUILD_PERF_PROFILING**=1/0, default to disable if not set
> Note: if it is enabled, developer can use API `void wasm_runtime_dump_perf_profiling(wasm_module_inst_t module_inst)` to dump the performance consumption info. Currently we only profile the performance consumption of each WASM function.
> The function name searching sequence is the same with dump call stack feature.
#### **Set maximum app thread stack size**
- **WAMR_APP_THREAD_STACK_SIZE_MAX**=n, default to 8 MB (8388608) if not set
> Note: the AOT boundary check with hardware trap mechanism might consume large stack since the OS may lazily grow the stack mapping as a guard page is hit, we may use this configuration to reduce the total stack usage, e.g. -DWAMR_APP_THREAD_STACK_SIZE_MAX=131072 (128 KB).

View File

@ -2,12 +2,16 @@
# Prepare WASM building environments
WASI-SDK version 8.0+ is the major tool supported by WAMR to build WASM applications. There are some other WASM compilers such as the standard clang compiler and Emscripten might also work [here](./other_wasm_compilers.md).
For C and C++, WASI-SDK version 12.0+ is the major tool supported by WAMR to build WASM applications. Also we can use [Emscripten SDK (EMSDK)](https://github.com/emscripten-core/emsdk), but it is not recommended. And there are some other compilers such as the standard clang compiler, which might also work [here](./other_wasm_compilers.md).
Install WASI SDK: Download the [wasi-sdk](https://github.com/CraneStation/wasi-sdk/releases) and extract the archive to default path `/opt/wasi-sdk`
To install WASI SDK, please download the [wasi-sdk release](https://github.com/CraneStation/wasi-sdk/releases) and extract the archive to default path `/opt/wasi-sdk`.
For [AssemblyScript](https://github.com/AssemblyScript/assemblyscript), please refer to [AssemblyScript quick start](https://www.assemblyscript.org/quick-start.html) and [AssemblyScript compiler](https://www.assemblyscript.org/compiler.html#command-line-options) for how to install `asc` compiler and build WASM applications.
For Rust, please firstly ref to [Install Rust and Cargo](https://doc.rust-lang.org/cargo/getting-started/installation.html) to install cargo, rustc and rustup, by default they are installed under ~/.cargo/bin, and then run `rustup target add wasm32-wasi` to install wasm32-wasi target for Rust toolchain. To build WASM applications, we can run `cargo build --target wasm32-wasi`, the output files are under `target/wasm32-wasi`.
Build WASM applications
Build WASM applications with wasi-sdk
=========================
You can write a simple ```test.c``` as the first sample.
@ -44,21 +48,23 @@ To build the source file to WASM bytecode, we can input the following command:
/opt/wasi-sdk/bin/clang -O3 -o test.wasm test.c
```
There are some useful options which can be specified to build the source code:
## 1. wasi-sdk options
- **-nostdlib** Do not use the standard system startup files or libraries when linking. In this mode, the libc-builtin library of WAMR must be built to run the wasm app, otherwise, the libc-wasi library must be built. You can specify **-DWAMR_BUILD_LIBC_BUILTIN** or **-DWAMR_BUILD_LIBC_WASI** for cmake to build WAMR with libc-builtin support or libc-wasi support.
There are some useful options which can be specified to build the source code (for more link options, please run `/opt/wasi-sdk/bin/wasm-ld --help`):
- **-nostdlib** Do not use the standard system startup files or libraries when linking. In this mode, the **libc-builtin** library of WAMR must be built to run the wasm app, otherwise, the **libc-wasi** library must be built. You can specify **-DWAMR_BUILD_LIBC_BUILTIN=1** or **-DWAMR_BUILD_LIBC_WASI=1** for cmake to build WAMR with libc-builtin support or libc-wasi support.
- **-Wl,--no-entry** Do not output any entry point
- **-Wl,--export=<value>** Force a symbol to be exported, e.g. **-Wl,--export=main** to export main function
- **-Wl,--export=\<value\>** Force a symbol to be exported, e.g. **-Wl,--export=foo** to export foo function
- **-Wl,--export-all** Export all symbols (normally combined with --no-gc-sections)
- **-Wl,--initial-memory=<value>** Initial size of the linear memory, which must be a multiple of 65536
- **-Wl,--initial-memory=\<value\>** Initial size of the linear memory, which must be a multiple of 65536
- **-Wl,--max-memory=<value>** Maximum size of the linear memory, which must be a multiple of 65536
- **-Wl,--max-memory=\<value\>** Maximum size of the linear memory, which must be a multiple of 65536
- **-z stack-size=<vlaue>** The auxiliary stack size, which is an area of linear memory, and must be smaller than initial memory size.
- **-z stack-size=\<vlaue\>** The auxiliary stack size, which is an area of linear memory, and must be smaller than initial memory size.
- **-Wl,--strip-all** Strip all symbols
@ -66,11 +72,12 @@ There are some useful options which can be specified to build the source code:
- **-Wl,--allow-undefined** Allow undefined symbols in linked binary
- **-Wl,--allow-undefined-file=<value>** Allow symbols listed in <file> to be undefined in linked binary
- **-Wl,--allow-undefined-file=\<value\>** Allow symbols listed in \<file\> to be undefined in linked binary
- **-pthread** Support POSIX threads in generated code
For example, we can build the wasm app with command:
``` Bash
/opt/wasi-sdk/bin/clang -O3 -nostdlib \
-z stack-size=8192 -Wl,--initial-memory=65536 \
@ -79,7 +86,112 @@ For example, we can build the wasm app with command:
-Wl,--export=__heap_base -Wl,--export=__data_end \
-Wl,--no-entry -Wl,--strip-all -Wl,--allow-undefined
```
to generate a wasm binary with small footprint.
to generate a wasm binary with nostdlib mode, auxiliary stack size is 8192 bytes, initial memory size is 64 KB, main function, heap base global and data end global are exported, no entry function is generated (no _start function is exported), and all symbols are stripped. Note that it is nostdlib mode, so libc-builtin should be enabled by runtime embedder or iwasm (with cmake -DWAMR_BUILD_LIBC_BUILT=1, enabled by iwasm in Linux by default).
If we want to build the wasm app with wasi mode, we may build the wasm app with command:
```bash
/opt/wasi-sdk/bin/clang -O3 \
-z stack-size=8192 -Wl,--initial-memory=65536 \
-o test.wasm test.c \
-Wl,--export=__heap_base -Wl,--export=__data_end \
-Wl,--strip-all
```
to generate a wasm binary with wasi mode, auxiliary stack size is 8192 bytes, initial memory size is 64 KB, heap base global and data end global are exported, wasi entry function exported (_start function), and all symbols are stripped. Note that it is wasi mode, so libc-wasi should be enabled by runtime embedder or iwasm (with cmake -DWAMR_BUILD_LIBC_WASI=1, enabled by iwasm in Linux by default), and normally no need to export main function, by default _start function is executed by iwasm.
## 2. How to reduce the footprint?
Firstly if libc-builtin (-nostdlib) mode meets the requirements, e.g. there are no file io operations in wasm app, we should build the wasm app with -nostdlib option as possible as we can, since the compiler doesn't build the libc source code into wasm bytecodes, which greatly reduces the binary size.
### (1) Methods to reduce the libc-builtin (-nostdlib) mode footprint
- export \_\_heap_base global and \_\_data_end global
```bash
-Wl,--export=__heap_base -Wl,--export=__data_end
```
If the two globals are exported, and there are no memory.grow and memory.size opcodes (normally nostdlib mode doesn't introduce these opcodes since the libc malloc function isn't built into wasm bytecode), WAMR runtime will truncate the linear memory at the place of \__heap_base and append app heap to the end, so we don't need to allocate the memory specified by `-Wl,--initial-memory=n` which must be at least 64 KB. This is helpful for some embedded devices whose memory resource might be limited.
- reduce auxiliary stack size
The auxiliary stack is an area of linear memory, normally the size is 64 KB by default which might be a little large for embedded devices and actually partly used, we can use `-z stack-size=n` to set its size.
- use -O3 and -Wl,--strip-all
- reduce app heap size when running iwasm
We can pass `--heap-size=n` option to set the maximum app heap size for iwasm, by default it is 16 KB. For the runtime embedder, we can set the `uint32_t heap_size` argument when calling API ` wasm_runtime_instantiate`.
- reduce wasm operand stack size when running iwasm
WebAssembly is a binary instruction format for a stack-based virtual machine, which requires a stack to execute the bytecodes. We can pass `--stack-size=n` option to set the maximum stack size for iwasm, by default it is 16 KB. For the runtime embedder, we can set the `uint32_t stack_size` argument when calling API ` wasm_runtime_instantiate` and `wasm_runtime_create_exec_env`.
- decrease block_addr_cache size for classic interpreter
The block_addr_cache is an hash cache to store the else/end addresses for WebAssembly blocks (BLOCK/IF/LOOP) to speed up address lookup. This is only available in classic interpreter. We can set it by define macro `-DBLOCK_ADDR_CACHE_SIZE=n`, e.g. add `add_defintion (-DBLOCK_ADDR_CACHE_SIZE=n)` in CMakeLists.txt, by default it is 64, and total block_addr_cache size is 3072 bytes in 64-bit platform and 1536 bytes in 32-bit platform.
### (2) Methods to reduce the libc-wasi (without -nostdlib) mode footprint
Most of the above methods are also available for libc-wasi mode, besides them, we can export malloc and free functions with `-Wl,--export=malloc -Wl,--export=free` option, so WAMR runtime will disable its app heap and call the malloc/free function exported to allocate/free the memory from/to the heap space managed by libc.
## 3. Build wasm app with pthread support
Please ref to [pthread library](./pthread_library.md) for more details.
## 4. Build wasm app with SIMD support
Normally we should install emsdk and use its SSE header files, please ref to workload samples, e.g. [bwa CMakeLists.txt](../samples/workload/bwa/CMakeLists.txt) and [wasm-av1 CMakeLists.txt](../samples/workload/wasm-av1/CMakeLists.txt) for more details.
# Build WASM applications with emsdk
## 1. Install emsdk
Assuming you are using Linux, you may install emcc and em++ from Emscripten EMSDK following the steps below:
```
git clone https://github.com/emscripten-core/emsdk.git
cd emsdk
./emsdk install latest
./emsdk activate latest
# And then source the emsdk_env.sh script before build wasm app
source emsdk_env.sh (or add it to ~/.bashrc if you don't want to run it each time)
```
The Emscripten website provides other installation methods beyond Linux.
## 2. emsdk options
To build the wasm C source code into wasm binary, we can use the following command:
```bash
EMCC_ONLY_FORCED_STDLIBS=1 emcc -O3 -s STANDALONE_WASM=1 \
-o test.wasm test.c \
-s TOTAL_STACK=4096 -s TOTAL_MEMORY=65536 \
-s "EXPORTED_FUNCTIONS=['_main']" \
-s ERROR_ON_UNDEFINED_SYMBOLS=0
```
There are some useful options:
- **EMCC_ONLY_FORCED_STDLIBS=1** whether to link libc library into the output binary or not, similar to `-nostdlib` option of wasi-sdk clang. If specified, then no libc library is linked and the **libc-builtin** library of WAMR must be built to run the wasm app, otherwise, the **libc-wasi** library must be built. You can specify **-DWAMR_BUILD_LIBC_BUILTIN=1** or **-DWAMR_BUILD_LIBC_WASI=1** for cmake to build WAMR with libc-builtin support or libc-wasi support.
The emsdk's wasi implementation is incomplete, e.g. open a file might just return fail, so it is strongly not recommended to use this mode, especially when there are file io operations in wasm app, please use wasi-sdk instead.
- **-s STANDALONE_WASM=1** build wasm app in standalone mode (non-web mode), if the output file has suffix ".wasm", then only wasm file is generated (without html file and JavaScript file).
- **-s TOTAL_STACK=\<value\>** the auxiliary stack size, same as `-z stack-size=\<value\>` of wasi-sdk
- **-s TOTAL_MEMORY=\<value\>** or **-s INITIAL_MEORY=\<value\>** the initial linear memory size
- **-s MAXIMUM_MEMORY=\<value\>** the maximum linear memory size, only take effect if **-s ALLOW_MEMORY_GROWTH=1** is set
- **-s ALLOW_MEMORY_GROWTH=1/0** whether the linear memory is allowed to grow or not
- **-s "EXPORTED_FUNCTIONS=['func name1', 'func name2']"** to export functions
- **-s ERROR_ON_UNDEFINED_SYMBOLS=0** disable the errors when there are undefined symbols
For more options, please ref to <EMSDK_DIR>/upstream/emscripten/src/settings.js, or [Emscripten document](https://emscripten.org/docs/compiling/Building-Projects.html).
# Build a project with cmake
@ -136,19 +248,29 @@ Usage: wamrc [options] -o output_file wasm_file
Use +feature to enable a feature, or -feature to disable it
For example, --cpu-features=+feature1,-feature2
Use --cpu-features=+help to list all the features supported
--opt-level=n Set the optimization level (0 to 3, default: 3, which is fastest)
--size-level=n Set the code size level (0 to 3, default: 3, which is smallest)
--opt-level=n Set the optimization level (0 to 3, default is 3)
--size-level=n Set the code size level (0 to 3, default is 3)
-sgx Generate code for SGX platform (Intel Software Guard Extention)
--bounds-checks=1/0 Enable or disable the bounds checks for memory access:
by default it is disabled in all 64-bit platforms except SGX and
in these platforms runtime does bounds checks with hardware trap,
and by default it is enabled in all 32-bit platforms
--format=<format> Specifies the format of the output file
The format supported:
aot (default) AoT file
object Native object file
llvmir-unopt Unoptimized LLVM IR
llvmir-opt Optimized LLVM IR
--enable-bulk-memory Enable the post-MVP bulk memory feature
--enable-multi-thread Enable multi-thread feature, the dependent features bulk-memory and
--enable-tail-call Enable the post-MVP tail call feature
thread-mgr will be enabled automatically
--enable-simd Enable the post-MVP 128-bit SIMD feature
--enable-dump-call-stack Enable stack trace feature
-v=n Set log verbose level (0 to 5, default is 2), larger with more log
Examples: wamrc -o test.aot test.wasm
wamrc --target=i386 -o test.aot test.wasm
wamrc --target=i386 --format=object -o test.o test.wasm
```

View File

@ -1,5 +1,4 @@
## Use clang compiler
The recommended method to build a WASM binary is to use clang compiler ```clang-8```. You can refer to [apt.llvm.org](https://apt.llvm.org) for the detailed instructions. Here are referenced steps to install clang-8 in Ubuntu 16.04 and Ubuntu 18.04.
@ -61,42 +60,6 @@ clang-8 --target=wasm32 -O3 \
You will get ```test.wasm``` which is the WASM app binary.
## Use Emscripten tool
The last method to build a WASM binary is to use Emscripten tool ```emcc```.
Assuming you are using Linux, you may install emcc from Emscripten EMSDK following the steps below:
```
git clone https://github.com/emscripten-core/emsdk.git
cd emsdk
./emsdk install latest-fastcomp
./emsdk activate latest-fastcomp
```
The Emscripten website provides other installation methods beyond Linux.
Use the emcc command below to build the WASM C source code into the WASM binary.
``` Bash
cd emsdk
source emsdk_env.sh (or add it to ~/.bashrc if you don't want to run it each time)
cd <dir of test.c>
EMCC_ONLY_FORCED_STDLIBS=1 emcc -g -O3 -s WASM=1 -s ERROR_ON_UNDEFINED_SYMBOLS=0 \
-s TOTAL_MEMORY=65536 -s TOTAL_STACK=4096 \
-s ASSERTIONS=1 -s STACK_OVERFLOW_CHECK=2 \
-s "EXPORTED_FUNCTIONS=['_main']" -o test.wasm test.c
```
You will get ```test.wasm``` which is the WASM app binary.
## Using Docker
Another method availble is using [Docker](https://www.docker.com/). We assume you've already configured Docker (see Platform section above) and have a running interactive shell. Currently the Dockerfile only supports compiling apps with clang, with Emscripten planned for the future.

View File

@ -57,7 +57,7 @@ To build this C program into WebAssembly app with libc-builtin, you can use this
You can also build this program with WASI, but we need to make some changes to wasi-sysroot:
1. disable malloc / free of wasi as they don't support shared memory
1. disable malloc/free of wasi if the wasi-sdk version is smaller than wasi-sdk-12.0 (not include 12.0), as they don't support shared memory:
``` bash
/opt/wasi-sdk/bin/llvm-ar -d /opt/wasi-sdk/share/wasi-sysroot/lib/wasm32-wasi/libc.a dlmalloc.o
```
@ -169,4 +169,4 @@ int pthread_key_delete(pthread_key_t key);
## Known limits
- `pthread_attr_t`, `pthread_mutexattr_t` and `pthread_condattr_t` are not supported yet, so please pass `NULL` as the second argument of `pthread_create`, `pthread_mutex_init` and `pthread_cond_init`.
- The `errno.o` in wasi-sysroot is not compatible with this feature, so using errno in multi-thread may cause unexpected behavior.
- Currently `struct timespec` is not supported, so the prototype of `pthread_cond_timedwait` is different from the native one, it takes an unsigned int argument `useconds` to indicate the waiting time.
- Currently `struct timespec` is not supported, so the prototype of `pthread_cond_timedwait` is different from the native one, it takes an unsigned int argument `useconds` to indicate the waiting time.