Commit Graph

114 Commits

Author SHA1 Message Date
efa8019bdb Merge dev/simd for fast-interp (#4131)
* Implement the first few SIMD opcodes for fast interpreter (v128.const, v128.any_true) (#3818)

Tested on the following code:
```
(module
  (import "wasi_snapshot_preview1" "proc_exit" (func $proc_exit (param i32)))
  (memory (export "memory") 1)

  ;; WASI entry point
  (func $main (export "_start")
    v128.const i8x16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    v128.any_true
    if
      unreachable
    end
    
    v128.const i8x16 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15     
    v128.any_true
    i32.const 0
    i32.eq
    if
      unreachable
    end

    i32.const 0
    call $proc_exit
  )
)
```

* implement POP_V128()

This is to simplify the simd implementation for fast interpreter

* Add all SIMD operations into wasm_interp_fast switch

* Add V128 comparison operations

Tested using
```
(module
  (import "wasi_snapshot_preview1" "proc_exit" (func $proc_exit (param i32)))

  (memory (export "memory") 1)

  (func $assert_true (param v128)
    local.get 0
    v128.any_true
    i32.eqz
    if
      unreachable
    end
  )

  (func $main (export "_start")
    ;; Test v128.not
    v128.const i8x16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    v128.not
    v128.const i8x16 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255
    i8x16.eq
    call $assert_true

    ;; Test v128.and
    v128.const i8x16 255 255 255 255 0 0 0 0 255 255 255 255 0 0 0 0
    v128.const i8x16 255 255 0 0 255 255 0 0 255 255 0 0 255 255 0 0
    v128.and
    v128.const i8x16 255 255 0 0 0 0 0 0 255 255 0 0 0 0 0 0
    i8x16.eq
    call $assert_true

    ;; Test v128.andnot
    v128.const i8x16 255 255 255 255 0 0 0 0 255 255 255 255 0 0 0 0
    v128.const i8x16 255 255 0 0 255 255 0 0 255 255 0 0 255 255 0 0
    v128.andnot
    v128.const i8x16 0 0 255 255 0 0 0 0 0 0 255 255 0 0 0 0
    i8x16.eq
    call $assert_true

    ;; Test v128.or
    v128.const i8x16 255 255 0 0 0 0 255 255 255 255 0 0 0 0 255 0
    v128.const i8x16 0 0 255 255 255 255 0 0 0 0 255 255 255 255 0 0
    v128.or
    v128.const i8x16 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 0
    i8x16.eq
    call $assert_true

    ;; Test v128.xor
    v128.const i8x16 255 255 0 0 255 255 0 0 255 255 0 0 255 255 0 0
    v128.const i8x16 255 255 255 255 0 0 0 0 255 255 255 255 0 0 0 0
    v128.xor
    v128.const i8x16 0 0 255 255 255 255 0 0 0 0 255 255 255 255 0 0
    i8x16.eq
    call $assert_true

    i32.const 0
    call $proc_exit
  )
)
```

* Add first NEON SIMD opcode implementations to fast interpreter (#3859)

Add some implementations of SIMD opcodes using NEON instructions.
Tested using:
```wast
(module
  (import "wasi_snapshot_preview1" "proc_exit" (func $proc_exit (param i32)))
  (memory (export "memory") 1)

  (func $assert_true (param v128)
    local.get 0
    v128.any_true 
    i32.eqz
    if
      unreachable
    end
  )
  (func $main (export "_start")
    i32.const 0
    i32.const 32
    memory.grow
    drop

    i32.const 0
    v128.const i8x16 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
    v128.store

    i32.const 0
    v128.load

    v128.const i8x16 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
    i8x16.eq
    call $assert_true

    i32.const 16
    v128.const i8x16 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
    v128.store

    i32.const 16
    v128.load
    v128.const i8x16 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
    i8x16.eq
    call $assert_true

    i32.const 0
    v128.load
    v128.const i8x16 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
    i8x16.eq
    call $assert_true
    drop

    i32.const 0
    i32.const 1
    memory.grow
    drop

    i32.const 0
    i64.const 0x7F80FF017E02FE80
    i64.store

    i32.const 0
    v128.load8x8_s

    v128.const i16x8 127 -128 -1 1 126 2 -2 -128

    i16x8.eq
    call $assert_true

    i32.const 0
    i64.const 0x80FE027E01FF807F
    i64.store

    i32.const 0
    v128.load8x8_u

    v128.const i16x8 128 254 2 126 1 255 128 127

    i16x8.eq
    call $assert_true

    i32.const 0
    i64.const 0x8000FFFE7FFF0001
    i64.store

    i32.const 0
    v128.load16x4_s

    v128.const i32x4 -32768 -2 32767 1

    i32x4.eq
    call $assert_true

    i32.const 0
    i64.const 0x8000FFFE7FFF0001 
    i64.store

    i32.const 0
    v128.load16x4_u

    v128.const i32x4 32768 65534 32767 1   

    i32x4.eq
    call $assert_true

    i32.const 0
    i64.const 0x8000000000000001
    i64.store

    i32.const 0
    v128.load32x2_s

    v128.const i64x2 -2147483648 1 

    i64x2.eq
    call $assert_true

    i32.const 0
    i64.const 0x8000000000000001
    i64.store

    i32.const 0
    v128.load32x2_u

    v128.const i64x2 2147483648 1

    i64x2.eq
    call $assert_true

    call $proc_exit
  )
)
```

* Emit imm for lane extract and replace (#3906)

* Fix replacement value not being correct (#3919)

* Implement load lanes opcodes for wasm (#3942)

* Implement final SIMD opcodes: store lane (#4001)

* Fix load/store (#4054)

* Correctly use unsigned functions  (#4055)

* implement local and function calls for v128 in the fast interpreter

* Fix splat opcodes, add V128 handling in preserve_referenced_local and reserve_block_ret

* Fix incorrect memory overflow values + SIMD ifdefs

* Fix load/load_splat macros

* correct endif wasm loader

* Update core/iwasm/interpreter/wasm_opcode.h

* Fix spec tests when WASM_CPU_SUPPORTS_UNALIGNED_ADDR_ACCESS is 0

* Resolve merge conflicts arising from main -> dev/simd_for_interp and implement fast interpreter const offset loader support for V128

* Enable SIMDe tests on CI

* Document WAMR_BUILD_LIB_SIMDE

---------

Co-authored-by: James Marsh <mrshnja@amazon.co.uk>
Co-authored-by: jammar1 <108334558+jammar1@users.noreply.github.com>
Co-authored-by: Maks Litskevich <makslit@amazon.com>
Co-authored-by: Marcin Kolny <marcin.kolny@gmail.com>
Co-authored-by: Wenyong Huang <wenyong.huang@intel.com>
2025-03-20 14:23:20 +08:00
766f378590 Merge pull request #4033 from g0djan/godjan/iterate_callstack
Copy callstack API
2025-03-11 10:31:56 +08:00
cc3f0a096b Cleaning up 2025-02-24 17:33:14 +00:00
32338bb7d6 Copy read only API behind a flag instead of using user defined callback 2025-02-24 17:22:05 +00:00
f2ef9ee62e Cmake improvements (#4076)
- Utilizes the standard CMake variable BUILD_SHARED_LIBS to simplify the CMake configuration.
- Allows the use of a single library definition for both static and
  shared library cases, improving maintainability and readability of the CMake configuration.
- Install vmlib public header files
- Installs the public header files for the vmlib target to the include/iwasm directory.
- Install cmake package
- Adds the necessary CMake configuration files (iwasmConfig.cmake and iwasmConfigVersion.cmake).
- Configures the installation of these files to the appropriate directory (lib/cmake/iwasm).
- Ensures compatibility with the same major version.
- Improve windows product-mini CMakeLists.txt
- Fix missing symbols when linking windows product-mini with shared vmlib
- Improve Darwin product-mini CMakeLists.txt

---------

Signed-off-by: Peter Tatrai <peter.tatrai.ext@siemens.com>
2025-02-21 15:29:49 +08:00
e64685f43c Add versioning support and update CMake configuration 2025-02-05 10:31:20 +00:00
41b2c6d0d5 Show wasm proposals status during compilation and execution (#3989)
- add default build configuration options and enhance message output for WAMR features
- Add Wasm proposal status printing functionality
2025-02-05 15:28:26 +08:00
c6712b4033 add a validator for aot module (#3995)
- Add AOT module validation to ensure memory constraints are met
- Enable AOT validator in build configuration and update related source files
2025-02-05 15:21:49 +08:00
53da420c41 Enable shrunk memory by default and add related configurations (#4008)
- Enable shrunk memory by default and add related configurations
- Improve error messages for memory access alignment checks
- Add documentation for WAMR shrunk memory build option
- Update NuttX workflow to disable shrunk memory build option
2025-01-13 07:09:04 +08:00
1af474099b Add Windows wamrc and iwasm build in release CI (#3857)
- For Windows, llvm libs need to cache more directories, so use a multi-line
  environment variable for paths
- Remove conditionally build directories `win32build`, just use `build` for all platform
- Add Windows wamrc and iwasm(disable lib pthread semaphore and fast jit for now)
  build in release CI
2024-10-17 10:01:56 +08:00
c4aa1deda5 Add shared heap sample (#3806) 2024-09-20 16:13:20 +08:00
1fd422ae6e Merge branch main into dev/shared_heap 2024-09-20 14:34:00 +08:00
51a71092bf Support dynamic aot debug (#3788)
Enable dynamic aot debug feature which debugs the aot file
and is able to set the break point and do single step. Refer to
the README for the detailed steps.

Signed-off-by: zhangliangyu3 <zhangliangyu3@xiaomi.com>
2024-09-18 11:02:10 +08:00
92852f3719 Implement a first version of shared heap feature (#3789)
ps. https://github.com/bytecodealliance/wasm-micro-runtime/issues/3546
2024-09-14 10:51:42 +08:00
0599351262 wasi-nn: Add a new target for llama.cpp as a wasi-nn backend (#3709)
Minimum support:
- [x] accept (WasmEdge) customized model parameters. metadata.
- [x] Target [wasmedge-ggml examples](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml)
  - [x] basic
  - [x] chatml
  - [x] gemma
  - [x] llama
  - [x] qwen

---

In the future, to support if required:
- [ ] Target [wasmedge-ggml examples](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml)
  - [ ] command-r. (>70G memory requirement)
  - [ ] embedding. (embedding mode)
  - [ ] grammar. (use the grammar option to constrain the model to generate the JSON output)
  - [ ] llama-stream. (new APIS `compute_single`, `get_output_single`, `fini_single`)
  - [ ] llava. (image representation)
  - [ ] llava-base64-stream. (image representation)
  - [ ] multimodel. (image representation)
- [ ] Target [llamaedge](https://github.com/LlamaEdge/LlamaEdge)
2024-09-10 08:45:18 +08:00
1329e1d3e1 Add support for multi-memory proposal in classic interpreter (#3742)
Implement multi-memory for classic-interpreter. Support core spec (and bulk memory) opcodes now,
and will support atomic opcodes, and add multi-memory export APIs in the future. 

PS: Multi-memory spec test patched a lot for linking test to adapt for multi-module implementation.
2024-08-21 12:22:23 +08:00
58ca02bc5f Add support for RISCV32 ILP32F (#3708) 2024-08-15 15:17:42 +08:00
140ff25d46 wasi-nn: Apply new architecture (#3692)
ps.
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3677
2024-08-13 09:14:52 +08:00
058bc47102 [wasi-nn] Add a new wasi-nn backend openvino (#3603) 2024-07-22 17:16:41 +08:00
f7d2826772 Allow missing imports in wasm loader and report error in wasm instantiation instead (#3539)
The wasm loader is failing when multi-module support is on and the dependent
modules are not found; this enforces the AOT compiler integrations to prepare
dependent modules while it isn't necessary.

This PR allows allows missing imports in wasm loader and report error in wasm
instantiation instead, which enables the integrated AOT compiler to work as if
the multi-module support isn't turned on.
2024-06-25 10:04:39 +08:00
49c9fa31da Fix typo of WAMR_CONFIGUABLE_BOUNDS_CHECKS (#3424)
Change to WAMR_CONFIGURABLE_BOUNDS_CHECKS, and fix CodeQL compilation errors
which were introduced by PR #3406.

ps.
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3393#discussion_r1591810998
https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/9055318553/job/24876266629
2024-05-14 14:33:09 +08:00
ba59e56e19 User defined memory allocator for different purposes (#3316)
Some issues are related with memory fragmentation, which may cause
the linear memory cannot be allocated. In WAMR, the memory managed
by the system is often trivial, but linear memory usually directly allocates
a large block and often remains unchanged for a long time. Their sensitivity
and contribution to fragmentation are different, which is suitable for
different allocation strategies. If we can control the linear memory's allocation,
do not make it from system heap, the overhead of heap management might
be avoided.

Add `mem_alloc_usage_t usage` as the first argument for user defined
malloc/realloc/free functions when `WAMR_BUILD_ALLOC_WITH_USAGE` cmake
variable is set as 1, and make passing `Alloc_For_LinearMemory` to the
argument when allocating the linear memory.
2024-04-18 19:40:57 +08:00
68bd30c6f9 Enhance GC subtyping checks (#3317)
Enhance the GC subtyping checks:
- Fix issues in the type equivalence check
- Enable the recursive type subtyping check
- Add a equivalence type flag in defined types of aot file, if there is an
  equivalence type before, just set it true and re-use the previous type
- Normalize the defined types for interpreter and AOT
- Enable spec test case type-equivalence.wast and type-subtyping.wast,
  and enable some commented cases
- Enable set WAMR_BUILD_SANITIZER from cmake variable
2024-04-18 12:32:01 +08:00
a23fa9f86c Implement memory64 for classic interpreter (#3266)
Adding a new cmake flag (cache variable) `WAMR_BUILD_MEMORY64` to enable
the memory64 feature, it can only be enabled on the 64-bit platform/target and
can only use software boundary check. And when it is enabled, it can support both
i32 and i64 linear memory types. The main modifications are:

- wasm loader & mini-loader: loading and bytecode validating process 
- wasm runtime: memory instantiating process
- classic-interpreter: wasm code executing process
- Support memory64 memory in related runtime APIs
- Modify main function type check when it's memory64 wasm file
- Modify `wasm_runtime_invoke_native` and `wasm_runtime_invoke_native_raw` to
  handle registered native function pointer argument when memory64 is enabled
- memory64 classic-interpreter spec test in `test_wamr.sh` and in CI

Currently, it supports memory64 memory wasm file that uses core spec
(including bulk memory proposal) opcodes and threads opcodes.

ps.
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3091
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3240
https://github.com/bytecodealliance/wasm-micro-runtime/pull/3260
2024-04-02 15:22:07 +08:00
9c8551cf75 Add cmake flag to control aot intrinsics (#3261)
Add cmake variable `-DWAMR_BUILD_AOT_INTRINSICS=1/0` to enable/disable
the aot intrinsic functions, which are normally used by AOT XIP feature, and
can be disabled to reduce the aot runtime binary size.

And refactor the code in aot_intrinsics.h/.c.
2024-04-01 11:26:05 +08:00
cef88deedb Add wasi_ephemeral_nn module support (#3241)
Add `wasi_ephemeral_nn` module support with optional cmake variable,
which was mentioned in #3229.
2024-03-21 21:05:34 +08:00
16a4d71b34 Implement GC (Garbage Collection) feature for interpreter, AOT and LLVM-JIT (#3125)
Implement the GC (Garbage Collection) feature for interpreter mode,
AOT mode and LLVM-JIT mode, and support most features of the latest
spec proposal, and also enable the stringref feature.

Use `cmake -DWAMR_BUILD_GC=1/0` to enable/disable the feature,
and `wamrc --enable-gc` to generate the AOT file with GC supported.

And update the AOT file version from 2 to 3 since there are many AOT
ABI breaks, including the changes of AOT file format, the changes of
AOT module/memory instance layouts, the AOT runtime APIs for the
AOT code to invoke and so on.
2024-02-06 20:47:11 +08:00
a27ddece7f Always allocate linear memory using mmap (#3052)
With this approach we can omit using memset() for the newly allocated memory
therefore the physical pages are not being used unless touched by the program.

This also simplifies the implementation.
2024-02-02 22:17:44 +08:00
af318bac81 Implement Exception Handling for classic interpreter (#3096)
This PR adds the initial support for WASM exception handling:
* Inside the classic interpreter only:
  * Initial handling of Tags
  * Initial handling of Exceptions based on W3C Exception Proposal
  * Import and Export of Exceptions and Tags
* Add `cmake -DWAMR_BUILD_EXCE_HANDLING=1/0` option to enable/disable
  the feature, and by default it is disabled
* Update the wamr-test-suites scripts to test the feature
* Additional CI/CD changes to validate the exception spec proposal cases

Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1884
587513f3c6
8bebfe9ad7
59bccdfed8

Signed-off-by: Ricardo Aguilar <ricardoaguilar@siemens.com>
Co-authored-by: Chris Woods <chris.woods@siemens.com>
Co-authored-by: Rene Ermler <rene.ermler@siemens.com>
Co-authored-by: Trenner Thomas <trenner.thomas@siemens.com>
2024-01-31 08:27:17 +08:00
3fcd79867d Forward log and log level to custom bh_log callback (#3070)
Follow-up on #2907. The log level is needed in the host embedder to
better integrate with the embedder's logger.

Allow the developer to customize his bh_log callback with
`cmake -DWAMR_BH_LOG=<log_callback>`,
and update sample/basic to show the usage.
2024-01-24 13:05:07 +08:00
c8b59588a7 Enhance setting write gs base with cmake variable (#3066)
In linux x86-64, developer can use cmake variable to configure whether
to enable writing linear memory base address to x86 GS register or not:
- `cmake -DWAMR_DISABLE_WRITE_GS_BASE=1`: disabled it
- `cmake -DWAMR_DISABLE_WRITE_GS_BASE=0`: enabled it
- `cmake` without `-DWAMR_DISABLE_WRITE_GS_BASE=1/0`:
        auto-detected by the compiler
2024-01-23 12:21:20 +08:00
d13a54f860 Revert "Enable MAP_32BIT for macOS (#2992)" (#3032)
Revert "Do not use pagezero size option if osx version >= 13 (#3025)"
and  "Enable MAP_32BIT for macOS (#2992)".

Discussion: https://github.com/bytecodealliance/wasm-micro-runtime/issues/3009
2024-01-18 09:22:09 +08:00
6fb376cbe3 Disable quick aot entry for interp and fast-jit (#3039)
Quick aot/jit entry is only for aot and llvm jit now, no need to enable it
for interpreter and fast-jit when aot and llvm jit are disabled.
2024-01-17 16:17:08 +08:00
ffa131b5ac Allow using mmap for shared memory if hw bound check is disabled (#3029)
For shared memory, the max memory size must be defined in advanced. Re-allocation
for growing memory can't be used as it might change the base address, therefore when
OS_ENABLE_HW_BOUND_CHECK is enabled the memory is mmaped, and if the flag is
disabled, the memory is allocated. This change introduces a flag that allows users to use
mmap for reserving memory address space even if the OS_ENABLE_HW_BOUND_CHECK
is disabled.
2024-01-16 22:15:55 +08:00
bb053e3a2d Do not use pagezero size option if osx version >= 13 (#3025)
Reported in https://github.com/bytecodealliance/wasm-micro-runtime/issues/3009.
2024-01-16 12:14:43 +08:00
7c7684819d Register quick call entries to speedup the aot/jit func call process (#2978)
In some scenarios there may be lots of callings to AOT/JIT functions from the
host embedder, which expects good performance for the calling process, while
in the current implementation, runtime calls the wasm_runtime_invoke_native
to prepare the array of registers and stacks for the invokeNative assemble code,
and the latter then puts the elements in the array to physical registers and
native stacks and calls the AOT/JIT function, there may be many data copying
and handlings which impact the performance.

This PR registers some quick AOT/JIT entries for some simple wasm signatures,
and let runtime call the entry to directly invoke the AOT/JIT function instead of
calling wasm_runtime_invoke_native, which speedups the calling process.

We may extend the mechanism next to allow the developer to register his quick
AOT/JIT entries to speedup the calling process of invoking the AOT/JIT functions
for some specific signatures.
2024-01-10 16:44:09 +08:00
6fa6d6d9a5 Enable MAP_32BIT for macOS (#2992)
On macOS, by default, the first 4GB is occupied by the pagezero.
While it can be controlled with link time options, as we are
an library, we usually don't have a control on how to link an
executable.
2024-01-10 16:19:06 +08:00
5c3ad0279a Enable AOT linux perf support (#2930)
And refactor the original perf support
- use WAMR_BUILD_LINUX_PERF as the cmake compilation control
- use WASM_ENABLE_LINUX_PERF as the compiler macro
- use `wamrc --enable-linux-perf` to generate aot file which contains fp operations
- use `iwasm --enable-linux-perf` to create perf map for `perf record`
2024-01-02 15:58:17 +08:00
b0d5b8df1d Fix issues of build/run with llvm-17 (#2853)
- Fix compilation error of using PGOOptions
- Fix LLVM JIT run error due to `llvm_orc_registerEHFrameSectionWrapper`
  symbol not found
2023-12-04 16:40:54 +08:00
444b159963 Implement async termination of blocking thread (#2516)
Send a signal whose handler is no-op to a blocking thread to wake up
the blocking syscall with either EINTR equivalent or partial success.

Unlike the approach taken in the `dev/interrupt_block_insn` branch (that is,
signal + longjmp similarly to `OS_ENABLE_HW_BOUND_CHECK`), this PR
does not use longjmp because:
* longjmp from signal handler doesn't work on nuttx
  refer to https://github.com/apache/nuttx/issues/10326
* the singal+longjmp approach may be too difficult for average programmers
  who might implement host functions to deal with

See also https://github.com/bytecodealliance/wasm-micro-runtime/issues/1910
2023-09-20 18:11:52 +08:00
6c846acc59 Implement module instance context APIs (#2436)
Introduce module instance context APIs which can set one or more contexts created
by the embedder for a wasm module instance:
```C
    wasm_runtime_create_context_key
    wasm_runtime_destroy_context_key
    wasm_runtime_set_context
    wasm_runtime_set_context_spread
    wasm_runtime_get_context
```

And make libc-wasi use it and set wasi context as the first context bound to the wasm
module instance.

Also add samples.

Refer to https://github.com/bytecodealliance/wasm-micro-runtime/issues/2460.
2023-09-07 14:54:11 +08:00
b45d014112 wasi-nn: Improve TPU support (#2447)
1. Allow TPU and GPU support at the same time.
2. Add Dockerfile to run example with [Coral USB](https://coral.ai/products/accelerator/).
2023-08-14 20:03:56 +08:00
490fa2ddac Auto-check wrgsbase in cmake script (#2437)
Auto-check whether `WRGSBASE` instruction is supported in linux x86-64 in the
cmake script. And if not, disable writing x86 GS register.
2023-08-09 19:43:08 +08:00
18092f86cc Make memory access boundary check behavior configurable (#2289)
Allow to use `cmake -DWAMR_CONFIGURABLE_BOUNDS_CHECKS=1` to
build iwasm, and then run `iwasm --disable-bounds-checks` to disable the
memory access boundary checks.

And add two APIs:
`wasm_runtime_set_bounds_checks` and `wasm_runtime_is_bounds_checks_enabled`
2023-07-04 16:21:30 +08:00
fe830d805d Add cmake variable to disable writing gs register (#2284)
Support to disable writing x86-64 GS segment register by
  `cmake -DWAMR_DISABLE_WRITE_GS_BASE=1`
and update document. Issue was reported in #2273.
2023-06-13 10:26:25 +08:00
8d88471c46 Implement AOT static PGO (#2243)
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:

1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
   to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
      `iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
    to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
    to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
    to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.

The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
2023-06-05 09:17:39 +08:00
27239723a9 Add asan and ubsan to WAMR CI (#2161)
Add nightly (UTC time) checks with asan and ubsan, and also put gcc-4.8 build
to nightly run since we don't need to run it with every PR.

Co-authored-by: Maksim Litskevich <makslit@amazon.co.uk>
2023-05-26 09:45:37 +08:00
89be5622a5 wasi-nn: Add external delegation to support several NPU/GPU (#2162)
Add VX delegation as an external delegation of TFLite, so that several NPU/GPU
(from VeriSilicon, NXP, Amlogic) can be controlled via WASI-NN.

Test Code can work with the X86 simulator.
2023-05-05 16:29:36 +08:00
bab2402b6e Fix atomic.wait, get wasi_ctx exit code and thread mgr issues (#2024)
- Remove notify_stale_threads_on_exception and change atomic.wait
  to be interruptible by keep waiting and checking every one second,
  like the implementation of poll_oneoff in libc-wasi
- Wait all other threads exit and then get wasi exit_code to avoid
  getting invalid value
- Inherit suspend_flags of parent thread while creating new thread to
  avoid terminated flag isn't set for new thread
- Fix wasi-threads test case update_shared_data_and_alloc_heap
- Add "Lib wasi-threads enabled" prompt for cmake
- Fix aot get exception, use aot_copy_exception instead
2023-03-15 07:47:36 +08:00
a15a731e12 wasi-nn: Support multiple TFLite models (#2002)
Remove restrictions:
- Only 1 WASM app at a time
- Only 1 model at a time
   - `graph` and `graph-execution-context` are ignored

Refer to previous document:
e8d718096d/core/iwasm/libraries/wasi-nn/README.md
2023-03-08 15:54:06 +08:00