- r_1 - engine_NEON1M11-0001_o2_m26_ifx - 10 analyzed loop(s)
- Loop 18566 - engine_linux64_intel_ifx_impi
 - Loop 39475 - engine_linux64_intel_ifx_impi
 - Loop 255174 - engine_linux64_intel_ifx_impi
 - Loop 37916 - engine_linux64_intel_ifx_impi
 - Loop 256165 - engine_linux64_intel_ifx_impi
 - Loop 19710 - engine_linux64_intel_ifx_impi
 - Loop 38054 - engine_linux64_intel_ifx_impi
 - Loop 121792 - engine_linux64_intel_ifx_impi
 - Loop 19918 - engine_linux64_intel_ifx_impi
 - Loop 167135 - engine_linux64_intel_ifx_impi
 
 - r_2 - engine_NEON1M11-0001_o2_m26_ifort - 10 analyzed loop(s)
- Loop 15282 - engine_linux64_intel_impi
 - Loop 193162 - engine_linux64_intel_impi
 - Loop 30046 - engine_linux64_intel_impi
 - Loop 28971 - engine_linux64_intel_impi
 - Loop 97970 - engine_linux64_intel_impi
 - Loop 15758 - engine_linux64_intel_impi
 - Loop 98506 - engine_linux64_intel_impi
 - Loop 29120 - engine_linux64_intel_impi
 - Loop 97971 - engine_linux64_intel_impi
 - Loop 92421 - engine_linux64_intel_impi
 
 
| Analysis | Count | Percentage | Weighted Count | 
| ▼Loop Computation Issues– | 19 |  |  | 
| ○Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 10 | 50.00 | 0.35 | 
| ○Presence of a large number of scalar integer instructions | 5 | 25.00 | 0.14 | 
| ○Presence of expensive FP instructions | 2 | 10.00 | 0.03 | 
| ○Large loop body over microp cache size | 1 | 5.00 | 0.01 | 
| ○Bottleneck in the front-end | 1 | 5.00 | 0.01 | 
| ▼Control Flow Issues– | 9 |  |  | 
| ○Presence of 2 to 4 paths | 4 | 20.00 | 0.07 | 
| ○Presence of calls | 3 | 15.00 | 0.17 | 
| ○Presence of more than 4 paths | 1 | 5.00 | 0.02 | 
| ○Non-innermost loop | 1 | 5.00 | 0.01 | 
| ▼Data Access Issues– | 23 |  |  | 
| ○More than 20% of the loads are accessing the stack | 8 | 40.00 | 0.35 | 
| ○Presence of indirect access | 4 | 20.00 | 0.12 | 
| ○Presence of constant non-unit stride data access | 4 | 20.00 | 0.13 | 
| ○Presence of special instructions executing on a single port | 3 | 15.00 | 0.06 | 
| ○More than 10% of the vector loads instructions are unaligned | 3 | 15.00 | 0.06 | 
| ○Presence of expensive instructions: scatter/gather | 1 | 5.00 | 0.03 | 
| ▼Vectorization Roadblocks– | 20 |  |  | 
| ○Presence of more than 4 paths | 4 | 20.00 | 0.19 | 
| ○Presence of 2 to 4 paths | 4 | 20.00 | 0.07 | 
| ○Presence of constant non-unit stride data access | 4 | 20.00 | 0.13 | 
| ○Presence of indirect access | 4 | 20.00 | 0.12 | 
| ○Presence of calls | 3 | 15.00 | 0.17 | 
| ○Non-innermost loop | 1 | 5.00 | 0.01 | 
| ▼Inefficient Vectorization– | 5 |  |  | 
| ○Presence of special instructions executing on a single port | 3 | 15.00 | 0.06 | 
| ○Use of masked instructions | 1 | 5.00 | 0.01 | 
| ○Presence of expensive instructions: scatter/gather | 1 | 5.00 | 0.03 | 
 
 
| Analysis | r_1 | r_2 | 
| Loop Computation Issues | Presence of expensive FP instructions | 1 | 1 | 
|---|
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 5 | 5 | 
| Large loop body over microp cache size | 1 | 0 | 
| Presence of a large number of scalar integer instructions | 2 | 3 | 
| Bottleneck in the front-end | 1 | 0 | 
| Control Flow Issues | Presence of calls | 2 | 1 | 
|---|
| Presence of 2 to 4 paths | 2 | 2 | 
| Presence of more than 4 paths | 0 | 1 | 
| Non-innermost loop | 1 | 0 | 
| Data Access Issues | Presence of constant non-unit stride data access | 2 | 2 | 
|---|
| Presence of indirect access | 2 | 2 | 
| More than 10% of the vector loads instructions are unaligned | 3 | 0 | 
| Presence of expensive instructions: scatter/gather | 0 | 1 | 
| Presence of special instructions executing on a single port | 3 | 0 | 
| More than 20% of the loads are accessing the stack | 3 | 5 | 
| Vectorization Roadblocks | Presence of calls | 2 | 1 | 
|---|
| Presence of 2 to 4 paths | 2 | 2 | 
| Presence of more than 4 paths | 2 | 2 | 
| Non-innermost loop | 1 | 0 | 
| Presence of constant non-unit stride data access | 2 | 2 | 
| Presence of indirect access | 2 | 2 | 
| Inefficient Vectorization | Presence of expensive instructions: scatter/gather | 0 | 1 | 
|---|
| Presence of special instructions executing on a single port | 3 | 0 | 
| Use of masked instructions | 1 | 0 |