options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 1, "n_threads_batch": 1, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 146.756195, "speed_pp": 13.955118, "t_tg": 0.000000, "speed_tg": nan, "t": 146.756195, "speed": 13.955118}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_0  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 18369 tid 18369 thread 0 bound to OS proc set {0}
OMP: pid 18369 tid 18436 thread 1 bound to OS proc set {32}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 2, "n_threads_batch": 2, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 73.375374, "speed_pp": 27.911272, "t_tg": 0.000000, "speed_tg": nan, "t": 73.375374, "speed": 27.911272}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_1  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 18464 tid 18464 thread 0 bound to OS proc set {0}
OMP: pid 18464 tid 18532 thread 2 bound to OS proc set {32}
OMP: pid 18464 tid 18531 thread 1 bound to OS proc set {16}
OMP: pid 18464 tid 18533 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 36.727261, "speed_pp": 55.762394, "t_tg": 0.000000, "speed_tg": nan, "t": 36.727261, "speed": 55.762394}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_2  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 18560 tid 18560 thread 0 bound to OS proc set {0}
OMP: pid 18560 tid 18627 thread 1 bound to OS proc set {8}
OMP: pid 18560 tid 18629 thread 3 bound to OS proc set {24}
OMP: pid 18560 tid 18628 thread 2 bound to OS proc set {16}
OMP: pid 18560 tid 18630 thread 4 bound to OS proc set {32}
OMP: pid 18560 tid 18632 thread 6 bound to OS proc set {48}
OMP: pid 18560 tid 18631 thread 5 bound to OS proc set {40}
OMP: pid 18560 tid 18633 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 18.489580, "speed_pp": 110.765091, "t_tg": 0.000000, "speed_tg": nan, "t": 18.489580, "speed": 110.765091}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_3  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 18660 tid 18660 thread 0 bound to OS proc set {0}
OMP: pid 18660 tid 18727 thread 1 bound to OS proc set {4}
OMP: pid 18660 tid 18735 thread 9 bound to OS proc set {36}
OMP: pid 18660 tid 18729 thread 3 bound to OS proc set {12}
OMP: pid 18660 tid 18728 thread 2 bound to OS proc set {8}
OMP: pid 18660 tid 18734 thread 8 bound to OS proc set {32}
OMP: pid 18660 tid 18730 thread 4 bound to OS proc set {16}
OMP: pid 18660 tid 18731 thread 5 bound to OS proc set {20}
OMP: pid 18660 tid 18736 thread 10 bound to OS proc set {40}
OMP: pid 18660 tid 18733 thread 7 bound to OS proc set {28}
OMP: pid 18660 tid 18732 thread 6 bound to OS proc set {24}
OMP: pid 18660 tid 18739 thread 13 bound to OS proc set {52}
OMP: pid 18660 tid 18737 thread 11 bound to OS proc set {44}
OMP: pid 18660 tid 18740 thread 14 bound to OS proc set {56}
OMP: pid 18660 tid 18741 thread 15 bound to OS proc set {60}
OMP: pid 18660 tid 18738 thread 12 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 9.335571, "speed_pp": 219.375977, "t_tg": 0.000001, "speed_tg": 0.000000, "t": 9.335572, "speed": 219.375946}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_4  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 18769 tid 18769 thread 0 bound to OS proc set {0}
OMP: pid 18769 tid 18836 thread 1 bound to OS proc set {2}
OMP: pid 18769 tid 18837 thread 2 bound to OS proc set {5}
OMP: pid 18769 tid 18840 thread 5 bound to OS proc set {13}
OMP: pid 18769 tid 18839 thread 4 bound to OS proc set {10}
OMP: pid 18769 tid 18844 thread 9 bound to OS proc set {24}
OMP: pid 18769 tid 18841 thread 6 bound to OS proc set {16}
OMP: pid 18769 tid 18847 thread 12 bound to OS proc set {32}
OMP: pid 18769 tid 18848 thread 13 bound to OS proc set {35}
OMP: pid 18769 tid 18842 thread 7 bound to OS proc set {18}
OMP: pid 18769 tid 18838 thread 3 bound to OS proc set {8}
OMP: pid 18769 tid 18852 thread 17 bound to OS proc set {46}
OMP: pid 18769 tid 18846 thread 11 bound to OS proc set {29}
OMP: pid 18769 tid 18849 thread 14 bound to OS proc set {37}
OMP: pid 18769 tid 18845 thread 10 bound to OS proc set {27}
OMP: pid 18769 tid 18851 thread 16 bound to OS proc set {43}
OMP: pid 18769 tid 18854 thread 19 bound to OS proc set {51}
OMP: pid 18769 tid 18853 thread 18 bound to OS proc set {48}
OMP: pid 18769 tid 18850 thread 15 bound to OS proc set {40}
OMP: pid 18769 tid 18843 thread 8 bound to OS proc set {21}
OMP: pid 18769 tid 18856 thread 21 bound to OS proc set {56}
OMP: pid 18769 tid 18855 thread 20 bound to OS proc set {54}
OMP: pid 18769 tid 18857 thread 22 bound to OS proc set {59}
OMP: pid 18769 tid 18858 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 7.042750, "speed_pp": 290.795502, "t_tg": 0.000000, "speed_tg": nan, "t": 7.042750, "speed": 290.795502}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_5  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 18885 tid 18885 thread 0 bound to OS proc set {0}
OMP: pid 18885 tid 18954 thread 3 bound to OS proc set {6}
OMP: pid 18885 tid 18952 thread 1 bound to OS proc set {2}
OMP: pid 18885 tid 18955 thread 4 bound to OS proc set {8}
OMP: pid 18885 tid 18953 thread 2 bound to OS proc set {4}
OMP: pid 18885 tid 18964 thread 13 bound to OS proc set {26}
OMP: pid 18885 tid 18963 thread 12 bound to OS proc set {24}
OMP: pid 18885 tid 18956 thread 5 bound to OS proc set {10}
OMP: pid 18885 tid 18966 thread 15 bound to OS proc set {30}
OMP: pid 18885 tid 18962 thread 11 bound to OS proc set {22}
OMP: pid 18885 tid 18965 thread 14 bound to OS proc set {28}
OMP: pid 18885 tid 18957 thread 6 bound to OS proc set {12}
OMP: pid 18885 tid 18960 thread 9 bound to OS proc set {18}
OMP: pid 18885 tid 18968 thread 17 bound to OS proc set {34}
OMP: pid 18885 tid 18958 thread 7 bound to OS proc set {14}
OMP: pid 18885 tid 18970 thread 19 bound to OS proc set {38}
OMP: pid 18885 tid 18969 thread 18 bound to OS proc set {36}
OMP: pid 18885 tid 18961 thread 10 bound to OS proc set {20}
OMP: pid 18885 tid 18967 thread 16 bound to OS proc set {32}
OMP: pid 18885 tid 18959 thread 8 bound to OS proc set {16}
OMP: pid 18885 tid 18971 thread 20 bound to OS proc set {40}
OMP: pid 18885 tid 18972 thread 21 bound to OS proc set {42}
OMP: pid 18885 tid 18973 thread 22 bound to OS proc set {44}
OMP: pid 18885 tid 18974 thread 23 bound to OS proc set {46}
OMP: pid 18885 tid 18976 thread 25 bound to OS proc set {50}
OMP: pid 18885 tid 18975 thread 24 bound to OS proc set {48}
OMP: pid 18885 tid 18979 thread 28 bound to OS proc set {56}
OMP: pid 18885 tid 18977 thread 26 bound to OS proc set {52}
OMP: pid 18885 tid 18978 thread 27 bound to OS proc set {54}
OMP: pid 18885 tid 18982 thread 31 bound to OS proc set {62}
OMP: pid 18885 tid 18980 thread 29 bound to OS proc set {58}
OMP: pid 18885 tid 18981 thread 30 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 5.612815, "speed_pp": 364.879303, "t_tg": 0.000000, "speed_tg": nan, "t": 5.612815, "speed": 364.879303}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_6  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 19009 tid 19076 thread 1 bound to OS proc set {1}
OMP: pid 19009 tid 19009 thread 0 bound to OS proc set {0}
OMP: pid 19009 tid 19077 thread 2 bound to OS proc set {3}
OMP: pid 19009 tid 19078 thread 3 bound to OS proc set {4}
OMP: pid 19009 tid 19080 thread 5 bound to OS proc set {8}
OMP: pid 19009 tid 19083 thread 8 bound to OS proc set {13}
OMP: pid 19009 tid 19079 thread 4 bound to OS proc set {6}
OMP: pid 19009 tid 19082 thread 7 bound to OS proc set {11}
OMP: pid 19009 tid 19081 thread 6 bound to OS proc set {9}
OMP: pid 19009 tid 19089 thread 14 bound to OS proc set {22}
OMP: pid 19009 tid 19087 thread 12 bound to OS proc set {19}
OMP: pid 19009 tid 19085 thread 10 bound to OS proc set {16}
OMP: pid 19009 tid 19086 thread 11 bound to OS proc set {17}
OMP: pid 19009 tid 19090 thread 15 bound to OS proc set {24}
OMP: pid 19009 tid 19092 thread 17 bound to OS proc set {27}
OMP: pid 19009 tid 19107 thread 32 bound to OS proc set {52}
OMP: pid 19009 tid 19108 thread 33 bound to OS proc set {53}
OMP: pid 19009 tid 19084 thread 9 bound to OS proc set {14}
OMP: pid 19009 tid 19109 thread 34 bound to OS proc set {55}
OMP: pid 19009 tid 19110 thread 35 bound to OS proc set {56}
OMP: pid 19009 tid 19088 thread 13 bound to OS proc set {21}
OMP: pid 19009 tid 19094 thread 19 bound to OS proc set {30}
OMP: pid 19009 tid 19097 thread 22 bound to OS proc set {35}
OMP: pid 19009 tid 19093 thread 18 bound to OS proc set {29}
OMP: pid 19009 tid 19100 thread 25 bound to OS proc set {40}
OMP: pid 19009 tid 19095 thread 20 bound to OS proc set {32}
OMP: pid 19009 tid 19103 thread 28 bound to OS proc set {45}
OMP: pid 19009 tid 19099 thread 24 bound to OS proc set {39}
OMP: pid 19009 tid 19101 thread 26 bound to OS proc set {42}
OMP: pid 19009 tid 19091 thread 16 bound to OS proc set {26}
OMP: pid 19009 tid 19112 thread 37 bound to OS proc set {60}
OMP: pid 19009 tid 19096 thread 21 bound to OS proc set {34}
OMP: pid 19009 tid 19106 thread 31 bound to OS proc set {50}
OMP: pid 19009 tid 19104 thread 29 bound to OS proc set {47}
OMP: pid 19009 tid 19098 thread 23 bound to OS proc set {37}
OMP: pid 19009 tid 19105 thread 30 bound to OS proc set {48}
OMP: pid 19009 tid 19111 thread 36 bound to OS proc set {58}
OMP: pid 19009 tid 19113 thread 38 bound to OS proc set {61}
OMP: pid 19009 tid 19102 thread 27 bound to OS proc set {43}
OMP: pid 19009 tid 19114 thread 39 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 4.743154, "speed_pp": 431.780212, "t_tg": 0.000000, "speed_tg": nan, "t": 4.743154, "speed": 431.780212}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_7  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 19141 tid 19208 thread 1 bound to OS proc set {1}
OMP: pid 19141 tid 19209 thread 2 bound to OS proc set {2}
OMP: pid 19141 tid 19141 thread 0 bound to OS proc set {0}
OMP: pid 19141 tid 19210 thread 3 bound to OS proc set {4}
OMP: pid 19141 tid 19211 thread 4 bound to OS proc set {5}
OMP: pid 19141 tid 19213 thread 6 bound to OS proc set {8}
OMP: pid 19141 tid 19212 thread 5 bound to OS proc set {6}
OMP: pid 19141 tid 19214 thread 7 bound to OS proc set {9}
OMP: pid 19141 tid 19218 thread 11 bound to OS proc set {14}
OMP: pid 19141 tid 19217 thread 10 bound to OS proc set {13}
OMP: pid 19141 tid 19216 thread 9 bound to OS proc set {12}
OMP: pid 19141 tid 19215 thread 8 bound to OS proc set {10}
OMP: pid 19141 tid 19220 thread 13 bound to OS proc set {17}
OMP: pid 19141 tid 19224 thread 17 bound to OS proc set {23}
OMP: pid 19141 tid 19221 thread 14 bound to OS proc set {18}
OMP: pid 19141 tid 19241 thread 34 bound to OS proc set {46}
OMP: pid 19141 tid 19226 thread 19 bound to OS proc set {25}
OMP: pid 19141 tid 19219 thread 12 bound to OS proc set {16}
OMP: pid 19141 tid 19240 thread 33 bound to OS proc set {44}
OMP: pid 19141 tid 19244 thread 37 bound to OS proc set {50}
OMP: pid 19141 tid 19223 thread 16 bound to OS proc set {21}
OMP: pid 19141 tid 19222 thread 15 bound to OS proc set {20}
OMP: pid 19141 tid 19225 thread 18 bound to OS proc set {24}
OMP: pid 19141 tid 19229 thread 22 bound to OS proc set {29}
OMP: pid 19141 tid 19235 thread 28 bound to OS proc set {37}
OMP: pid 19141 tid 19239 thread 32 bound to OS proc set {43}
OMP: pid 19141 tid 19245 thread 38 bound to OS proc set {51}
OMP: pid 19141 tid 19236 thread 29 bound to OS proc set {39}
OMP: pid 19141 tid 19227 thread 20 bound to OS proc set {27}
OMP: pid 19141 tid 19246 thread 39 bound to OS proc set {52}
OMP: pid 19141 tid 19243 thread 36 bound to OS proc set {48}
OMP: pid 19141 tid 19228 thread 21 bound to OS proc set {28}
OMP: pid 19141 tid 19231 thread 24 bound to OS proc set {32}
OMP: pid 19141 tid 19242 thread 35 bound to OS proc set {47}
OMP: pid 19141 tid 19247 thread 40 bound to OS proc set {54}
OMP: pid 19141 tid 19248 thread 41 bound to OS proc set {55}
OMP: pid 19141 tid 19233 thread 26 bound to OS proc set {35}
OMP: pid 19141 tid 19252 thread 45 bound to OS proc set {60}
OMP: pid 19141 tid 19251 thread 44 bound to OS proc set {59}
OMP: pid 19141 tid 19249 thread 42 bound to OS proc set {56}
OMP: pid 19141 tid 19232 thread 25 bound to OS proc set {33}
OMP: pid 19141 tid 19234 thread 27 bound to OS proc set {36}
OMP: pid 19141 tid 19238 thread 31 bound to OS proc set {41}
OMP: pid 19141 tid 19230 thread 23 bound to OS proc set {31}
OMP: pid 19141 tid 19237 thread 30 bound to OS proc set {40}
OMP: pid 19141 tid 19250 thread 43 bound to OS proc set {58}
OMP: pid 19141 tid 19253 thread 46 bound to OS proc set {62}
OMP: pid 19141 tid 19254 thread 47 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 4.127825, "speed_pp": 496.145081, "t_tg": 0.000000, "speed_tg": nan, "t": 4.127825, "speed": 496.145081}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_8  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 19381 tid 19448 thread 1 bound to OS proc set {1}
OMP: pid 19381 tid 19450 thread 3 bound to OS proc set {3}
OMP: pid 19381 tid 19449 thread 2 bound to OS proc set {2}
OMP: pid 19381 tid 19381 thread 0 bound to OS proc set {0}
OMP: pid 19381 tid 19451 thread 4 bound to OS proc set {4}
OMP: pid 19381 tid 19453 thread 6 bound to OS proc set {6}
OMP: pid 19381 tid 19452 thread 5 bound to OS proc set {5}
OMP: pid 19381 tid 19459 thread 12 bound to OS proc set {13}
OMP: pid 19381 tid 19455 thread 8 bound to OS proc set {9}
OMP: pid 19381 tid 19456 thread 9 bound to OS proc set {10}
OMP: pid 19381 tid 19461 thread 14 bound to OS proc set {16}
OMP: pid 19381 tid 19458 thread 11 bound to OS proc set {12}
OMP: pid 19381 tid 19457 thread 10 bound to OS proc set {11}
OMP: pid 19381 tid 19454 thread 7 bound to OS proc set {8}
OMP: pid 19381 tid 19464 thread 17 bound to OS proc set {19}
OMP: pid 19381 tid 19465 thread 18 bound to OS proc set {20}
OMP: pid 19381 tid 19481 thread 34 bound to OS proc set {39}
OMP: pid 19381 tid 19471 thread 24 bound to OS proc set {27}
OMP: pid 19381 tid 19460 thread 13 bound to OS proc set {15}
OMP: pid 19381 tid 19462 thread 15 bound to OS proc set {17}
OMP: pid 19381 tid 19470 thread 23 bound to OS proc set {26}
OMP: pid 19381 tid 19497 thread 50 bound to OS proc set {58}
OMP: pid 19381 tid 19480 thread 33 bound to OS proc set {38}
OMP: pid 19381 tid 19463 thread 16 bound to OS proc set {18}
OMP: pid 19381 tid 19479 thread 32 bound to OS proc set {37}
OMP: pid 19381 tid 19495 thread 48 bound to OS proc set {55}
OMP: pid 19381 tid 19496 thread 49 bound to OS proc set {56}
OMP: pid 19381 tid 19466 thread 19 bound to OS proc set {22}
OMP: pid 19381 tid 19467 thread 20 bound to OS proc set {23}
OMP: pid 19381 tid 19498 thread 51 bound to OS proc set {59}
OMP: pid 19381 tid 19468 thread 21 bound to OS proc set {24}
OMP: pid 19381 tid 19475 thread 28 bound to OS proc set {32}
OMP: pid 19381 tid 19476 thread 29 bound to OS proc set {33}
OMP: pid 19381 tid 19472 thread 25 bound to OS proc set {29}
OMP: pid 19381 tid 19482 thread 35 bound to OS proc set {40}
OMP: pid 19381 tid 19474 thread 27 bound to OS proc set {31}
OMP: pid 19381 tid 19473 thread 26 bound to OS proc set {30}
OMP: pid 19381 tid 19483 thread 36 bound to OS proc set {41}
OMP: pid 19381 tid 19478 thread 31 bound to OS proc set {35}
OMP: pid 19381 tid 19492 thread 45 bound to OS proc set {52}
OMP: pid 19381 tid 19484 thread 37 bound to OS proc set {42}
OMP: pid 19381 tid 19469 thread 22 bound to OS proc set {25}
OMP: pid 19381 tid 19491 thread 44 bound to OS proc set {51}
OMP: pid 19381 tid 19477 thread 30 bound to OS proc set {34}
OMP: pid 19381 tid 19490 thread 43 bound to OS proc set {49}
OMP: pid 19381 tid 19488 thread 41 bound to OS proc set {47}
OMP: pid 19381 tid 19486 thread 39 bound to OS proc set {45}
OMP: pid 19381 tid 19489 thread 42 bound to OS proc set {48}
OMP: pid 19381 tid 19493 thread 46 bound to OS proc set {53}
OMP: pid 19381 tid 19494 thread 47 bound to OS proc set {54}
OMP: pid 19381 tid 19485 thread 38 bound to OS proc set {44}
OMP: pid 19381 tid 19487 thread 40 bound to OS proc set {46}
OMP: pid 19381 tid 19499 thread 52 bound to OS proc set {60}
OMP: pid 19381 tid 19500 thread 53 bound to OS proc set {61}
OMP: pid 19381 tid 19502 thread 55 bound to OS proc set {63}
OMP: pid 19381 tid 19501 thread 54 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 3.633005, "speed_pp": 563.720703, "t_tg": 0.000000, "speed_tg": nan, "t": 3.633005, "speed": 563.720703}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9

To display your profiling results:
###########################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                         COMMAND                                                                                          #
###########################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_9  #
###########################################################################################################################################################################################################################


* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66. 
If this is incorrect, rerun with number-processes-per-node=X
OMP: pid 19534 tid 19602 thread 2 bound to OS proc set {2}
OMP: pid 19534 tid 19601 thread 1 bound to OS proc set {1}
OMP: pid 19534 tid 19534 thread 0 bound to OS proc set {0}
OMP: pid 19534 tid 19604 thread 4 bound to OS proc set {4}
OMP: pid 19534 tid 19605 thread 5 bound to OS proc set {5}
OMP: pid 19534 tid 19603 thread 3 bound to OS proc set {3}
OMP: pid 19534 tid 19608 thread 8 bound to OS proc set {8}
OMP: pid 19534 tid 19607 thread 7 bound to OS proc set {7}
OMP: pid 19534 tid 19606 thread 6 bound to OS proc set {6}
OMP: pid 19534 tid 19609 thread 9 bound to OS proc set {9}
OMP: pid 19534 tid 19610 thread 10 bound to OS proc set {10}
OMP: pid 19534 tid 19612 thread 12 bound to OS proc set {12}
OMP: pid 19534 tid 19614 thread 14 bound to OS proc set {14}
OMP: pid 19534 tid 19613 thread 13 bound to OS proc set {13}
OMP: pid 19534 tid 19611 thread 11 bound to OS proc set {11}
OMP: pid 19534 tid 19615 thread 15 bound to OS proc set {15}
OMP: pid 19534 tid 19617 thread 17 bound to OS proc set {17}
OMP: pid 19534 tid 19619 thread 19 bound to OS proc set {19}
OMP: pid 19534 tid 19618 thread 18 bound to OS proc set {18}
OMP: pid 19534 tid 19633 thread 33 bound to OS proc set {33}
OMP: pid 19534 tid 19649 thread 49 bound to OS proc set {49}
OMP: pid 19534 tid 19634 thread 34 bound to OS proc set {34}
OMP: pid 19534 tid 19621 thread 21 bound to OS proc set {21}
OMP: pid 19534 tid 19650 thread 50 bound to OS proc set {50}
OMP: pid 19534 tid 19635 thread 35 bound to OS proc set {35}
OMP: pid 19534 tid 19620 thread 20 bound to OS proc set {20}
OMP: pid 19534 tid 19651 thread 51 bound to OS proc set {51}
OMP: pid 19534 tid 19622 thread 22 bound to OS proc set {22}
OMP: pid 19534 tid 19624 thread 24 bound to OS proc set {24}
OMP: pid 19534 tid 19616 thread 16 bound to OS proc set {16}
OMP: pid 19534 tid 19625 thread 25 bound to OS proc set {25}
OMP: pid 19534 tid 19623 thread 23 bound to OS proc set {23}
OMP: pid 19534 tid 19640 thread 40 bound to OS proc set {40}
OMP: pid 19534 tid 19641 thread 41 bound to OS proc set {41}
OMP: pid 19534 tid 19626 thread 26 bound to OS proc set {26}
OMP: pid 19534 tid 19637 thread 37 bound to OS proc set {37}
OMP: pid 19534 tid 19642 thread 42 bound to OS proc set {42}
OMP: pid 19534 tid 19632 thread 32 bound to OS proc set {32}
OMP: pid 19534 tid 19648 thread 48 bound to OS proc set {48}
OMP: pid 19534 tid 19628 thread 28 bound to OS proc set {28}
OMP: pid 19534 tid 19639 thread 39 bound to OS proc set {39}
OMP: pid 19534 tid 19643 thread 43 bound to OS proc set {43}
OMP: pid 19534 tid 19638 thread 38 bound to OS proc set {38}
OMP: pid 19534 tid 19630 thread 30 bound to OS proc set {30}
OMP: pid 19534 tid 19631 thread 31 bound to OS proc set {31}
OMP: pid 19534 tid 19636 thread 36 bound to OS proc set {36}
OMP: pid 19534 tid 19645 thread 45 bound to OS proc set {45}
OMP: pid 19534 tid 19653 thread 53 bound to OS proc set {53}
OMP: pid 19534 tid 19652 thread 52 bound to OS proc set {52}
OMP: pid 19534 tid 19656 thread 56 bound to OS proc set {56}
OMP: pid 19534 tid 19657 thread 57 bound to OS proc set {57}
OMP: pid 19534 tid 19654 thread 54 bound to OS proc set {54}
OMP: pid 19534 tid 19629 thread 29 bound to OS proc set {29}
OMP: pid 19534 tid 19646 thread 46 bound to OS proc set {46}
OMP: pid 19534 tid 19655 thread 55 bound to OS proc set {55}
OMP: pid 19534 tid 19644 thread 44 bound to OS proc set {44}
OMP: pid 19534 tid 19627 thread 27 bound to OS proc set {27}
OMP: pid 19534 tid 19647 thread 47 bound to OS proc set {47}
OMP: pid 19534 tid 19662 thread 62 bound to OS proc set {62}
OMP: pid 19534 tid 19658 thread 58 bound to OS proc set {58}
OMP: pid 19534 tid 19660 thread 60 bound to OS proc set {60}
OMP: pid 19534 tid 19663 thread 63 bound to OS proc set {63}
OMP: pid 19534 tid 19659 thread 59 bound to OS proc set {59}
OMP: pid 19534 tid 19661 thread 61 bound to OS proc set {61}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 3.242427, "speed_pp": 631.625610, "t_tg": 0.000000, "speed_tg": nan, "t": 3.242427, "speed": 631.625610}





Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10

To display your profiling results:
############################################################################################################################################################################################################################
#    LEVEL    |     REPORT     |                                                                                          COMMAND                                                                                          #
############################################################################################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-131-5415/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761317305/tools/lprof_npsu_run_10  #
############################################################################################################################################################################################################################

×