* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 38915 tid 38915 thread 0 bound to OS proc set {0}
OMP: pid 38915 tid 39014 thread 1 bound to OS proc set {24}
OMP: pid 38915 tid 39015 thread 2 bound to OS proc set {48}
OMP: pid 38915 tid 39016 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 31.160095, "speed_pp": 65.725090, "t_tg": 0.000000, "speed_tg": nan, "t": 31.160095, "speed": 65.725090}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_2 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 39038 tid 39038 thread 0 bound to OS proc set {0}
OMP: pid 39038 tid 39139 thread 3 bound to OS proc set {36}
OMP: pid 39038 tid 39137 thread 1 bound to OS proc set {12}
OMP: pid 39038 tid 39138 thread 2 bound to OS proc set {24}
OMP: pid 39038 tid 39140 thread 4 bound to OS proc set {48}
OMP: pid 39038 tid 39141 thread 5 bound to OS proc set {60}
OMP: pid 39038 tid 39142 thread 6 bound to OS proc set {72}
OMP: pid 39038 tid 39143 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 15.661391, "speed_pp": 130.767441, "t_tg": 0.000000, "speed_tg": nan, "t": 15.661391, "speed": 130.767441}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_3 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 39213 tid 39213 thread 0 bound to OS proc set {0}
OMP: pid 39213 tid 39312 thread 1 bound to OS proc set {6}
OMP: pid 39213 tid 39314 thread 3 bound to OS proc set {18}
OMP: pid 39213 tid 39316 thread 5 bound to OS proc set {30}
OMP: pid 39213 tid 39315 thread 4 bound to OS proc set {24}
OMP: pid 39213 tid 39313 thread 2 bound to OS proc set {12}
OMP: pid 39213 tid 39317 thread 6 bound to OS proc set {36}
OMP: pid 39213 tid 39319 thread 8 bound to OS proc set {48}
OMP: pid 39213 tid 39320 thread 9 bound to OS proc set {54}
OMP: pid 39213 tid 39321 thread 10 bound to OS proc set {60}
OMP: pid 39213 tid 39323 thread 12 bound to OS proc set {72}
OMP: pid 39213 tid 39324 thread 13 bound to OS proc set {78}
OMP: pid 39213 tid 39322 thread 11 bound to OS proc set {66}
OMP: pid 39213 tid 39325 thread 14 bound to OS proc set {84}
OMP: pid 39213 tid 39326 thread 15 bound to OS proc set {90}
OMP: pid 39213 tid 39318 thread 7 bound to OS proc set {42}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 7.923562, "speed_pp": 258.469604, "t_tg": 0.000000, "speed_tg": nan, "t": 7.923562, "speed": 258.469604}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_4 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 39347 tid 39347 thread 0 bound to OS proc set {0}
OMP: pid 39347 tid 39446 thread 1 bound to OS proc set {4}
OMP: pid 39347 tid 39450 thread 5 bound to OS proc set {20}
OMP: pid 39347 tid 39447 thread 2 bound to OS proc set {8}
OMP: pid 39347 tid 39454 thread 9 bound to OS proc set {36}
OMP: pid 39347 tid 39451 thread 6 bound to OS proc set {24}
OMP: pid 39347 tid 39448 thread 3 bound to OS proc set {12}
OMP: pid 39347 tid 39453 thread 8 bound to OS proc set {32}
OMP: pid 39347 tid 39452 thread 7 bound to OS proc set {28}
OMP: pid 39347 tid 39455 thread 10 bound to OS proc set {40}
OMP: pid 39347 tid 39456 thread 11 bound to OS proc set {44}
OMP: pid 39347 tid 39457 thread 12 bound to OS proc set {48}
OMP: pid 39347 tid 39458 thread 13 bound to OS proc set {52}
OMP: pid 39347 tid 39462 thread 17 bound to OS proc set {68}
OMP: pid 39347 tid 39459 thread 14 bound to OS proc set {56}
OMP: pid 39347 tid 39449 thread 4 bound to OS proc set {16}
OMP: pid 39347 tid 39460 thread 15 bound to OS proc set {60}
OMP: pid 39347 tid 39463 thread 18 bound to OS proc set {72}
OMP: pid 39347 tid 39461 thread 16 bound to OS proc set {64}
OMP: pid 39347 tid 39464 thread 19 bound to OS proc set {76}
OMP: pid 39347 tid 39465 thread 20 bound to OS proc set {80}
OMP: pid 39347 tid 39466 thread 21 bound to OS proc set {84}
OMP: pid 39347 tid 39468 thread 23 bound to OS proc set {92}
OMP: pid 39347 tid 39467 thread 22 bound to OS proc set {88}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 5.811584, "speed_pp": 352.399628, "t_tg": 0.000000, "speed_tg": nan, "t": 5.811584, "speed": 352.399628}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_5 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 39489 tid 39489 thread 0 bound to OS proc set {0}
OMP: pid 39489 tid 39589 thread 2 bound to OS proc set {6}
OMP: pid 39489 tid 39588 thread 1 bound to OS proc set {3}
OMP: pid 39489 tid 39590 thread 3 bound to OS proc set {9}
OMP: pid 39489 tid 39592 thread 5 bound to OS proc set {15}
OMP: pid 39489 tid 39595 thread 8 bound to OS proc set {24}
OMP: pid 39489 tid 39591 thread 4 bound to OS proc set {12}
OMP: pid 39489 tid 39599 thread 12 bound to OS proc set {36}
OMP: pid 39489 tid 39596 thread 9 bound to OS proc set {27}
OMP: pid 39489 tid 39593 thread 6 bound to OS proc set {18}
OMP: pid 39489 tid 39601 thread 14 bound to OS proc set {42}
OMP: pid 39489 tid 39598 thread 11 bound to OS proc set {33}
OMP: pid 39489 tid 39600 thread 13 bound to OS proc set {39}
OMP: pid 39489 tid 39597 thread 10 bound to OS proc set {30}
OMP: pid 39489 tid 39604 thread 17 bound to OS proc set {51}
OMP: pid 39489 tid 39594 thread 7 bound to OS proc set {21}
OMP: pid 39489 tid 39605 thread 18 bound to OS proc set {54}
OMP: pid 39489 tid 39602 thread 15 bound to OS proc set {45}
OMP: pid 39489 tid 39603 thread 16 bound to OS proc set {48}
OMP: pid 39489 tid 39606 thread 19 bound to OS proc set {57}
OMP: pid 39489 tid 39609 thread 22 bound to OS proc set {66}
OMP: pid 39489 tid 39611 thread 24 bound to OS proc set {72}
OMP: pid 39489 tid 39607 thread 20 bound to OS proc set {60}
OMP: pid 39489 tid 39613 thread 26 bound to OS proc set {78}
OMP: pid 39489 tid 39610 thread 23 bound to OS proc set {69}
OMP: pid 39489 tid 39615 thread 28 bound to OS proc set {84}
OMP: pid 39489 tid 39614 thread 27 bound to OS proc set {81}
OMP: pid 39489 tid 39612 thread 25 bound to OS proc set {75}
OMP: pid 39489 tid 39616 thread 29 bound to OS proc set {87}
OMP: pid 39489 tid 39608 thread 21 bound to OS proc set {63}
OMP: pid 39489 tid 39618 thread 31 bound to OS proc set {93}
OMP: pid 39489 tid 39617 thread 30 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 4.601703, "speed_pp": 445.052612, "t_tg": 0.000000, "speed_tg": nan, "t": 4.601703, "speed": 445.052612}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_6 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 39687 tid 39687 thread 0 bound to OS proc set {0}
OMP: pid 39687 tid 39786 thread 1 bound to OS proc set {2}
OMP: pid 39687 tid 39788 thread 3 bound to OS proc set {7}
OMP: pid 39687 tid 39787 thread 2 bound to OS proc set {4}
OMP: pid 39687 tid 39789 thread 4 bound to OS proc set {9}
OMP: pid 39687 tid 39790 thread 5 bound to OS proc set {12}
OMP: pid 39687 tid 39794 thread 9 bound to OS proc set {21}
OMP: pid 39687 tid 39798 thread 13 bound to OS proc set {31}
OMP: pid 39687 tid 39795 thread 10 bound to OS proc set {24}
OMP: pid 39687 tid 39797 thread 12 bound to OS proc set {29}
OMP: pid 39687 tid 39791 thread 6 bound to OS proc set {14}
OMP: pid 39687 tid 39796 thread 11 bound to OS proc set {26}
OMP: pid 39687 tid 39802 thread 17 bound to OS proc set {41}
OMP: pid 39687 tid 39792 thread 7 bound to OS proc set {16}
OMP: pid 39687 tid 39799 thread 14 bound to OS proc set {33}
OMP: pid 39687 tid 39793 thread 8 bound to OS proc set {19}
OMP: pid 39687 tid 39803 thread 18 bound to OS proc set {43}
OMP: pid 39687 tid 39800 thread 15 bound to OS proc set {36}
OMP: pid 39687 tid 39818 thread 33 bound to OS proc set {80}
OMP: pid 39687 tid 39819 thread 34 bound to OS proc set {82}
OMP: pid 39687 tid 39817 thread 32 bound to OS proc set {77}
OMP: pid 39687 tid 39804 thread 19 bound to OS proc set {46}
OMP: pid 39687 tid 39820 thread 35 bound to OS proc set {84}
OMP: pid 39687 tid 39801 thread 16 bound to OS proc set {38}
OMP: pid 39687 tid 39805 thread 20 bound to OS proc set {48}
OMP: pid 39687 tid 39808 thread 23 bound to OS proc set {55}
OMP: pid 39687 tid 39822 thread 37 bound to OS proc set {89}
OMP: pid 39687 tid 39806 thread 21 bound to OS proc set {50}
OMP: pid 39687 tid 39809 thread 24 bound to OS proc set {58}
OMP: pid 39687 tid 39807 thread 22 bound to OS proc set {53}
OMP: pid 39687 tid 39810 thread 25 bound to OS proc set {60}
OMP: pid 39687 tid 39813 thread 28 bound to OS proc set {67}
OMP: pid 39687 tid 39823 thread 38 bound to OS proc set {92}
OMP: pid 39687 tid 39821 thread 36 bound to OS proc set {87}
OMP: pid 39687 tid 39814 thread 29 bound to OS proc set {70}
OMP: pid 39687 tid 39811 thread 26 bound to OS proc set {63}
OMP: pid 39687 tid 39816 thread 31 bound to OS proc set {75}
OMP: pid 39687 tid 39815 thread 30 bound to OS proc set {72}
OMP: pid 39687 tid 39824 thread 39 bound to OS proc set {94}
OMP: pid 39687 tid 39812 thread 27 bound to OS proc set {65}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 3.867878, "speed_pp": 529.489319, "t_tg": 0.000000, "speed_tg": nan, "t": 3.867878, "speed": 529.489319}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_7 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 39845 tid 39845 thread 0 bound to OS proc set {0}
OMP: pid 39845 tid 39944 thread 1 bound to OS proc set {2}
OMP: pid 39845 tid 39946 thread 3 bound to OS proc set {6}
OMP: pid 39845 tid 39945 thread 2 bound to OS proc set {4}
OMP: pid 39845 tid 39947 thread 4 bound to OS proc set {8}
OMP: pid 39845 tid 39956 thread 13 bound to OS proc set {26}
OMP: pid 39845 tid 39948 thread 5 bound to OS proc set {10}
OMP: pid 39845 tid 39955 thread 12 bound to OS proc set {24}
OMP: pid 39845 tid 39954 thread 11 bound to OS proc set {22}
OMP: pid 39845 tid 39957 thread 14 bound to OS proc set {28}
OMP: pid 39845 tid 39952 thread 9 bound to OS proc set {18}
OMP: pid 39845 tid 39949 thread 6 bound to OS proc set {12}
OMP: pid 39845 tid 39958 thread 15 bound to OS proc set {30}
OMP: pid 39845 tid 39950 thread 7 bound to OS proc set {14}
OMP: pid 39845 tid 39960 thread 17 bound to OS proc set {34}
OMP: pid 39845 tid 39951 thread 8 bound to OS proc set {16}
OMP: pid 39845 tid 39976 thread 33 bound to OS proc set {66}
OMP: pid 39845 tid 39977 thread 34 bound to OS proc set {68}
OMP: pid 39845 tid 39959 thread 16 bound to OS proc set {32}
OMP: pid 39845 tid 39963 thread 20 bound to OS proc set {40}
OMP: pid 39845 tid 39953 thread 10 bound to OS proc set {20}
OMP: pid 39845 tid 39978 thread 35 bound to OS proc set {70}
OMP: pid 39845 tid 39975 thread 32 bound to OS proc set {64}
OMP: pid 39845 tid 39962 thread 19 bound to OS proc set {38}
OMP: pid 39845 tid 39964 thread 21 bound to OS proc set {42}
OMP: pid 39845 tid 39967 thread 24 bound to OS proc set {48}
OMP: pid 39845 tid 39961 thread 18 bound to OS proc set {36}
OMP: pid 39845 tid 39968 thread 25 bound to OS proc set {50}
OMP: pid 39845 tid 39966 thread 23 bound to OS proc set {46}
OMP: pid 39845 tid 39983 thread 40 bound to OS proc set {80}
OMP: pid 39845 tid 39984 thread 41 bound to OS proc set {82}
OMP: pid 39845 tid 39985 thread 42 bound to OS proc set {84}
OMP: pid 39845 tid 39980 thread 37 bound to OS proc set {74}
OMP: pid 39845 tid 39979 thread 36 bound to OS proc set {72}
OMP: pid 39845 tid 39981 thread 38 bound to OS proc set {76}
OMP: pid 39845 tid 39988 thread 45 bound to OS proc set {90}
OMP: pid 39845 tid 39970 thread 27 bound to OS proc set {54}
OMP: pid 39845 tid 39987 thread 44 bound to OS proc set {88}
OMP: pid 39845 tid 39969 thread 26 bound to OS proc set {52}
OMP: pid 39845 tid 39965 thread 22 bound to OS proc set {44}
OMP: pid 39845 tid 39986 thread 43 bound to OS proc set {86}
OMP: pid 39845 tid 39971 thread 28 bound to OS proc set {56}
OMP: pid 39845 tid 39973 thread 30 bound to OS proc set {60}
OMP: pid 39845 tid 39972 thread 29 bound to OS proc set {58}
OMP: pid 39845 tid 39982 thread 39 bound to OS proc set {78}
OMP: pid 39845 tid 39990 thread 47 bound to OS proc set {94}
OMP: pid 39845 tid 39974 thread 31 bound to OS proc set {62}
OMP: pid 39845 tid 39989 thread 46 bound to OS proc set {92}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 3.354892, "speed_pp": 610.451843, "t_tg": 0.000000, "speed_tg": nan, "t": 3.354892, "speed": 610.451843}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_8 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 40011 tid 40110 thread 1 bound to OS proc set {1}
OMP: pid 40011 tid 40011 thread 0 bound to OS proc set {0}
OMP: pid 40011 tid 40111 thread 2 bound to OS proc set {3}
OMP: pid 40011 tid 40112 thread 3 bound to OS proc set {5}
OMP: pid 40011 tid 40113 thread 4 bound to OS proc set {6}
OMP: pid 40011 tid 40114 thread 5 bound to OS proc set {8}
OMP: pid 40011 tid 40117 thread 8 bound to OS proc set {13}
OMP: pid 40011 tid 40116 thread 7 bound to OS proc set {12}
OMP: pid 40011 tid 40115 thread 6 bound to OS proc set {10}
OMP: pid 40011 tid 40122 thread 13 bound to OS proc set {22}
OMP: pid 40011 tid 40123 thread 14 bound to OS proc set {24}
OMP: pid 40011 tid 40119 thread 10 bound to OS proc set {17}
OMP: pid 40011 tid 40118 thread 9 bound to OS proc set {15}
OMP: pid 40011 tid 40126 thread 17 bound to OS proc set {29}
OMP: pid 40011 tid 40124 thread 15 bound to OS proc set {25}
OMP: pid 40011 tid 40142 thread 33 bound to OS proc set {57}
OMP: pid 40011 tid 40120 thread 11 bound to OS proc set {19}
OMP: pid 40011 tid 40127 thread 18 bound to OS proc set {31}
OMP: pid 40011 tid 40125 thread 16 bound to OS proc set {27}
OMP: pid 40011 tid 40128 thread 19 bound to OS proc set {32}
OMP: pid 40011 tid 40159 thread 50 bound to OS proc set {86}
OMP: pid 40011 tid 40143 thread 34 bound to OS proc set {58}
OMP: pid 40011 tid 40157 thread 48 bound to OS proc set {83}
OMP: pid 40011 tid 40158 thread 49 bound to OS proc set {84}
OMP: pid 40011 tid 40121 thread 12 bound to OS proc set {20}
OMP: pid 40011 tid 40160 thread 51 bound to OS proc set {88}
OMP: pid 40011 tid 40129 thread 20 bound to OS proc set {34}
OMP: pid 40011 tid 40144 thread 35 bound to OS proc set {60}
OMP: pid 40011 tid 40141 thread 32 bound to OS proc set {55}
OMP: pid 40011 tid 40134 thread 25 bound to OS proc set {43}
OMP: pid 40011 tid 40133 thread 24 bound to OS proc set {41}
OMP: pid 40011 tid 40130 thread 21 bound to OS proc set {36}
OMP: pid 40011 tid 40149 thread 40 bound to OS proc set {69}
OMP: pid 40011 tid 40138 thread 29 bound to OS proc set {50}
OMP: pid 40011 tid 40137 thread 28 bound to OS proc set {48}
OMP: pid 40011 tid 40135 thread 26 bound to OS proc set {45}
OMP: pid 40011 tid 40147 thread 38 bound to OS proc set {65}
OMP: pid 40011 tid 40150 thread 41 bound to OS proc set {71}
OMP: pid 40011 tid 40145 thread 36 bound to OS proc set {62}
OMP: pid 40011 tid 40136 thread 27 bound to OS proc set {46}
OMP: pid 40011 tid 40161 thread 52 bound to OS proc set {90}
OMP: pid 40011 tid 40151 thread 42 bound to OS proc set {72}
OMP: pid 40011 tid 40139 thread 30 bound to OS proc set {51}
OMP: pid 40011 tid 40148 thread 39 bound to OS proc set {67}
OMP: pid 40011 tid 40146 thread 37 bound to OS proc set {64}
OMP: pid 40011 tid 40154 thread 45 bound to OS proc set {77}
OMP: pid 40011 tid 40153 thread 44 bound to OS proc set {76}
OMP: pid 40011 tid 40152 thread 43 bound to OS proc set {74}
OMP: pid 40011 tid 40132 thread 23 bound to OS proc set {39}
OMP: pid 40011 tid 40163 thread 54 bound to OS proc set {93}
OMP: pid 40011 tid 40155 thread 46 bound to OS proc set {79}
OMP: pid 40011 tid 40140 thread 31 bound to OS proc set {53}
OMP: pid 40011 tid 40162 thread 53 bound to OS proc set {91}
OMP: pid 40011 tid 40164 thread 55 bound to OS proc set {95}
OMP: pid 40011 tid 40156 thread 47 bound to OS proc set {81}
OMP: pid 40011 tid 40131 thread 22 bound to OS proc set {38}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.966670, "speed_pp": 690.336304, "t_tg": 0.000000, "speed_tg": nan, "t": 2.966670, "speed": 690.336304}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9
To display your profiling results:
#########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_9 #
#########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 40185 tid 40285 thread 1 bound to OS proc set {1}
OMP: pid 40185 tid 40286 thread 2 bound to OS proc set {3}
OMP: pid 40185 tid 40287 thread 3 bound to OS proc set {4}
OMP: pid 40185 tid 40185 thread 0 bound to OS proc set {0}
OMP: pid 40185 tid 40288 thread 4 bound to OS proc set {6}
OMP: pid 40185 tid 40289 thread 5 bound to OS proc set {7}
OMP: pid 40185 tid 40291 thread 7 bound to OS proc set {10}
OMP: pid 40185 tid 40293 thread 9 bound to OS proc set {13}
OMP: pid 40185 tid 40295 thread 11 bound to OS proc set {16}
OMP: pid 40185 tid 40297 thread 13 bound to OS proc set {19}
OMP: pid 40185 tid 40292 thread 8 bound to OS proc set {12}
OMP: pid 40185 tid 40294 thread 10 bound to OS proc set {15}
OMP: pid 40185 tid 40290 thread 6 bound to OS proc set {9}
OMP: pid 40185 tid 40301 thread 17 bound to OS proc set {25}
OMP: pid 40185 tid 40317 thread 33 bound to OS proc set {50}
OMP: pid 40185 tid 40298 thread 14 bound to OS proc set {21}
OMP: pid 40185 tid 40303 thread 19 bound to OS proc set {28}
OMP: pid 40185 tid 40318 thread 34 bound to OS proc set {51}
OMP: pid 40185 tid 40296 thread 12 bound to OS proc set {18}
OMP: pid 40185 tid 40333 thread 49 bound to OS proc set {74}
OMP: pid 40185 tid 40334 thread 50 bound to OS proc set {75}
OMP: pid 40185 tid 40319 thread 35 bound to OS proc set {53}
OMP: pid 40185 tid 40299 thread 15 bound to OS proc set {22}
OMP: pid 40185 tid 40300 thread 16 bound to OS proc set {24}
OMP: pid 40185 tid 40335 thread 51 bound to OS proc set {77}
OMP: pid 40185 tid 40321 thread 37 bound to OS proc set {56}
OMP: pid 40185 tid 40305 thread 21 bound to OS proc set {31}
OMP: pid 40185 tid 40306 thread 22 bound to OS proc set {33}
OMP: pid 40185 tid 40316 thread 32 bound to OS proc set {48}
OMP: pid 40185 tid 40309 thread 25 bound to OS proc set {37}
OMP: pid 40185 tid 40302 thread 18 bound to OS proc set {27}
OMP: pid 40185 tid 40332 thread 48 bound to OS proc set {72}
OMP: pid 40185 tid 40308 thread 24 bound to OS proc set {36}
OMP: pid 40185 tid 40304 thread 20 bound to OS proc set {30}
OMP: pid 40185 tid 40311 thread 27 bound to OS proc set {40}
OMP: pid 40185 tid 40313 thread 29 bound to OS proc set {43}
OMP: pid 40185 tid 40337 thread 53 bound to OS proc set {80}
OMP: pid 40185 tid 40324 thread 40 bound to OS proc set {60}
OMP: pid 40185 tid 40312 thread 28 bound to OS proc set {42}
OMP: pid 40185 tid 40320 thread 36 bound to OS proc set {54}
OMP: pid 40185 tid 40310 thread 26 bound to OS proc set {39}
OMP: pid 40185 tid 40307 thread 23 bound to OS proc set {34}
OMP: pid 40185 tid 40336 thread 52 bound to OS proc set {78}
OMP: pid 40185 tid 40338 thread 54 bound to OS proc set {81}
OMP: pid 40185 tid 40327 thread 43 bound to OS proc set {65}
OMP: pid 40185 tid 40315 thread 31 bound to OS proc set {46}
OMP: pid 40185 tid 40340 thread 56 bound to OS proc set {84}
OMP: pid 40185 tid 40342 thread 58 bound to OS proc set {87}
OMP: pid 40185 tid 40323 thread 39 bound to OS proc set {59}
OMP: pid 40185 tid 40314 thread 30 bound to OS proc set {45}
OMP: pid 40185 tid 40339 thread 55 bound to OS proc set {83}
OMP: pid 40185 tid 40325 thread 41 bound to OS proc set {62}
OMP: pid 40185 tid 40341 thread 57 bound to OS proc set {86}
OMP: pid 40185 tid 40322 thread 38 bound to OS proc set {57}
OMP: pid 40185 tid 40345 thread 61 bound to OS proc set {92}
OMP: pid 40185 tid 40343 thread 59 bound to OS proc set {89}
OMP: pid 40185 tid 40344 thread 60 bound to OS proc set {90}
OMP: pid 40185 tid 40326 thread 42 bound to OS proc set {63}
OMP: pid 40185 tid 40346 thread 62 bound to OS proc set {93}
OMP: pid 40185 tid 40347 thread 63 bound to OS proc set {95}
OMP: pid 40185 tid 40329 thread 45 bound to OS proc set {68}
OMP: pid 40185 tid 40328 thread 44 bound to OS proc set {66}
OMP: pid 40185 tid 40330 thread 46 bound to OS proc set {69}
OMP: pid 40185 tid 40331 thread 47 bound to OS proc set {71}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.631229, "speed_pp": 778.343506, "t_tg": 0.000000, "speed_tg": nan, "t": 2.631229, "speed": 778.343506}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10
To display your profiling results:
##########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_10 #
##########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 40417 tid 40518 thread 3 bound to OS proc set {4}
OMP: pid 40417 tid 40517 thread 2 bound to OS proc set {2}
OMP: pid 40417 tid 40520 thread 5 bound to OS proc set {6}
OMP: pid 40417 tid 40516 thread 1 bound to OS proc set {1}
OMP: pid 40417 tid 40519 thread 4 bound to OS proc set {5}
OMP: pid 40417 tid 40417 thread 0 bound to OS proc set {0}
OMP: pid 40417 tid 40523 thread 8 bound to OS proc set {10}
OMP: pid 40417 tid 40521 thread 6 bound to OS proc set {8}
OMP: pid 40417 tid 40522 thread 7 bound to OS proc set {9}
OMP: pid 40417 tid 40525 thread 10 bound to OS proc set {13}
OMP: pid 40417 tid 40524 thread 9 bound to OS proc set {12}
OMP: pid 40417 tid 40526 thread 11 bound to OS proc set {14}
OMP: pid 40417 tid 40529 thread 14 bound to OS proc set {18}
OMP: pid 40417 tid 40527 thread 12 bound to OS proc set {16}
OMP: pid 40417 tid 40532 thread 17 bound to OS proc set {22}
OMP: pid 40417 tid 40564 thread 49 bound to OS proc set {66}
OMP: pid 40417 tid 40548 thread 33 bound to OS proc set {44}
OMP: pid 40417 tid 40530 thread 15 bound to OS proc set {20}
OMP: pid 40417 tid 40549 thread 34 bound to OS proc set {45}
OMP: pid 40417 tid 40533 thread 18 bound to OS proc set {24}
OMP: pid 40417 tid 40528 thread 13 bound to OS proc set {17}
OMP: pid 40417 tid 40565 thread 50 bound to OS proc set {67}
OMP: pid 40417 tid 40580 thread 65 bound to OS proc set {87}
OMP: pid 40417 tid 40547 thread 32 bound to OS proc set {43}
OMP: pid 40417 tid 40566 thread 51 bound to OS proc set {68}
OMP: pid 40417 tid 40555 thread 40 bound to OS proc set {53}
OMP: pid 40417 tid 40552 thread 37 bound to OS proc set {49}
OMP: pid 40417 tid 40531 thread 16 bound to OS proc set {21}
OMP: pid 40417 tid 40535 thread 20 bound to OS proc set {26}
OMP: pid 40417 tid 40541 thread 26 bound to OS proc set {35}
OMP: pid 40417 tid 40534 thread 19 bound to OS proc set {25}
OMP: pid 40417 tid 40538 thread 23 bound to OS proc set {30}
OMP: pid 40417 tid 40544 thread 29 bound to OS proc set {39}
OMP: pid 40417 tid 40579 thread 64 bound to OS proc set {86}
OMP: pid 40417 tid 40537 thread 22 bound to OS proc set {29}
OMP: pid 40417 tid 40581 thread 66 bound to OS proc set {88}
OMP: pid 40417 tid 40558 thread 43 bound to OS proc set {57}
OMP: pid 40417 tid 40569 thread 54 bound to OS proc set {72}
OMP: pid 40417 tid 40545 thread 30 bound to OS proc set {40}
OMP: pid 40417 tid 40556 thread 41 bound to OS proc set {55}
OMP: pid 40417 tid 40582 thread 67 bound to OS proc set {90}
OMP: pid 40417 tid 40550 thread 35 bound to OS proc set {47}
OMP: pid 40417 tid 40539 thread 24 bound to OS proc set {32}
OMP: pid 40417 tid 40542 thread 27 bound to OS proc set {36}
OMP: pid 40417 tid 40543 thread 28 bound to OS proc set {37}
OMP: pid 40417 tid 40567 thread 52 bound to OS proc set {70}
OMP: pid 40417 tid 40563 thread 48 bound to OS proc set {64}
OMP: pid 40417 tid 40559 thread 44 bound to OS proc set {59}
OMP: pid 40417 tid 40557 thread 42 bound to OS proc set {56}
OMP: pid 40417 tid 40536 thread 21 bound to OS proc set {28}
OMP: pid 40417 tid 40572 thread 57 bound to OS proc set {76}
OMP: pid 40417 tid 40551 thread 36 bound to OS proc set {48}
OMP: pid 40417 tid 40568 thread 53 bound to OS proc set {71}
OMP: pid 40417 tid 40584 thread 69 bound to OS proc set {92}
OMP: pid 40417 tid 40561 thread 46 bound to OS proc set {61}
OMP: pid 40417 tid 40573 thread 58 bound to OS proc set {78}
OMP: pid 40417 tid 40540 thread 25 bound to OS proc set {33}
OMP: pid 40417 tid 40554 thread 39 bound to OS proc set {52}
OMP: pid 40417 tid 40571 thread 56 bound to OS proc set {75}
OMP: pid 40417 tid 40562 thread 47 bound to OS proc set {63}
OMP: pid 40417 tid 40560 thread 45 bound to OS proc set {60}
OMP: pid 40417 tid 40546 thread 31 bound to OS proc set {41}
OMP: pid 40417 tid 40553 thread 38 bound to OS proc set {51}
OMP: pid 40417 tid 40576 thread 61 bound to OS proc set {82}
OMP: pid 40417 tid 40575 thread 60 bound to OS proc set {80}
OMP: pid 40417 tid 40585 thread 70 bound to OS proc set {94}
OMP: pid 40417 tid 40577 thread 62 bound to OS proc set {83}
OMP: pid 40417 tid 40574 thread 59 bound to OS proc set {79}
OMP: pid 40417 tid 40583 thread 68 bound to OS proc set {91}
OMP: pid 40417 tid 40578 thread 63 bound to OS proc set {84}
OMP: pid 40417 tid 40570 thread 55 bound to OS proc set {74}
OMP: pid 40417 tid 40586 thread 71 bound to OS proc set {95}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.463732, "speed_pp": 831.259216, "t_tg": 0.000000, "speed_tg": nan, "t": 2.463732, "speed": 831.259216}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11
To display your profiling results:
##########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_11 #
##########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 40607 tid 40707 thread 1 bound to OS proc set {1}
OMP: pid 40607 tid 40708 thread 2 bound to OS proc set {2}
OMP: pid 40607 tid 40709 thread 3 bound to OS proc set {3}
OMP: pid 40607 tid 40710 thread 4 bound to OS proc set {4}
OMP: pid 40607 tid 40607 thread 0 bound to OS proc set {0}
OMP: pid 40607 tid 40711 thread 5 bound to OS proc set {6}
OMP: pid 40607 tid 40714 thread 8 bound to OS proc set {9}
OMP: pid 40607 tid 40715 thread 9 bound to OS proc set {10}
OMP: pid 40607 tid 40717 thread 11 bound to OS proc set {13}
OMP: pid 40607 tid 40716 thread 10 bound to OS proc set {12}
OMP: pid 40607 tid 40713 thread 7 bound to OS proc set {8}
OMP: pid 40607 tid 40712 thread 6 bound to OS proc set {7}
OMP: pid 40607 tid 40719 thread 13 bound to OS proc set {15}
OMP: pid 40607 tid 40721 thread 15 bound to OS proc set {18}
OMP: pid 40607 tid 40724 thread 18 bound to OS proc set {21}
OMP: pid 40607 tid 40739 thread 33 bound to OS proc set {40}
OMP: pid 40607 tid 40755 thread 49 bound to OS proc set {59}
OMP: pid 40607 tid 40718 thread 12 bound to OS proc set {14}
OMP: pid 40607 tid 40740 thread 34 bound to OS proc set {41}
OMP: pid 40607 tid 40723 thread 17 bound to OS proc set {20}
OMP: pid 40607 tid 40756 thread 50 bound to OS proc set {60}
OMP: pid 40607 tid 40720 thread 14 bound to OS proc set {16}
OMP: pid 40607 tid 40738 thread 32 bound to OS proc set {38}
OMP: pid 40607 tid 40722 thread 16 bound to OS proc set {19}
OMP: pid 40607 tid 40741 thread 35 bound to OS proc set {42}
OMP: pid 40607 tid 40772 thread 66 bound to OS proc set {80}
OMP: pid 40607 tid 40771 thread 65 bound to OS proc set {78}
OMP: pid 40607 tid 40726 thread 20 bound to OS proc set {24}
OMP: pid 40607 tid 40744 thread 38 bound to OS proc set {46}
OMP: pid 40607 tid 40727 thread 21 bound to OS proc set {25}
OMP: pid 40607 tid 40728 thread 22 bound to OS proc set {26}
OMP: pid 40607 tid 40735 thread 29 bound to OS proc set {35}
OMP: pid 40607 tid 40729 thread 23 bound to OS proc set {27}
OMP: pid 40607 tid 40730 thread 24 bound to OS proc set {29}
OMP: pid 40607 tid 40725 thread 19 bound to OS proc set {23}
OMP: pid 40607 tid 40754 thread 48 bound to OS proc set {58}
OMP: pid 40607 tid 40732 thread 26 bound to OS proc set {31}
OMP: pid 40607 tid 40734 thread 28 bound to OS proc set {33}
OMP: pid 40607 tid 40736 thread 30 bound to OS proc set {36}
OMP: pid 40607 tid 40743 thread 37 bound to OS proc set {44}
OMP: pid 40607 tid 40731 thread 25 bound to OS proc set {30}
OMP: pid 40607 tid 40737 thread 31 bound to OS proc set {37}
OMP: pid 40607 tid 40747 thread 41 bound to OS proc set {49}
OMP: pid 40607 tid 40742 thread 36 bound to OS proc set {43}
OMP: pid 40607 tid 40752 thread 46 bound to OS proc set {55}
OMP: pid 40607 tid 40746 thread 40 bound to OS proc set {48}
OMP: pid 40607 tid 40758 thread 52 bound to OS proc set {63}
OMP: pid 40607 tid 40753 thread 47 bound to OS proc set {56}
OMP: pid 40607 tid 40760 thread 54 bound to OS proc set {65}
OMP: pid 40607 tid 40763 thread 57 bound to OS proc set {69}
OMP: pid 40607 tid 40751 thread 45 bound to OS proc set {54}
OMP: pid 40607 tid 40733 thread 27 bound to OS proc set {32}
OMP: pid 40607 tid 40759 thread 53 bound to OS proc set {64}
OMP: pid 40607 tid 40761 thread 55 bound to OS proc set {66}
OMP: pid 40607 tid 40757 thread 51 bound to OS proc set {61}
OMP: pid 40607 tid 40762 thread 56 bound to OS proc set {67}
OMP: pid 40607 tid 40766 thread 60 bound to OS proc set {72}
OMP: pid 40607 tid 40775 thread 69 bound to OS proc set {83}
OMP: pid 40607 tid 40745 thread 39 bound to OS proc set {47}
OMP: pid 40607 tid 40764 thread 58 bound to OS proc set {70}
OMP: pid 40607 tid 40767 thread 61 bound to OS proc set {73}
OMP: pid 40607 tid 40749 thread 43 bound to OS proc set {52}
OMP: pid 40607 tid 40769 thread 63 bound to OS proc set {76}
OMP: pid 40607 tid 40750 thread 44 bound to OS proc set {53}
OMP: pid 40607 tid 40768 thread 62 bound to OS proc set {75}
OMP: pid 40607 tid 40765 thread 59 bound to OS proc set {71}
OMP: pid 40607 tid 40773 thread 67 bound to OS proc set {81}
OMP: pid 40607 tid 40774 thread 68 bound to OS proc set {82}
OMP: pid 40607 tid 40770 thread 64 bound to OS proc set {77}
OMP: pid 40607 tid 40748 thread 42 bound to OS proc set {50}
OMP: pid 40607 tid 40783 thread 77 bound to OS proc set {93}
OMP: pid 40607 tid 40782 thread 76 bound to OS proc set {92}
OMP: pid 40607 tid 40779 thread 73 bound to OS proc set {88}
OMP: pid 40607 tid 40776 thread 70 bound to OS proc set {84}
OMP: pid 40607 tid 40778 thread 72 bound to OS proc set {87}
OMP: pid 40607 tid 40780 thread 74 bound to OS proc set {89}
OMP: pid 40607 tid 40785 thread 79 bound to OS proc set {95}
OMP: pid 40607 tid 40777 thread 71 bound to OS proc set {86}
OMP: pid 40607 tid 40781 thread 75 bound to OS proc set {90}
OMP: pid 40607 tid 40784 thread 78 bound to OS proc set {94}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.249581, "speed_pp": 910.391724, "t_tg": 0.000000, "speed_tg": nan, "t": 2.249581, "speed": 910.391724}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12
To display your profiling results:
##########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_12 #
##########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 40809 tid 40908 thread 1 bound to OS proc set {1}
OMP: pid 40809 tid 40909 thread 2 bound to OS proc set {2}
OMP: pid 40809 tid 40910 thread 3 bound to OS proc set {3}
OMP: pid 40809 tid 40911 thread 4 bound to OS proc set {4}
OMP: pid 40809 tid 40912 thread 5 bound to OS proc set {5}
OMP: pid 40809 tid 40913 thread 6 bound to OS proc set {6}
OMP: pid 40809 tid 40916 thread 9 bound to OS proc set {9}
OMP: pid 40809 tid 40915 thread 8 bound to OS proc set {8}
OMP: pid 40809 tid 40809 thread 0 bound to OS proc set {0}
OMP: pid 40809 tid 40914 thread 7 bound to OS proc set {7}
OMP: pid 40809 tid 40924 thread 17 bound to OS proc set {18}
OMP: pid 40809 tid 40925 thread 18 bound to OS proc set {19}
OMP: pid 40809 tid 40919 thread 12 bound to OS proc set {13}
OMP: pid 40809 tid 40926 thread 19 bound to OS proc set {20}
OMP: pid 40809 tid 40920 thread 13 bound to OS proc set {14}
OMP: pid 40809 tid 40921 thread 14 bound to OS proc set {15}
OMP: pid 40809 tid 40918 thread 11 bound to OS proc set {12}
OMP: pid 40809 tid 40917 thread 10 bound to OS proc set {11}
OMP: pid 40809 tid 40922 thread 15 bound to OS proc set {16}
OMP: pid 40809 tid 40939 thread 32 bound to OS proc set {35}
OMP: pid 40809 tid 40940 thread 33 bound to OS proc set {36}
OMP: pid 40809 tid 40927 thread 20 bound to OS proc set {22}
OMP: pid 40809 tid 40973 thread 66 bound to OS proc set {72}
OMP: pid 40809 tid 40956 thread 49 bound to OS proc set {54}
OMP: pid 40809 tid 40974 thread 67 bound to OS proc set {73}
OMP: pid 40809 tid 40931 thread 24 bound to OS proc set {26}
OMP: pid 40809 tid 40928 thread 21 bound to OS proc set {23}
OMP: pid 40809 tid 40935 thread 28 bound to OS proc set {30}
OMP: pid 40809 tid 40930 thread 23 bound to OS proc set {25}
OMP: pid 40809 tid 40941 thread 34 bound to OS proc set {37}
OMP: pid 40809 tid 40929 thread 22 bound to OS proc set {24}
OMP: pid 40809 tid 40942 thread 35 bound to OS proc set {38}
OMP: pid 40809 tid 40945 thread 38 bound to OS proc set {41}
OMP: pid 40809 tid 40932 thread 25 bound to OS proc set {27}
OMP: pid 40809 tid 40943 thread 36 bound to OS proc set {39}
OMP: pid 40809 tid 40957 thread 50 bound to OS proc set {55}
OMP: pid 40809 tid 40938 thread 31 bound to OS proc set {34}
OMP: pid 40809 tid 40933 thread 26 bound to OS proc set {28}
OMP: pid 40809 tid 40936 thread 29 bound to OS proc set {31}
OMP: pid 40809 tid 40955 thread 48 bound to OS proc set {52}
OMP: pid 40809 tid 40944 thread 37 bound to OS proc set {40}
OMP: pid 40809 tid 40937 thread 30 bound to OS proc set {33}
OMP: pid 40809 tid 40946 thread 39 bound to OS proc set {42}
OMP: pid 40809 tid 40949 thread 42 bound to OS proc set {46}
OMP: pid 40809 tid 40953 thread 46 bound to OS proc set {50}
OMP: pid 40809 tid 40950 thread 43 bound to OS proc set {47}
OMP: pid 40809 tid 40971 thread 64 bound to OS proc set {70}
OMP: pid 40809 tid 40934 thread 27 bound to OS proc set {29}
OMP: pid 40809 tid 40947 thread 40 bound to OS proc set {44}
OMP: pid 40809 tid 40952 thread 45 bound to OS proc set {49}
OMP: pid 40809 tid 40923 thread 16 bound to OS proc set {17}
OMP: pid 40809 tid 40972 thread 65 bound to OS proc set {71}
OMP: pid 40809 tid 40948 thread 41 bound to OS proc set {45}
OMP: pid 40809 tid 40963 thread 56 bound to OS proc set {61}
OMP: pid 40809 tid 40951 thread 44 bound to OS proc set {48}
OMP: pid 40809 tid 40959 thread 52 bound to OS proc set {57}
OMP: pid 40809 tid 40969 thread 62 bound to OS proc set {68}
OMP: pid 40809 tid 40980 thread 73 bound to OS proc set {80}
OMP: pid 40809 tid 40965 thread 58 bound to OS proc set {63}
OMP: pid 40809 tid 40962 thread 55 bound to OS proc set {60}
OMP: pid 40809 tid 40976 thread 69 bound to OS proc set {76}
OMP: pid 40809 tid 40967 thread 60 bound to OS proc set {66}
OMP: pid 40809 tid 40981 thread 74 bound to OS proc set {81}
OMP: pid 40809 tid 40988 thread 81 bound to OS proc set {89}
OMP: pid 40809 tid 40964 thread 57 bound to OS proc set {62}
OMP: pid 40809 tid 40966 thread 59 bound to OS proc set {65}
OMP: pid 40809 tid 40977 thread 70 bound to OS proc set {77}
OMP: pid 40809 tid 40984 thread 77 bound to OS proc set {84}
OMP: pid 40809 tid 40979 thread 72 bound to OS proc set {79}
OMP: pid 40809 tid 40970 thread 63 bound to OS proc set {69}
OMP: pid 40809 tid 40961 thread 54 bound to OS proc set {59}
OMP: pid 40809 tid 40983 thread 76 bound to OS proc set {83}
OMP: pid 40809 tid 40985 thread 78 bound to OS proc set {85}
OMP: pid 40809 tid 40989 thread 82 bound to OS proc set {90}
OMP: pid 40809 tid 40968 thread 61 bound to OS proc set {67}
OMP: pid 40809 tid 40975 thread 68 bound to OS proc set {74}
OMP: pid 40809 tid 40982 thread 75 bound to OS proc set {82}
OMP: pid 40809 tid 40958 thread 51 bound to OS proc set {56}
OMP: pid 40809 tid 40990 thread 83 bound to OS proc set {91}
OMP: pid 40809 tid 40954 thread 47 bound to OS proc set {51}
OMP: pid 40809 tid 40960 thread 53 bound to OS proc set {58}
OMP: pid 40809 tid 40978 thread 71 bound to OS proc set {78}
OMP: pid 40809 tid 40987 thread 80 bound to OS proc set {88}
OMP: pid 40809 tid 40986 thread 79 bound to OS proc set {87}
OMP: pid 40809 tid 40992 thread 85 bound to OS proc set {93}
OMP: pid 40809 tid 40991 thread 84 bound to OS proc set {92}
OMP: pid 40809 tid 40993 thread 86 bound to OS proc set {94}
OMP: pid 40809 tid 40994 thread 87 bound to OS proc set {95}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.107499, "speed_pp": 971.768005, "t_tg": 0.000000, "speed_tg": nan, "t": 2.107499, "speed": 971.768005}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13
To display your profiling results:
##########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_13 #
##########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 41015 tid 41114 thread 1 bound to OS proc set {1}
OMP: pid 41015 tid 41115 thread 2 bound to OS proc set {2}
OMP: pid 41015 tid 41116 thread 3 bound to OS proc set {3}
OMP: pid 41015 tid 41118 thread 5 bound to OS proc set {5}
OMP: pid 41015 tid 41117 thread 4 bound to OS proc set {4}
OMP: pid 41015 tid 41122 thread 9 bound to OS proc set {9}
OMP: pid 41015 tid 41015 thread 0 bound to OS proc set {0}
OMP: pid 41015 tid 41119 thread 6 bound to OS proc set {6}
OMP: pid 41015 tid 41121 thread 8 bound to OS proc set {8}
OMP: pid 41015 tid 41123 thread 10 bound to OS proc set {10}
OMP: pid 41015 tid 41125 thread 12 bound to OS proc set {12}
OMP: pid 41015 tid 41126 thread 13 bound to OS proc set {13}
OMP: pid 41015 tid 41120 thread 7 bound to OS proc set {7}
OMP: pid 41015 tid 41124 thread 11 bound to OS proc set {11}
OMP: pid 41015 tid 41127 thread 14 bound to OS proc set {14}
OMP: pid 41015 tid 41128 thread 15 bound to OS proc set {15}
OMP: pid 41015 tid 41146 thread 33 bound to OS proc set {33}
OMP: pid 41015 tid 41147 thread 34 bound to OS proc set {34}
OMP: pid 41015 tid 41162 thread 49 bound to OS proc set {49}
OMP: pid 41015 tid 41132 thread 19 bound to OS proc set {19}
OMP: pid 41015 tid 41148 thread 35 bound to OS proc set {35}
OMP: pid 41015 tid 41163 thread 50 bound to OS proc set {50}
OMP: pid 41015 tid 41129 thread 16 bound to OS proc set {16}
OMP: pid 41015 tid 41178 thread 65 bound to OS proc set {65}
OMP: pid 41015 tid 41164 thread 51 bound to OS proc set {51}
OMP: pid 41015 tid 41179 thread 66 bound to OS proc set {66}
OMP: pid 41015 tid 41145 thread 32 bound to OS proc set {32}
OMP: pid 41015 tid 41180 thread 67 bound to OS proc set {67}
OMP: pid 41015 tid 41161 thread 48 bound to OS proc set {48}
OMP: pid 41015 tid 41138 thread 25 bound to OS proc set {25}
OMP: pid 41015 tid 41150 thread 37 bound to OS proc set {37}
OMP: pid 41015 tid 41133 thread 20 bound to OS proc set {20}
OMP: pid 41015 tid 41130 thread 17 bound to OS proc set {17}
OMP: pid 41015 tid 41142 thread 29 bound to OS proc set {29}
OMP: pid 41015 tid 41137 thread 24 bound to OS proc set {24}
OMP: pid 41015 tid 41149 thread 36 bound to OS proc set {36}
OMP: pid 41015 tid 41154 thread 41 bound to OS proc set {41}
OMP: pid 41015 tid 41141 thread 28 bound to OS proc set {28}
OMP: pid 41015 tid 41136 thread 23 bound to OS proc set {23}
OMP: pid 41015 tid 41151 thread 38 bound to OS proc set {38}
OMP: pid 41015 tid 41143 thread 30 bound to OS proc set {30}
OMP: pid 41015 tid 41139 thread 26 bound to OS proc set {26}
OMP: pid 41015 tid 41166 thread 53 bound to OS proc set {53}
OMP: pid 41015 tid 41140 thread 27 bound to OS proc set {27}
OMP: pid 41015 tid 41153 thread 40 bound to OS proc set {40}
OMP: pid 41015 tid 41158 thread 45 bound to OS proc set {45}
OMP: pid 41015 tid 41155 thread 42 bound to OS proc set {42}
OMP: pid 41015 tid 41167 thread 54 bound to OS proc set {54}
OMP: pid 41015 tid 41165 thread 52 bound to OS proc set {52}
OMP: pid 41015 tid 41144 thread 31 bound to OS proc set {31}
OMP: pid 41015 tid 41177 thread 64 bound to OS proc set {64}
OMP: pid 41015 tid 41156 thread 43 bound to OS proc set {43}
OMP: pid 41015 tid 41159 thread 46 bound to OS proc set {46}
OMP: pid 41015 tid 41170 thread 57 bound to OS proc set {57}
OMP: pid 41015 tid 41157 thread 44 bound to OS proc set {44}
OMP: pid 41015 tid 41152 thread 39 bound to OS proc set {39}
OMP: pid 41015 tid 41168 thread 55 bound to OS proc set {55}
OMP: pid 41015 tid 41160 thread 47 bound to OS proc set {47}
OMP: pid 41015 tid 41182 thread 69 bound to OS proc set {69}
OMP: pid 41015 tid 41169 thread 56 bound to OS proc set {56}
OMP: pid 41015 tid 41174 thread 61 bound to OS proc set {61}
OMP: pid 41015 tid 41186 thread 73 bound to OS proc set {73}
OMP: pid 41015 tid 41171 thread 58 bound to OS proc set {58}
OMP: pid 41015 tid 41172 thread 59 bound to OS proc set {59}
OMP: pid 41015 tid 41183 thread 70 bound to OS proc set {70}
OMP: pid 41015 tid 41175 thread 62 bound to OS proc set {62}
OMP: pid 41015 tid 41187 thread 74 bound to OS proc set {74}
OMP: pid 41015 tid 41134 thread 21 bound to OS proc set {21}
OMP: pid 41015 tid 41173 thread 60 bound to OS proc set {60}
OMP: pid 41015 tid 41190 thread 77 bound to OS proc set {77}
OMP: pid 41015 tid 41181 thread 68 bound to OS proc set {68}
OMP: pid 41015 tid 41185 thread 72 bound to OS proc set {72}
OMP: pid 41015 tid 41188 thread 75 bound to OS proc set {75}
OMP: pid 41015 tid 41131 thread 18 bound to OS proc set {18}
OMP: pid 41015 tid 41135 thread 22 bound to OS proc set {22}
OMP: pid 41015 tid 41189 thread 76 bound to OS proc set {76}
OMP: pid 41015 tid 41184 thread 71 bound to OS proc set {71}
OMP: pid 41015 tid 41191 thread 78 bound to OS proc set {78}
OMP: pid 41015 tid 41194 thread 81 bound to OS proc set {81}
OMP: pid 41015 tid 41192 thread 79 bound to OS proc set {79}
OMP: pid 41015 tid 41195 thread 82 bound to OS proc set {82}
OMP: pid 41015 tid 41196 thread 83 bound to OS proc set {83}
OMP: pid 41015 tid 41176 thread 63 bound to OS proc set {63}
OMP: pid 41015 tid 41193 thread 80 bound to OS proc set {80}
OMP: pid 41015 tid 41198 thread 85 bound to OS proc set {85}
OMP: pid 41015 tid 41197 thread 84 bound to OS proc set {84}
OMP: pid 41015 tid 41201 thread 88 bound to OS proc set {88}
OMP: pid 41015 tid 41199 thread 86 bound to OS proc set {86}
OMP: pid 41015 tid 41207 thread 94 bound to OS proc set {94}
OMP: pid 41015 tid 41206 thread 93 bound to OS proc set {93}
OMP: pid 41015 tid 41205 thread 92 bound to OS proc set {92}
OMP: pid 41015 tid 41200 thread 87 bound to OS proc set {87}
OMP: pid 41015 tid 41202 thread 89 bound to OS proc set {89}
OMP: pid 41015 tid 41203 thread 90 bound to OS proc set {90}
OMP: pid 41015 tid 41204 thread 91 bound to OS proc set {91}
OMP: pid 41015 tid 41208 thread 95 bound to OS proc set {95}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 128, "tg": 0, "pl": 16, "n_kv": 2048, "t_pp": 2.004510, "speed_pp": 1021.696106, "t_tg": 0.000000, "speed_tg": nan, "t": 2.004510, "speed": 1021.696106}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14
To display your profiling results:
##########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-131-3962/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761315400/tools/lprof_npsu_run_14 #
##########################################################################################################################################################################################################################################