https://gist.github.com/tanakamura/efbfab5cfdf6707d098714d616dd6ef3
PL2 を変えてやってみた
PL2 = 80W
Performance counter stats for 'make -j 40':
https://gist.github.com/tanakamura/efbfab5cfdf6707d098714d616dd6ef3
PL2 を変えてやってみた
PL2 = 80W
Performance counter stats for 'make -j 40':
============= LATENCY ==============================================================================
instruction | IPC ( rel[%]), CPI ( rel[%])
------------------------------------------+---------------------------------------------------------
m128 addps | 0.50-0.25 ( 100.0[%]), 2.00-4.00 ( -50.0[%])
m128 aesdec | 0.33-0.14 ( 133.4[%]), 3.00-7.00 ( -57.1[%])
m128 aesdeclast | 0.33-0.14 ( 133.4[%]), 3.00-7.00 ( -57.1[%])
m128 aesenc | 0.33-0.14 ( 133.3[%]), 3.00-7.00 ( -57.1[%])
m128 aesenclast | 0.33-0.14 ( 133.4[%]), 3.00-7.00 ( -57.1[%])
m128 blendps | 1.00-1.00 ( 0.1[%]), 1.00-1.00 ( -0.1[%])
https://zenn.dev/tanakmura/articles/litex_linux_ae3feff0b48ede これで説明した make.py を vivado で実行
zen2
alderlake P core
linux build
Linux-5.14.15 の make defconfig したものから make を二回やって二回目
[J] は、/sys/class/powercap/intel-rapl:0/energy_uj を読んで出たJoule値 (CPU内蔵センサー値なので、AMDとIntelで基準が違う可能性あり)
以下のようなのを rapl-run.py として、
| ooo ratio : 1.398357 | |
| ostimer: clock_gettime | |
| userland_timer: cntvct | |
| perf_counter: no | |
| Qualcomm Snapdragon 710 | |
| ==== idiv32-realtime ==== | |
| -> : divider_bit | |
| | | 1| 2| 3| 4| 5| 6| 7| 8| 9| 10| 11| 12| 13| 14| 15| 16| 17| 18| 19| 20| 21| 22| 23| 24| 25| 26| 27| 28| 29| 30| 31| 32 | |
| --------------------------------------------------------------------------------------------------------------------------------------------- | |
| | 0 | 2.9|2.9| 7.2| 2.9|2.9|2.9|2.9| 3.2|3.0|2.9| 2.9|2.9|2.9|2.9|2.9| 2.9|2.9|2.9|2.9|2.9| 2.9|2.9|2.9| 2.9|2.9|2.9|2.9|2.9|2.9|14.2|4.0|2.9 |
| | |result | |
| -------------------------- | |
| | ROB | 389 | |
| | INT PRF | 384 | |
| | FP PRF | 372 | |
| | INT(multi chain) | 32 | |
| | FP(multi chain) | 25 | |
| |INT(single chain) | 32 | |
| | FP(single chain) | 25 | |
| v : test_name |
| ostimer: clock_gettime | |
| userland_timer: rdtscp | |
| perf_counter: yes | |
| Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz | |
| ==== fpu ==== | |
| | | nsec/call | |
| ------------------------- | |
| |denormal_add | 1.28803 | |
| | normal_add | 1.04813 | |
| |denormal_mul | 1.05132 |
| ostimer: clock_gettime | |
| userland_timer: rdtscp | |
| perf_counter: yes | |
| AMD Ryzen 7 3700X 8-Core Processor | |
| ==== libc ==== | |
| | | nsec/call | |
| ----------------------------------- | |
| | atoi_99999 | 14.19927 | |
| | fflush_stdout | 5.74030 | |
| | sscanf_double_99999 | 122.22827 |
| ostimer: clock_gettime | |
| userland_timer: rdtscp | |
| perf_counter: yes | |
| AMD Ryzen 7 3700X 8-Core Processor | |
| ==== libc ==== | |
| | | nsec/call | |
| ---------------------------------- | |
| | atoi_99999 | 17.43726 | |
| | fflush_stdout | 11.08052 | |
| | sscanf_double_99999 | 121.75406 |
| ostimer: clock_gettime | |
| userland_timer: rdtscp | |
| perf_counter: yes | |
| Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz | |
| ==== cache-bandwidth-1t ==== | |
| <copy> | |
| | |GiB/s | |
| -------------------- | |
| | 3072 |187.16192 | |
| | 4096 |159.48263 |