S922X Ugoos AM6B Device Tree - Performance/Efficiency - Testing Needed

@YadaYada

I found a 12 year old github repo that compiles: GitHub - qris/dhrystone-deb: Debian package for the Dhrystone benchmark in C

Dhrystone Results

taskset -c 0,4 ./dhry

Dhrystone Benchmark, Version 2.1 (Language: C)

Program compiled without ‘register’ attribute

Please give the number of runs through the benchmark: 100000000

Execution starts, 100000000 runs through Dhrystone
Execution ends

Final values of the variables used in the benchmark:

Int_Glob: 5
should be: 5
Bool_Glob: 1
should be: 1
Ch_1_Glob: A
should be: A
Ch_2_Glob: B
should be: B
Arr_1_Glob[8]: 7
should be: 7
Arr_2_Glob[8][7]: 100000010
should be: Number_Of_Runs + 10
Ptr_Glob->
Ptr_Comp: 19165584
should be: (implementation-dependent)
Discr: 0
should be: 0
Enum_Comp: 2
should be: 2
Int_Comp: 17
should be: 17
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Next_Ptr_Glob->
Ptr_Comp: 19165584
should be: (implementation-dependent), same as above
Discr: 0
should be: 0
Enum_Comp: 1
should be: 1
Int_Comp: 18
should be: 18
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Int_1_Loc: 5
should be: 5
Int_2_Loc: 13
should be: 13
Int_3_Loc: 7
should be: 7
Enum_Loc: 1
should be: 1
Str_1_Loc: DHRYSTONE PROGRAM, 1’ST STRING
should be: DHRYSTONE PROGRAM, 1’ST STRING
Str_2_Loc: DHRYSTONE PROGRAM, 2’ND STRING
should be: DHRYSTONE PROGRAM, 2’ND STRING

Microseconds for one run through Dhrystone: 0.1
Dhrystones per Second: 16611296.0

taskset -c 0,1 ./dhry

Dhrystone Benchmark, Version 2.1 (Language: C)

Program compiled without ‘register’ attribute

Please give the number of runs through the benchmark: 100000000

Execution starts, 100000000 runs through Dhrystone
Execution ends

Final values of the variables used in the benchmark:

Int_Glob: 5
should be: 5
Bool_Glob: 1
should be: 1
Ch_1_Glob: A
should be: A
Ch_2_Glob: B
should be: B
Arr_1_Glob[8]: 7
should be: 7
Arr_2_Glob[8][7]: 100000010
should be: Number_Of_Runs + 10
Ptr_Glob->
Ptr_Comp: 27959696
should be: (implementation-dependent)
Discr: 0
should be: 0
Enum_Comp: 2
should be: 2
Int_Comp: 17
should be: 17
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Next_Ptr_Glob->
Ptr_Comp: 27959696
should be: (implementation-dependent), same as above
Discr: 0
should be: 0
Enum_Comp: 1
should be: 1
Int_Comp: 18
should be: 18
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Int_1_Loc: 5
should be: 5
Int_2_Loc: 13
should be: 13
Int_3_Loc: 7
should be: 7
Enum_Loc: 1
should be: 1
Str_1_Loc: DHRYSTONE PROGRAM, 1’ST STRING
should be: DHRYSTONE PROGRAM, 1’ST STRING
Str_2_Loc: DHRYSTONE PROGRAM, 2’ND STRING
should be: DHRYSTONE PROGRAM, 2’ND STRING

Microseconds for one run through Dhrystone: 0.1
Dhrystones per Second: 8210180.5

Column 1 Column 2 Column 3 Column 4 E F G
12-year old bench Energy Back Calc Other DTB
A53 A73 A53 A73 A53 A73
Raw Number 8210180.5 16611296 772 1192 592 1024
Ratio 33% 67% 39% 61% 37% 63%

That solves one mystery. Now we can decide what numbers to actually choose. If you want the binary file to run on your own device I can share it with you.

Now, how correct do we want to be? Should we change it by creating a PR? Or should we let it be because we were both within 2% of each other.

No idea how to solve the other mystery of benchmarking the kernel scheduler. My main concern is that if we don’t use processor perf registers we are inducing load onto the kernel scheduler which alters our observations. Basically how do we find a non-invasive way of measurement?

1 Like