S922X Ugoos AM6B Device Tree - Performance/Efficiency - Testing Needed

I know I’m replying to myself…

@doppingkoala what I was saying was you can trick the scheduler into thinking you have 4 separate CPU die with BGA separated by copper lines. All you gotta do is create one cluster with one CPU. I want to mimic this kind of affinity for processes in single threaded operations at a system level not process level.

What will this achieve? The scheduler will become hesitant in bumping processes between cores and only use it for true multi threaded applications.

1 Like

I see the same CPU load balance shift by resolution, but I see this shift regardless of whether the L2 cache nodes are present in the dtb. I made sure to start and stop the averaging script with the video to make the results more consistent

Column 1 Column 2
No L2 Cache nodes L2 Cache nodes included
Big Buck Bunny 360P Big Buck Bunny 360P
Average Loads (10 seconds): Average Loads (10 seconds):
Core 0 Average Load: 56.9% Core 0 Average Load: 43.4%
Core 1 Average Load: 31.2% Core 1 Average Load: 56.5%
Core 2 Average Load: 26.9% Core 2 Average Load: 25.5%
Core 3 Average Load: 7.2% Core 3 Average Load: 7.2%
Core 4 Average Load: 9.2% Core 4 Average Load: 5.6%
Core 5 Average Load: 7.2% Core 5 Average Load: 3.1%
Big Buck Bunny 720P Big Buck Bunny 720P
Average Loads (10 seconds): Average Loads (10 seconds):
Core 0 Average Load: 66.2% Core 0 Average Load: 57.8%
Core 1 Average Load: 52.8% Core 1 Average Load: 68.0%
Core 2 Average Load: 37.2% Core 2 Average Load: 37.7%
Core 3 Average Load: 30.9% Core 3 Average Load: 31.3%
Core 4 Average Load: 27.7% Core 4 Average Load: 26.6%
Core 5 Average Load: 26.5% Core 5 Average Load: 25.3%
Big Buck Bunny 1080P Big Buck Bunny 1080P
Average Loads (10 seconds): Average Loads (10 seconds):
Core 0 Average Load: 61.2% Core 0 Average Load: 66.2%
Core 1 Average Load: 61.3% Core 1 Average Load: 60.5%
Core 2 Average Load: 67.3% Core 2 Average Load: 66.9%
Core 3 Average Load: 66.5% Core 3 Average Load: 65.0%
Core 4 Average Load: 65.0% Core 4 Average Load: 64.2%
Core 5 Average Load: 64.0% Core 5 Average Load: 63.6%

Hmm okay…

Maybe device to device variation.

Also why is your potato quality using an A73? If you have the same CPU as me it shouldn’t use A73. Non normalized numbers below.

Your usage is 50+40+25 = 115

My usage is 40+24 = 64

Why such a large difference? Are you waiting a lot for SDIO?

Your 720p is using all 6 cores. That’s high CPU load.

Recall how I said no difference at the high end of CPU?

The video is only 10sec long, and your results are for 13sec. Your results are averaging in 3sec of close to no CPU load on the A73s, and it’s bringing your values down. You may need to redo the test making sure you only average the 10sec while the video is playing.

This is why it’s better to use a 2-4min clip so that these small inconsistencies don’t cause significant changes in the results.

But what do your results look like when you remove the L2 cache nodes doing the same test?

Fair point about 10 seconds vs 240 seconds. I will not dispute it.

Anyway let’s go the most impactful performance then 360p. Why is 360p using A73?

Here is my shortened test.
360p

Average Loads (averaged over 10 seconds):
Core 0 Average Load: 52.1%
Core 1 Average Load: 29.7%
Core 2 Average Load: 3.5%
Core 3 Average Load: 0.8%
Core 4 Average Load: 0.0%
Core 5 Average Load: 0.0%

720p

Average Loads (averaged over 10 seconds):
Core 0 Average Load: 62.2%
Core 1 Average Load: 56.4%
Core 2 Average Load: 17.9%
Core 3 Average Load: 8.1%
Core 4 Average Load: 3.7%
Core 5 Average Load: 3.0%

You cannot tell me that your numbers don’t look suspect. Just look at my core 3, 4 and 5 vs yours at 720p. Mine = 8, 4, 3 and yours = 31, 27, 26. We are talking about the faster core being utilizied at a greater capacity. You are using 4x, 6x, 13x more cpu on those expensive fast cores. If your system is taxing the cpu cores at such high ratios then the “step function” of performance for your device lies at a different level. You may not feel the performance we are talking about.

You also should look at core 1 and core 2 for 360p. Mine = 30, 4 and yours = ~40s, ~26

The cache difference, maybe someone else with a ugoos can test it for you. I am about to go to sleep.

How you choosed? I see on ondemand only 667 as minimum, also tables of cpu usage of other users show not using 500 mhz in ce. I have s922x(j) devices as x88king, minix u22xj and am6plus and always was used minimum from clear instalation only 667 mhz (compare to 500 mhz in android). But it can be bad optimalized daemon of kodi and etc, …maybe not issue of CE, but kodi. Why is important ? Because we waste 14-20% cpu usage according to using cpu in idle time.

We have different setups, different background processes, different devices. It would be more meaningful if you can show that the load balance results look different with and without the cache nodes on our own setup. That would rule out device and CE differences.

not sure if this will be useful
https://kodi.wiki/view/Samples

lots of different samples to test

Without cache.

720p

Average Loads (averaged over 10 seconds):
Core 0 Average Load: 64.0%
Core 1 Average Load: 61.6%
Core 2 Average Load: 19.8%
Core 3 Average Load: 7.6%
Core 4 Average Load: 4.8%
Core 5 Average Load: 3.0%

Second

Average Loads (averaged over 10 seconds):
Core 0 Average Load: 65.8%
Core 1 Average Load: 59.7%
Core 2 Average Load: 19.5%
Core 3 Average Load: 9.0%
Core 4 Average Load: 5.2%
Core 5 Average Load: 2.9%

360p

Average Loads (averaged over 10 seconds):
Core 0 Average Load: 49.5%
Core 1 Average Load: 31.0%
Core 2 Average Load: 6.1%
Core 3 Average Load: 1.0%
Core 4 Average Load: 0.3%
Core 5 Average Load: 0.1%

Second

Average Loads (averaged over 10 seconds):
Core 0 Average Load: 50.8%
Core 1 Average Load: 33.2%
Core 2 Average Load: 4.7%
Core 3 Average Load: 1.0%
Core 4 Average Load: 0.2%
Core 5 Average Load: 0.3%

Without cache my A73 usage on Core2 approximatley from 3 to 6.1 or even 9(instantaneous) in short test. Still almost 2x usage. But at such low numbers it could be variability/standard deviation.

I was mistaken in my test method it seems. I take back my words. Sorry to mislead you.

UI navigation is a short workload, you move your cursor and then you move it again. It is not consistently moving. Short bursts of usage.

You may be right, cache may not have an effect on it

1 Like

I should have asked before, which av1 video size are you using for the 3 resolutions?

LOL We are both stupid!! ahahah!!!

Yes the video size would make cpu usage different. 5 MB.

1 Like

ok, I’m using the 30MB for 1080p and 720p and 20MB for 360, so that is probably the cpu load difference

I removed all attached DTB as they might can lead to non booting system when using wrong. All g12b DTBs will include the changes with next nightly in Amlogic-ng and Amlogic-no available here: https://relkai.coreelec.org

7 Likes

Anyone know if it is possible to just pull the ‘correct’ values from the original android installs somehow?

https://wiki.coreelec.org/coreelec:devicetree

It’s all there…

1 Like

Concerning AM6B+ -ng version, are there any additional changes needed to install whole .tar update, or will it be enough to replace old with new dtb.img pulled from this new .tar update?

If you install the full tar, then the dtb will be updated automatically.

If you’re running the latest CPM build, then hopefully they’ll be another build soon because there’s been a number of important fixes since A1.

1 Like

“Hi, Thanks for doing these. Will you be doing one for X88King?”
will their be an updated DTB file for this box?

No further testing.

2 Likes

I am running the CPM build, so taking the dtb from the tomorrow’s nightly should work right ? No need to do the full tar update which might render the CPM changes void.