Khadas vim2 thermal thermal_zone0: Failed to get temperature: -22

Hi,

since i bought the khadas fan for my vim2 basic, i constantly get

[ 197.490816@4] thermal thermal_zone0: failed to read out thermal zone (-22)
[ 197.490842@4] thermal thermal_zone0: Failed to get temperature: -22

and the fan seems to run for a second and then stop and runs again. Seen these messages without fan as well, but only from time to time.

dmesg | grep fan
[ 17.284867@7] khadas_fan_probe
[ 17.304911@7] fanctl fan.51: trigger temperature is level0:50, level1:60, level2:70.

I am running 8.95.1

CoreELEC (official): 8.95.1 (KVIM2.arm)
Linux CoreElec3 3.14.29 #1 SMP Tue Aug 28 21:52:46 BST 2018 aarch64 GNU/Linux

i can manually read the temperature

cat /sys/class/thermal/thermal_zone0/temp

51000

but i somehow doubt it as the head sink is hot, though it seems to run better with sink and fan as the video lags ( only on 4k content and not very often ) seems to be gone. When that happened it seems a cpu was shutdown and switched to another cpu

Unloading khadas_fan crashes with

[93417.323917@1] Call trace:
[93417.326510@1] [] strlen+0x10/0x90
[93417.331339@1] [] iterate_dir+0x68/0xe0
[93417.336602@1] [] compat_sys_getdents64+0x7c/0x110
[93417.342809@1] Code: b200c3eb 927cec01 f2400c07 54000261 (a8c10c22)
[93417.349076@1] —[ end trace 304641395a547e9e ]—

Any clue how to fix that?

I’m getting the same error messages.

Kernel message about thermal zone is only informative. Probably there is missing mutex for read out temperature. For temperatures 50 to 56 °C I use pulse control for quite fan run. For pulse control I use 100 Hz. Typically working CPU temperature is kept at 53 to 54 °C. Android driver has no pulse control only with 3 run levels, the lowest level is to loud for me. Driver doesn’t support unload. If you want change driver behavior you can change temperature trigger levels.

thanks @afl1, how do i set the triggers permanently on corelec for this?

hm, the trip points seems to be much higher than the actual temp

CoreElec3:/sys/devices/virtual/thermal/thermal_zone0 # cat trip_point_*
1000
70000
passive
1000
80000
passive
5000
85000
hot
1000
260000
critical

CoreElec3:/sys/devices/virtual/thermal/thermal_zone0 # cat temp
52000

still the fan spins and stops and spins again.
Did you connect it via the fan controller or via gpio? my fan came with a plug that fits the fan controller

https://github.com/CoreELEC/CoreELEC/blob/master/packages/linux-drivers/amlogic/kvimfan-aml/sources/khadas-fan.c

#define KHADAS_FAN_TRIG_TEMP_LEVEL0 50 // 50 degree if not set
#define KHADAS_FAN_TRIG_TEMP_LEVEL1 60 // 60 degree if not set
#define KHADAS_FAN_TRIG_TEMP_LEVEL2 70 // 70 degree if not set

seems that is hardcoded, and since the temp is always above 50, the fans runs always? ( not good on code reading though )

What kind of fan did you buy (Manufacturer/model)?
AFAIK there are fans with analog rpm control, and fans with PWM rpm control available…

i bought https://www.gearbest.com/cpu-cooler/pp_009529678755.html?wid=1433363

Thanks. Although their description is strange (picture: 2pin, description: Fan Pin: 3 + 4 pin,3 pin), i consider the picture, and the fan is controlled by current (2-bit driver circuit). As it is from Khadas, I assume it will fit to their SW driver.

I agree to afl1 that the main problem may be acessibility to temp sensor value.
But also the SW driver seems to have some design flaws (level mismatch fan_work_func vs. khadas_fan_level_set) and some strange coding (e.g. line 97)

Ha, found your bug in the Khadas code:
Default fan_data->last_level = 0 ==> Fan Off
Now you have 51°, which results in line101 as Level6 ==> Fan Off
In line 79, fan_data->last_level is set to 6 now. ==> Fan Off
Next loops, line 77 now involve & decrements level one by one up to level 3&2 ==> Fan low/high mode.
Then line 79 fan_data->last_level is set to 6, repeating the game…

This code needs rework!
Maybe better to revert to original Khadas Code, and run it in 5sec container, or implement proper hysteresis.

Cool, thank a lot @rho-bot

how we proceed from here, am i supposed to open a bug report somewhere?

There is no bug, you only doesn’t understand code. Level value has following meaning:
0 - off
1 - high
2 - normal
3 - low
4 - 1/2 low
5 - 1/3 low
6 - 1/4 low
Hysteresis is implemented to eliminate slight changes as the temperature resolution in 1 degree.
Trigger levels 50,60,70 are defined in device tree. We can move it to higher values e.g.: 55,65,70.

With /sys/class/fan you can switch driver to manual mode and set fan level.
echo 0 > /sys/class/fan/mode – manual mode
echo 0 > /sys/class/fan/level – fan off
echo 1 > /sys/class/fan/level – fan low
echo 2 > /sys/class/fan/level – fan normal
echo 3 > /sys/class/fan/level – fan high

Sorry, you are right. But hard to understand without any code comments…
So with line 76-79 you are generating the duty factors.
And line 97 gives 2° hysteresis (although I still don’t get the need for ? condition)

But then we are still back to the original question of 1second bouncing…
Main question: why it cannot be read?
But If temp cannot be read permanently, duty generation is stalled, as line 92-94 just returns without calling khadas_fan_level_set ()

Proposal:

struct khadas_fan_data {
	struct platform_device *pdev;
	struct class *class;
	struct delayed_work work;
	struct delayed_work fan_test_work;
	enum    khadas_fan_enable enable;
	enum 	khadas_fan_mode mode;
	enum 	khadas_fan_level level;
	int	ctrl_gpio0;
	int	ctrl_gpio1;
	int	trig_temp_level0;
	int	trig_temp_level1;
	int	trig_temp_level2;
        int     last_level;
        int     temp;
        int     work_level;
};

static void fan_work_func(struct work_struct *_work)
{
	struct khadas_fan_data *fan_data = container_of(_work,
		   struct khadas_fan_data, work.work);

	int temp, level;

	temp =  get_cpu_temp();

	if(temp >= 0){
		fan_data->temp = fan_data->temp ? (fan_data->temp + temp) / 2 : temp;
		if (fan_data->temp < fan_data->trig_temp_level0)
			level = 0;
		else if (fan_data->temp < fan_data->trig_temp_level0 + 2)
			level = 6;
		else if (fan_data->temp < fan_data->trig_temp_level0 + 4)
			level = 5;
		else if (fan_data->temp < fan_data->trig_temp_level0 + 6)
			level = 4;
		else if (fan_data->temp < fan_data->trig_temp_level1)
			level = 3;
		else if (fan_data->temp < fan_data->trig_temp_level2)
			level = 2;
		else
			level = 1;

		fan_data->work_level = level;

		if (level > 3)
			schedule_delayed_work(&fan_data->work, KHADAS_FAN_LOOP_PULSE);
		else
			schedule_delayed_work(&fan_data->work, KHADAS_FAN_LOOP_SECS);
	}else{
		schedule_delayed_work(&fan_data->work, KHADAS_FAN_LOOP_PULSE);
		level = fan_data->work_level;
	}

	khadas_fan_level_set(fan_data, level);
}

thanks @afl1 for the fan info, thanks @rho-bot for the code proposal

based on afl1’s info i created a very simply script that i let controll the fan now

#!/bin/sh

#enable manual mode
echo 0 > /sys/class/fan/mode

#read temp ( unnecessary at this stage ) and mode
temp=`cat /sys/class/thermal/thermal_zone0/temp`
mode=`cat /sys/class/fan/mode | awk '{ print $3 }'`

#set temparature levels
level1=50000
level2=60000
level3=70000

while [ $mode == 0 ]
do
	# check mode and temp constantly
	temp=`cat /sys/class/thermal/thermal_zone0/temp`
	mode=`cat /sys/class/fan/mode | awk '{ print $3 }'`
	if [ $temp -gt $level3 ]
	then
		# enable highst fan level
		echo 3 > /sys/class/fan/level
		# setting script to sleep for 60 second to avoid bouncing the fan too often
		sleep 60
	elif [ $temp -gt $level2 ]
	then
		# enable normal fan level
		echo 2 > /sys/class/fan/level
		sleep 60
	elif [ $temp -gt $level1 ]
	then
		# enable low fan level
		echo 1 > /sys/class/fan/level
		sleep 60
	else
		echo 0 > /sys/class/fan/level
	fi
	# sleeping for X seconds to avoid bouncing fan too often
	sleep 5 
done

if [ $mode == 1 ]
then
	# sending message to dmesg and kodi to inform user
	echo "Fan script stopped as fan mode has been switched to automatic" >> /dev/kmsg
	kodi-send -a "Notification(Fan,Fan script stopped as fan mode has been switched to automatic)"
fi
1 Like