Woke up to a very cold house

Woke up this morning, and although the temperature on our HestiPi was set to 78 degrees, the house was 72 degrees and the AC was still running. I copied the logs before rebooting the HestiPi, which solved the issue. What can be done to prevent this from happening again?

==> /var/log/openhab2/openhab.log <==
2020-07-03 03:00:57.031 [INFO ] [.dashboard.internal.DashboardService] - Started Dashboard at http://10.0.31.150:8080

2020-07-03 03:00:57.082 [INFO ] [.dashboard.internal.DashboardService] - Started Dashboard at https://10.0.31.150:8443

2020-07-03 03:01:16.871 [INFO ] [arthome.ui.paper.internal.PaperUIApp] - Started Paper UI at /paperui

2020-07-03 03:06:39.269 [INFO ] [b.core.service.AbstractActiveService] - HTTP Refresh Service has been started

2020-07-03 03:06:44.161 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'NULL' cannot be resolved to an item or type; line 180, column 35, length 4

2020-07-03 03:06:44.035 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'NULL' cannot be resolved to an item or type; line 180, column 35, length 4

2020-07-03 03:06:44.379 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'NULL' cannot be resolved to an item or type; line 180, column 35, length 4

2020-07-03 03:06:44.377 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'NULL' cannot be resolved to an item or type; line 180, column 35, length 4

2020-07-03 03:06:44.306 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'NULL' cannot be resolved to an item or type; line 188, column 31, length 4

2020-07-03 03:06:44.994 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'NULL' cannot be resolved to an item or type; line 180, column 35, length 4

2020-07-03 03:06:45.065 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'NULL' cannot be resolved to an item or type; line 180, column 35, length 4

2020-07-03 03:06:45.075 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:06:45.257 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:06:45.829 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:06:45.900 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:06:45.979 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:06:46.015 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:06:46.029 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:06:46.589 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'TempUnitChanged': The name 'MainSwitch' cannot be resolved to an item or type; line 632, column 5, length 10

2020-07-03 03:06:46.661 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'TempUnit updated': The name 'TempUnit_Topic' cannot be resolved to an item or type; line 883, column 5, length 14

2020-07-03 03:06:46.670 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'TempUnit updated': The name 'TempUnit_Topic' cannot be resolved to an item or type; line 883, column 5, length 14

2020-07-03 03:06:46.758 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'TempUnit updated': The name 'TempUnit_Topic' cannot be resolved to an item or type; line 883, column 5, length 14

2020-07-03 03:06:46.740 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'TempUnit updated': The name 'TempUnit_Topic' cannot be resolved to an item or type; line 883, column 5, length 14

2020-07-03 03:06:47.367 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'TempUnit updated': The name 'TempUnit_Topic' cannot be resolved to an item or type; line 883, column 5, length 14

2020-07-03 03:06:47.456 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'HeatingBoostTime changed': The name 'HeatingBoostTime' cannot be resolved to an item or type; line 430, column 32, length 16

2020-07-03 03:06:47.470 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:06:47.500 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'checkcurrtemp': The name 'MyTempProxy' cannot be resolved to an item or type; line 220, column 8, length 11

2020-07-03 03:06:47.793 [ERROR] [ab.binding.gpio.internal.GPIOBinding] - Error occurred while creating backend object for item Pin16, exception: Device or resource busy

2020-07-03 03:06:47.863 [ERROR] [ab.binding.gpio.internal.GPIOBinding] - Error occurred while creating backend object for item Pin18, exception: Device or resource busy

2020-07-03 03:06:47.925 [ERROR] [ab.binding.gpio.internal.GPIOBinding] - Error occurred while creating backend object for item Pin12, exception: Device or resource busy

2020-07-03 03:06:48.020 [ERROR] [ab.binding.gpio.internal.GPIOBinding] - Error occurred while creating backend object for item Pin23, exception: Device or resource busy

2020-07-03 03:06:50.614 [WARN ] [core.thing.internal.ThingManagerImpl] - Initializing handler for thing 'mqtt:broker:mosquitto' takes more than 5000ms.

2020-07-03 03:07:04.849 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to 'localhost' with clientid mosquitto and file store '/var/lib/openhab2/mqtt/localhost'

2020-07-03 03:07:04.873 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:07:16.800 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'FanControl': The name 'FanMode' cannot be resolved to an item or type; line 614, column 9, length 7

2020-07-03 03:07:16.815 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'HeatingPin changed': The name 'HeatingPin' cannot be resolved to an item or type; line 103, column 12, length 10

2020-07-03 03:07:20.036 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'Fan Mode': The name 'FanMode' cannot be resolved to an item or type; line 360, column 12, length 7

2020-07-03 03:07:20.344 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'FanControl': The name 'FanMode' cannot be resolved to an item or type; line 614, column 9, length 7

2020-07-03 03:07:21.022 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'HeatingBoostTime changed': The name 'HeatingBoostTime' cannot be resolved to an item or type; line 430, column 32, length 16

2020-07-03 03:07:21.463 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'Cooling Mode': The name 'CoolingMode' cannot be resolved to an item or type; line 334, column 12, length 11

2020-07-03 03:07:21.858 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'Heating2Pin changed': The name 'Heating2' cannot be resolved to an item or type; line 121, column 9, length 8

2020-07-03 03:07:22.340 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:07:27.779 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'convertproxy': The name 'PreviousTempReading' cannot be resolved to an item or type; line 180, column 6, length 19

2020-07-03 03:13:09.118 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'CoolingPin changed': An error occurred during the script execution: null

2020-07-03 03:13:09.055 [INFO ] [lipse.smarthome.model.script.Default] - HeatingPin set to OFF.

2020-07-03 03:13:10.402 [INFO ] [lipse.smarthome.model.script.Default] - FanPin set to OFF.

[REBOOTED AT THIS POINT]

Fenced the logs to make it easier to view.
@rlkoshak any input?

I have only ever seen errors like these on the first boot after clearing the cache which is something you should never seen on a HestiaPi as the OH version should remain fixed (during an upgrade the script clear the cache) or if you’ve done so yourself.

Have you upgraded openHAB?

Have you cleared the cache yourself?

Was there any reboot or restart of the HestiaPi prior to these errors?

Was there anything interesting in the logs prior to the logs posted above?

How long was the HestiaPi up before this occurred?

Thank you for the response. Here’s more backstory and further updates. A few weeks ago, we had a power outage at our home for a few hours, and when power was restored, the thermostat failed to work anymore. It booted, but everything on the screen showed 0 and I could not connected via the web interface nor app. After playing with it for a while, I flashed a new SD card and booted from that, going through the first-time setup again. After several reboots and hours of waiting, it still did not function properly, but better than before the power outage. Display still showed zeros, even though the logs showed actual temp and humidity readings. App would connect, but I could not get it to function. After some searching through the forums, I found a posting that suggested running the following commands that got me up and running:
sudo openhab-cli clean-cache
sudo openhab-cli reset-ownership
sudo shutdown -r now

Gave it about 20-30 minutes to reboot and fully come back up, and it worked fine for a week or two. Then I had the incident I started this post about, where the AC was running all night.

Now, an update. This morning my wife noticed that the thermostat showed the temperature set to 70, where we leave it at 79, the house temp was 77 and the AC was running. She changed the temp with the app to 79, but told me the AC kept running. I copied the logs files (linked below) and rebooted just now to see if that fixes it.

https://cloud.allegiance-it.com/s/sZz8L4SdEbAJEEP

Just happened again in the last hour or so, thermostat changed itself to 70 degrees set point from 79.

Something is off with the timezone too. Looking through the events.log, as far as I can tell the timezone or clock was off, then at some point the log entries changed to the right time and then looks like it’s going though a startup. The pi has not rebooted for 3 days.

2020-07-08 15:23:54.580 [nt.ItemStatePredictedEvent] - MyHumiProxy predicted to become 51.2
2020-07-08 15:23:54.715 [vent.ItemStateChangedEvent] - MyHumiProxy changed from 51.5 to 51.2
2020-07-08 15:24:04.527 [vent.ItemStateChangedEvent] - MyHumi changed from 51.2 to 50.9
2020-07-08 15:24:04.716 [ome.event.ItemCommandEvent] - Item ‘MyTempProxy’ received command 79.0
2020-07-08 15:24:04.790 [nt.ItemStatePredictedEvent] - MyTempProxy predicted to become 79.0
2020-07-08 15:24:04.816 [ome.event.ItemCommandEvent] - Item ‘MyHumiProxy’ received command 50.9
2020-07-08 15:24:05.077 [nt.ItemStatePredictedEvent] - MyHumiProxy predicted to become 50.9
2020-07-08 15:24:05.295 [vent.ItemStateChangedEvent] - MyHumiProxy changed from 51.2 to 50.9
2020-07-08 11:37:55.601 [ome.event.ItemCommandEvent] - Item ‘MainSwitch’ received command OFF
2020-07-08 11:37:55.992 [ome.event.ItemCommandEvent] - Item ‘HeatingMode’ received command OFF
2020-07-08 11:37:56.186 [ome.event.ItemCommandEvent] - Item ‘CoolingMode’ received command OFF
2020-07-08 11:37:56.235 [ome.event.ItemCommandEvent] - Item ‘FanMode’ received command OFF
2020-07-08 11:37:56.283 [ome.event.ItemCommandEvent] - Item ‘Heating2’ received command OFF
2020-07-08 11:37:56.558 [vent.ItemStateChangedEvent] - TempSetpoint changed from NULL to 70
2020-07-08 11:37:56.667 [vent.ItemStateChangedEvent] - TempSetpointF changed from NULL to 70
2020-07-08 11:37:56.875 [ome.event.ItemCommandEvent] - Item ‘HeatingBoostTime’ received command 10
2020-07-08 11:37:56.979 [ome.event.ItemCommandEvent] - Item ‘CoolingBoostTime’ received command 10
2020-07-08 11:37:57.191 [vent.ItemStateChangedEvent] - PreviousTempReading changed from NULL to 0
2020-07-08 11:37:57.361 [ome.event.ItemCommandEvent] - Item ‘TempSetpoint’ received command 70
2020-07-08 11:37:57.496 [ome.event.ItemCommandEvent] - Item ‘Heating2Time’ received command 0
2020-07-08 11:37:57.592 [vent.ItemStateChangedEvent] - PreviousHumiReading changed from NULL to 0
2020-07-08 11:37:57.706 [ome.event.ItemCommandEvent] - Item ‘CoolingPin’ received command OFF
2020-07-08 11:37:57.816 [ome.event.ItemCommandEvent] - Item ‘Heating2Delta’ received command 0
2020-07-08 11:37:58.006 [ome.event.ItemCommandEvent] - Item ‘HeatingPin’ received command OFF
2020-07-08 11:37:58.059 [ome.event.ItemCommandEvent] - Item ‘CoolingPin’ received command OFF
2020-07-08 11:37:58.211 [ome.event.ItemCommandEvent] - Item ‘HeatingPin’ received command OFF
2020-07-08 11:37:59.488 [ome.event.ItemCommandEvent] - Item ‘Network_WLAN_IP’ received command 10.0.31.150
2020-07-08 11:37:59.969 [ome.event.ItemCommandEvent] - Item ‘Network_SSID’ received command Brooks
2020-07-08 11:38:00.427 [ome.event.ItemCommandEvent] - Item ‘Network_WLAN_INFO’ received command 100
2020-07-08 11:38:00.890 [ome.event.ItemCommandEvent] - Item ‘Network_WLAN_MAC’ received command b8:27:eb:1b:bf:66
2020-07-08 11:38:01.560 [ome.event.ItemCommandEvent] - Item ‘System_CPU_TEMP’ received command 118 °F
2020-07-08 11:38:02.973 [ome.event.ItemCommandEvent] - Item ‘System_CPU_LOAD’ received command 100
2020-07-08 11:38:03.585 [ome.event.ItemCommandEvent] - Item ‘System_Used_Space’ received command 11 %
2020-07-08 11:38:03.866 [ome.event.ItemCommandEvent] - Item ‘TempUnit’ received command F
2020-07-08 11:38:04.260 [ome.event.ItemCommandEvent] - Item ‘SystemType’ received command US
20

@bsandor I’ve been following your issue and I can’t offer any direct advice on openhab. Here’s what I wrote to manage the HestiaPi. It’s still under active development but it honors cooling setpoints. Only 3 setpoints are available for now 68,70 and 72, which I know sounds nuts, but see resources.py for info.

Thermo.py is the main execution point for the app, then visit localhost:1949 to see the front end. I2C default address is 0x76 but that can be changed in thermo.py.

Here’s a link. I don’t have any installation documentation at the moment but you can use this for guidance.

@HestiaPi Here’s what I wrote. Super unpolished but it meets my needs. Next step is to schedule setpoint changes with an excel import.

1 Like

Does it work with openHAB or does it completely replaces it? Any screens of the LCD and the web/app UI?

@HestiaPi I’m using it as a complete replacement. I just loaded rasbian desktop onto an sd card and setup a openbox style kiosk with a tmux ui until i got fed up with the screen brightness. Now I just run it headless until that OLED bonnet comes in. The long startup issues that people have had don’t appear here.

@rlkoshak’s latest improvements have drastically reduced the startup time. We have also started to work on porting to Buster distro but this has been delayed for some time now.

I might be wiling to try something new at some point, but am still hoping to get this thing working as-is. I’ve been using various versions of Hestia Pi thermostats for over a year at this point, and this is the first time I’ve seen this issue. It lasts less than a day at this point before going haywire and running the AC non-stop, I can’t leave it like this since we both work and aren’t around all the time to keep rebooting it, let alone waking up freezing. If there are no other suggestions, I’m going to re-image the SD card today and see if it was a fluke with the image I used.

I was under the impression you had @rlkoshak version… You mean you have the image from the website without modifications and it does this? Then yes please re-flash the SD and maybe check it first for hardware issues.

Yes, this was all with the version downloaded from HestiaPi site.

Thought I’d document the process of starting over with a clean image on the SD card (not the original card), in case anything shows as hardware related vs software.

8:00 - booted with new image, went through initial configuration
8:18 - after joining hestiapi to wifi and monitoring logs via ssh, hestipi stopped responding. Cannot even ping it on the network anymore. Screen shows “Off” and 0 for temp and humidity (touch screen has not worked since new, so cannot attest to that interface at this time - have been dealing with the OS issues so have not had time to try to resolder connections)
8:30 - Screen now shows current (I think) temp and humidity, but I still can’t ping or ssh into it
8:43 - Temp readout on screen went up, so it’s showing actual temp, still can’t ping or SSH into it
8:49 - After performing wifi scan and checking firewall activity, it is simply not showing up on the network. Since the touch interface does not work (not sure if I can reboot from that anyway), I’ve pressed the reset button on the board to see if it will come back up functional.
8:50 - Can ping the IP now
8:51 - Able to ssh in. Running “openhab-cli showlogs” to monitor boot
8:56 - Loosing ping packets as it continues to boot. Is that normal with high CPU load on raspberry pi?
8:57 - Screen shows “Loading…”. Attempting to load web gui, results in many errors in log output:
“Caused by: java.lang.UnsupportedOperationException: Asynchronous processing not supported on Servlet 2.x container.”, more packet losses (< 10% loss)
9:10 - Screen up, showing “Off” and 0 degrees/humi. Getting about 30 percent packet loss on pings
9:15 - Screen now showing current temp and humidity. Packet loss dropped < 10%. Can connect via app, able to set temp set point and turn AC on auto
9:16 - Took a full minute for screen to reflect new temp set point
10:06 - Noticed the AC was not running, temp was still set at 79, I bumped it down to 78, still did not turn on. I then noticed the AC mode changed to OFF from where I set it to Auto. Changed back to Auto and now the AC is running

I will post more later if there is anything of interest

Something is not right. SD card or power supply unit (or power supply input?)

Sorry, I’ve somehow missed the continuation of this thread.

Some comments based on new information:

  • When an RPi loses power, if it happens to be writing out to the SD card at that time, it runs the real risk of corrupting not only that file but other files on the card. In the new version I worked on mentioned by hestiapi, I’ve greatly decreased the amount of logging that openHAB itself does to limit this possibility. I’m kind of hoping at some point RPi 0ws or something with equivalent pinouts has more RAM where we can put the logging into a ZRAM disk. If you see errors immediately after a power loss, file system corruption is likely the problem. Reflashing the card was a good choice.

  • The new openHAB configs takes almost exactly 10 minutes to boot, a 66% reduction in boot time. I’m doing some experiments with Java 11 (OH 2.5.6 supports Java 11, OH 3 will require it) that further whittles away at the boot time. At least anecdotally. That should address the “waiting hours” problems.

  • When the display shows zeros but events.log shows Items changing state, the problem is the connection to the MQTT broker. But it’s important to understand which Items are changing. The relevant Items are:

    • MyTempProxy
    • MyHumiProxy
    • TempSetpoint (in the new version there is a MaxTempSetpoint and MinTempSetpoint)

And there are others but if you see these Items changing the display should be changing.

  • The original error in the OP I’ve only ever seen reported to occur after a clearing of the cache. But it would occur immediately after the clearing of the cache. It’s not something that would occur after a couple of days of proper operation.

Are there indications that the HestiaPi rebooted or something changed default.items? The v1.1 version which is on the SD card downloaded from the website does not remember settings in those cases, resetting everything back to their defaults. The default setpoint is 70 degrees f. I see TempSetpoint changing from NULL to 70 which points to something causing that Item at least to get reset to NULL, the initial state an Item has when it’s first loaded. Also, the fact that TempUnit and SystemType received a command also points to the initialization rule running again for some reason. That rule only runs at system started events, which can occur when OH starts, when default.rules is reloaded, and sometimes when default.items reloads I think. Have you done any other modifications to the base SD card image? Something like setting up backups or the like?

  • I don’t know if it’s normal that the RPi doesn’t respond to pings when the CPU is pegged. That seems reasonable though.

  • The “Unsupported Operation Exception” is a non-issue and can be ignored. I’ve actually suppressed that warning. openHAB assumes at least an RPi 2 which has multiple CPU cores. This exception is complaining that the RPi 0w only has the one core. But this error is informative. It only appears when you have PaperUI or BasicUI open in a browser somewhere. I have seen cases with the v1.1 of the code where having these pages open can delay the boot process significantly. This is not a problem with v1.2 as far as I’m aware. The boot scripts wait until the overall system load drops below a certain threshold before it copies the default.rules file over. Having the UIs open adds just enough load that it takes forever to drop below that threshold. While the Rules are not running, the sensor readings are not published to the LCD screen and it shows 0s.

  • As I mentioned above, the v1.1 does not remember your settings on a reboot. All the settings get reset to the defaults on every boot and the heater and AC default to OFF. So I’m not surprised the AC came back as OFF. The v1.2 remembers your settings across the reboot.

  • If you didn’t, the timezone needs to be reselected after a reburn of the SD card.

I’m inclined to agree with HestiaPi. I wonder if the power is not clean or something like that. It really looks like OH is periodically fully or at least partially restarting for some reason. Over all, the behaviors described are kind of bizarre and inexplicable as a description of any single root problem.

Whenever it was having this behavior, the system uptime showed it had not rebooted. We will see how it fairs with the new SD card and new image - I’m beginning to suspect there was an issue with the last downloaded image (burned to new SD card, same one I’m using now) I had used since it would not function normally from the get-go, and seemed to degrade over time. I vaguely remember that when I burned the image to a new SD card after the power outage (before this problem started), BalenaEtcher complained about the image the first time I tried to burn it, but I was able to get it to go after another try. I did not, nor have I now, done any modifications from the downloaded image - it’s “stock”.

Look forward to v1.2, even if just for the storing of the settings between reboots. We don’t lose power here a lot, but at least every few months between spring thunderstorms and winter snowstorms.

For another thread, I’m sure, but if anyone has any solution for a battery to keep the thermostat up during a power outage, that’d be a big plus.

Thanks for all the feedback and help. I’ll keep you posted if issues return or if it seems to be good in a few days time.

The whole machine need not reboot for openHAB to restart or for openHAB to reload the .items files or .rules files which, for all practical purposes here is the same as openHAB restarting.

Well, clean load did not work. I feel like I have some bad hardware. Froze up two times over 4 days, one time I couldn’t even ping or SSH into it.

This is the third one I’ve purchase. The first one was a first-gen 110v touch. The relay stopped working after about 9 months. The second one was my fault - new US HVAC version like my current one, I pre-ordered it during the fund raising to replace my dead original one. I fried it when I didn’t realize the wire inlet hole in the back of the case changed location from the first thermostat to this one, so when I mounted it the solder points on the back of the board pierced the insulation on the thermostat wires and fried the board. So, I had to order another one, this one. This has lasted only a few months, the touch screen has not worked since new (I never tried the resoldering suggestion, won’t bother now because of all the other issues). Wondering what to do at this point. I am a big open-source proponent and am still happy to be supporting this project by purchasing hardware, but this is getting very time consuming and expensive. I now have my original mechanical thermostat back in place as it works fine.

That must be very frustrating. If we know the SD card is good, I would then focus on the power supply. As you have removed it from the wall, I would advise to plug in a microUSB power cable (at least 2 Ampere) and have it running for a few days even without the rest of the wires to see if it gets stuck again or not…

Here is what I have found out. I bypassed the on-board power supply with a separate 110VAC to 5VDC power supply connected directly to the two 5VDC leads on the board. The hestiapi ran for a good week or so, rebooted once from a power outage, then ran again for another week solid. It was not controlling the furnace, but I was able to connect to it via the web interface the whole time it was up and operating. This morning, I ran a new known-good 24VAC power source (not connected to the furnace) to the on-board 24VAC to 5VDC power supply, and after about an hour I could no longer connect to the hestiapi. I rebooted, and kept a terminal window open on my Mac while I worked with a ping running to it. After about 2 hours, it started losing packets. Another hour or so, I could no longer connect to the web portal on it and could not ping it at all. I’ve rebooting it again just now, but expect the same results.

Do you know the power rating of the 24V AC transformer? From your tests I understand that everything is ruled out apart from the 24V circuit (transformer and 24 to 5V power supply). I believe it is a power output thing and not the actual voltage. If there is a simple cat command you could get the voltage reading from the CPU, log its output to a file every, say 5 sec if it’s not too CPU demanding command and check it after a crash