Stuck in a Loading Loop

Here are my updates:

  1. Long loop wait time: I’m using a UHS3, V1 32GB SD card.
    at 50MHz clock I get
Running DD WRITE test...
536870912 bytes (537 MB, 512 MiB) copied, 41.4046 s, 13.0 MB/s
temp=41.2'C

Running DD READ test...
536870912 bytes (537 MB, 512 MiB) copied, 24.1178 s, 22.3 MB/s
temp=41.2'C

Then I overclocked SD clock to 100MHz with “sd_clock=100.000 MHz” in /boot/config.txt. This has read/write results:

Running DD WRITE test...
536870912 bytes (537 MB, 512 MiB) copied, 39.5661 s, 13.6 MB/s
temp=41.2'C

Running DD READ test...
536870912 bytes (537 MB, 512 MiB) copied, 12.7305 s, 42.2 MB/s
temp=42.2'C

So for now I’ll continue with the sd overclock.

  1. Touchscreen not working.
    I tried many ideas, posted by others who also had touchscreen problems related to TFT 3.5 screens or similar, even so much as to recompile custom dts files. My problems was that the EV_KEY event would only trigger touchdown value 1 on the first touch and never the release, “value 0”
    type 1 (EV_KEY), code 330 (BTN_TOUCH), value 1
    After the first touch no EV_KEY events ever fired again. At this point, I figured the touch overlay of the screen was no good and bought a Waveshare branded screen. The Waveshare screen exhibited the same behavior. I ended up finding these posts:
    Touchscreen not working · Issue #261 · goodtft/LCD-show · GitHub
    LCD35-Show touch screen not working · Issue #53 · waveshare/LCD-show · GitHub

These and many posts implied that these screen’s touch-panel have problems working with newer kernals (buster, bullseye, etc). I tested this theory by using older hestiapi F/W 1.2 and the touch issue was the same.

I started to research the GPIO that is used for the touch interrupt, and on these screens it is GPIO 17.
I used dtoverlay parameters to experiment, and found that if I added the ads7846 overlay and commented out the display overlay the touch screen worked perfectly; this told me the issue was not the hardware.

dtoverlay=ads7846,penirq=17
#dtoverlay=tft35a:rotate=270

Going back and forth between the overlays, I found a horrendous hack:

  • Disable tft35a, enable ads7846, then reboot: touch works but no display
  • after the reboot, edit and enable the tft35a and leave the ads7846 enabled, then “reboot”: The touch works and the display works! So, then I powered down, so that I could do a “cold” hardboot; and this just yields the original problem. No touch but display works. So, for about 15 minutes, I worked on trying to do a soft boot hack, involving using an available GPIO to toggle in a way to know if the hardware had hard booted or soft booted. I gave up when I realized the pain of it all.

So, I went back to GPIO 17 and found that if I add a short polling loop against 17 after bootup via a python script then it forces the touch to work.

import RPi.GPIO as GPIO
import time
GPIO.setwarnings(False)
GPIO.setmode(GPIO.BCM)
GPIO.setup(17, GPIO.IN)

i = 0
while i < 10:
  state = GPIO.input(17)
  #print(state)
  time.sleep(0.1)
  i += 1

I tried this on both screens and this worked.

So this thread, is done. I have another issue which, I will start a new thread.

I have the same issue - the “Starting in XXX seconds …” screen never goes away. It will countdown to “Loading …”, but then always goes back to “Starting in XXX seconds …”.

This after I flashed hestia-pi-ONE-v1.4-dev-bullseye-6185.img to a Kingston Canvas React microSD card. The card is C10, U3, A1, V30, I got it because of how unbearably slow openHAB is to boot, this is the fastest thing I could find.

Output of w command after sshing in:

02:25:51 up 41 min, 1 user, load average: 5.12, 6.44, 6.66

This is a slight improvement from ~10 minutes ago when it was a load average of 9.

The only reason I upgraded was because openHAB couldn’t download addons anymore (that was version 1.2).

[2024-07-05 Fri 19:09] - Finally came up and is cooling, although it’s very laggy in response in the app, and I keep getting “The sitemap could not be loaded. The following error occurred: Connection to host failed”

[2024-07-06 Sat 05:31] - Is finally and truly up. I’m considering this problem solved. Apparently it just takes waiting hours for on the first boot up. Is there any way this time could be reduced, by doing some of the work on the images that are available for download?

I’ve spent a lot of time trying to figure this one out, where the load average never drops, and thus the loading screen never goes away.

It’s been very frustrating because it’s been an intermittent issue. I’ll experience it, try to figure it out, end up reflashing the SD card and then I can’t reproduce it. It’s all the same hardware, including the SD card, and the same image that I’m flashing as well. I don’t know if something is getting sometimes getting corrupted on first boot or what.

As crazy as it seems, just reflashing the v1.4 image is pretty likely to work. I wish I could explain why because it really doesn’t make sense to me.

In my experience and from others on the forum here, using a faster SD card does seem to help with this problem, but it doesn’t solve it (obviously).

If you want to go back to v1.2, you might be able to get the OpenHAB addons working by making the configuration change mentioned here.

It’s stuck again, due to power outage. This time, I waited over two hours (according to uptime on the HestiaPi itself), with no progress beyond the “Loading …” screen (happily, it got the correct IP address, so it’s finally remembering the WiFi I setup, unlike previous versions). I finally rebooted it over ssh with “shutdown -r now”. Now waiting again, but it’s cycling through the “Starting in XXX seconds …” and “Loading …” screens.

[2024-08-22 Thu 11:08] - I’m done. This is unacceptable. I’m going back to my old “dumb” thermostat. At least it works.

ETA: Other bonuses to the “dumb” thermostat:

  1. battery backup
  2. screen isn’t insanely bright all the time
  3. “boots” in seconds

ETA2: I have mulitple pieces to variants of the HestiaPi - I will ship them to people for best offer + cost of shipping. Make me an offer.

Sorry to hear about this problem rearing its head again and I completely understand your frustration. I wish we had better solutions for your today. We are working on smashing this bug, speeding up boot times, and developing another model of hardware that will boot within seconds, but they’re not done yet.

For anyone else who might see this thread, know that we are planning on completely rewriting the image from scratch to remove OpenHAB (discussion thread on that). That is likely the component that is keeping the load average high, and thus keeping the loading screen from going away.

The new image will boot faster, and still won’t require batteries. The screen brightness is a function of the hardware, but a community member contributed a screen dimmer. We’re also working on a new model (Hestia32) which will have a screen where the backlight can be turned off.

In the meantime, for people who are comfortable SSHing into their pi, or mounting the SD card and editing file that way, this can be fixed by editing /home/pi/scripts/kiosk-xinit.sh to not wait until the load average is under 2.00. This is a matter of changing the “2.00” to something larger, like “5.00” or “10.00”.

To be clear, that change is just a workaround. Reflashing the SD card will fix the problem. Neither of these are acceptable, and as I said, we’re working on an entirely new image which avoids this problem all together.

This is a good start; but I’d warn not making the same mistakes OpenHAB made. For one, the choice of Java and Python on this low-end of hardware seems feckless. I know a large part of the problems are due to poor architecting (which can inflict a project in any language), but Java projects seem to have a bad habit of pulling in a bajillion grotty little things that are not needed.

And while I love Python (it’s what I currently implement at least my main project in) and understand the mass appeal and availability of programmers, the GIL and no native compiled programs feel like they are also holding back boot speed.

When I check top on a “locked” HestiaPi, it’s always a Java process and Python process. Just move to something like Go or Rust, or heaven forfend, Common Lisp where you have SBCL to compile to native. Hell, Erlang seems like it was made for this niche (but that might be viewed as even more esoteric than Lisp).

That’s my two cents as a long time embedded and real-time software engineer. I realize this is all armchair quarterbacking, that I’m not putting up (albeit the thought to write a replacement in CL has tempted me; so many projects, so little time), so feel free to tell me to STFU.

I may try your fix, but have other pulls on my time (the aforementioned projects, etc), so it’ll be a few months. Thanks for getting back to me on this.

I agree. I also don’t want to reinvent the wheel when there are already so many existing home automation projects out there.

I spent today looking into existing open source projects that we might be able to join forces with an existing project and the top contender (MyController) is written in GoLang and they specifically mention running on the Pi Zero. I wouldn’t say I entirely know GoLang, but it’s close enough to C and Python that I can usually get code that does what I want with a little trial and error.

And if that doesn’t work out, I have a list of other projects to investigate.

1 Like

I’m experimenting with my own fork of the bash script that skips the CPU load check, and loads a Python UI (which reads the temperature directly and monitors/publishes mqtt messages). The idea is that the user should have UI immediately. This UI will show that openHAB is still loading, informing user that triggers/events and web access are on hold until OH is fully ready.