Lots of issues with v1.4-dev

My touch one damaged its file system, and I had to reinstall. The only image I could find that would boot was 1.4-dev, but it has numerous problems:

  1. It keeps either forgetting the default route, or just not getting it from DHCP (yes, the DHCP server correctly indicates the default route).
  2. It acts like none of the sensors or IO pins are hooked up. e.g. the temp sensor in PaperUI shows this:>

Traceback (most recent call last):
File “/home/pi/scripts/bme280.py”, line 182, in main()
File “/home/pi/scripts/bme280.py”, line 170, in main (chip_id, chip_version) = readBME280ID()
File “/home/pi/scripts/bme280.py”, line 65, in readBME280ID (chip_id, chip_version) = bus.read_i2c_block_data(addr, REG_ID, 2) OSError: [Errno 121] Remote I/O error Traceback (most recent call last):
File “/home/pi/scripts/bme280.py”, line 182, in main()
File “/home/pi/scripts/bme280.py”, line 170, in main (chip_id, chip_version) = readBME280ID()
File “/home/pi/scripts/bme280.py”, line 65, in readBME280ID (chip_id, chip_version) = bus.read_i2c_block_data(addr, REG_ID, 2) OSError: [Errno 121] Remote I/O error.

  1. None of the relays actuate - turning on the fan does nothing, no relay click, no voltage on the line.
  2. It doesn’t seem to remember things like the location I’ve set, or the language, or the timezone.

At first glance these seem to be unrelated problems, but the last two may both be OpenHAB. And just to confirm, v1.4-dev is the latest release as of right now.

First we have the DHCP issue. This is one I haven’t seen before, so hopefully the logs will tell us something. The Raspberry Pi OS uses dhcpcd (at least up to and including in Debian 11, which is what v1.4-dev is based on). The configuration is in /etc/dhcpcd.conf you can see the status with systemctl status dhcpcd and the logs with systemctl status dhcpcd. The expected log message should include something like this:

Sep 01 20:34:24 raspberrypi dhcpcd[297]: eth0: adding default route via 10.0.2.2

Second we have the temperature sensor issue. If you are able to SSH into the pi, you should be able to do some debugging with i2cdetect. See section 4.3 of the owners manual for details and let us know what you find.

Third, there’s the relays. There are instructions on how to determine if they relay hardware is working in section 4.2 of the owners manual. This will tell us if the problem is that the relays don’t work, or if it’s that OpenHAB isn’t trying to turn on the relays. My bet is on the latter, but we should verify this before getting too far into things.

For the fourth item, let’s focus on the timezone and see if solving that solves everything else. When you use the basic web UI (Basic UI → Settings) to set your timezone and reboot, it should certainly persist. When troubleshooting, it’s worth noting that Java has its own timezone settings that are separate from the O/S. This means if you go the the Paper UI and check Configuration → System to see your timezone, I wouldn’t expect it to be set there. It’s the system timezone that we care about.

After setting the timezone in the Basic UI, /etc/timezone should be gone (it would have been set to “Europe/London” originally). The symbolic link at /etc/localtime should point to your timezone file (e.g. be a symlink to /usr/share/zoneinfo/America/New_York). You can also see the system timezone on the command line with the date command.

If you just want a quick fix for the timezone issue, you can use timedatectl to set it. This should be picked up by OpenHAB, displayed in the OpenHAB UI, and persist across reboots.

Finally, I’d just like to say thank you. I appreciate you taking the time to come report these issues here. It’s hard to fix intermittent problems and since I can’t reproduce any of these issues, I rely on reports like yours to try to figure out what the root cause might be and fix it not only for you but hopefully for everyone in the next release.

Speaking of which, I’d also like to mention that I am working on a v1.5-dev right now and I’m making solid progress, including a fix for some long-standing intermittent errors with installing OpenHAB add-ons. So there is reason to be optimistic. It takes a while to re-build everything from scratch on an raspberry pi, but I’m dedicated to grinding it out. :flexed_biceps:

First, for DHCPD:

Sep 01 10:34:47 raspberrypi dhcpcd[588]: wlan0: IAID eb:dc:dc:c1
Sep 01 10:34:47 raspberrypi dhcpcd[588]: wlan0: adding address fe80::32d0:1186:3d98:ef43
Sep 01 10:34:47 raspberrypi dhcpcd[588]: wlan0: soliciting an IPv6 router
Sep 01 10:34:48 raspberrypi dhcpcd[588]: wlan0: rebinding lease of 10.16.0.91
Sep 01 10:34:48 raspberrypi dhcpcd[588]: wlan0: probing address 10.16.0.91/8
Sep 01 10:34:52 raspberrypi dhcpcd[588]: wlan0: leased 10.16.0.91 for 43200 seconds
Sep 01 10:34:52 raspberrypi dhcpcd[588]: wlan0: adding route to 10.0.0.0/8
Sep 01 10:34:52 raspberrypi dhcpcd[588]: wlan0: adding host route to 224.0.0.0
Sep 01 10:34:52 raspberrypi dhcpcd[588]: wlan0: adding host route to 239.0.0.0

For reference, the DHCP server is configured thus:

subnet 10.0.0.0 netmask 255.0.0.0
   {
      option broadcast-address 10.255.255.255;
      range 10.5.0.0 10.5.255.255;
      option routers 10.0.0.1;
      option subnet-mask 255.0.0.0;
      option netbios-name-servers 10.0.0.1;
      option domain-name-servers 10.0.0.1,10.0.0.1;

   }
}

group 
{ 
   option classless-routes 24,224,0,0,0,0,0,0,8,239,0,0,0,0;
      option ntp-servers 10.0.0.1;
      option lpr-servers 10.0.0.1;
      option nis-servers 10.0.0.1;
      option pop-server  10.0.0.1;
      option log-servers 10.0.0.1;
   use-host-decl-names on;
   host Thermostat
   {
      hardware ethernet <redacted>;
      fixed-address 10.16.0.91;
   }

As you can see, there is the router address in there, so I don’t know why it’s not picking that up.

Second, the temperature. Thanks for the pointer to the owner’s manual, that will help.
i2cdetect is not showing anything, so I’ll start troubleshooting that.

Third, the relays. I found a bug in the documentation:

echo 1 | /sys/class/gpio/gpio12/value # turn on
echo 0 | /sys/class/gpio/gpio12/value # turn of

Should be >, not |

This may be an own-goal - I am testing this on my bench, using a variable DC supply rather than an AC supply, and if those relays are SSRs using opto-triacs rather than real honest to Pete electro-mechanical relays they may not like that. I’ll investigate further, now that I know where to manually turn them on.

Fourth: I was only working in the PaperUI timezone, which was what was not persisting. The system time zone is set correctly.

Fifth: Can you not use a cross-compiler running on something with some horsepower? I’ve done scadloads of cross-development for ARM using AMD64 hosted compilers. Under Ubuntu it’s pretty easy to install the ARM cross compilers.

Well, I can tell that the I2C is showing a resting state of 3.3V, and is NOT transitioning low at any time (dual channel 1GHz scope on data and clock lines). This is measured at the temp sensor.

I won’t be able to do a lot more debug tonight - I am getting weather and power glitches, and while all my computers are on back-up, the HestiaPi is not. No sense courting glitches in the file system.

  1. I’ll put this in my queue of things to try to reproduce. Everything there looks reasonable. The only thing I can think of is that maybe the host block is overriding more than is desired, but I have enough raspberry pi boards that can test that out.
  2. This is one of those situations where it’d be great to be able to verify the sensor works, if you have any other gear that can do so.
  3. :person_facepalming: How embarrassing. I now have a MR in to fix the documentation. And these relays are weird when you dig down into the datasheet. They act totally normal with 24V AC, and reportedly 240V AC (and thus presumably 120V AC), but outside of our use case I remember there looked like there were some gotchas.
  4. I’ll keep an eye out for this one as I’m testing v1.5-dev to see if I can reproduce the issue.
  5. It’s not the compiling, it’s booting, resizing the partition, adding SSH keys, rebooting, installing the LCD drivers, adding apt repos, installing software, setting up OpenHAB, and so on. It’s not so bad if I’m just setting up a single machine manually. It’s the fact that I want the entire setup process to be repeatable and automated, and so I’m re-doing all of that work each time that I think I’ve fixed the current issue.

Oh, I can respect making things repeatable and automated - that’s one of my peeves with other engineers I’ve worked with who FAIL to do that. I’ll spend a day automating an hour process, if that process will be re-executed many times, so that it WILL be repeatable. (I once spend quite some time setting up a RedHat kickstart to build a server for the company I worked for, so that at any point thereafter they could completely rebuild it with 2 CDs).

As for testing the sensor: No, I don’t have any I2C controllers laying around. I have a couple of other PIs around, and I may be able to test. However, the fact that the clock line isn’t toggling makes me think the problem is on the Pi end of things.

As for the routes - could the Pi’s DHCP be getting confused about the routes for the multicast networks - maybe trying to re-use a variable for them as well as the gateway route? This is how my whole network works, and all the other hosts are fine with this and get the gateway.

I have a bit more information:

  1. Every time I restart it, it goes back to the “set up the WiFi” and I have to connect to its WiFi and tell it to use mine. Checking wpa-supplicant.conf shows that the SSID and passphrase have been written to the file, and I’ve done a sync to force it to storage. It’s acting like it keeps going back through the “reset everything for the first time” code.
  2. I’ve verified that at least 2 of the relays trigger now that I am using AC to drive it, so that seems to be the problem there. I cannot say with 100% certainty that all the relays work, as 2 didn’t light the LEDs, but that may be a bad connection - investigating.
  3. I’ve noticed some font corruption on the panel during start up:

    I don’t know if that means anything, but in the spirit of “If something funny seen, record amount of funny here”, I am reporting it.
  4. I have verified that I2C clock is running and that data is NOT - indeed, what I see is a droop of the voltage on the data line, so that does tend to make me think the sensor is dead.

One of the relays is not working. Again, maybe my fault; I will check the control signal to it, and replace if needed. I can also get a sensor.

OK, I have a sensor and relay on order. Mouser usually is one day for me, so tomorrow I should be able to test that part of things.

While I wait for parts, I was poking around - this time the system came up directly onto the network (no default gateway, still). I noticed that you have Modemmanager loaded - is that strictly needed? It might be a bit of fat to trim…

After seeing you say that the wpa-supplicant.conf was saved, it jarred my memory that people have reported this problem when they don’t have 2.4GHz wifi. The Raspberry Pi doesn’t support 5GHz.

I’ll keep an eye out for that text overwriting thing. I admit that I don’t usually watch them boot anymore when I’m testing them out. After I power them on, it’s time to go get a snack or some tea and come back in a bit.

As for ModemManager, that should be stopped and disabled…

I guess something went wrong with my automation and we didn’t catch it is final testing. However, I can say that the Debian 12 based image doesn’t have it running. I suppose I could just remove the modemmanager package entirely. :thinking:

In any case, I’ll save that for the end. Right now X tells me that the screen isn’t detected, and the HDMI cuts over to a black screen, so I can’t even log in with a keyboard/monitor attached. This is always the worst part of upgrades.

OK, I have the hardware working, but:
Now, it always comes up in the “set up wifi” mode with it’s own network, even though I’ve configured /etc/wpa_supplication/wpa_supplicant with the correct network information. Moreover, when it comes up in that mode, it never seems to find any of the local 2.4GHz networks - the drop down is always blank, and of course it won’t let me just type in the information.
What drives the decision that it must reconfigure the SSID, and what can I do to override that?

That decision is made based on whether it can connect to the WiFi access point that is configured. If it can connect to wifi, it does so; if not, it falls back to spinning up its own AP.

I have “good” news for you on this front: I’m running into this same thing on the thermostat that I have on the bench right now (the bookworm-based one). I’m not happy to have this problem, but I am glad I’m able to reproduce it with an AP that I’ve used before with this exact raspberry pi!

As a workaround, I was able to use nmcli device wifi connect <AP name> password <password> to set the wifi configuration manually. I found that command in a stackoverflow answer. After running that, I found that the pi would automatically connect to wifi after reboots and it’d avoid the initial wifi setup. I’m not sure if this will work on bullseye, I’m currently only working with bookworm.

A reproducible bug you can work on is infinitely preferable to a random bug you cannot. I’ll try that and see if it works for me as well.