Waterbug helps you monitor leaks that are being collected in any sort of container. Two float sensors mounted in the container detect when fluid levels reach a certain trip point, and report back to the IBM Bluemix dashboard via MQTT. A high current relay controls a water pump which can empty the container at preset levels or on demand from the dashboard. A bootloader and HTTP downloader allow OTA firmware updates, as well as firmware validation and failsafe rollback to known working versions.
The normal use case is for managing a container catching a water leak. The float sensors are set up in the container, with one at a low level near the bottom and one at the top. The low level sensor provides an early warning that a leak is ongoing. When the water reaches the high level sensor, the relay is actuated and a water pump begins removing the water. The pump continues working until the water gets below the low level sensor; then, the low level float sensor turns off and deactivates the relay.
The bootloader sits at the base address of the SAMD21's non-volatile memory. At the end of the bootloader partition, a small boot status struct holds flags to keep track of the current running image, downloaded image, watchdog resets, and an upgrade enable flag. An external SPI flash chip holds firmware in three partitions. The base partition holds a "golden" image for app recovery. The other two partitions store downloaded applications: one which is currently running, and another that has been downloaded to perform an upgrade.
When the bootloader starts, it checks if the upgrade enable flag is set. If it is, it also checks the downloaded image flag to set the base address to begin reading the downloaded application from flash. It then writes this application to the NVM's application space, resets the boot status flags, and resets the microcontroller.
If the bootloader starts and there is no upgrade waiting, it checks that the current running image is valid by counting watchdog resets and inspecting the reset vector of the application. If either of these checks fail, the bootloader copies the golden image to the NVM and resets to recover to a known good state.
Firmware updates are initiated by the application by clicking a button in the Node-Red dashboard. The update routine starts by de-initializing the MQTT stack and configuring an HTTP client. The HTTP client downloads the latest firmware image and a CRC file. As the download occurs, a CRC32 is calculated. The final CRC is compared against the downloaded CRC file and the upgrade may proceed if they match. If not, the application terminates the downloader and resets.
If the downloaded image is valid, the application writes the downloaded image location and the upgrade enable flag to the boot status struct, then resets the microcontroller so the bootloader can write the application to NVM.
Waterbug connects to the cloud through an MQTT broker (mosquitto) which also communicates with IBM Bluemix. The Bluemix dashboard serves as the web interface for the device.
Bluemix sends a heartbeat request at regular intervals, which asks the device to respond with its firmware version. If the version is received, it is displayed on the dashboard along with an "OK" for the heartbeat status. If a response is not received, the status changes to "Unreachable". Bluemix also can send manual requests to turn the relay on and off.
The device sends updates of its float sensor states to Bluemix whenever they change.
There were no issues with the initial design, fabrication or bring-up, but some of the board modules started to fail for unknown reasons. This was eventually tracked down to a connection between the status output of the battery management IC and a microcontroller pin. The status output was tri-state, with 5V logic that presented a high state when the charge cycle was complete. Since this high state was well over the rated input voltage of the 3.3V microcontroller, this eventually caused failures in the modules. The problem was solved by cutting the trace for this signal, which had the negative consequence of removing the controller's ability to read the charger status. In the future, this input would have to be level-shifted.
The power path management circuit did not perform optimally as designed. With the battery and no USB input present, the gate of the PMOS pass device was charged to roughly 1V instead of being pulled down to 0. The root cause was reverse leakage across the Schottky diode, and the issue was solved by reducing the value of the pulldown resistor.
There were several challenges in the software development process that were more difficult than anticipated.
The first major hurdle was the HTTP downloader. Since the download came in chunks of unpredictable sizes that were not page-aligned with the flash memory, we had to create a circular buffer which would store the chunks and write them to the flash in increments of 256B. It was also challenging to discover all of the reconfiguration and flag resets necessary to make the downloader execute a second request to download the CRC file.
The MQTT library was especially difficult to work with. We had an issue where the library would become stuck when subscribing to topics, and had no way of debugging it since the library was pre-compiled and poorly documented. We eventually discovered that it would only work properly at optimization level -O1.
MQTT brokers were also problematic. We started with CloudMQTT, but found the latency to be unacceptable. Mosquitto was set up on a spare Raspberry Pi Zero W, and worked very well for a week. However, when this service failed a few minutes before the demo, it was very difficult to get it restarted remotely. For future projects, I would set up a more robust service or accept the poor performance and higher reliability of CloudMQTT.