Controller falling offline

Has there been any server issues as of late?

Two nights in a row now, I have had my controller fall offline overnight and stay offline until I reset power to it.

Mine has been going through the same problems. At least a month. I finally created a ticket and there is a guy supposedly looking at the logs and performed some actions, but it went down again this morning at 2 am. I haven’t seen a pattern of date or time for going offline. The lights are still on, but there is no connectivity through the website and the mobile app just says it is loading data. One time recently it actually came back online and I think that is something to do with the switching of servers. I depend upon vera staying up all night watching the house so I don’t have to. After a recent home invasion, I have a heightened sense of security and I need to find a solution that can be stable enough to get through a night.

grrr. It just did it AGAIN!

I have had a similar problem that lasted about 6 weeks before Vera Support found squashfs errors in the logs indicating memory errors that were attributed to the USB stick used for storing logs. Or at least that is the working theory. The suspect USB stick was removed last week and the Vera Edge unit has been solid since then. So that seems to be the most likely explanation.

I am still trying to track down the cause of corruptions to the Z Wave device database, but they may have caused by the random reboots and power cycling during the troubleshooting process.

To diagnose the problem I built a log file of emails by creating a scene that triggered an alert email to me every 20mins. I could easily see periods offline by looking for gaps in the list of emails, or random restarts when the gaps between emails was not 20mins. This gave me an easy way of seeing when the controller was going offline even though all the lights were on and it seemed to be normal. I passed this list to Vera support so they could look for log entries around the time of the failures.

Shame that Vera does not have the ability to format a USB without errors, or to detect and avoid bad sectors in storage media, or even sending logs to an external server.

I have been tearing my hair out trying to resolve this, and have considered abandoning Vera in favour of a platform with a more fault tolerant storage system (e.g. PC with RAID drives) but the elegance and low power/heat of the Vera hardware is hard to give up.

I asked questions about how to fix errors in a corrupted Z Wave device database, and whether any Vera product supports RAID storage or ECC memory chips to avoid corruption from memory errors, or perpetuating errors from corrupted backups.

You may find the response from Vera Support interesting…

I would like to let you know that there are no squashfs errors after removing the USB flash drive, it seems that those memory errors were from the stick.

When the system has memory issues (no physical issue) , we rewrite Linux distribution, it has a tool that should manage them. After that, we restore a backup file created in advance.

The memory errors can not be transmitted by restoring a backup file from another controller. The cause of this are the BAD sectors that are hardware failures of the internal memory. But is not the case there, those errors were from the USB stick. 

Unfortunately, I don't know any graphical tool that allows you to access the logs. However, you can use the HTTP request below, please note that it came be used locally by replacing the Vera_IP with the real IP address of the controller.

[http://Vera_IP/cgi-bin/cmh/log.sh?Device=LuaUPnP](http://vera_ip/cgi-bin/cmh/log.sh?Device=LuaUPnP)

Or you can store them on a USB and extract them from the /tmp/log/cmh directory.

All devices are kept in the user_data.json.lzo file, but to be honest I don't know what type of changes you can do from here. You will need pluto_lzo to uncompressed it.  It can be accessed also with a HTTP request.

[http://Vera_IP/port_3480/data_request?id=status&output_format=xml](http://vera_ip/port_3480/data_request?id=status&output_format=xml)

or 

[http://Vera_IP/port_3480/data_request?id=status&output_format=json](http://vera_ip/port_3480/data_request?id=status&output_format=json)

Related to ECC memory and RAID storage, we have implemented a raid software at kernel level on the VeraPlus and VeraSecure units but none of our controllers use ECC memory.

Please let me know if you can see any improvements after the USB stick was removed.

Thank you!