Database Errors Reported
Some observers in the chat room are reporting that their observations submitted since November 11, 2010 are not showing up in our online tools. Doc and Matt will investigate this first thing in the morning (eastern) and update this page when they know more information. We apologize for this inconvenience.
Update 2011Feb10 0620 ET (by Doc)
Well, its been a far longer day and night than I thought it would be, and I'm sorry for that.
That being said the AID Database and webobs is back up and running.
We ended up having to load in a database from 1 February and then load in the data from that date to the present from the transaction logs that we keep for just that purpose.
The actual data load was done by around 10:20. The rest of the time - about 5.5 hours - was spent re-indexing all the data that had been loaded in. For almost 20,000,000 records and the amount of indexes we have on the observational database (around 8 indexes) this took more time than I had hoped.
Then around 19:00 something happened whereby 90% of the DB was erased. There was nothing for it but to start over again, which is what I did. Toward the end, for multiple reasons, I was very, very careful! :-)
The good news is that, in the end, our backup plans worked. We had enough layers of backup redundancy that even with a couple of points of failures we were still able to recover. Further good news is that this incident showed us where the weaknesses are and we have some solid ideas on how to shore those weaknesses up, and a presentation to give to Arne about the subject next Wednesday.
So there we are. I very much apologize for the long day away from the database, folks, but we should be back up to where we were.
Thanks for your patience. And a special thanks to those people who sent me emails after I sent out the notification that I'd had to start things all over again telling me to keep the faith. The AAVSO membership comprise the best people on the planet, and you folks just proved that to me again last night. Thank you!
"Well, look at that. The Sun's coming up!"
-President John Sheridan's last words.
"Sleeping in Light"
Update 2011Feb10 0505 ET (by Doc)
The last three transaction logs are being loaded. We're almost there!
Update 2011Feb09 2327 ET (by Doc)
As you can see by the front page of the web site, the counter is up from a very worrying 1 million or so observations in the DB to a much better 19 million or so. The initial load and indexing is done. We are now loading in the last week or so of observations - about 50,000 - via the binary transaction logs.
Update 2011Feb09 20:00 ET (by Doc)
Well, I managed to do something very similar to what was originally done to kill the database in the first place, right at the very end of having it back online.
There is nothing to do but to start over again, which I am now doing. I hope and expect that things will be back online very early in the morning (~0600).
I apologise folks. It really hasn't been one of our smoothest days.
Update 2011Feb09 14:25 ET (by Doc)
The vast majority of the database (all except this month's data) has been put back online. I am now working at restoring the February data. Once that is done, we'll be able to open things back up again.
The big time sync was re-indexing the data, and that took a good bit longer than I anticipated.
Update 2011Feb09 8:15 ET (by Matthew)
Doc and I discovered the problem independently last night as well. We're looking at the timestamps on the backup used to restore the AID yesterday, but it's likely it was an older backup than intended. We have my independent backup of AID dating from 2011 February 1 and we also have a copy of the MySQL transaction logs from February 1 to the present day, and so we are confident that all data are safe -- you should not need to resubmit any observations.
WebObs and other AID-dependent services will likely go down for a few hours today, and then we should be back to normal. Once we verify that the system has been restored, we will announce it as such. Until that time, please direct any comments and questions about the situation to me (email@example.com).
Apologies for the inconvenience to everyone!
Last Updated: February 10, 2011 - 7:22am