Outage on 14.11.2019 report
Published on Thu, 14 Nov 2019 11:00:00 +0000
Appdb has experienced outage on 14.11.2019 - and it was very unusual problem - both our nodes (production and reserve one) has failed.
Around 2 AM GMT on 14.11.2019 hard drive in RAID array on main appdb server has failed. It will result in downgraded performance of all appdb. Failover has happened, transferring live data from faulty array to another server. During transfer huge amount of data, disks on destination server has failed too. So, everything has stalled.
It's hard to recover live migration, because there are too many points of failure, we have tried to do it, but recovery was unsuccessful, thats why we decided to rollback cold backup that was made about a week ago.
Now appdb uses backup from 05.11.2019.
Impact is the following:
- New apps, uploaded after 05.12.2019 are lost
- MyAppStore files uploaded after 05.11.2019 are lost
- Devices linked after 05.11.2019 are lost - you need to remove appdb profile and link device again. Don't worry, your PRO will remain. If apps will crash, just trigger installation of any app from appdb - they will recover.
- Some translations may be missing
- PROs, activated by udidregistrations, may be lost, we are in touch with them to recover them ASAP
PROs activated by appdb are safe, it is another part of appdb's systems.
If you see any bugs or untranslated things, please report in this topic, we will fix it quickly.
It is secondary major failure in past months. We are changing our recovery plan and will make big scheduled maintenance in order to prevent such failures in future.
Best regards, appdb team.