![]() |
Infrastructure maintainance on 19.11.
Hi everybody,
sorry for the short notice but we will do some heavy maintainance to the maemo.org infrastructure tomorrow, starting at 10:00 CET (09:00 UTC). All systems will be affected. We expect to be down for at least 6 hours as we do upgrades on the underlying hypervisors. What we will do:
Sorry for any inconvenience this might cause. Best, Falk |
Re: Infrastructure maintainance on 19.11.
Thanks for notificatiln.
@tmo admin possibly to be made sticky on overall level? |
Re: Infrastructure maintainance on 19.11.
Hi everyone,
tl;dr: half of infrastucture broken, fix expected early next week, film at eleven. This maintainance didn't go to plan, here's a short post-mortem: Timeline: 10:00 - start updates and backups on blade-a 14:30 - backups and updates complete on blade-a, reboot confirmed successful 14:31 - uptime induced filesystem check after 1347 days 15:00 - start of backups on blade-b 17:12 - filesystem check complete, blade-a up and running 17:30 - first systems on blade-a confirmed up and working 18:30 - software upgrade on stage and mail complete 20:15 - backups of blade-b finished and copied onto blade-a backup space 20:16 - start of updates on blade-b 21:00 - updates on blade-b complete, reboot 21:01 - blade-b stuck in boot with corrupt bios image in flash 23:30 - all available remote recovery options tried, none working 23:40 - decision to go for Plan B, boot talk.maemo.org on blade-a, redirect everything else to talk.m.o 23:45 - blade-b turned off through IPMI 23:53 - talk.m.o available again Fallbacks in place: www.maemo.org, wiki.maemo.org, garage.maemo.org are redirected to talk.maemo.org Next Action Items: I'll visit the datacenter monday after work (around 18:00 CET) to try to recover the bios of the broken machine with a physical USB stick. If this is successful we'll migrate talk.m.o back to it's original host and reenable www.m.o, wiki.m.o, garage.m.o through DNS after the VMs and the blade are confirmed working Best, xes & falk |
Re: Infrastructure maintainance on 19.11.
My browser complaints about a wrong certificate; is this a side effect of the update? Is it temporary?
(Details: the name on the cert does not match the URL.) |
Re: Infrastructure maintainance on 19.11.
1 Attachment(s)
Quote:
A hint for all remaining N9 user: we have again no automatic network (WLAN auto/manual) detection. A nice screenshot attached (maybe later, my N9 does not let me select it :)) --edit Quote:
|
Re: Infrastructure maintainance on 19.11.
2 Attachment(s)
Let me share the screen that our Supermicro server showed to reward us for a day of work...
http://www.supermicro.nl/products/sy...cfm?parts=SHOW Then, we also discovered that Supermicro wants money to obtain a license to flash bios remotely using the IPMI. (anyway, we are not sure this could work to recovery the bios) Supermicro: really, thanks. |
Re: Infrastructure maintainance on 19.11.
Possible to replace the chip?
|
Re: Infrastructure maintainance on 19.11.
@win7mac
at the moment i can't say which is the "weight" of the problem we are facing until tomorrow Falk will make some tests while trying to restore the blade. Then, while with your personal pc / board / laptop you can try whatever you want and any hack, any trick is done because you have nothing to loose, with servers you have to enter in a different perspective where you have to consider risks, best options, time to fix, quality of result and possibility to make more damages. So, my reply is: i think that no one tries to remove a chip from a server mainboard without a spare board or without a warranty of result. |
Re: Infrastructure maintainance on 19.11.
I wasn't suggesting any tricks or hacks. Some BIOS are replaceable, but since it's not listed on that parts list, that's probably not an option. :(
|
Re: Infrastructure maintainance on 19.11.
Quote:
Best, Falk |
All times are GMT. The time now is 20:16. |
vBulletin® Version 3.8.8