Discover Habbo's history
Treat yourself with a Secret Santa gift.... of a random Wiki page for you to start exploring Habbo's history!
Happy holidays!
Celebrate with us at Habbox on the hotel, on our Forum and right here!
Join Habbox!
One of us! One of us! Click here to see the roles you could take as part of the Habbox community!


Results 1 to 1 of 1
  1. #1
    Join Date
    Jun 2004
    Location
    Reading, Berkshire
    Posts
    2,260
    Tokens
    12,202
    Habbo
    :Jin:

    Latest Awards:

    Default 28/11/16 Outage Report

    On 28th November between 22:45 and 23:03 we suffered a brief outage the cause of which was discovered within 10 minutes.

    What happened?
    The new servers we migrated to were originally set up 3 weeks ago, back then there were two database servers that were replicating from each other with the basis being that we could offer high availability and load balancing between the two. However a week before we began the migration we were performing some tuning steps which broke the replication between host 1 and host 2.

    We thought nothing of it and shutdown the second host on the basis that we would go back to repair it over the christmas holidays.

    Whilst the sites continued to operate on the single database host, this host was generating binary log files which were not being processed by the second host as it was shut down. As a result it slowly filled the C:\ to 100% which is what caused the outage.

    Why were the sites suspended?

    Whilst attempting to fix the issue we were faced with max connection errors as the slowed down server clung onto each connection for several minutes whilst it slowly attempted to process the request. In the end the user traffic had to be stopped from reaching the database server so the fix could be applied.

    How are we going to prevent this from happening?

    There are a number of things we still have left to do on with our new infrastructure the first and foremost being our new monitoring platform which will monitor for things such as disk space utilisation, memory usage, processor usage and service uptime.

    As we have migrated over in a hurry for commercial purposes we have had to priorities the web and database servers over monitoring. We are slowly catching up to these tasks.
    Last edited by Jin; 28-11-2016 at 11:14 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •