2014.08.28 Planned Total Service Outage - IT Datacenter Maintenance

On Thursday, August 28th, the CalArts IT datacenter is undergoing major power maintenance.  This will require a complete shutdown of all IT servers and services.  We will begin this maintenance at 5:00AM PST and plan to be fully operational by 10:30AM PST.
 
Thursday 8/28 5:00AM PST through 10:30AM PST
  • CalArts.edu, REDCAT.org, and associated websites
  • On-campus network access, wired and wireless
  • On-campus access to the internet, including email
  • The Hub, WebAdvisor, Colleague (including UI access)
  • Access to departmental and school fileshares
  • VPN and any other service that connects to the campus network
** THIS MAINTENANCE WILL AFFECT ALL ON-CAMPUS IT SERVICES **
While we do not foresee any complications we will notify the community through this CAIT article and through Twitter.

 Aug 28th 2014 - Event Log

  • 08:49AM: Shutdown procedures encountered an unexpected issue.  We are working with vendor support to resolve the problem.  Minimal on-campus services are still functioning.
  • 10:45AM: We have determined the best course of action at this time is to abort the shutdown.  The issue we encountered will require further investigation before we're able to continue.  We are returning the critical systems to operation: Colleague (Hub/WebAdvisor), CalArts.edu and associated websites, and on-campus wireless access.
  • 12:00PM: Systems are slowly coming back online.  The websites have not been restored as of yet, nor has the Colleague (Hub/WebAdvisor) system.  On-campus wireless *is* online, as are the fileshares and the authentication servers. 
  • 12:47PM: Colleague (Hub/WebAdvisor) is now operational, along with most departmental services, and CalArts.edu and associated websites.  VPN is expected to be online shortly.
  • 01:45PM: All services have been restored.

Resolution

A sudden hardware failure in our backup infrastructure created a situation where we did not have total confidence in our backups.  During an operation as severe as this having completely reliable backups is a requirement, so the decision was made to abort the process.  Network Operations will be replacing and testing the failed components while a new maintenance schedule is considered.  As soon as that is established we will be notifying the community as loudly and as broadly as we're able to.

 

Have more questions? Submit a request

Comments

Please sign in to leave a comment.