All Systems Operational

Updated a few seconds ago

Jetstream2 Research and Education Cloud JS2 Docs are here: https://docs.jetstream-cloud.org/

Primary Cloud

Operational

Exosphere User Interface

Operational

ASU Regional Cloud

Operational

Cornell Regional Cloud

Operational

Hawaii Regional Cloud

Operational

TACC Regional Cloud

Operational

Docs Site

Operational

Website

Operational

Jetstream2 Support

Operational

0

Upcoming Maintenances

3

Incidents Last 30 Days

1

Maintenances Last 30 Days

External Services

History (Last 7 days)

Incident Status

Operational


Components

Primary Cloud


Locations

Primary Data Center




April 18, 2024 8:17PM EDT
April 19, 2024 12:17AM UTC
[Resolved] Jetstream2 engineers have returned a majority of GPU hosts to service. At this time, Jetstream2 GPU instances can be created and unshelved as expected. If you are still experiencing any lingering issues, please contact help@jetstream-cloud.org for support.

April 18, 2024 11:44AM EDT
April 18, 2024 3:44PM UTC
[Identified] Engineers have identified a fix and are incrementally rolling it out to Jetstream2 GPU hosts. Approximately 23% of Jetstream2's total GPU resources are available at this time, providing limited capacity for unshelving GPU-enabled instances and creating new ones.

April 17, 2024 6:30PM EDT
April 17, 2024 10:30PM UTC
[Identified] Partial Service Disruption

April 17, 2024 6:29PM EDT
April 17, 2024 10:29PM UTC
[Identified] GPUs are currently unable to launch or unshelve. We have identified the problem and are working towards a solution as quickly as possible. Please check the status page for further updates.

Incident Status

Partial Service Disruption


Components

Primary Cloud


Locations

Primary Data Center




April 17, 2024 6:28PM EDT
April 17, 2024 10:28PM UTC
[Resolved] We're pleased to announce that at this time all Jetstream2 resources except for GPUs appear to be online and fully operational. Actions may be slightly slower this afternoon as usage returns to normal; however, we believe all functions should now be available. A note for GPU instances: GPUs are currently unable to launch or unshelve. We have identified the problem and are working towards a solution as quickly as possible. Please check the status page for further updates. On behalf of the entire Jetstream2 team, thank you so much for your patience and understanding during this time. The challenges of this maintenance event also presented learning opportunities, and we are grateful to the Jetstream2 community for weathering the unexpected with us. Please contact Jetstream2 Support at help@jetstream-cloud.org if you notice further issues or have any questions.

April 16, 2024 9:51PM EDT
April 17, 2024 1:51AM UTC
[Identified] The status of Jetstream2 remains unchanged at this time. As performance degradation has been prolonged far beyond our initial expectations, we feel it is best to offer a more detailed explanation of the causes behind these issues. The April 14th planned power outage from our utility provider, Duke Energy, began and ended later than scheduled. The silver lining of this particular maintenance is that it offered the chance to move our cooling units and CPU units to generator-backed circuits. In the future, they should be able to ride out power events and keep the majority of Jetstream2's user base online. Additionally, during the power outage, the networking team outside of Jetstream2 encountered an issue that required troubleshooting with their vendor. Jetstream2 engineers put a workaround in place and the networking issue was resolved in the early hours of April 15th. The remaining issue currently being investigated involves our Ceph cluster, which was cleanly shut down at the start of the power event for the transition to the UPS/generator-backed circuits and cleanly brought back online when power was restored. However, on the morning of April 15th, we noticed a bad object or objects had caused issues affecting the cluster on a wider scale. Since then we’ve been working with our Ceph vendor to resolve these issues, which appear to be the result of a previously undocumented Ceph bug. These communications are ongoing and our vendor is still working with us to restore full operations to Jetstream2. At present, Jetstream2 is capable of limited operations, such as running non-volume-backed instances and starting/stopping existing instances. Functions that remain affected by the lingering Ceph issue include launching new instances, shelving/unshelving, running volume-backed instances, interacting with images, and interacting with volume storage. We are limited in our ability to affect progress at this time; however, we will continue to work with the vendor to resolve these issues as soon as possible. Protecting data integrity is our goal and is the primary reason for restricting the above operations. We remain grateful for your continued patience and understanding during this unexpectedly challenging time. If you have any questions, please contact us at help@jetstream-cloud.org.

April 16, 2024 4:15PM EDT
April 16, 2024 8:15PM UTC
[Identified] Jetstream2 will be stopping all client activity shortly in order to try to initiate a repair of CephFS. Engineers have been on call with vendor support to address this issue, and this action is part of the ongoing troubleshooting. New instance creation, Manila shares, and object storage will be unavailable at this time; further, all existing instances will be shut off. The exact duration of this outage is unknown at this time, but we will continue to work to address these issues as quickly as possible and provide updates as the come up. We greatly appreciate your continued patience and understanding during this unexpected extension of maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

April 16, 2024 8:05AM EDT
April 16, 2024 12:05PM UTC
[Identified] Jetstream2 engineers have been working through the night with the vendor's support team; however, the status of Jetstream2 remains unchanged at this time. While some Jetstream2 resources are functional, many operations issues continue to persist. These include shelving/unshelving instances, creating new instances, interacting with images, and accessing storage. We will continue to work to address these issues as quickly as possible and will provide updates as the come up. We greatly appreciate your continued patience and understanding during this unexpected extension of maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

April 15, 2024 6:36PM EDT
April 15, 2024 10:36PM UTC
[Identified] While some Jetstream2 resources are now functional, many operations issues continue to persist. These include shelving/unshelving instances, creating new instances, interacting with images, and accessing storage. Our engineers are working with the vendor's support team to address these issues as quickly as possible. We hope to provide an update on the situation by approximately 8AM EDT Tuesday, April 16. We greatly appreciate your continued patience and understanding during this unexpected extension of maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

April 15, 2024 11:42AM EDT
April 15, 2024 3:42PM UTC
[Investigating] While Jetstream2 outages were believed to be resolved, several issues continue to persist and performance is degraded system wide. These include, but are not limited to, issues related to unshelving/starting instances, connecting to instances, and interacting with images. Our engineers are working with the vendor's support team to address these issues as quickly as possible. Please refer to the Jetstream2 Status page for updates. We greatly appreciate your understanding during this maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

Incident Status

Degraded Performance


Components

Primary Cloud


Locations

Primary Data Center




April 17, 2024 6:21PM EDT
April 17, 2024 10:21PM UTC
[Resolved] Resolved

April 15, 2024 10:32AM EDT
April 15, 2024 2:32PM UTC
[Investigating] Some Jetstream2 users may be experiencing degraded performance of storage resources following the planned maintenance on April 14th-15th. Our staff are working with the vendor's support team to address this as quickly as possible. The Jetstream2 status page will be updated when this issue is resolved. We appreciate your understanding. If you have any questions, please contact us at help@jetstream-cloud.org.

Description

On Sunday, April 14, from approximately 7AM-8PM EDT (6AM-7PM CDT, 5AM-6PM MDT, 4AM-5PM PDT), power utility provider Duke Energy will perform maintenance in the IU Bloomington Data Center. This will cause power loss resulting in an outage for Jetstream2 resources. This outage will impact all Jetstream2 users. We strongly advise you to preserve your work prior to April 14 by safely shutting down any active processes or jobs and backing up essential data outside of Jetstream2. The Jetstream2 status page will be updated as resources come back online. Please refer to this source for the most up-to-date information. We appreciate your understanding and hope to mitigate any inconvenience this might cause. If you have any questions, please contact us at help@jetstream-cloud.org.


Components

Primary Cloud


Locations

Primary Data Center


Schedule

April 14, 2024 7:00AM - April 14, 2024 8:00PM EDT
April 14, 2024 11:00AM - April 15, 2024 12:00AM UTC



April 15, 2024 10:28AM EDT
April 15, 2024 2:28PM UTC
[Update] The data center power outage and maintenance for April 14th-15th is complete. Jetstream2 CPU, GPU, and Large Memory resources are now available. Please note: Some users may notice degraded performance of Jetstream2 storage. Our staff are working with the vendor's support team to address this as quickly as possible. The Jetstream2 status page will be updated when this issue is resolved. We greatly appreciate your understanding during this maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

April 15, 2024 12:25AM EDT
April 15, 2024 4:25AM UTC
[Update] While maintenance is still ongoing, Jetstream2 CPU resources are now available after the 4/14 power outage. The utility provider is now anticipating completion in the early morning hours of Monday, April 15, at approximately 4AM EDT. Once the utility maintenance is complete, we will work to bring GPU and Large Memory resources back online. We are also working to restore network connectivity following an equipment upgrade. Instances with floating IP addresses have full connectivity, but instances without a floating IP are experiencing broken or degraded network access. We expect to restore this on Monday. The Jetstream2 status page will be updated with details as they become available. Please refer to this source for the most up-to-date information. We appreciate your understanding and apologize for any inconvenience this might cause. If you have any questions, please contact us at help@jetstream-cloud.org.

April 14, 2024 7:09PM EDT
April 14, 2024 11:09PM UTC
[Update] Update: The resolution time for today's planned power maintenance has been extended. We currently expect the outage to be resolved by approximately 11:30PM EDT (10:30PM CDT, 9:30PM MDT, 8:30PM PDT). As this maintenance is being conducted by a third-party utility provider, this time is subject to change. The Jetstream2 status page will be updated as resources come back online. Please refer to this source for the most up-to-date information. We appreciate your understanding and apologize for any inconvenience this might cause. If you have any questions, please contact us at help@jetstream-cloud.org.

April 14, 2024 7:00AM EDT
April 14, 2024 11:00AM UTC
[Update] Scheduled maintenance is starting.