All Systems Operational

Updated a few seconds ago

Jetstream2 Research and Education Cloud JS2 Docs are here: https://docs.jetstream-cloud.org/

Primary Cloud

Operational

Exosphere User Interface

Operational

ASU Regional Cloud

Operational

Cornell Regional Cloud

Operational

Hawaii Regional Cloud

Operational

TACC Regional Cloud

Operational

Docs Site

Operational

Website

Operational

Jetstream2 Support

Operational

0

Upcoming Maintenances

3

Incidents Last 30 Days

1

Maintenances Last 30 Days

External Services

History (Last 7 days)

Incident Status

Operational


Components

Primary Cloud


Locations

Primary Data Center




April 18, 2024 8:17PM EDT
April 19, 2024 12:17AM UTC
[Resolved] Jetstream2 engineers have returned a majority of GPU hosts to service. At this time, Jetstream2 GPU instances can be created and unshelved as expected. If you are still experiencing any lingering issues, please contact help@jetstream-cloud.org for support.

April 18, 2024 11:44AM EDT
April 18, 2024 3:44PM UTC
[Identified] Engineers have identified a fix and are incrementally rolling it out to Jetstream2 GPU hosts. Approximately 23% of Jetstream2's total GPU resources are available at this time, providing limited capacity for unshelving GPU-enabled instances and creating new ones.

April 17, 2024 6:30PM EDT
April 17, 2024 10:30PM UTC
[Identified] Partial Service Disruption

April 17, 2024 6:29PM EDT
April 17, 2024 10:29PM UTC
[Identified] GPUs are currently unable to launch or unshelve. We have identified the problem and are working towards a solution as quickly as possible. Please check the status page for further updates.

Incident Status

Partial Service Disruption


Components

Primary Cloud


Locations

Primary Data Center




April 17, 2024 6:28PM EDT
April 17, 2024 10:28PM UTC
[Resolved] We're pleased to announce that at this time all Jetstream2 resources except for GPUs appear to be online and fully operational. Actions may be slightly slower this afternoon as usage returns to normal; however, we believe all functions should now be available. A note for GPU instances: GPUs are currently unable to launch or unshelve. We have identified the problem and are working towards a solution as quickly as possible. Please check the status page for further updates. On behalf of the entire Jetstream2 team, thank you so much for your patience and understanding during this time. The challenges of this maintenance event also presented learning opportunities, and we are grateful to the Jetstream2 community for weathering the unexpected with us. Please contact Jetstream2 Support at help@jetstream-cloud.org if you notice further issues or have any questions.

April 16, 2024 9:51PM EDT
April 17, 2024 1:51AM UTC
[Identified] The status of Jetstream2 remains unchanged at this time. As performance degradation has been prolonged far beyond our initial expectations, we feel it is best to offer a more detailed explanation of the causes behind these issues. The April 14th planned power outage from our utility provider, Duke Energy, began and ended later than scheduled. The silver lining of this particular maintenance is that it offered the chance to move our cooling units and CPU units to generator-backed circuits. In the future, they should be able to ride out power events and keep the majority of Jetstream2's user base online. Additionally, during the power outage, the networking team outside of Jetstream2 encountered an issue that required troubleshooting with their vendor. Jetstream2 engineers put a workaround in place and the networking issue was resolved in the early hours of April 15th. The remaining issue currently being investigated involves our Ceph cluster, which was cleanly shut down at the start of the power event for the transition to the UPS/generator-backed circuits and cleanly brought back online when power was restored. However, on the morning of April 15th, we noticed a bad object or objects had caused issues affecting the cluster on a wider scale. Since then we’ve been working with our Ceph vendor to resolve these issues, which appear to be the result of a previously undocumented Ceph bug. These communications are ongoing and our vendor is still working with us to restore full operations to Jetstream2. At present, Jetstream2 is capable of limited operations, such as running non-volume-backed instances and starting/stopping existing instances. Functions that remain affected by the lingering Ceph issue include launching new instances, shelving/unshelving, running volume-backed instances, interacting with images, and interacting with volume storage. We are limited in our ability to affect progress at this time; however, we will continue to work with the vendor to resolve these issues as soon as possible. Protecting data integrity is our goal and is the primary reason for restricting the above operations. We remain grateful for your continued patience and understanding during this unexpectedly challenging time. If you have any questions, please contact us at help@jetstream-cloud.org.

April 16, 2024 4:15PM EDT
April 16, 2024 8:15PM UTC
[Identified] Jetstream2 will be stopping all client activity shortly in order to try to initiate a repair of CephFS. Engineers have been on call with vendor support to address this issue, and this action is part of the ongoing troubleshooting. New instance creation, Manila shares, and object storage will be unavailable at this time; further, all existing instances will be shut off. The exact duration of this outage is unknown at this time, but we will continue to work to address these issues as quickly as possible and provide updates as the come up. We greatly appreciate your continued patience and understanding during this unexpected extension of maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

April 16, 2024 8:05AM EDT
April 16, 2024 12:05PM UTC
[Identified] Jetstream2 engineers have been working through the night with the vendor's support team; however, the status of Jetstream2 remains unchanged at this time. While some Jetstream2 resources are functional, many operations issues continue to persist. These include shelving/unshelving instances, creating new instances, interacting with images, and accessing storage. We will continue to work to address these issues as quickly as possible and will provide updates as the come up. We greatly appreciate your continued patience and understanding during this unexpected extension of maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

April 15, 2024 6:36PM EDT
April 15, 2024 10:36PM UTC
[Identified] While some Jetstream2 resources are now functional, many operations issues continue to persist. These include shelving/unshelving instances, creating new instances, interacting with images, and accessing storage. Our engineers are working with the vendor's support team to address these issues as quickly as possible. We hope to provide an update on the situation by approximately 8AM EDT Tuesday, April 16. We greatly appreciate your continued patience and understanding during this unexpected extension of maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

April 15, 2024 11:42AM EDT
April 15, 2024 3:42PM UTC
[Investigating] While Jetstream2 outages were believed to be resolved, several issues continue to persist and performance is degraded system wide. These include, but are not limited to, issues related to unshelving/starting instances, connecting to instances, and interacting with images. Our engineers are working with the vendor's support team to address these issues as quickly as possible. Please refer to the Jetstream2 Status page for updates. We greatly appreciate your understanding during this maintenance. If you have any questions, please contact us at help@jetstream-cloud.org.

Incident Status

Degraded Performance


Components

Primary Cloud


Locations

Primary Data Center




April 17, 2024 6:21PM EDT
April 17, 2024 10:21PM UTC
[Resolved] Resolved

April 15, 2024 10:32AM EDT
April 15, 2024 2:32PM UTC
[Investigating] Some Jetstream2 users may be experiencing degraded performance of storage resources following the planned maintenance on April 14th-15th. Our staff are working with the vendor's support team to address this as quickly as possible. The Jetstream2 status page will be updated when this issue is resolved. We appreciate your understanding. If you have any questions, please contact us at help@jetstream-cloud.org.