|

Sui Reveals What Caused Three Mainnet Halts After Major Network Upgrade

Sui’s mainnet suffered three separate outages throughout May 28 and May 29 after the community’s 1.72 launch uncovered edge circumstances in fuel charging and validator restart logic, in response to a postmortem from the Sui Foundation. The basis said the problems have since been resolved, community exercise has resumed, and “no consumer funds have been in danger.”

The incidents started on Thursday, May 28, when Sui’s mainnet halted at round 7 a.m. PT and remained down till roughly 1:30 p.m. PT. A second outage adopted on Friday morning, beginning at about 5 a.m. PT and ending round 8:30 a.m. PT. The third halt started Friday afternoon at roughly 1:30 p.m. PT and was resolved round 7:20 p.m. PT.

According to the muse, the primary two outages stemmed from crash bugs involving the interplay between fuel charging logic and Sui’s 1.72 improve, which launched deal with balances. The third outage was separate, triggered throughout a scheduled epoch change after validator restarts uncovered a latent bug in how randomness state was preserved.

“During the outages, no consumer funds have been in danger, and the community didn’t revert any dedicated transactions when it resumed,” the Sui Foundation mentioned. “As of now, validators have absolutely addressed the recognized points brought on by each the unique gas-charging bug and the randomness-state bug, and community exercise has resumed.”

Sui Gas Charging Bug Triggered Initial Halts

The first downside centered on Sui’s new deal with stability function, which permits customers to retailer funds and pay for fuel with out relying solely on coin objects. Transactions on Sui will pay fuel by way of deal with balances, coin objects, or a hybrid construction combining each.

The edge case emerged in that hybrid fuel path. When a transaction tried to spend from an deal with stability that would not cowl competing transactions, the scheduler accurately cancelled it with an InsufficientFundsForWithdraw error. But later, throughout fuel smashing — the method of mixing enter cash right into a single gas-paying coin — the identical reservation might nonetheless try to debit funds once more.

In the muse’s clarification, the crash didn’t happen instantly throughout fuel smashing however throughout settlement, when stability deltas have been reconciled by a system transaction. A unfavourable delta utilized to a zero stability brought on an underflow.

The instant repair was conceptually simple: keep away from fuel smashing when a transaction is cancelled with InsufficientFundsForWithdraw. Validators adopted that repair on Thursday, bringing the community again on-line. But the muse acknowledged that the patch was an interim measure, chosen to revive the community whereas engineers developed a extra full resolution.

“Changing fuel logic is a fragile operation,” the muse wrote. “As defined above, there are difficult interactions between deal with balances and cash. Other than fixing bugs, fuel logic adjustments should protect all earlier habits or use applicable model gating.”

That interim patch contained a recognized weak point. If a transaction had a number of cancellation causes, one other error might masks the InsufficientFundsForWithdraw situation. When that occurred Friday morning, the unique underflow path might nonetheless be reached, inflicting a second halt.

Epoch Change Exposed Randomness-State Bug

The third outage got here after the community had resumed regular operation Friday morning. At the subsequent scheduled epoch change, validators failed to finish the transition due to a bug tied to Sui’s distributed key technology protocol, or DKG, which bootstraps randomness for transactions that depend upon on-chain randomness.

During the sooner restart cycle, participation was not high sufficient for the subsequent epoch’s DKG course of, so randomness was disabled as designed. The downside was that the failure verdict was not written to disk. As validators restarted once more, they got here again up with out remembering that DKG had failed.

“With validators not remembering DKG had failed, neither might occur, the paused queue grew, and end-of-epoch logic — which should drain that queue earlier than closing — was left ready on DKG that will by no means come,” the muse mentioned.

The repair had two components: persisting DKG standing throughout restarts and including a mechanism that allowed validators to shut the caught epoch at a coordinated level. That mechanism was used as soon as to shut the affected epoch, after which the community moved into the subsequent epoch and randomness was restored.

The postmortem framed the outages as a broader engineering lesson for Sui. The basis mentioned end-of-epoch resilience wants additional funding, significantly round sleek degradation and operational force-close mechanisms. It additionally mentioned fuel charging deserves the identical stage of rigor because the Move VM or Mysticeti consensus, given its interplay with settlement, conservation checks, and scheduling.

At press time, SUI traded at $0.8798.

Similar Posts