Optimizing Payment Channels to Achieve CDN Latencies (Skyrocket Part 2)

5y ago•

bullish:

bearish:

(Skyrocket Part 2)

A bit over a month ago we released a blogpost about a protocol optimization we call Skyrocket. If you haven’t already you should check that out first.

The second part of optimizing Sia was introducing so-called Ephemeral Accounts (EAs). As the name might suggest, EAs are about keeping a balance on an account. These accounts reside on the individual hosts on the network and can be used to pay them. So for example, if you had an EA on a host with 1 Siacoin (SC) in it, you could pay that host for downloading data worth up to 1 SC.

Why is this useful? To understand this we first need to dive a bit into how renters pay hosts on the Sia network.

Smart Contracts

Every renter who wants to upload data to a host on the Sia network first needs to form a smart contract with the host for a certain period of time. In the Sia ecosystem, those smart contracts are called file contracts (FCs). These FCs contain tokens from both the renter, to pay for storing data and the host, to put up collateral. All of that is locked away until the FC expires. While the funds are not accessible before that, it can be revised by both parties together to shift the balances within the FC around. For example, if both parties locked away 2 SC each, the renter can pay the host by shifting 1 SC from its balance to the host’s balance. So after the FC expires, the renter would receive 1 SC and the host would receive 3 SC.

The advantage of this approach is, that only the initial revision and the last revision of the FC need to be submitted to the network and mined into blocks. From the moment the contract is formed until it expires it can be updated any number of times without having to go into the blockchain. In fact, if you download 400 MiB of data to a host, the contract would be updated 100 times; Once for every 4 MiB sector. At a download speed of 1 Gbps, a contract is revised ~32 times per second.

The downside of this is that both renter and host need to hold on to the latest revision. Otherwise, they could cheat each other by submitting older revisions with a more beneficial state. So both parties need to make sure to store every revision.

Atomicity, Consistency, Isolation, and Durability

Since contract revisions are very important and need to be available in spite of crashes, power outages, or other potential issues. By default, HDDs and SSDs don’t provide any strong guarantees for that.

Disks are split up into many physical sectors. For most modern disks those sectors have a size of 4KiB and a single one can be assumed to be written to the disk atomically. That means even during a power outage or crash, a single physical sector will either be written to disk correctly or won’t be written to disk at all. Any data larger than a physical sector that is written to disk, like our contract revisions, could potentially end up in an inconsistent state. For example, when writing 2 physical sectors worth of data to disk, the first one might be synced to disk before the crash but the second isn’t.

To guarantee that our revisions are always written to disk correctly and always available after doing so, we had to take some extra steps. We had to make that transaction to disk ACID. ACID is a set of properties a transaction to disk needs to have to guarantee data availability even through power failures and other issues:

Atomicity: A transaction is either written to disk fully or isn’t. By default that is only true for single physical sectors on the disk.
Consistency: A transaction can only bring the data on disk from one valid state to another. In our case, this means it’s either one contract revision or the next. No corrupted state in between.
Isolation: Concurrent executions of transactions result in the same state as the sequential execution of the same transactions.
Durability: Once a transaction has been committed, it will be available even in case of a power outage or crash.

We use multiple different techniques throughout the Sia codebase to ensure ACID persistence depending on the use case. For cases where performance and availability are paramount, we developed our own high-performance writeaheadlog which you can check out on our GitLab if you are interested. It enables us to rapidly apply ACID, random updates to on-disk data structures, and combine multiple updates over multiple files into single ACID transactions.

I’m not going to explain the writeaheadlog as part of this post since it’s a quite complex project all by itself, but it boils down to essentially writing every transaction to disk twice. The first write serves as a backup to a separate file in case of a crash and the second write is the actual transaction. Should this process be interrupted before the backup is created, we know that no data was written yet anyway after a reboot. If it is interrupted afterward, we just try to copy the backup after the reboot.

So what if Sia’s revision updates weren’t ACID? In the worst case, a host’s contract revision could corrupt during a crash and it would no longer be able to revise the contract or provide storage proof, leading to a complete loss of already earned funds and collateral.

ACID Persistence Is Slow

Regular writes to disk (e.g. when copying large files) can appear to be quite fast because the operating systems optimizes writes by buffering the data in memory. Unfortunately, this means that even after closing an application that writes to disk, we can’t be sure that the data is actually stored on the physical disk. A solution to this is using the so-called “fsync” system call. It blocks until the data written to a file is actually stored on disk. Of course, this causes a huge performance impact, especially when rapidly syncing small amounts of data on an HDD. To guarantee durability during a crash, we need to write a FC to disk twice and sync it after each write.

For reference, recent consumer HDDs can do around 50 fsyncs per second with approximately 20 ms of latency. Since we need to 2 fsyncs per update, due to the backup mechanism explained in the previous section, that means we are down to about 25 updates per second and up to 40 ms of latency. This wouldn’t even let us saturate a 1 Gbps connection and also adds 40 ms of latency to every download in addition to the network latency. If Sia wants to compete with centralized enterprise solutions on low-cost consumer hardware, this is obviously not enough. Being able to act as a high-quality host on the Sia network using spare disk space on non-enterprise hardware has always been one of the beauties of Sia. We would like to keep it that way. That’s why we created Ephemeral Accounts.

Ephemeral Accounts

While operations that change the data stored in a FC still require a FC update, read-only operations like downloads don’t. In fact, for read-only operations, all we need the FC for is to pay the host. So what if we could get rid of the disk i/o for downloads? What if we could get rid of the limitation of downloading 25 sectors per second and also the 40 ms wait for the contract to be written to disk? Well, thanks to EAs we can.

Since we don’t want to update the FC for every download, we pre-pay the host by filling the EA with a tiny amount of tokens. By default this 1 SC (~USD 0.0032 at the time of writing) and requires a single FC update. Once the balance hits 0.5 SC, the renter refills the EA again making sure that we can keep interacting with that host without running out of funds.

This is also where the SiaMux from the last blog post of this series comes into play. We use it to keep multiple connections open to the host where one of them is responsible for refilling the EA. That means filling the EA doesn’t prevent us from also performing other operations in the meantime.

So why is it useful to have an account with only 1 SC on the host? Let’s assume the cost of downloading 1 TB was 1000 SC. That means for 1 SC we can download a full GB of data. So even if the renter was capable of downloading 1 GB of data per second, we would only need to update the FC once per second instead of 25 times like before. We also get the data without having to wait 40 ms for the update. This completely removes disk i/o as a bottleneck for retrieving data from a host and allows us to saturate even 1 Gbps connections.

Another important aspect of EAs is that in combination with the SiaMux we can now have multiple downloads run in parallel. A single FC can’t be updated in parallel since it requires both parties to sign it and to increment a revision number which needs to go up to keep both parties safe. With EAs we can have as many download requests run in parallel as we want. The host will process them in parallel as long as the EA balance is sufficient.

Not only that but the host can also grant partial or full refunds with EAs which was previously not possible for the same reasons. A signed FC revision always transmits a specific amount of money. For example, a renter paying for 2 sectors of data and only receiving one would still pay for both. With EAs, a host would withdraw the full amount of money from the caller’s EA at the beginning of the operation and then refund some of it by depositing it again at the end. We will discuss why that’s useful in the next post about the Merklized-Data-Machine (MDM).

Minimal Trust

So, as mentioned before, we pre-pay the host before downloading to avoid updating the FC when we actually download from it. This introduces a tiny amount of risk for both parties.

The renter risks the host accepting the money and going offline. Since the host is pre-paid, consensus won’t protect us from that. To mitigate that risk, we only pre-pay the host tiny amounts. By never trusting the host with more than sub-cent amounts and refilling the EA every couple of seconds, the host doesn’t have any incentive to cheat. It’s a lot more profitable to keep conducting business than to “rob” renters of less than a cent. Especially since renters will notice immediately and stop conducting business with that host.

For the host, the risk lies in crashing and as a result not charging the renter. This might happen if a host withdraws from an in-memory EA but then crashes before the next time it persists the latest balance on disk. To prevent that the host will track the delta between the in-memory balance and the balance on disk. If that delta grows too large, the host will sync the balance to disk and prevent withdrawals in the meantime, therefore resetting the delta to 0. This should only happen on rare occasions though since the renter will keep refilling the EA and therefore reduce the delta.

Conclusion

Hopefully, we were able to give you some more insights on the challenges we face working on trustless and distributed platforms like Sia and Skynet. The next time we will talk about the MDM, what it is, why it makes the RPC protocol a lot more extendible, and how it enables a lot more complex use cases.

As always, if you want to know more, discuss the design with our team or just like to have technical discussions, check out the code on Gitlab and join us on Discord! Also, don’t forget to stay tuned for the last part of our Skyrocket blog post series!

Optimizing Payment Channels to Achieve CDN Latencies (Skyrocket Part 2) was originally published in Sia Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

5y ago•

Siacoin

bullish:

bearish: