Currently, our plan is to have RAID cover data devices only. Cache is not included and this leads to a problem. With RAID, you would theoretically be able to remove failed devices, but that does not seem like enough justification to add RAID to a cache, which will otherwise not benefit a lot. We need to find another way to handle failed devices in the cache so that users don't get into a position where they cannot start their pool.
There are two separate cases we need to handle: a started and stopped pool. In a started pool, it's relatively easy to add remove cache functionality. That is straightforward. In the stopped pool case, there are a few considerations. If a cache fails to be set up, we could theoretically just remove the cache for the user so that the pool can still be set up and they can re-add the cache. The other option, if we're willing to target only V2 pools for this functionality, would be to provide a way to remove the cache from stopped pools. As the Stratis metadata is available in V2 in all cases, we would theoretically be able to target a specific metadata operation to remove the cache on a stopped pool. This, however, would not be able to be done in V1.
@drckeefe @mulkieran I'm open to your feedback here.
Currently, our plan is to have RAID cover data devices only. Cache is not included and this leads to a problem. With RAID, you would theoretically be able to remove failed devices, but that does not seem like enough justification to add RAID to a cache, which will otherwise not benefit a lot. We need to find another way to handle failed devices in the cache so that users don't get into a position where they cannot start their pool.
There are two separate cases we need to handle: a started and stopped pool. In a started pool, it's relatively easy to add remove cache functionality. That is straightforward. In the stopped pool case, there are a few considerations. If a cache fails to be set up, we could theoretically just remove the cache for the user so that the pool can still be set up and they can re-add the cache. The other option, if we're willing to target only V2 pools for this functionality, would be to provide a way to remove the cache from stopped pools. As the Stratis metadata is available in V2 in all cases, we would theoretically be able to target a specific metadata operation to remove the cache on a stopped pool. This, however, would not be able to be done in V1.
@drckeefe @mulkieran I'm open to your feedback here.