Commit 0f25b76
committed
draid: add failure domains support
Currently, the only way to tolerate the failure of the whole
enclosure is to configure several draid vdevs in the pool, each
vdev having disks from different enclosures. But this essentially
degrades draid to raidz and defeats the purpose having fast
sequential resilvering on wide pools with draid.
This patch allows to configure several children groups in the same
row in one draid vdev. In each such group, let's call it failure
group, the user can configure disks belonging to different
enclosures - failure domains. For example, in case of 10
enclosures with 10 disks each, the user can put 1st disk from each
enclosure into 1st group, 2nd disk from each enclosure into 2nd
group, and so on. If one enclosure fails, only one disk from each
group would fail, which won't affect draid operation, and each
group would have enough redundancy to recover the stored data. Of
course, in case of draid2 - two enclosures can fail at a time, in
case of draid3 - three enclosures (provided there are no other
disk failures in each group).
In order to preserve fast sequential resilvering in case of a disk
failure, the groups much share all disks between themselves, and
this is achieved by shuffling the disks between the groups. But
only i-th disks in each group are shuffled between themselves,
i.e. the disks from the same enclosures, after that they are
shuffled within each group, like it is done today in an ordinary
draid. Thus, no more than one disk from any enclosure can appear
in any failure group as a result of this shuffling.
For example, here's how the pool status output looks like in
case of two `draid1:2d:4c:1s` groups:
NAME STATE READ WRITE CKSUM
pool1 ONLINE 0 0 0
draid1:2d:4c:1s:8w-0 ONLINE 0 0 0
enc0d0 ONLINE 0 0 0
enc1d0 ONLINE 0 0 0
enc2d0 ONLINE 0 0 0
enc3d0 ONLINE 0 0 0
enc0d1 ONLINE 0 0 0
enc1d1 ONLINE 0 0 0
enc2d1 ONLINE 0 0 0
enc3d1 ONLINE 0 0 0
spares
draid1-0-0 AVAIL
draid1-0-1 AVAIL
The number of failure groups is specified indirectly via the new
width parameter in draid vdev configuration descriptor, which is
the total number of disks and which is multiple of children in
each group. This multiple is the number of groups (width /
children). Doing it this way allows the user conveniently see how
many disks draid has in an instant.
Spare disks are evenly distributed among failure groups, so the
number of spares should be multiple of the number of groups, and
they are shared by all groups. However, to support domain failure,
we cannot have more than nparity - 1 failed disks in any group, no
matter if they are rebuilt to draid spares or not (the blocks of
those spares can be mapped to the disks from the failed domain
(enclosure), and we cannot tolerate more than nparity failures in
any failure group).
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Closes #11969.1 parent 7e33476 commit 0f25b76
File tree
29 files changed
+1395
-113
lines changed- cmd/zpool
- include
- sys
- fs
- lib/libzfs
- man
- man7
- man8
- module
- zcommon
- zfs
- tests
- runfiles
- zfs-tests/tests
- functional
- cli_root
- zpool_create
- zpool_get
- fault
- redundancy
29 files changed
+1395
-113
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3528 | 3528 | | |
3529 | 3529 | | |
3530 | 3530 | | |
| 3531 | + | |
| 3532 | + | |
| 3533 | + | |
| 3534 | + | |
| 3535 | + | |
3531 | 3536 | | |
3532 | 3537 | | |
3533 | 3538 | | |
| |||
8030 | 8035 | | |
8031 | 8036 | | |
8032 | 8037 | | |
8033 | | - | |
| 8038 | + | |
8034 | 8039 | | |
8035 | 8040 | | |
8036 | 8041 | | |
| |||
8174 | 8179 | | |
8175 | 8180 | | |
8176 | 8181 | | |
8177 | | - | |
| 8182 | + | |
8178 | 8183 | | |
8179 | 8184 | | |
8180 | 8185 | | |
| |||
10715 | 10720 | | |
10716 | 10721 | | |
10717 | 10722 | | |
| 10723 | + | |
| 10724 | + | |
| 10725 | + | |
| 10726 | + | |
| 10727 | + | |
| 10728 | + | |
| 10729 | + | |
| 10730 | + | |
| 10731 | + | |
| 10732 | + | |
| 10733 | + | |
| 10734 | + | |
10718 | 10735 | | |
10719 | 10736 | | |
10720 | 10737 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1323 | 1323 | | |
1324 | 1324 | | |
1325 | 1325 | | |
1326 | | - | |
| 1326 | + | |
1327 | 1327 | | |
1328 | 1328 | | |
1329 | 1329 | | |
1330 | 1330 | | |
1331 | 1331 | | |
1332 | | - | |
| 1332 | + | |
1333 | 1333 | | |
1334 | 1334 | | |
1335 | 1335 | | |
1336 | | - | |
1337 | | - | |
1338 | | - | |
| 1336 | + | |
| 1337 | + | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
1339 | 1341 | | |
1340 | 1342 | | |
1341 | 1343 | | |
1342 | 1344 | | |
| 1345 | + | |
1343 | 1346 | | |
1344 | 1347 | | |
1345 | | - | |
| 1348 | + | |
| 1349 | + | |
1346 | 1350 | | |
1347 | 1351 | | |
1348 | 1352 | | |
1349 | 1353 | | |
1350 | 1354 | | |
| 1355 | + | |
1351 | 1356 | | |
1352 | 1357 | | |
1353 | 1358 | | |
1354 | 1359 | | |
1355 | 1360 | | |
| 1361 | + | |
| 1362 | + | |
| 1363 | + | |
1356 | 1364 | | |
1357 | 1365 | | |
1358 | 1366 | | |
| |||
1376 | 1384 | | |
1377 | 1385 | | |
1378 | 1386 | | |
1379 | | - | |
| 1387 | + | |
1380 | 1388 | | |
1381 | 1389 | | |
1382 | 1390 | | |
1383 | | - | |
| 1391 | + | |
| 1392 | + | |
1384 | 1393 | | |
1385 | | - | |
1386 | | - | |
| 1394 | + | |
| 1395 | + | |
1387 | 1396 | | |
1388 | 1397 | | |
1389 | 1398 | | |
1390 | 1399 | | |
1391 | | - | |
| 1400 | + | |
| 1401 | + | |
1392 | 1402 | | |
1393 | | - | |
| 1403 | + | |
| 1404 | + | |
| 1405 | + | |
| 1406 | + | |
| 1407 | + | |
| 1408 | + | |
| 1409 | + | |
| 1410 | + | |
| 1411 | + | |
| 1412 | + | |
| 1413 | + | |
1394 | 1414 | | |
1395 | | - | |
1396 | | - | |
| 1415 | + | |
1397 | 1416 | | |
1398 | 1417 | | |
1399 | 1418 | | |
| |||
1405 | 1424 | | |
1406 | 1425 | | |
1407 | 1426 | | |
| 1427 | + | |
| 1428 | + | |
| 1429 | + | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
| 1437 | + | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
| 1445 | + | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
| 1454 | + | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
| 1459 | + | |
| 1460 | + | |
| 1461 | + | |
| 1462 | + | |
1408 | 1463 | | |
1409 | 1464 | | |
1410 | 1465 | | |
| |||
1414 | 1469 | | |
1415 | 1470 | | |
1416 | 1471 | | |
1417 | | - | |
1418 | | - | |
| 1472 | + | |
| 1473 | + | |
1419 | 1474 | | |
1420 | 1475 | | |
1421 | 1476 | | |
| |||
1450 | 1505 | | |
1451 | 1506 | | |
1452 | 1507 | | |
1453 | | - | |
| 1508 | + | |
1454 | 1509 | | |
1455 | 1510 | | |
1456 | 1511 | | |
| |||
1467 | 1522 | | |
1468 | 1523 | | |
1469 | 1524 | | |
1470 | | - | |
| 1525 | + | |
1471 | 1526 | | |
| 1527 | + | |
1472 | 1528 | | |
1473 | 1529 | | |
1474 | 1530 | | |
| |||
1606 | 1662 | | |
1607 | 1663 | | |
1608 | 1664 | | |
| 1665 | + | |
| 1666 | + | |
| 1667 | + | |
| 1668 | + | |
1609 | 1669 | | |
1610 | 1670 | | |
1611 | 1671 | | |
1612 | 1672 | | |
| 1673 | + | |
| 1674 | + | |
| 1675 | + | |
| 1676 | + | |
| 1677 | + | |
| 1678 | + | |
| 1679 | + | |
| 1680 | + | |
| 1681 | + | |
| 1682 | + | |
| 1683 | + | |
| 1684 | + | |
| 1685 | + | |
| 1686 | + | |
| 1687 | + | |
| 1688 | + | |
| 1689 | + | |
| 1690 | + | |
| 1691 | + | |
| 1692 | + | |
| 1693 | + | |
| 1694 | + | |
| 1695 | + | |
| 1696 | + | |
| 1697 | + | |
| 1698 | + | |
| 1699 | + | |
1613 | 1700 | | |
1614 | 1701 | | |
1615 | 1702 | | |
| |||
1647 | 1734 | | |
1648 | 1735 | | |
1649 | 1736 | | |
| 1737 | + | |
| 1738 | + | |
| 1739 | + | |
| 1740 | + | |
| 1741 | + | |
| 1742 | + | |
| 1743 | + | |
| 1744 | + | |
| 1745 | + | |
| 1746 | + | |
| 1747 | + | |
| 1748 | + | |
| 1749 | + | |
| 1750 | + | |
| 1751 | + | |
| 1752 | + | |
| 1753 | + | |
| 1754 | + | |
| 1755 | + | |
| 1756 | + | |
| 1757 | + | |
| 1758 | + | |
| 1759 | + | |
| 1760 | + | |
| 1761 | + | |
| 1762 | + | |
| 1763 | + | |
| 1764 | + | |
| 1765 | + | |
| 1766 | + | |
| 1767 | + | |
| 1768 | + | |
| 1769 | + | |
| 1770 | + | |
| 1771 | + | |
| 1772 | + | |
| 1773 | + | |
| 1774 | + | |
| 1775 | + | |
| 1776 | + | |
| 1777 | + | |
| 1778 | + | |
| 1779 | + | |
| 1780 | + | |
| 1781 | + | |
| 1782 | + | |
| 1783 | + | |
| 1784 | + | |
| 1785 | + | |
| 1786 | + | |
| 1787 | + | |
| 1788 | + | |
| 1789 | + | |
| 1790 | + | |
| 1791 | + | |
| 1792 | + | |
| 1793 | + | |
| 1794 | + | |
| 1795 | + | |
| 1796 | + | |
| 1797 | + | |
| 1798 | + | |
| 1799 | + | |
| 1800 | + | |
| 1801 | + | |
| 1802 | + | |
| 1803 | + | |
| 1804 | + | |
| 1805 | + | |
| 1806 | + | |
| 1807 | + | |
| 1808 | + | |
| 1809 | + | |
| 1810 | + | |
| 1811 | + | |
1650 | 1812 | | |
1651 | 1813 | | |
1652 | 1814 | | |
| |||
1692 | 1854 | | |
1693 | 1855 | | |
1694 | 1856 | | |
1695 | | - | |
| 1857 | + | |
| 1858 | + | |
1696 | 1859 | | |
1697 | 1860 | | |
1698 | 1861 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
443 | 443 | | |
444 | 444 | | |
445 | 445 | | |
| 446 | + | |
446 | 447 | | |
447 | 448 | | |
448 | 449 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
389 | 389 | | |
390 | 390 | | |
391 | 391 | | |
| 392 | + | |
| 393 | + | |
392 | 394 | | |
393 | 395 | | |
394 | 396 | | |
| |||
907 | 909 | | |
908 | 910 | | |
909 | 911 | | |
| 912 | + | |
910 | 913 | | |
911 | 914 | | |
912 | 915 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
71 | | - | |
| 71 | + | |
72 | 72 | | |
73 | 73 | | |
| 74 | + | |
74 | 75 | | |
75 | 76 | | |
76 | 77 | | |
| |||
103 | 104 | | |
104 | 105 | | |
105 | 106 | | |
106 | | - | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
107 | 110 | | |
108 | 111 | | |
109 | 112 | | |
| |||
0 commit comments