Translation

(itstool) path: sect2/para
English
Redundant data is detected and deduplicated:
16/440
Context English Chinese (Simplified) (zh_CN) State
Compression can have a similar unexpected interaction with backups. Quotas are often used to limit how much data can be stored to ensure there is sufficient backup space available. However since quotas do not consider compression, more data may be written than would fit with uncompressed backups. 压缩功能在与备份功能一起使用时也可能会有类似的问题,通常会使用配额功能来限制能够储存的资料量来确保有足够的备份空间可用。但是由于配额功能并不会考量压缩状况,可能会有比未压缩版本备份更多的资料量会被写入到数据集。
Zstandard Compression Zstandard 压缩
In <acronym>OpenZFS</acronym> 2.0, a new compression algorithm was added. Zstandard (<acronym>Zstd</acronym>) offers higher compression ratios than the default <acronym>LZ4</acronym> while offering much greater speeds than the alternative, <acronym>gzip</acronym>. <acronym>OpenZFS</acronym> 2.0 is available starting with FreeBSD 12.1-RELEASE via <package>sysutils/openzfs</package> and has been the default in FreeBSD 13-CURRENT since September 2020, and will by in FreeBSD 13.0-RELEASE. 在<acronym>OpenZFS</acronym>2.0中,增加了一种新的压缩算法。Zstandard (<acronym>Zstd</acronym>)比默认的<acronym>LZ4</acronym>压缩率更高,同时比<acronym>gzip</acronym>速度更快。<acronym>OpenZFS</acronym> 2.0 从 FreeBSD 12.1-RELEASE 开始通过 <package>sysutils/openzfs</package> 提供, 并从 2020 年 9 月起成为 FreeBSD 13-CURRENT 的默认版本, 并将在 FreeBSD 13.0-RELEASE 中使用。
<acronym>Zstd</acronym> provides a large selection of compression levels, providing fine-grained control over performance versus compression ratio. One of the main advantages of <acronym>Zstd</acronym> is that the decompression speed is independent of the compression level. For data that is written once but read many times, <acronym>Zstd</acronym> allows the use of the highest compression levels without a read performance penalty. <acronym>Zstd</acronym>提供了大量的压缩级别选择,可对性能与压缩比进行精细控制。<acronym>Zstd</acronym>的主要优势之一是解压速度与压缩级别无关。对于写入一次但读取多次的数据,<acronym>Zstd</acronym>使用最高的压缩级别时不会影响读取性能。
Even when data is updated frequently, there are often performance gains that come from enabling compression. One of the biggest advantages comes from the compressed ARC feature. <acronym>ZFS</acronym>'s Adaptive Replacement Cache (<acronym>ARC</acronym>) caches the compressed version of the data in <acronym>RAM</acronym>, decompressing it each time it is needed. This allows the same amount of <acronym>RAM</acronym> to store more data and metadata, increasing the cache hit ratio. 即使数据更新频繁,也常常会因为启用压缩功能而获得性能提升。其中最大的优势来自于压缩的ARC功能。<acronym>ZFS</acronym>的自适应替换缓存(<acronym>ARC</acronym>)将数据的压缩版本缓存在<acronym>RAM</acronym>中,在每次需要时对其进行解压。这使得相同数量的<acronym>RAM</acronym>可以存储更多的数据和元数据,提高了缓存命中率。
<acronym>ZFS</acronym> offers 19 levels of <acronym>Zstd</acronym> compression, each offering incrementally more space savings in exchange for slower compression. The default level is <literal>zstd-3</literal> and offers greater compression than <acronym>LZ4</acronym> without being significantly slower. Levels above 10 require significant amounts of memory to compress each block, so they are discouraged on systems with less than 16 GB of <acronym>RAM</acronym>. <acronym>ZFS</acronym> also implements a selection of the <acronym>Zstd</acronym> <emphasis>fast</emphasis> levels, which get correspondingly faster but offer lower compression ratios. <acronym>ZFS</acronym> supports <literal>zstd-fast-1</literal> through <literal>zstd-fast-10</literal>, <literal>zstd-fast-20</literal> through <literal>zstd-fast-100</literal> in increments of 10, and finally <literal>zstd-fast-500</literal> and <literal>zstd-fast-1000</literal> which provide minimal compression, but offer very high performance. <acronym>ZFS</acronym>提供了19个<acronym>Zstd</acronym>压缩级别,每个级别的压缩速度较慢,但却能逐步节省更多的空间。默认级别是<literal>zstd-3</literal>,提供比<acronym>LZ4</acronym>更大的压缩率,但速度却不会明显变慢。超过10的级别需要大量的内存来压缩每个区块,因此在<acronym>RAM</acronym>小于16 GB的系统上不推荐使用。<acronym>ZFS</acronym>还实现了<acronym>Zstd</acronym> <emphasis>fast</emphasis>级别的选择,这些级别的速度会相应变快,但提供的压缩比较低。<acronym>ZFS</acronym>支持<literal>zstd-fast-1</literal>到<literal>zstd-fast-10</literal>、<literal>zstd-fast-20</literal>到<literal>zstd-fast-100</literal>,增量为10。最后是<literal>zstd-fast-500</literal>和<literal>zstd-fast-1000</literal>,它们提供最小的压缩量,但速度快。
If ZFS is not able to allocate the required memory to compress a block with <acronym>Zstd</acronym>, it will fall back to storing the block uncompressed. This is unlikely to happen outside of the highest levels of <acronym>Zstd</acronym> on systems that are memory constrained. The sysctl <literal>kstat.zfs.misc.zstd.compress_alloc_fail</literal> counts how many times this has occurred since the <acronym>ZFS</acronym> module was loaded. 如果ZFS无法分配所需的内存来使用<acronym>Zstd</acronym>压缩一个块,它将回到未压缩的块存储。在内存有限的系统中,除了最高级别的<acronym>Zstd</acronym>之外,这种情况不太可能发生。sysctl <literal>kstat.zfs.misc.zstd.compress_alloc_fail</literal>会统计自<acronym>ZFS</acronym>模块被加载以来发生了多少次。
Deduplication 去重复(Deduplication)
When enabled, <link linkend="zfs-term-deduplication">deduplication</link> uses the checksum of each block to detect duplicate blocks. When a new block is a duplicate of an existing block, <acronym>ZFS</acronym> writes an additional reference to the existing data instead of the whole duplicate block. Tremendous space savings are possible if the data contains many duplicated files or repeated information. Be warned: deduplication requires an extremely large amount of memory, and most of the space savings can be had without the extra cost by enabling compression instead. 当开启,去重复(<link linkend="zfs-term-deduplication">Deduplication</link>)功能会使用每个资料区块的校验码(Checksum)来侦测重复的资料区块,当新的资料区块与现有的资料区块重复,<acronym>ZFS</acronym> 便会写入连接到现有资料的参考来替代写入重复的资料区块,这在资料中有大量重复的文件或资讯时可以节省大量的空间,要注意的是:去重复功能需要使用大量的内存且大部份可节省的空间可改开启压缩功能来达成,而压缩功能不需要使用额外的内存。
To activate deduplication, set the <literal>dedup</literal> property on the target pool: 要开启去重复功能,需在目标存储池设定 <literal>dedup</literal> 属性:
<prompt>#</prompt> <userinput>zfs set dedup=on <replaceable>pool</replaceable></userinput> <prompt>#</prompt> <userinput>zfs set dedup=on <replaceable>pool</replaceable></userinput>
Only new data being written to the pool will be deduplicated. Data that has already been written to the pool will not be deduplicated merely by activating this option. A pool with a freshly activated deduplication property will look like this example: 只有要被写入到存储池的新资料才会做去重复的动作,先前已被写入到存储池的资料不会因此启动了这个选项而做去重复。查看已开启去重复属性的存储池会如下:
<prompt>#</prompt> <userinput>zpool list</userinput>
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
pool 2.84G 2.19M 2.83G - - 0% 0% 1.00x ONLINE -
<prompt>#</prompt> <userinput>zpool list</userinput>
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
pool 2.84G 2.19M 2.83G - - 0% 0% 1.00x ONLINE -
The <literal>DEDUP</literal> column shows the actual rate of deduplication for the pool. A value of <literal>1.00x</literal> shows that data has not been deduplicated yet. In the next example, the ports tree is copied three times into different directories on the deduplicated pool created above. <literal>DEDUP</literal> 栏位会显示存储池的实际去重复率,数值为 <literal>1.00x</literal> 代表资料尚未被去重复.在下一个例子会在前面所建立的去重复存储池中复制三份 Port 树到不同的目录中。
<prompt>#</prompt> <userinput>for d in dir1 dir2 dir3; do</userinput>
&gt; <userinput>mkdir $d &amp;&amp; cp -R /usr/ports $d &amp;</userinput>
&gt; <userinput>done</userinput>
<prompt>#</prompt> <userinput>for d in dir1 dir2 dir3; do</userinput>
&gt; <userinput>mkdir $d &amp;&amp; cp -R /usr/ports $d &amp;</userinput>
&gt; <userinput>done</userinput>
Redundant data is detected and deduplicated: 已经侦测到重复的资料并做去重复:
<prompt>#</prompt> <userinput>zpool list</userinput>
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
pool 2.84G 20.9M 2.82G - - 0% 0% 3.00x ONLINE -
<prompt>#</prompt> <userinput>zpool list</userinput>
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
pool 2.84G 20.9M 2.82G - - 0% 0% 3.00x ONLINE -
The <literal>DEDUP</literal> column shows a factor of <literal>3.00x</literal>. Multiple copies of the ports tree data was detected and deduplicated, using only a third of the space. The potential for space savings can be enormous, but comes at the cost of having enough memory to keep track of the deduplicated blocks. <literal>DEDUP</literal> 栏位显示有<literal>3.00x</literal> 的去重复率,这代表已侦测到多份复制的Port 树资料并做了去重复的动作,且只会使用第三份资料所占的空间.去重复能节省空间的潜力可以非常巨大,但会需要消耗大量的内存来持续追踪去重复的资料区块。
Deduplication is not always beneficial, especially when the data on a pool is not redundant. <acronym>ZFS</acronym> can show potential space savings by simulating deduplication on an existing pool: 去重复并非总是有效益的,特别是当存储池中的资料本身并没有重复时。 <acronym>ZFS</acronym> 可以透过在现有存储池上模拟开启去重复功能来显示可能节省的空间:
<prompt>#</prompt> <userinput>zdb -S <replaceable>pool</replaceable></userinput>
Simulated DDT histogram:

bucket allocated referenced
______ ______________________________ ______________________________
refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
------ ------ ----- ----- ----- ------ ----- ----- -----
1 2.58M 289G 264G 264G 2.58M 289G 264G 264G
2 206K 12.6G 10.4G 10.4G 430K 26.4G 21.6G 21.6G
4 37.6K 692M 276M 276M 170K 3.04G 1.26G 1.26G
8 2.18K 45.2M 19.4M 19.4M 20.0K 425M 176M 176M
16 174 2.83M 1.20M 1.20M 3.33K 48.4M 20.4M 20.4M
32 40 2.17M 222K 222K 1.70K 97.2M 9.91M 9.91M
64 9 56K 10.5K 10.5K 865 4.96M 948K 948K
128 2 9.50K 2K 2K 419 2.11M 438K 438K
256 5 61.5K 12K 12K 1.90K 23.0M 4.47M 4.47M
1K 2 1K 1K 1K 2.98K 1.49M 1.49M 1.49M
Total 2.82M 303G 275G 275G 3.20M 319G 287G 287G

dedup = 1.05, compress = 1.11, copies = 1.00, dedup * compress / copies = 1.16
<prompt>#</prompt> <userinput>zdb -S <replaceable>pool</replaceable></userinput>
Simulated DDT histogram:

bucket allocated referenced
______ ______________________________ ______________________________
refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
------ ------ ----- ----- ----- ------ ----- ----- -----
1 2.58M 289G 264G 264G 2.58M 289G 264G 264G
2 206K 12.6G 10.4G 10.4G 430K 26.4G 21.6G 21.6G
4 37.6K 692M 276M 276M 170K 3.04G 1.26G 1.26G
8 2.18K 45.2M 19.4M 19.4M 20.0K 425M 176M 176M
16 174 2.83M 1.20M 1.20M 3.33K 48.4M 20.4M 20.4M
32 40 2.17M 222K 222K 1.70K 97.2M 9.91M 9.91M
64 9 56K 10.5K 10.5K 865 4.96M 948K 948K
128 2 9.50K 2K 2K 419 2.11M 438K 438K
256 5 61.5K 12K 12K 1.90K 23.0M 4.47M 4.47M
1K 2 1K 1K 1K 2.98K 1.49M 1.49M 1.49M
Total 2.82M 303G 275G 275G 3.20M 319G 287G 287G

dedup = 1.05, compress = 1.11, copies = 1.00, dedup * compress / copies = 1.16
After <command>zdb -S</command> finishes analyzing the pool, it shows the space reduction ratio that would be achieved by activating deduplication. In this case, <literal>1.16</literal> is a very poor space saving ratio that is mostly provided by compression. Activating deduplication on this pool would not save any significant amount of space, and is not worth the amount of memory required to enable deduplication. Using the formula <emphasis>ratio = dedup * compress / copies</emphasis>, system administrators can plan the storage allocation, deciding whether the workload will contain enough duplicate blocks to justify the memory requirements. If the data is reasonably compressible, the space savings may be very good. Enabling compression first is recommended, and compression can also provide greatly increased performance. Only enable deduplication in cases where the additional savings will be considerable and there is sufficient memory for the <link linkend="zfs-term-deduplication"><acronym>DDT</acronym></link>. 在 <command>zdb -S</command> 分析完存储池后会显示在启动去重复后可达到的空间减少比例。在本例中,<literal>1。16</literal> 是非常差的空间节省比例,因为这个比例使用压缩功能便能达成。若在此存储池上启动去重复并不能明显的节省空间使用量,那么就不值得耗费大量的内存来开启去重复功能。透过公式 <emphasis>ratio = dedup * compress / copies</emphasis>,系统管理者可以规划储存空间的配置,来判断要处理的资料是否有足够的重复资料区块来平衡所需的内存。若资料是可压缩的,那么空间节少的效果可能会非常好,建议先开启压缩功能,且压缩功能也可以大大提高效能。去重复功能只有在可以节省可观的空间且有足够的内存做 <link linkend="zfs-term-deduplication"><acronym>DDT</acronym></link> 时才开启。
<acronym>ZFS</acronym> and Jails <acronym>ZFS</acronym> 与Jail
<command>zfs jail</command> and the corresponding <literal>jailed</literal> property are used to delegate a <acronym>ZFS</acronym> dataset to a <link linkend="jails">Jail</link>. <command>zfs jail <replaceable>jailid</replaceable></command> attaches a dataset to the specified jail, and <command>zfs unjail</command> detaches it. For the dataset to be controlled from within a jail, the <literal>jailed</literal> property must be set. Once a dataset is jailed, it can no longer be mounted on the host because it may have mount points that would compromise the security of the host. <command>zfs jail</command> 以及相关的<literal>jailed</literal> 属性可以用来将一个<acronym>ZFS</acronym> 数据集委托给一个 <link linkend="jails">Jail</link> 管理。<command>zfs jail <replaceable>jailid</replaceable></command> 可以将一个数据集连结到一个指定的 Jail,而 <command>zfs unjail</command> 则可解除连结。数据集要可以在 Jail 中控制需设定 <literal>jailed</literal> 属性,一旦数据集被隔离便无法再挂载到主机,因为有挂载点可能会破坏主机的安全性。
Delegated Administration 委托管理
A comprehensive permission delegation system allows unprivileged users to perform <acronym>ZFS</acronym> administration functions. For example, if each user's home directory is a dataset, users can be given permission to create and destroy snapshots of their home directories. A backup user can be given permission to use replication features. A usage statistics script can be allowed to run with access only to the space utilization data for all users. It is even possible to delegate the ability to delegate permissions. Permission delegation is possible for each subcommand and most properties. 一个全面性的权限委托系统可能无权限的使用者执行 <acronym>ZFS</acronym> 的管理功能。例如,若每个使用者的家目录均为一个数据集,便可以给予使用者权限建立与摧毁它们家目录中的快照。可以给予备份使用者使用备份功能的权限。一个使用量统计的 Script 可以允许其在执行时能存取所有使用者的空间利用率资料。甚至可以将委托权限委托给其他人,每个子指令与大多数属性都可使用权限委托。
Delegating Dataset Creation 委托数据集建立
<command>zfs allow <replaceable>someuser</replaceable> create <replaceable>mydataset</replaceable></command> gives the specified user permission to create child datasets under the selected parent dataset. There is a caveat: creating a new dataset involves mounting it. That requires setting the FreeBSD <literal>vfs.usermount</literal> <citerefentry><refentrytitle>sysctl</refentrytitle><manvolnum>8</manvolnum></citerefentry> to <literal>1</literal> to allow non-root users to mount a file system. There is another restriction aimed at preventing abuse: non-<systemitem class="username">root</systemitem> users must own the mountpoint where the file system is to be mounted. <command>zfs allow <replaceable>someuser</replaceable> create <replaceable>mydataset</replaceable></command> 可以给予指定的使用者在指定的父数据集下建立子数据集的权限。这里需要注意:建立新数据集会牵涉到挂载,因此需要设定 FreeBSD 的<literal>vfs.usermount</literal> <citerefentry><refentrytitle>sysctl</refentrytitle><manvolnum>8</manvolnum></citerefentry > 为<literal>1</literal> 来允许非root 的使用者挂载一个文件系统。这里还有另一项限制可以避免滥用:非 <systemitem class="username">root</systemitem> 使用者必须拥有挂载点在文件系统中所在位置的权限才可挂载。
Delegating Permission Delegation 委托权限委托
<command>zfs allow <replaceable>someuser</replaceable> allow <replaceable>mydataset</replaceable></command> gives the specified user the ability to assign any permission they have on the target dataset, or its children, to other users. If a user has the <literal>snapshot</literal> permission and the <literal>allow</literal> permission, that user can then grant the <literal>snapshot</literal> permission to other users. <command>zfs allow <replaceable>someuser</replaceable> allow <replaceable>mydataset</replaceable></command> 可以给予指定的使用者有权限指派它们在目标数据集或其子数据集上拥有的任何权限给其他人。若该使用者拥有 <literal>snapshot</literal> 权限及 <literal>allow</literal> 权限,则该使用者可以授权 <literal>snapshot</literal> 权限给其他使用者。
Tuning 优化调整
There are a number of tunables that can be adjusted to make <acronym>ZFS</acronym> perform best for different workloads. 这里有数个可调校的项目可以调整,来让 <acronym>ZFS</acronym> 在面对各种工作都能以最佳状况运作。

Loading…

No matching activity found.

Browse all component changes

Glossary

English Chinese (Simplified) (zh_CN)
No related strings found in the glossary.

Source information

Source string comment
(itstool) path: sect2/para
Source string location
book.translate.xml:42367
String age
a year ago
Source string age
a year ago
Translation file
books/zh_CN/handbook.po, string 6957