Page 1 of 4

PSA encryption performance - 1.9.4 MUCH faster than 2.1.0

PostPosted: Mon Nov 15, 2021 11:33 am
by Tsur
I recently upgraded a server from High Sierra to Mojave and decided to upgrade from ZFS 1.9.4 to 2.1.0
I could not figure out why writes and reads to encrypted data sets were dramatically slower. All the hardware was the exact same. Started troubleshooting my network. Then, just decided to test local copies from an SSD. However, both local and network writes and reads never went above 45 MB/s. I figured it must be Mojave or 2.1.0. Downgrading ZFS was the easier first step.

I uninstalled 2.1.0 and installed 1.9.4 and that instantly fixed the read/write performance. Over the network, the encrypted datasets are now saturating gigabit ethernet. And local writes from the SSD are +250 MB/s. I don't know if this a known issue or perhaps just particular to my setup. But if you too are experiencing really slow performance from you encrypted datasets, it might be 2.1.0

Though, 1.9.4's encrypted performance is so much faster that now I'm slightly worried my files aren't properly encrypted ; ) Maybe there's a reason why 2.1.0 is so much slower - it's actually encrypting the files.

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Tue Nov 16, 2021 3:25 am
by jawbroken
It might help to specify what your encryption settings are, if someone gets the chance to look at the regression.

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Tue Nov 16, 2021 7:03 am
by Tsur
ˆI agree.
# zfs create -o encryption=on -o keylocation=prompt -o keyformat=passphrase [dataset]

Sorry, the settings were in my previous, misguided, post where I was attempting to force root level file sharing behavior. Just the default, recommended encryption found in the Wiki. I believe the defaults are AES 256 CCM. Earlier in this server's life, I was using Core Storage encryption, which worked great, but was just more of a hassle. I switched to native ZFS a couple years back. Initially, I was worried that my lowly i3-4150 wouldn't be able to handle on-the-fly encryption, but with both Core and native, it worked great with no noticeable performance hit.

Dunno why 2.1.0 is slower.

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Mon Jan 03, 2022 5:46 pm
by lundman
It is possible/likely that in 2.1.0 codebase the assembler versions are not kicked in. Although, if that was the case, they wouldn't be listed in sysctl output

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Tue Jan 04, 2022 8:48 am
by abc123
For 2.1.0 on Monterey x64, I see the following:

Code: Select all
$ sysctl -a | grep "kstat\.zfs\.darwin\.tunable\..+_impl:"
kstat.zfs.darwin.tunable.zfs_vdev_raidz_impl: cycle [fastest] original scalar sse2
kstat.zfs.darwin.tunable.icp_gcm_impl: cycle [fastest] generic
kstat.zfs.darwin.tunable.icp_aes_impl: cycle [fastest] generic x86_64
kstat.zfs.darwin.tunable.zfs_fletcher_4_impl: [fastest] scalar superscalar superscalar4 sse2


Are any other options expected for gem and yes? What does 1.9.4 list? How do we find which is the fastest?

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Tue Jan 04, 2022 9:32 am
by abc123
Ah, had a Catalina VM lying around. Using 1.9.4 I get:

Code: Select all
sysctl -a | grep "kstat\.zfs\.darwin\.tunable" | grep "impl:"
kstat.zfs.darwin.tunable.zfs_vdev_raidz_impl: [fastest] original scalar sse2 ssse3 avx2
kstat.zfs.darwin.tunable.icp_gcm_impl: cycle [fastest] generic pclmulqdq
kstat.zfs.darwin.tunable.icp_aes_impl: cycle [fastest] generic x86_64 aesni
kstat.zfs.darwin.tunable.zfs_fletcher_4_impl: [fastest] scalar superscalar superscalar4 sse2 ssse3 avx2


So does look like some of the implementations are missing from 2.1.0.

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Tue Jan 04, 2022 8:21 pm
by cgiard
I’m guessing, but the lack of “aesni” on the Monterey output would likely be a real performance problem. Doing AES without the special CPU instructions is a LOT slower.

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Wed Jan 05, 2022 2:06 am
by lundman
Yeah, definitely missing aesni and pclmulqdq - those will hurt. I'll check why the build isn't including them.

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Wed Jan 05, 2022 11:19 pm
by lundman
It does reach:

Code: Select all
    return (!!(spl_cpuid_features() & CPUID_FEATURE_AES));

#define CPUID_FEATURE_AES       _HBit(25) /* AES instructions */

#define cpuid(func, a, b, c, d)                 \
    __asm__ __volatile__( \
    "        pushq %%rbx        \n" \
    "        xorq %%rcx,%%rcx   \n" \
    "        cpuid              \n" \
    "        movq %%rbx, %%rsi  \n" \
    "        popq %%rbx         \n" : \
    "=a" (a), "=S" (b), "=c" (c), "=d" (d) : "a" (func))

uint64_t
spl_cpuid_features(void)
{

        cpuid(0, a, b, c, d);
        if (a >= 1) {
            cpuid(1, a, b, c, d);
            cpuid_features = d;

}

# sysctl machdep.cpu.features
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON SSSE3 FMA CX16 SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C


Odd, it has:
Code: Select all
 spl_cpuid_features: FEATURES 1f8bfbff
 spl_cpuid_features: LEAF7    9c27ab
 spl_cpuid_features: AES is NO: testing bit 200000000000000


Hmmmm

Re: PSA encryption performance - 1.9.4 MUCH faster than 2.1.

PostPosted: Wed Jan 05, 2022 11:33 pm
by lundman
Ah ok, so I was only keeping 32bits:

Code: Select all
spl_cpuid_features: FEATURES fffa320b1f8bfbff
spl_cpuid_features AES is YES: testing bit 200000000000000