11-25-2020, 02:27 PM
(11-25-2020, 01:53 PM)DrYak Wrote:(11-19-2020, 04:24 PM)xyzzy Wrote: Non 4kB aligned partitions are usually really bad. {...} But most likely it means each write to a 4kB ext4 (or other fs) block involves a read-modify-write cycle to the two flash blocks (of unknown size, 2k? 4k?) that the ext4 block overlaps.
This 4kiB doesn't matter any more nowadays.
That's not what I see when I test. I did read-testing with flashbench:
align 16777216 pre 282µs on 395µs post 280µs diff 114µs
align 8388608 pre 282µs on 403µs post 285µs diff 120µs
align 4194304 pre 231µs on 350µs post 281µs diff 94µs
align 2097152 pre 230µs on 263µs post 231µs diff 32.5µs
align 1048576 pre 231µs on 255µs post 229µs diff 24.5µs
align 524288 pre 231µs on 255µs post 233µs diff 22.8µs
align 262144 pre 228µs on 245µs post 228µs diff 16.9µs
align 131072 pre 230µs on 254µs post 233µs diff 22.9µs
align 65536 pre 231µs on 261µs post 228µs diff 31.1µs
align 32768 pre 230µs on 247µs post 232µs diff 16.4µs
align 16384 pre 233µs on 249µs post 233µs diff 15.6µs
align 8192 pre 232µs on 250µs post 233µs diff 16.7µs
align 4096 pre 232µs on 253µs post 234µs diff 20.2µs
align 2048 pre 233µs on 234µs post 233µs diff 416ns
The important number is the last column, which is the difference between reading before or after a boundary vs on the boundary. The increase in cost of a read that crosses a 4kB boundary is about 50x higher than a read that crosses a 2kB boundary. There's also a significant change at 4 MB, which is probably the erase block size.
I also did a test using fio of random 4kB IO on the ext4 partition in it's original unaligned location and after I moved the partition to align it. Unaligned it got 1701 IOPS read and 568 IOPS write. After alignment it was 3549 IOPS read and 1168 IOPS write.