How does TRIM affect SSD speed?

Just recently, I built a desktop PC for myself. I mainly assembled it of used and refurbished parts. One of those parts had been a used Samsung 840 PRO 256GB from a close colleague. After a space upgrade he didn’t need it anymore.

My plan was to copy the whole installation from my laptop’s SSD onto the colleague’s SSD with just minimal changes. I partitioned the drive, configured the crypto, LVM and filesystems and then fired up rsync -aHxEXA for all my partitions.

But it failed to copy my root drive (~30GB) in under after half an hour. I got suspicious. It came clear after a look on the transfer speeds: It had about 15-20MB/s write speed. WTF!? Even a bad HDD would be faster!

There was no reason to have a bottleneck on ~20MB/s. The SSD was connected via SATAIII-Interface and the source was also an SSD connected via SATAIII on the same computer.

Unlike a used product from eBay, I knew my colleague did treat the SSD well. My colleague had a similar setup to mine: a simple ext4 filesystem on top of an dm-crypt encrypted partition.

Why did he grind his SSD that hard?

Status quo

This was the best decision ever in this story, and probably the reason, why I’ll write a blogpost about it: Collect a metric first!

Any benchmarking software, which can test write speed, should suffice. I used GNOME Disks, as this piece of software is a quite useful GUI tool on Linux distros.

Benchmark Initial Obviously, the blue line is the plotted read speed, the green dots are the access times. Not so obvious: The red line is resembles the write speed. Sorry for the mouse pointer sticking around.

Do you spot the write sample performance? No? Well, it’s hiding behind the green dots visualizing the access time speed.

It’s obvious, this drive is broken and unacceptable to use. Even a HDD will perform better.

What could the problem be?

I’ve came up with some possible solutions.

  • SSD is actually broken in Hardware
  • The SSD never got TRIMmed
  • A firmware update is necessary

I chose to update the Firmware first. Samsung’s SSDs are known to have firmware problems, which affect the SSD’s speed. The SSD shouldn’t be broken, as it’s a PRO SSD, which has got up to 5 years warranty and also S.M.A.R.T. did not report anything other than 0 on usual fail indicators. I also declined the missing TRIM, because we both switched to dm-crypt together and in contrast to me, he actively decided to use TRIM on his SSD.

Firmware Update

So, I started with the firmware update. I could start a complete new rant about the process to create a bootable USB medium with the USB firmware update. I hope their intern has produced those ISO images and not the dev, who has written the SSD firmware, too.

After failing a few hours before getting successful, the price seems to pay off. The write speed got quadrupled:

Benchmark with new FW

Wow, that’s awesome. Their firmware update does help very well!

Conclusion

Finally we improv… No, we’re not over the finish line yet! I always preach to prove one’s axioms[0]. The firmware update helped. But was it really the proper solution to our problem, or did the firmware update use some side effects to improve performance?

So let’s check again our outlined paths:

  • SSD is actually broken in Hardware

I guess the S.M.A.R.T. values show that the drive shouldn’t be an issue. Of course, the drive could be broken and completely undetected by S.M.A.R.T., but the FW update did help and bring the SSD back to acceptable speeds.

  • A firmware update is necessary

Yeah, this helped quite well. But to verify, that this is actually the problem, we need to test it by the time. If the drive speed stays the same, that’s the correct solution. If not, there has to be something else.

  • The SSD never got TRIMmed

At first, I ruled this out in my axioms. I thought, this couldn’t be possible. But I never (dis)proved it. We installed his machine together, but in meantime, my colleague could have deactivated TRIM for $some_reason.

So, let’s check the TRIM assumption again. While reasearching this whole topic, I found out that you don’t have to use hdparm anymore to discard block devices. With blkdiscard, there is a new tool, which does it for any blockdevice.

So, let’s issue a simple (Warning: Use with caution!)

blkdiscard /dev/sdb

and then run the benchmark again.

Oh, wow!

I couldn’t trust my eyes. Yep. The read speed maxed out to SATA connection speed! The write speed quadrupled again!

Bonus: Compare the access times with the previous results!

Benchmark with new FW

The real conclusion

  • Prove your axioms[0]!
  • A clear problem always has got a clear solution. Go on and find the applying solution. “A solution” ain’t always real!

Obviously, TRIMming didn’t work on my colleague’s SSD. And while reading the manual for the discard related options in LUKS, I actually guess, that it didn’t work right from the start. So when I helped him setting up his SSD crypto and not testing the discard to work, I shot myself in the knee over the long run.

[0] OK, technically, these aren’t axioms. These are plain assumptions. But people often treat them like axioms. They spend no effort in neither proving nor falsifying their assumptions.

Coming up

To shift this also onto a higher level: Was the firmware update or the blkdiscard the actual problem? I don’t know yet, but I’ve got a similar SSD (Samsung 750 EVO 500GB) with similar write speeds. But running blkdiscard first and then applying a FW update might shed some light on this question.

Keep your eyes open or follow me on Twitter. I’ve got also an RSS feed.