Saturday, December 1, 2012

Linux & SSD's

I've been meaning to write this up for a while. There is a lot of information available on the web about running Linux on SSD's. I found all of these tips and tweaks in various places, but I'm consolidating it all into one post, if for no other selfish reason my own personal reference.

Step one, Purchase a good SSD.
This may seem obvious, but it's really not as easy as it sounds. I poured over benchmarks and reviews before purchasing my SSD's, and I really came to two conclusions. Either buy a Samsung drive or an Intel drive. They really seem(ed) to me to have the best balance of reliability, speed, support and staying power.
I settled on Samsung 830's, right before the 840's came out, and I'm really happy with my choice considering that the 830's have MLC memory and the 840's have TLC. If you want to understand more about that, Uncle Google is your friend.

Step two, install an SSD friendly Linux.
Ok, this is pretty easy now, any up-to-date distro will do. But, for example, Debian stable doesn't support Trim, so if you are a Debian user, you'll either want to install a backported kernel, or run Testing. If my understanding is correct, Ubuntu 10.04 and CentOS/Scientific Linux/RHEL have had TRIM support backported into their older kernels.

Step three, tweak your SSD friendly Linux into submission.
By default, even SSD friendly Linux distros aren't all that friendly to SSD's. You have to make them play nice. Below are the steps I followed to do this.

Change your default Scheduler
By default, most Linux distributions use the kernel's default CFQ scheduler. The CFQ (stands for Completely Fair Queuing) is the defauld scheduler for the Linux kernel. It's optimized for a variety of workloads, and places synchronous requests submitted by processes into per-process queues and then allocates I/O slices for each queue. Async request get placed together in fewer queues. Although CFQ is great for spinning rusty metal, there are other (better) options for SSD's

One such choice is the Deadline scheduler. Deadline is default in Ubuntu starting with 12.10, so it may be worth just leaving it as set if you desire. It's a really good choice for SSD's, and probably the best choice for slightly-lesser-quality SSD's. The way Deadline works is that it imposes a deadline on all I/O requests to prevent starvation of request. It maintains two queues in addition to the sorted read & write queues. Deadline queues are sorted by their expiration time, the sorted queues are sorted by sector number. Before serving the next request, the scheduler decides which queue to use. Read queues are given a higher priority than write queues. This is also the preferred scheduler for database systems.

The next choice is the NOOP scheduler. It is the simplest scheduler and is a simple FIFO queue. The noop scheduler assums that I/O optimization will occur at another I/O layer, such as a storage controller.

Noop is best used with SSD devices. These devices do not depend on mechanical movement, and don't require re-ordering of multiple I/O requests.

Although Noop is best used with SSD/Flash devices, I've noticed better performance on my spindle drives as well which primarily host Virtual Hard Drive images for VMware. After doing a little more research on this, I found that Noop is preferred for virtualization, even on mechanical drives.

Now, how does one go about changing the scheduler? Well, the best way I found was modifying the file/etc/default/grub and adding an option to the GRUB_CMDLINE_LINUX_DEFAULT line. This is how mine looks:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash elevator=noop"

After doing this you'll want to run sudo update-grub to update Grub with those changes.

I've scratched the default scheduler part out as I honestly don't know what to recommend at this point. Currently, on my new box I'm simply using Ubuntu's default of Deadline. I may experiment with NOOP again at some point when I have time to benchmark.

Add 'discard' option to your /etc/fstab to enable TRIM
This allows the operating system to tell a SSD device that a block of data are no longer considered to hold valid data and can be wiped internally. Over time SSD drives can slow down if a write operation requires previously written pages need to be overwritten. Luckily its easy enough to implement. Simply edit your /etc/fstab file and add the word discard to your list of mount options. I usually go ahead and add noatime as well to disable writing of last access time information to reduce those writes as well. Here is an example of a line in my fstab

UUID=7d552b7d-3f05-461e-a85f-153100136552 / ext4 discard,noatime,errors=remount-ro 0 1

While you are in your fstab you may want to mount tmp and any other temp directories which do not contain data needed after reboot to RAM.

tmpfs /tmp tmpfs defaults,noatime,mode=1777 0 0
tmpfs /var/spool tmpfs defaults,noatime,mode=1777 0 0
tmpfs /var/tmp tmpfs defaults,noatime,mode=1777 0 0

I've since discovered from Linux Kernel developer Theodore Ts'o  that it is far better to execute fstrim manually than to use the 'discard' mount flag.

I created the following script:
And created a cron job to execute this script each morning. I also abandoned having tmpfs in RAM as well.

 Lower vm.swappiness to tune down how aggressively memory pages are swapped to disk. 
If you have a lot of physical memory, you may rarely swap anyway, but I like to set my vm.swappiness to 10 to further reduce how likely it is that I will swap. Linux moves memory pages that have not been accessed for some time to the swap space even if there is enough free memory available. By changing the percentage in /proc/sys/vm/swappiness you can control the swapping behavior.   A high swappiness value means that the kernel will be more likely to unmap mapped pages. A low swappiness value means the opposite, the kernel will be less likely to unmap mapped pages. In short, the higher the vm.swappiness value, the more the system will swap. vm.swappiness takes a value between 0 and 100 to change the balance between swapping applications and freeing cache. At 100, the kernel will always prefer to find inactive pages and swap them out; in other cases, whether a swapout occurs depends on how much application memory is in use and how poorly the cache is doing at finding and releasing inactive items.

To lower this value you need to edit the /etc/sysctl.conf file and add the following to the bottom vm.swappiness = 10, and reboot.

Although this is a conglomeration of information already freely available on the web, I hope it helps you by requiring less reading and digesting. These settings are all working well for me, but perhaps your mileage may vary depending on your particular hardware or distro. If you feel I've done something dumb anywhere in here, or have better suggestions, please comment, I'd love to hear your opinions and best practices.

1 comment:

  1. Tmpfs

    Files and directories store in tmpfs is temporary, tmpfs keeps everything in virtual memory (kernel internal caches), nothing will be saved on your hard drive or SSD. Once your system is restarted, everything in tmpfs will be gone. Normally linux system cache stores in /tmp directory. To reduce writes on SSD, you can mount /tmp to tmpfs.

    Edit fstab file
    # nano /etc/fstab

    Add the line to the end of fstab file
    tmpfs /tmp tmpfs defaults,noatime,mode=1777 0 0

    If logs aren’t important for you (laptop or desktop), you can also mount /var/log to tmpfs. Add the line to the end of fstab file
    tmpfs /var/log tmpfs defaults,noatime,mode=0755 0 0

    Preload is a Linux software developed by Behdad Esfahbod. Preload learns programs that users use often, records statics using Markov chains, analyzes, and predicts what programs will be most used. Preload then will load those programs, binaries, and dependencies into memory or ram. By having programs already in RAM or memory, it will take less time when you actually start that program or programs.

    To install preload on Ubuntu, Linux Mint or debian based distributions
    # apt-get update && apt-get install preload

    To install preload on Fedora, Centos or Redhat based distributions
    # yum install preload
    Swap and Swapiness

    Swappiness is a part of Linux kernel that let you control how much swap (virtual memory) file is being used. Swappiness values can be changed from 0 to 100. The higher swappiness values the more Linux kernel will try to use swap space, the lower swappiness values means linux kernel will useless or try not to use swap space depends on our setting. The default swappiness value from linux kernel is 60, if your system have plenty have RAM, you should avoid using swap space which writes and reads will be on your SSD or hard drive. For system with 4 GB or more RAM, I would suggest to reduce the usage of swap by changing swappiness settings to between 10 even 0.

    To check your swappiness setting on your system, you should see the value of 60 as default.
    $ cat /proc/sys/vm/swappiness

    The value is up to you to decide. Here is my suggestion

    2 GB = 30
    4 GB = 10
    6 GB or more = 0

    To change swappiness setting:
    $ su -
    # nano /etc/sysctl.conf

    And add this line into sysctl.conf file.
    vm.swappiness = 10