Each application can use some of your memory. Linux uses all otherwise unoccupied memory (except for the last few Mb) as “cache”. This includes the page cache, inode caches, etc. This is a good thing — it helps speed things up heaps. Both writing to disk and reading from disk can be sped up immensely by cache.
Ideally, you have enough memory for all your applications, and you still have several hundred Mb left for cache. In this situation, as long as your applications don’t increase their memory use and the system isn’t struggling to get enough space for cache, there is no need for any swap.
Once applications claim more RAM, it simply goes into some of the space that was used by cache, shrinking the cache. De-allocating cache is cheap and easy enough that it is simply done in real time — everything that sits in the cache is either just a second copy of something that’s already on disk, so can just be deallocated instantly, or it’s something that we would have had to flush to disk within the next few seconds anyway.
This is not a situation that is specific to Linux — all modern operating systems work this way. The different operating systems might just report free RAM differently: some include the cache as part of what they consider “free” and some may not.
When you talk about free RAM, it’s a lot more meaningful to include cache, because it practically is free — it’s available should any application request it. On Linux, the free command reports it both ways — the first line includes cache in the used RAM column, and the second line includes cache (and buffers) in the free column.
How Linux uses swap (simplified)
Once you have used up enough memory that there is not enough left for a smooth-running cache, Linux may decide to re-allocate some unused application memory from RAM to swap.
It doesn’t do this according to a definite cut-off. It’s not like you reach a certain percentage of allocation then Linux starts swapping. It has a rather “fuzzy” algorithm. It takes a lot of things into account, which can best be described by “how much pressure is there for memory allocation”. If there is a lot of “pressure” to allocate new memory, then it will increase the chances some will be swapped to make more room. If there is less “pressure” then it will decrease these chances.
Your system has a “swappiness” setting which helps you tweak how this “pressure” is calculated. It’s normally not recommended to alter this at all, and I would not recommend you alter it. Swapping is overall a very good thing — although there are a few edge cases where it harms performance, if you look at overall system performance it’s a net benefit for a wide range of tasks. If you reduce the swappiness, you let the amount of cache memory shrink a little bit more than it would otherwise, even when it may really be useful. Whether this is a good enough trade-off for whatever problem you’re having with swapping is up to you. You should just know what you’re doing, that’s all.
There is a well-known situation in which swap really harms perceived performance on a desktop system, and that’s in how quickly applications can respond to user input again after being left idle for a long time and having background processes heavy in IO (such as an overnight backup) run. This is a very visible sluggishness, but not enough to justify turning off swap all together and very hard to prevent in any operating system. Turn off swap and this initial sluggishness after the backup/virus scan may not happen, but the system may run a little bit slower all day long. This is not a situation that’s limited to Linux, either.
When choosing what is to be swapped to disk, the system tries to pick memory that is not actually being used — read to or written from. It has a pretty simple algorithm for calculating this that chooses well most of the time.
If you have a system where you have a huge amount of RAM (at time of writing, 8GB is a huge amount for a typical Linux distro), then you will very rarely ever hit a situation where swap is needed at all. You may even try turning swap off. I never recommend doing that, but only because you never know when more RAM may save you from some application crashing. But if you know you’re not going to need it, you can do it.
But how can swap speed up my system? Doesn’t swapping slow things down?
The act of transferring data from RAM to swap is a slow operation, but it’s only taken when the kernel is pretty sure the overall benefit will outweigh this. For example, if your application memory has risen to the point that you have almost no cache left and your I/O is very inefficient because of this, you can actually get a lot more speed out of your system by freeing up some memory, even after the initial expense of swapping data in order to free it up.
It’s also a last resort should your applications actually request more memory than you actually have. In this case, swapping is necessary to prevent an out-of-memory situation which will often result in an application crashing or having to be forcibly killed.
Swapping is only associated with times where your system is performing poorly because it happens at times when you are running out of usable RAM, which would slow your system down (or make it unstable) even if you didn’t have swap. So to simplify things, swapping happens because your system is becoming bogged down, rather than the other way around.
Once data is in swap, when does it come out again?
Transferring data out of swap is (for traditional hard disks, at least) just as time-consuming as putting it in there. So understandably, your kernel will be just as reluctant to remove data from swap, especially if it’s not actually being used (ie read from or written to). If you have data in swap and it’s not being used, then it’s actually a good thing that it remains in swap, since it leaves more memory for other things that are being used, potentially speeding up your system.