Prevent zram LRU inversion with zswap and max_pool_percent = 100
The major disadvantage of using zram is LRU inversion:
older pages get into the higher-priority zram and quickly fill it, while newer pages are swapped in and out of the slower [...] swap
The zswap documentation says that zswap does not suffer from this:
Zswap receives pages for compression through the Frontswap API and is able to
evict pages from its own compressed pool on an LRU basis and write them back to
the backing swap device in the case that the compressed pool is full.
Could I have all the benefits of zram and a completely compressed RAM by setting max_pool_percent
to 100
?
Zswap seeks to be simple in its policies. Sysfs attributes allow for one user
controlled policy:
* max_pool_percent - The maximum percentage of memory that the compressed
pool can occupy.
No default max_pool_percent
is specified here, but the Arch Wiki page says that it is 20
.
Apart from the performance implications of decompressing, is there any danger / downside in setting max_pool_percent
to 100
?
Would it operate like using an improved swap-backed zram?
linux swap zram zswap
add a comment |
The major disadvantage of using zram is LRU inversion:
older pages get into the higher-priority zram and quickly fill it, while newer pages are swapped in and out of the slower [...] swap
The zswap documentation says that zswap does not suffer from this:
Zswap receives pages for compression through the Frontswap API and is able to
evict pages from its own compressed pool on an LRU basis and write them back to
the backing swap device in the case that the compressed pool is full.
Could I have all the benefits of zram and a completely compressed RAM by setting max_pool_percent
to 100
?
Zswap seeks to be simple in its policies. Sysfs attributes allow for one user
controlled policy:
* max_pool_percent - The maximum percentage of memory that the compressed
pool can occupy.
No default max_pool_percent
is specified here, but the Arch Wiki page says that it is 20
.
Apart from the performance implications of decompressing, is there any danger / downside in setting max_pool_percent
to 100
?
Would it operate like using an improved swap-backed zram?
linux swap zram zswap
add a comment |
The major disadvantage of using zram is LRU inversion:
older pages get into the higher-priority zram and quickly fill it, while newer pages are swapped in and out of the slower [...] swap
The zswap documentation says that zswap does not suffer from this:
Zswap receives pages for compression through the Frontswap API and is able to
evict pages from its own compressed pool on an LRU basis and write them back to
the backing swap device in the case that the compressed pool is full.
Could I have all the benefits of zram and a completely compressed RAM by setting max_pool_percent
to 100
?
Zswap seeks to be simple in its policies. Sysfs attributes allow for one user
controlled policy:
* max_pool_percent - The maximum percentage of memory that the compressed
pool can occupy.
No default max_pool_percent
is specified here, but the Arch Wiki page says that it is 20
.
Apart from the performance implications of decompressing, is there any danger / downside in setting max_pool_percent
to 100
?
Would it operate like using an improved swap-backed zram?
linux swap zram zswap
The major disadvantage of using zram is LRU inversion:
older pages get into the higher-priority zram and quickly fill it, while newer pages are swapped in and out of the slower [...] swap
The zswap documentation says that zswap does not suffer from this:
Zswap receives pages for compression through the Frontswap API and is able to
evict pages from its own compressed pool on an LRU basis and write them back to
the backing swap device in the case that the compressed pool is full.
Could I have all the benefits of zram and a completely compressed RAM by setting max_pool_percent
to 100
?
Zswap seeks to be simple in its policies. Sysfs attributes allow for one user
controlled policy:
* max_pool_percent - The maximum percentage of memory that the compressed
pool can occupy.
No default max_pool_percent
is specified here, but the Arch Wiki page says that it is 20
.
Apart from the performance implications of decompressing, is there any danger / downside in setting max_pool_percent
to 100
?
Would it operate like using an improved swap-backed zram?
linux swap zram zswap
linux swap zram zswap
edited Nov 25 '17 at 8:07
Tom Hale
asked Nov 25 '17 at 6:37
Tom HaleTom Hale
7,24033999
7,24033999
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
To answer your question, I first ran a series of experiments. The final answers are in bold at the end.
Experiments performed:
1) swap file, zswap disabled
2) swap file, zswap enabled, max_pool_percent = 20
3) swap file, zswap enabled, max_pool_percent = 70
4) swap file, zswap enabled, max_pool_percent = 100
5) zram swap, zswap disabled
6) zram swap, zswap enabled, max_pool_percent = 20
7) no swap
8) swap file, zswap enabled, max_pool_percent = 1
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Setup before the experiment:
- VirtualBox 5.1.30
- Fedora 27, xfce spin
- 512 MB RAM, 16 MB video RAM, 2 CPUs
- linux kernel 4.13.13-300.fc27.x86_64
- default
swappiness
value (60) - created an empty 512 MB swap file (300 MB in experiment 9) for possible use during some of the experiments (using
dd
) but didn'tswapon
yet - disabled all dnf* systemd services, ran
watch "killall -9 dnf"
to be more sure that dnf won't try to auto-update during the experiment or something and throw the results off too far
State before the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 280 72 8 132 153
Swap: 511 0 511
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 74624 8648 127180 0 0 1377 526 275 428 3 2 94 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 102430 688 3593850 67603 3351 8000 1373336 17275 0 26
sr0 0 0 0 0 0 0 0 0 0 0
The subsequent swapon operations, etc., leading to the different settings during the experiments, resulted in variances of within about 2% of these values.
Experiment operation consisted of:
- Run Firefox for the first time
- Wait about 40 seconds or until network and disk activity ceases (whichever is longer)
- Record the following state after the experiment (firefox left running, except for experiments 7 and 9 where firefox crashed)
State after the experiment:
1) swap file, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 287 5 63 192 97
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 5904 1892 195428 63 237 1729 743 335 492 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 134680 10706 4848594 95687 5127 91447 2084176 26205 0 38
sr0 0 0 0 0 0 0 0 0 0 0
2) swap file, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 330 6 33 148 73
Swap: 511 317 194
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 325376 7436 756 151144 3 110 1793 609 344 477 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 136046 1320 5150874 117469 10024 41988 1749440 53395 0 40
sr0 0 0 0 0 0 0 0 0 0 0
3) swap file, zswap enabled, max_pool_percent = 70
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 342 8 32 134 58
Swap: 511 393 118
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 403208 8116 1088 137180 4 8 3538 474 467 538 3 3 91 3 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 224321 1414 10910442 220138 7535 9571 1461088 42931 0 60
sr0 0 0 0 0 0 0 0 0 0 0
4) swap file, zswap enabled, max_pool_percent = 100
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 345 10 32 129 56
Swap: 511 410 101
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 420712 10916 2316 130520 1 11 3660 492 478 549 3 4 91 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 221920 1214 10922082 169369 8445 9570 1468552 28488 0 56
sr0 0 0 0 0 0 0 0 0 0 0
5) zram swap, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 333 4 34 147 72
Swap: 499 314 185
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
5 0 324128 7256 1192 149444 153 365 1658 471 326 457 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 130703 884 5047298 112889 4197 9517 1433832 21037 0 37
sr0 0 0 0 0 0 0 0 0 0 0
zram0 58673 0 469384 271 138745 0 1109960 927 0 1
6) zram swap, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 338 5 32 141 65
Swap: 499 355 144
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 364984 7584 904 143572 33 166 2052 437 354 457 3 3 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 166168 998 6751610 120911 4383 9543 1436080 18916 0 42
sr0 0 0 0 0 0 0 0 0 0 0
zram0 13819 0 110552 78 68164 0 545312 398 0 0
7) no swap
Note that firefox is not running in this experiment at the time of recording these stats.
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 289 68 8 127 143
Swap: 0 0 0
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 70108 10660 119976 0 0 13503 286 607 618 2 5 88 5 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 748978 3511 66775042 595064 4263 9334 1413728 23421 0 164
sr0 0 0 0 0 0 0 0 0 0 0
8) swap file, zswap enabled, max_pool_percent = 1
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 292 7 63 186 90
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 7088 2156 188688 43 182 1417 606 298 432 3 2 94 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 132222 9573 4796802 114450 10171 77607 2050032 137961 0 41
sr0 0 0 0 0 0 0 0 0 0 0
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Firefox was stuck and the system still read from disk furiously.
The baseline for this experiment is a different since a new swap file has been written:
total used free shared buff/cache available
Mem: 485 280 8 8 196 153
Swap: 299 0 299
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 8948 3400 198064 0 0 1186 653 249 388 2 2 95 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 103099 688 3610794 68253 3837 8084 1988936 20306 0 27
sr0 0 0 0 0 0 0 0 0 0 0
Specifically, extra 649384 sectors have been written as a result of this change.
State after the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 335 32 47 118 53
Swap: 299 277 22
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
7 1 283540 22912 2712 129132 0 0 83166 414 2387 1951 2 23 62 13 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 3416602 26605 406297938 4710584 4670 9025 2022272 33805 0 521
sr0 0 0 0 0 0 0 0 0 0 0
Subtracting the extra 649384 written sectors from 2022272 results in 1372888. This is less than 1433000 (see later) which is probably because of firefox not loading fully.
I also ran a few experiments with low swappiness
values (10 and 1) and they all got stuck in a frozen state with excessive disk reads, preventing me from recording the final memory stats.
Observations:
- Subjectively, high
max_pool_percent
values resulted in sluggishness. - Subjectively, the system in experiment 9 was so slow as to be unusable.
- High
max_pool_percent
values result in the least amount of writes whereas very low value ofmax_pool_percent
results in the most number of writes. - Experiments 5 and 6 (zram swap) suggest that firefox wrote data that resulted in about 62000 sectors written to disk. Anything above about 1433000 are sectors written due to swapping. See the following table.
- If we assume the lowest number of read sectors among the experiments to be the baseline, we can compare the experiments based on how much extra read sectors due to swapping they caused.
Written sectors as a direct consequence of swapping (approx.):
650000 1) swap file, zswap disabled
320000 2) swap file, zswap enabled, max_pool_percent = 20
30000 3) swap file, zswap enabled, max_pool_percent = 70
40000 4) swap file, zswap enabled, max_pool_percent = 100
0 5) zram swap, zswap disabled
0 6) zram swap, zswap enabled, max_pool_percent = 20
-20000 7) no swap (firefox crashed)
620000 8) swap file, zswap enabled, max_pool_percent = 1
-60000 9) swap file (300 M), zswap enabled, max_pool_percent = 100 (firefox crashed)
Extra read sectors as a direct consequence of swapping (approx.):
51792 1) swap file, zswap disabled
354072 2) swap file, zswap enabled, max_pool_percent = 20
6113640 3) swap file, zswap enabled, max_pool_percent = 70
6125280 4) swap file, zswap enabled, max_pool_percent = 100
250496 5) zram swap, zswap disabled
1954808 6) zram swap, zswap enabled, max_pool_percent = 20
61978240 7) no swap
0 (baseline) 8) swap file, zswap enabled, max_pool_percent = 1
401501136 9) swap file (300 M), zswap enabled, max_pool_percent = 100
Interpretation of results:
- This is subjective and also specific to the usecase at hand; behavior will vary in other usecases.
- Zswap's page pool takes away space in RAM that can otherwise be used by system's page cache (for file-backed pages), which means that the system repeatedly throws away file-backed pages and reads them again when needed, resulting in excessive reads.
- The high number of reads in experiment 7 is caused by the same problem - the system's anonymous pages took most of the RAM and file-backed pages had to be repeatedly read from disk.
- It might be possible under certain circumstances to minimize the amount of data written to swap disk near zero using
zswap
but it is evidently not suited for this task. - It is not possible to have "completely compressed RAM" as the system needs a certain amount of non-swap pages to reside in RAM for operation.
Personal opinions and anecdotes:
- The main improvement of zswap in terms of disk writes is not the fact that it compresses the pages but the fact it has its own buffering & caching system that reduces the page cache and effectively keeps more anonymous pages (in compressed form) in RAM. (However, based on my subjective experience as I use Linux daily, a system with swap and
zswap
with the default values ofswappiness
andmax_pool_percent
always behaves better than anyswappiness
value and nozswap
orzswap
with high values ofmax_pool_percent
.) - Low
swappiness
values seem to make the system behave better until the amount of page cache left is so small as to render the system unusable due to excessive disk reads. Similar with too highmax_pool_percent
. - Either use solely
zram
swap and limit the amount of anonymous pages you need to hold in memory, or use disk-backed swap withzswap
with approximately default values forswappiness
andmax_pool_percent
.
EDIT:
Possible future work to answer the finer points of your question would be to find out for your particular usecase how the the zsmalloc
allocator used in zram
compares compression-wise with the zbud
allocator used in zswap
. I'm not going to do that, though, just pointing out things to search for in docs/on the internet.
EDIT 2:
echo "zsmalloc" > /sys/module/zswap/parameters/zpool
switches zswap's allocator from zbud
to zsmalloc
. Continuing with my test fixture for the above experiments and comparing zram
with zswap
+zsmalloc
, it seems that as long as the swap memory needed is the same as either a zram
swap or as zswap
's max_pool_percent
, the amount of reads and writes to disk is very similar between the two. In my personal opinion based on the facts, as long as the amount of zram
swap I need is smaller than the amount of zram
swap I can afford to actually keep in RAM, then it is best to use solely zram
; and once I need more swap than I can actually keep in memory, it is best to either change my workload to avoid it or to disable zram
swap and use zswap
with zsmalloc
and set max_pool_percent
to the equivalent of what zram previously took in memory (size of zram
* compression ratio). I currently don't have the time to do a proper writeup of these additional tests, though.
Welcome and thanks! P.S. you may want to look atzpool=z3fold
as it allows 3 pages per compressed page, rather than 2.
– Tom Hale
Dec 2 '17 at 8:28
I did tryz3fold
and it slowed the computer tremendously while keeping the CPU load high, as compared tozsmalloc
. Maybe because I didn't try it on the latest kernel that includes some crucial performance improvements toz3fold
- elinux.org/images/d/d3/Z3fold.pdf slide 28.
– Jake F
Dec 2 '17 at 13:31
Ooh, perhaps its time to change to 4.14LTS... :) Your article says that it's not the best performing zpool.
– Tom Hale
Dec 2 '17 at 13:55
It's also possible to mix zram with disk-based swap without suffering LRU inversion. To achieve this, create 30 zram volumes at priority 5 and disk swap at priority 0. Write a cron job which runs swapoff / swapon for one of the volumes every 2 minutes. The zram will keep getting paged back into memory and the disk won't so LRU will eventually end up on the disk when they overflow the zram.
– Wil
Feb 5 at 17:40
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f406925%2fprevent-zram-lru-inversion-with-zswap-and-max-pool-percent-100%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
To answer your question, I first ran a series of experiments. The final answers are in bold at the end.
Experiments performed:
1) swap file, zswap disabled
2) swap file, zswap enabled, max_pool_percent = 20
3) swap file, zswap enabled, max_pool_percent = 70
4) swap file, zswap enabled, max_pool_percent = 100
5) zram swap, zswap disabled
6) zram swap, zswap enabled, max_pool_percent = 20
7) no swap
8) swap file, zswap enabled, max_pool_percent = 1
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Setup before the experiment:
- VirtualBox 5.1.30
- Fedora 27, xfce spin
- 512 MB RAM, 16 MB video RAM, 2 CPUs
- linux kernel 4.13.13-300.fc27.x86_64
- default
swappiness
value (60) - created an empty 512 MB swap file (300 MB in experiment 9) for possible use during some of the experiments (using
dd
) but didn'tswapon
yet - disabled all dnf* systemd services, ran
watch "killall -9 dnf"
to be more sure that dnf won't try to auto-update during the experiment or something and throw the results off too far
State before the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 280 72 8 132 153
Swap: 511 0 511
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 74624 8648 127180 0 0 1377 526 275 428 3 2 94 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 102430 688 3593850 67603 3351 8000 1373336 17275 0 26
sr0 0 0 0 0 0 0 0 0 0 0
The subsequent swapon operations, etc., leading to the different settings during the experiments, resulted in variances of within about 2% of these values.
Experiment operation consisted of:
- Run Firefox for the first time
- Wait about 40 seconds or until network and disk activity ceases (whichever is longer)
- Record the following state after the experiment (firefox left running, except for experiments 7 and 9 where firefox crashed)
State after the experiment:
1) swap file, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 287 5 63 192 97
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 5904 1892 195428 63 237 1729 743 335 492 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 134680 10706 4848594 95687 5127 91447 2084176 26205 0 38
sr0 0 0 0 0 0 0 0 0 0 0
2) swap file, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 330 6 33 148 73
Swap: 511 317 194
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 325376 7436 756 151144 3 110 1793 609 344 477 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 136046 1320 5150874 117469 10024 41988 1749440 53395 0 40
sr0 0 0 0 0 0 0 0 0 0 0
3) swap file, zswap enabled, max_pool_percent = 70
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 342 8 32 134 58
Swap: 511 393 118
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 403208 8116 1088 137180 4 8 3538 474 467 538 3 3 91 3 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 224321 1414 10910442 220138 7535 9571 1461088 42931 0 60
sr0 0 0 0 0 0 0 0 0 0 0
4) swap file, zswap enabled, max_pool_percent = 100
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 345 10 32 129 56
Swap: 511 410 101
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 420712 10916 2316 130520 1 11 3660 492 478 549 3 4 91 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 221920 1214 10922082 169369 8445 9570 1468552 28488 0 56
sr0 0 0 0 0 0 0 0 0 0 0
5) zram swap, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 333 4 34 147 72
Swap: 499 314 185
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
5 0 324128 7256 1192 149444 153 365 1658 471 326 457 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 130703 884 5047298 112889 4197 9517 1433832 21037 0 37
sr0 0 0 0 0 0 0 0 0 0 0
zram0 58673 0 469384 271 138745 0 1109960 927 0 1
6) zram swap, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 338 5 32 141 65
Swap: 499 355 144
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 364984 7584 904 143572 33 166 2052 437 354 457 3 3 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 166168 998 6751610 120911 4383 9543 1436080 18916 0 42
sr0 0 0 0 0 0 0 0 0 0 0
zram0 13819 0 110552 78 68164 0 545312 398 0 0
7) no swap
Note that firefox is not running in this experiment at the time of recording these stats.
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 289 68 8 127 143
Swap: 0 0 0
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 70108 10660 119976 0 0 13503 286 607 618 2 5 88 5 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 748978 3511 66775042 595064 4263 9334 1413728 23421 0 164
sr0 0 0 0 0 0 0 0 0 0 0
8) swap file, zswap enabled, max_pool_percent = 1
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 292 7 63 186 90
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 7088 2156 188688 43 182 1417 606 298 432 3 2 94 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 132222 9573 4796802 114450 10171 77607 2050032 137961 0 41
sr0 0 0 0 0 0 0 0 0 0 0
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Firefox was stuck and the system still read from disk furiously.
The baseline for this experiment is a different since a new swap file has been written:
total used free shared buff/cache available
Mem: 485 280 8 8 196 153
Swap: 299 0 299
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 8948 3400 198064 0 0 1186 653 249 388 2 2 95 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 103099 688 3610794 68253 3837 8084 1988936 20306 0 27
sr0 0 0 0 0 0 0 0 0 0 0
Specifically, extra 649384 sectors have been written as a result of this change.
State after the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 335 32 47 118 53
Swap: 299 277 22
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
7 1 283540 22912 2712 129132 0 0 83166 414 2387 1951 2 23 62 13 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 3416602 26605 406297938 4710584 4670 9025 2022272 33805 0 521
sr0 0 0 0 0 0 0 0 0 0 0
Subtracting the extra 649384 written sectors from 2022272 results in 1372888. This is less than 1433000 (see later) which is probably because of firefox not loading fully.
I also ran a few experiments with low swappiness
values (10 and 1) and they all got stuck in a frozen state with excessive disk reads, preventing me from recording the final memory stats.
Observations:
- Subjectively, high
max_pool_percent
values resulted in sluggishness. - Subjectively, the system in experiment 9 was so slow as to be unusable.
- High
max_pool_percent
values result in the least amount of writes whereas very low value ofmax_pool_percent
results in the most number of writes. - Experiments 5 and 6 (zram swap) suggest that firefox wrote data that resulted in about 62000 sectors written to disk. Anything above about 1433000 are sectors written due to swapping. See the following table.
- If we assume the lowest number of read sectors among the experiments to be the baseline, we can compare the experiments based on how much extra read sectors due to swapping they caused.
Written sectors as a direct consequence of swapping (approx.):
650000 1) swap file, zswap disabled
320000 2) swap file, zswap enabled, max_pool_percent = 20
30000 3) swap file, zswap enabled, max_pool_percent = 70
40000 4) swap file, zswap enabled, max_pool_percent = 100
0 5) zram swap, zswap disabled
0 6) zram swap, zswap enabled, max_pool_percent = 20
-20000 7) no swap (firefox crashed)
620000 8) swap file, zswap enabled, max_pool_percent = 1
-60000 9) swap file (300 M), zswap enabled, max_pool_percent = 100 (firefox crashed)
Extra read sectors as a direct consequence of swapping (approx.):
51792 1) swap file, zswap disabled
354072 2) swap file, zswap enabled, max_pool_percent = 20
6113640 3) swap file, zswap enabled, max_pool_percent = 70
6125280 4) swap file, zswap enabled, max_pool_percent = 100
250496 5) zram swap, zswap disabled
1954808 6) zram swap, zswap enabled, max_pool_percent = 20
61978240 7) no swap
0 (baseline) 8) swap file, zswap enabled, max_pool_percent = 1
401501136 9) swap file (300 M), zswap enabled, max_pool_percent = 100
Interpretation of results:
- This is subjective and also specific to the usecase at hand; behavior will vary in other usecases.
- Zswap's page pool takes away space in RAM that can otherwise be used by system's page cache (for file-backed pages), which means that the system repeatedly throws away file-backed pages and reads them again when needed, resulting in excessive reads.
- The high number of reads in experiment 7 is caused by the same problem - the system's anonymous pages took most of the RAM and file-backed pages had to be repeatedly read from disk.
- It might be possible under certain circumstances to minimize the amount of data written to swap disk near zero using
zswap
but it is evidently not suited for this task. - It is not possible to have "completely compressed RAM" as the system needs a certain amount of non-swap pages to reside in RAM for operation.
Personal opinions and anecdotes:
- The main improvement of zswap in terms of disk writes is not the fact that it compresses the pages but the fact it has its own buffering & caching system that reduces the page cache and effectively keeps more anonymous pages (in compressed form) in RAM. (However, based on my subjective experience as I use Linux daily, a system with swap and
zswap
with the default values ofswappiness
andmax_pool_percent
always behaves better than anyswappiness
value and nozswap
orzswap
with high values ofmax_pool_percent
.) - Low
swappiness
values seem to make the system behave better until the amount of page cache left is so small as to render the system unusable due to excessive disk reads. Similar with too highmax_pool_percent
. - Either use solely
zram
swap and limit the amount of anonymous pages you need to hold in memory, or use disk-backed swap withzswap
with approximately default values forswappiness
andmax_pool_percent
.
EDIT:
Possible future work to answer the finer points of your question would be to find out for your particular usecase how the the zsmalloc
allocator used in zram
compares compression-wise with the zbud
allocator used in zswap
. I'm not going to do that, though, just pointing out things to search for in docs/on the internet.
EDIT 2:
echo "zsmalloc" > /sys/module/zswap/parameters/zpool
switches zswap's allocator from zbud
to zsmalloc
. Continuing with my test fixture for the above experiments and comparing zram
with zswap
+zsmalloc
, it seems that as long as the swap memory needed is the same as either a zram
swap or as zswap
's max_pool_percent
, the amount of reads and writes to disk is very similar between the two. In my personal opinion based on the facts, as long as the amount of zram
swap I need is smaller than the amount of zram
swap I can afford to actually keep in RAM, then it is best to use solely zram
; and once I need more swap than I can actually keep in memory, it is best to either change my workload to avoid it or to disable zram
swap and use zswap
with zsmalloc
and set max_pool_percent
to the equivalent of what zram previously took in memory (size of zram
* compression ratio). I currently don't have the time to do a proper writeup of these additional tests, though.
Welcome and thanks! P.S. you may want to look atzpool=z3fold
as it allows 3 pages per compressed page, rather than 2.
– Tom Hale
Dec 2 '17 at 8:28
I did tryz3fold
and it slowed the computer tremendously while keeping the CPU load high, as compared tozsmalloc
. Maybe because I didn't try it on the latest kernel that includes some crucial performance improvements toz3fold
- elinux.org/images/d/d3/Z3fold.pdf slide 28.
– Jake F
Dec 2 '17 at 13:31
Ooh, perhaps its time to change to 4.14LTS... :) Your article says that it's not the best performing zpool.
– Tom Hale
Dec 2 '17 at 13:55
It's also possible to mix zram with disk-based swap without suffering LRU inversion. To achieve this, create 30 zram volumes at priority 5 and disk swap at priority 0. Write a cron job which runs swapoff / swapon for one of the volumes every 2 minutes. The zram will keep getting paged back into memory and the disk won't so LRU will eventually end up on the disk when they overflow the zram.
– Wil
Feb 5 at 17:40
add a comment |
To answer your question, I first ran a series of experiments. The final answers are in bold at the end.
Experiments performed:
1) swap file, zswap disabled
2) swap file, zswap enabled, max_pool_percent = 20
3) swap file, zswap enabled, max_pool_percent = 70
4) swap file, zswap enabled, max_pool_percent = 100
5) zram swap, zswap disabled
6) zram swap, zswap enabled, max_pool_percent = 20
7) no swap
8) swap file, zswap enabled, max_pool_percent = 1
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Setup before the experiment:
- VirtualBox 5.1.30
- Fedora 27, xfce spin
- 512 MB RAM, 16 MB video RAM, 2 CPUs
- linux kernel 4.13.13-300.fc27.x86_64
- default
swappiness
value (60) - created an empty 512 MB swap file (300 MB in experiment 9) for possible use during some of the experiments (using
dd
) but didn'tswapon
yet - disabled all dnf* systemd services, ran
watch "killall -9 dnf"
to be more sure that dnf won't try to auto-update during the experiment or something and throw the results off too far
State before the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 280 72 8 132 153
Swap: 511 0 511
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 74624 8648 127180 0 0 1377 526 275 428 3 2 94 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 102430 688 3593850 67603 3351 8000 1373336 17275 0 26
sr0 0 0 0 0 0 0 0 0 0 0
The subsequent swapon operations, etc., leading to the different settings during the experiments, resulted in variances of within about 2% of these values.
Experiment operation consisted of:
- Run Firefox for the first time
- Wait about 40 seconds or until network and disk activity ceases (whichever is longer)
- Record the following state after the experiment (firefox left running, except for experiments 7 and 9 where firefox crashed)
State after the experiment:
1) swap file, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 287 5 63 192 97
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 5904 1892 195428 63 237 1729 743 335 492 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 134680 10706 4848594 95687 5127 91447 2084176 26205 0 38
sr0 0 0 0 0 0 0 0 0 0 0
2) swap file, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 330 6 33 148 73
Swap: 511 317 194
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 325376 7436 756 151144 3 110 1793 609 344 477 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 136046 1320 5150874 117469 10024 41988 1749440 53395 0 40
sr0 0 0 0 0 0 0 0 0 0 0
3) swap file, zswap enabled, max_pool_percent = 70
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 342 8 32 134 58
Swap: 511 393 118
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 403208 8116 1088 137180 4 8 3538 474 467 538 3 3 91 3 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 224321 1414 10910442 220138 7535 9571 1461088 42931 0 60
sr0 0 0 0 0 0 0 0 0 0 0
4) swap file, zswap enabled, max_pool_percent = 100
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 345 10 32 129 56
Swap: 511 410 101
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 420712 10916 2316 130520 1 11 3660 492 478 549 3 4 91 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 221920 1214 10922082 169369 8445 9570 1468552 28488 0 56
sr0 0 0 0 0 0 0 0 0 0 0
5) zram swap, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 333 4 34 147 72
Swap: 499 314 185
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
5 0 324128 7256 1192 149444 153 365 1658 471 326 457 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 130703 884 5047298 112889 4197 9517 1433832 21037 0 37
sr0 0 0 0 0 0 0 0 0 0 0
zram0 58673 0 469384 271 138745 0 1109960 927 0 1
6) zram swap, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 338 5 32 141 65
Swap: 499 355 144
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 364984 7584 904 143572 33 166 2052 437 354 457 3 3 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 166168 998 6751610 120911 4383 9543 1436080 18916 0 42
sr0 0 0 0 0 0 0 0 0 0 0
zram0 13819 0 110552 78 68164 0 545312 398 0 0
7) no swap
Note that firefox is not running in this experiment at the time of recording these stats.
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 289 68 8 127 143
Swap: 0 0 0
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 70108 10660 119976 0 0 13503 286 607 618 2 5 88 5 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 748978 3511 66775042 595064 4263 9334 1413728 23421 0 164
sr0 0 0 0 0 0 0 0 0 0 0
8) swap file, zswap enabled, max_pool_percent = 1
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 292 7 63 186 90
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 7088 2156 188688 43 182 1417 606 298 432 3 2 94 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 132222 9573 4796802 114450 10171 77607 2050032 137961 0 41
sr0 0 0 0 0 0 0 0 0 0 0
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Firefox was stuck and the system still read from disk furiously.
The baseline for this experiment is a different since a new swap file has been written:
total used free shared buff/cache available
Mem: 485 280 8 8 196 153
Swap: 299 0 299
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 8948 3400 198064 0 0 1186 653 249 388 2 2 95 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 103099 688 3610794 68253 3837 8084 1988936 20306 0 27
sr0 0 0 0 0 0 0 0 0 0 0
Specifically, extra 649384 sectors have been written as a result of this change.
State after the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 335 32 47 118 53
Swap: 299 277 22
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
7 1 283540 22912 2712 129132 0 0 83166 414 2387 1951 2 23 62 13 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 3416602 26605 406297938 4710584 4670 9025 2022272 33805 0 521
sr0 0 0 0 0 0 0 0 0 0 0
Subtracting the extra 649384 written sectors from 2022272 results in 1372888. This is less than 1433000 (see later) which is probably because of firefox not loading fully.
I also ran a few experiments with low swappiness
values (10 and 1) and they all got stuck in a frozen state with excessive disk reads, preventing me from recording the final memory stats.
Observations:
- Subjectively, high
max_pool_percent
values resulted in sluggishness. - Subjectively, the system in experiment 9 was so slow as to be unusable.
- High
max_pool_percent
values result in the least amount of writes whereas very low value ofmax_pool_percent
results in the most number of writes. - Experiments 5 and 6 (zram swap) suggest that firefox wrote data that resulted in about 62000 sectors written to disk. Anything above about 1433000 are sectors written due to swapping. See the following table.
- If we assume the lowest number of read sectors among the experiments to be the baseline, we can compare the experiments based on how much extra read sectors due to swapping they caused.
Written sectors as a direct consequence of swapping (approx.):
650000 1) swap file, zswap disabled
320000 2) swap file, zswap enabled, max_pool_percent = 20
30000 3) swap file, zswap enabled, max_pool_percent = 70
40000 4) swap file, zswap enabled, max_pool_percent = 100
0 5) zram swap, zswap disabled
0 6) zram swap, zswap enabled, max_pool_percent = 20
-20000 7) no swap (firefox crashed)
620000 8) swap file, zswap enabled, max_pool_percent = 1
-60000 9) swap file (300 M), zswap enabled, max_pool_percent = 100 (firefox crashed)
Extra read sectors as a direct consequence of swapping (approx.):
51792 1) swap file, zswap disabled
354072 2) swap file, zswap enabled, max_pool_percent = 20
6113640 3) swap file, zswap enabled, max_pool_percent = 70
6125280 4) swap file, zswap enabled, max_pool_percent = 100
250496 5) zram swap, zswap disabled
1954808 6) zram swap, zswap enabled, max_pool_percent = 20
61978240 7) no swap
0 (baseline) 8) swap file, zswap enabled, max_pool_percent = 1
401501136 9) swap file (300 M), zswap enabled, max_pool_percent = 100
Interpretation of results:
- This is subjective and also specific to the usecase at hand; behavior will vary in other usecases.
- Zswap's page pool takes away space in RAM that can otherwise be used by system's page cache (for file-backed pages), which means that the system repeatedly throws away file-backed pages and reads them again when needed, resulting in excessive reads.
- The high number of reads in experiment 7 is caused by the same problem - the system's anonymous pages took most of the RAM and file-backed pages had to be repeatedly read from disk.
- It might be possible under certain circumstances to minimize the amount of data written to swap disk near zero using
zswap
but it is evidently not suited for this task. - It is not possible to have "completely compressed RAM" as the system needs a certain amount of non-swap pages to reside in RAM for operation.
Personal opinions and anecdotes:
- The main improvement of zswap in terms of disk writes is not the fact that it compresses the pages but the fact it has its own buffering & caching system that reduces the page cache and effectively keeps more anonymous pages (in compressed form) in RAM. (However, based on my subjective experience as I use Linux daily, a system with swap and
zswap
with the default values ofswappiness
andmax_pool_percent
always behaves better than anyswappiness
value and nozswap
orzswap
with high values ofmax_pool_percent
.) - Low
swappiness
values seem to make the system behave better until the amount of page cache left is so small as to render the system unusable due to excessive disk reads. Similar with too highmax_pool_percent
. - Either use solely
zram
swap and limit the amount of anonymous pages you need to hold in memory, or use disk-backed swap withzswap
with approximately default values forswappiness
andmax_pool_percent
.
EDIT:
Possible future work to answer the finer points of your question would be to find out for your particular usecase how the the zsmalloc
allocator used in zram
compares compression-wise with the zbud
allocator used in zswap
. I'm not going to do that, though, just pointing out things to search for in docs/on the internet.
EDIT 2:
echo "zsmalloc" > /sys/module/zswap/parameters/zpool
switches zswap's allocator from zbud
to zsmalloc
. Continuing with my test fixture for the above experiments and comparing zram
with zswap
+zsmalloc
, it seems that as long as the swap memory needed is the same as either a zram
swap or as zswap
's max_pool_percent
, the amount of reads and writes to disk is very similar between the two. In my personal opinion based on the facts, as long as the amount of zram
swap I need is smaller than the amount of zram
swap I can afford to actually keep in RAM, then it is best to use solely zram
; and once I need more swap than I can actually keep in memory, it is best to either change my workload to avoid it or to disable zram
swap and use zswap
with zsmalloc
and set max_pool_percent
to the equivalent of what zram previously took in memory (size of zram
* compression ratio). I currently don't have the time to do a proper writeup of these additional tests, though.
Welcome and thanks! P.S. you may want to look atzpool=z3fold
as it allows 3 pages per compressed page, rather than 2.
– Tom Hale
Dec 2 '17 at 8:28
I did tryz3fold
and it slowed the computer tremendously while keeping the CPU load high, as compared tozsmalloc
. Maybe because I didn't try it on the latest kernel that includes some crucial performance improvements toz3fold
- elinux.org/images/d/d3/Z3fold.pdf slide 28.
– Jake F
Dec 2 '17 at 13:31
Ooh, perhaps its time to change to 4.14LTS... :) Your article says that it's not the best performing zpool.
– Tom Hale
Dec 2 '17 at 13:55
It's also possible to mix zram with disk-based swap without suffering LRU inversion. To achieve this, create 30 zram volumes at priority 5 and disk swap at priority 0. Write a cron job which runs swapoff / swapon for one of the volumes every 2 minutes. The zram will keep getting paged back into memory and the disk won't so LRU will eventually end up on the disk when they overflow the zram.
– Wil
Feb 5 at 17:40
add a comment |
To answer your question, I first ran a series of experiments. The final answers are in bold at the end.
Experiments performed:
1) swap file, zswap disabled
2) swap file, zswap enabled, max_pool_percent = 20
3) swap file, zswap enabled, max_pool_percent = 70
4) swap file, zswap enabled, max_pool_percent = 100
5) zram swap, zswap disabled
6) zram swap, zswap enabled, max_pool_percent = 20
7) no swap
8) swap file, zswap enabled, max_pool_percent = 1
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Setup before the experiment:
- VirtualBox 5.1.30
- Fedora 27, xfce spin
- 512 MB RAM, 16 MB video RAM, 2 CPUs
- linux kernel 4.13.13-300.fc27.x86_64
- default
swappiness
value (60) - created an empty 512 MB swap file (300 MB in experiment 9) for possible use during some of the experiments (using
dd
) but didn'tswapon
yet - disabled all dnf* systemd services, ran
watch "killall -9 dnf"
to be more sure that dnf won't try to auto-update during the experiment or something and throw the results off too far
State before the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 280 72 8 132 153
Swap: 511 0 511
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 74624 8648 127180 0 0 1377 526 275 428 3 2 94 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 102430 688 3593850 67603 3351 8000 1373336 17275 0 26
sr0 0 0 0 0 0 0 0 0 0 0
The subsequent swapon operations, etc., leading to the different settings during the experiments, resulted in variances of within about 2% of these values.
Experiment operation consisted of:
- Run Firefox for the first time
- Wait about 40 seconds or until network and disk activity ceases (whichever is longer)
- Record the following state after the experiment (firefox left running, except for experiments 7 and 9 where firefox crashed)
State after the experiment:
1) swap file, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 287 5 63 192 97
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 5904 1892 195428 63 237 1729 743 335 492 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 134680 10706 4848594 95687 5127 91447 2084176 26205 0 38
sr0 0 0 0 0 0 0 0 0 0 0
2) swap file, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 330 6 33 148 73
Swap: 511 317 194
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 325376 7436 756 151144 3 110 1793 609 344 477 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 136046 1320 5150874 117469 10024 41988 1749440 53395 0 40
sr0 0 0 0 0 0 0 0 0 0 0
3) swap file, zswap enabled, max_pool_percent = 70
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 342 8 32 134 58
Swap: 511 393 118
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 403208 8116 1088 137180 4 8 3538 474 467 538 3 3 91 3 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 224321 1414 10910442 220138 7535 9571 1461088 42931 0 60
sr0 0 0 0 0 0 0 0 0 0 0
4) swap file, zswap enabled, max_pool_percent = 100
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 345 10 32 129 56
Swap: 511 410 101
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 420712 10916 2316 130520 1 11 3660 492 478 549 3 4 91 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 221920 1214 10922082 169369 8445 9570 1468552 28488 0 56
sr0 0 0 0 0 0 0 0 0 0 0
5) zram swap, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 333 4 34 147 72
Swap: 499 314 185
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
5 0 324128 7256 1192 149444 153 365 1658 471 326 457 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 130703 884 5047298 112889 4197 9517 1433832 21037 0 37
sr0 0 0 0 0 0 0 0 0 0 0
zram0 58673 0 469384 271 138745 0 1109960 927 0 1
6) zram swap, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 338 5 32 141 65
Swap: 499 355 144
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 364984 7584 904 143572 33 166 2052 437 354 457 3 3 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 166168 998 6751610 120911 4383 9543 1436080 18916 0 42
sr0 0 0 0 0 0 0 0 0 0 0
zram0 13819 0 110552 78 68164 0 545312 398 0 0
7) no swap
Note that firefox is not running in this experiment at the time of recording these stats.
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 289 68 8 127 143
Swap: 0 0 0
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 70108 10660 119976 0 0 13503 286 607 618 2 5 88 5 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 748978 3511 66775042 595064 4263 9334 1413728 23421 0 164
sr0 0 0 0 0 0 0 0 0 0 0
8) swap file, zswap enabled, max_pool_percent = 1
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 292 7 63 186 90
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 7088 2156 188688 43 182 1417 606 298 432 3 2 94 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 132222 9573 4796802 114450 10171 77607 2050032 137961 0 41
sr0 0 0 0 0 0 0 0 0 0 0
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Firefox was stuck and the system still read from disk furiously.
The baseline for this experiment is a different since a new swap file has been written:
total used free shared buff/cache available
Mem: 485 280 8 8 196 153
Swap: 299 0 299
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 8948 3400 198064 0 0 1186 653 249 388 2 2 95 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 103099 688 3610794 68253 3837 8084 1988936 20306 0 27
sr0 0 0 0 0 0 0 0 0 0 0
Specifically, extra 649384 sectors have been written as a result of this change.
State after the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 335 32 47 118 53
Swap: 299 277 22
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
7 1 283540 22912 2712 129132 0 0 83166 414 2387 1951 2 23 62 13 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 3416602 26605 406297938 4710584 4670 9025 2022272 33805 0 521
sr0 0 0 0 0 0 0 0 0 0 0
Subtracting the extra 649384 written sectors from 2022272 results in 1372888. This is less than 1433000 (see later) which is probably because of firefox not loading fully.
I also ran a few experiments with low swappiness
values (10 and 1) and they all got stuck in a frozen state with excessive disk reads, preventing me from recording the final memory stats.
Observations:
- Subjectively, high
max_pool_percent
values resulted in sluggishness. - Subjectively, the system in experiment 9 was so slow as to be unusable.
- High
max_pool_percent
values result in the least amount of writes whereas very low value ofmax_pool_percent
results in the most number of writes. - Experiments 5 and 6 (zram swap) suggest that firefox wrote data that resulted in about 62000 sectors written to disk. Anything above about 1433000 are sectors written due to swapping. See the following table.
- If we assume the lowest number of read sectors among the experiments to be the baseline, we can compare the experiments based on how much extra read sectors due to swapping they caused.
Written sectors as a direct consequence of swapping (approx.):
650000 1) swap file, zswap disabled
320000 2) swap file, zswap enabled, max_pool_percent = 20
30000 3) swap file, zswap enabled, max_pool_percent = 70
40000 4) swap file, zswap enabled, max_pool_percent = 100
0 5) zram swap, zswap disabled
0 6) zram swap, zswap enabled, max_pool_percent = 20
-20000 7) no swap (firefox crashed)
620000 8) swap file, zswap enabled, max_pool_percent = 1
-60000 9) swap file (300 M), zswap enabled, max_pool_percent = 100 (firefox crashed)
Extra read sectors as a direct consequence of swapping (approx.):
51792 1) swap file, zswap disabled
354072 2) swap file, zswap enabled, max_pool_percent = 20
6113640 3) swap file, zswap enabled, max_pool_percent = 70
6125280 4) swap file, zswap enabled, max_pool_percent = 100
250496 5) zram swap, zswap disabled
1954808 6) zram swap, zswap enabled, max_pool_percent = 20
61978240 7) no swap
0 (baseline) 8) swap file, zswap enabled, max_pool_percent = 1
401501136 9) swap file (300 M), zswap enabled, max_pool_percent = 100
Interpretation of results:
- This is subjective and also specific to the usecase at hand; behavior will vary in other usecases.
- Zswap's page pool takes away space in RAM that can otherwise be used by system's page cache (for file-backed pages), which means that the system repeatedly throws away file-backed pages and reads them again when needed, resulting in excessive reads.
- The high number of reads in experiment 7 is caused by the same problem - the system's anonymous pages took most of the RAM and file-backed pages had to be repeatedly read from disk.
- It might be possible under certain circumstances to minimize the amount of data written to swap disk near zero using
zswap
but it is evidently not suited for this task. - It is not possible to have "completely compressed RAM" as the system needs a certain amount of non-swap pages to reside in RAM for operation.
Personal opinions and anecdotes:
- The main improvement of zswap in terms of disk writes is not the fact that it compresses the pages but the fact it has its own buffering & caching system that reduces the page cache and effectively keeps more anonymous pages (in compressed form) in RAM. (However, based on my subjective experience as I use Linux daily, a system with swap and
zswap
with the default values ofswappiness
andmax_pool_percent
always behaves better than anyswappiness
value and nozswap
orzswap
with high values ofmax_pool_percent
.) - Low
swappiness
values seem to make the system behave better until the amount of page cache left is so small as to render the system unusable due to excessive disk reads. Similar with too highmax_pool_percent
. - Either use solely
zram
swap and limit the amount of anonymous pages you need to hold in memory, or use disk-backed swap withzswap
with approximately default values forswappiness
andmax_pool_percent
.
EDIT:
Possible future work to answer the finer points of your question would be to find out for your particular usecase how the the zsmalloc
allocator used in zram
compares compression-wise with the zbud
allocator used in zswap
. I'm not going to do that, though, just pointing out things to search for in docs/on the internet.
EDIT 2:
echo "zsmalloc" > /sys/module/zswap/parameters/zpool
switches zswap's allocator from zbud
to zsmalloc
. Continuing with my test fixture for the above experiments and comparing zram
with zswap
+zsmalloc
, it seems that as long as the swap memory needed is the same as either a zram
swap or as zswap
's max_pool_percent
, the amount of reads and writes to disk is very similar between the two. In my personal opinion based on the facts, as long as the amount of zram
swap I need is smaller than the amount of zram
swap I can afford to actually keep in RAM, then it is best to use solely zram
; and once I need more swap than I can actually keep in memory, it is best to either change my workload to avoid it or to disable zram
swap and use zswap
with zsmalloc
and set max_pool_percent
to the equivalent of what zram previously took in memory (size of zram
* compression ratio). I currently don't have the time to do a proper writeup of these additional tests, though.
To answer your question, I first ran a series of experiments. The final answers are in bold at the end.
Experiments performed:
1) swap file, zswap disabled
2) swap file, zswap enabled, max_pool_percent = 20
3) swap file, zswap enabled, max_pool_percent = 70
4) swap file, zswap enabled, max_pool_percent = 100
5) zram swap, zswap disabled
6) zram swap, zswap enabled, max_pool_percent = 20
7) no swap
8) swap file, zswap enabled, max_pool_percent = 1
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Setup before the experiment:
- VirtualBox 5.1.30
- Fedora 27, xfce spin
- 512 MB RAM, 16 MB video RAM, 2 CPUs
- linux kernel 4.13.13-300.fc27.x86_64
- default
swappiness
value (60) - created an empty 512 MB swap file (300 MB in experiment 9) for possible use during some of the experiments (using
dd
) but didn'tswapon
yet - disabled all dnf* systemd services, ran
watch "killall -9 dnf"
to be more sure that dnf won't try to auto-update during the experiment or something and throw the results off too far
State before the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 280 72 8 132 153
Swap: 511 0 511
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 74624 8648 127180 0 0 1377 526 275 428 3 2 94 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 102430 688 3593850 67603 3351 8000 1373336 17275 0 26
sr0 0 0 0 0 0 0 0 0 0 0
The subsequent swapon operations, etc., leading to the different settings during the experiments, resulted in variances of within about 2% of these values.
Experiment operation consisted of:
- Run Firefox for the first time
- Wait about 40 seconds or until network and disk activity ceases (whichever is longer)
- Record the following state after the experiment (firefox left running, except for experiments 7 and 9 where firefox crashed)
State after the experiment:
1) swap file, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 287 5 63 192 97
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 5904 1892 195428 63 237 1729 743 335 492 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 134680 10706 4848594 95687 5127 91447 2084176 26205 0 38
sr0 0 0 0 0 0 0 0 0 0 0
2) swap file, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 330 6 33 148 73
Swap: 511 317 194
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 325376 7436 756 151144 3 110 1793 609 344 477 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 136046 1320 5150874 117469 10024 41988 1749440 53395 0 40
sr0 0 0 0 0 0 0 0 0 0 0
3) swap file, zswap enabled, max_pool_percent = 70
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 342 8 32 134 58
Swap: 511 393 118
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 403208 8116 1088 137180 4 8 3538 474 467 538 3 3 91 3 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 224321 1414 10910442 220138 7535 9571 1461088 42931 0 60
sr0 0 0 0 0 0 0 0 0 0 0
4) swap file, zswap enabled, max_pool_percent = 100
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 345 10 32 129 56
Swap: 511 410 101
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 420712 10916 2316 130520 1 11 3660 492 478 549 3 4 91 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 221920 1214 10922082 169369 8445 9570 1468552 28488 0 56
sr0 0 0 0 0 0 0 0 0 0 0
5) zram swap, zswap disabled
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 333 4 34 147 72
Swap: 499 314 185
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
5 0 324128 7256 1192 149444 153 365 1658 471 326 457 3 2 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 130703 884 5047298 112889 4197 9517 1433832 21037 0 37
sr0 0 0 0 0 0 0 0 0 0 0
zram0 58673 0 469384 271 138745 0 1109960 927 0 1
6) zram swap, zswap enabled, max_pool_percent = 20
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 338 5 32 141 65
Swap: 499 355 144
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 364984 7584 904 143572 33 166 2052 437 354 457 3 3 93 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 166168 998 6751610 120911 4383 9543 1436080 18916 0 42
sr0 0 0 0 0 0 0 0 0 0 0
zram0 13819 0 110552 78 68164 0 545312 398 0 0
7) no swap
Note that firefox is not running in this experiment at the time of recording these stats.
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 289 68 8 127 143
Swap: 0 0 0
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 70108 10660 119976 0 0 13503 286 607 618 2 5 88 5 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 748978 3511 66775042 595064 4263 9334 1413728 23421 0 164
sr0 0 0 0 0 0 0 0 0 0 0
8) swap file, zswap enabled, max_pool_percent = 1
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 292 7 63 186 90
Swap: 511 249 262
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 255488 7088 2156 188688 43 182 1417 606 298 432 3 2 94 2 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 132222 9573 4796802 114450 10171 77607 2050032 137961 0 41
sr0 0 0 0 0 0 0 0 0 0 0
9) swap file (300 M), zswap enabled, max_pool_percent = 100
Firefox was stuck and the system still read from disk furiously.
The baseline for this experiment is a different since a new swap file has been written:
total used free shared buff/cache available
Mem: 485 280 8 8 196 153
Swap: 299 0 299
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 8948 3400 198064 0 0 1186 653 249 388 2 2 95 1 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 103099 688 3610794 68253 3837 8084 1988936 20306 0 27
sr0 0 0 0 0 0 0 0 0 0 0
Specifically, extra 649384 sectors have been written as a result of this change.
State after the experiment:
[root@user-vm user]# free -m ; vmstat ; vmstat -d
total used free shared buff/cache available
Mem: 485 335 32 47 118 53
Swap: 299 277 22
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
7 1 283540 22912 2712 129132 0 0 83166 414 2387 1951 2 23 62 13 0
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 3416602 26605 406297938 4710584 4670 9025 2022272 33805 0 521
sr0 0 0 0 0 0 0 0 0 0 0
Subtracting the extra 649384 written sectors from 2022272 results in 1372888. This is less than 1433000 (see later) which is probably because of firefox not loading fully.
I also ran a few experiments with low swappiness
values (10 and 1) and they all got stuck in a frozen state with excessive disk reads, preventing me from recording the final memory stats.
Observations:
- Subjectively, high
max_pool_percent
values resulted in sluggishness. - Subjectively, the system in experiment 9 was so slow as to be unusable.
- High
max_pool_percent
values result in the least amount of writes whereas very low value ofmax_pool_percent
results in the most number of writes. - Experiments 5 and 6 (zram swap) suggest that firefox wrote data that resulted in about 62000 sectors written to disk. Anything above about 1433000 are sectors written due to swapping. See the following table.
- If we assume the lowest number of read sectors among the experiments to be the baseline, we can compare the experiments based on how much extra read sectors due to swapping they caused.
Written sectors as a direct consequence of swapping (approx.):
650000 1) swap file, zswap disabled
320000 2) swap file, zswap enabled, max_pool_percent = 20
30000 3) swap file, zswap enabled, max_pool_percent = 70
40000 4) swap file, zswap enabled, max_pool_percent = 100
0 5) zram swap, zswap disabled
0 6) zram swap, zswap enabled, max_pool_percent = 20
-20000 7) no swap (firefox crashed)
620000 8) swap file, zswap enabled, max_pool_percent = 1
-60000 9) swap file (300 M), zswap enabled, max_pool_percent = 100 (firefox crashed)
Extra read sectors as a direct consequence of swapping (approx.):
51792 1) swap file, zswap disabled
354072 2) swap file, zswap enabled, max_pool_percent = 20
6113640 3) swap file, zswap enabled, max_pool_percent = 70
6125280 4) swap file, zswap enabled, max_pool_percent = 100
250496 5) zram swap, zswap disabled
1954808 6) zram swap, zswap enabled, max_pool_percent = 20
61978240 7) no swap
0 (baseline) 8) swap file, zswap enabled, max_pool_percent = 1
401501136 9) swap file (300 M), zswap enabled, max_pool_percent = 100
Interpretation of results:
- This is subjective and also specific to the usecase at hand; behavior will vary in other usecases.
- Zswap's page pool takes away space in RAM that can otherwise be used by system's page cache (for file-backed pages), which means that the system repeatedly throws away file-backed pages and reads them again when needed, resulting in excessive reads.
- The high number of reads in experiment 7 is caused by the same problem - the system's anonymous pages took most of the RAM and file-backed pages had to be repeatedly read from disk.
- It might be possible under certain circumstances to minimize the amount of data written to swap disk near zero using
zswap
but it is evidently not suited for this task. - It is not possible to have "completely compressed RAM" as the system needs a certain amount of non-swap pages to reside in RAM for operation.
Personal opinions and anecdotes:
- The main improvement of zswap in terms of disk writes is not the fact that it compresses the pages but the fact it has its own buffering & caching system that reduces the page cache and effectively keeps more anonymous pages (in compressed form) in RAM. (However, based on my subjective experience as I use Linux daily, a system with swap and
zswap
with the default values ofswappiness
andmax_pool_percent
always behaves better than anyswappiness
value and nozswap
orzswap
with high values ofmax_pool_percent
.) - Low
swappiness
values seem to make the system behave better until the amount of page cache left is so small as to render the system unusable due to excessive disk reads. Similar with too highmax_pool_percent
. - Either use solely
zram
swap and limit the amount of anonymous pages you need to hold in memory, or use disk-backed swap withzswap
with approximately default values forswappiness
andmax_pool_percent
.
EDIT:
Possible future work to answer the finer points of your question would be to find out for your particular usecase how the the zsmalloc
allocator used in zram
compares compression-wise with the zbud
allocator used in zswap
. I'm not going to do that, though, just pointing out things to search for in docs/on the internet.
EDIT 2:
echo "zsmalloc" > /sys/module/zswap/parameters/zpool
switches zswap's allocator from zbud
to zsmalloc
. Continuing with my test fixture for the above experiments and comparing zram
with zswap
+zsmalloc
, it seems that as long as the swap memory needed is the same as either a zram
swap or as zswap
's max_pool_percent
, the amount of reads and writes to disk is very similar between the two. In my personal opinion based on the facts, as long as the amount of zram
swap I need is smaller than the amount of zram
swap I can afford to actually keep in RAM, then it is best to use solely zram
; and once I need more swap than I can actually keep in memory, it is best to either change my workload to avoid it or to disable zram
swap and use zswap
with zsmalloc
and set max_pool_percent
to the equivalent of what zram previously took in memory (size of zram
* compression ratio). I currently don't have the time to do a proper writeup of these additional tests, though.
edited Nov 27 '17 at 6:05
answered Nov 26 '17 at 17:04
Jake FJake F
1564
1564
Welcome and thanks! P.S. you may want to look atzpool=z3fold
as it allows 3 pages per compressed page, rather than 2.
– Tom Hale
Dec 2 '17 at 8:28
I did tryz3fold
and it slowed the computer tremendously while keeping the CPU load high, as compared tozsmalloc
. Maybe because I didn't try it on the latest kernel that includes some crucial performance improvements toz3fold
- elinux.org/images/d/d3/Z3fold.pdf slide 28.
– Jake F
Dec 2 '17 at 13:31
Ooh, perhaps its time to change to 4.14LTS... :) Your article says that it's not the best performing zpool.
– Tom Hale
Dec 2 '17 at 13:55
It's also possible to mix zram with disk-based swap without suffering LRU inversion. To achieve this, create 30 zram volumes at priority 5 and disk swap at priority 0. Write a cron job which runs swapoff / swapon for one of the volumes every 2 minutes. The zram will keep getting paged back into memory and the disk won't so LRU will eventually end up on the disk when they overflow the zram.
– Wil
Feb 5 at 17:40
add a comment |
Welcome and thanks! P.S. you may want to look atzpool=z3fold
as it allows 3 pages per compressed page, rather than 2.
– Tom Hale
Dec 2 '17 at 8:28
I did tryz3fold
and it slowed the computer tremendously while keeping the CPU load high, as compared tozsmalloc
. Maybe because I didn't try it on the latest kernel that includes some crucial performance improvements toz3fold
- elinux.org/images/d/d3/Z3fold.pdf slide 28.
– Jake F
Dec 2 '17 at 13:31
Ooh, perhaps its time to change to 4.14LTS... :) Your article says that it's not the best performing zpool.
– Tom Hale
Dec 2 '17 at 13:55
It's also possible to mix zram with disk-based swap without suffering LRU inversion. To achieve this, create 30 zram volumes at priority 5 and disk swap at priority 0. Write a cron job which runs swapoff / swapon for one of the volumes every 2 minutes. The zram will keep getting paged back into memory and the disk won't so LRU will eventually end up on the disk when they overflow the zram.
– Wil
Feb 5 at 17:40
Welcome and thanks! P.S. you may want to look at
zpool=z3fold
as it allows 3 pages per compressed page, rather than 2.– Tom Hale
Dec 2 '17 at 8:28
Welcome and thanks! P.S. you may want to look at
zpool=z3fold
as it allows 3 pages per compressed page, rather than 2.– Tom Hale
Dec 2 '17 at 8:28
I did try
z3fold
and it slowed the computer tremendously while keeping the CPU load high, as compared to zsmalloc
. Maybe because I didn't try it on the latest kernel that includes some crucial performance improvements to z3fold
- elinux.org/images/d/d3/Z3fold.pdf slide 28.– Jake F
Dec 2 '17 at 13:31
I did try
z3fold
and it slowed the computer tremendously while keeping the CPU load high, as compared to zsmalloc
. Maybe because I didn't try it on the latest kernel that includes some crucial performance improvements to z3fold
- elinux.org/images/d/d3/Z3fold.pdf slide 28.– Jake F
Dec 2 '17 at 13:31
Ooh, perhaps its time to change to 4.14LTS... :) Your article says that it's not the best performing zpool.
– Tom Hale
Dec 2 '17 at 13:55
Ooh, perhaps its time to change to 4.14LTS... :) Your article says that it's not the best performing zpool.
– Tom Hale
Dec 2 '17 at 13:55
It's also possible to mix zram with disk-based swap without suffering LRU inversion. To achieve this, create 30 zram volumes at priority 5 and disk swap at priority 0. Write a cron job which runs swapoff / swapon for one of the volumes every 2 minutes. The zram will keep getting paged back into memory and the disk won't so LRU will eventually end up on the disk when they overflow the zram.
– Wil
Feb 5 at 17:40
It's also possible to mix zram with disk-based swap without suffering LRU inversion. To achieve this, create 30 zram volumes at priority 5 and disk swap at priority 0. Write a cron job which runs swapoff / swapon for one of the volumes every 2 minutes. The zram will keep getting paged back into memory and the disk won't so LRU will eventually end up on the disk when they overflow the zram.
– Wil
Feb 5 at 17:40
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f406925%2fprevent-zram-lru-inversion-with-zswap-and-max-pool-percent-100%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown