Abstract
You can also download a PDF version of this document.
Notice
Since I made this benchmark, many things have changed. FreeBSD 5.x has gone -stable, DragonFly project has published version 1.0, NetBSD has completed version 2.0, and there was much development on Linux, also. This benchmark has mostly historical value now. I'd like to repeat this sort of benchmark, to apply knowledge learned in doing it (mostly, doing more, longer runs and calculating standard deviation...), but this probably won't happen anytime soon, mostly due to the lack of appropriate hardware (SMP this time).
What is benchmarked?
This benchmark evaluates FreeBSD 5.2, FreeBSD 4.9, DragonFly -current, NetBSD 1.6.2, Linux 2.4.25 and Linux 2.6.5 systems. Since "Linux" does not represent a complete system, Gentoo 2004.0 (with the specified kernels added) is used. Each of the kernels used have had no custom optimizations turned on unless otherwise stated.
Benchmarking software was: bytebench 3.1, bonnie++ 1.93.03, ubench 0.3, postgresql 7.4.1, apache 1.3.29 and php 4.3.4.
Why another benchmark?
This benchmark was conceived as an opportunity for me to get some "real numbers" behind my choice of OS for development and deployment of a web CMS (content management system). I've been a FreeBSD advocate in my community for several years, but my justifications were based mostly on "3rd party" evidence (various other benchmarks and articles, and tracking the development mailing lists) and my subjective feelings that the system is "more right" for the job (actually, most of the jobs :) ). Now the opportunity has presented itself for me to do some relatively extensive benchmarking and usage of various systems, and so add numbers to my claims.
Even if I'm biased towards the BSD systems, I tried to make the benchmarking protocol as fair as possible for all systems (as will be apparent, I hope, from the rest of the article). The fact that I am inclined to use FreeBSD doesn't imply that I have negative feelings towards Linux-based systems, or that I'm not competent or comfortable using it. My daily job is mostly systems administration, and these days you have to be competent in many different systems.
Setup
Each test in run on the same machine. It is a Compaq Proliant ML330e with 933MHz Pentium III processor, 1GB of RAM, 18GB SCSI disk on the Adaptec (Compaq OEM) 29160 Ultra160 SCSI controller.
Each time, the disk was repartitioned into 3 parts: 6000MB root (and the system), 1000MB swap, and the rest (10359MB) exclusively for the benchmark. Care was taken to ensure all systems had equal test conditions, and for all cases, the tests were run in the same order.
The disk partitions/slices were created as follows:
| Size, filesystem | Mount point |
| 6000MB (ufs or ufs2 or ext3) | / |
| 1000MB (swap) | swap |
| 6000MB (ufs+s, ufs2+s, ext3 or xfs) | /bench |
All filesystem-dependant benchmarks (bonnie++, the database for pgbench & cms test) were performed on the /bench partition/slice.
FreeBSD 5.2, FreeBSD 4.9, DragonFlyBSD
The disk layout when using FreeBSD 5.2 was allocated to one single partition, divided in slices (default newfs options, block size 16KB) as specified above.
Installed was the Kern-Developer distribution, with ports. The only enabled options in the Network Services Menu were the "sshd" and the "RFC1323 & RFC1644". Nothing else of the sort was later started. To take advantage of the added performance, /etc/make.conf was modified to include CPUTYPE=p3 and CFLAGS=-O2 -pipe.
Of the application packages, the following were installed: portupgrade, vim and cvsup (+dependencies) from binary packages, apache13, bonnie++, bytebench, postgresql7, php4 from ports.
The benchmarks were run for two kernels (same world): the GENERIC kernel, as was installed from the official ISO image, and a CUSTOM kernel, which differs from the GENERIC in these points:
- Removed support for i486 and i586 CPUs
- Scheduler changed to SCHED ULE
- Removed INVARIANTS SUPPORT
- Removed "options SMP" and "device apic"
- (CFLAGS=-march=pentium3 -O2)
The change was mainly to explore the low performance achieved by FreeBSD 5.2 in some other similar tests.
FreeBSD 4.9 was setup in a similar way, except that the only tested kernel was the GENERIC one, as distributed with the release.
DragonFlyBSD is a relatively new system (it's not released yet), and as such doesn't include a setup/installer program. Luckily, the instructions that come with the boot/live CD are good enough so that installing from scratch was pretty easy. Everything else was like on the FreeBSDs (since this is a fork of FreeBSD 4, it uses the same ports system).
NetBSD 1.6.2
Because of my clumsiness with the installer, or the presence of some bugs regarding modifying a FreeBSD partition setup, I wasn"t able to partition the drive as intended, but instead the default partition scheme was used. All benchmarks were performed on the /usr partition (which had softupdates/softdeps enabled).
The default compiler configuration included -O2, so I left it at that. The pkgsrc at the time didn"t include the required version of PostgreSQL so I had to compile it manually. To make it work, I also recompiled the kernel (to increase available SYSV shared memory). This was the most ascetic system I tested :)
Linux 2.4.25, Linux 2.6.5
I chose Gentoo distribution as the most BSD-like in philosophy and practice, and was very pleasantly surprised. Installing from scratch was not that hard (although my subjective opinion is that the DragonFly"s documentation was a little better), and the portage system is absolutely great! The setup was the usual partition scheme (this time with three primary partitions instead of slices), and everything else went smoothly. The tests were run on the vanilla-sources and development-sources kernels. The tests were done on the same system appropriate kernel was selected in the boot loader. There were weird problems with some of the benchmarks on the 2.6 kernel, which will be described in the appropriate sections.
Linux test were done on 2 filesystem types for both kernels: ext3 & xfs (the /bench partition was formatted with the one, then the other), and bonnie++ and database tests were done on that partition.
Benchmark results
All values are of "bigger-is-better" type. Special care was taken to ensure the results are consistent and the conditions were as uniform as possible across the various systems.
Bytebench
This is a known and loved benchmark that aims to stress various part of a system. Version 3.1 was used, although there are newer versions of the package, since that is the only version available in the FreeBSD ports system. Of the results, only the summary report, and the "syscall overhead" score were recorded.
The numbers in the summary report are the "index values" of the benchmark (except for the syscall overhead test, which is loops-per-second).|
|
Arithmetic Test (type=double) |
Dhrystone 2 without register variables |
Execl Throughput Test |
File Copy (30sec) |
Pipe-based Context Switching Test |
Shell scripts (8 concurrent) |
*SUM of 6 items |
*AVERAGE |
System call overhead (lps/1000) |
|
FreeBSD 5.2, GENERIC |
114.2 |
102.3 |
30.6 |
225.4 |
45.5 |
18.0 |
536.0 |
89.3 |
214.7 |
|
FreeBSD 5.2, CUSTOM |
114.6 |
102.7 |
31.4 |
230.5 |
56.7 |
49.8 |
585.7 |
97.6 |
234.2 |
|
FreeBSD 4.9, GENERIC |
112.9 |
82.6 |
58.9 |
281.7 |
51.2 |
31.8 |
619.1 |
103.2 |
317.6 |
|
DragonFly, current |
114.6 |
84.1 |
37.0 |
279.9 |
63.6 |
31.5 |
610.7 |
101.8 |
285.4 |
|
Linux 2.4.25, xfs |
|
95.7 |
112.4 |
296.6 |
178.2 |
85.0 |
876.7 |
146.1 |
480.1 |
|
Linux 2.6.5, xfs |
|
94.6 |
93.2 |
308.8 |
143.9 |
75.3 |
823.3 |
137.2 |
491.9 |
|
NetBSD 1.6.2 |
114.7 |
86.1 |
44.7 |
269.5 |
31.5 |
22.8 |
569.3 |
94.9 |
400.6 |
Concerning FreeBSDs, the tests show that the regression in performance is still present in the 5.x branch, but is slowly being mended. Even as such, the results from the BSDs are far below those of the Linux.
There were a few anomalies when running this benchmark under Linux. The arithmetic test (type=double) failed in unknown way (not a segfault or any similar signal, just a notification that the test results were unavailable). Since I didn't like the empty square, I "extrapolated" the result by assuming the same ratio between the "arithmetic test" and "dhrystone 2" in a system that uses gcc 3.3 & FreeBSD 5.2. Additionally, in case of the 2.6 kernel the result reporting failed somehow, so I had to recalculate the scores from the intermediate results (but in the same way as the original program).
Bonnie++
This is a disk and filesystem benchmark. It measures the raw disk throughput in several contexts (byte by byte I/O, block IO, input, output and rewrite performance; creating and deleting files). Bonnie++ was run with parameters "-s 2000 -d /bench".
|
|
Sequential Output |
Sequential Input |
Random Seeks |
|||||||||
|
Per Chr |
Block |
Rewrite |
Per Chr |
Block |
||||||||
|
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
K/sec |
%CPU |
/sec |
%CPU |
|
|
FreeBSD 5.2, GENERIC |
70 |
99 |
44,915 |
39 |
19,159 |
19 |
166 |
99 |
46,684 |
26 |
219.4 |
8 |
|
FreeBSD 5.2, CUSTOM |
88 |
99 |
44,863 |
36 |
19,153 |
19 |
229 |
99 |
46,531 |
25 |
231.9 |
9 |
|
FreeBSD 4.9, GENERIC |
136 |
99 |
46,944 |
33 |
23,844 |
23 |
248 |
99 |
49,677 |
24 |
243.3 |
6 |
|
DragonFly, current |
110 |
99 |
46,877 |
41 |
22,282 |
26 |
195 |
99 |
48,814 |
26 |
247.4 |
6 |
|
Linux 2.4.25, xfs |
11202 |
98 |
59,816 |
28 |
24,818 |
17 |
11261 |
89 |
52,342 |
15 |
344.2 |
0 |
|
Linux 2.4.25, ext3 |
10938 |
98 |
55,434 |
36 |
22,166 |
15 |
11393 |
90 |
51,209 |
14 |
321.2 |
1 |
|
Linux 2.6.5, xfs |
11036 |
99 |
56,600 |
30 |
23,749 |
15 |
12260 |
98 |
57,445 |
15 |
335.3 |
1 |
|
Linux 2.6.5, ext3 |
10797 |
98 |
60,000 |
41 |
23,955 |
17 |
11705 |
93 |
52,162 |
13 |
304.5 |
1 |
|
NetBSD 1.6.2 |
43390 |
84 |
46,101 |
29 |
8,532 |
6 |
29324 |
93 |
47,178 |
17 |
256.5 |
1 |
Filesystem throughput tests:
|
|
Sequential Create |
Random Create |
||||||||||
|
Create |
|
Read |
|
Delete |
|
Create |
|
Read |
|
Delete |
|
|
|
/sec |
%CPU |
/sec |
%CPU |
/sec |
%CPU |
/sec |
%CPU |
/sec |
%CPU |
/sec |
%CPU |
|
|
FreeBSD 5.2, GENERIC |
4405 |
34 |
23910 |
54 |
26,261 |
99 |
3,006 |
25 |
|
|
29,776 |
99 |
|
FreeBSD 5.2, CUSTOM |
5,159 |
33 |
|
|
23,749 |
57 |
4,301 |
31 |
|
|
28,241 |
79 |
|
FreeBSD 4.9, GENERIC |
16,872 |
93 |
|
|
20,820 |
94 |
17,191 |
94 |
|
|
10,591 |
66 |
|
DragonFly, current |
13,131 |
88 |
27577 |
96 |
17,886 |
92 |
11,256 |
76 |
|
|
6,790 |
45 |
|
Linux 2.4.25, xfs |
1,786 |
25 |
|
|
1,720 |
34 |
1,875 |
28 |
|
|
587 |
7 |
|
Linux 2.4.25, ext3 |
724 |
99 |
|
|
|
|
758 |
99 |
|
|
2,890 |
97 |
|
Linux 2.6.5, xfs |
1,917 |
27 |
|
|
1,853 |
34 |
1,891 |
30 |
|
|
558 |
7 |
|
Linux 2.6.5, ext3 |
694 |
99 |
|
|
|
|
687 |
97 |
|
|
2,654 |
98 |
|
NetBSD 1.6.2 |
723 |
73 |
|
|
18,500 |
99 |
819 |
82 |
1016 |
99 |
3,915 |
98 |
The filesystem handling in FreeBSD and derivatives in unsurpassed by far. An interesting data point is the NetBSD score, because it uses the same type of filesystem, but without the dirhash and other optimizations present in FreeBSD. Missing data in the above table signifies operations that were too fast to measure correctly by the bonnie++ program.
Linux clearly wins the IO throughput test, having a score upto about 130% better than nearest BSD, either by having a better SCSI driver, or because the system itself is just faster.
There were no anomalies in running the benchmark.
Ubench
Ubench is another popular unix benchmark. It is concentrated on CPU and memory throughput. Ubench will spawn about 2 concurrent processes for each CPU available on the system (1 in this case). This ensures all available raw CPU horsepower is used. For the CPU score, ubench is executing rather senseless mathematical integer and floating-point calculations for 3 minutes. The ratio of floating-point calculations to integer is about 1:3. Memory score is the result of executing rather senseless memory allocation and memory to memory copying operations for another 3 minutes. concurrently using several processes. (This description of ubench was taken from its homepage.)
|
|
CPU |
Memory |
Average |
|
FreeBSD 5.2, GENERIC |
52,218 |
42,596 |
47,407 |
|
FreeBSD 5.2, CUSTOM |
52,305 |
43,837 |
48,071 |
|
FreeBSD 4.9, GENERIC |
52,332 |
46,349 |
49,341 |
|
DragonFly, current |
52,223 |
40,494 |
46,359 |
|
Linux 2.4.25, xfs |
55,588 |
61,936 |
58,762 |
|
Linux 2.6.5, xfs |
55,026 |
|
58,481 |
|
NetBSD 1.6.2 |
49,879 |
46,703 |
48,291 |
The CPU score is usually the result of how good a compiler is, but even here some trends can be noticed. All the FreeBSDs have essentially the same score, which is again lower than the one from the Linux systems. The memory benchmark is rather interesting, as it show significant differences between the systems. Surprisingly, a particularly low score was measured in DragonFly.
PgBench
This is one of the "real-world" benchmarks. It is included in the "contrib" tree of recent PostgreSQL distributions. The test database is made with a scale of 40, which results in 4000000 tuples for the benchmark. There were two runs of the benchmark, one with 50 transactions per client ("B", the result is averaged over 5 runs) and one with 1000 transactions per client ("A", the result is averaged over 3 runs). The benchmarks were run with the "-C" switch, so that every transaction was made in its own connection.
|
|
A |
B |
|
TPS (+conn) |
TPS (+conn) |
|
|
FreeBSD 5.2, GENERIC |
24.42 |
35.18 |
|
FreeBSD 5.2, CUSTOM |
32.70 |
35.60 |
|
FreeBSD 4.9, GENERIC |
63.90 |
85.60 |
|
DragonFly, current |
32.26 |
40.40 |
|
Linux 2.4.25, xfs |
46.30 |
57.65 |
|
Linux 2.4.25, ext3 |
37.85 |
40.44 |
|
Linux 2.6.5, xfs |
56.72 |
43.79 |
|
Linux 2.6.5, ext3 |
31.70 |
27.09 |
|
NetBSD 1.6.2 |
35.73 |
38.65 |
The results vary greatly across the systems, and often between consequential runs of the benchmark program, so the results can only be used for crude indication of a trend. A clear winner here is FreeBSD 4.9, with Linux 2.6 not so far behind.
Web CMS system
This benchmark is actually a run of a real heavyweight web CMS (Content Management System) in the development of which I participate as a lead programmer. In a way, all this work is to determine the optimal platform for deployment of this system.
The system itself is closed-source, so I cannot give away details, but here are some technical details: the systems is heavily database-based (PostgreSQL), it uses turck-mmcache for PHP acceleration, SQL and and intermediate data caching, and also uses Smarty templates and filesystem-based caching. The system was somewhat deliberately pessimized for this particular purpose.
The benchmark is the result of running siege program from a computer in the same network. For the "20 clients" test, the parameters to siege were "-f list -t5m -d1 -c20"; for the "1 client" test, the parameters were "-f list -t5m -b -c1".
|
|
20 clients |
1 client |
||
|
Trans./sec |
Concurrency |
Trans./sec |
Concurrency |
|
|
FreeBSD 5.2, GENERIC |
3.13 |
17.00 |
2.24 |
0.99 |
|
FreeBSD 5.2, CUSTOM |
3.76 |
13.96 |
2.23 |
0.99 |
|
FreeBSD 4.9, GENERIC |
2.10 |
4.58 |
2.22 |
0.96 |
|
DragonFly, current |
1.38 |
17.96 |
2.27 |
1.00 |
|
Linux 2.4.25, xfs |
0.99 |
18.31 |
2.39 |
1.00 |
|
Linux 2.4.25, ext3 |
1.11 |
18.68 |
2.39 |
0.99 |
|
Linux 2.6.5, xfs |
3.56 |
17.07 |
2.38 |
1.00 |
|
Linux 2.6.5, ext3 |
3.65 |
17.50 |
2.39 |
0.99 |
|
NetBSD 1.6.2 |
1.24 |
17.62 |
2.85 |
4.86 |
The results are rather surprising, when looked at with all the previous benchmark results in mind. This shows FreeBSD at its finest - concurrent network access. Judging by the bonnie++ and bytebench results alone, all Linux systems should have had a score far superior to the BSD's, but that is not the case. Linux 2.4 seems to be bound by some scalability issues, but they seem to be mostly resolved in the 2.6 series. NetBSD seems to be doing something out of the ordinary with network connections - it's the only way I can explain the high number at the "1 client/concurrency" cell.
The ext3/xfs mark refers to the filesystem on which the database was hosted.
DragonFly was the only system where there were problems running the benchmark, but it's probably because of the experimental state of the system. (the benchmark was repeated until I got clear runs).
Summary

The summary graph is composed of the following results:
- bytebench: "Sum of 6 values"
- bonnie++: "Sequential rewrite kb/s"
- ubench: average score
- pgbench: "A" tps/sec
- web CMS: "20 clients trans/sec"
Linux is getting better by the day. The 2.6 kernel is probably ready for use in previously BSD-dominated applications. The score on the synthetic benchmarks really stands out in the cumulative graph, and FreeBSD is really starting to look as under-performing. What bails it out is the excellent scalability, which manages to be even a little better than that of Linux 2.6.
Conclusion
My intentions in performing these benchmarks were:
- To see how a specific real-world application runs on the benchmarked systems
- To explore the differences between the "stable" FreeBSD 4.9 and the "new technology" version 5.2. Some benchmarks published on the Internet have shown a disturbing lack of performance in the FreeBSD 5.x series.
- To see if the newcomer in the field, DragonflyBSD can continue good record of the FreeBSD 4.x technology and make it better than FreeBSD 5.x (which is one of the prominent project goals)
- To see how does FreeBSD measure up to Linux 2.4 and 2.6 series.
- To learn and have fun :)
As for the first goal, the project will stay at FreeBSD, only it will change to FreeBSD 5.x as it seems to be meeting its goals and offering more performance.
There were some surprises during the testing. At one time it seemed that Linux will "sweep the floor" with FreeBSD, at least judging by the bonnie++ results - it scored an almost 130% advantage over the best BSD in the IO test. But it turned out to be in vain " the performance of the 2.4 kernel in the CMS test was abysmal. It shows that the 2.6 kernel is rapidly approaching the scalability of FreeBSD, and may soon surpass it.
I was very pleasantly surprised by the Gentoo distribution " I will probably use it in all my future dealings with Linux. The portage system has many features that FreeBSD ports system can learn for, most notable of which are the ability to keep build information for several versions of a package simultaneously, and the integrated feeling of the whole system (as opposed to bunch-of-small-utilities approach in FreeBSD).
Disclaimer: if anything, this article shows that you need to target the benchmarking to your own conditions. Running synthetic micro-benchmarks does not give an indication of how a system performs in a real-world situation. You probably shouldn't trust this benchmarks to apply to your own particular system(s), and should carefully benchmark before deployment. The purpose of this article is not to start (or participate) in holy wars about which system is better. It all depends on your requirements.
Regrets, I've had a few...
But then again, too few to mention. I wish i had collected more data about the state of systems being benchmarked (how much time is spent in user/system/interrupt state, how many pagefaults [even if there were really none during these particular benchmarks], IO ops/second etc.). I've learned things about the systems benchmarked and about the benchmark programs themselves, and if I had to do it all over again, I'd change a few things.
But my biggest regret is that I haven't been able to run these kind of tests on a dual- or quad-processor machines (and I'm not talking HTT here). These would show how really scalable the systems are, and would probably make DragonFlyBSD shine (at least when it's released).
Trivia
These statements do not mean anything; read with caution :)
- Linux kernel 2.4.24 is compressed to a 28.4MB .tbz archive, FreeBSD 4.9 kernel is compressed to a 10.3MB .tbz archive. About the same ratio is present with Linux 2.6.1 vs FreeBSD 5.2.
- Looking at the information presented by the "top" utility, it seems that the Linux systems spend much less time in the "system" state, but overall spend less time in "idle" state during the benchmarks
- The ability to set compiler optimization level does have influence on the results, but mostly only for the synthetic tests.
- In most of the tests, Linux systems gave more uniform results (FreeBSD results were more scattered)
- OpenOffice.org sucks donkey balls with HTML export... :(
Acknowledgments
I thank all that have supported me, or merely left me alone at the corner to click away at the computer (and curse all software that sucks), especially: the inhabitants of the "cube office", Neven (for proof-reading this text), and Sonja (well, just for being there :) ).
Ivan Voras, April 7th, 2004.