?

Log in

No account? Create an account

Previous Entry | Next Entry

Pun intended. Today I am doing totally unscientific (but quite useful) comparison between a few well-known HTTP performance tools.

System: FreeBSD 8.2, x86_84 (L3426, no HT, no TB). Both client and servers were on the same host.

Load generators tested:
Server side is node.js (0.6.6) with the following code:
var app = require('http').createServer(handler);

app.listen(8081, "127.0.0.1");

function handler (req, res) {
  res.writeHead(200);
  return res.end('Success');
}

All load generators were first set up for single-connection operation (no concurrency), then I threw some concurrency in to up the ante.

Test results:
ToolKeep-Alive?Concur­rencyRequests madeTest durationRPS shownNotes
httperfYES1500 K41 s12.1 Khttperf's CPU usage was about 10% less than Node's: ~60% vs 70%
httperfNO1500 K93 s5.3 K
siegeYES1600 K71 s8.2 Ksiege's CPU usage was about 30% less than Node's: ~50% vs 75%
siegeNO1300 K88 s3.9 K
abNO11500 K104 s4.8 Kab's CPU usage was about 3x less than Node's: ~25% vs 80%
pronkYES21100 K112 s0.9 K3
std=1.5 K
pronk's CPU usage was about 20x greater than Node's: ~101% vs 4%
httperfYES100500 K39 s12.7 Khttperf's CPU usage was on par with Node.JS's
httperfNO2584500 K69 s7.2 K
siegeYES100300 K21 s11.7 Ksiege's CPU usage was about 5x less than Node's: ~20% vs 100%
siegeNO100300 K43 s7.4 Ksiege's CPU usage was about 2x less than Node's: ~50% vs 100%
abNO1100500 K63 s7.8 Kab's CPU usage was about 2x less than Node's: ~40% vs 99%
pronkYES2100500 K81 s7.0 K
std=5.1 K
pronk's CPU usage was about 3x greater than Node's: ~160% vs 60%
pronkYES210500 K110 s4.7 K
std=5.3 K


[1] ab utilizes HTTP/1.0 when calling the server, and in this mode the Keep-Alive header is ignored by Node.JS, quite surprisingly. Effectively, ab's "-k" option (to enable HTTP keep-alives) is a no-op if used against Node.JS based servers.
[2] pronk can't be configured to disable keep-alive mode.
[3] pronk is very inefficient when -c is low (less than a number of cores).
[4] httperf cannot be configured to utilize a fixed number of connections. So concurrency is what's measured after trying to connect with --rate <rate>.


I learned something new today.

httperf is not always right for Node.JS

httperf once again shown itself being the best testing tool for high-performance web servers... when it is not used against high-performance GC language based non-keep-alive servers. In such unlucky scenario GC pauses trigger creation of new FDs, which causes httperf to choke a bit with managing its select() masks[4]. So, httperf might not be able to squeeze the max performance out of non-keepalive, low latency Node.JS applications. ab is somewhat better, but not without its own quirks.

ab cannot fully test Node.JS

Apache Benchmark utilizes HTTP/1.0 when talking to the server. When "-k" option is given, it just adds "Connection: Keep-Alive" header. Unfortunately, Node.JS server ignores keep-alive if not invoked through an HTTP/1.1 connection[2]. This removes an opportunity to test how Node.JS based applications work in a keep-alive mode.

siege is a decent tool

siege starts a tad slower in a single-user, but becomes more competitive in higher concurrency setting. No quirks discovered.

pronk is fun, but quirky

The relatively new pronk is not well suited for testing efficient (low latency) high performance servers. The reason is two-fold. First, pronk does not have this mode where it creates new connections for each new HTTP request (in this weirdness pronk is a complete opposite of ab, which can't keep it alive). Yet this non-keepalive operation is the most important mode to be tested in the high-load setting, since clients come and go in waves, and there is comparatively not too much opportunity to reuse the connection. Second, pronk is woefully inefficient (0.9 kRPS?!) for a single connection (-c 1). Although situation improves rather quickly with supplying higher -c values.

So, I take from that that I can't use pronk for testing Coser, our internal web server. To load it properly I had to resort to getting three physical boxes to run httperf in parallel in HTTP/1.0 non-keepalive mode.

Node.JS is flaky [on FreeBSD]

pronk -c 10000 causes it to quit:
events.js:48
        throw arguments[1]; // Unhandled 'error' event
        ^
Error: accept Unknown system errno 23
    at errnoException (net.js:632:11)
    at TCP.onconnection (net.js:817:24)

Comments

( 27 comments — Leave a comment )
levgem
Jan. 4th, 2012 02:59 pm (UTC)
It is very strange that Node.js ignores Keep-Alive header.

It uses http_parser.c that properly tells whether to use or not to use keepalive.
lionet
Jan. 4th, 2012 03:03 pm (UTC)
See for yourself:
[vlm@nala:~]>telnet localhost 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.0
Connection: keep-alive
Host: localhost:8081

HTTP/1.1 200 OK
Connection: close

SuccessConnection closed by foreign host.
[vlm@nala:~]>telnet localhost 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
Host: localhost:8081

HTTP/1.1 200 OK
Connection: keep-alive
Transfer-Encoding: chunked

7
Success
0

^]
telnet> Connection closed.
[vlm@nala:~]>


Edited at 2012-01-04 03:05 pm (UTC)
levgem
Jan. 4th, 2012 03:05 pm (UTC)
Add, please, command line for launching httperf
lionet
Jan. 4th, 2012 03:07 pm (UTC)
Many connections, non-keepalive:
httperf --uri / --server localhost --port 8081 --num-conns=500000 --rate 7200
Single connection, keep-alive:
httperf --uri / --server localhost --port 8081 --num-calls=500000
levgem
Jan. 4th, 2012 03:28 pm (UTC)
# ./httperf --port=9000 --uri=/index.html --recv-buffer=165536 --num-conns=1000 --num-calls=200 --rate=17000
httperf --client=0/1 --server=localhost --port=9000 --uri=/index.html --rate=17000 --send-buffer=4096 --recv-buffer=165536 --num-conns=1000 --num-calls=200
Maximum connect burst length: 38

Total: connections 1000 requests 173334 replies 173200 test-duration 17.657 s

Connection rate: 56.6 conn/s (17.7 ms/conn, <=1000 concurrent connections)
Connection time [ms]: min 8382.9 avg 14149.1 max 17598.0 median 16215.5 stddev 3358.7
Connection time [ms]: connect 2552.1
Connection length [replies/conn]: 200.000

Request rate: 9816.8 req/s (0.1 ms/req)
Request size [B]: 72.0

Reply rate [replies/s]: min 9214.0 avg 10173.6 max 10903.6 stddev 867.9 (3 samples)
Reply time [ms]: response 49.8 transfer 8.5
Reply size [B]: header 67.0 content 130000.0 footer 0.0 (total 130067.0)
Reply status: 1xx=0 2xx=173200 3xx=0 4xx=0 5xx=0

CPU time [s]: user 1.03 system 16.02 (user 5.8% system 90.7% total 96.6%)
Net I/O: 1246639.1 KB/s (10212.5*10^6 bps)

Errors: total 134 client-timo 0 socket-timo 0 connrefused 0 connreset 134
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0



Net I/O, конечно, самое интересное.

Мне кажется, что надо как-то поковыряться вокруг: Reply time [ms]: response 49.8
Чем же занимается эрланг 50 мс?
lionet
Jan. 4th, 2012 04:24 pm (UTC)
First, let me suggest not setting --rate much higher than the actual "Connection rate" (56.6 conn/s). It is rather useless and leads to errors (you have 134). Make sure --rate is ever slightly higher than connection rate, but keep errors at 0. That'll be your base line.

I just tested Coser on the same hardware I used in the main post. Here are my results with your --num-conns/num-calls settings:
httperf --client=0/1 --server=localhost --port=1234 --uri=/v1/bus/foo/channel/a --rate=650 --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=200
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
Maximum connect burst length: 2

Total: connections 1000 requests 200000 replies 200000 test-duration 2.124 s

Connection rate: 470.9 conn/s (2.1 ms/conn, <=486 concurrent connections)
Connection time [ms]: min 7.7 avg 571.9 max 1044.9 median 632.5 stddev 256.0
Connection time [ms]: connect 1.3
Connection length [replies/conn]: 200.000

Request rate: 94170.8 req/s (0.0 ms/req)
Request size [B]: 82.0

Reply rate [replies/s]: min 0.0 avg 0.0 max 0.0 stddev 0.0 (0 samples)
Reply time [ms]: response 2.9 transfer 0.0
Reply size [B]: header 234.0 content 6.0 footer 0.0 (total 240.0)
Reply status: 1xx=0 2xx=200000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 0.62 system 1.50 (user 29.3% system 70.6% total 99.9%)
Net I/O: 29612.3 KB/s (242.6*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
levgem
Jan. 4th, 2012 04:59 pm (UTC)
Your responce is 6 bytes. Mine is 100kb. Maybe this is reason of big difference?
lionet
Jan. 4th, 2012 05:02 pm (UTC)
I did not imply anything, just a sample. Of course your response is heavier. Of course my server does 10x more replies than yours. There's nothing much to learn from that though.

My main point was about the use of --rate, which is excessively large in your case.

Edited at 2012-01-04 05:04 pm (UTC)
levgem
Jan. 4th, 2012 05:44 pm (UTC)

httperf --client=0/1 --server=localhost --port=9000 --uri=/index.html --rate=650 --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=200
Maximum connect burst length: 4

Total: connections 1000 requests 200000 replies 200000 test-duration 12.214 s

Connection rate: 81.9 conn/s (12.2 ms/conn, <=982 concurrent connections)
Connection time [ms]: min 445.0 avg 7421.0 max 11187.2 median 7176.5 stddev 2539.9
Connection time [ms]: connect 2238.9
Connection length [replies/conn]: 200.000

Request rate: 16374.4 req/s (0.1 ms/req)
Request size [B]: 72.0

Reply rate [replies/s]: min 15055.5 avg 16673.8 max 18292.0 stddev 2288.5 (2 samples)
Reply time [ms]: response 25.9 transfer 0.0
Reply size [B]: header 62.0 content 6.0 footer 0.0 (total 68.0)
Reply status: 1xx=0 2xx=200000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 0.38 system 11.26 (user 3.1% system 92.2% total 95.2%)
Net I/O: 2238.7 KB/s (18.3*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0



I want to understand why coser replies so faster. What is the reason. Memory allocation or something else.
levgem
Jan. 4th, 2012 05:47 pm (UTC)
Your coser shows 1,3 ms for connection.

My benchmark is: Connection time [ms]: connect 2049.5


Why!
lionet
Jan. 4th, 2012 06:14 pm (UTC)
I suspect this is because Erlang is eager to accept connections and then is kind of lazy processing them through your machinery. So in the end you end up quickly accepting lots of connections, but then dispatches them slowly. Make sure you do +K true, and try disabling SMP.

Edited at 2012-01-04 06:15 pm (UTC)
levgem
Jan. 4th, 2012 07:20 pm (UTC)
I'll try to measure it. Hope to fix it.
nponeccop
Jan. 11th, 2013 08:33 pm (UTC)
JFYI it is no longer the case so it's time for node to be retested :)
_slw
Jan. 4th, 2012 03:04 pm (UTC)
к сожалению все потроганные мной генераторы запросов довольно неэффективны (по CPU) и малофункциональны.
ну или переусложнены настолько, что даже трогать не хочется.
lionet
Jan. 4th, 2012 03:10 pm (UTC)
Для начала расскажи, что тебе от них нужно.

Edited at 2012-01-04 03:10 pm (UTC)
_slw
Jan. 4th, 2012 03:16 pm (UTC)
на мало CPU много запросов/трафика.
т.е. хочется И много запросов (мелкого файла) И получить много гигабит.

т.е. мне надо из писюка с несколькими гигабитами сделать генератор трафика (для нагрузки внешней сетевой железки).

как заставить FreeBSD пакеты с одной сетевой отправлять на другую -- я знаю (а вот линух нормально это не заставить делать, похоже), nginx достаточно роизводителен, а вот httperf похоже существенно больше ресурсов хочет. а ядра не резиновые. а хочется гигабит 6-8 в результате получить.
lionet
Jan. 25th, 2015 09:04 am (UTC)
Ну вот, наконец, появилось: https://github.com/machinezone/tcpkali
thesz
Jan. 4th, 2012 03:39 pm (UTC)
Where's Haskell in here?
lionet
Jan. 4th, 2012 03:49 pm (UTC)
Pronk
synergy_ek
Jan. 4th, 2012 06:19 pm (UTC)
What about Tsung?
And why server and client was on the same machine?
lionet
Jan. 4th, 2012 06:31 pm (UTC)
> What about Tsung?

I did not test Tsung in this experiment, but I know that speed is not its core strength. Being distributed and having more elaborate testing scenarios is. This is a different kind of tool, useful in its own right.

> And why server and client was on the same machine?

Because the tests were conducted on a single machine. Despite this not being the best practice, I think in the first approximation the results of this experiment are going to be useful. Please don't pay attention to the absolute numbers, they are irrelevant. More relevant pieces of data revealed by this test are:
  • httperf's outdated reliance on select() backfires in non-keepalive setting when operating against GC-based runtime;
  • ab and node.js interoperability problem noticed;
  • discovered pronk performance problem when concurrency setting is small;
  • Node.JS is flaky [on FreeBSD].
    All the above points are valid despite the tests conducted on a single machine.

    Also please note that the box is relatively uncontended, and is equipped with a 4-core server-class Xeon processor (albeit with a shared cache). This alleviates a problem of bad scheduling to some degree.

    Edited at 2012-01-04 07:38 pm (UTC)
  • alxzhr
    Jan. 4th, 2012 06:40 pm (UTC)
    You may want to check Siege (http://www.joedog.org/index/siege-manual).
    lionet
    Jan. 4th, 2012 07:24 pm (UTC)
    Checked, added to the table, thanks!
    jakobz
    Jan. 5th, 2012 11:53 pm (UTC)
    Это пиздец, как фокусы. Вроде понятно что наебывают, но хочется понять где. Я к вам не прошел, но не жалею.
    lionet
    Jan. 6th, 2012 04:53 am (UTC)
    Ничего не понятно из того, что ты написал.
    xana4ok
    Jan. 20th, 2012 09:36 pm (UTC)
    in man httperf it's said:
    The calls in a burst are
    issued as follows: at first, a single call is
    issued. Once the reply to this first call has been
    fully received, all remaining calls in the burst
    are issued concurrently. The concurrent calls are
    issued either as pipelined calls on an existing
    persistent connection or as individual calls on
    separate connections. Whether a persistent
    connection is used depends on whether the server
    responds to the first call with a reply that
    includes a ``Connection: close'' header line. If
    such a line is present, separate connections are
    used.

    so, this should do the trick.
    httperf --server=localhost --port=80 --uri=/ --hog --wsess=100,100,0 --burst-length=100
    Harri Paavola
    Nov. 15th, 2012 06:13 am (UTC)
    I ran similar tests yesterday and found out that ab and siege perform more or less the same and httperf performance varies a lot depending on the page file size, http://spage.fi/benchmark (http://spage.fi/benchmark)
    ( 27 comments — Leave a comment )