While working on finding performance bottlenecks on a project at work, I had to create a load testing framework. Now if we were talking apache, there's no dearth of tools, the most notable being http_load and a new tool found by one of my colleagues,
pylot.
No such luck, however, for Thrift. So I set out to create such a tool by myself. I took one of the -remote clients that Thrift generates and started hacking on it until I got what I wanted, including:
- Ability to load arbitrary service clients (Python)
- Ability to load API calls via a file (like how http_load takes a list of URLs
- Ability to spawn multiple client threads
- Ability to throttle the rate at which calls are made
- Ability to specify how long the load test should run
- Ability to specify how many total requests to make
Here's how to use it:
Usage: ./thriftbench -h host:port -s Service [-t num_threads] [-d duration (seconds)] [-c num_calls] [-r max_fetch_rate] command_file
A sample command_file - essentially a tuple-per-line with first field being the API call name and remaining fields being args - could contain the following:
("sayHello", "arg1", {"arg2_key1": "val1", "arg2_key2": "val2"}, ThriftServiceObject({"init1": "a", "init2": 2}))
Here's a sample call to run the benchmark for 10 seconds with 10 parallel clients. This run calls only one API end-point but you could include more of them and the stats will be reported separately for each. To create a distribution ratio, e.g 5:1 calls to sayHello vs sayHola, you could put 5 entries in the command_list file for sayHello and one for sayHola.
$ ./thriftbench -h localhost:9090 -s HelloService -t 10 -d 10 ~/calllist
sayHello: 002454 fetches (244.957372/sec). 00000 errors. 40.81 ms (avg). 39.00 ms (50th %ile). 62.16 ms (90th %ile). 4.51 ms (min). 249.13 ms (max)
One limitation of the client is that currently it needs the "ttypes.py" file generated by Thrift in its import path along with the service client skeletons. That's not such a bit problem in itself but the import happens without any namespace prepended to it. The other big problem is that it uses Python's threads for parallel clients. Don't go crazy with the numbers in the '-t' parameter!
BTW, does anyone talk about multi-core performance of Thrift? At least with Python, the multi-threaded Thrift server only creates one process which only uses one core, no matter how many cores there may be on your machine. Sucketh.
Here's the script. Enjoy!
The script has now been released under Apache License 2.0. Head to
http://code.google.com/p/thriftbench :-)