What is Pbench?
Pbench is a harness that allows data collection from a variety of tools while running a benchmark. Pbench has some built-in scripts that run some common benchmarks, but the data collection can be run separately as well with a benchmark that is not built-in to Pbench, or a Pbench script can be written for the benchmark. Such contributions are more than welcome!
TL;DR - How to set up Pbench and run a benchmark
Prerequisite: Somebody has already done the server setup.
The following steps assume that only a single node participates in the benchmark run. If you want a multi-node setup, you have to read up on the --remote options of various commands (in particular, pbench-register-tool-set):
. /etc/profile.d/pbench-agent.sh
# or log out and log
back in
pbench-register-tool-set
pbench-user-benchmark -C test1 -- ./your_cmd.sh
pbench-move-results
For explanations and details, see subsequent sections.
How to install
See the install section for details.
Defaults
The benchmark scripts source the base script (/opt/pbench-agent/base) which sets a bunch of defaults:
pbench_run=/var/lib/pbench-agent
pbench_log=/var/lib/pbench-agent/pbench.log
date=`date "+%F_%H:%M:%S"`
hostname=`hostname -s`
results_repo=pbench@pbench.example.com
results_repo_dir=/pbench/public_html/incoming
ssh_opts='-o StrictHostKeyChecking=no'
These are now specified in the config file /opt/pbench-agent/config/pbench-agent.cfg.
Available tools
The configured default set of tools (what you would get by running pbench-register-tool-set) is:
- sar, iostat, mpstat, pidstat, proc-vmstat, proc-interrupts, perf
In addition, there are tools that can be added to the default set with pbench-register-tool:
- blktrace, cpuacct, dm-cache, docker, kvmstat, kvmtrace, lockstat, numastat, perf, porc-sched_debug, proc-vmstat, qemu-migrate, rabbit, strace, sysfs, systemtap, tcpdump, turbostat, virsh-migrate, vmstat
There is a default group of tools (that's what pbench-register-tool-set uses), but tools can be registered in other groups using the --group option of pbench-register-tool. The group can then be started and stopped using pbench-start-tools and pbench-stop-tools using their --group option.
Additional tools can be registered:
pbench-register-tool --name blktrace
or unregistered (e.g. some people prefer to run without perf):
pbench-unregister-tool --name perf
Note that perf is run in a "low overhead" mode with options "record -a –freq=100", but if you want to run it differently, you can always unregister it and register it again with different options:
pbench-unregister-tool --name=perf
pbench-register-tool --name=perf -- --record-opts="record -a
--freq=200"
Tools can be also be registered, started, and stopped on remote hosts (see the --remote option described in "What does --remote do?" in FAQ section).
Available benchmark scripts
Pbench provides a set of pre-packaged scripts to run some common benchmarks using the collection tools and other facilities that pbench provides. These are found in the bench-scripts directory of the Pbench installation (/opt/pbench-agent/bench-scripts by default). The current set includes:
- pbench fio
- pbench-linpack
- pbench-specjbb2005
- pbench-uperf
-
pbench-user-benchmark
- See Running Pbench collection tools with an arbitrary benchmark below for more information
You can run any of these with the --help option to get basic information about how to run the script. Most of these scripts accept a standard set of generic options, some semi-generic ones that are common to a bunch of benchmarks, as well as some benchmark specific options that vary from benchmark to benchmark.
The generic options are:
--help
show the set of options that the benchmark accepts.
--config
the name of the testing configuration (user specified).
--tool-group
the name of the tool group specifying the tools to run during
execution of the benchmark.
--install
just install the benchmark (and any other needed packages) - do not
run the benchmark.
The semi-generic ones are:
--test-types
the test types for the given benchmark - the values are
benchmark-specific and can be obtained using --help.
--runtime
maximum runtime in seconds.
--clients
list of hostnames (or IPs) of systems that run the client (drive the
test).
--samples
the number of samples per iteration.
--max-stddev
the percent maximum standard deviation allowed in order to consider
the iteration to pass.
--max-failures
the maximum number of failures to achieve the allowed standard
deviation.
--postprocess-only
--run-dir
--start-iteration-num
--tool-label-pattern
Benchmark-specific options are called out in the following sections for each benchmark.
Note that in some of these scripts the default tool group is hard-wired: if you want them to run a different tool group, you need to edit the script.
pbench-fio
Iterations are the cartesian product targets X test-types X block-sizes. More information on many of the following can be obtained from the fio man page.
--direct
O_DIRECT enabled or not (1/0) - default is 1.
--sync
O_SYNC enabled or not (1/0) - default is 0.
--rate-iops
IOP rate not to be exceeded (per job, per client)
--ramptime
seconds - time to warm up test before measurement.
--block-sizes
list of block sizes - default is 4, 64, 1024.
--file-size
fio will create files of this size during the job run.
--targets
file locations (list of directory/block device).
--job-mode
serial/concurrent - default is concurrent.
--ioengine
any IO engine that fio supports (see the fio man page) - default is
psync.
--iodepth
number of I/O units to keep in flight against the file.
--client-file
file containing list of clients, one per line.
--numjobs
number of clones (processes/threads performing the same workload) of
this job - default is 1.
--job-file
if you need to go beyond the recognized options, you can use a fio
job file.
pbench-linpack
TBD
pbench-specjbb2005
TBD
pbench-uperf
--kvm-host
--message-sizes
--protocols
--instances
--servers
--server-nodes
--client-nodes
--log-response-times
pbench-user-benchmark
TBD
Utility Scripts
This section is needed as preparation for the Second steps section below.
Pbench uses a bunch of utility scripts to do common operations. There is a common set of options for some of these: --name to specify a tool, --group to specify a tool group, --with-options to list or pass options to a tool, --remote to operate on a remote host (see entries in the FAQ section for more details on these options).
The first set is for registering and unregistering tools and getting some information about them:
pbench-list-tools
list the tools in the default group or in the specified group; with
the –name option, list the groups that the named tool is in. TBD: how do
you list all available tools whether in a group or not?
pbench-register-tool-set
call pbench-register-tool on each tool in the default list.
pbench-register-tool
add a tool to a tool group (possibly remotely).
OBSOLETE (see below) pbench-unregister-tool
remove a tool from a tool group (possibly remotely).
pbench-clear-tools
remove a tool or all tools from a specified tool group (including
remotely). Used with a --name option, it replaces
pbench-unregister-tool.
The second set is for controlling the running of tools – pbench-start-tools and pbench-stop-tools, as well as pbench-postprocess-tools below, take --group, --dir and --iteration options: which group of tools to start/stop/postprocess, which directory to use to stash results and a label to apply to this set of results. pbench-kill-tools is used to make sure that all running tools are stopped: having a bunch of tools from earlier runs still running has been known to happen and is the cause of many problems (slowdowns in particular):
pbench-start-tools
start a group of tools, stashing the results in the directory
specified by --dir.
pbench-stop-tools
stop a group of tools
pbench-kill-tools
make sure that no tools are running to pollute the environment.
The third set is for handling the results and doing cleanup:
pbench-postprocess-tools
run all the relevant postprocessing scripts on the tool output -
this step also gathers up tool output from remote hosts to the local
host in preparation for copying it to the results repository.
pbench-clear-results
start with a clean slate.
pbench-copy-results
copy results to the results repo.
pbench-move-results
move the results to the results repo and delete them from the local
host.
pbench-edit-prefix
change the directory structure of the results (see the
Accessing results on the web section below for
details).
pbench-cleanup
clean up the pbench run directory - after this step, you will need
to register any tools again.
pbench-register-tool-set, pbench-register-tool and pbench-unregister-tool can also take a --remote option (see What does --remote do?) in FAQ section in order to allow the starting/stopping of tools and the postprocessing of results on multiple remote hosts.
There is a set of miscellaneous tools for doing various and sundry things - although the name of the script indicates its purpose, if you want more information on these, you will have to read the code:
These are used by various pieces of Pbench. There is also a contrib directory that contains completely unsupported tools that various people have found useful.
Second Steps
WARNING: It is highly recommended that you use one of the pbench-<benchmark> scripts for running your benchmark. If one does not exist already, you might be able to use the pbench-user-benchmark script to run your own script. The advantage is that these scripts already embody some conventions that Pbench and associated tools depend on, e.g. using a timestamp in the name of the results directory to make the name unique. If you cannot use pbench-user-benchmark and a pbench-<benchmark> script does not exist already, consider writing one or helping us write one. The more we can encapsulate all these details into generally useful tools, the easier it will be for everybody: people running it will not need to worry about all these details and people maintaining the system will not have to fix stuff because the script broke some assumptions. The easiest way to do so is to crib an existing pbench-<benchmark> script, e.g pbench-fio.
Once collection tools have been registered, the work flow of a benchmark script is as follows:
- create a benchmark_results directory
- start the collection tools
- run the benchmark
- stop the collection tools
- postprocess the collection tools data
The tools are started with an invocation of pbench-start-tools like this:
pbench-start-tools --group=$group --iteration=$iteration --dir=$benchmark_tools_dir
where the group is usually "default" but can be changed to taste as described above, iteration is a benchmark-specific tag that disambiguates the separate iterations in a run (e.g. for pbench-fio it is a combination of a count, the test type, the block size and a device name), and the benchmark_tools_dir specifies where the collection results are going to end up (see the section for much more detail on this).
The stop invocation is parallel, as is the postprocessing invocation:
pbench-stop-tools --group=$group --iteration=$iteration --dir=$benchmark_tools_dir
pbench-postprocess-tools --group=$group --iteration=$iteration --dir=$benchmark_tools_dir
Benchmark scripts options
Generally speaking, benchmark scripts do not take any pbench-specific options except --config (see What does --config do? in FAQ section). Other options tend to be benchmark-specific.
Collection tools options
--help can be used to trigger the usage message on all of the tools (even though it's an invalid option for many of them). Here is a list of gotcha's:
pbench-register-tool --name=blktrace [--remote=foo] -- --devices=/dev/sda,/dev/sdb
There is no default and leaving it empty causes errors in postprocessing (this should be flagged).
Utility script options
Note that pbench-move-results, pbench-copy-results and pbench-clear-results always assume that the run directory is the default /var/lib/pbench-agent.
pbench-move-results and pbench-copy-results now (starting with Pbench version 0.31-108gf016ed6) take a --prefix option. This is explained in the Accessing results on the web section below.
Note also that pbench-start/stop/postprocess-tools must be called with exactly the same arguments. The built-in benchmark scripts do that already, but if you go your own way, make sure to follow this dictum.
--dir
specify the run directory for all the collections tools. This
argument must be used by pbench-start/stop/postprocess-tools, so
that all the results files are in known places:
pbench-start-tools --dir=/var/lib/pbench-agent/foo
pbench-stop-tools --dir=/var/lib/pbench-agent/foo
pbench-postprocess-tools --dir=/var/lib/pbench-agent/foo
--remote
specify a remote host on which a collection tool (or set of
collection tools) is to be registered:
pbench-register-tool --name=< tool> --remote=< host>
Running Pbench collection tools with an arbitrary benchmark
If you want to take advantage of Pbench's data collection and other goodies, but your benchmark is not part of the set above (see Available benchmark scripts), or you want to run it differently so that the pre-packaged script does not work for you, that's no problem (but, if possible, heed the WARNING above). The various Pbench phases can be run separately and you can fit your benchmark into the appropriate slot:
group=default
benchmark_tools_dir=TBD
pbench-register-tool-set --group=$group
pbench-start-tools --group=$group --iteration=$iteration --dir=$benchmark_tools_dir
<run your benchmark>
pbench-stop-tools --group=$group --iteration=$iteration --dir=$benchmark_tools_dir
pbench-postprocess-tools --group=$group --iteration=$iteration --dir=$benchmark_tools_dir
pbench-copy-results
Often, multiple experiments (or "iterations") are run as part of a single run. The modified flow then looks like this:
group=default
experiments="exp1 exp2 exp3"
benchmark_tools_dir=TBD
pbench-register-tool-set --group=$group
for exp in $experiments ;do
pbench-start-tools --group=$group --iteration=$exp
<run the experiment>
pbench-stop-tools --group=$group --iteration=$exp
pbench-postprocess-tools --group=$group --iteration=$exp
done
pbench-copy-results
Alternatively, you may be able to use the pbench-user-benchmark script as follows:
pbench-user-benchmark --config="specjbb2005-4-JVMs" -- my_benchmark.sh
which is going to run my_benchmark.sh in the <run your benchmark> slot
above. Iterations and such are your responsibility.
pbench-user-benchmark can also be used for a somewhat more
specialized scenario: sometimes you just want to run the collection
tools for a short time while your benchmark is running to get an
idea of how the system looks. The idea here is to use
pbench-user-benchmark
to run a sleep of the appropriate duration in parallel with your
benchmark:
pbench-user-benchmark --config="specjbb2005-4-JVMs" -- sleep 10
will start data collection, sleep for 10 seconds, then stop data
collection and gather up the results. The config argument is a tag to
distinguish this data collection from any other: you will probably want
to make sure it's unique.
This works well for one-off scenarios, but for repeated usage on well
defined phase changes you might want to investigate Triggers.
Remote hosts
Multihost benchmarks
Usually, a multihost benchmark is run using a host that acts as the "controller" of the run. There is a set of hosts on which data collection is to be performed while the benchmark is running. The controller may or may not be itself part of that set. In what follows, we assume that the controller has password-less ssh access to the relevant hosts.
The recommended way to run your workload is to use the generic pbench-user-benchmark script. The workflow in that case is:
for host in $hosts ;do
pbench-register-tool-set --remote=$host
done
If you cannot use the pbench-user-benchmark script, then the process becomes more manual. The workflow is:
Customizing
Some characteristics of Pbench are specified in config files and can be customized by adding your own config file to override the default settings. TBD
Results handling
Accessing results on the web
This section describes how to get to your results using a web browser. It describes how pbench-move-results moves the results from your local controller to a centralized location and what happens there. It also describes the --prefix option to pbench-move-results (and pbench-copy-results) and a utility script, pbench-edit-prefix, that allows you to change how the results are viewed.
Where to go to see results
Where pbench-move/copy-results copies the results is site-dependent. Check with the admin who set up the Pbench server and provided you with the configuration file for the pbench-agent installation.
Advanced topics
Triggers
Triggers are groups of tools that are started and stopped on specific events. They are registered with pbench-register-tool-trigger using the --start-trigger and --stop-trigger options. The output of the benchmark is piped into the pbench-tool-trigger tool which detects the conditions for starting and stopping the specified group of tools.
There are some commands specifically for triggers:
pbench-register-tool-trigger
register start and stop triggers for a
tool group.
pbench-list-triggers
list triggers and their start/stop
criteria.
pbench-tool-trigger
this is a Perl script that looks for
the start-trigger and end-trigger markers in the benchmark's output,
starting and stopping the appropriate group of tools when it finds the
corresponding marker.