Skip to content

Instantly share code, notes, and snippets.

@yuvalif
Last active March 31, 2026 15:31
Show Gist options
  • Select an option

  • Save yuvalif/4a077cef5063653f337337c9ca08094e to your computer and use it in GitHub Desktop.

Select an option

Save yuvalif/4a077cef5063653f337337c9ca08094e to your computer and use it in GitHub Desktop.

radosgw-admin UX and documentation improvements

Background

Currently, documenting radosgw-admin commands is a manual and error-prone process. After implementing a new command, the "usage" string should be updated accordingly in the code, where there could be a mismatch between the actually command and its arguments and what is documented in the usage. After that the man page needs to be updated manually, as well as the admin guide. Any references to this command in other places in our documentation also need to be manually updated.

We would like to solve this program with a more programmatic approach:

  • Declare command & argument semantics explicitly in code using a cli/args framework that supports auto-generation of context-aware "usage" docs
  • Investigate how this can then be used to auto-generate the man page, admin guides and any other related documentation (maybe using some python code)
  • See if we can easily reference these command descriptions in other places in our documentation
  • All while maintaining backward compatibility with the existing behavior
  • Some more information is available on this issue.

Evaluation Stage

Step 1 - Build Ceph and Run Basic Test

First would be to have a Linux-based development environment. As a minimum you would need a 4-core CPU, with 8G RAM and 50GB of available disk space. Unless you already have a Linux distro you like, I would recommend choosing from:

  • Fedora (42/43) - my favorite!
  • Ubuntu (24.04 LTS)
  • WSL (Windows Subsystem for Linux), though it would probably take much longer...
  • RHEL9/Centos9
  • Other Linux distros - try at your own risk :-)

Once you have that up and running, you should clone the Ceph repo from github. If you don't know what github and git are, this is the right time to close these gaps :-) You should also have a github account, so you can later share your work on the project.

Before building, you can install any missing system dependencies with:

./install-deps.sh

Note that the first build may take a long time, so the following cmake parameters could be used to minimize the build time. With a fresh ceph clone use the following:

./do_cmake.sh -DBOOST_J=$(nproc) -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DWITH_MGR_DASHBOARD_FRONTEND=OFF \
  -DWITH_DPDK=OFF -DWITH_SPDK=OFF -DWITH_SEASTAR=OFF -DWITH_CEPHFS=OFF -DWITH_RBD=OFF -DWITH_KRBD=OFF -DWITH_CCACHE=OFF -DNinja

Then invoke the build process (using ninja) from within the build directory (created by do_cmake.sh). Assuming the build was completed successfully, you can run the unit tests.

Now you are ready to run the ceph processes, as explained in the Ceph project README. You probably would also like to check the developer guide and learn more on how to build Ceph and run it locally.

Now it is time to play around a little with the radosgw-admin commandline tool:

  • Create a bucket using standard tools like the aws CLI or s3cmd which would be owned by the default user created as part of vstart
  • Upload objects to that bucket
  • Create a new user using radosgw-admin
  • Modify bucket ownership to the new user using radosgw-admin
  • Use radosgw-admin to get bucket stats and verify that ownership changed
  • Try to upload an object using the default credentials and verify it fails
  • Verify that using the credentials of the new user allows you to upload object to the bucket

If you run into issues with step 1, please use the rgw-devel slack channel for help (you can find details of the Ceph slack workspace on the Ceph website).

Step 2

In this step, please try to find mismatches (e.g. missing commands, missing parameters) between these:

  • the radosgw-admin code
  • the "usage" message (i.e. what is printed to the screen with radosgw-admin -h
  • the radosgw-admin man page

If you wish, you may also try to fix some of the issues that you find in a PR.

In your proposal, please provide a list of the issues you find, with links to any fix PRs you may have created.

In the proposal, please perform an initial survey of possible commandline argument frameworks that we might want to evaluate. Please use at least some of the following criteria for evaluation:

  • Is it an established and well maintained framework (e.g. boost)?
  • Is it a header only framework?
  • Does it allow for nested commands/verbs as we have (e.g. radosgw-admin bucket list)?
  • Can it generate the "usage" docs automatically?
  • Can it give command/verb specific help (e.g. radosgw-admin bucket --help)?
  • Does it support per-command argument declaration, and generate the errors automatically when required arguments are missing?
  • Does it support the existing behavior that knows how to list the possible verbs that are next in the hierarchy. e.g. if you use the verb "bucket" and then "logging" the next options are:
radosgw-admin bucket logging
ERROR: Unknown command
Expected one of the following:
  flush
  info
  list

If you have questions or need clarifications for phase 2. please contact a mentor by email

@cheese-cakee
Copy link
Copy Markdown

Hi Yuval,

Here is my evaluation progress for Project #1.

Documentation Drift Audit

I compared 261 commands in radosgw-admin.cc (the all_cmds table) against 153 commands documented in doc/man/8/radosgw-admin.rst.

120 commands exist in the code but are not documented in the man page:

Category Count Examples
Account 6 account create/get/list/modify/rm/stats
Bucket advanced 14 bucket layout, bucket check olh, bucket check unlinked, bucket set-min-shards, bucket resync encrypted multipart
Bucket sync 6 bucket sync checkpoint/info/init/markers/run/status
Dedup 8 dedup stats/estimate/exec/pause/restart/resume/throttle
Datalog 5 datalog autotrim/prune/semaphore list/reset/type
MFA 6 mfa check/create/get/list/remove/resync
Notification 3 notification get/list/rm
Ratelimit 8 ratelimit get/set/enable/disable, global ratelimit get/set/enable/disable
Role/role-policy 11 role delete/update, role-policy attach/delete/detach/get/list/put, role-trust-policy modify
Script 8 script get/put/rm, script-package add/list/reload/rm
Sync groups 12 sync group create/flow/pipe/remove, sync info/policy get/status
User policy 3 user policy attach/detach/list attached
Other 23+ bilog autotrim/status, mdlog autotrim/fetch, reshard bucket/stale-*, reshardlog *, objects expire-stale, olh get/readlog, period delete, realm default rm, zone default/delete, zonegroup placement get, zones list, usage clear

I also found 12 man page entries that do not map to any command in the code: metadata get/list/put/rm, period rm, pool add/rm, pools list, role modify/rm, role-policy rm, zone rm. These are likely aliases or outdated entries.

Root cause: The usage() function in radosgw-admin.cc is 400+ lines of manual cout statements. The man page is a separate RST file maintained by hand. When new commands are added, both must be updated independently. Drift is inevitable.

CLI Framework Survey

I evaluated four frameworks against the seven criteria from the project description:

Criteria CLI11 cxxopts Boost.PO argparse
Established/maintained Yes (4.2k stars, University of Cincinnati, active releases) Yes (4.7k stars) Yes (part of Boost) Moderate (3.4k stars)
Header-only Yes Yes No Yes
Nested subcommands Native support No (manual parsing) No (manual dispatch) Native support
Auto-generates usage docs Yes, extensible formatter Basic Basic Good
Per-subcommand help Yes (radosgw-admin bucket --help) No No Yes
Per-command arg declaration with auto-error Yes Yes Yes Yes
C++ standard required C++11 C++11 C++11 C++17

CLI11 is the strongest candidate for radosgw-admin. It supports nested subcommands natively (e.g. bucket sync status), generates per-command help automatically, and has an extensible formatter class that can be used to produce man pages programmatically. It is C++11 compatible and header-only, so it integrates without build system changes.

Example of how a bucket sync status command would look with CLI11:

CLI::App app{"radosgw-admin"};

CLI::App* bucket = app.add_subcommand("bucket", "Bucket operations");
CLI::App* sync = bucket->add_subcommand("sync", "Bucket sync operations");
sync->require_subcommand(1);
CLI::App* status = sync->add_subcommand("status", "Show sync status");

std::string bucket_name;
status->add_option("-b,--bucket", bucket_name, "Bucket name")->required();
status->add_flag("-d,--detailed", "Show detailed status");

app.require_subcommand(1);
CLI11_PARSE(app, argc, argv);

The key benefit is that the command tree, help text, argument validation, and man page generation all come from the same source of truth. Adding a new command updates everything at once.

PRs

  • PR #67984: Fixed s->req_id to s->trans_id in rgw_bucket_logging.cc (reviewed, awaiting test results)
  • PR #68097: Changed ldpp_dout to std::cout in sync_checkpoint.cc (pending review)
  • s3-tests PR #729: Added test_bucket_logging_request_id (pending review)

Happy to discuss any of this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment