Fabric configuration is currently implemented around a configuration library called viper. Viper reads configuration from files, environment variables, and flags, and exposes an API that uses dot qualified keys to reference the configuration values (think System Properties on steroids).
When configuration is read from files, the segments of the configuration key are used to walk config file stanzas to the data. Values read from the configuration file can be overridden by setting an environment variable that maps to the configuration key. Config values can also be sourced from flags. Flags take precedence over environment variables and values source from files.
Most of the issues we have with our configuration aren't problems with viper as much as how we use it.
Viper provides a function based API that makes it easy for Fabric components to retrieve configuration values. Unfortunately, the API is so easy to use that it has proliferated throughout the code base like a virus. What was initially a light touch, easy to use pattern has become a core component with several issues:
- Config is a global singleton that impacts concurrent testing of multiple config values
- Easy access to global configuration resulted in code that could not be explicitly configured without using the viper API
- Creation of "utility" layers on top of viper resulted in multiple entry points to the configuration data
- Config information is spread throughout the system without good documentation or a single source for the config schema
While viper enables programs to set default configuration values by calling
SetDefault
, Fabric has chosen to use "sample configuration" documents as
configuration defaults. This results in poor configuration defaults and less
than ideal sample config documents.
In many cases, default values were chosen based on assumptions in test cases rather than utility in production scenarios.
Even though the orderer and the peer use viper for configuration, are documented together, and live in the same source tree, they use similar, but different patterns to obtain configuration information.
For example, core.yaml
uses camelCase for configuration keys
while the orderer's orderer.yaml
uses PascalCase.
While the difference isn't particularly significant1, the arbitrary case
difference makes it harder to template shared configuration values between
the two.
In TLS configuration, the orderer allows users to specify a PEM encoded certificate block or a path to a file. While this is very flexible, using "magic" instead of separate keys requires non-standard config processing and error handling.
Fabric-CA also uses viper but, instead of using camelCase or UpperCamelCase, it uses lowercase for configuration keys. It's default configuration is also handled differently. If a config file does not exist, it will write one to the working directory for future customization.
There are also places where the names of required configuration values are
different for no really good reason. An example of this is the MSP
configuration path. In the peer it's called mspConfigPath
while in the
orderer it's called LocalMSPDir
.
The orderer configuration attempted to avoid some of the viper proliferation
problems that the peer suffers from. As part of that, a new viperutil
package with an EnhancedExactUnmarshal
function
was created. This package uses a combination of reflection, viper
, and
mapstructure
to discover viper configuration keys.
Unfortunately, the road to hell is paved with good intentions. In addition to violating the "clear is better than clever" proverb, the config parsing implementation of the orderer ended up relying on case-preserving behavior in viper that was deemed to be a bug. This bug-as-feature behavior pinned us to an ancient version of viper.
Historically, fabric has primarily been distributed as docker images. Since these images only contain the sample, default configuration values, environment variables were used to override the defaults. This is a perfectly reasonable mechanism, however, these overrides quickly got out of hand. Take, for example, a command we document in our "bring your first network" sample:
CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/[email protected]/msp
CORE_PEER_ADDRESS=peer0.org1.example.com:7051
CORE_PEER_LOCALMSPID="Org1MSP"
CORE_PEER_TLS_ROOTCERT_FILE=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls/ca.crt
In addition to the odd gopath references, the reliance on explicit, repeated, configuration with differing values while working across peers is error prone and confusing - especially when using command line tools.
Our configuration model and viper further complicate things when mapping configuration values to environment variables. For example, the list of client root CA certificates for the operations service looks like this in config documents:
operations:
tls:
# paths to PEM encoded ca certificates to trust for client authentication
clientRootCAs:
files: []
The variable associated with this config is
prefix_OPERATIONS_TLS_CLIENTROOTCAS_FILES
. Since the key is associated with a list,
how should the list be encoded? Well, in the peer, it looks like this:
CORE_OPERATIONS_TLS_CLIENTROOTCAS_FILES: '/certs/tls/cacerts/cacert.pem /certs/msp/operationscerts/operationscert-1.pem'
and in the orderer it looks like this:
ORDERER_OPERATIONS_TLS_CLIENTROOTCAS: '[/certs/tls/cacerts/cacert.pem,/certs/msp/operationscerts/operationscert-1.pem]'
Ignoring the slight variation in environment key name, the encoding of the values are quite different. The peer requires a space separated list of files while the orderer requires a comma separated list of values enclosed in square brackets.
Another problem area is where we use variable values as keys in a map. The peer system chaincode element is a good example:
# system chaincodes whitelist. To add system chaincode "myscc" to the
# whitelist, add "myscc: enable" to the list below, and register in
# chaincode/importsysccs.go
system:
_lifecycle: enable
cscc: enable
lscc: enable
escc: enable
vscc: enable
qscc: enable
When the keys are variable, it may not be possible to map it to a valid
environment variable. This was highlighted when we called the new lifecycle
chaincode +lifecycle
because +
is not part of the POSIX portable character
set definition for environment variables.
These are the goals of the configuration work:
- Provide a consistent, YAML serialization of configuration for the peer,
orderer, and CA.
- Consistent use of
camelCase
(orsnake_case1
) for configuration keys. - Identical configuration schema for TLS and MSP for all Fabric programs.
- Consistent use of
- Remove automatic mapping of environment variable overrides for config elements and move to explicitly named overrides where appropriate.
- Provide a simple mechanism to reference environment variables as values in the configuration files.
- Enable a portable configuration and runtime tree
- Use the current working directory as the default configuration directory.
- Ensure all file references within configuration are relative to the configuration document.
- Enable automated generation of documentation and default configuration documents directly from code.
- Rename core.yaml to peer.yaml
- Extract
peer
subcommands to a new CLI (calledfabric
orfabcli
) and associated with a correspondingly named file for configuration. Config and data for this tool should be sourced from XDG compliant locations by default. (e.g.$HOME/.config/fabcli.yaml
) - Implement an alternative local MSP that is sourced from a single YAML document instead of a configuration tree.
- Stop using
mapstructure
when decoding config - Remove viper from Fabric
These are explicitly out of scope:
- Dynamic reloading of configuration
- Updates to the channel config transaction tooling
- General UX improvements to the command line
We continue to use YAML for our configuration documents. In addition to allowing JSON elements, it supports comments and multi-line block values.
Long running processes will no longer use /etc/hyperledger/fabric
as the
default config location; instead, the current working directory will be used.
An explicit configuration directory can still be specified by setting
FABRIC_CFG_PATH
in the environment.
Command line flags should be provided to specify a specific configuration file. The flag will override any default location or environment variable value.
When relative paths are used in a configuration file, the paths are to be interpreted relative to the directory containing the configuration file.
When relative paths are used on command line flags, the paths are to be interpreted relative to the working directory.
Configuration keys and command line flags will use a consistent naming convention. The choice of convention matters much less than ensuring it's used consistently.
The leading candidates for config file keys are camelCase
, snake_case
, and
PascalCase
.
The leading candidates for long command line flags are nodelimeter
,
hyphen-delimited
, and camelCase
.
When mapping a set of configuraiton values to some entity, avoid patterns where an input value is treated as a key:
mapping:
1.2.3.4: 5.6.7.8
9.8.7.6: 4.3.2.1
Instead, use lists of objects with explicit field names:
mappings:
- from: 1.2.3.4
to: 5.6.7.8
- from 9.8.7.6
to: 4.3.2.1
The latter form promotes type safety and extensibility.
The major exception to this rule is when the config model is carrying generic
or opaque configuration elements. In these cases, an opaque
map[string]interface{}
is appropriate.
Certificate and key values in configuration will be PEM encoded blocks. If certificate chains are used, a multi-block value should be used. The blocks must be concatenated such that each certificate certifies the preceding it; the root CA shall be the last certificate in the list.
Separate keys must be used for certificates and key values and for files that contain certificate and key values. For example:
tls:
cert: |
-----BEGIN CERTIFICATE-----
Base64–encoded certificate
-----END CERTIFICATE-----
certfile: ~
key: ~
keyfile: tls/private.key
Notice cert
and certfile
are elements of the tls
configuration. When an
inline certificate is used, it should be the value associated with cert
;
when a file reference is used, the path should be associated with certfile
.
It is an error to provide values for inline and file references for the same
configuration element. In the example above, certfile
is explicitly nil.
Certificate pools should always support multiple PEM encoded blocks.
Instead of exposing all configuration as environment variables, we will allow
configuration values to be sourced from environment variables by using a value
of ${env.ENVIRONMENT_NAME}
as a value in the configuration file.
id: ${env.MY_ENV_VAR}
The ${env.NAME}
format is will not interpreted if used within a value. For
example, whem MESSAGE_ENV_VAR is set to "msg":
msg: ${env.MESSAGE_ENV_VAR}
message: my message is ${env.MESSAGE_ENV_VAR}.
will result in message="my message is ${env.MESSAGE_ENV_VAR}"
and
msg="msg"
.
Referencing environment variables is only supported for the basic types used in leaf nodes of the configuration.
Durations must always include units compatible with go
's
time.ParseDuration function. A value without a suffix is
expressed in nanoseconds and is rarely appropriate.
This means, by extension, that configuration keys should not contain units for
the duration. (e.g. timeoutInSeconds
should not be used)
To support automatic generation of documentation, configuration will be expressed as a set of structures in code.
A config
package will be provided decode configuration documents. The
package will expose types and functions to resolve relative paths and apply
configuration defaults to values omitted from the config.
The config
package will also expose functions to write example configuration
documents with comments. Field level godoc will be used to document the config
keys and tags will be used to provide the default value and, when required,
the name of the environment variable that can be used to override what is read
from configuration.
// MyConfig is my configuration.
type MyConfig struct {
// ID provides a unique identifier.
ID string `yaml:"id" example:"example-id" env:"MY_CONFIG_ID"`
// Timeout is the maximum time the service will wait for a response.
Timeout time.Duration `yaml:"timeout" default:"30s"`
// Subsystem configures the subsystem.
Subsystem *SubsystemConfig `yaml:"subsystem"`
}
// SubystemConfig is the configuration structure for subsystem.
type SubsystemConfig struct {
// Timeout is the maximum amount of time the subysstem will wait for a
// response.
Timeout time.Duration `yaml:"timeout" default:"10s"`
// WorkingStoragePath points to the directory to store temporary data.
WorkingStoragePath config.Path `yaml:"workingStoragePath" default:"subsys"`
}
The generated documentation will replace occurrences of the struct field name with the appropriate YAML config element name.
The default configuration for MyConfig
would look like this:
---
# id provides a unique identifier
id: example-id
# timeout is the maximum time the service will wait for a response.
timeout: 30s
# subsystem configures the subsystem
subsystem:
# timeout is the maximum amount of time the subsystem will wait for a
# response.
timeout: 10s
# workingStoragePath points the directory to store temporary data.
workingStoragePath: subsys
Footnotes
-
In the case of the orderer, yaml tags were not specified so the case doesn't matter when decoding but does matter when using the viper API. ↩