I have 2 member servers on an L2 network
I have some post deploy sanity checks to do on proxy pairs using keepalived where after initial deployment or changing the vip configuration the 'network' subset of ansible_facts is too slow in regathering. I.e. if a service is broken, I'm not going to sit there for 3 minutes or longer waiting for ansible to crawl my servers. The solution is to just pull the information you need, and test it.
Where vip can be vips and states can be:
vip server1 - proceed
vip server2 - proceed
vip no servers - error
vip both servers - error
expected behaviour:
vip single server
vip can transiation to server 2
vip is reachable in either case
how to test:
don't use ansible facts, they can be too slow to regather after making changes to keepalived.conf vip config
test if interface(s) exists with vip
test if rich rule is working bi-directionally
test endpoint acs -> vip (wait_for_vip)
Where known values are the list of vip(s) keepalived_virtual_ips
# can't regather facts in case of error, facts take too long for rollback
- block:
# get all family inet v4 addresses for ens160 for each host
# register as fact member
- name: Get provisioned addresses
shell: nmcli -g ip4.address connection show ens160
register: member
# create new list object from string .stdout for each host
# register as fact address
# where stdout is string 'address1 | address2 | address3'
- name: Create list from addresses per host
set_fact:
address: "{{ member.stdout.split('|') | map('trim') | list }}"
# combine both lists 'addresses' when running against haproxy pair
# register as fact addresses
- name: Merge lists of addresses into single list
set_fact:
addresses: "{{ hostvars[groups['haproxy'][0]]['address'] + hostvars[groups['haproxy'][1]]['address'] }}"
when: groups['haproxy'] | length > 1
run_once: yes
# if vips are not provisioned at all, fail
- name: Assert vip(s) {{ keepalived_virtual_ips }} are provisioned
assert:
that:
- item + '/32' in addresses
fail_msg: "{{ item }} are not present on ens160 on either server, preparing to rollback config"
success_msg: "{{ item }} are provisioned"
loop: "{{ keepalived_virtual_ips }}"
run_once: yes
# if vips are provisioned to each member in the pair, fail
- name: Assert vip(s) {{ keepalived_virtual_ips }} are not splitbrained
assert:
that:
- addresses | regex_findall(item + '/32') | length == 1
fail_msg: "{{ item }} are provisioned on more than one server and are splitbrained, preparing to rollback config"
success_msg: "{{ item }} are provisioned successfully"
loop: "{{ keepalived_virtual_ips }}"
run_once: yes
Since this has to accommodate when there are an haproxy pair, or when there's a single haproxy like with some test setups the line groups['haproxy'] | length > 1
can be done in a few ways.
Though I like this approach I don't like the failure message the ansible spits out, since pipelines systems will pickup on the stderr
.
# combine both lists 'member' when running against haproxy pair
# register as fact addresses
- block:
- name: Merge lists of addresses into single list
set_fact:
addresses: "{{ hostvars[groups['haproxy'][0]]['address'] + hostvars[groups['haproxy'][1]]['address'] }}"
failed_when: not groups['haproxy'] | length > 1
run_once: yes
rescue:
- name: Merge lists of addresses into single list
set_fact:
addresses: "{{ hostvars[groups['haproxy'][0]]['address'] }}"
run_once: yes
This can more simply be written
# combine both lists 'addresses' when running against haproxy pair
# register as fact addresses
- name: Merge lists of addresses into single list
set_fact:
addresses: "{{ hostvars[groups['haproxy'][0]]['address'] + hostvars[groups['haproxy'][1]]['address'] }}"
when: groups['haproxy'] | length == 2
run_once: yes
- name: Merge lists of addresses into single list
set_fact:
addresses: "{{ hostvars[groups['haproxy'][0]]['address'] }}"
when: groups['haproxy'] | length == 1
run_once: yes
TASK [keepalived_el7 : Get provisioned addresses] *********************************************************************************************************************
changed: [server1]
changed: [server2]
Thursday 08 July 2021 23:09:30 +0100 (0:00:10.559) 0:00:10.662 *********
TASK [keepalived_el7 : Create list from addresses per host] ***********************************************************************************************************
ok: [server1]
ok: [server2]
Thursday 08 July 2021 23:09:30 +0100 (0:00:00.078) 0:00:10.740 *********
TASK [keepalived_el7 : Merge lists of addresses into single list] *****************************************************************************************************
ok: [server1]
Thursday 08 July 2021 23:09:30 +0100 (0:00:00.091) 0:00:10.832 *********
TASK [keepalived_el7 : Assert vip(s) ['10.130.254.69', '10.130.254.70'] are provisioned] ******************************************************************************
ok: [server1] => (item=10.130.254.69) => changed=false
ansible_loop_var: item
item: 10.130.254.69
msg: 10.130.254.69 is provisioned
ok: [server1] => (item=10.130.254.70) => changed=false
ansible_loop_var: item
item: 10.130.254.70
msg: 10.130.254.70 is provisioned
Thursday 08 July 2021 23:09:30 +0100 (0:00:00.075) 0:00:10.908 *********
TASK [keepalived_el7 : Assert vip(s) ['10.130.254.69', '10.130.254.70'] are not splitbrained] *************************************************************************
ok: [server1] => (item=10.130.254.69) => changed=false
ansible_loop_var: item
item: 10.130.254.69
msg: 10.130.254.69 is provisioned successfully
ok: [server1] => (item=10.130.254.70) => changed=false
ansible_loop_var: item
item: 10.130.254.70
msg: 10.130.254.70 is provisioned successfully
Thursday 08 July 2021 23:09:30 +0100 (0:00:00.079) 0:00:10.987 *********
Anyone interested in how to configure test and roll back, here's a sample
# post-config-sanity-check read, they don't write so safe to always run
# configure-keepalived will have errors in check mode if running for the first time
# post-config-sanity-check any error should fail the pipeline
- import_tasks: configure-keepalived.yml
ignore_errors: "{{ ansible_check_mode }}"
- block:
- import_tasks: post-config-sanity-check.yml
any_errors_fatal: true
check_mode: no
rescue:
- import_tasks: rollback-configuration.yml
- import_tasks: post-config-sanity-check.yml
any_errors_fatal: true
check_mode: no
It's easy to find online how to combine lists in ansible [1,2,3] + [4,5,6]
but most of these examples use a single host or 'localhost'. It's less clear how to combine named lists into a single useable master list from registered facts for multiple hosts. i.e.
# host_vars/host1 some_list = [1,2,3]
# host_vars/host2 some_list = [3,4,5]
- name:
set_fact:
nums: [some_list + some_list]
When both are named 'some_list' you can't access them as some_list + some_list
. Sure you could name them some_list_host1
some_list_host2
but what if it's a registered value? I went through a few solutions for this that gave similar results but this one uses the fewest lines and is clearest. Any registered value is a fact. Any fact can be accessed via hostvars[hostname][some_fact]
.