Check for duplicate items in a list with Ansible using a custom filter
Ansible provides some useful ways to manipulate data and filter items from sets or lists. None of these built-in filters provided what I thought would be a simple task - show duplicate items in a list.
Let's assume that I mistyped and entered a duplicate IP address into a list - like so:
- {mac: 3E:79:66:00:87:F7, ip: 192.168.1.1, hostname: dns}
- {mac: F2:8F:35:40:8A:9C, ip: 192.168.1.1, hostname: irc}
- {mac: 92:A9:C6:04:AE:CA, ip: 192.168.1.3, hostname: dev}
I need a way to not only check each item for duplicate but also for convenience it would be nice to print out the error to the user. Imagine this list is 100+ items long!
You'd think using one of the set-theory filters like unique
or difference
would work. unique
is designed to take a list with duplicate items and just completely discard these duplicates. difference
compares two lists and both sides of the comparison contain the same information so it returns that there is no difference.
Custom Filter
The answer turned out to be straight forward. Create a custom filter - source. Create a folder in the root of your Ansible project named filter_plugins
and put the following code into a file named something like dupcliate_filter.py
.
#!/usr/bin/python
class FilterModule(object):
def filters(self):
return {'duplicates': self.duplicates}
def duplicates(self, items):
sums = {}
result = []
for item in items:
if item not in sums:
sums[item] = 1
else:
if sums[item] == 1:
result.append(item)
sums[item] += 1
return result
Then in your Ansible code you can call this new custom filter named duplicates
thus:
- name: dupe check
debug:
msg: "Duplicate entry: {{ item | duplicates }}"
loop:
- "{{ dhcp_reservations | selectattr('mac', 'defined') | map(attribute='mac') }}"
- "{{ dhcp_reservations | selectattr('ip', 'defined') | map(attribute='ip') }}"
- "{{ dhcp_reservations | selectattr('hostname', 'defined') | map(attribute='hostname') }}"
This produces the following output:
TASK [ktz-dhcp-dns : dupe check] ****************************************************************************************************************************************************************************************
ok: [10.42.0.201] => (item=['3E:79:66:00:87:F7', 'F2:8F:35:40:8A:9C', '92:A9:C6:04:AE:CA']) => {
"msg": "Duplicate entry: []"
}
ok: [10.42.0.201] => (item=['192.168.1.1', '192.168.1.1', '192.168.1.3']) => {
"msg": "Duplicate entry: ['192.168.1.1']"
}
ok: [10.42.0.201] => (item=['dns', 'irc', 'dev']) => {
"msg": "Duplicate entry: []"
}
Thus we can see easily that the duplicated field here was the IP address 192.168.1.1
.
I expected writing a custom filter was going to be difficult and cumbersome but it was very simple and in the end, much faster than trying to turn YAML into a programming language!