As we all know, Network Automation is getting more and more real in today networks and communities. All vendors now offer some programmable interface to their products, enabling network engineers to better configure and manage their infrastructures. Automation inspired communities like Network-to-Code Slack channel now counts 1300+ professionals who exchange knowledge and help each other on a day-to-day basis. Remarkable Python automation libraries are expanding and improving at a very good pace (netmiko, NAPALM) as well as existing tools and frameworks like Ansible have lately added a bunch of new networking related features and networking Hackthons are organized.
All of these are very good signals that should push you to embrace the change, if you haven’t done it yet 🙂 This change is so big that it not only involves devices configuration (unlike many people think when they first meet automation) but it affects the whole network life cycle, from the design to the configuration, to the maintenance and monitoring.
I underlined monitoring as it is a big topic and some vendor like Cisco has started to support a set of ways to let you be able to dynamically gather data and statistics from devices.
Streaming Telemetry
The last sentence may sounds familiar to you (SNMP anyone?) but Telemetry is a totally different concept. The whole idea of Telemetry is based on a push model, where the device sends data to a configured receiver. This can be done on a timer basis or on an policy/event-driven basis. That’s pretty different from the pull model used by SNMP.
Data gained from Telemetry can be used to automatically re-configure the device, to build statistics of some given metrics, to rise preventive alarms or to whatever else use case you want to develop.
Cisco offers these capabilities to its IOS-XR (from Release 6.1.1 onwards) in two different models:
- Model Driven Telemetry (MDT): here the streamed data is defined by a YANG model, either an Open-Config or Cisco IOSXR native model. In order to use MDT the device must support
- Policy Driven Telemtry (PDT): here the streamed data is defined by policy file which also defines the frequency of the streaming. An example of policy file to stream interface counters looks like this:
{
"Name": "TelemetryTest",
"Metadata": {
"Version": 25,
"Description": "This is a sample policy to demonstrate the syntax",
"Comment": "This is the first draft",
"Identifier": "<data that may be sent by the encoder to the mgmt stn"
},
"CollectionGroups": {
"FirstGroup": {
"Period": 10,
"Paths": [
"RootOper.Interfaces.Interface(*)"
]
}
}
}
Testing MDT on XR
I’m really interested in this topic so I wanted to do a really quick test. Luckily, Cisco offers some nice tutorials like this.
This is what I configured on my XR device:
RP/0/RP0/CPU0:xrv1(config)# telemetry model-driven
RP/0/RP0/CPU0:xrv1(config-model-driven)# destination-group jumphost
RP/0/RP0/CPU0:xrv1(config-model-driven-dest)# address family ipv4 10.0.0.5 port 5555
RP/0/RP0/CPU0:xrv1(config-model-driven-dest-addr)# encoding self-describing-gpb
RP/0/RP0/CPU0:xrv1(config-model-driven-dest-addr)# protocol tcp
RP/0/RP0/CPU0:xrv1(config-model-driven-dest-addr)# exit
RP/0/RP0/CPU0:xrv1(config-model-driven)#
RP/0/RP0/CPU0:xrv1(config-model-driven)#sensor-group SGroup1
RP/0/RP0/CPU0:xrv1(config-model-driven-snsr-grp)# sensor-path Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/generic-counters
RP/0/RP0/CPU0:xrv1(config-model-driven-snsr-grp)# exit
RP/0/RP0/CPU0:xrv1(config-model-driven)#
RP/0/RP0/CPU0:xrv1(config-model-driven)#subscription Sub1
RP/0/RP0/CPU0:xrv1(config-model-driven-subs)#sensor-group-id SGroup1 sample-interval 10000
RP/0/RP0/CPU0:xrv1(config-model-driven-subs)#destination-id jumphost
RP/0/RP0/CPU0:xrv1(config-model-driven-snsr-grp)# commit
This instructs the device to collect data based on a YANG model extracting interface counters. The next step is to enable the data collector on the receiver device. You can download the Collector repo from GitHub and pick the one compatible to your configured encoding method. It’s just a piece of Python code, so all you have to do is run it on your console specifying receiving address and port.
$ python telemetry_receiver.py --ip-address 10.0.0.5 --port 5555
Compiled cisco.proto
Waiting for TCP connection
Waiting for UDP message
Got TCP connection
Getting TCP message
Message Type: JSON (2))
Flags: None
Length: 10695
Decoding message
CollectionStartTime: Mon Nov 14 06:07:39 2016 (393ms)
CollectionID: 1151
CollectionEndTime: Mon Nov 14 06:07:39 2016 (742ms)
Version: 25
Policy: TelemetryTest
Path: RootOper.Interfaces.Interface
Identifier: <data that may be sent by the encoder to the mgmt stn
Data: {
RootOper {
Interfaces {
Interface (5 items - displaying first entry only) [
[0]
InterfaceName: GigabitEthernet0/0/0/0
InterfaceHandle: GigabitEthernet0/0/0/0
IsL2Looped: False
EncapsulationTypeString: ARPA
HardwareTypeString: GigabitEthernet
State: IM_STATE_ADMINDOWN
LastStateTransitionTime: 6644
MaxBandwidth: 1000000
InFlowControl: IM_ATTR_FLOW_CONTROL_OFF
LineState: IM_STATE_ADMINDOWN
LinkType: IM_ATTR_LINK_TYPE_FORCE
CollectionTime: Mon Nov 14 06:07:39 2016 (544ms)
Description:
DataRates {
PeakInputDataRate: 0
LoadInterval: 9
InputPacketRate: 0
PeakOutputDataRate: 0
InputLoad: 0
PeakOutputPacketRate: 0
OutputPacketRate: 0
OutputLoad: 0
Bandwidth: 1000000
Reliability: 255
PeakInputPacketRate: 0
InputDataRate: 0
OutputDataRate: 0
}
InterfaceType: IFT_GETHERNET
CarrierDelay {
CarrierDelayUp: 10
CarrierDelayDown: 0
}
StateTransitionCount: 0
IsL2TransportEnabled: False
Encapsulation: ether
ParentInterfaceName: <No interface>
IsDampeningEnabled: False
Duplexity: IM_ATTR_DUPLEX_FULL
MediaType: IM_ATTR_MEDIA_OTHER
MTU: 1514
MACAddress {
Address: 2cc2.605c.9017
}
BurnedInAddress {
Address: 2cc2.605c.9017
}
Bandwidth: 1000000
OutFlowControl: IM_ATTR_FLOW_CONTROL_OFF
IfIndex: 0
InterfaceStatistics {
StatsType: Full
FullInterfaceStats {
InputOverruns: 0
ParityPacketsReceived: 0
MulticastPacketsSent: 0
MulticastPacketsReceived: 0
InputIgnoredPackets: 0
SecondsSinceLastClearCounters: 0
Applique: 0
SecondsSincePacketSent: 4294967295
PacketsSent: 0
OutputBuffersSwappedOut: 0
GiantPacketsReceived: 0
SecondsSincePacketReceived: 4294967295
InputErrors: 0
BytesReceived: 0
LastDiscontinuityTime: 1479115826
OutputErrors: 0
CarrierTransitions: 0
LastDataTime: 1479121659
CRCErrors: 0
OutputDrops: 0
PacketsReceived: 0
InputQueueDrops: 0
OutputQueueDrops: 0
InputAborts: 0
InputDrops: 0
ThrottledPacketsReceived: 0
Resets: 0
FramingErrorsReceived: 0
BroadcastPacketsSent: 0
OutputUnderruns: 0
OutputBufferFailures: 0
BytesSent: 0
RuntPacketsReceived: 0
UnknownProtocolPacketsReceived: 0
BroadcastPacketsReceived: 0
AvailabilityFlag: 0
}
}
Speed: 1000000
]
}
}
Getting TCP message
Message Type: JSON (2))
Flags: None
Length: 8323
This was pretty easy! Here you can find the list of all YANG model you can use to get data from the device.
For example, to get BGP data you’d use sensor-path like Cisco-IOS-XR-ipv4-bgp-oper:bgp/config-instances/config-instance/config-instance-default-vrf, resulting in a stream of data like this (and much more):
Getting TCP message
Message Type: GPB_KEY_VALUE (4))
Flags: None
Length: 7013
Decoding message
Collection ID: 12
Base Path: Cisco-IOS-XR-ipv4-bgp-oper:bgp/config-instances/config-instance/config-instance-default-vrf/entity-configurations/entity-configuration
Subscription ID:
Model Version:
Start Time: Mon Nov 14 06:40:33 2016 (606ms)
Msg Timestamp: Mon Nov 14 06:40:33 2016 (606ms)
End Time: Mon Nov 14 06:40:33 2016 (632ms)
Fields: 1
Displaying first entry only
<no name>: fields (items 31) Mon Nov 14 06:40:33 2016 (624ms) {
instance-name: default (string)
entity-type: 3 (sint32)
neighbor-address: 1.1.1.1 (string)
neighbor-address: fields (items 2) {
afi: ipv4 (string)
ipv4-address: 1.1.1.1 (string)
}
group-name: (string)
configuration-type: neighbor (string)
address-family-identifier: 23 (uint32)
af-independent-config: fields (items 120) {
remote-as-number-xx: 0 (uint32)
remote-as-number-yy: 65534 (uint32)
configured-speaker-id: 0 (uint32)
tcp-mss: 0 (uint32)
min-advertisement-interval: 0 (uint32)
min-advertisement-interval-msecs: 0 (uint32)
description: (string)
ebgp-hop-count: 1 (uint32)
bmp-servers: 1 (uint32)
is-ebgp-multihop-bgpmpls-forwarding-disabled: false (string)
keychain: (string)
local-as-number-xx: 0 (uint32)
local-as-number-yy: 0 (uint32)
local-as-no-prepend: false (string)
password: (string)
socket-buffer-receive-size: 32768 (uint32)
bgp-buffer-receive-size: 4096 (uint32)
socket-buffer-send-size: 24576 (uint32)
bgp-buffer-send-size: 4096 (uint32)
adminstrative-shutdown: false (string)
keepalive-interval: 60 (uint32)
hold-time-value: 180 (uint32)
min-acc-hold-time-value: 3 (uint32)
local-ip-address: fields (items 2) {
afi: ipv4 (string)
ipv4-address: 0.0.0.0 (string)
}
msg-log-in-buf-count: 0 (uint32)
msg-log-out-buf-count: 0 (uint32)
route-updates-source: (string)
dmz-link-bandwidth: 0 (uint32)
ebgp-recv-dmz: 0 (uint32)
ebgp-send-dmz-mode: bgp-ebgp-send-dmz-disable (string)
ttl-security: 0 (uint32)
suppress4-byte-as: 0 (uint32)
capability-negotiation-suppressed: 0 (uint32)
session-open-mode: bgp-tcp-mode-type-either (string)
bfd: 0 (uint32)
bfd-mininterval: 0 (uint32)
bfd-multiplier: 0 (uint32)
tos-type-info: 0 (uint32)
tos-value-info: 6 (uint32)
nsr-disabled: 0 (uint32)
graceful-restart-disabled: 0 (uint32)
nbr-restart-time: 120 (uint32)
nbr-stale-path-time: 360 (uint32)
nbr-enforce-first-as-status: true (string)
nbr-cluster-id-type-info: 0 (uint32)
nbr-cluster-id-info: 0 (uint32)
ignore-connected-check: false (string)
internal-vpn-client: false (string)
addpath-send-capability: 0 (uint32)
update-error-handling-no-reset: 0 (uint32)
addpath-receive-capability: 0 (uint32)
egress-peer-engineering: 0 (uint32)
prefix-validation-disable: 0 (uint32)
bestpath-use-origin-as-validity: 0 (uint32)
prefix-validation-allow-invalid: 0 (uint32)
prefix-validation-signal-ibgp: 0 (uint32)
neighbor-update-filter-exists: false (string)
neighbor-update-filter-message-buffer-count: 0 (uint32)
neighbor-update-filter-message-buffer-is-non-circular: false (string)
neighbor-update-filter-logging-disable: false (string)
neighbor-update-filter-attribute-filter-group-name: (string)
graceful-shutdown-exists: 0 (uint32)
graceful-shutdown-loc-pref: 0 (uint32)
graceful-shutdown-as-prepends: 0 (uint32)
graceful-shutdown-activate: 0 (uint32)
remote-as-info: fields (items 2) {
is-item-configured: true (string)
}
speaker-id-info: fields (items 2) {
is-item-configured: false (string)
}
min-advertisement-info: fields (items 2) {
is-item-configured: false (string)
}
description-info: fields (items 2) {
is-item-configured: false (string)
}
ebgp-hop-count-info: fields (items 2) {
is-item-configured: false (string)
}
tcpmss-info: fields (items 2) {
is-item-configured: false (string)
}
bmp-servers-info: fields (items 2) {
is-item-configured: false (string)
}
keychain-info: fields (items 2) {
is-item-configured: false (string)
}
local-as-info: fields (items 2) {
is-item-configured: false (string)
}
password-info: fields (items 2) {
is-item-configured: false (string)
}
receive-buffer-info: fields (items 2) {
is-item-configured: false (string)
}
send-buffer-info: fields (items 2) {
is-item-configured: false (string)
}
shutdown-info: fields (items 2) {
is-item-configured: false (string)
}
timers-info: fields (items 2) {
is-item-configured: false (string)
}
local-address-info: fields (items 2) {
is-item-configured: false (string)
}
msg-log-in-info: fields (items 2) {
is-item-configured: false (string)
}
msg-log-out-info: fields (items 2) {
is-item-configured: false (string)
}
update-source-info: fields (items 2) {
is-item-configured: false (string)
}
dmz-link-bandwidth-info: fields (items 2) {
is-item-configured: false (string)
}
ebgp-recv-dmz-info: fields (items 2) {
is-item-configured: false (string)
}
ebgp-send-dmz-info: fields (items 2) {
is-item-configured: false (string)
}
ttl-security-info: fields (items 2) {
is-item-configured: false (string)
}
suppress4-bbyte-as-info: fields (items 2) {
is-item-configured: false (string)
}
session-open-mode-info: fields (items 2) {
is-item-configured: false (string)
}
bfd-info: fields (items 2) {
is-item-configured: false (string)
}
bfd-mininterval-info: fields (items 2) {
is-item-configured: false (string)
}
bfd-multiplier-info: fields (items 2) {
is-item-configured: false (string)
}
tos-info: fields (items 2) {
is-item-configured: false (string)
}
nsr-disabled-info: fields (items 2) {
is-item-configured: false (string)
}
graceful-restart-disabled-info: fields (items 2) {
is-item-configured: false (string)
}
nbr-restart-time-info: fields (items 2) {
is-item-configured: false (string)
}
nbr-stale-path-time-info: fields (items 2) {
is-item-configured: false (string)
}
nbr-enforce-first-as-info: fields (items 2) {
is-item-configured: false (string)
}
cluster-id-info: fields (items 2) {
is-item-configured: false (string)
}
ignore-connected-info: fields (items 2) {
is-item-configured: false (string)
}
internal-vpn-client-info: fields (items 2) {
is-item-configured: false (string)
}
addpath-send-capability-info: fields (items 2) {
is-item-configured: false (string)
}
addpath-receive-capability-info: fields (items 2) {
is-item-configured: false (string)
}
egress-peer-engineering-info: fields (items 2) {
is-item-configured: false (string)
}
update-error-handling-no-reset-info: fields (items 2) {
is-item-configured: false (string)
}
prefix-validation-disable-info: fields (items 2) {
is-item-configured: false (string)
}
prefix-validation-use-validit-info: fields (items 2) {
is-item-configured: false (string)
}
prefix-validation-allow-invalid-info: fields (items 2) {
is-item-configured: false (string)
}
prefix-validation-signal-ibgp-info: fields (items 2) {
is-item-configured: false (string)
}
neighbor-update-filter-exists-info: fields (items 2) {
is-item-configured: false (string)
}
neighbor-update-filter-message-buffer-count-info: fields (items 2) {
is-item-configured: false (string)
}
neighbor-update-filter-syslog-disable-info: fields (items 2) {
is-item-configured: false (string)
}
neighbor-update-filter-attribute-info: fields (items 2) {
is-item-configured: false (string)
}
graceful-shutdown-info: fields (items 2) {
is-item-configured: false (string)
}
graceful-shutdown-loc-pref-info: fields (items 2) {
is-item-configured: false (string)
}
graceful-shutdown-as-prepends-info: fields (items 2) {
is-item-configured: false (string)
}
graceful-shutdown-activate-info: fields (items 2) {
is-item-configured: false (string)
}
capability-negotiation-suppressed-info: fields (items 2) {
is-item-configured: false (string)
}
local-as-replace-as: false (string)
local-as-dual-as: false (string)
}
}
You can now start to collect and consume this data as you wish. One option is to use the ready-to-use stack built by Cisco.
Pretty cool. That’s a shame we can use such technology only with a limited set of Cisco IOS-XR devices, right? 🙂
Streaming Telemetry on NXOS
On one hand, it’s totally true that this is still a poorly supported feature, but if you’re geek enough you can easily obtain some kind of push based telemetry on NXOS too. This can be done thanks to on-box Python support on Cisco NXOS 9000.
Cisco Python Package
You can easily access the on-box Python Interpreter by simply typing python. Then you can import the library to interact with the underlying OS.
n9k2# python
Python 2.7.5 (default, Oct 8 2013, 23:59:43)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from cli import *
>>>
Now try to send a command and display its output:
>>> vlans = cli('show vlan')
>>> print vlans
VLAN Name Status Ports
---- -------------------------------- --------- -------------------------------
1 default active Eth1/1, Eth1/2, Eth1/3, Eth1/4
Eth1/5, Eth1/6, Eth1/7, Eth1/8
Eth1/9, Eth1/10, Eth1/11
Eth1/13, Eth1/14, Eth1/15
Eth1/16, Eth1/17, Eth1/18
Eth1/19, Eth1/20, Eth1/21
Eth1/22, Eth1/23, Eth1/24
Eth1/25, Eth1/26, Eth1/27
Eth1/28, Eth1/29, Eth1/30
Eth1/34, Eth1/35, Eth1/36
Eth1/37, Eth1/38, Eth1/39
Eth1/40, Eth1/41, Eth1/42
Eth1/43, Eth1/44, Eth1/45
Eth1/46, Eth1/47, Eth1/48
Eth2/5, Eth2/6, Eth2/7, Eth2/8
Eth2/9, Eth2/10, Eth2/11
Eth2/12
20 VLAN0020 active
122 VLAN0122 active
123 VLAN0123 active
200 WEB act/lshut
201 VLAN0201 act/lshut
VLAN Type Vlan-mode
---- ----- ----------
1 enet CE
20 enet CE
122 enet CE
123 enet CE
200 enet CE
201 enet CE
Remote SPAN VLANs
-------------------------------------------------------------------------------
Primary Secondary Type Ports
------- --------- --------------- -------------------------------------------
We can also deal with structured output using the clid function.
>>> vlans = clid('show vlan')
>>> print vlans
{"TABLE_vlanbrief": {"ROW_vlanbrief": [{"vlanshowbr-vlanid": "1", "vlanshowbr-vlanid-utf": "1", "vlanshowbr-vlanname": "default", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "noshutdown", "vlanshowplist-ifidx": "Ethernet1/1-11,Ethernet1/13-30,Ethernet1/34-48,Ethernet2/5-12"}, {"vlanshowbr-vlanid": "20", "vlanshowbr-vlanid-utf": "20", "vlanshowbr-vlanname": "VLAN0020", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "noshutdown"}, {"vlanshowbr-vlanid": "122", "vlanshowbr-vlanid-utf": "122", "vlanshowbr-vlanname": "VLAN0122", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "noshutdown"}, {"vlanshowbr-vlanid": "123", "vlanshowbr-vlanid-utf": "123", "vlanshowbr-vlanname": "VLAN0123", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "noshutdown"}, {"vlanshowbr-vlanid": "200", "vlanshowbr-vlanid-utf": "200", "vlanshowbr-vlanname": "WEB", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "shutdown"}, {"vlanshowbr-vlanid": "201", "vlanshowbr-vlanid-utf": "201", "vlanshowbr-vlanname": "VLAN0201", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "shutdown"}]}, "TABLE_mtuinfo": {"ROW_mtuinfo": [{"vlanshowinfo-vlanid": "1", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "20", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "122", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "123", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "200", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "201", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}]}}
The really nice thing is that we can also run Python scripts!
So, the goal is to write a Python script to collect some data and to push them off-box to an external collector 🙂 This can be done by running the script, store the data on a file and copy it to the external collector. Doing so, we’d obtain almost the same results of XR Telemetry.
But, what if we want to make it even more cool? Let’s say writing a script to collect BGP neighbors every 30 seconds and push an alert every time the state of any neighbor changes? What if this alert is pushed to Slack instead of a server? Okay, let’s do it 🙂
Enable Slack WebHook
In order to push automatic notifications to Slack, we need to enable a WebHook which is a way to post messages as HTTP requests with a JSON payload. To configure a webhook you should go to [your company].slack.com/services/new/incoming-webhook and click on Add Incoming WebHooks Integration button. Once you have your WebHook link, you can use it with Python requests library to POST messages on channels.
A simple function to push such messages would look like this:
def message_deliver(text):
webhook_url = # WEBHOOK URL
username = "gabriele"
icon_emoji = ":panda_face:"
channel = "general"
body = {
'username': username,
'icon_emoji': icon_emoji,
'text': 'Hello World!',
'channel': channel
}
command = '''
curl -i -k -H "Content-Type: application/x-www-form-urlencoded" -X POST <webhook_url> --data '{0}'
'''.format(str(body).replace("'", '"'))
os.system(command)
The above function will push an Hello World! message to my WebHook, resulting in a message post by the username gabriele on the channel called general. We’ll use this exact function to push our automatic BGP notifications.
Get BGP neighbors
Writing such script is pretty easy and you can achieve it in many ways. You can find here the one I used for this test. This will gather and store all BGP neighbors and their states every 30 seconds, compare the current state with the latest one and finally push a message to Slack if any change is found.
NXOS Scheduler
Now, as last step, we have to configure a scheduler to run the Python script. This is how my scheduler looks.
n9k2# show scheduler config
config terminal
feature scheduler
scheduler logfile size 16
end
config terminal
scheduler job name get_bgp_neighbors
python bootflash:/get_bgp_neighbors.py
end
config terminal
scheduler schedule name get_bgp_neighbors
time start 2016:11:01:13:41 repeat 5
job name get_bgp_neighbors
end
Testing it
I’ve established a BGP session between two N9K switches. Once everything is set, we can simply shut one neighbor down resulting in this.
Nice, right?
Even without XR capabilities, you can build something nice by exploiting Python and Cisco libraries!
Conclusions
I’ve touched many topics on this post: from Telemetry to Even-Driven automation, to WebHooks and on-box Python. I think all these can be useful when it comes to manage/tests a network/service and I hope this will help to get you started with such tools and technologies 🙂
Network Automation Survey
In order to better understand how people and companies are using these technology on their networks, a group of professionals has put together a survey to get a bigger picture about how we automate (or, at least, attempt to ) our networks. Here it is if you want to partecipate.