Cisco Wants You to Use APIs and It Shows

As anticipated in a previous blog post, I’ve attended Cisco Live Europe in Berlin from 20th to 24th February. During that time I’ve also had the pleasure to be invited as delagate at the Tech Field Day at CLEUR event and had the opportunity to hear about some interesting news from Cisco on several topics and environments.

During the first TFD day of sessions, we learned how Cisco is developing and leveraging its DNA (Digital Network Architecture) to simplify how campus networks are managed. This post will be focused on new features around programmability and automation on enterprise switches.

Programmable Interfaces in Enterprise Switches

SDN, NFV, Automation, Softwarization, Orchestration. When software meets networks the confusion starts from the naming. There is a whole world of new tools, names and acronyms around this field that people can get lost even before starting to dig deeper into it. That’s why our presenter, Fabrizio Maccioni, decided to give us a very focused and practical view of what Cisco is doing with programmable interface in campus switches.

(In)Consistency

One of the biggest difficulties in automating multi-vendor networks is that most vendors (when they do!) offer totally different APIs and protocols to “automate” their devices. This means that in order to execute a simple operation, like getting the current running configuration, we may have to use a REST API for vendor X, simple CLI for vendor Y and some proprietary protocol for vendor Z. This operation may also return structured or un-structured data based on the vendor. This is clearly sub-optimal and it doesn’t help engineers to start automating their networks.

This is even more frustrating when it happens with different devices from the same vendor! Today, based on which Cisco equipment you have, you may end up on one of the following stages:

  • Catalyst 4K: no APIs offered. Simple SSH/screen scraping of unstrucured data
  • Catalyst 3K: NETCONF protocol with YANG models
  • Nexus 7K: NX-API leveraging CLI
  • Nexus 9K: NX-API leveraging REST

The good news is this is going to change soon. In fact, Cisco is working on driving consistency across several platforms so that we will be able to automatically manage our Cisco networks in the same regardless of the device platform and OS. In particular, Cisco will bring this kind of consistency across IOS-XR, IOS-XE and NX-OS.

This is an important commitment from Cisco as it represents the desire to lower the barrier to start using APIs instead of CLI on their enterprise equipment.

Data Models

How is Cisco going to do it?

The plan is to leverage YANG data models over device features, using NETCONF, REST or gRPC to configure/get those features.

YANG (Yet Another Next Generation) is a data modeling language used to describe how data is represented and accessed. YANG data models are represented by definition hierarchies called schema trees whose instances are encoded in XML. 

As as example, the following block represent the data model for ACL statistical data.

module cisco-acl-oper {
  yang-version 1;
  namespace "urn:cisco:params:xml:ns:yang:cisco-acl-oper";
  prefix cisco-access-control-list-oper;

  import ned {
    prefix ned;
  }
  import ietf-yang-types {
    prefix "yang";
  }

  organization
    "Cisco Systems, Inc.";

  contact
    "Cisco Systems, Inc. Customer Service Postal: 170 W Tasman Drive
     San Jose, CA 95134 Tel: +1 1800 553-NETS E-mail: cs-yang@cisco.com";

  description
    "This module contains a collection of YANG definitions for ACL statistical data."+
    "Copyright (c) 2016 by Cisco Systems, Inc."+
    "All rights reserved.";

  reference "TODO";

  revision 2016-03-30 {
    description
      "Update description with copyright notice.";
  }

  revision 2015-08-10 {
    description "Model for Network Access Control List (ACL) operational data.";
    reference
      "RFC XXXX: Network Access Control List (ACL)
      YANG Data  Model";
  }

  augment /ned:native {
    container access-lists {
      config false;
      description
        "This is top level container for Access Control Lists. It can have one
        or more Access Control List.";

      list access-list {
        key access-control-list-name;
        description "An access list (acl) is an ordered list of
        access list entries (ACE). Each access control entries has a
        list of match criteria, and a list of actions.
        Since there are several kinds of access control lists
        implemented with different attributes for
        each and different for each vendor, this
        model accommodates customizing access control lists for
        each kind and for each vendor.";

        leaf access-control-list-name {
          type string;
          description "The name of access-list. A device MAY restrict the length
        and value of this name, possibly space and special characters are not
        allowed.";
        }

        container access-list-entries {
          description "The access-list-entries container contains
          a list of access-list-entry(ACE).";

          list access-list-entry {
            key rule-name;
            ordered-by user;
            description "List of access list entries(ACE)";
            leaf rule-name {
              type uint32;
              description "Entry number.";
            }

            container access-list-entries-oper-data {
              description "Per access list entries operational data";
              leaf match-counter {
                type yang:counter64;
                description "Number of matches for an access list entry";
              }
            }
          }
        }
      }
    }
  }
}

YANG models are open and available on GitHub, where we can find a sub-directory reserved to Cisco models for IOS-XE, IOS-XR and NX-OS.

These models can either be open or native:

  • open models are vendor-independent, designed by standardization organizations like IETF but also by other entities like OpenConfig
  • native models are designed by Cisco itself for its own equipment

Cisco devices will support both types of model, with native models being a super-set of open ones. The reason for this is clear: (1) standardization organization are generally slow as (2) they have to find trade-off between several parties needing to find the best solution fitting the whole industry. As a result, platform specific features will be left out of the equation. In order to avoid partial feature coverage (which would be the worst!) Cisco has developed its native models that will be used to offer a complete support for all features. Anyway, both families will always be supported and the user will be the one who will choose which family to use. So if you want to use only IETF models across your multivendor enviroment which includes Cisco device, go for it!

This also have another implication: native models may be different across platforms, meaning that the same feature may be represented by different models on NXOS and XE, since NXOS can have some specific attributes not present on XE. This totally makes sense, but still I hope deviations will be minimal. We’ll know soon!

An important section of Fabrizio’s session has been focused on demos, showing us few use cases where these new features can be useful. Obviously, I couldn’t stand still without testing something myself 😉

Demo time

An example worths thousand  words, so we’ll use two simple scripts to compare sending CLI command over SSH and using an open programmable interface like REST, highlighting the key benefits coming with the second approach.

Here I’ll use a Cisco CSR1000V device running Cisco IOS XE Software, Version 16.03.01. Jason Edelman has highlighted how this platform already supports RESTCONF even if it still appears as an hidden feature.

Operational commands

First, I want to compare sending operational commands. I’ll use Netmiko as SSH library. The following simple script will get the show ip interface brief output and will print it out.

from netmiko import ConnectHandler

def main():
 device = ConnectHandler('csr1', username='test', password='test', device_type='cisco_ios')
 output = device.send_command('show ip interface brief')
 print output

if __name__ == "__main__":
 main()

The output will look as follows:

csr1#show ip interface brief
Interface              IP-Address      OK? Method Status                Protocol
GigabitEthernet1       10.0.0.51       YES NVRAM  up                    up
GigabitEthernet2       unassigned      YES NVRAM  up                    up
GigabitEthernet3       unassigned      YES NVRAM  up                    up
GigabitEthernet4       10.10.10.1      YES NVRAM  up                    up
Loopback10             unassigned      YES unset  up                    up

Let’s now do the same using the REST API.

import requests

def main():
    auth = HTTPBasicAuth('test', 'test')
    headers = {
        'Accept-Type': 'application/vnd.yang.data+json',
        'Content-Type': 'application/vnd.yang.data+json'
    }

    url = 'http://csr1/restconf/api/config/native/interface?deep'
    response = requests.get(url, headers=headers, auth=auth)
    print response.text

if __name__ == "__main__":
    main()

The produced output will look like this (too verbose).

Let’s compare them:

  • The first approach returned unstructured data which is easy to understand by engineers but hard to manage by software. Also, it took 8.8s to establish an SSH session, execute the command and retrieve the output. SSH is stateful, meaning a session needs to be created before starting to send any commands
  • The second approach returned A LOT of structured data which is really easy to manage and parse. Also, it took only 3.2s to get an huge amount of data because REST is a stateless service, meaning no session has to be established first

Configuration commands

Let’s now see how SSH and REST differently behave with configuration commands. I want to send a set of configuration commands to configure an interface with the following properties:

  1. name: Loopback10
  2. ip: 100.10.10.10
  3. mask: 255.255.255.255
  4. description: Configured with RESTCONF

To make things trieckier, let’s insert a typo into the mask: 255.255.255.355.

I’ve written two other simple scripts executing configuration commands via SSH and REST. In the first one, since it uses simple SSH, each command is sent individually. Let’s run it:

$ python test_ssh_command.py
$

Let’s now check what happened on the device:

csr1#show run interface loopback10
Building configuration…

Current configuration : 76 bytes
!
interface Loopback10
description Configured via REST
no ip address
end

As we can see, no IP/mask is configured due to the typo into the mask value, but the interface’s been created and description’s been correctly configured. This means this kind of operation is not transactional/atomic: in a transactional operation all operations must be correctly executed, otherwise none of them will be executed at all.

Now, let’s run the REST script, in which we can send multiple configuration commands at once (I’ve removed the loopback interface before doing it and included a print to show the request’s response).

$ python test_rest_command.py
{
  "errors": {
    "error": [
      {
        "error-message": "invalid value for: mask in /ios:native/ios:interface/ios:Loopback[ios:name='10']/ios:ip/ios:address/ios:primary/ios:mask: \"255.255.255.355\" is not a valid value.",
        "error-urlpath": "/api/config/native/interface/Loopback",
        "error-tag": "malformed-message"
      }
    ]
  }
}

Here we have a clear description of what went wrong: invalid value for: mask \”255.255.255.355\” is not a valid value.

Let’s get back to the device and see what happened.

csr1#show run interface loopback10
                                ^
% Invalid input detected at '^' marker.

The loopback interface has not been created! How come?
The RESTCONF interface pushes configuration on NETCONF datastore first. Then, if the configuration is valid, meaning all the values are compliant to the underlying data models, it’s committed to the running configuration. Otherwise, ALL the configuration operations contained in the RESTCONF/NETCONF call are rolled back and never applied!

Again, let’s compare:

  • Single operation in SSH, multiple operations at once in RESTCONF/NETCONF
  • Open interfaces are transactional, SSH is not
  • Better error handling in RESTCONF/NETCONF. We can use try/except Python construct to check and remediate errors

On-box Python

I’ve already written on on-box Python support on Nexus 9000 platform. This may be particularly useful on some use cases and Cisco is now adding this support to its enterprise switches as well!

Fabrizio’s shown us an interesting demo where he exploited the Embedded Event Manager and Python. Here he configured the EEM (1) to monitor log messages to look for an IF-DOWN message and running a Python script to reactivate it via a “no shut” command and (2) to create a backup config every time a change occour.

Someone may ask: why should we consume CPU for something we can do with an external system? Here are few points:

  • Python runs on a secure Linux instance on the device, so the actual OS is separated from Linux environment. This enabled Cisco to limit the Linux container CPU usage up to 1% of the total. This means Python scripts will never overload your device.
  • Why shouldn’t you use something that is already there without the need to add an external system which may mean additional complexity, overhead etc. ? 🙂
  • What if you accidentaly lose access to the device due to some misconfiguration on your management interface? This can be used to automatically restore the right configuration!

Conclusions

Unlike other buzzword, Automation is a reality today and it’s good to see how Cisco is working hard on trying to offer a better experience to those who want to start automating their networks, not only on data centers but on campus environments as well.

In particular, I’m really happy about Cisco effort on bringing some level of consistency across the (almost) full range of OSs. This is something the industry has to lean toward if we want automation to be the normality when it comes to operate networks: consistency across APIs, enabling us to access devices the same way and consistency on data using open models, enabling us to get the same kind of data across differen devices.

That’s the future and I’m excited about it 🙂

 

Cisco Live Europe 2017

Last October I was in New York City to enjoy few days of onsite work at Network to Code‘s HQ. I love working remotely from my home in Sicily, but it’s always interesting and exciting to physically join the team every once in a while. At that time, we had the opportunity to attend the Ansible Fest in Broklyn, a full day of presentations and workshops around the Ansible’s world. This one was the first important live event I’ve ever attended and I really enjoyed it as it offered the opportunity to meet new folks sharing our “automation vision” 🙂

Next month will be the time of another important event: Cisco Live Europe in Berlin. I’ve discovered the passion for networks by attending a Cisco Academy class at high school and since that moment I’ve always dreamed to attend a Cisco Live event. Now I’ll finally have the opportunity to do it!

As it’s clear, my interest around networks has evolved over time and now my focus is on Network Automation: Ansible, NAPALM, Python, APIs, CI/CD, StackStorm, you got it, this kind of magical stuff 😀 Because of this, I plan to spend most of my time at the DevNet Zone, a place where crazy people can talk about crazy stuff. Some of the session I’m interested to attend are these:

  • Getting Started with Containers: DEVNET-2042
  • Building a DevOps CICD Pipeline from Scratch: DEVNET-2203
  • Demystifying Container Networking: DEVNET-1195
  • DevNet Workshop – Managing Cisco UCS with the Python SDK: DEVNET-2060
  • DevNet Workshop – ACI API: DEVNET-2054
  • Cisco UCS Python SDKs: DEVNET-2063
  • NetDevOps for the Network Dude – How to get started with API’s, Ansible and Python: DEVNET-1002
  • Mastering ACI Programmability and Automating common DC Tasks: DEVNET-2001
  • DevNet Workshop – NXOS in the Real World Using NX-API REST: DEVNET-2101
  • DevNet Workshop – NETCONF/RESTCONF/YANG API: DEVNET-2044
  • DevNet Workshop -Device programmability for zero touch provisioning: DEVNET-2053

Yeah, that’s lot of content, I’ll need to diligently plan ahead for it. That said, I’m sure it’ll be an amazing experience!

I’ll arrive in Berlin on February 20, so if you’d like to meet and talk about everything around networking and automation I’d be very happy to share a beer with you, so just reach out! 🙂

 

 

Telemetry Streaming on Cisco IOS-XR, NXOS and Slack

As we all know, Network Automation is getting more and more real in today networks and communities. All vendors now offer some programmable interface to their products, enabling network engineers to better configure and manage their infrastructures. Automation inspired communities like Network-to-Code Slack channel now counts 1300+ professionals who exchange knowledge and help each other on a day-to-day basis. Remarkable Python automation libraries are expanding and improving at a very good pace (netmiko, NAPALM) as well as existing tools and frameworks like Ansible have lately added a bunch of new networking related features and networking Hackthons are organized.

All of these are very good signals that should push you to embrace the change, if you haven’t done it yet 🙂 This change is so big that it not only involves devices configuration (unlike many people think when they first meet automation) but it affects the whole network life cycle, from the design to the configuration, to the maintenance and monitoring.

I underlined monitoring as it is a big topic and some vendor like Cisco has started to support a set of ways to let you be able to dynamically gather data and statistics from devices.

Streaming Telemetry

The last sentence may sounds familiar to you (SNMP anyone?) but Telemetry is a totally different concept. The whole idea of Telemetry is based on a push model, where the device sends data to a configured receiver. This can be done on a timer basis or on an policy/event-driven basis. That’s pretty different from the pull model used by SNMP.

Data gained from Telemetry can be used to automatically re-configure the device, to build statistics of some given metrics, to rise preventive alarms or to whatever else use case you want to develop.

Cisco offers these capabilities to its IOS-XR (from Release 6.1.1 onwards) in two different models:

  • Model Driven Telemetry (MDT):  here the streamed data is defined by a YANG model, either an Open-Config or Cisco IOSXR native model. In order to use MDT the device must support
  • Policy Driven Telemtry (PDT):  here the streamed data is defined by policy file which also defines the frequency of the streaming. An example of policy file to stream interface counters looks like this:
{
"Name": "TelemetryTest",
 "Metadata": {
     "Version": 25,
     "Description": "This is a sample policy to demonstrate the syntax",
     "Comment": "This is the first draft",
     "Identifier": "<data that may be sent by the encoder to the mgmt stn"
},
 "CollectionGroups": {
     "FirstGroup": {
         "Period": 10,
         "Paths": [
             "RootOper.Interfaces.Interface(*)"
         ]
     }
 }
}

Testing MDT on XR

I’m really interested in this topic so I wanted to do a really quick test. Luckily, Cisco offers some nice tutorials like this.

This is what I configured on my XR device:

RP/0/RP0/CPU0:xrv1(config)# telemetry model-driven
RP/0/RP0/CPU0:xrv1(config-model-driven)# destination-group jumphost
RP/0/RP0/CPU0:xrv1(config-model-driven-dest)# address family ipv4 10.0.0.5 port 5555
RP/0/RP0/CPU0:xrv1(config-model-driven-dest-addr)# encoding self-describing-gpb
RP/0/RP0/CPU0:xrv1(config-model-driven-dest-addr)# protocol tcp
RP/0/RP0/CPU0:xrv1(config-model-driven-dest-addr)# exit
RP/0/RP0/CPU0:xrv1(config-model-driven)#
RP/0/RP0/CPU0:xrv1(config-model-driven)#sensor-group SGroup1
RP/0/RP0/CPU0:xrv1(config-model-driven-snsr-grp)# sensor-path Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/generic-counters
RP/0/RP0/CPU0:xrv1(config-model-driven-snsr-grp)# exit
RP/0/RP0/CPU0:xrv1(config-model-driven)#
RP/0/RP0/CPU0:xrv1(config-model-driven)#subscription Sub1
RP/0/RP0/CPU0:xrv1(config-model-driven-subs)#sensor-group-id SGroup1 sample-interval 10000
RP/0/RP0/CPU0:xrv1(config-model-driven-subs)#destination-id jumphost
RP/0/RP0/CPU0:xrv1(config-model-driven-snsr-grp)# commit

This instructs the device to collect data based on a YANG model extracting interface counters. The next step is to enable the data collector on the receiver device. You can download the Collector repo from GitHub and pick the one compatible to your configured encoding method. It’s just a piece of Python code, so all you have to do is run it on your console specifying receiving address and port.

$ python telemetry_receiver.py --ip-address 10.0.0.5 --port 5555

Compiled cisco.proto
Waiting for TCP connection
Waiting for UDP message
Got TCP connection
Getting TCP message
  Message Type: JSON (2))
  Flags: None
  Length: 10695
Decoding message

CollectionStartTime: Mon Nov 14 06:07:39 2016 (393ms)
CollectionID: 1151
CollectionEndTime: Mon Nov 14 06:07:39 2016 (742ms)
Version: 25
Policy: TelemetryTest
Path: RootOper.Interfaces.Interface
Identifier: <data that may be sent by the encoder to the mgmt stn
Data: {
  RootOper {
    Interfaces {
      Interface (5 items - displaying first entry only) [
        [0]
          InterfaceName: GigabitEthernet0/0/0/0
          InterfaceHandle: GigabitEthernet0/0/0/0
          IsL2Looped: False
          EncapsulationTypeString: ARPA
          HardwareTypeString: GigabitEthernet
          State: IM_STATE_ADMINDOWN
          LastStateTransitionTime: 6644
          MaxBandwidth: 1000000
          InFlowControl: IM_ATTR_FLOW_CONTROL_OFF
          LineState: IM_STATE_ADMINDOWN
          LinkType: IM_ATTR_LINK_TYPE_FORCE
          CollectionTime: Mon Nov 14 06:07:39 2016 (544ms)
          Description:
          DataRates {
            PeakInputDataRate: 0
            LoadInterval: 9
            InputPacketRate: 0
            PeakOutputDataRate: 0
            InputLoad: 0
            PeakOutputPacketRate: 0
            OutputPacketRate: 0
            OutputLoad: 0
            Bandwidth: 1000000
            Reliability: 255
            PeakInputPacketRate: 0
            InputDataRate: 0
            OutputDataRate: 0
          }
          InterfaceType: IFT_GETHERNET
          CarrierDelay {
            CarrierDelayUp: 10
            CarrierDelayDown: 0
          }
          StateTransitionCount: 0
          IsL2TransportEnabled: False
          Encapsulation: ether
          ParentInterfaceName: <No interface>
          IsDampeningEnabled: False
          Duplexity: IM_ATTR_DUPLEX_FULL
          MediaType: IM_ATTR_MEDIA_OTHER
          MTU: 1514
          MACAddress {
            Address: 2cc2.605c.9017
          }
          BurnedInAddress {
            Address: 2cc2.605c.9017
          }
          Bandwidth: 1000000
          OutFlowControl: IM_ATTR_FLOW_CONTROL_OFF
          IfIndex: 0
          InterfaceStatistics {
            StatsType: Full
            FullInterfaceStats {
              InputOverruns: 0
              ParityPacketsReceived: 0
              MulticastPacketsSent: 0
              MulticastPacketsReceived: 0
              InputIgnoredPackets: 0
              SecondsSinceLastClearCounters: 0
              Applique: 0
              SecondsSincePacketSent: 4294967295
              PacketsSent: 0
              OutputBuffersSwappedOut: 0
              GiantPacketsReceived: 0
              SecondsSincePacketReceived: 4294967295
              InputErrors: 0
              BytesReceived: 0
              LastDiscontinuityTime: 1479115826
              OutputErrors: 0
              CarrierTransitions: 0
              LastDataTime: 1479121659
              CRCErrors: 0
              OutputDrops: 0
              PacketsReceived: 0
              InputQueueDrops: 0
              OutputQueueDrops: 0
              InputAborts: 0
              InputDrops: 0
              ThrottledPacketsReceived: 0
              Resets: 0
              FramingErrorsReceived: 0
              BroadcastPacketsSent: 0
              OutputUnderruns: 0
              OutputBufferFailures: 0
              BytesSent: 0
              RuntPacketsReceived: 0
              UnknownProtocolPacketsReceived: 0
              BroadcastPacketsReceived: 0
              AvailabilityFlag: 0
            }
          }
          Speed: 1000000
      ]
    }
  }
Getting TCP message
  Message Type: JSON (2))
  Flags: None
  Length: 8323

This was pretty easy! Here you can find the list of all YANG model you can use to get data from the device.

For example, to get BGP data you’d use sensor-path like Cisco-IOS-XR-ipv4-bgp-oper:bgp/config-instances/config-instance/config-instance-default-vrf, resulting in a stream of data like this (and much more):

Getting TCP message
  Message Type: GPB_KEY_VALUE (4))
  Flags: None
  Length: 7013
Decoding message

Collection ID:   12
Base Path:       Cisco-IOS-XR-ipv4-bgp-oper:bgp/config-instances/config-instance/config-instance-default-vrf/entity-configurations/entity-configuration
Subscription ID:
Model Version:
Start Time:      Mon Nov 14 06:40:33 2016 (606ms)
Msg Timestamp:   Mon Nov 14 06:40:33 2016 (606ms)
End Time:      Mon Nov 14 06:40:33 2016 (632ms)
Fields: 1
  Displaying first entry only
  <no name>: fields (items 31) Mon Nov 14 06:40:33 2016 (624ms) {
    instance-name: default (string)
    entity-type: 3 (sint32)
    neighbor-address: 1.1.1.1 (string)
    neighbor-address: fields (items 2)  {
      afi: ipv4 (string)
      ipv4-address: 1.1.1.1 (string)
    }
    group-name:  (string)
    configuration-type: neighbor (string)
    address-family-identifier: 23 (uint32)
    af-independent-config: fields (items 120)  {
      remote-as-number-xx: 0 (uint32)
      remote-as-number-yy: 65534 (uint32)
      configured-speaker-id: 0 (uint32)
      tcp-mss: 0 (uint32)
      min-advertisement-interval: 0 (uint32)
      min-advertisement-interval-msecs: 0 (uint32)
      description:  (string)
      ebgp-hop-count: 1 (uint32)
      bmp-servers: 1 (uint32)
      is-ebgp-multihop-bgpmpls-forwarding-disabled: false (string)
      keychain:  (string)
      local-as-number-xx: 0 (uint32)
      local-as-number-yy: 0 (uint32)
      local-as-no-prepend: false (string)
      password:  (string)
      socket-buffer-receive-size: 32768 (uint32)
      bgp-buffer-receive-size: 4096 (uint32)
      socket-buffer-send-size: 24576 (uint32)
      bgp-buffer-send-size: 4096 (uint32)
      adminstrative-shutdown: false (string)
      keepalive-interval: 60 (uint32)
      hold-time-value: 180 (uint32)
      min-acc-hold-time-value: 3 (uint32)
      local-ip-address: fields (items 2)  {
        afi: ipv4 (string)
        ipv4-address: 0.0.0.0 (string)
      }
      msg-log-in-buf-count: 0 (uint32)
      msg-log-out-buf-count: 0 (uint32)
      route-updates-source:  (string)
      dmz-link-bandwidth: 0 (uint32)
      ebgp-recv-dmz: 0 (uint32)
      ebgp-send-dmz-mode: bgp-ebgp-send-dmz-disable (string)
      ttl-security: 0 (uint32)
      suppress4-byte-as: 0 (uint32)
      capability-negotiation-suppressed: 0 (uint32)
      session-open-mode: bgp-tcp-mode-type-either (string)
      bfd: 0 (uint32)
      bfd-mininterval: 0 (uint32)
      bfd-multiplier: 0 (uint32)
      tos-type-info: 0 (uint32)
      tos-value-info: 6 (uint32)
      nsr-disabled: 0 (uint32)
      graceful-restart-disabled: 0 (uint32)
      nbr-restart-time: 120 (uint32)
      nbr-stale-path-time: 360 (uint32)
      nbr-enforce-first-as-status: true (string)
      nbr-cluster-id-type-info: 0 (uint32)
      nbr-cluster-id-info: 0 (uint32)
      ignore-connected-check: false (string)
      internal-vpn-client: false (string)
      addpath-send-capability: 0 (uint32)
      update-error-handling-no-reset: 0 (uint32)
      addpath-receive-capability: 0 (uint32)
      egress-peer-engineering: 0 (uint32)
      prefix-validation-disable: 0 (uint32)
      bestpath-use-origin-as-validity: 0 (uint32)
      prefix-validation-allow-invalid: 0 (uint32)
      prefix-validation-signal-ibgp: 0 (uint32)
      neighbor-update-filter-exists: false (string)
      neighbor-update-filter-message-buffer-count: 0 (uint32)
      neighbor-update-filter-message-buffer-is-non-circular: false (string)
      neighbor-update-filter-logging-disable: false (string)
      neighbor-update-filter-attribute-filter-group-name:  (string)
      graceful-shutdown-exists: 0 (uint32)
      graceful-shutdown-loc-pref: 0 (uint32)
      graceful-shutdown-as-prepends: 0 (uint32)
      graceful-shutdown-activate: 0 (uint32)
      remote-as-info: fields (items 2)  {
        is-item-configured: true (string)
      }
      speaker-id-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      min-advertisement-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      description-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      ebgp-hop-count-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      tcpmss-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      bmp-servers-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      keychain-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      local-as-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      password-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      receive-buffer-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      send-buffer-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      shutdown-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      timers-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      local-address-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      msg-log-in-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      msg-log-out-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      update-source-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      dmz-link-bandwidth-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      ebgp-recv-dmz-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      ebgp-send-dmz-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      ttl-security-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      suppress4-bbyte-as-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      session-open-mode-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      bfd-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      bfd-mininterval-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      bfd-multiplier-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      tos-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      nsr-disabled-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      graceful-restart-disabled-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      nbr-restart-time-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      nbr-stale-path-time-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      nbr-enforce-first-as-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      cluster-id-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      ignore-connected-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      internal-vpn-client-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      addpath-send-capability-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      addpath-receive-capability-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      egress-peer-engineering-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      update-error-handling-no-reset-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      prefix-validation-disable-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      prefix-validation-use-validit-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      prefix-validation-allow-invalid-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      prefix-validation-signal-ibgp-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      neighbor-update-filter-exists-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      neighbor-update-filter-message-buffer-count-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      neighbor-update-filter-syslog-disable-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      neighbor-update-filter-attribute-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      graceful-shutdown-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      graceful-shutdown-loc-pref-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      graceful-shutdown-as-prepends-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      graceful-shutdown-activate-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      capability-negotiation-suppressed-info: fields (items 2)  {
        is-item-configured: false (string)
      }
      local-as-replace-as: false (string)
      local-as-dual-as: false (string)
    }
  }

You can now start to collect and consume this data as you wish. One option is to use the ready-to-use stack built by Cisco.

Pretty cool. That’s a shame we can use such technology only with a limited set of Cisco IOS-XR devices, right? 🙂

Streaming Telemetry on NXOS

On one hand, it’s totally true that this is still a poorly supported feature, but if you’re geek enough you can easily obtain some kind of push based telemetry on NXOS too. This can be done thanks to on-box Python support on Cisco NXOS 9000.

Cisco Python Package

You can easily access the on-box Python Interpreter by simply typing python. Then you can import the library to interact with the underlying OS.

n9k2# python
Python 2.7.5 (default, Oct  8 2013, 23:59:43)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from cli import *
>>>

Now try to send a command and display its output:

>>> vlans = cli('show vlan')
>>> print vlans 

VLAN Name                             Status    Ports
---- -------------------------------- --------- -------------------------------
1    default                          active    Eth1/1, Eth1/2, Eth1/3, Eth1/4
                                                Eth1/5, Eth1/6, Eth1/7, Eth1/8
                                                Eth1/9, Eth1/10, Eth1/11
                                                Eth1/13, Eth1/14, Eth1/15
                                                Eth1/16, Eth1/17, Eth1/18
                                                Eth1/19, Eth1/20, Eth1/21
                                                Eth1/22, Eth1/23, Eth1/24
                                                Eth1/25, Eth1/26, Eth1/27
                                                Eth1/28, Eth1/29, Eth1/30
                                                Eth1/34, Eth1/35, Eth1/36
                                                Eth1/37, Eth1/38, Eth1/39
                                                Eth1/40, Eth1/41, Eth1/42
                                                Eth1/43, Eth1/44, Eth1/45
                                                Eth1/46, Eth1/47, Eth1/48
                                                Eth2/5, Eth2/6, Eth2/7, Eth2/8
                                                Eth2/9, Eth2/10, Eth2/11
                                                Eth2/12
20   VLAN0020                         active
122  VLAN0122                         active
123  VLAN0123                         active
200  WEB                              act/lshut
201  VLAN0201                         act/lshut

VLAN Type         Vlan-mode
---- -----        ----------
1    enet         CE
20   enet         CE
122  enet         CE
123  enet         CE
200  enet         CE
201  enet         CE

Remote SPAN VLANs
-------------------------------------------------------------------------------

Primary  Secondary  Type             Ports
-------  ---------  ---------------  -------------------------------------------

We can also deal with structured output using the clid function.

>>> vlans = clid('show vlan')
>>> print vlans
{"TABLE_vlanbrief": {"ROW_vlanbrief": [{"vlanshowbr-vlanid": "1", "vlanshowbr-vlanid-utf": "1", "vlanshowbr-vlanname": "default", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "noshutdown", "vlanshowplist-ifidx": "Ethernet1/1-11,Ethernet1/13-30,Ethernet1/34-48,Ethernet2/5-12"}, {"vlanshowbr-vlanid": "20", "vlanshowbr-vlanid-utf": "20", "vlanshowbr-vlanname": "VLAN0020", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "noshutdown"}, {"vlanshowbr-vlanid": "122", "vlanshowbr-vlanid-utf": "122", "vlanshowbr-vlanname": "VLAN0122", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "noshutdown"}, {"vlanshowbr-vlanid": "123", "vlanshowbr-vlanid-utf": "123", "vlanshowbr-vlanname": "VLAN0123", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "noshutdown"}, {"vlanshowbr-vlanid": "200", "vlanshowbr-vlanid-utf": "200", "vlanshowbr-vlanname": "WEB", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "shutdown"}, {"vlanshowbr-vlanid": "201", "vlanshowbr-vlanid-utf": "201", "vlanshowbr-vlanname": "VLAN0201", "vlanshowbr-vlanstate": "active", "vlanshowbr-shutstate": "shutdown"}]}, "TABLE_mtuinfo": {"ROW_mtuinfo": [{"vlanshowinfo-vlanid": "1", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "20", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "122", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "123", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "200", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}, {"vlanshowinfo-vlanid": "201", "vlanshowinfo-media-type": "enet", "vlanshowinfo-vlanmode": "ce-vlan"}]}}

The really nice thing is that we can also run Python scripts!

So, the goal is to write a Python script to collect some data and to push them off-box to an external collector 🙂 This can be done by running the script, store the data on a file and copy it to the external collector. Doing so, we’d obtain almost the same results of XR Telemetry.

But, what if we want to make it even more cool? Let’s say writing a script to collect BGP neighbors every 30 seconds and push an alert every time the state of any neighbor changes? What if this alert is pushed to Slack instead of a server? Okay, let’s do it 🙂

Enable Slack WebHook

In order to push automatic notifications to Slack, we need to enable a WebHook which is a way to post messages as HTTP requests with a JSON payload. To configure a webhook you should go to [your company].slack.com/services/new/incoming-webhook and click on Add Incoming WebHooks Integration button. Once you have your WebHook link, you can use it with Python requests library to POST messages on channels.

A simple function to push such messages would look like this:

def message_deliver(text):
    webhook_url = # WEBHOOK URL
    username = "gabriele"
    icon_emoji = ":panda_face:"
    channel = "general"

    body = {
        'username': username,
        'icon_emoji': icon_emoji,
        'text': 'Hello World!',
        'channel': channel
    }

    command = '''
    curl -i -k -H "Content-Type: application/x-www-form-urlencoded" -X POST <webhook_url> --data '{0}'
    '''.format(str(body).replace("'", '"'))
    os.system(command)

The above function will push an Hello World! message to my WebHook, resulting in a message post by the username gabriele on the channel called general. We’ll use this exact function to push our automatic BGP notifications.

Get BGP neighbors

Writing such script is pretty easy and you can achieve it in many ways. You can find here the one I used for this test. This will gather and store all BGP neighbors and their states every 30 seconds, compare the current state with the latest one and finally push a message to Slack if any change is found.

NXOS Scheduler

Now, as last step, we have to configure a scheduler to run the Python script. This is how my scheduler looks.

n9k2# show scheduler config
config terminal
  feature scheduler
  scheduler logfile size 16
end

config terminal
 scheduler job name get_bgp_neighbors
python bootflash:/get_bgp_neighbors.py

end

config terminal
  scheduler schedule name get_bgp_neighbors
    time start 2016:11:01:13:41 repeat 5
    job name get_bgp_neighbors
end

Testing it

I’ve established a BGP session between two N9K switches. Once everything is set, we can simply shut one neighbor down resulting in this.

bot

Nice, right?

Even without XR capabilities, you can build something nice by exploiting Python and Cisco libraries!

Conclusions

I’ve touched many topics on this post: from Telemetry to Even-Driven automation, to WebHooks and on-box Python. I think all these can be useful when it comes to manage/tests a network/service and I hope this will help to get you started with such tools and technologies 🙂

 


Network Automation Survey

In order to better understand how people and companies are using these technology on their networks, a group of professionals has put together a survey to get a bigger picture about how we automate (or, at least, attempt to  ) our networks. Here it is if you want to partecipate.

So you want to start with Network Automation…

As we all know, things in networking are changing rapidly and so is changing the needed skillset for those who manage networks.

I’m definitely not an expert (I’m far from it) but lately many people asked me how to start with Network Automation. Now I’ve just received a message from a LinkedIn’s friend asking for something like this and I suddenly realized this would be a nice topic to write about 🙂

In this post I’ll briefly summarize what you need to start your journey (or, at least, what I used to start mine).

Python

Even if we may be still far to deploy Software Defined Networks everywhere, software managed networks are a real thing and Python is the core of them.

Python is a pretty well-know programming language which is loved for its ease of learning. I’ve studied C and Java at university and hated both of them, while I simply love Python 🙂

There are plenty of available resources for those who wish to study for free and the following is a little list of stuff I’ve personally used:

  • CodeAcademy: really nice course to start your journey. It let you approach the language in a very practical way. Anyway, it does not dig very deep into the language.
  • Coursera: the website is full of Python courses, from the basics to more advanced topics. I’ve attended a couple of them and I really appreciated them.
  • How to think like a computer scientist: this was the very first Python resource I’ve ever used. It is a very well written book covering all the foundation in a pretty deep and clear way.
  • Dive Into Python: this is a more advanced book for those of you who are hungry of knowledge.

I’m sure the list of someone else would look completely different since there are so many resources out there. So just pick one of them and start 🙂

APIs

Networking vendors have developed specialized APIs to help engineers interact with their devices. I’ll introduce some of them within this section.

Juniper PyEZ

Juniper is working hard on automation and has developed the PyEZ library, supported by almost every JunOS device. Once you installed all the requirements, it’s really easy to start talking to your remote device:


>>> from jnpr.junos import Device
>>> from jnpr.junos.utils.config import Config
>>> from pprint import pprint
>>> my_device = Device(host='172.16.1.1', user='gabriele', password='gabriele')
>>> my_device.open()
Device(172.16.1.1)
>>> pprint(my_device.facts)
{'2RE': False,
'HOME': '/var/home/gabriele',
'domain': None,
'fqdn': 'Router1',
'hostname': 'Router1',
'ifd_style': 'CLASSIC',
'model': 'olive',
'personality': 'UNKNOWN',
'serialnumber': '',
'switch_style': 'NONE',
'vc_capable': False,
'version': '12.1R1.9',
'version_info': junos.version_info(major=(12, 1), type=R, minor=1, build=9)}

Here there are some other practical reference about it:

Cisco

Cisco is working toward enabling automation in today’s network as well (of course).

Cisco NX-API

If you want to talk to Cisco NX-OS devices, you can use their NX-API.

Jason Edelman did an awesome work on both introducing NX-API here and developing another API called pycsco that simplifies working with Cisco NX-OS switches that support NX-API.

Here you can also find the latest reference from Cisco itself: NX-API book.

Cisco IOS-XR

Elisa Jasinska developed an API to help interact with Cisco devices running IOS-XR. It’s called pyIOSXR.

Arista EOS

If you want to use Arista EOS, you can pick eAPI. You can also find some references here and on Packet Pushers.

Netmiko

Another super useful tool is Netmiko. It’s not a specialized API but instead it’s used to send commands to network devices and retrieve their output. That’s a great resource for those who want to start with network automation and I’ve extensively used it in pretty much every project I’ve done.

In addition, the list of supported devices is huge:

Cisco IOS
Cisco IOS-XE
Cisco ASA
Cisco NX-OS
Cisco IOS-XR
Cisco WLC (limited testing)
Arista vEOS
HP ProCurve
HP Comware (limited testing)
Juniper Junos
Brocade VDX (limited testing)
F5 LTM (experimental)
Huawei (limited testing)
A10 (limited testing)
Avaya ERS (limited testing)
Avaya VSP (limited testing)
OVS (experimental)
Enterasys (experimental)
Extreme (experiemental)
Fortinet (experimental)
Alcatel-Lucent SR-OS (experimental)

Netmiko’s been developed by Kirk Byers and he also wrote an amazing post on how to use it. Thank you Kirk 🙂

NAPALM

This name shouldn’t sound new to you! 😉

In fact, I’ve extensively talked about NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support) in my previous post.

If you didn’t read it, repent, go read it and come back here 🙂

Automation tools: Ansible

Like NAPALM, this shouldn’t sound new! I’ve talked about Ansible in two of my previous posts (here and here).

Anyway, those posts could be difficult to understand if you’re completely new. In this case, don’t worry, you definitely can be guided by Kirk and Jason (these two guys are awesome!):

  • Kirk wrote a very nice guide introducing Ansible playbooks and templates, splitting it into 3 parts (Part1, Part2, Part3). This is what I used to write my first blog post on Ansible (see just above).
  • Jason extensively wrote about Ansible basics (this post is precious for those who just started to use the tool) and other more advanced applications as well (here, here and here).

Have I already said that these two are awesome? 🙂

Fast-paced Courses

Last but not least, if you really want to boost your automation skills and have enough resources (or you’re lucky enough to receive support from your company) you may want to attend live classes on Network Automation.

Jason Edelman si delivering an awesome Network Programming and Automation course all around the world. It covers everything you need to move from novice, writing your first “Hello World!” in Python, to NetOps Ninja developing a working network automation Flask app.

During the course you’ll not just sit there listening to Jason, but you’ll go through 10+ hours of labs too. I’ve reviewed the whole lab section and it took me almost the full 10 hours to complete it (and I was not new to most of the topics!). So I think you can expect to spend at least 2 extra hours on this.

Summarizing: 4 days digging deep on Python and Network Automation including some cool tools like Ansible + 12 hours of practical labs + guidance from Jason Edelman, one of the most expert guy on the field = How cool is this?  😀

tweet

Here it is the course schedule for the first part of the next year. Don’t miss it! 😉

 Conclusion

These are just some of the available resource I used to start my journey with NetOps. There are tons of more resources out there if you want to start practicing Network Automation or simply improve your coding skills so you have no excuses! Choose what you want and just start! 🙂

start

 

Adding Cisco IOS support to NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support)

If  you’re a networking passionate I’m pretty sure you’ve already heard about NAPALM (no, I’m not talking about the flammable liquid used in warfare 🙂 ). Anyway, if you’ve not yet, you’re going to discover a very nice project for network automation.


 

NAPALM

What is it? Let’s quote its documentation page:

NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support) is a Python library that implements a set of functions to interact with different network device Operating Systems using a unified API.

It’s a project developed by David Barroso and Elisa Jasinska (thank you guys 😀 ), owned by Spotify and, as the quote says itself, it is used to interact with different hardware networking vendors. Basically, it works like an API on top of other APIs, adding another level of abstraction.

Lately, many vendors have developed APIs to making it easier to interact with their equipments. For example, most of the JunOS devices support Juniper PyEZ, and so do Cisco’s Nexus with its NX-API.

This way, if I want to interact with a Juniper device I can use PyEZ, whereas I’d use NX-API if I wish to talk with a Nexus switch, and the example continues with other specialized APIs.

What NAPALM does is hiding this layer unifying the way we access a networking device, regardless who built it.

napalm

Back to few days ago, NAPALM supported the following network OS:

  • Arista EOS
  • Juniper JunOS
  • Cisco IOS-XR
  • Cisco NX-OS
  • Fortinet FortiOS
  • IBM OS

This is possible thanks to the introduction of the NetworkDriver concept. Every time we want to interact with a device, we can only specify what OS we are going to talk to and NAPALM will select the correct NetworkDriver (basically, a library with all the functions related to that OS).

>>> from napalm import get_network_driver
>>> get_network_driver('eos')
>>>
>>> get_network_driver('iosxr')
>>>
>>> get_network_driver('junos')
>>>
>>> get_network_driver('fortios')

NAPALM will still use third party APIs but this will be trasparent to the user.

Cisco IOS support

Unlike NX-OS, Cisco IOS have no API support. Therefore, it’s not that straightforward to obtain structured data from it and at first NAPALM didn’t support it. So I thought this could be a nice spot to play 🙂

I forked the main repository and started to code.

IOSDriver

Since no native API exists, I had to use something more general: netmiko. This is a pretty sweet Python library making it super easy to connect and interact with networking devices. Once a command is sent, netmiko can give me back the output and then I can start to filter and parse it.

The module is composed by 12 methods:

  • open(): opens the connection with the remote device. It is the first method to be used.
  • close(): closes the connection with the remote device.
  • load_merge_candidate(filename, config): loads a candidate configuration from a textfile or a configuration string. If both are passed, filename is picked. At this point no configuration is pushed to the device yet.
  • compare_config(): simply shows the list of commands proposed by the load_merge_candidate method and ready to be executed if commited.
  • discard_config(): what if we notice some errors after the compare_config? We can discard the proposed changes using this method.
  • commit_commit(): pushes the configuration from load_merge_candidate and saves the configuration.
  • rollback(): we can rollback the commited changes using this method. This simply adds the no keyword to commands (anyway it’s smart enough to recognize parent/child commands)
  • get_lldp_neighbors(): extracts lldp neighbors information from the device.
  • get_facts(): extracts information like uptime, vendor, os_version, serial_number, model, hostname, fqdn, and interface_list.
  • get_interfaces(): extracts information about interfaces including status and speed.
  • get_bgp_neighbors(): extracts information about BGP neighbors.
  • get_interfaces_counters(): extracts information about counters.

Demo

Now let’s see an example of how to use NAPALM.

The first thing we’re gonna do is to connect to our remote device specifying the OS type, username, password and IP address.


from netmiko import ConnectHandler
from napalm import get_network_driver
get_network_driver('ios')
driver = get_network_driver('ios')
device = driver('172.16.1.1', 'gabriele', 'gabriele')
device.open()

Once we’re done, if everything went fine we’ll see the Python interactive shell confirming the SSH session has been established.

python1

Now we can start to interact with our device. Let’s ask for some facts, for example:

python2

As we can see, it’s a Cisco 3640 device whose IOS version is 12.4(16). It has 3 interfaces and its uptime value is set to 9 minutes.

Cool, right? 😀

Let’s try some other methods:

python3

Earlier we discovered 3 interfaces exist in our device. Now we’ve just obtained some specific information about them using the get_interfaces() as well as BGP neighbors information thanks to get_bgp_neighbors().

Let’s see how NAPALM can help us with configuration management. Imagine we want to implement OSPF on our network. Just to keep it simple, we want to push the following configuration:


router ospf 1
!
network 0.0.0.0 0.0.0.0 area 1
!

Using load_merge_candidate(filename=new_good.conf), we’ll load our configuration (assuming new_good.conf is the textfile containing the above config). Then, we can see what changes would be implemented using compare_config(). At that point, we can decide to either commit or discard these changes.

Here we’ve sudden realized our OSPF area should be and not 1. So we decide to discard the candidate configuration with discard_config(). We can use compare_config() to confirm that every possible change’s been discarded.

python4

Since our last compare_config() doesn’t show anything, it means everything went fine.

Anyway, we still want to implement OSPF, so we fix the configuration and give it another try. This time we want to use a configuration string instead of a .conf file. We do this with load_merge_candidate(config=’router ospf 1\nnetwork 0.0.0.0 0.0.0.0 area 0′). Then, if we are happy with it, we can commit.

python5

..and this is how the router’s config look like:

python6

At this point, if we want to rollback the change we can simply use rollback().

Conclusion

It’s been lot of fun to work on this patch and I’m happy to announce that now NAPALM supports IOS too since my PullRequest has been merged to the main repo 🙂

NAPALM is a really cool project and it’s popular among NetOps community and it’s also been presented at a NANOG conference. Here you can find the video from the awesome guys who actually designed and implemented it. Enjoy 🙂


 

P.S.

If you’re interested about NAPALM or Network Automation in general you should definitely join the SLACK channel at network.toCode(). Here you’ll find lots of cool guys discussing fancy stuff on networking 🙂

 

Project:Them04 – Bruce DeWald

Here we are again with another interview! 😀

Today, our guest is Bruce, a young and brilliant Network Engineer from the US. Despite his youth he already have a lot of experience on the field and I’m sure you’ll enjoy his contribution here and his tips. Let’s go 🙂


Gabriele: Hey Bruce, welcome! Let’s start simple: Who are you? Where are you from? How old are you?

Bruce: Hello! My name is Bruce DeWald and I’m from a tiny town in Pennsylvania, US. I’m 21 and currently a senior at RIT.

G: What did you study and where?

B: I’m currently a senior studying Applied Networking and Systems Administration at Rochester Institute of Technology. I’ll be graduating in May 2016 with my undergraduate degree.

G: Based on your LinkedIn page, you have a lot of experience on IT. Can you talk us about your past experiences?

B: I’ve had three related jobs to network engineering. My first and current position is at my department’s on campus computer labs. I have built and manage our current server and network infrastructure. This job has given me a lot of experience as I’ve had the ability to learn a lot of new things and be able to do it myself as this isn’t a production infrastructure that a company relies on.

My second experience (Summer 2014) was a Network Engineering Internship at Harris Corporation in Melbourne, Florida. I worked more with network management and monitoring tools here ensuring the infrastructure remained up. I also got to recreate some topologies in the lab to troubleshoot issues we were having which was a lot of fun.

My most recent experience was this past summer interning at Cisco Meraki which I’ll elaborate on further on.

ci_190114_24

G: As you said, your most recent experience has been at Cisco Meraki, Tell me about it. How did you apply?

B: I actually applied at RIT’s career fair in October 2014. We have a career fair twice a year where over 250 companies come to recruit our students.

G: How was the recruiting process organized? 

B: At the career fair itself I was asked some very brief questions (about 5 or so) to supplement my resume to get a baseline about my knowledge. A few weeks later I had two one hour skype interviews. About a month after that I had a final one hour skype interview with the manager. Finally, an additional month went by until I heard back that I had received an offer (December 2014).

G: Is there anything about interviews you can share (without breaking any NDA)?

B: All three of my interviews were extremely technical with maybe about 10% of the time being spent on behavioral questions. They involved going through network troubleshooting scenarios that my interviewer would draw out on the board. These interviews were quite challenging but were also a lot of fun!

G: How was your internship organized? Have you done any kind of training?

B: My internship lasted eleven weeks over the summer. The first two weeks had training throughout and we eased ourselves into the job. The rest of the internship we were essentially on our own. By that I mean no one was holding our hand and we were doing real work but everyone around the entire office was extremely friendly and helpful as we had questions.

G: What was your role?

B: My official title was “Network Support Engineer Intern.” This involved handling customer cases revolving around the Meraki product. Some days I would close lots of cases because the customers would be asking easy questions. Other days I would spend an entire day on 2 or 3 cases. These cases might have required reading through lots of documentation, asking more senior employees for their expertise, or recreating the issue in our lab. Any time a case required further investigation we would recreate the customer topology in our lab. This both helped me learn the product/networking better, but also allowed us to see exactly what the customer was seeing.

ci_190114_37-630x392

G: What do you like the most about Cisco Meraki and your job as an intern?

B: The culture and atmosphere around the office made it a really enjoyable place to work in.There’s so many places to relax and so many great people around the office. Several times at lunch we would be joined by people we had never met before (often from another department) and have a great conversation over lunch.

G: How does Cisco Meraki “treat” its interns? What kind of “perks” did you receive?

B: We were treated just like full time employees. About the only difference between us and the full time employees was that we didn’t get health insurance. Meraki even provides awesome housing for us during the internship. Which is great because the cost of rent in San Francisco is ridiculous! Besides the awesome office, other perks we got were free breakfast & lunch, with occasional dinners and micro kitchens all around the office with tons of healthy (and unhealthy) snacks. We also got other random perks like free massages one day. There was also a gym in the building that we could use to work out before/after work.

G: In your opinion, what are the skills that a Cisco Meraki intern candidate should have?

B: In my opinion, the primary skill one needs at this job and any other in this field, is the ability to adapt and learn quickly. This field is always changing and companies are always shifting which technologies they use. The ability to learn something new quickly is a great skill to have. As for this particular internship, network troubleshooting skills were essential as that is pretty much what the internship entailed.

G: At the end of the internship, does they give the opportunity to convert it into a fulltime position?

B: Yes, if interested, interns can go through a few additional interviews to review their internship and skills to see if they get a full time offer.

G: How did you prepare yourself for the interviews?

B: I didn’t do any particular training for the interviews as I felt that my past experience and education had adequately prepared me for the internship. I am very happy with the education I have received and lucky to have gotten the experience that I have.

G: Can you share with us any advice for someone who want to start career in network engineering?

B: I think someone who’s interested in network engineering should play around with the technology as much as possible. I think it’s great to teach yourself new things that you are interested in. Occasionally I go into my department labs and just play with something new to teach myself. I also Co-Founded a networking club at RIT called NextHop where we try to teach students things that aren’t covered by our classes to better prepare students for a career in networking or systems administration.

I also personally believe that certs are a great way to both prove that you know something and teach yourself something new. I currently have my CCNA R&S and am pursuing others.

G: As always, one last question. What are your plans for the future?

B: My plans for the future are to obtain a full time Network Engineering position in the San Francisco Bay Area starting May 2016.

G: With such experience, knowledge and personality I’m sure you will have no problems with landing a wonderful position in any field you’d want 🙂

Many many thanks for your time, Bruce. We wish you all the best for everything 🙂

B: Thanks. It’s been a pleasure!


Honestly, I’m pretty amazed by Bruce’s story. He is so young and yet so skilled!

If you want to do like Bruce, this is the Cisco Meraki’s career page where you can find all the available positions. Don’t be shy, let’s apply 🙂

ci_190114_03-940x627

Network Automation Project – Part 1

Hi everyboy, today I want to share with you a project I’ve worked on during the past days. For this project I’ve been inspired by the Facebook NetEng team who created the #netengcode facebook group after a presentation a NANOG where they performed a tutorial on Network Automation and, specifically, on how they developed an auto-remedation tool. Beside all the really interesting contents, the tool was developed using DB interaction using Python. Since I’ve never done anything like this, I’ve decided to make some practice coding a little project.

As I said, I’ve been inspired by their work and code, and my project contains some lines of theirs (limited to some db.py’s lines). Anyway, let’s start 🙂

[The whole project code can be found on GitHub, here]

SCENARIO

scenario

As usually, the network is built using GNS3 with VirtualBox.

The devices are minimally configured with the essentials to provide basic connectivity and SSH access to the user gabriele.

THE TOOL

The project directory is composed by several files:


gabriele@gabriele-VirtualBox:~/Desktop/project$ ls
bgp.py network_automation_project.py db.py push.py devices.txt interfaces.py


The network_automation_project.py file is the main file that we will be execute. An high level description of the project is:

  • The user define a list of devices to be added inside the network using the devices.txt text file.
  • The “python network_automation_project.py [-u username -p password -f textfile]” command is executed.
  • The tool will ask for any missing parameters and then will start to parse devices.txt data inside a sqlite database.
  • The tool will ask the user some facts about the network (interface names and addresses and BGP informations).
  • The tool will store those additional information inside the db.
  • The tool will generate the proper configuration based on the data provided.
  • The tool will push the configuration to the devices.

In this stage the project is executed as dry run, but I’d like to add some multithreding function.

THE DB

As I said, the db is a sqlite one. I’ve defined 3 tables: devices, interfaces and neighbors.


DEVICES_SCHEMA = (''' 
 CREATE TABLE devices ( 
 router_id TEXT PRIMARY KEY, 
 hostname TEXT, 
 vendor TEXT, 
 ports INT, 
 as_number INT, 
 ip_address TEXT, 
 configured DEFAULT 0) 
''') 
NEIGHBORS_SCHEMA = (''' 
 CREATE TABLE neighbors ( 
 router_id TEXT PRIMARY KEY, 
 neighbors_list TEXT) 
''') 
INTERFACES_SCHEMA = (''' 
 CREATE TABLE interfaces ( 
 router_id TEXT PRIMARY KEY, 
 interface TEXT) 
''') 


RUN IT!

Let’s see what happens if we run it.


..$ python network_automation_project.py -u gabriele -p projectme10 -f devices.txt
BUILDING DB
{
 "1.1.1.1": {
 "as_number": "10", 
 "hostname": "Router1", 
 "ip_address": "172.16.1.1", 
 "ports_number": "12", 
 "vendor": "Cisco"
 }, 
 "2.2.2.2": {
 "as_number": "20", 
 "hostname": "Router2", 
 "ip_address": "172.16.2.2", 
 "ports_number": "36", 
 "vendor": "Cisco"
 }
}


As you can see, I’ve executed the tool with the options -u -p -f so that it will run directly without asking any missing information first.

The devices.txt file appears as follows:


1.1.1.1;Router1;Cisco;12;10;172.16.1.1;
2.2.2.2;Router2;Cisco;36;20;172.16.2.2;

The format is router_id;hostname;vendor,number_of_ports;as_number;ip_address_to_ssh;

After that all the parameters are parsed, the tool will start building and populating the database.

If everything goes well, then we can start to configure our ports.. [CLICK IMAGES TO ENLARGE]

ports

The tool informs me I can configure up to 12 ports and ask me how many ports I whish to configure right now. I choose 2 ports and then I have to enter the information required. To make sure the user will insert a valid IP address I’ve used the netaddr Python module. All the data is then stored inside the db.

Then we enters the BGP configuration phase..

bgp

After this, the tool starts to generate the proper configuration command based on the data stored inside the db. Then, the configuration is pushed into the devices using the netmiko Python API and, lastly, all the running configuration is saved in a text file with the format hostname_configuration.txt

conf

Now it’s the time to configure the other device as well.

After that, let’s verify eveything has gone fine issuing a “show bgp” command on Router2:

show_bgp

It seems good. R2 now it’s named Router2 and all the BGP information are received correctly. In addition to this, we can also examine the 2 new created text files: Router1_configuration.txt and Router2_configuration.txt.

GOING FURTHER

The project is not finished yet and I’d like to add some more functionalities and fix some problems as well. Specifically, I’d like to:

  • Improve DB interaction, since sometimes some error occurs when data is already stored.
  • Add some multivendor support, As you can see from the actual code, I’ve included some function to generate Juniper configurations, but I leaved them blank.
  • Add some multithreading behaviour.
  • Add some “service” like traffic draining.

Finding some other guy who wish to contribute would be great. So if you’re interested or if you have some hints please reach out 😀

I hope you will find this post interesting. Any comments are more than welcome 🙂

Again, the whole project code can be found here


P.S.

I’ve started contributing to other network automation project on github. I think this is a great way to learn and understand things better. By now I’ve only pushed few lines of code to the netcli_to_dict project, which is “a collection of scripts that provide samples of how to iterate through the output of an executed command and return a dictionary, or a list of dictionaries, based on information could be useful in various automation projects.

P.P.S

Few days ago I went to Taormina with my girlfriend. We spent some great time together dining and walking through the beautiful streets of that little city. Then, I saw the “I have a dream…” wall, where everyone can write something. So I decided to take the chunk and write “NETENG INTERN 2016”. Who knows.. 🙂

WP_000215

P.P.P.S

For those wondering, I’m not an a**hole and so I’ve written something romantic for my girlfriend before writing the above thing 😀