Here are some quick notes from the two days of NETCONF interoperability testing we did this Saturday and Sunday leading up to the (ongoing as this is written) IETF 85 in Atlanta. The NETCONF Protocol RFC is being advanced towards a place on the venerable Full Standard RFCs list. The process of moving a standards track RFC through the two maturity levels (“Proposed Standard” to “Internet Standard”) as described in RFC 6410.

The first level, Proposed Standard, is described as:

A Proposed Standard specification is generally stable, has resolved known design choices, is believed to be well-understood, has received significant community review, and appears to enjoy enough community interest to be considered valuable.

…and the Internet Standards level is described as:

A specification for which significant implementation and successful operational experience has been obtained may be elevated to the Internet Standard level. An Internet Standard (which may simply be referred to as a Standard) is characterized by a high degree of technical maturity and by a generally held belief that the specified protocol or service provides significant benefit to the Internet community.

NETCONF is currently a Proposed Standard, and the interoperability testing is part of trying to meet the criteria to move it to Internet Standard. It is also generally a generally useful experience for the participating implementations to make sure that things work nicely with each other. It is also quite an enjoyable event with interesting discussions and the occasional finding of deviations from specification (a.k.a. bugs).

We had five server implementations, including submissions from us at Tail-f, Juniper, YUMA, and the libnetconf project. We had ten (!) client implementations including submissions from Tail-f, MG-Soft, and Seguesoft. The more detailed report from this event will be published in the IETF Journal but in summary I can at least say that we had good interoperability in that we had all clients talking to all servers in a useful manner. Surprisingly small amounts of catastrophic failures in combination with a healthy amount of discussions around the interpretation of some more exotic parts of the standard. Exactly as we had hoped.

Now, off to the Interface to the Routing System (IRS) WG-forming BoF. Will be interesting to see if the requirements on the eventual protocol will match NETCONF enough that it can be reused. More to come on this surely.

Advertisements

I’m in the process of writing up a more comprehensive post on the structure and content of the OF-CONFIG YANG module, but I’ve found a particularly interesting faux-pas that I though interesting to describe in this separate post.

YANG includea a couple of features that uses various types of references from part of the data model to another. The most common example of this is the leafref built-in type and it’s associated path statement that allows the module developer to reference a particular leaf instance in the data tree. Another type of references is used in the must statement and it’s associated XPath argument string that is used to fomal declare a constraint on valid data.

A good example of this comes from the draft interface configuration YANG module that is currently a working group item under consideration for standards track. The core interface model is very straightforward:

+--rw interfaces
   +--rw interface [name]
      +--rw name                        string
      +--rw description?                string
      +--rw type                        ianaift:iana-if-type
      +--rw location?                   string
      +--rw enabled?                    boolean
      +--ro if-index                    int32
      +--rw mtu?                        uint32
      +--rw link-up-down-trap-enable?   enumeration

The location leaf is an optional string. According to the specification text:

It is optional in the data model, but if the type represents a physical interface, it is mandatory.

So, in the example ethernet module in the Appendix we see the following construct:

container ethernet {
  must "../if:location" {
  description
    "An ethernet interface must specify the physical location
     of the ethernet hardware.";
  }
...

This means that the ethernet container will only be present if the interface configuration includes a location leaf. Note well that the reference references a configurable parameter (the location string is read-write). Now, turning to the OF-CONFIG 1.1 specification we see the following must statement in the global resources container:

container resources {
  description
    "A lists containing all resources of the OpenFlow Capable Switch.";
     list port {
        must "features/current/rate != 'other' or " +
          "(count(current-rate) = 1 and count(max-rate) = 1 and "
          +
          " current-rate > 0 and max-rate > 0)" {
...

If we follow the reference to the features/current/rate leaf highlighted above we see that it’s a config false node since it’s located in the current container:

container features {
  container current {
    uses openflow-port-current-features-grouping;
    config false;
    description
      "The features (rates, duplex, etc.) of the port that are currently in use.";
  }
...

This means that the state of the rate leaf is outside the control of the administrator and relies on operational state This is worth thinking twice about. The configuration simply becomes invalid if the operational state changes. What is useful system behavior in that case? Remove that part of the configuration? And what if that part of the configuration is referenced from other parts of the system? Consider the can of worms wide open.

In summary; referencing operational state is not what you want and this should probably be rethought in OF-CONFIG 1.1++.

The ONF YANG in MG-Soft's YANG Designer

It’s with some excitement I see that the ONF has publicly released the OF-CONFIG specification to precious little fanfare. The head-spinner (in a positive sense) for me is that it includes a nice little RFC 2119-style MUST statement making NETCONF mandatory:

[…] OF-CONFIG1.0 requires that devices supporting OF-CONFIG 1.0 MUST implement NETCONF protocol as the transport. This in turn implies as specified by NETCONF specification that OpenFlow Capable Switches supporting OF-CONFIG1.0 must implement SSH as a transport protocol

See, this is the exact type of use of NETCONF that I think will make all the difference. This means that pure OpenFlow switches actually don’t even really need a traditional CLI or Web UI. It will probably need some sort of very constrained CLI for seed configuration. After that it could be NETCONF only including a NETCONF CLI or a Web UI based with a NETCONF backend. It’s now programmatic, see.

Going back to the excellent (you should really, really read it) Problem Statement for the Automated Configuration of Large IP Networks draft makes me think that what is left is a reasonably well designed (secure!) “call home” protocol. This would mean that we could get rid of the ping-sweeps that is still, to this date, the most sophisticated tool that the network management world has in terms of discovering new network elements.

The Reverse Secure Shell (Reverse SSH) draft seems to be a good conversation starter. An open source implementation of that with some support from a vendor or two then I’m sure we’ll be able to reanimate the sleeping Secure Shell (secsh) working group and off we go.

Here’s a quick post on some of the things that makes adapting RESTful principles to the area of configuring network elements somewhat challenging. I’ve had a number of conversations of the tire-kicking kind (“What if we used REST instead”) in this direction and these are the kinds of conversation-holes that I find myself invariably unable to dig myself out of.

What I’m talking about below is the idea of adding RESTful interfaces to the network elements themselves. This is in contrast to providing a RESTful interface to the management system and then use whatever protocol or scripting means to make the configuration happen in the actual network elements.

REST relies heavily on the concept of hypermedia objects (remember the HATEOAS-principle!). Hypermedia objects are kind of self contained and self sufficient in that they are not expected to be mapped into any structured context (e.g. a tree or a chronological order). Think about what that would mean for designing hypermedia objects to represent the parameters required for any common router or switch configuration task (e.g. BGP peering configuration or MPLS VPN setup). How could we design useful hypermedia objects that:

  • Expose the useful set of configuration parameters available in the actual protocol implementations (vs an over-simplified model for very specific use cases)
  • Capture the relations among the common features in a rnetwork element (e.g. it doesn’t make sense to enable OSPF on interfaces that does not have an IP address)

A RESTful approach makes most sense when the designer have a very large degree of freedom to design the objects as they see fit and not be burdened with much implementation detail. The amount of configuration parameters and the amount of feature interactions in even a simple router or switch is such that this task will be very challenging.

This makes me think that RESTful interfaces may make a lot of sense on the element management layer where designers have a larger degree of freedom to make up application specific models and not be forced to reflect the underlying implementation. I believe it’s called abstraction.

On the other hand; I would love to be proven wrong through some ambitious attempt at breaking down some corner of a router or switch configuration into reasonable hypermedia objects that could be directly accessed by a management system. Or a REST CLI?

IETF is back in Europe for its 83d meeting starting Monday and I’m going. These are pretty interesting times for standard defining organizations in general and the venerable Task Force that created everything you use to read this post (except the browser markup language) is no different in that area.

With the networking industry in an unprecedented state of hype-driven flux it is no surprise that many SDOs are thinking about how to keep relevant. How this thinking manifests itself depends on the cultural roots of the organization. Some try to “pivot” (or whatever it’s called these days) away from being marginalized and spend marketing (!) resources on things like membership brainstorming sessions, general cloudification and virtualization activities, or simply raising the cost of exhibitors booths during events. Very few of these activities lead to better, more well timed and more useful standards.

I’ve always felt that IETF is the FreeBSD of the standards industry. It may not be first to publish specifications for emerging challenges, but when working group drafts eventually pass through the eye of the needle (which is IESG approval) it usually provide specifications with enough detail and quality to actually make it into implementations and interoperability.

Much of the criticism of the IETF comes from lack of understanding the principles and processes it is built on. Ironically enough this is in humorous symmetry with the classic “I haven’t read the document we’re discussing, BUT…”-comment that is a very well known faux pas inside the IETF. The processes are informally introduced in The introductory Tao of the IETF document and the RFC process is described in detail RFC 2026. Personally I think it’s best summarized in the following quote from David Clark:

We reject kings, presidents and voting. We believe in rough consensus and running code

So, for Paris I’m particularly looking forward to:

  • The dinners and hallway discussions which is always where the most exciting conversations and napkin sessions are held
  • The NETCONF and YANG contributors meeting where REST access to resources described in YANG is on the agenda
  • I’m guessing the “Overlay Networking (NVO3)” and the “Infrastructure-to-application information exposure (i2aex)” birds of feather (BoF) sessions will be crowd pleasers and will provide ample opportunity for some good grey-beard action

Reach out (@cmoberg or calle@tail-f.com) if you want to (mail|meet|tweet)up at the Le Palais des Congres de Paris starting in a couple of hours.

This post is a more of a reference note around some YANG modeling specifics. It has come up a couple of times so I thought I’d follow the DRY principle and document once and reference forever. I think the casual reader will enjoy the read (especially if interested in modeling stuff) but there are XPath predicates and NETCONF error tags ahead so be warned!

The YANG language provides a useful concept in the leafref statement. It is a way to reference a particular leaf instance in a data store. This is one of the semantic validation constructs that really adds tangible benefits above and beyond what’s available in e.g. SMI.

A common use case is to use a leafref to reference the network interface or IP address used for a particular purpose. Since leafrefs refer to instances, not nodes in the model, it also implies that there is valid configuration in place for that particular interface or IP address. Here’s an example from the CCAP YANG module:

leaf slot {
    type leafref {
        path "/ccap/chassis/slot/slot-number";
    }
 }

The slot leaf references an instance of a slot in a CCAP chassis identified by it’s slot number which is also the key in the slot list. This means that the referenced slot configuration (e.g. slot “1”) must exist in the configuration for the referencing leaf to be valid. Dangling pointers are not allowed. This is in contrast with e.g. how ifIndex is used in SNMP where there is no validation of whether the pointed-to object really exists.

Validation is performed by the server side and there is a specific error message defined for situations where a leafref would refer to a non-existing instance (from RFC 6020, Section 13.6):

error-tag: data-missing
error-app-tag: instance-required
error-path: Path to the leafref leaf.

This post wouldn’t be any fun if it didn’t introduce a challenge though; so here goes. The CCAP model introduced above obviously goes deeper than the ‘slot’ concept. Slots are expected to contain line cards of various types and there are ports sitting on the line card. A system can contain several slots. Each slot contains a single line card (that may or may not be present) and a line card hosts several ports.

So, in order to reference an instance of, say, a port. We need to traverse the following structure with our reference:

  • A list of slot/line-card pairs that each contain a;
  • list of ports

An example of such a reference could be:

<slot>2</slot>
<upstream-rf-port>4</upstream-rf-port>

Note that we need to specify two values (slot 2 and port 4) to uniquely identify a port instance. Note that since there is a 1:1 mapping between slot and line-card (a slot may only contain exactly 0 or 1 line-cards) there is no need for a key reference to the line-card.

A more mundane example of this is how certain router vendors identify interfaces in the CLI:

  • Juniper: ge-0/0/1 for gigabit ethernet port 1 in PIC 0 in slot 0
  • Cisco: FastEthernet0/0 for fast ethernet port 0 in slot 0

The numeric components of these interface names are examples of instance identifiers, meaning that they references keys (one per list traversed) to uniquely identify a specific leaf. Now if you remember the syntax of the path-statement above:

path "/ccap/chassis/slot/slot-number";

The XPath syntax in the path statements does not include a way to provide more than a single key (in this case slot-number), so as we go further down the path we would end up with something like:

path "/ccap/chassis/slot/line-card/rf-line-card/upstream-rf-port/port-number";

The path statement above looks deceptively simple and it is both. It only identifies port-number on value, meaning that if we have several RF line cards with the same identifier (say “0”), then we’ll get a match for all of those which is not what we would be looking for in a leafref. The qualifying parent keys are missing. This requires a pretty neat trick using XPath predicates in leafrefs.

Leafref path statements (described in RFC6020, Section 9.9.2) is a subset of XPath abbreviated syntax. This opens up for the use of XPath predicates that can be used to find specific node(s) that contain a specific value. The trick that we are going to look at uses the current() predicate to pin keys while traversing trees with multiple keys. The current() predicate is specific to YANG’s application of XPath (imported from XSLT) and is described like this in RFC 6020 Section 6.4.1:

The function library is the core function library defined in [XPATH], and a function “current()” that returns a node set with the initial context node.

Now, looking at the following extended snippet from the CCAP model:

grouping upstream-physical-channel-reference {
    leaf slot {
        type leafref {
            path "/ccap/chassis/slot/slot-number";
        }
    }
    leaf upstream-rf-port {
        type leafref {
            path "/ccap/chassis/slot[slot-number=current()/../slot]/line-card/rf-line-card/upstream-rf-port/port-number";
        }
    }
}

The upstream-physical-channel-reference grouping contains two leaves. The slot leaf is a leafref with a path referring to a slot. The upstream-rf-port leaf is where the fun starts. By using the current() function in a predicate we reference the list member that have the same slot-number value as the sibling slot leafref.

So, the combination of the slot and upstream-rf-port leafs uniquely identify an upstream port. And by collecting them into a grouping with a label upstream-physical-channel-reference, we’ve made it reusable. So anytime we need to refer to a physical channel interface we can instantiate it using uses. Except in notifications, but that’s for a separate blog post.

This, by the way, can’t be done in XML Schema. Which is unexpected. All evidence to the contrary would be much appreciated.

Having worked with cisco CLIs through many maintenance windows in the early years of my career, I’ve come to not-like it as much as any other reasonably ambitious network engineer. Back in the late 90s I experienced a real eye-opener of a situation where a team of network engineers threatened to resign if the suggestion to introduce another CLI (read: JUNOS) in the network was made real. I can’t remember exactly, but I have this feeling that they wore their leather jackets during the meeting where things became agitated.

As I’ve been slowly immersing myself in network management over the recent years, I’ve had countless discussions with various makes of networking pundits on this particular topic. It’s just interesting to see how clever engineers go to extremes to maintain a form of status quo that they in any other context would understand to be a problem.

The most interesting take on this issue was conveyed to me by wise man with a lot of experience directly from the source of the problem in this example. His point was that in order to understand the proliferation of, and the lengths to which some engineers go to defend the CLI, one would benefit from understanding the concept of rent seeking. To quote Wikipedia:

In economics, rent-seeking is an attempt to obtain economic rent by manipulating the social or political environment in which economic activities occur, rather than by creating new wealth, for example, spending money on political lobbying in order to be given a share of wealth that has already been created. A famous example of rent-seeking is the limiting of access to lucrative occupations, as by medieval guilds or modern state certifications and licensures.

I could literally hear the coin drop in my own head as I read the last sentence of the above quote. Of course, the leather-jacketed engineers simply worked to maintain monopoly-like privileges and limit free competition on innovative improvements around working with router configuration. In hindsight it makes perfect sense (as always) and had I understood this at the time then I’m sure the ensuing screaming match would have turned more constructive faster.

When we eventually introduced M40s in the network one of the leather-jacketed guys told me in confidence that he wanted at least some of his IOS CLI years back now that he had been exposed to the JUNOS equivalent. At that point he understood the value of the CLI equivalent of a free enterprise approach. The good part of applying known problem definitions to your observations is that it usually comes with a set of solutions and rent-seeking is not different in that sense.

I’ll leave it as an exercise to the reader to find the most easily translated approach to breaking out of situations like this and would love to hear if people have experience with this pattern from other parts of our beloved networking industry.