Tuesday, 15 May 2018
CHAIR: While we're starting, please remember this is the last possible time to submit your name for the Programme Committee. If you are interested in joining the Programme Committee, please e‑mail pc [at] ripe [dot] net with a short biography and then you'll have the opportunity to present at the beginning of the next session, a little bit about who you are and why you want to join. I think it's super great, you get to bug all of your friends and colleagues to submit great presentations and you really help make the Plenary happen.
As well, very important, to please remember to rate the talks, because that is how we get feedback on how to make this conference as good as possible as it can be for everyone in the community.
CHAIR: With that, let's get started on the Plenary, we have Richard Sheehan from Facebook who is going to be talking to us about building commodity based network.
RICHARD SHEEHAN: Thank you very much, I work at Facebook, based in Dublin. I used to work in the same team as Louis where we used to monitor the network and find out all the stuff that.didn't work, now I work on the FBOSS team.
One of the things I have noticed that when we have given talks like this in the past, we are really good is telling you like all the amazing stuff you can do, when you have built this ‑‑
I shall endeavour to speak for slowly and clearly. So, we give these talks, and unfortunately we assume that you have a massive network and basically infinite money and you can do whatever you like. That's great for a very small number of companies in the world that have these massive networks, most people aren't in that situation. The idea of this talk is to give you an idea of how you get started to moving into a network without going all the way in and building your own data centres and OS and all of that stuff.
Firstly, does anyone know what this device is? And I apologise for the grainy photo but it's really really old. Anyone ever seen one of these before? I was kind of hoping somebody would. This is called an N Cube 3,000. This was when I started my career nearly 20 years ago, we bought one of these, I worked for an ISP in Ireland actually basically the France Telecom of Ireland, although much much worse than France Telecom, a notice the French people laugh a little bit more than anyone else when I said that. We bought one of these. We wanted to do video on demand. The cable companies had started off the Internet and the phone company that I worked for was really concerned about this that we would have no business, so they said okay, what we will do is, we'll offer TV over DSL, and we want to do video on demand, that's the cool thing that the cable companies can't do. They bought one of these. It ran Oracle video server. It ran a proprietary OS called transit which was based on plan 9. I got to work on this. This was pretty awesome thing to get to play with and we paid something like 2 million dollars for it. It never saw an active day of production because we' abandoned the project in, this started in 1998 when there was lots of money in the IS industry, by the time we were looking at launching it it was kind of '99/2000, when things went really badly in the Internet industry right around then. We ended up trading this back into the company that we bought it for for, like, 20,000 dollars after never having used it, and they went bankrupt shortly after that. There is a lesson to be learned here, and we'll move on to what that is. What is totally cool, we did this.
Many years later, Mr. Zuckerberg decided, I need to build a system to allow people to share cat videos with all of their friends all of the time. By this time the idea of buying this massive super expensive proprietary video system, well, probably it would be crazy to do this. What you do, what everybody else does, you buy a tonne of Linux servers because they are way cheaper, adaptable and you can use them for anything, which is exactly what Facebook did. We said we need to connect them together, to a network. What we did is what everybody else does, you go to Cisco and Juniper, you say give us the biggest switches you sell, you take a forklift to give them all the money to pay for these switches and you connect everything together. And then after a year or two, you realise, as your network gets bigger and bigger and bigger, you are giving these guys huge amounts of money and the accountants start to go, well, guys, like this is a lot of money, this stuff is super reliable, we're not having any outages, because if we are paying this kind of money, we want to make sure this this stuff is really good. When you look at it, you say, okay, super expensive, you are at the kind of top end of the market where you are basically a beta tester, so stuff is breaking all of the time and everyone is unhappy about it, you decide well, we'll build our own commodity network. How hard can that be? We looked throughout the world to see what was the best technology we could use to design a network and it turns out that 1950s telephony knowledge is exactly what we need. What we wanted was basically a big non‑blocking switch and what telephone companies had been really good at doing for a long time is just that. We decided to use CLOS topologies in order to build this network and this diagram you will see countless times in Facebook presentations of what we call our fabric network. We have racks, cluster switches and spine switches and on the edge we have edge switches to connect outside of the data centre. This is one of our data centre networks. We came up with a design, again it's a bunch of blocks and a bunch of coloured cables and lines, that's that complicated. We decided we'd build our own hardware because why not? It seemed like fun, we came up with a wedge platform, a six pack and 2015, we came up with Wedge 100; 2016, we had Bagpack, then Wedge 100S, and actually we're working on the next platform right now, it is ‑‑ we have the latest chip, finally, and we're actually waiting what we are working on at the moment, we are trying to bring up the couple of lab devices over the next generation of our hardware. Then what we have, we've got this lovely hardware, we need actually something to run on top of this, something that will switch packets on it, we wrote FBOSS, the Facebook operating system effectively for a network device.
And to do that, we decided to higher a software team. This is the just the software team, it's not the hardware team and there is other supporting teams like the production engineering team. Then we wonder why nobody else does this. And the main reason, I think, is, these people all live in California, pretty sure they all drive Ferraris and Teslas and having a team of this size plus a bunch of other people turns out that's super expensive on a year‑on‑year basis and it doesn't make any sense for anyone who doesn't have a huge network.
So, with that in mind, we thought, well, okay, that's great for us, it's great for the Amazons and Googles of this world who are going to have massive networks and lots of money to invest in the technology and the teams to build this stuff. For everybody else out there who is in a much smaller operation, yes, you are spending a lot on network hardware but you have to justify the cost of having a team to support this stuff which is expensive and you have to make the economics of it has to make sense.
So, how do you do this? Once again, you have a data centre. It's probably not going to be this big or not this many cables in it, but you can do this too. You can build a cheap commodity data centre network. Data centre networks are simple when you compare them to Internet facing networks. There is some rules of the game which I'm going to say right off the bat because it's important. Firstly, your network has to be a layer 3 only network. It's much easier and it's going to scale much much better.
Secondly, what you want to use is ideally Open Source, Open Source industry standard protocols for routing. BGP, OSPF, the last one in the game is capacity is going to be so much cheaper, you can kind of throw bandwidth at the problem and make a lot of other things go away.
So, I'm going to reiterate this Layer2 point because I hope I don't need to convince anyone here that large Layer2 domains are the devil and they will cause you immense pain for the rest of your life. I don't think any network engineer says, do you know what I'm going to do? I'm going to design a massive Layer2 network because I think this is fun to maintain. What usually happens is the software guys go, hey, we need Layer2 for some reason that actually isn't a good reason and eventually coerce the network people into building this and then it becomes a disaster for the rest of its existence. So, I would encourage all of you to push back as hard as you possibly can against anyone who says we need a large Layer2 domain, particularly that spans over a long distance. It's not going to work, it's going to cause you an immense amount of misery, there is a lot of software protocols that will abstract this from you and not turn it into a network problem. People make these short cuts in software and they push this complexity into the network and then the network gets blamed for all the problems that this ultimately results in.
So, the idea of building a commodity network is first you start with a cheap switch. Now for the purposes of this talk, obviously, the legal people said, you can't put up a Juniper or Arista or Cisco switch because then we'll get sued. So all the pictures of my switches are going to be FBOSS switches. Please when you see these, assume that I'm actually talking about you are just going to use, use the cheap Cicsos, use the cheap Juniper, cheap anything. We are looking at a broad come base switch, 32 by 40 gig ports. The per port cost, 40 gig is about $300 when you average it out. To give you an idea. I actually, before I worked at Facebook, I worked at another large company that decided to build a commodity network. When we looked at the math of doing it at the time of doing it. To buy what we were buying from an external vendor we were paying $3,000 per 10 gig port. When we did the math on commodity, it was $100. So even with the cost of everything else. That saving was simply, over time, was huge.
And that's why this makes a lot of sense.
So, the idea I want to give you here as well, 32 ports isn't very much. You can't connect a lot to a single 32 port switch. The idea is we're going to build a virtual chassis because, realistically, inside a chassis what you have inside is a back plain and a lunch of line cards. We are stuffing the switches in a rack. At the top layer we are going to take two of the switches and that's going to form our back plain, and then the lower four switches will become our line cards, we'll take 16 properties from each lower switch and connect it to the upper switches. This gives us full non‑blocking capacity from lower to upper layer and suddenly we have got virtual chassis, we have got 64 ports that we also have net redundancy in. If I lose in the lower switches I lose 25% of my uplink capacity and likewise upwards. One of the advantages of doing this with the chassis you buy it and you put it in a single rack and it's immediately a single point of failure. You might have dual power to it, but if somebody walks into it or knocks the rack over or water spills on it, it's gone.
When you use cheap switches, you can just pick some of them up and put them somewhere else and run the cabling. While the cabling is more complicated you can get physical diversity really, really easy, which is a very powerful thing. You can have them in different buildings, different rooms.
Now we want to attach racks, so we have got 16 useful down‑facing ports on the line card switches. This gives us 16 racks with 4 by 40 gig uplinks. 160 gig of uplink per rack, that's a lot of capacity. With the failure of a switch we only go down to 120 switch, we still have lots of uplink capacity in the failure of a switch.
Eventually we are going to try and connect a 17th rack. That's fine. We just double up and really to do this we just need three more switches on each rack, we now go to for on the back plain, 8 on the line cards, now we have doubled the amount of capacity we have by just adding six cheap extra switches. There is also a huge advantage here, as soon as the spine goes from two to four devices now I have got a 25% failure of any device is only 25% of my capacity at all. Which is really nice. Now my failure domain becomes much much smaller for an individual device. Those of you who know, when you lose a big device your customer impact is magnified by the number of ports you have.
So, how many of these switches can you fit into a rack. We have these things called hackathons at Facebook, where we decide to do something crazy and ones of these was can we fit 42 switches into a 42 rack. We only needed to fit in 40. We did this, 40 devices by 32 ports, gave us 1,280 ‑‑ these are 100‑gig switches, 640 non over‑subscribed rack facing ports is what we can get out of a single rack. Admittedly, if there is a lot of cables in there, and that's where the hard part is. We got this up and running and now we have started to deploy these as a big network chassis in our networks.
So, I have made this sound like it's pretty good and it's pretty easy. But there are some big differences between your fixed switch and your big expensive chassis. And I need to cover what these are and why actually they are not super important.
The first thing is you have got smaller routing table sizes, smaller TCAM, much smaller buffers, and you don't get any of the fancy chassis features. But that's actually okay because when you are building a data centre network, you don't need these things. Firstly, ideally you are going to use route aggregation, so you don't need massive routing tables. You are going to aggregate your routes inside the data centre and announce sums RIS out. ACLs, generally speaking, if you have a big data centre network and there is ACLs on everybody single device, your life is going to be a misery because you are going to block customer traffic all the time.
And so what I have seen done in the two large companies I have worked for is that inside the data centre network, inside the network, there is no ACLs. They are either pushed onto the servers or they are pushed onto the edge of the data centre network.
The last thing, and this is the talk, we have talked about buffers quite a lot today. Our general approach to this is just throw more switches at the problem and not use buffers. As a person who has worked a lot in software, I am very anti‑buffer because it adds latency to customer packets in a way that's hard to surface to the customer that it's happening. So, I hate hidden ‑‑ and customers will always come to you particularly when they manage services well and say, hey, how come it's taking so long? You go, buffering, there is buffering happening somewhere in the network, I don't know where it is, I have no easy way to measure it, but I know sufficient is slower now than it was before and I can't tell you why and I can't even easily pinpoint where in my network it's occurring, what we generally do because the bandwidth is so much cheaper than buying big expensive devices, simply operate at a level that you are not buffering at all.
Inservice software upgrade. Who has ever done an inserver software upgrade and not felt like this? Have not thought that, hey, this code path that gets exercised once maybe every 12 to 24 months in my organisation is going to go flawlessly the first time on a live production network. So much as I would love to trust this stuff and much as I would love to trust external vendors, this stuff really frightens me because all the code paths that are not used are never going to be super reliable.
The nice part about a commodity network is you never do this. You just take stuff out of service, you have got so many switches when you want to do upgrades you take one switch at a time, do the upgrade, put the new configuration on it if required, bring it back into service. So it's a much easier model to actually do upgrades because you have complete redundancy in your network, you are never passing traffic through a live device, when you want to change it so you can do any kind of upgrade, you can upgrade your kernel, your policies, you can do on in a very controlled manner and the way you deal with failure is the same way you deal with upgrades. Failure, you take something out of service, or even a suspected failure, you take out one device, which is a very small unit of capacity, you figure out what's wrong and you put it back in service. You want to do an upgrade, it's exactly the same process and it's simple.
So, some of the rules of the game. First of all, when you operate a large number of devices just like in the server world, newer network devices have to have a standard configuration. I know in the hip centre world of today, everyone wants this lovely small batch, vegan ‑‑ I look at my colleague there ‑‑ yeah, handcrafted, small batch configs, no you don't want this, what you want is a config that is entirely predictable and as simple as possible. You want to store those configs in source control and most importantly, the time you will need to spend some time on automation. The most important thing to do is have a safe and easy way both in your configuration and automation to take switches in and out of service. If you design it right this becomes easy and it's an immensely powerful tool because being able to take stuff out of capacity when you even suspect something might be wrong and not have to worry about that having an impact that's going to make things worse. It's incredibly freeing in a lot of ways.
Monitoring, obviously we're a big believer in that. We want to collect all of the metrics that your network device gives you. Then, ideally, you want to not trust all of those metrics all of the time and actually ping all of your network devices and make sure they are actually there.
And last of all, once a switch starts to be behave badly or you think it is, just take it out of service. The customer doesn't want it to be broken. They just want the network to work. So taking things, even if you suspect them, even if you are not sure, taking it out of service is the best thing for the customer even if it makes you harder for you to troubleshoot the problem.
So, we have talked about putting a whole bunch of switches in a rack and having it behave as one big chassis device. But when I think of our switches and how we operate our data centre network, I think of actually the entire data centre as one big virtual chassis because it basically, it's the same concept. But the thing is, you can actually start small. So we always do the massive big diagram where there is lots of boxes in it. This is the simplest version of this that you can build. Once again, if you have less than 16 racks, you have your racks, you have your fork switches, you have four spine switches and you have four switches to connect out to the rest of your network. And this is fine. This will operate at a very low cost, an ability to have a data centre that has 16 racks in it. The nice part is it scales very linearly. You don't go back to your vendor and say I need to spend X million dollars to buy another switch. You just throw more switches, you need more racks, you add capacity at the bottom layer. If you have added more racks, obviously you are going to cause congestion higher up. You can add more edge switches. You are going to create problems now at the spine. You can add more racks and ultimately, you can scale up your spine. So you get to this but you don't start at this and we don't start at this. We start small and build and build. The nice idea of this is you can scale it on demand and it's because these devices are cheap and because your vendor will have large supplies of them, you can just order them as you need them, you don't have to have long lead times on the hardware, they will have stacks of these things sitting around because they are cheap.
So, some of you are getting upset because you are going, hey, I happen to like my massive complicated chassis device that no one else in the company likes or knows how to operate. You still get to have these, they get pushed to the top of your network, you still need ACLs somewhere, you are probably going to need to connect to the Internet, you need your big routing tables. The idea of what we do is, we push all of this complexity right to the top of the network. As we leave our data centre network that's where all the complexity lives, you spending all of this money in one small place in the network rather than distributing this complexity and this cost throughout your entire data centre network.
Funnily enough the hardest part about building a big commodity network is not messing up the cables because it turns out even in this rack, look at the numbers, 32 data cables, 32 ports in every box, there is 1440 cables in this rack that we built. So, there is lots of things you can do to make this easier, you can colour‑code the cables. When someone is plugging them in, it's easier to get it right or wrong. What we're actually looking at now in the FBOSS devices is changing the LED colours, changing it from green to red if they plug them in correctly or not, just because I think it's something like 10 to 20% of all the devices we put into service have incorrect cabling the first time around and it actually causes massive delays in turning up equipment.
So, what's in it for you as the network engineer to do this? You know, it's the old IBM phrase, nobody ever got fired for buying IBM, nobody ever got fired for buying Cisco in the networking world. But at the same time, there was a time when people said, well, you know, I'll just buy the digital equipment box, I'll buy the sun microsystem box, those companies don't exist any more because something changed in the industry 20 years ago, Linux came along and destroyed them. I think what's going to happen in the next five to ten years is, we have all of these cheap basics coming out of companies like Broadcom, it allows you to build a network for much cheaper. What we don't have yet and what we're trying to do is give people a usable OS on top of this. The thing that killed off Sun and Deck and all of those companies is the fact that you had Linux come out and actually be deployable and easy to manage and offer support. You can use this crazy Open Source thing but we'll support you and we'll adopt you. I think the same thing is going to happen in the networking world. Obviously on the Internet things are different but inside a data centre I do see in ten years' time that most companies will be doing this, and I'm sure people will be sceptical, but people were super sceptical, I remember I was, I remember I ran Linux on my detection stop, I thought we'll never do that. And then two years later we installed it all because the economics of buying Deck and Sun at that stage didn't make sense. I think once we get to a point where we have got a good robust operating system that you can install on commodity switches everyone is going to do this. When you can go to your company, your manager and say hey, we built this commodity network, it was 50 million dollars cheaper than doing this with Cisco, I personally have three children to feed, they like expensive toys, I saved you 50 million dollars, just take a small percentage of that and maybe deposit that into my bank account.
One thing I'd like to say, so last time I was here was in Madrid at a RIPE talk, and I made a French Martini, which I'm not going to do today because most of my French colleagues pointed out that it was neither French nor a Martini. There was also some discussion about Facebook being used to influence RIPE elections and I made some comments during the end of my talk that at the time about taking over RIPE and having French Martinis for everyone, at the time seemed quite funny. The balance of time and what's happened since, super, super awkward, so, I'm going to do something less controversial this time around. As we have discovered lately, democracy is a fragile thing. RIPE obviously, being a French acronym, it's lovely to have RIPE here in France. France has also proven lately is one of the few countries around that still take democracy seriously. Maybe we choose ‑‑ well a little more power in the RIPE inside of France, because those of us in English speaking countries seem to be making a mess of democracy lately.
Thank you very much.
CHAIR: Thank you. Even though Richard reduced his talking speed, we now have some time for questions, so everyone who has some questions, please go up to the microphone.
AUDIENCE SPEAKER: Benedikt Stockebrand. When it comes to running an OS on your boxes have you considered those Linux derivatives like Cumulus or [PCAR] 8 because from my understanding is that's pretty much the combination you want to get this thing to work in a small environment than Facebook.
RICHARD SHEEHAN: We have played with it. Unfortunately, we have spent a lot of time in building our FBOSS and therefore it is currently on all our switches. It has pros and cons compared to Cumulus Linux. It's one of those cases where we started a little bit earlier than Cumulus, therefore when we got to the point ‑‑ by the time we got to the point where we could have used it we were already pretty far down the track of using FBOSS and then migrating all of those device to say something else would be painful. I think, like, in ‑‑ if we could do it all again as it's easy to say, using something that's easy for everybody to use would be better for the entire community.
BENEDIKT STOCKEBRAND: And second point, the orchestration, you have some custom proprietary in‑house sort of orchestration firmware for all the components, or how do you do it? For the network components. I mean you need to configure them somehow. How do you do that? Because that's normally the reason I hear why people want those big Cisco boxes with faxes or whatever, because they realise we have to manage like 5,000 components individually, we have a bit of a problem.
RICHARD SHEEHAN: We have three things. We need to build them. That's what my team does, they arrive as bare metal and we install them. They behave like Linux servers so we treat them like that. We image them with a pretty standard Linux image and convert that into a Facebook image. We start a couple of daemons on it, we install the kernel modules. We install Wedge Agent which is the software that controlls that. We install this thing called Coop, which generates the configs, then they are all generated ‑‑ all the configuration for an entire data centre network are pre‑generated. We generate them all at once and basically what Coop does is pull in the appropriate configuration, massage it into the version that we need for that device and that's how it works. We do it all at once because we know what it's going to look like.
AUDIENCE SPEAKER: Leslie Carr, Clover Health. Just as a quick comment. We have a lot ‑‑ there is a lot of different programmes you can use for orchestration and configuration management. I know we have a lot of SaltStack users here, I'm sure you can ask around, Ansible and Puppet also have a lot of network support and I'm sure there is even more systems that I don't know very well.
BENEDIKT STOCKEBRAND: The biggest problem that ‑‑ I'm at least familiar with a few of these. The biggest problem is trying to convince network people who believe in the serial console to use these sorts of things. That's usually the biggest problem. And the one that sometimes just fails the entire thing.
RICHARD SHEEHAN: We have actually no CLA device in anything else. Everything is done via trivet calls. There is no way to log into one of these devices and run commands manually, everything is done via API calls. It's one of the things I wish the vendors would do. It's easier. If they had good APIs to update their configurations.
BENEDIKT STOCKEBRAND: If there ‑‑
CHAIR: Do you guys want to grab a coffee later on? Let's have a last question, because the next speaker is up.
AUDIENCE SPEAKER: Sebastian Wiesinger, Noris Network. You said at the beginning of the talk you do layer 3 only which is probably the only way this could work. You probably don't have many people that leave the room screaming when you tell them they can only do Layer3 and have no Layer2 between their boxes. So, what I'm trying to ask is, how do you think to convince these people that Layer3 is better and how to convince them that it's not more complicated, you know. If I tell people that you have to install a BGP daemon on your server ‑‑
RICHARD SHEEHAN: The Layer2 domain is inside the rack. So the rack ‑‑ any host inside the rack are talking Layer2 to each other. The rack switch upwards is obviously Layer3. They are not running routing protocols on the actual host, the Layer2 domain is the rack effectively.
AUDIENCE SPEAKER: So they don't have to ‑‑
RICHARD SHEEHAN: Unless they want to do Layer2 from one rack to another that's when it gets complicated. Provided they can talk inside the rack Layer2 is fine.
CHAIR: Thank you, Richard.
Our next speaker is Charles Eckel from Cisco and he will introduce us into how to bring Open Source development and open standards together and how we can all benefit from that.
CHARLES ECKEL: Thanks everyone for being here and to the Programme Committee for giving me the option to ‑‑ the opportunity to talk today. So, I work at Cisco. My name is Charles Eckel. I am very happy to have the opportunity to talk to you about two things that I'm very passionate about and that I find quite exciting. One is Open Source software, and the other is Open Standards. Now, you may be thinking something, two words you don't often hear in the same sentence, standards and exciting, but part of my goal here today is to show you a way in which I think standards can be exciting. With the idea being that if we're creating Open Source implementations of evolving standards in parallel with going through the standardisation process, then I think that opens up a, you know, a world of possibilities and really good things.
Really, it enables us to reach, I'd say, better standards, more quickly. And perhaps more importantly, increases adoption and deployment of those standards, because now there is, there is running code that developers can use rather than just having an RFC number or something like that.
So, now I'm going to spend the rest of the time telling you a bit about why, giving a bit more examples and going into more detail about this.
So, first of all, why standards in why should we care about standards? Why do we continue to need to care about standards?
So for many years now standards have played a key role in the networking industry, really that industry has demanded standards from vendors. If you want your equipment to be, you know, viable and used and purchased by people, you really need to support standards.
The reasons for that are a couple. But the main one is interoperability. You want to be able to choose different equipment from different verdicts, plug it in together, grow your network over time, one vendor goes away you want to be able to replace with another one. So that interoperability is key and really going hand in hand with that is really avoiding any vendor lock‑in.
Now, vendors have cooperated really on this type of standardisation effort very willingly and the reason is there is good reason for them to do, it's a way of stoke credibility for their products, right, they say, yeah, we support this standard, this standard and this standard, it makes their product more credible. And it also, that interoperability is very important to the vendors too, because now that makes it a lot easier for them to work with their own partners and perhaps even in those case where is they need to work with a competitor, it enables to go much more smoothly.
Okay, so then why Open Source? Well, I think no one would argue that really we have seen Open Source transforming many industries, including the network industry, to the point where now the network industry really demands open source from its vendors. So if you are a vendor of equipment, I'd say you really need to have an Open Source story to even have a seat at a table when it comes to working and getting yourself in a, you know, to be considered by an operator.
Open Source can be used defensively and what I mean by this is if you consider the case where you have been spending a lot of time and effort on standardisation, on adding support for standards into your products, now if you can support and add support for those same standards into Open Source implementations, that continues to make those standards viable and more deployable and at the same time it helps protect your investment in those standards.
Now, on the other hand, I'd say Open Source can be used offensively, and what I mean by this is, you could use Open Source to ‑‑ perhaps in a market where you don't have a strong footing, and change the playing field by really commoditising some aspect of the way things operate and then switching the playing field such that you know your products kind of fit in a bit better.
So I mentioned, you know, we have been doing standards for a long time and I think we're pretty familiar with the traditional standards process, but I just wanted to review that.
So generally what ends up happening over a year or longer, a standard emerges, it could be shorter but often times it's longer than even a year or two. Once you have that standard, you then need to go off and implement it, right, in in your products and solutions and then the problem is, you take multiple implementations of those standards in, you know, different products and they are probably not going to per operate real well from day one, now you spend time getting these different implementations to interoperate. And the reason for that is there is ambiguities in the specs, errors in the specs, different feature sets that you may or may not implement. But eventually we get through all that. But the problem is it just takes a really, really long time. So that's ‑‑ the luxury of that time is something that we don't have any more.
Now, if you combine with this the power of Open Source, which I mentioned is transforming industries very, very quickly, and there is reasons for that, good reasons for that. Really, you are able to leverage a vast community, a very passionate bright people who are working together quite well in a certain specific problem space. They are able to innovate at a very, very rapid pace and in some cases if that software gets deployed widely results in a de facto standard.
Now, everything is not perfect by any means with Open Source. For those of you who have tried to use any Open Source project, you have probably realised there is some assembly required, you are going to need documentation and that documentation that does exist, that's not the favourite thing of developers to do. So, probably it's going to be, if it does exist, it's going to have gaps in it, there is going to be parts of it that are out of date, and so you are going to have to spend some significant effort, usually, in getting this stuff to work.
Now, even in the great case where the documentation is perfect and everything is working really, really well, the problem that you'll probably encounter is there is no single Open Source project that's going to provide the end‑to‑end solution that you really need. What you are going to have to do is take multiple Open Source projects and maybe even some vendor proprietary products and then write some your own glue and put that all together in order to provide a solution. That can be a lot of work and take continuous amount of time, as all this stuff continues to evolve.
So, what I really advocate then is combining these two. Let's bring the speed and collaborative spirit that we see working so well in the Open Source community and let's you know bring that into standards organisations and the standards process. Let's look at adding support for the key standards to important Open Source projects. And maybe even go one step further with that, create researches implementations of standards and solutions that are being proposed out of Open Source projects.
Now, a couple of ways that I want to talk about in a bit more detail is where I have seen this working are in the next few slides but that includes something like Interop events and hackathons where we can work together on this.
Okay. So this is a slide of Open Daylight. How many of you are familiar with Open Daylight? A decent number. Open Source SDN controller. So, don't worry if you can't read the fine print on that. The idea being that, in the middle, you have all your kind of core networking services and then on top you have a set of APIs that you can use to write your own applications that are going to control the controller and then at the bottom you see a lot of networking protocols that you are probably familiar with. And you can selectively enable those on Open Daylight to be able to control pretty much any type of virtual or physical network element that's ever been created.
So that's one of the really powerful things about it.
So now what I just did here was I lit up in green those things which are really direct implementations of IETF standards and I did that just because I am familiar with the IETF standards in that space, and I probably even missed some here but the things in green with ones that I am absolutely sure with based on IETF standards. And so this is an example of what I think is really, really helpful. Because, especially that bottom layer where we are interfacing in with the various network elements, we have support for existing standards. So what does that mean? That means that Open Daylight is, it's much easier to pick up Open Daylight and deploy it to control an existing network than it would be if it didn't have support for all these standards.
So, I use this as an example of something that I think we need to see more of, of existing important networking standards being supported in Open Source networking projects.
Okay, I mentioned IETF a couple of times and know I'm going to go into a bit more on IETF. Now I'm sure almost all of you know what IETF is, right? Yeah, of course, so you know, it's many more than Open Daylight. So, IETF, the Internet Engineering Task Force, it's been around for over 30 years, that's really where all the important Internet standards have been created and certainly the networking standards that we use all the time. And just a subset of them that you are probably familiar with are, you know, definitely familiar with at the bottom there. But there's many, many more.
So the challenge I would say, although IETF has been pretty successful in its overall goal, is that it is slow, it takes a long time to come up with these standards, and I have been involved in IETF for ‑‑ how long ‑‑ I have been using IETF stuff for almost probably 20 years but really going to IETF meetings for probably a little over ten years. And it's a lot of the same people. So while we're seeing a tremendous growth in Open Source and a lot of the young talented people going through, we don't see nearly as many of them coming into IETF, into standards in general. What we see, actually, is that the standardisation process can actually be overrun by the innovation that's happening in Open Source. And in some cases, as I mentioned earlier, that results in the sort of de facto standards where code becomes a de facto standard because we couldn't complete our standardisation effort quickly enough.
So, one thing that the IETF has done, and that I have been involved with, has been doing these IETF hackathons, where the goal here is, we want to advance the pace and relevance of the standardisation work that we're doing. So in parallel with defining the standards, we're trying to flush out the core ideas there. We're trying to implement them in code. Make sure that we are implementable, that the specs that we have are implementable, that they are not ambiguous, that they don't have things missing. And we're not waiting until the standards complete, we're doing this in parallel with the early versions, the early drafts.
We're also using this as a tool to attract new talent, because developers, university students, it's much much easier, much more interesting, I would say, for them to come in, bring their software development skills and apply them to specific problems that we have at the IETF hackathon rather than to start with some mailing list and reading drafts and commenting in a meeting.
And these hackathons I should mention, they are very collaborative, they are open, they are really non‑competitive events. It's not like some other hackathons you may have seen. The idea here is we're all trying to move IETF standardisation forward.
And then what I show on the, in the graph there is just the growth of this. So I ran the first hackathon a little over three years ago, at IETF, we have about 40 people there and it was mainly because I twisted their arm and begged them to come and give this a shot, and now, just a few years later, the last one in London, we had over 250 people participating, and really devoting their weekend before the IETF meeting to this activity, which, to me, is quite remarkable, and we're really seeing a lot of benefits in the IETF because of this.
I also wanted to mention something else that we have done, and we have just created a GitHub organisation for the hackathon, and we use this as a place to keep code if you don't already have another home for your project. You're not forced to put it here, but the reason I mention this is just changing the way we work to use tools that are already very familiar to software developers like GitHub, that makes it a much more inviting place for them to come and work with us on code than forcing them to use a different IETF tooling system, that they are not familiar with. Now we do still use the IETF tools and they're fantastic, but we just don't want to be Tor a barrier to people coming in and in participating in the hackathon and to participating with us in general. So using GitHub has been a great way to attract more developers and make them productive right away.
And then just last week I was in Senegal, in Dakar, there was the African Internet summit and we had a hackathon there, and kind of similar type of idea being very collaborative and non‑competitive. A couple of goals there. We really wanted to build more competency around IETF technology in Africa, you know, in the regions that were represented at this meeting. And we wanted to really advance, I would say, deployment of evolving an existing IETF standards by getting people more familiar with them, right, giving he them an opportunity to work together on these standards and to learn new things about them.
And we had three different projects. One I led on network programmability around NETCONF and YANG and RESTCONF. We had one on NTP data minimisation and then another one on intelligent transport systems and both of those were based on implementing Internet drafts. Not standards but again drafts. Because we wanted to test these things out by implementing them in parallel with working on them.
So now I'm going to shift gears a little bit to give an example of another standards organisation. MEF. So MEF was founded a bit more recently than IETF, it came around in 2001, and really the goal of MEF was to standardise what they coined as carrier ethernet, to make it easy for service providers to provide this service to customers to be able to define what vendor equipment must support in order to be considered carrier ethernet certified. And the work that MEF did was very successful. They created a huge carrier ethernet market. They say it's an 80‑billion‑dollar market and growing.
But, you know, that started in 2010 and I would say ‑‑ sorry, yeah, 2001. The problem was they actually solved that problem. Carrier ethernet is pretty ubiquitous, so they need to look at a new challenge to take on. And so, they moved more into the software side as well by looking more at higher level services at Layer3, Layer4, something they termed life cycle service orchestration: how could we orchestrate the networks and do this across a multi‑vendor, multi‑service provider network?
So, what this picture is showing is on the left‑hand side, the traditional MEF operating environment. You had services on the top, and then certification that was provided. And that's basically what MEF did for carrier ethernet.
Now, what was added was, first of all, the LSO APIs, and again you can see APIs coming here because that realisation of software developers, Open Source software too, very, very critical and want to make sure that there is a component of what's being done that speaks to developers. And perhaps even more importantly, this community component that got added. And that's things like developing the developer community that understands and is able to implement MEF standards and implementing those standards in parallel with defining them, because lifecycle service orchestration is a pretty big goal that they are going after. It's not something that MEF is doing in isolation. There is work going on in other standards organisations, including the IETF that MEF wants to be able to leverage. So having a community where people work together is very, very important. And the Open Source community is a really good place for different standards organisations to come together and work collaboratively on things of common interest.
So, similarly, MEF has its hackathon. They call them LSO hackathons. I helped introduce the first one to MEF in November 2015, they have been running them ever since with the goal of taking this whole set of APIs and this architecture that's termed lifecycle service orchestration, we want to be able to implement that entire architecture as it's being defined because no one is going to wait a number of years for it to be finalised and standardised. Rather than implementing it in parallel with defining it. That's the goal of the hackathon. And really bringing people together to understand and work on various aspects of that architecture together.
And then another thing that's interesting is, we're not just working on code, but this introduction of something called MEF net, and I brought this forward because once you have the running code, another thing that's valuable is to actually have a place where you can run it, and have it live as a server that you could actually access, so, for example, if you think of APIs, you are going to have a client and a server. What we can do here is, as we are implementing these things, we can keep a server up and running in MEF net and then that makes is so much easier for someone who is interesting in working on this to implement their client and test it against a known working server that's running in MEF net. So I think that's a really good way of embracing this kind of incremental development as well.
Okay. So, my call to action to all of you is that I hope I have explained this well enough that you find it interesting. I hope that you will join me and others in this kind of combination of Open Source and standards, with the idea being that we want to make the standards that we're working on more consumable by developers. And the way of doing that is by having running code that's available in parallel with the standard, having utilities, having libraries, things like that, that developers can use rather than just producing a spec and then waiting for it to get implemented. And doing that in parallel with defining it so that we end up with a much easier to implement, much better standard in the first place.
And then we want to be able to make the Open Source software that we're working on more consumable by the industry and the way that we're doing that is by adding support for the key standards that networking industries built on into the Open Source projects too. So now it's much easier for a service provider, an operator, whoever, to adopt and use these Open Source projects, because the key standards they have been relying on for years are supported. And we have the IETF hackathon coming up in mid‑July, that's shown there, in Montreal. The next MEF hackathon is in ‑‑ it's actually end of October in Los Angeles, California. And then I'm not sure when the next AIS hackathon is, it will probably be around this time and Uganda is where the next AIS summit will be. With that, I thank you, and really would love any questions or comments, recommendations that you have. So, please, I'm happy to entertain those.
CHAIR: Thank you very much. And ‑‑
AUDIENCE SPEAKER: My name is Vesna, I work for RIPE NCC, I am a community builder there. And I am happy that you brought up this topic and I am happy that the hackathons are working for you. I don't have a question as a tradition is apparently on this RIPE meeting, I have a comment.
I would actually like to jump on the bandwagon and promote our own hackathons which were inspired more or less by the IETF ones and the ones from the free software community. We have organised six already and this year we have two coming up, one is in Dublin in June for the network operators tools and the other one is just before the RIPE meeting in October in Amsterdam, working together with the university on developing the quantum Internet together with the RIPE community. So I hope that you would join, that you would promote our events. We will promote your events because more hackathons are more awesomeness. Thank you.
CHARLES ECKEL: It's a deal. Thank you.
AUDIENCE SPEAKER: Franziska Lichtblau, Max Planck Institute. One also suggestion or a question if something like that exists, as a person who is teaching students at university and I am sending people to hackathons because I get very nice e‑mails from Vesna, is there or should we create a platform like stupid calendar where we just have all the hackathons corresponding to different topics, whatever, we did that for Open Source events in Europe at some point. I think people who actually are trying to teach students to integrate students top communities would actually benefit from that, if you just click somewhere look it up and maybe we have someone who would actually profit from that.
CHARLES ECKEL: That sounds like a good idea. When you said we did that for events, who do you mean by "we" and did it work well?
AUDIENCE SPEAKER: Basically it was a bunch of friends and family from the ‑‑ it started out in the Debbion community and we included all kinds of free software enters and just somebody was host it go, but as here we have a bunch of organisations involved, maybe someone already has a back end where we could host something like that.
CHARLES ECKEL: I think that's a great idea and something I'll certainly look into t if there is already a place people are going to see various events, then hosting it in a similar place seems logical. But if not, we can look for a better place. Thank you.
AUDIENCE SPEAKER: Hi. I have an online question for you. It's from Sandy Murphy, Parsons. In IETF we have found a lot of standards development to be very beneficial. You mention that the IETF ageing community. How to get the younger folk involved? Thanks.
CHARLES ECKEL: Certainly, how to get the younger folks involved, certainly the hackathon is one thing that we found it was aimed at that and I think it has been working quite well. We have a number of projects that have been brought in by universities, that's one way that it's happened, where a professor or a Ph.D. student or someone championed a project within IETF hackathon, we call whoever is leading a project, we call them a champion for a project, and so someone from the university would champion it, and then what I have seen it different sets of students kind of come and go from one hackathon to the next, and they continue to work on it as something they do as part of their course work, and that's worked really, really well in some cases it's been implementing existing IETF standards, in other cases it's been with new work that's ongoing. And just the idea of people coming in who, if they have a software development background, they provide a very ‑‑ they are very valuable to the IETF community, because we have plenty of people who are good at writing standards and even more people who are good at arguing about standards. But we're kind of, not that we don't have any good developers, but we don't have enough, and so when we see people from universities from elsewhere coming in who have good software development skills, you know, they tend to be welcomed with open arms, and so I have seen that as being a very good way to get plugged in and get started, contributing right away to the IETF. And through the hackathon and just writing code in general, it's a great way to get involved.
LESLIE CARR: Thank you very much.
Thank you, Sandy, for reminding us that we are all ageing. All right. And our next speaker is Aaron Glenn and he is going to talk to us about the greatest alliteration of all the titles: Promoting the promise of programmable pact processing with P4.
AARON GLENN: You can thank Peter Hessler for the excessive alliteration.
A quick survey. How many have previously heard of P4? A few hands. Any of those right any P4, run any P4 programmes? I see one hand. All right. This is going to be ‑‑ more fun than I expected.
So, let me start with the beginning of pact routing, 1968, Arp Net and the interface message processors were essentially a honey role micro computer. One of the lead researchers in develop the Arp Net reflected on his experience of the another face message processor's abilities during that time. I find it kind of telling that the features found in those micro controllers are kind of hard to find in modern networks today, built 20 years later. Debugging, monitoring, measurement, and above all, flexibility.
Now, why is that? For the past 20 years, network protocol design, development and deployment has been driven by ASICs. These ASIC are fixed function, meaning there is an API for the functionality that silicone provides which is exposed by the chip silicone manufacturer that then your favourite vendor uses in the control plane. Now, getting to protocols or features ends up being a long arduous process, upgrades would require a forklift, memory profiles like TKMS RAM are inflexible at best. A good example is VXLAN, it was standardised but it took four years until it was generally available in silicone and in devices that you can install in your data centre.
This leads to networks being ossified by this development cycle.
P4 stands for programming proposal independent pact processors. It was initially designed for programme switches, but its scope is broadened to be to be more general for any kind of forwarding plane. As language, it's only for the data plane. It doesn't specify anything in the control plane. So, what we're talking about is just forwarding packets. You wouldn't express BGP or OSPF in P4.
We'll cover that part of the control plane a little bit later.
This isn't another SDN talk. I get it, everyone is tired of hearing OSDN but I promise I don't represent anyone and the only thing I am selling are some ideas, not any products.
I myself, I am an applied network operator and designer. I have got a decade or so of professional experience. And I'm really just excited about P4. I'm not an academic, so times I wish I was. I'm just excited and I think this has a lot of applicability to a lot of you here in the room.
Okay. So, is P4 SDN? Maybe. What is P4? Again, it's a language for expressing pact processing pipeline. I know that sounds like a mouthful, but that's what it is. Just a data plane. When I make a researches to P4 for some of you that are familiar with it I'm speaking of P4, 16, the newest specification. There were a few ones before that. P4, 16 is the real deal.
There are three guiding principles that the language designers used in order to realise this goal of an abstracted programmable forwarding plane. The first one is protocol independence. P4 is not tied to any protocol. Not Internet, not MPLS, not anything. The next part is target independence, so whether I'm talking about a Switch ASIC or a network programmable unit, or even a GPU, a graphics card, it doesn't matter, the target isn't important to the language.
The third and final one is, it's reconfigurable. So once I have written a programme and I have deployed it on whatever that device may be, I want to be able to change that once I have already racked and stacked it, once I have powered it on, once it's already forwarding traffic. So what does P4 let us do?
It's not necessarily just programming a device. It's defining the network's behaviour. Now, my talk is about P4, but what I really would like to talk about is how programmable data planes unlock a whole lot of features and things for us as network operators and network designers. I'll touch on the language, but again, that's not really what this talk is about.
Really, it's what can happen when the data plane becomes programmable? A few things. And they are pretty self evident, right. We can realise new features and new protocols faster.
Randy Bush's presentation yesterday about BGP SPF, he includes a new ether type to change some TLVs and stuff. You could go to Broadcom or Cavium or any other of the ASIC manufacturers and ask them could you implement this new ether type? They might. They might not. Maybe if you are Facebook. But probably not. And even if they did it would take years. So, kind of hard.
You can use your hardware resources more efficiently and I am sure there is more than a handful of you in the room that have to reconfigure a TCAM profile, strip away some ACL rules, all because the memory is partitioned in a way that's not specific to your network needs. With a programmable network you can free up resources, basically push more packets with less watts recollect the way you need to. We can increase visibility into the network. I know Facebook, not to beat on them, loves pinging everything all the time in a full feis. That's sin they can traffic. If we can start manipulating headers and computing things on those headers we might be able to do OAM inband using the actual customer traffic, flip a bit here, add a header there, next thing you know I can do weird things like proof of transit, I promise you that's not a block chain thing, it's not. Not a block chain thing.
But then I can say, hey, I know for a fact that this pact went over this switch and this router and this other switch and another router, and I know because along the way I modified the header, and when it gets to its destination, I know the history of it.
Lastly, it just gives you greater control over the network. You can imagine nearly everyone in this room has at one time or another excitedly rid a data sheet, thought about how it might fit into their network, how they might architect this new thing with this new feature only to be disappointed or have it not work and not being able to do anything about it. Over the programme data plane, well you can. After you have deployed it after you have purchased it after you have configured it.
Okay. Well, that's cool. I mean, we get it. You can do some cool stuff. And it might feel slightly revolutionary because we are used to such fixed functionality in our networks. Really, this is just table stakes. This is making up for lost time 20 years of basically being behind in programming. Other computing realms like significant processing, graphics, storage, they all have the flexibility and programmability and yet us in networking we don't. We haven't.
Now, we have been able to programme data planes since network time immemorial. But, the majority of the excitement that we have now in programmability seems to be centred around ASICs and that's cool. But this goes beyond wire rate, pact programming just ASICs. I am interested in P4 because of how generally applicable it is to all kinds of devices. And up here is an example available mostly public Open Source compiler targets for P4.
So you can write a P4 programme and compile it to extended Berkley packet filter, you can do it for express data path, for vector packet processing, a few software APIs that you might be familiar with. I can then compile it to an FPGA, with these. The [Bear Fictufino], which is probably where you might have heard about P4, I can compile it to a graphics card, a GPU. I can also do it to DBDK and the one that I find most interesting for an example is to Open vSwitch, a project called Pisces.
The Pisces one is the most interesting thing to me for one fact, you can express programmable pipelines in any language, and Open vSwitch is written, I believe, mostly in C, but a researcher,
Mohammad Shabaz , basically did a project where he retro fitted OvS to support P4 to express existing forwarding behaviours. And the interesting thing is, as you can see here, it uses a lot less lines of code. I mean like an order of magnitude less. But it's expressing the same thing, it's the same functionality. Some people might argue about the speed or the cache usage, but forget all that. Let's just talk about the feature. Forwarding ethernet frames. IPv4, ACLs, a lot less code. It makes it a lot easier to reason about these things. And now when I want to add something like TCP flags or inband OAM, I can do that with a few lines of codes instead of more libraries and lines and lines and lines of C, it quickly becomes intractable.
So let's use P4 to model all the forwarding planes. Everything in the network from the VMs containers to the top of rack switchers to the routers, let's use it to model everything. But do I only need a compiler to use P4? No. In fact, P4 becomes useful for a lot of things that don't natively support the language.
Again, P4 let's devices be defined by their behaviour, you express that behaviour with the language.
That's due to the expressiveness of the language. So now I can model the behaviour of any packet forwarding plane, whether the underlying hardware or software natively supports it. Google has a really interesting slide deck that they gave not too long ago, labelled Next Generation SDN Switch ‑ Future Plans for Google's SDN networks. This slide particularly caught my eye because it explains what I'm trying to get at a little bit better visually. You have your control, controller, control plane, distribute distributed or otherwise, and that speaks to a logical representation of your switch or your router or whatever forwarding plane. But then that then maps to whatever is actually running, your Broadcom tomahawk, your Cabiam X plan, your Intel flex pipe, whatever switch or ASIC or FPGA you happen to be running. Now, if the physical one doesn't have a compiler and doesn't natively support P4, I can still interact with a logical representation of what that switch supports. Yeah, you have to write that logical representation yourself. But again, with P4 it's not too difficult. It's all the language really does and you can do it in about 400, 500 lines.
So, now that we can do that, they have come to standardise it, the P4 association, the one that's driving this language, has an abstraction that they call the portable switch app distraction and that is basically kind of like a standard library for like the C language, it's the standard library for P4, it express as switch in a very, very general sense. You can use that to logically model a physical switch underneath.
Okay, so I have represented my programming pipeline in P4. I have all this needs stuff, but now how do I control it? Do I have to write my own API? Do I have to spend time writing a new control plane to handle all of this programmability and all this new data? Thankfully they have also begun standardising something called P4Runtime. So P4Runtime still allows for the same independence of target independence, protocol independence and even programme independence. It is a generic API that knows how to speak and talk to all the P4 primatives, the header parsers, the tables, the batch action, it's similar to OpenFlow and switch abstraction interface. But it remains protocol independent, unlike those two. If you come up with your own protocol, your own artisanal MPLS, P4Runtime will still let you interact with it without any changes.
I got ahead of myself. That's what the next slide was going to tell you. It's not necessary to have a P4 programme. You don't necessarily have to have the underlying source code to a P4 programme in order to interact with it on the control plane. The minimum requirement, as I highlighted, was a simple information file that basically tells you what is going on underneath, what the tables are, what the parsers look like, what counters you might be do, all of that is standardised.
So some of you might have heard that Juniper recently is going to support P4. Now the details are pretty slim, they have one small forum post, I believe, it's not even a blog post, but they posted up this nice graphic of how it's going to integrate with their new API called the advanced forwarding table, I don't remember what AFT stands for, it's their own private API and the Junos control plane runs on top of it. But also to support P4Runtime with the P4 agent. I don't expect Juniper to write a whole bunch of P4 programmes and put them up on GitHub and see how they work, that's not going to happen, maybe, I hope so, that would be super cool. But it's not.
So, they are going to use, as we said they are going to use P4 to express the behaviour of their underlying devices across the entire Junos platform, so that means your routers, your MXs, your fancy PTXs, down to your switches and conceivably even the FRX. They are going to use P4 to describe how all these devices forward traffic and then present all of that with Runtime. So you can interact with it. This goes beyond NETCONF or all those other APIs. This let's you get down to the other nitty gritty: streaming telemetry, how headers are mangled, and whatnot, in the pipeline. And all abstracted away, so whatever, if it's the Juniper trio chip set or a Broadcom tomahawk, you don't need to know any of those things any more, ,it's all abstracted away. Not to be left out. Cisco also plans to support P4Runtime using a very similar API called open forwarding abstraction and they are basically doing the same thing and in basically the same way. So nothing too exciting there.
This went faster than I expected. So, let me wrap it up by saying why I am ultimately excited about P4 and it's not so much just having more visibility or just being able to express all devices we use in day in day out in a generic language. What I'm most excited about is how we design and operate networks is going to change significantly going forward with something like P4.
You don't have to programme P4 directly. In fact, it may turn out it's better as an intermediate representation, kind of like an assembly language or a low level C.
Network programming is a lot easier and a lot different than general programming on a Linux box or any kind of application. It's much more bounded. We don't do much. Again, we parse headers, we match that to table and then with do some kind of action. There is not much going on there. So it makes it a lot easier to use long known formal verification techniques in order to start verifying things. And I have a lot of favourite papers if any of you know me, I read a lot of papers and then I talk about those papers ad nauseam to my friends. I have one favourite one. It's called correct by construction networks using step wise refinement. I know that sounds like a bunch of academic silliness. The paper is very readable and it's very readable for a network operator. You don't need a Ph.D. or masters degree, you can be a college dropout like me and still appreciate what's going on in that paper. I think it's about 14 pages. I highly invite anyone in this audience no matter your level to read it. It's very, very interesting. And what they talk about in this paper is they develop a higher level language called Cocoon and that's even more succinct; it allows you to encode policies and behaviours without any of the low level details and then they manage to write a compiler in Haskell, if you are familiar with it, takes this Cocoon and spits out P4. Now, that's kind of interesting. I can model my network in a higher level language, compile it to P4 and then conceivably use that P4 against my programmable network or against my Juniper and Cisco devices. We can now do those kinds of things with a programmable data plane with a programmable languages like P4, but then in that step of compiling, it uses formal verification techniques developed by Jixera ‑‑ the OSPF guy, him and a few other folks have come up with all kinds of formal verification that is we have kind of pooh‑poohed in the industry for the past 30 years, it turns out they applicable, you can use them and end up with some really useful guarantees.
I will end it there. If you are interested in P4, and I hope you are, I hope I have whetted your appetite, if you'd like to start and sustained T the P4 developer PDFs are some great presentations to get a better understanding of what operators are doing in this space, if you are actually interested in writing some, the Sidcom tutorials are a bit of an investment in time but they are worth it. After eight hours, you'll walk out knowing more about P4 than you ever thought you could.
Andy Fingerhut, a Cisco guy, has a really, really interesting collection of P4 stuff. It's wide variety. P4 is a changing moving target. It's still being codified. And that would be interesting to have more than the regular hyper scale folks involved in it, so if you are a network operator and programming your network seems interesting and you have moved beyond the RESTCONF and NETCONF and control pane and you want to get deeper, get involved in the P4 association. Right now, it's just Google and Facebook and that's kind of boring.
Like any good nerd, I have a vanity domain, and like any good slacker, there is nothing on it, but I would implore you to bookmark it before the DIY social, I should have some interesting stuff on there.
And lastly, I regularly tweet about P4 if you are into that sort of thing, I am network service on Twitter. Thank you very much.
CHAIR: Thank you, Aaron, for that nice introduction on how we might all programme our networks in the future. So questions...
AUDIENCE SPEAKER: Peter Hessler Open BSD shell. Do you know of anyone that's using P4 in a production or a larger /interesting test environment?
AARON GLENN: I do, and I don't know how I managed to call that slide. I basically redid all this at 3:00 in the morning. AT&T is doing some interesting stuff mostly centred around segment routing if you are familiar with it. That's something I can throw up on the website. But they have a really nice end‑to‑end diagram of how they are using P4 literally from the CPE to all the neat Cloud network function virtualisation Openstack stuff and everything in between. Bell Canada supposedly wrote their own MPLS scratch in P4 from zero to Interop in six months. That was a bullet point on a slide I can't find any more and I haven't heard anyone talk about it since. But I can believe it. AT&T, big one. Bell Canada, talking about it. But there is apparently a lot more P4 that's going on under the hood that people aren't readily talking about just yet.
AUDIENCE SPEAKER: Jeff Tantsura, Nuage Networks. I'd argue that P4 is wrongly obstruction for most people here and that's actually ongoing work on how to generate a ‑‑ out of YANG models, which is much move applicable to networking as a service. P4 is very focused on ‑‑ it doesn't define how network ‑‑ looking into high level abstraction rather than trying to move beats probably much more applicable layer or level for ‑‑ and that's why we implore everyone to at least look at the Cocoon paper or the correct by construction paper because it does precisely that.
AUDIENCE SPEAKER: Which is still focussing on programming.
AARON GLENN: I agree wholeheartedly. You can make it your doc sis modem at home. The idea is generalising how we process packets, ethernet packets, any kind much packets, you can do OTN frames, you can he is press them in P4, no one is quite sure why, but you could do that.
CHAIR: It seems we are out of questions. Thank you again.
And with that, I remind you to please, please, please do rate the talks to make our world and our work much easier to give you an amazing Plenary and I see you again at 4 where we will have all the amazing candidates for the next PC elections on stage and enjoy your coffee.
LIVE CAPTIONING BY
MARY McKEON, RMR, CRR, CBC