Thursday 17 May 2018
MARTIN WINTER: Good morning everyone, so this is the Open Source Working Group. I am one of your chairs.
ONDREJ FILIP: I am the second co‑chair. Nice to meet you here.
MARTIN WINTER: I'll welcome you to the Working Group.
ONDREJ FILIP: So, we have an agenda, you can see it either in this presentation or on the website. We have a few presentations, and we even have time for two lightning talks so I think the agenda will be interesting for you. We have two people in the room that will help us to make this meeting happen. First of them is Annand who is taking minutes, and second is mass who is monitor the general practitioner err, so for are moat people if you have yes, sir don't forget to use Jabber to raise the question.
Before we start we have two important questions for you. First of all, do you have any additions to the agenda? I really don't expect much, but... thank you for that. And also, I need to ask you, if you approve the minutes from the previous Working Group, the minutes were published you know a few weeks after the last RIPE meeting, so in time, we didn't get any comments or here is your last chance to comment on the minutes. I don't see anybody running to the microphone. So, minutes are approved.
And the last thing about ‑‑
MARTIN WINTER: I wanted to add something. I see a trend more and more, not just in this Working Group, but everywhere else that presentations submitted to the Working Group are getting later and later. There is a lot of opening there. We would love to take as many presentations as we can in there, there is even sometimes extra time, but please submit it earlier because the programme should be published about one month before the meeting' latest for this Working Group, so when you send it in the week before, that doesn't really help. So this time one month before I think we had about one submission, and then at the end basically, we could have filled easily two slots, no problem at all. It's more and more. So please if you consider it, we don't need finer slides but please send us a note earlier and we can help you trying to get them in.
ONDREJ FILIP: Absolutely. A month ago we were discussing whether we shouldn't cancel this Working Group, and a week ago Your Honour, you were discussing about how to fit all the stuff in. And I think we did manage well, we have run five minutes over time so hopefully we will make it.
MARTIN WINTER: We also had to deny an a few really good presentations because I couldn't change everything last week again.
Let's just run through the agenda quickly. So, first we have about RPKI, quite an interesting talk, we have heard quite a few RPKI talks already at this RIPE. That's Tim from RIPE, I don't even attempt to say his last name, sorry, you can read it.
MARTIN WINTER: After that I'll give a quick talk about one year of FR routing what happened or so, where it's going. Then we have Alex for about open only networking, what happens, what's the state of like how can you do a network with not just open software or the open hardware combined there.
Then we have Andre talking about his latest project, I think that's the third version of Turris.
And then we have two quick lightning talks about Kea Open Source DHCP, what's going on there, and also a discussion about block chains for like using that for proving about IP addresses, owner ships and all that thing and is that a feasibility.
Okay. Let's not delay it much further. So, let's start with the first talk, Tim.
TIM BRUIJNZEELS: Good morning everybody. Let me go quickly, because many topics today.
So I want to talk today about the RPKI validator that we built at the RIPE NCC. We already had one, but we built a new one. But first, a little recap, this might be known to everybody already, but very quickly.
What is this RPKI validator thing and what does this RPKI thing? So first of all, in essence, the RPKI is a resource public key infrastructure where IP addresses and ASNs are tied to a public key essentially and it follows the hierarchy of the registry.
Then, the holder of the private key for that public key can make statements. And the one that we always see is, and talk about most, is ROAs. So, you can do things like ASN X is authorised to announce my prefix Y, signed, older of Y, it's very similar, the route object, but with a different authorisation model.
How you might use it in practice is, it could look a little bit like this. You have a validator that gets the RPKI objects and validates them, and then there is a couple of options for you on how you might use them. So there is an API on our validator and the other ones as well, that you can use and then you can have your own scripts and use it in similar ways as route objects. There is also an RPKI as a router protocol that allows you to do things like this. Rather than having long lists configured in your router static list, you can do ‑‑ well this is sort of a config, but the gist here is you can do statements like if something is valid, do something. If something is not found like maybe give it a lower preference, or if it's invalid, then drop it, which is actually recommended best practice, because if you just give it a local preference, then more specific invalid announces would still make it through.
This may also be an enabler to do more origin validation, not just for your own customers, let's say, but in general.
Now, why don't we do a validator number 3? We have a Version 2.24 is the most recent stable one. It's used quite widely by people who are doing RPKI validation but we have some issues with it. It's using quite a lot of memory because we keep everything in memory, including all the objects and certificates, etc. Also the code was getting a bit old and hard to maintain. We made a choice back in the day to use Scala, which I believe is still a nice programming language, but it's kind of hard to find people who can code in it, to be honest, these days. So, that's a bit of an issue.
Also, there was a some missing features, like for the RKRTR protocol that I mentioned, it wasn't able to do the updates that is actually part of the protocol. We would always say no router, just do a complete resync, which was of course inefficient.
So, we proposed then to create a new version. My former colleague, Alex Band, presented on that at RIPE 72 in Copenhagen and we got mandate from the community to go ahead and make a version 3. So that's where we are.
Design goals are stability, maintainability, redundancy, we want to reduce the memory fingerprint somewhat and we want to consider deployments and updates.
So, the implementation. Yes, it's in Java and I think that it is the language that a lot of people love to hate. Why use it? It's what we have expertise in. Also, we're building this on top of the RPKI comments library that we have in Java. So, we have that knowledge and know how and you know, it's ‑‑ well I think the quality of the software is helped by knowing the language, let's say.
Also, there is an argument to be made for genetic diversity, not all implementations are using the same code based underneath. There are other options out there that you can use of course, there is RIR C Y N IC, if you Google for the links I think you can find them. This is based on Python and open SSL and there is a tool called RPS T IR, now taken over by Z D net. That's written in C and it's a completely separate implementation for all the Crypto stuff. So that's the ecosystem, let's say.
The previous version was also Open Source with a BSD licence and we have a release candidate available now.
For deployments we have built RPNs because we use it internally and it's something we can test. I am playing with a docker container but I'm welcome to ‑‑ I would really welcome feedback on how to do that better by anybody who knows such things. We have a generic build available as well.
So on the GitHub page there is a WIKI with lots of documentation and pointers where you can download this.
Feedback is very much welcome. So, yeah, I mentioned the WIKI. Please try it. Let us know, create issues, some of you already have. I am looking at Job, here. Bear in mind that we are aiming for stability over lots of features. But of course we are committed to fixing issues if they come up, and I think what would be helpful is helping with documentation, examples, ways to deploy this, such things.
The actual production release, if we find issues during this release candidate phase, then of course we'll have to evaluate whether we need to deal with that before making a production release but probably yes. But, you know, no big things pending, we would do a production release on the 4th June.
A bit about the architecture. This is similar to the previous page. Where the old validator an embedded RTR server we have now extracted that into a separate component. And the reason for doing that is not everybody wants a Rome on, but if you want to do you may within a to run more than one. Because these distinction stay available for example when an validator gets updated or is unavailable for whatever reason. Then, quickly, the features. It's very similar to before. The signing is also very similar. So, you can have a list of trust anchors, by default the ARIN trust anchor is not included but you can add it, I have just created the script that you can use a little bit more easily because as you will notice, there is no upload trust anchor here, but there is a script that allows you to add any arbitrary trust anchor.
Now, another feature that may be functionally not very visible but we think it's quite important, is we made a strict separation and object validation and object retrieval. So, we can deal with the Repository a synchronously, that means if Repository are unavailable, it doesn't affect the rest of the RPKI tree. That's important if you have delegated CAs are their own publication points that may not also not be available. It also supports the new Delta protocol which is HTT S based rather than Rsync, if it finds that that is there, then it will prefer it validator ROAs is pretty much the same as before. The export functions here are exactly the same as the old validator hat so if you were using that and doing exports and processing them, you should be able to just use this one instead.
And if not, then raise an issue.
Ignore filters. This is something that the previous version also had. The one difference here is that you can add comments as well for readability. So you can remind yourself why you put this in. The idea here is that you know, if for whatever reason you think there is an issue with configuration in the RPKI, you can make an exception. You can say for this specific address space I want to ignore the RPKI. And in this case I put in an example for the RIPE NCC prefix actually. And I'll come back to that in the following slides.
So the next thing you can do is add your own white lists. So you can add additional things and we have had actually we have talked to people who were using the validator in the field. I'm not sure that I can name them by name here, but the thing is we did find that they use the white list feature quite a lot. So they are a content provider but they do origin filtering to keep our suspicious traffic, sometimes they find that they need to add white list entries to temporarily allow things.
Again you can have a comment there as well. This is also consistent with a ‑‑ well the thing on the bottom, called SLURM, it's a bit of a funny acronym, what it really means is a local exceptions. So you can have your local filters and white list entries for ROAs as well as BGPSec router key if you want to play with that. We don't have an UI for that. You can use a SLURM file and what I want to do, I want to ‑‑ the Internet draft on that is, I forget the exact state in IETF land, but it's very near to completion. So, I want to put the researches in to that, because that is what is supported here and it's also supported by the rip centre validator. So you can use that to export and import your settings.
Now, BGP preview is a feature that we also had in the previous validators. This takes the aggregated RIS route collector dumps and any seen by more than 5 peers is shown in a table. You can search so you can narrow your scope, which is what I have done here. And in this case it shows a prefix for RIPE NCC and the ASN port 3. That's an announcement that we do and it now says invalid ASN. If you look at why, then you get an analysis. And it will actually tell you well at the bottom there you see there was a ROA for it that would have made it valid but we chose to filter it out and we put in the white list entry except that's for a completely different ASN. So in our local setting here, this announcement would show up as invalid because of that.
That's pretty much the big picture. So, with that, I want to end and open the floor to questions, comments and...
ONDREJ FILIP: Are there any questions for Tim? Okay. I don't see anybody, so thank you very much Tim for your presentation.
And next presenter will be Martin. He would like to talk about one year of development of FR routing which is kind of part of Quagga but I think he will say it much better than me. So Martin.
MARTIN WINTER: So, one year ago I was at the RIPE too, I talked a bit at that about a new work. For the ones who missed it, there I had this, at the RIPE 74, had a quick talk and I explained what the FRR is, for the ones who still don't know ‑‑ is there anyone in here who doesn't know FRR? I don't see any hands. That's great. So anyway, it's routing stack. We basically wanted to fix a community in a way that we saw that we can do fast development, more open, the whole thing. And how we did it. Like with more automated testing, we moved to like a GitHub model with like pull requests compared from like sending patches in there.
The key thing I wanted to show is what happened since then.
First of all, the contribution level went up. So if you look on the top, I have a little bit, do you see at the bottom the grey, that's about the number of total GIT commits that was in Quagga from the time when we started the fork. We spent quite a bit of time, more than half a year, before we went public to like add a lot of things, fix stuff there and everything, so that's basically where you see the blue level, where we made it public, so we nearly doubled it before we went public there. So that was from about 3800 commits in Quagga where we took the base to about 6800 when we went public. And now over the year, it grew about 12,000 commits in the GIT.
So we are very happy from that, you also see from over the time we had a lot of pull requests we processed. The only way to do them that fast is obviously with CI system all automate as much as we can. We are a very active community. I would say at any time across all the different companies, there are probably at least 20 people, probably more like 30 people, full‑time working on FR routing.
Then we came out with multiple releases. You remember, if you were here a year ago, we just came out with FRR 2.0. Where we had some of the interesting features added. Now we released FRR 4.0 just about I think the beginning of April ‑‑ no, March, it was in March when we released that one. And we are actually working finalising the 05.0 version. So a lot of things went in.
A quick overview. I talked about that before. That's just when we came out. That was like the 2.0 version. Some of the interests highlights we added LDP, the whole MPLS part. And I won't talk too much there.
Then we came with the 3.0 version out in like October last year. Had a lot of things. We added up some label manager which was interesting. The next loop protocol. A lot more LDP enhancements to make it useful. There was more PIM part too and a lot of more BGP features.
The 4.0 version which we just came out in March, we added the RPKI, so, for the ones who want to play with it, it's all in there. If you are don't loading the packages which we have in GitHub, I have to say sorry, RPKI is not in there, mainly because I couldn't really figure out to verify, test it well enough, and I'm still working on picking out how the best things so I can actually tell people it actually works. But if you rebuild packages you can add it, if you manually build it you can turn it on. And the latest I heard just a few days back, some people say it should theoretically work.
We added Babel , too and I know everyone here was waiting for EIGRP, so we added that in there too. At least one person is really excited. We added on OSPF v2, we added some ex perm al segment routing, it might be something interest too for some of you. We started doing a lot of VR graphs, we also had on the BGP, had, like, doing a lot of EVPN over that time, so if you want to do EVPN, we have type 3, type 4 and type 5 I believe is like more or less in there, but you can go and look in more details in there so there is a lot of that part in there too.
We are working a lot on the CLI, to improve that, so especially I know some people love the Cisco like CLI but not everyone, especially in automation, we are look working heavily to have like a real API and stuff in there. So there is more and more chase on output there, most outputs. We have JSON outputs, we are also looking on it after inputs like in a better way. And the 5.0, which is basically currently we are in the testing phase on it. We have like some more PIM stuff added in there a little bit, so we have a PIM, we have S M and S M implemented. We have ISIS ‑‑ we have doing from the open fabric standard, not yet in 50. BGP flow spec we have some initial work, I believe it's implemented usability at this time. It's missing a lot of CLI part or something still, so it's partially implemented there.
We also add‑on the graph the network name spaces f you want to use VRF on Linux you need quite a new kernel, I believe about 415 or very close to that kernel version, and if you don't have that basically the VRF system recollect the Linux kernel part doesn't work. So we supported the all the choice that you can use network name spaces instead.
We also did the policy routing demon, so policy based routing demon, which is a new routing Daemon which does policy routing stuff.
Packages, if you want to tryout. We have a bunch of packages which we support ourselves which we are testing and put it up on GitHub. So, you find a bunch of different Abuntu packages, this is for 4.0. We are hoping to add a bit more on the 5.0. Mainly Abuntu. Everything is basically 64 bit we build, some of them 32‑bit Intel and a few things.
In addition to these packages there is a fee BSD, it's an official port which exists out there. If you are running alpine Linux, Jen too Linux, open WRT, there are a lunch of at least instructions how to build it, they are in various processes of getting integrated too. There is this net package if you are going through the store on mostly Ubuntu or other once that support those packages.
And that's mostly it. I tried to make it short as we have a very full agenda. If you are not yet using FR routing and you are looking for something, join us. There is ‑‑ we have a very good success rate. I hear from a lot of commercial vendors and something like moving over, so there is a lot of work there. Unfortunately sometimes we don't know all the ones but yes a lot of white locks Windows, random other ones going to it. Find us on GitHub, send a pull requests if you have any changes, we are very open, we try to post them as fast as possible, there are mailing lists. The slack channel, unfortunately it requires invite. Ping one of us and we can help you. Okay.
ONDREJ FILIP: Are there any any questions?
AUDIENCE SPEAKER: Charles Eckel, I work at Cisco. And thanks for this presentation. Just a question about the API and having something other than the CLI. Have you thought about, or like looked at using YANG models and going with that approach.
MARTIN WINTER: Yes. Basically so. Work we're doing is in that and if you are interested, we have actually tomorrow ‑‑ no, Friday ‑‑ well that is tomorrow ‑‑ we have a community meeting where we talk about this. We had some discussion like about different approaches, we have one which is like basically a YANG CONF D example. We haven't made a clear decision which model it goes. I think that model is probably a really good choice. But, yeah... if you ping me afterwards, if you want, ping me, it will be 5 p.m. here local time.
AUDIENCE SPEAKER: Job Snijders. Can you go back one slide. This is not really a BGP daemon specific question. But I see that you are supporting Ubuntu 12 ‑‑
MARTIN WINTER: Yes. Abuntu 12 reach end of life quite some generations back. I'm not sure how many years back and I wanted to drop it so many times, but it's not so much pain. I think the most painful one was the port ASN 06 that doesn't support most of the package we need. So if you look at the build instructions, it's an absolute pain to build it on sent owe 06, you have to rebuild most of it. It starts out of how to make, how to recompile it because non of the rope owes have anything. That's the worst one, I would love to drop that one. Unfortunately that seems to be in demand. The Ubuntu 12.04 I had a few people asking me to keep it around because they do some extended support. But I'm plan to go drop it now in the next version I am adding the 18.04 and the next time I have problem for building it I'll probably start dropping it. I'll keep is as long as I don't have extra pain. If you are seeing something missing or want to help out...
AUDIENCE SPEAKER: Filippe, NETASSIST. The question is about routing tables again. I see you had support in nearly implemented and the first question will be how much you get the programme there and the other thing is about supporting multiple, even virtual routing tables like BIRD does it's a really useful feature for traffic for generating management and so on. And last one is API for statistics. Maybe a YANG model would be a good choice for that. Just another good acronym and good technology. Anyway statistics and the routing tables.
MARTIN WINTER: So if I start on the back. Statistics, ping me afterwards exactly what statistics you are looking at. For me that's just a bit more just an output part, so that's ‑‑ unless you look about some realtime, I assume you are not talking about SNMP, I don't think anyone still uses that. So that is some of the JSON output depending on what you need I would love to hear t I haven't heard that much requests in that area though.
The routing table, like you have something with the VRF, so there is also, if you just look at the BGP that is old old features in FRR which came from Quagga, the multiinstance BGP stuff too. There are also ways that you can run multiple BGP daemons, so I'm depending what you are doing it, it may be supported and there is a lot of different versions how you want to do that.
ONDREJ FILIP: Okay. I think that was the last question. So thank you very much Martin.
And next presentation will be from Alex and it's called open only networking.
ALEX SAROYAN: Hi. Thanks for coming. I work for Xcloud networks, we do lifecycle management platform for open networking. And I am here today to share some experience of working with our customers and experience from their networks.
So, first of all, to properly understand, I want to speak a little bit about address requirements. So I believe it's necessary to better understand which kind of problem do we solve.
This is not the real network because we can disclose details of our customers, but this is just an example network which is pretty similar to the typical networks with which we deal daily.
So, the requirements like, so you want to build like, you want to start with five slides in different places on the planet, then you want to go with more sites, you want this thing to be scaleable, to easily deploy this site. Keeping it low cost, you want to encrypt site to site traffic and preferably in a full mesh manner, so the encrypted side goes the shortest path. They want user traffic to each the closest side, which is very reasonable these days, keeping the content as close to the customer as possible.
So, they want to start from small, like from two racks. The idea is why to buy a big chassis when the requirement is to start just from a couple of racks. And when they add more racks, they don't want to throw away the equipment which they require in the beginning.
So, they want to do some load balancing, load balance the traffic. They want to allow only wanted traffic like to filter some unwanted traffic on the point of entering the network. They want to be able to securely connect to the network, and they want the whole thing to be scaleable. But be scaleable gradually without buying big equipment from the beginning.
So basically, they want cost effective, agile and highly scaleable network.
So, what we do is, the CLOS fabric recollect I think I am at least the third presenter during this RIPE meeting who speaks about architecture offered by Charles Clos in 1952, which is very actual these days, we are coming back to this.
So, the good thing with this fabric is that it's from our side, it can scale to the scale of this work but from other side it's also good for small networks like you start with just 2 racks, it's very easy without over killing equipment, you just start with small 48 port switches, then you scale to more racks, you are just adding rocks with two switches on top of the rack. If you need more capacity between any two racks, you basically add more spine switches, and with this technique you can always add more switches to get finally full lightening fabric. Then you can scale even more, and again adding more switches helps.
So again basic idea here is that there is no need to buy a big and expensive chassis. You just basically pay for the network based on your growth, so based on your network size.
There are a lot of hardware vendors out there, so we believe in diversity. We believe that with diverse vendors, diversity of vendors reduces cost of the equipment from one side and it's always good to have big choices. And most of these vendors produce hardware based on open complete projects standards, so this was design contributed by Facebook and other participants to open compute project, which means that basically there are a lot of switch models made by different vendors which are almost the same switch model. So basically just an example.
You have a switch from, I don't know, Dell, then you want to upgrade or replace or something and you have a switch from H Core in your stock but not Dell, but you can replace because it's basically the same switch inside, the same architecture. And even the big switches with 128 or 256 ports of 100 G which Facebook use, they are available out there.
So a lot of different useful options.
Network operating systems. As you probably know, the switches are coming without software. So it's just the box, you need some operating system to use there. There are more operating systems than on this slide, but this slide is just operating systems which we had some experience to work with.
We worked a lot with Cumulus, and it's very production ready. We have big experience of running this in production.
We are doing some stuff with IP infusion, which again works very well. And we are just starting with ONL and SwitchDev. So first two Cumulus and IP infusion, these are commercial operating systems so basically you pay to use them. And ONL is totally free, it's supported by open community. And SwitchDev is a lot supported by Melanox, it's not really ‑‑ SwitchDev is not really a full operating system, it's essentially a driver which comes with the Linux kernel so you basically can build kind of build your own operating system or port in your operating system and you can fine tune the switch actually for your particular needs.
So coming back to the fabric, we do BGP on numbers between switches. So we run FRR on the switches as a routing daemon. So we don't do spanning through the course because the whole fabric is talking BGP. We are a big fan of Layer3 fabric, so we do a Layer3 up to the host, and in this case it's very easy to manage. We don't use this kind of bad stuff like spanning tree and with BGP unnumbered, the configuration is very, very easy, you just say that I want to run BGP on this port, this is the remote ASN, and it just runs. You don't need to take care about any link IP addresses here because it is using IPv6 link local addresses to bring the adjacency up and IPv6 link locals are just there.
With this fabric there is no link between these two top of the rack switches because there is no need to anyhow synchronise any kind of state between the switches because there is no multichassis log or there is no other channel or something like that. These are just two links from the server to the switches and these links are running BGP and they are doing ECMP load balancing.
And again, because there is no link between switches, there is no any kind of replication, the two switches can be from different vendors, different models and they can even run different versions of operating system and even different operating systems. So it's another added layer of diversity.
We use collect D, graphite and carbon for collecting statistics from the switches. Collect D allows to) develop some custom plug‑ins which we do to collect the statistics which we are interested in and actually for different network operating systems, we use different ways to collect statistics.
For some rare cases, we are kind of against the Layer2 shared segments but for some rare cases when it's still required, there is VXLAN with EVPN signalling, so you basically say that I want this port, this port and this port to for and EVPN over BGP will signal information about your Mac addresses and again the signalling will be very scaleable.
So, regarding ACL enforcement. So these are the main vendors of the switching silicone and they can do at least 3,000 ACLs per switch safely. They can do more actually, but 3,000 ACLs is what we consider is safe, because with more ACLs there can be some, theoretically some traffic disruptions at the moment of applying ACLs. 3,000 is safe. Mellanox can do more but we still experiment and work with them to be able to do more ACLs.
So, filtering on the switch telecom, 3,000 is not very big number, but this happens in the hardware, so filtering with this way doesn't consume any additional resources and this is like free of charge, this is zero dollar thing.
So in some set ups, this is sufficient enough and people don't really need to buy any extra equipment.
With the Layer3 fabric, because ECMP is there, we do load balancing by leveraging the ECMP thing. So every server is advertising one Unicast IP address which is of course unique for every server and another IP address called Anycast which is the same for every server. So the traffic entering the fabric will ECMP towards the host, and so the missing thing here is health checks because there can be situations when the BGP is up, ports are up, everything is fine but application is misbehaving on one of the servers. So, we should not forget that there are CPU or memory switches and these switches are running just a Linux, you can code something and put on the switch and it will just work. So what we do is we basically run health checks which are checking TCP or ACT P or anything you need to work every server in case application is misbehaving, we basically talk to FRR, we tell FRR to remove this particular Anycast IP address from this particular neighbour, and it generates BGP updates, so a network converts and there is no more traffic travelling towards this particular host.
And this thing can scale to this scale, or even bigger, and again this is just zero cost, it's utilising resources of the switch and the switch is something you will have in any case on top of your rack.
Another question is, like, okay, ECMP is good but what have I want to do application layer load balancing and still I don't want to ‑‑ it's not ‑‑ it would be good for me not to buy a very expensive load balancer. So, we do something with H A proxy, so we installed two H A proxy machines, we connect them to the fabric, we run the same routing to the host and we do the same ECMP to organise redundancy or load balancing from the Internet towards H A proxy servers, then TCP or HTTP, or HTPS sessions are terminated on H A proxy. Then H A proxy does the connection through the fabric towards servers. Again, just zero cost and useful for many many networks.)
So, connection towards the Internet for these networks, this is not the most scaleable solution, but it's efficient enough for networks where ‑‑ so we face ‑‑ there are a lot of networks which are serving millions of customers but they actually don't generate a lot of traffic. So like, they generate 10 gigs of traffic or 20 gigs of traffic and it's possible to do without spending a big amount of money, so what we do.
It's a switch, it's an open networking switch running FR routing, they connect their peers towards the switch, and in this case, we separate the peers into two types. Peers which advertise full table and peers which advertise just small table like hundreds of prefixes or a few thousands of prefixes. The thing is that these switches are based on kind of cheap T come, they are limited in T come because it's a switch, it's fine for the switch to be limited. So this T come can do like 150 thousands of prefixes but not 600, not 700. So, we connect a Linux machine to the switch it's essentially a router almost like router on the stick, and we terminate ‑‑ so we create layer to bridge for the neighbours which advertise full view, the blue ones. And black ones advertise just small amount of prefixes. So we create bridges towards the Linux server, which is running FRR, it's using this Chelsio NIC, I'm not saying it is not possible with other Nicks, we just have experience with this NIC and we are sure with this NIC and we are trying with other NICs but experiments are not finished. I will update when we have more experience with it. So, the full table terminates here and this thing generates a default route towards the switch but still if you have some peerings with low amount of prefixes, we receive this low amount of prefixes directly on the switch, so the traffic generated from Internet network which goes, which should be forwarded towards peers with low number of prefixes goes here directly on full the speed of the port. The rest of the traffic goes to the Linux machine and is being routed there. So the part of the traffic is limited to about 40 gigs and we were able to get these 40 gig speeds ‑‑ well it works very well. And again, just super low cost solution for networks which need to keep full table but their traffic is not very big, like 20 gig, 30 gig or something.
There is a big problem of router on the stick. If the port goes down here, the thing here are not be aware, but we solved this because these two systems are just Linux systems and we can do almost everything. So, we list some kernel BoFs here and when port goes down we generate the return message towards Linux machine which talks to FRR and resets the BGP session. So, we actually propagate the link state, so solving this classical problem of router on the stick.
Similarly, they do ‑‑ they use open VPN for nipping traffic between hosts. Again, if this is configured automatically, it's super scaleable when create scripts which will configure open VPN on your hundreds of sites automatically, you can generate the site to site links easily and again so open VPN can be free of charge or you can pay for support, different options are available.
So... coming back to initial requirements. They wanted cost effective, agile and highly scaleable. So it's cost effective up to 30 times because it is based on commodity hardware which is much more cheaper comparing with traditional hardware, because of diversity and also it's cheaper because of combination between commodity hardware and Open Source software, which is free in many situations.
It's agile because it's Linux. You can easily automate everything. And it's highly scaleable because this approach is based on similar principles like hyper scaleers like Facebook, Google, so it's a good example that this thing can grow to that scale.
MARTIN WINTER: Okay. Any questions?
AUDIENCE SPEAKER: Hi. Babtiste Jonglez, from University Grenoble Alpes. Something I didn't quite understand. Do you run FR routing on the switches or on the servers?
ALEX SAROYAN: Everywhere actually. We love FRR. We contribute to FRR, we love FRR. It is great. There is a serious reason why FRR because as I said we do BGP unnumbered which is super easy to configure because we don't need to take care about link address and so on, and it's Open Source if we are not happy with something or something is missing, it's Open Source, we just go, we add in the source code and we submit it to the GitHub.
AUDIENCE SPEAKER: And another question. So you run Layer3 to the servers, so each server is on its own Layer2 network basically?
ALEX SAROYAN: Yeah exactly. We configure IP address on the loop back to the server and we run ‑‑ we form BGP JSON C on each layer and fabric.
AUDIENCE SPEAKER: Filippe, NETASSIST. Great to see such a good configuration and usage of Open Source tools. I have two questions. First one is about link state propagation. That's questions to be, discussion to be discussed here. Just, give you a little bit more details about it. And the second one is how it manges forwarding table of loading from the Linux kernel up to the silicone.
ALEX SAROYAN: Well, so, regarding first question, so the problem, I think the problem is obvious, which we solve, we just want to make sure the router on the particular is aware that the port is down and we don't want BGP timers ‑‑ we don't want to wait BGP timers to expire. So there is the small code ‑‑ so, there are many ways to do this. How we did it is as follows: There is a small code written in C because we wanted to be really fast, it's listening ‑‑ the kernel message bus and if it detects a failure, it generates a message through some client server application which another party is listening on the router on the stick side, and it fires some message and the part which is running on the router on the stick, basically talks to FRR to reset the BGP session. BGP session is being reset, so routes are being removed from the routing table, and yeah, it's trying to establish the BGP adjacency once again because it fails because no answers for hell owes. Until the port is back and it will come back again. So that's for the first question.
And the second question, I'm not sure if I got it correctly. So the question is how the switch is off loading the routing to the ASIC, was that the question?
AUDIENCE SPEAKER: The CPU offloads to the ASIC.
ALEX SAROYAN: So ASIC on the switch. Actually, that's the part of network operating system for example with Cumulus there is a daemon called switch D which talks ‑‑ which has an implementation of brood come, ASIC or Mellanox, Broadcom's API, so it essentially looks to the kernel's table and it installs ‑‑ it generates appropriate micro code which is getting pushed to the ASIC through API.
AUDIENCE SPEAKER: So not main line.
ALEX SAROYAN: Well, look ‑‑ Broadcom's API is not like very open. You should have a contract with Broadcom, some NDA with Broadcom, and after that you get ‑‑ so it's quite doable. But once you have that, you receive access to the source code, you get access to the API and then you can develop your own. But, yeah ‑‑ I wish it is a little bit more open, which is not the case with Mellanox, Mellanox as much more open and I show this example with SwitchDev, so it's fully Open Source and you can go and fix it if you are not happy with something.
MARTIN WINTER: We're kind of out of time. So...
ALEX SAROYAN: Thank you very much. Welcome to RIPE dinner. I hope to see you there and I am around, and would be happy to talk to you if there are any questions.
MARTIN WINTER: The newest fun toy, Ondrej Filip is going to talk about it.
ONDREJ FILIP: It's quite funny to continue after Alex because he was talking about very high speeds and I have really tiny ones just for the home purpose.
I think we are running a little bit out of time so I will try to convert my presentations to, more to lightning talks just to, you know, save sometime for the others.
So what is Turris, and what is project Turris. I think it was presented here like three times already so I assume that most of you know the project, but just in case, it's a project that started with the idea that we would do some ‑‑ we would create some probes that will detect some nominally network works, we will give it to people and that we will collect some data and create some security research. That was the original idea. And because we didn't know how to put those devices to people's homes, we decided to make a router which was given for free to them or you know, for some small price and those people installed the routers in the home and we were collecting data and we had quite a lot of information.
It was the first batch we created two generations of routers called Turris 1 and Turris 1.1. They were mainly in the Czech Republic, which was some limit of this project. And it was pretty fun. Then many people came to us and wanted to have this router because on the one hand it was a security probe but it was also a very powerful router with a lot of Open Source software. So we created another router called Turris nominee a. It was an ex streamly several crowdsource campaign and the project continued. Everything that I will present is either Open Source software or Open Source hardware. We publish everything from the software and also from the hardware.
So, here you have pictures of) those previous generations. So that was the first one, the second one is almost the same, it is USB 3 port. Then this is the version which is available for everybody, you can buy it in the shops if you wish, it's called Turris Omnia, and here is the detail of the board.
So, a little bit about Turris Omnia. It's Open Source hardware and software. We have operating system but it's just improved to open RT, we are now trying to submit as much patches as we can do open WRT back, although we have some unique features that probably can be accepted by this project. The most important one is the automated updates, you know, this device is instantly updated if there is any security issue, it's pretty quick.
As I said, this hardware started from a security resource project so the security in this is pretty strong t doesn't allow you to configure it insecurely. It has Crypto chip that controls the integrity of the updates and everything, the communication with the centre. It has many security features like, it can easily run as a honeypot in the router. It does a flow analysis, you know, based on the analysis. You can easily set up VPN and so on. And of course one of the goals of the project was to show that there can be tiny small device that can run all the new features like IPv6, DNSSEC validation, and also to show that focus do a little bit more than just a routing, so you can run even LX C containers in this router, so it's a kind of small server.
So that's the Turris no, ma'amey a. Again it was presented many times. If you want to know some details, try to find the older presentations.
So Omnia was a great device, although, we started it as a small project, but at the end of the day we added every feature we thought would be interesting for people. So, it has SFP port which is not very common in this kind of hardware, it has a lot of memory, it has quite a strong CPU, even a lot of dials for signalling anything you wish. So it is a pretty complex device and that's why it has quite high a price, although we are a not‑for‑profit company and we don't try to make a profit from this, but we need to cover our costs and they are quite high anyway.
And many people approached us later and said Omnia is great I would like to use this device but for example, could you just cut some of those features, I don't need SFP port for my purpose, and I don't need like three, menney CPS lot, we said that's fine, how many PCs do you need, 100. I said sorry for that that's not for us to start hardware development in such an amount.
So this hardware was inflexible for someone playing with the hardware for some special projects and that's why we decided to create a new, again Open Source hardware, that will fit these purposes. So the model will be small probably hopefully cheaper, and you could you know play with it. And that's how the project called Turris MOX started and we didn't take the lessons learned from the previous campaign, we started another again, so we have now very good time, because it's always relaxing and I am running a crowdfunding campaign f you want to look at it, you can go to ‑‑ these, you will see a video that shows after stopping making routing daemons and routers, I can also work as an actor.
So this is Turris MOX. As I said, it is a model design. So basically you create your router from the models you want. The model of the company router is like a sandwich and you can see the plastic which is also you know kind of like a Lego, you can assemble your router as you wish. And therefore, you know, it kind of has multiple use, it can be a router, it can a VPA P, it can be an X box, a simple media convert errand so on, it's quite flexible design. I will show you the hardware part.
These are first models that were introduced when we started the campaign. I have a few pieces with me so you can see the real size, by the way it's 1 square sent metre smaller than Raspberry Pie. This is a CPU model, it's the first on your left‑hand side. It's just a very small server with ethernet port, USB 3 port, and you can plug some other models to that, so you can really tailor your router as you wish, so this is the way how you create a router with five ethernet ports, and for example, if you need to make some Wi‑Fi, you just plug in another module and that's t so really easy way to create your own configuration of not just a router, you can use this for everything, you know, it's Open Source, so, you can use any purpose you wish.
A little bit about the CPU model. It's based on Marvel Armad, dual arm at 1.2 gigahertz. It can have half or 1 gig of memory. As I said it's equipped with USB 3 and single physical ethernet. The second one is in the boss bus, so the first one is gigabit ethernet, in the bus there is 2.5 gigabit ethernet. You can put Wi‑Fi into this small module and it has always 1 PC L express that goes to the bus so that's the extension that you put some other modules through. And also is has micro SD slot for firmware of course.
It is POE ready so you can power it also over ethernet. And it can route, I have written a presentation it's 1 gigabit. It's a little bit less to be honest, but still quite a lot for the purpose that is you put this router into your home.
So, that's the CPU model. Then the next model is PC express, as I said, it is also SIM card slot. The other model is model with four ethernet and the last one is model with SFP.
Later in the campaign, we introduced some other modules because people requested that. I don't have them here with me because they are still under construction. Another model is 8 times ethernet module. So it's double size basically like two of those. And can be added multiple times, but of course there are some limits for that.
That's another model, next model was a model with 4 USB 3 ports, that's something we developed in corporation with next Cloud because they were trying to find some, you know, small box that an next Cloud would run like a platform they could present to their users. So we have a module with 4 USB 3 ports which ‑‑ and the software will be ready to easily separate with USB 3 hard drives and of course very simple setup of next Cloud.
And the last model is PC LE pass through, had a that can be combined with another PC LE express module or for example with 4 times USB model because they are on the PCI line of the bus. It sounds very complicated. Therefore we created a very simple web page configure ate err that allows you to configure your configuration as you wish. Here is an example. Here are modules that are available and here is for example the result which I set up. So it's a router with, that consists of five modules. This is CPU with ethernet, USB 3, then we have one PCI so you can put that either cards or something like Wi‑Fi cards or something else. Then we have module with 8, another ethernet, and then we have SFP. So that's for example how you can create your own router.
We had some software, we have some software challenges with that. First of all, we would like this hardware to be supported by all common Linux distributions. Of course we are working on our own distribution based on open WRT but we would like to make this hardware more general to we are working now with some other companies to be able to support this.
Also, the fact that this router is modular creates a few issues because basically no distribution is really ready for very flexible hardware that can be changed and so on. And also the number of combinations is quite vast when you compute it. So, that's something we are now working heavily on.
And also, as I said, all distribution was, is designed to be secure and with this hardware, we have a little bit of a problem that when you boot it and you try to configure it, in all it has to be done through a wire with the ethernet port because cannot just open Wi‑Fi for everybody and then believe that the device stays secure. But in this configuration we don't know what is a LAN port, what is a WAN port, that's kind of messy. Also, just this module, if you just equip is with a Wi‑Fi card can be a router which is just one port and then Wi‑Fi creates the LAN and again configuring a device through one and so on it's a little bit complicated.
And last challenge is we like this device to boot from the network for example if you plug it to Turris Omnia, it will just get network address, boot image you know, boot from the network and then configure from the master from Omnia. So just you know, be very automated, you just plug it into your network and that's it.
So those are the main challenges we currently are working on. Hopefully you'll be able to upstream some of them to open WRT, if they will be accepted.
I hope you like our new toy and that's all from me.
MARTIN WINTER: Any quick questions?
AUDIENCE SPEAKER: Hi. Blake Willis, Zayo. Quick question regarding the boot loader on this thing. Sorry, I might have missed the slide, is this U boot or ‑‑
ONDREJ FILIP: It's a U boot.
AUDIENCE SPEAKER: Have you looked into using EF I firmware, they have massively less pain supporting various arm devices because the EFI boot loader is way easier to deal with than custom U boot for every platform.
ONDREJ FILIP: Not really. We have people that are not really experienced with U boot, not really but maybe we could consider that.
AUDIENCE SPEAKER: Talk to Peter.
AUDIENCE SPEAKER: Alexander, NETASSIST. Do you have PO N support in SFP model?
ONDREJ FILIP: Good question. And I don't know. I will take this question off line. I'm sorry.
MARTIN WINTER: Okay. Thank you.
Next person up is Vicky from SI E, she wants to talk about an update quick on the Kea project.
VICKY RISK: I can't wait to get my MOX router. I got an Omnia and one of our engineers took it off my detection so I'm having the MOX shipped to my home.
I'm going to give a quick update on our Open Source DHCP server from ISC.
First of all, we keep referring to Kea as a modern day DHCP server. This is what we mean by modern. When users first starting using Kea the feedback that we got from the network engineers is they all wanted to deploy it in pieces. People wanted to have a lot of flexibility so we have been trying to break apart the DHCP server into lots of components. The Kea daemon itself of course is handling all the communications with your DHCP clients. But from the beginning, we supported a back end separate storage both read storage for configuring your host reservations in a database, and also writing the leases to this external storage. And we have a number of different optional database options for that.
We have recently added a radius for accounting and access control. I'm going to talk about that briefly. We are very proud of the rest API which is just light years improved over the owe mappey interface on our other DHCP server, ISC DHCP, and creditor also trying to figure out trying to make as much as of the configuration information as possible, ideally all of it, centralised, so outside of the DHCP server itself, to make it easier to manage.
So that's what we mean by a modern DHCP server. Kea 1.4 is actually scheduled to be posted for beta testing tomorrow. So, we would love to get as many beta testers as possible. The biggest feature in 1.4 is high availability. This is the last major feature that was preventing a lot of people who were interested in migrating from ISC DHCP to Kea. Preventing them from migrating. We have it now. We would love to get people to test it. We also have a new back end, Apache CA sand aback end which was contributed by Deutsche Telekom, we'd love some testing support on that and we are coming out with a first version of radius access control and accounting. Also there are a number of other new features. We have improved the performance of the shared network feature. With every release, we are adding more flexibility in the client classification. And many small bug fixs in here as well.
The high availability is not the IETF draft DHCP v4 failover. Instead it is a much simpler implementation that works both for DHCP v4 and for v6. It works in two different modes, either at 50/50 load balancing so you can't have 80/20 or 60/40. You have two choices: 50/50 or an active standby. We have both a heartbeat over the control channel and also each Kea server monitors the other Kea servers response time in the second field, it's a time in responding to clients in order to determine if it is perhaps unavailable Tor down.
The other slide just pointed out that we have separate back ends for lease storage and for host reservation storage. So this added CA Sandra back end, if you are familiar with CA Sandra it's designed for massive scaleability without a single point of failure) so you can add as many nodes as you want in your fabric, and obviously then you have many more nodes that could fail in the database back end before you actually have a service impact.
Radius integration is something quite a few people have requested. I'm sure that with our first release we have not got all the features that people want, but we're starting down that path.
So, we're leveraging the hooks interface on the Kea server. So when we get different DHCP events like the discover etc., we can add a hook out to the radius server. One of the promises radius was originally designed for dialup access so this can be kind of slow. So we have also added a caching feature, so that we don't have to keep going back to the radius server after we already have the information on a particular client.
We're still finalising the content for the next version of Kea, 1.5, which will sometime in the fall. But we're planning on extending the ability to central eyes the server configuration and adding support for NETCONF using a YANG model. The YANG model for DHCP have not yet, to my knowledge, been implemented in production. And they are not really fully agreed to yet but this is what we'd like to do and we'd love some collaboration from this community.
My last point, it's sort of your first warning. We are serious about replacing ISC DHCP with Kea. It's something like 21, 22 years old. It's very hard to maintain at this point. I know people are not eager to replace their DHCP infrastructure. But at some point you are going to have to. We have done our last major feature branch of ISC DHCP earlier this year, that's 4.4. And we are working on some migration tools and as soon as we have a good set of migration tools for migrating both the configuration and your active leases, we're going to start talking about a timeframe for end of life for ICS DHCP.
That's all I have. I think in the deck ‑‑ yes, there is some links if anybody needs any references.
MARTIN WINTER: Okay. We're nearly out of time. But there is one quick question.
AUDIENCE SPEAKER: Peter Hessler with the open BSD project. Can you go back a few slides to the announcement page for the 1.4. I saw a note there that looked curious. So, in the new features, you have some things in parentheses and one of them is Open Source. The other one is premium hook library, can you explain that.
VICKY RISK: Sure. So, part of the Open Source in Kea is this hooks interface. So, you can, anyone can programme based on these different call outs to add features. Some of the features that we have added on the hooks library ‑‑ that we have added on the hooks library, we actually are selling at a premium on our website. So, like, for 499 you can get package with three of these different hooks libraries. The hooks library is one implementation of this feature. In most cases you can also implement the feature yourselves writing another hooks like re, it's a good point. I didn't get into, you know, which of these are premium just because of the time constraints, but we're trying to be open about it. Does that answer your question or ‑‑
PETER HESSLER: That does, yes.
MARTIN WINTER: Okay. Thank you. We are very short on time so let's go onto the next talk, we have like one last lightning talk left, talking a little bit about block chains.
JORDI PAILLISE VILANOVA: Hi everyone. I will try to explain how we put, could put IP addresses and AS numbers into the block chain. I would like to thank RIPE for inviting me here and present here.
I'd try to be very quick because we are all waiting for lunch.
First of all, I would like to say that this is cheap. So, in our proposal of block chain you you will need to run the data centre like this one, so you have to waste any power, no high electricity bills because it is a different thing, called approve of stake, it can run normal PC. So I will go very quick through how block chain works.
So very short. Block chain is a secure database. It is blocks one after the other and it is protected by two mechanisms, chain of signatures and the consensus that I mentioned before.
So, very short how it works. The transactions. They are protected by a signature. You send transactions to a particular network. At fixed intervals in time a particular node collects all the transactions, adds them into a block, computes the algorithm, sends the block back to the network and then finally the rest of the nodes when they she have this block they check the transactions, if both of them are correct they add it to the chain.
Just some features of block chains compared to traditional PKIs. They are decentralised. They don't need certification authorities which makes management more simple. They are also available and censor ship resistant because you have a permanent log of transactions. Drawbacks also. You don't have Crypto guarantee like you would expect from a PKI. It depends on the good behaviour of the parts ants and as all of you know they require storage.
We want to store three elements. IP prefixes, the chain of validations and delegations and the binding numbers.
Why block chain? We can see IP addresses like coins, they have a similar properties, so they are unique, so two parts cannot have the same IP address. They are transferrable so I can send IP addresses to someone. And they are divisable.
So we could think of a block chain ‑‑ like, in bitcoin you send money to people.
A quick example. First of all, IANA would make a trust anchors assigning all the address space to itself and then it would start locating the block of addresses to the registries, those in turn would allocate them to the customers and finally the customers can BIND these prefixes to their AS numbers. So, if you want to know which AS number is registered with a prefix I just have to retrieve the last trust anchors from the block chain and I get the answer.
So finally, a few words about the prototype we have built. It's written in Python. We have leveraged from different libraries, it uses a simple algorithm. A block time of 60 seconds. It supports boast IPv4 and IPv6 and you can find it here open sourceed in GitHub.
We performed and experiment to see how it fit our use case. And basically we tried to recreate the allocation hierarchy that we find in the Internet now so basically we have three levels of transactions. The first one allocating /10s, then again among 8 nodes in terms of a /16 and finally 10 thousand prefixes from the reader exchange files. And in this graph you can see the chain size. So you can see that for around the 1,000 and 50 prefix it is takes up to 1 gigabyte. And it takes around 7 hours to download and verify this chain.
I think that's all. And you can find more information in this archive paper and also there is an IETF draft in the first slide. So if you have any questions.
AUDIENCE SPEAKER: Hi, this is Andrei from II C. I didn't hear one thing what problem duds this solve?
JORDI PAILLISSE VILANOVA: If you have to speak the origin AS for an IP prefix. It's similar to RPKI.
MARTIN WINTER: Okay. Any other questions? Good. Thank you very much.
So that brings us to the end. Just about one minute over. Thank you everyone for attending. Remember next time please submit your presentations earlier, much better chance for getting in. Thank you.
LIVE CAPTIONING BY
MARY McKEON, RMR, CRR, CBC