15 May 2018
BRIAN NISBET: Hello, good afternoon. I am not chairing this session but I just have deep need to stand on stage so you have to listen to me for a couple of minutes. As you are aware, the PC, the greater portion of it, the equals who are better than the others are the elected members of the PC rather than the randomly appointed ones like me. We have the PC elections and three candidates, I know one of them is in the room. Andrei or Dimitry are you here? No. Peter has kindly said he is the only person in the room, which he is, we can just talk about him. So we have Peter Hessler, Ondrej Sury, it's been an interesting day, and Dimitry Kisselev, Dimitry is a new self nominated candidatement so their bios are up on the front page of RIPE 76.ripe.net. So you can go and look at those and cast your votes and at that point in time there are two seats and you need to cast them over the next 36ish hours or so, voting will end at some point on Thursday so you have the next 36 hours to to this in, please go vote as early and often as your conscience allows to continue for us to have an excellent PC to provide the plenary to you. Nothing more. I will hand over to the Chairs.
CHAIR: Thank you, Brian. I am Maria and I am chairing this session so we will be your Chairs for the next hour and a half, we have two 30 minute talks and we have three lightning talks and the first speaker is jammy hammer from ERNW and she is going to speak about the RFC 6980. So, please, thank you.
JACKY HAMMER: Hello, I work mainly as a systems network operator, and the topic I have brought for you today is RFC 6980 and how it is or is not implemented on various operating systems.
So I first going to introduce into what I did, what is this about then I show on on how I did the tests, what the results of my testing were and what we can learn from that.
We know a router in IPv6 has a somewhat different role from a router in IPv4. A route is not only, yeah, routing messages but in v6 we also have the router provisioning the clients on the network with necessary information. So this is a very important role, and as neighbour discovery is an integral part of IPv6 DNA and router advertisements are a part of that it's an important piece of packet which should be secured but IPv6 in general has no mechanism of securing neighbour discovery. Of course they are sent but nobody is really using it. And IPv6 on a local link we are all friends and we all trust each other and that makes it possible for attacker to say, okay, I am a router and here is your information about your network and we go ahead and trust him without any ‑‑ so there is a couple of RFCs that try to protect us a little from malicious‑formed tat. One of them is RFC 6980, what that says is that we must not fragment router advertisement or enable discovery packets in general because fragmentation for these messages is simply not necessary and what it also tells us if we as a client get these kind of messages in fragmented form, they are not valid and we must not accept them and should discard them.
Now, we wanted to test that a little if it actually applied to the systems, and what I did then was to just provide a target system or different target systems and have a attacker machine connected to the same network, I also use laptop to automate the testing but for the set‑up itself that is not important. What I use as a toolkit was Chiron, this is a framework to form certain kinds of neighbour discovery packets, how we want it, used for testing and we just ‑‑ just tell them, use this form of ‑‑ form the packet like this, set the following IP addresses and then you send the packet which is very convenient way to build custom article formed packets and a switch with a certain version and some TCP dump and Wireshark to look into the actual packets.
The tests that I then did were sending some forms of router advertisements in the first stage, simply some baseline router advertisements to see if the network itself worked and behaved as expected for valid packets. I played something with extension headers destination options hop headers, in unfragmented router advertisements to see if they got evaluated correctly and then I started fragmenting the router advertisements using two, three or four fragments or even more, and injecting some certain extension headers in either the fragmentable or unfragmentable parts. The headers I used were in this case hop by hop, destination options and routing headers which all have some specification about telling clients or telling routers how to behave on the network.
So the initial round of testing was done on Windows server 2016. In the beginning it was more of a bi‑product of some other IPv6 testing colleagues of mine did and they found something to be off, and then they would ask has someone time or wants to look deeper into that and I said this looks interesting, I want to look at that. These first tests were a year ago, and since then much has happened. So I have these tables where you can see what I did, I am shortly going to explain on how they are formatted because there are going to be more of them. Here I just state that the number of fragments, I split the package into it if it says one it's just an unfragment packet. Here are the destination ‑‑ the extension headers, destination options, hop by hop and also routing headers listed in the number and the order they occurred. And in the field below, message part, I specify if it's the unfragmentable or fragmentable part into which the extension headers are injected.
As you can see now, for the first baseline packets it behaves as would have been expected. A simple unfragmented packet is accepted and fragmented router advertisement is not accepted. Router advertisement with valid extension headers and that is not fragmented is accepted, and this is an invalid order of packets as of RFC 2460 and 8200 because the hop by hop header has to go first so I tested that as a bi‑product and this invalid packet is also discarded correctly.
So as long as we don't fragment the packet or have the destination options and other stuff in the unfragmentable part it gets discarded but as soon as we are getting creative and starting to put extension headers in the fragment part of the router advertisement, they suddenly get accepted again. Why? This is a fragmented router advertisement, it should not be accepted as a whole. So why is this accepted? Well, okay, this looks ugly. So we got even a little bit more creative, trying around with more headers and more packets. If you get really crazy then maybe stuff is discarded again but all in all, we can say, okay, this RFC is nearly not implemented at all, save for the very basic testing.
Now, how can we save ourselves from that? RFC 6105 proposes router advertisement guard, a mechanism to simply prevent router advertisements on certain ports on the switching hardware because, well, we know where our router is which port it's connected to so we can just prevent router advertisements on all the other ports. Let's try this. As we can see for the unfragmented or the simple ones the packet just doesn't reach the target, this is good, obviously. For the little more complex packets the packets also get discarded through the router advertisement guard, but as soon as we start with the really creative ones like 4 fragments and 2 options and 3 fragments and 2 options they can even evade the router advertisement guard mechanism and still infect our systems with malicious IPs and routes so as the problem here is I can just take this packet or this packet and use it to infect any Windows server on a local link with me with the route I wanted to have. I probably don't have to tell you what a problem route injection is.
So, as we are using a lot of FreeBSD in our company and I am basically a Linux person I wanted to see how it behaved on FreeBSD servers. This testing was done mid‑of 2017 and I tested current Arch Linux version and a CentOS and Debian, I tested the two current FreeBSD versions, then I used SUSE and Ubuntu servers. What I did here to just shorten the view, where it differs, the left half is without the RA guard and the right half is with the RA guard. So where the packet gets accepted here in the baseline test, it just gets discarded with the RA guard. So baseline everything looks as it should, rather boring. So now we can see here that what went wrong in the Windows testing, at least wrong according to our 6980, also went wrong on some Linux versions and FreeBSD. So then with the more creative packets we can could see free BST10 was rather okay and 11, the newer version, even had the problems Windows had with possible router advertisement guard evasion.
When we looked at the packets in Wireshark without the RA guards all packets, well we would expect them, boring. With the RA guards, we saw the following, the fragmented parts of the message but not the original first part in the cases where the packet was fragmented. So our target on ‑‑ on our target we could only observe the fragments but not the original packet, and anyway, it got evaluated and set the malicious routes. So why is that possible? Why do we evaluate fragments of packets which are missing original fragments which should be evaluated when they are fragmented? That is kind of nuts, right? So I presented those findings on the German network operators' group in November, and people started approaching me and talking to me about what impact this had, and one person, thanks, he submitted a back really fast after my talk, including a patch for FreeBSD because he is a FreeBSD developing community and this could be easily patched and he submitted it back and a patch and it got closed and fixed and said to be fixed in 11.1. Well, I tested 11.1, I show you later, and when I talked to people they said, okay, this is really crazy stuff for our networks, this is a problem, this is a huge problem, but okay, data centres and servers are interesting targets because they are the high risk, the high impact tar gets but our data centres are more of a controlled environment, we want to know about clients, how do regular clients in our networks behave, because as long as our A guard evasion is possible, not even our office nets are secure. And if you have, sorry for the stereotype, if we have the secretary's Windows PC and this stuff is possible, that is not cool. So could you do some more targeted research into client operating systems. And I said sure, okay, I am going to do that. I also wanted to do some more research into IoT stuff and mobile phones, hopefully I will have time for that in the future.
But for the ‑‑ for this meeting I focused on testing client operating systems. What I tested was, again, a current Arch Linux because that is a lot of people in my environment use, I tested Debian because I am a Debian person, I tested the current FreeBSD and Mac OS X and desktop and current Windows. What is missing here is I also tested Windows 10 IoT core because I thought that's a lot of devices to be out there, all those Raspberry Pi with Windows doing home automation, that is fun to exploit. So I just skip over the boring part and then, as we can see here, FreeBSD 11 behaves just as bad as before, I am sorry, Lutz you tell us, thank you for your work but it didn't work out and desktop 804 I tested two days before the rely, had problems the problems the previous Ubuntu versions didn't have and Windows 10 didn't behave anything different than Windows server, the just very, very broken. The IoT version 2, spawning thousands of devices and I can inject them with whatever routes and IPs I want, you can imagine the impact, maybe.
So what can we learn from this? That the behaviour and focus of these RFCs, I mean RFC 6980 is merely an example, we have to extrapolate that to a similar RFCs so it depends not only on the operating systems, on the versions, on the kernels, there are flags in certain Linux versions that enable this RFC 6980‑compliant behaviour, in some it is on and it complies and some it is off and it still complies, and some even don't know that flag and don't comply, so nobody really knows what is going on. The security mechanisms we are provided with, they can be evaded and that is because the Layer 2 device can't reassemble the packet and see if it's malicious, it sees the first fragment says okay route advertisement discard, and with the other fragments it doesn't know what to do so it lets them through and the client goes on and evaluates this packet with missing fragments and I am like seriously? But the question is whether we can actually have some long working solution for that, because it kind of conflicts with the robustness principle when you say be lenient in what you accept from others, then it's a logical step to accept this a little broken packages but, yeah, how to we really handle the conflict between security and lenience, I am not sure. What we should take from this is, we need to test, we need to test how our systems behave. We cannot take standards for granted. The vendors or, if the vendors don't do it, we as a community need to do more testing on how the RFCs the community create, you are part of the community that creates the RFCs, so we need to at least look into what the vendors do with it and even if, like, RFC 6980 is some minor RFC doing ‑‑ saying something about formality stuff, no, it's not, it has reasonable security implications and I don't want to know how many other RFCs are out there that are so highly under‑rated and cause such security implications for, well, who in this room is using Windows or FreeBSD or something? I guess it effects most of you.
Well, thank you for hearing me out. I am happy to be able to be here and thanks for ‑‑
CHAIR: Thank you, Jacky.
JEN LINKOVA: I totally agree that implementation must not have bugs, great. I think we are all on the same page, right?
JACKY HAMMER: Yes
JEN LINKOVA: One question and one comment. Have you opened ‑‑ have you reported those bugs you discovered to the systems?
JACKY HAMMER: Some I did. For the FreeBSD I took help from the community, for some Linux versions I talk with people from the community because I didn't really know how to file those bugs and got me some help. For Windows, honestly, I have no real idea on how to open bugs that actually do some work.
JEN LINKOVA: My comment is, yeah, it's a problem, but I would not say that it's huge problem we are all in danger because the only risk really you have here is DDOS, because your host might lose connectivity because get ‑‑ intercepting traffic should not be a problem because you should encrypt everything really, and if you do not trust your hosts, if you are not controlling them completely, it might worth liking into isolated host on Layer 2 and in this case it's not a problem. If you monitor your infrastructure and you don't monitor your host you will detect it quite easily right? So I agree it's a problem need to be fixed but I do not want this to sound too dramatic.
JACKY HAMMER: I absolutely agree with the standpoint that traffic interception is not a problem, if every traffic was encrypted like it should be, but we have some reality check, I mean there is too much ‑‑
SPEAKER: David Farmer, University of Minnesota. The RA guard is not all that successful in, as you have shown here, but this would be only malicious traffic. You don't accidentally get those forms of RAs.
JACKY HAMMER: Yes.
SPEAKER: So the one thing I will say is, RA guard is very effective at accidental RAs, and everybody should implement RA guard even though it doesn't solve all the problems.
JACKY HAMMER: Yes, definitely.
SPEAKER: Filippe, NETASSIST, thanks for your research, it was the good presentation, I was happy to see and to listen about it. So what things I would like to adhere is that ‑‑ same problem with IPv4. You remember it, it is actually ‑‑ by as far as I remember, as far as I remember, it was route accept, it was accept redirect ‑‑ there was ‑‑ you remember everyone, this interesting way to make malicious stuff in a network. In IPv6 we have the same, only the difference there is that regular processing of the packets here become complicated. I told it before many times ago, and everyone know about it, but as far as I realise, packet processing here is still at the type 3 and we don't have so much complication, even in IPv6 we don't need so much ‑‑ we don't need memory here to process all the advertisements, even though we can just drop, but and the problem is the actual implementations, that is why RA guard does not help. And box, operating systems it's truly problematic thing. Anyway, thanks for your research. I was really happy to hear that.
JACKY HAMMER: Thank you.
CHAIR: We have no more time for questions. Thank you.
I am presenting the next speaker, which is Kostas Zorbadelos, on large scale deployment of RFC 7596.
KOSTAS ZORBADELOS: Hello everyone. I am Kostas Zorbadelos, working for OTE Greece and this is another IPv6 story, hopefully it's interesting enough to keep everyone awake after a long session and a long day of interesting presentations.
Hopefully I can understand how this works. This is more or less what I will cover during the presentation. I will define the problem and then I will describe the characteristics of the IPv6 only service, I will give a short introduction of lightweight 4 over 6 which was our approach to the problem, and I will give a brief overview of the functionality of a lightweight after implemented as a virtualised network function, and then I will try to focus more on the deployment in our production network because the main purpose of this presentation is to give actually from a network operator's perspective.
I will mention our challenges, our experiences and the future work, more or less.
Wow. This is more or less a time‑line of events inside AS, I will be talking about the fixed network of OTE. We started researching the situation back in 2013 because we could see that there was a shortage of IP addresses and we went forward with lightweight 4 over 6 at the time more or less of its publication as RFC so in 2015 we went with our RFQ for lightweight after deployment. So we have it deployed in the network, I will share numbers a bit later, and our target is to reach 100,000 users in the production network by the end of the year. So this is a problem. The problem unique, as you can understand. We did face an exhaustion of public IPv4 addresses and when I am talking about exhaustion, I am not saying exhaustion in general, I am talking about exhaustion inside RIRs, we are really happy that our customer base is growing and at the time of the ‑‑ at the time when RIPE began allocation from its /8 we were ‑‑ we had around 500,000 addresses in our pools available. Okay. So we evaluated the possible solutions to mitigate this exhaustion problem. There is ‑‑ there was a lot of work in the IETF. More or less our candidates were SD light, MAP‑E, and lightweight 4 over 6. Our public pool was running low, we could see that clearly, in 2014 and the main proposal was to move forward with an IPv6‑only residential service and not invest in buying extra IPv4 addresses. IPv4 was to be treated as a service over the IPv6 network, or network is dual stack for a lot of years, and Greece is doing quite well in IPv6 penetration. Of course, we had to have a backup solution, a CGN solution, we thought we could avoid it, we didn't make it eventually. So these are the characteristics of an NIX only service. We wanted the service to be stateless or having as little state as possible, distributed, flexible, with a possibility to completely remove IPv4, minimise logging needs and one innovative feature is to have the necessary network functions implemented in virtualised fashion. MAP and lightweight 4 over 6 were the main contenders but we are part of Deutsche Telekom group, you may have heard something called the Terastream and its architecture and lightweight 4 over 6 is a choice inside Deutsche Telekom, so we went with it.
A brief overview of lightweight 4 over 6, it is basically an improvement to DS‑Lite that moves the functionality back to where it belongs, to the CP, and utilis encapsulation and tunneling to provide IPv4 service over an IPv6 only network. So this is more or less what is going on. A customer behind a LAN has the private IPv4 address as usual. The CPE performs a NAT translation but the extra thing here is that many CPEs share IPv4 addresses, each CPE has a port range, this is the idea to reserve IPv4 address space, so the CPE should perform its NAT functionality using a restricted port range. It will then encapsulate the traffic over IPv6 and send it to a central location where a lightweight after is located. The lightweight after consults a binding table, which is a minimal state that is held in lightweight 4 over 6. This binding table contains the CPE tunnel end point, the shared IPv4 address and the port range. LwAFTR performs a check and if the packet matches the binding table, it decapsulates the packet and it sends it to the public IPv4 Internet. On the return path, the reverse happens, actually. The return traffic hits the lwAFTRs, lwAFTR performs a check in the binding table, encapsulates the packet and then forwards it back to its original destination in the CPE. The main idea is that the lwAFTR is a stateless as possible. It only performs encapsulation and decapslation by consulting a binding table. The binding table as shown previously has just three pieces of information. It has the IPv6 address of the tunnel end point on the CPE, the public IPv4 shared address and the port range. What we do is that we exclude the system ports, the low system ports and we give users the ports higher than 1024. We are using 1024 ports per subscriber, right now, and there is no algorithmic relationship between the /56 that is delegated to the customer and the IPv4 address and port set that the customer gets. It's much simpler in terms of planning and implementation but you lose some, you win some, and it has a harder provisioning.
So, in terms of the CPE, the CPE, we performed an RFQ process to obtain a lwAFTR. The CPE was out of this RFQ and we targeted a single CPE that had a mass deployment inside their company. The RFQ process as a deliverable a lwAFTR VNF that we wanted to be able to run on commodity hardware and could accommodate multi 10 gig traffic with predictable performance. We had four vendors participate and the RFQ was quite new so development was done mostly in parallel with the RFQ and we stress‑tested those solutions extensively inside our labs and it took us more than a year for a single vendor to be selected. I'm happy to report, however, that at the heart of the solution we utilise open source software, a toolkit called Snabb, which is a fast packet networking toolkit, very interesting work. It helps implement fast data plane engines in pure software on commodity hardware. It allows you to implement networking applications using a high level scripting language, Lua, and the creating is Luke Gorrie, he had an affiliation with DT, a very smart and talented guy and seems quite pleasant. The approach is to perform with no kernel overhead, that is user mode networking and it can create data plane applications such as line rate performance at 10G and beyond. The selected vendor actually packaged Snabb, the selected vendor was Juniper eventually, they packaged Snabb and they provided ‑‑ Snabb was the fast data plane toolkit and they provided their Juniper VNF mix on top as a control plane. So this is more or less how it looks like. Each 10G interface is handled by a Snabb process and on top of that we have the VN mix control plane. It is all packaged as a docker container. It is very interesting work. It is a presentation on its own, and I highly recommend there is a presentation from Diego Pino in the IPv6 Working Group, I look forward to it and he intends to show the work that was done in the Snabb layer.
Now, what we needed in terms of at the employment in the production network, we needed of course the CPE support, we needed BNG configuration and various options there, RADIUS, central DHCP deployment and of course the installation of lwAFTR in our data centre locations. We wanted monitoring and measurements and we wanted of course to automate the provisioning process.
Our BNG, Cisco equipment, we targeted two iOS server versions, what we had and what the future deployment is going to be. And we communicated and requested various features from the vendor, mostly DHCPv6 related, we wanted the box to be able to be a server or relay at the same time, while also introducing end user identification in DHCPv6 messages in our PPP based set‑up.
In the RADIUS, things were fine because we utilise a free RADIUS set‑up, fully open source, the infrastructure is controlled by us, and we created the necessary config, we actually implemented a lw4o6 profile for a user. What this profile does, it's just disables IPv4 during PPP and then after a disconnection of the user the next session is created in a lw4o6 mode. The DHCP 6, a huge chapter on its own, we wanted ‑‑ we introduced central server because all provisioning on user CPEs happens over DHCPv6, the CPE gets the tunnel end point, the port range, the public IPv4 address that it will use, and at the time of deployment the only thing we could make it work was at plane old ‑‑ with a binary config. So we introduced a central DHCPv6 deployment using two servers in our data centre locations.
So this is more or less how things look like in our data centre. We have four machines to accommodate everything, both lwAFTR and the DHCP 6 functionality. LwAFTR is run as docker containers inside the servers. In terms of service provisioning, our provisioning happens in a few selected BNGs initially. We developed custom scripts and we get a batch of targeted users because we need to select users that have the proper CPE, and after we select the batch we just provision them in ‑‑ perform a disconnect and after that the user transparently moves it to a lw4o6 set‑up. In terms of automation, we needed to be able to generate a binding table in the afters with a matching DHCPv6 configuration and since that is binary, we needed to automate that to make it less error‑prone so we developed all necessary scripts. We developed scripts to upload and commit everything in LwAFTRs, and we provided a tool to the help desks to help them refer the use aer back to dual stack set‑up in case of problems.
In terms of monitoring, we utilised open source software again. There is a software called open NTI, it is again provided by Juniper. It just packages a lot of open source components, InfluxDB, graph Anna, so we added some stuff on ‑‑ on our own, we added the measurements for lw4o6, we contributed everything upstream.
So this is how the situation looks like in the productive ‑‑ production network right now. It is actually measurement from Friday, when I left the office, we do have 12 BNGs in production, we have 18,000 ‑‑ 18500 users, our current return traffic peaks at approximately 6 gigs currently. So the damn thing seems to work.
So, are there any problems in this set‑up? We did face challenges, of course. There are users with services over IPv4, mostly IP cameras, these are expected, that they would not work okay. We revert them back to a dual stack set‑up. There are a lot, we faced a lot of CPE dragons, having having to do with CPE is really hard. Also, since our deployment is still not global, it is in selected BNGs we face issues in case of service the slam port migration for users, but this is not an issue so far. The scalability of current DHCP deployment is unknown, the folks told us they have no experience with a configuration like our own to support this kind of set‑up. And we do not have possibility for a static IPv6 offering in this current set‑up. We faced also lwAFTR issues, performance issues mostly and in the live traffic but with the help of our vendor we mitigated all these problems. And originally we faced a higher risk and during when we had the old iOS versions we needed to refer all users to central set‑up and not only lightweight so we effected service globally but we don't face that any more.
I didn't mention we have CPE issues, I think I am repeating myself, okay. Of course since tunneling is involved, fragmentation and MTU is always a consideration. We have no noticeable issues on that. Of course we support Zumo frames in our core network but in the last mile we do have 1,500 MTU in the PPP sessions. There are some interesting properties that help an operator in this lw4o6 set‑up. There is no dependency between IPv4 and IPv6, but it's a good idea to plan ahead. There is flexibility in routing, we have implemented the lwAFTR functionality using Anycast, so we distribute load evenly and beautifully inside our data centre and locations. We can also perform traffic engineering if need be for the IPv4 ranges. And we can easily relatively easily add capacity, add extra capacity because the whole implementation of the lwAFTR is actually a VNF on a stick.
So, what do we plan to do in terms of future work:
We want to lighten and offload our current CGN set‑up. The idea was to have the lw4o6 deployment before being able to go to a CGN set‑up. Move all our BNGs to the new target version, by the way the version that contains all features is in the X dot 2 branch of iOS XR. And we plan of course to evaluate and implement a centralised DHCP KEA set‑up that will allow us to give static IPv6 offering which is something we are really interested in doing. Of course we need to expand the at the employment, put more users in this set‑up, and not actually future work, we are currently doing it, we need ‑‑ we want to promote this deployment and we are really interested in hearing the opinions of other network operators and we want input on whether anyone else is trying to ‑‑ is thinking of going forward with such a deployment, we hate being alone with this, on that, even the management does not feel comfortable with this. It seems to work okay, but on the other hand, we wouldn't like to be alone in this.
We can also provide different classes of service, we can have a service, okay currently our offering is a shared IPv4 address with a 1024 ports. We could have a variation of this service using more ports, and have subscribers move to different services, for example. We are thinking of exposing the port range, the IPv4 port range of the end users to facilitate port forwarding and stuff like that.
Finally, I think we will go with that configuration and operation via YANG, although our operation currently is automated and it seems to work pretty fine as it is. And the idea is also to support RFC 8026, the unified IPv4 in IPv6 on all our feature CPEs, of course.
Before accepting any questions, I would like to say that this was a multi‑year effort project. There was some very interesting work, very innovative work and we collaborated with elite engineers from vendors and I would like to thank each and every one of these colleagues, not only the selected vendor, but also other vendors, the interaction was excellent and I would like to thank those people, they know who they are. So here I am, shoot in terms of questions. And I will be here the entire week for anyone that wants to share opinions and discuss such a set‑up.
CHAIR: Thank you, Kostas.
SPEAKER: Hello, Jan Zorz. I have two questions. Are you using your own CPE or which vendor did decide to implement this stuff?
KOSTA ZORBADELOS: I will till the vendor off‑line, I think. It's not a good idea to share names here. All I can say is the communication is difficult.
JAN ZORZ: The second question, as far as I remember, we defined in RFC 6346 that I am co‑author of A plus B stateful that led into lw4o6 that you are using and I am glad that this work is being used in, out there in the field. But I'm wondering how are you assigning the port ranges? Is this continuous port range or you are scattering ports across the whole ‑‑
KOSTAS ZORBADELOS: In lw4o6 the ports should be contiguous. We have pre‑provisioned everything. Whenever we enable a service on a BNG box, we pre‑provision all possible prefixes inside the BNG, both on the DHCP and the lwAFTR binding tables. So this is what actually happens.
JAN ZORZ: Because this was our concern because with the we defined contiguous port range and we also defined the scattered port range.
KOSTAS ZORBADELOS: Yes. This is also an algorithm described in the MAP‑E RFC.
JAN ZORZ: Yes. But do you see any problems with ‑‑ with DNS and source port randomisation?
KOSTAS ZORBADELOS: No we haven't noticed any problems and it also seems the implementation in the CPE to work find in the restricted port range in NAT but you never know what we will face next.
JAN ZORZ: You will probably allocate more ports?
KOSTAS ZORBADELOS: No, a subscriber gets 1024 ports and that is it. Yes, the selection, why 1024 ports and not less or not more? We didn't perform any measurements, to be honest, with that. We just went with what we thought would be a good idea, even with, for peer to peer sessions. For the time being we don't seem to have any problems and by that I imply we do not have cases from customers in our help desk for these ‑‑ for this issue. We face problems regarding IPv4 or CPE stuff.
JAN ZORZ: Thank you for this presentation, keep us informed of development and experience.
KOSTAS ZORBADELOS: Thanks a lot.
SPEAKER: Jordi Palet. Thank you very much for this work, it's excellent. I will talk about this if you allow me in different for us.
KOSTAS ZORBADELOS: By all means.
SPEAKER: You know I have been working for at least three years with the CPE vendors and in IETF to promote the support of not just this transition mechanism but hopefully before the end of the year we get it for every CPE, so that will be helpful also for you. My question basically is I know you mentioned you didn't want to have a state, but you also have a solo network right?
KOSTAS ZORBADELOS: What I am talking about here is the fixed network
SPEAKER: Are you also planning the IPv6 set‑up in your solar network and if that is the case, have you considered it in case you offer an service like, for example, fixed line and backup by LT it E or something like that, how you will manage the different transition mechanisms because I guess in the solar network you will go for ‑‑
A. 464 XLATs.
KOSTAS ZORBADELOS: I cannot make an authoritative comment on that. I really have no idea how we will go forward in the mobile network. It remains to be seen. But we do face a real issue of IPv4 address exhaustion so our main focus is to lighten or just abolish the set‑up and put us many users as possible to reserve IPv4 addresses in the fixed network. And of course, the conversation will take place eventually about what will happen in the cellular network but they perform NAT there for a long time and it's been quite some time since the addresses were exhausted in the mobile parts so they also use CDN set‑up set‑up there.
SPEAKER: I have customers in the same situation and they are decided ‑‑ let's talk off‑line about this.
SPEAKER: Blake Willis with Eyebrows. Thanks very much for this talk. Have you had any law enforcement traceability requests on your IPv4 addresses because I know at least in France, Belgium and Europol the recommended number is not more than eight users per public IPv4 and recommended really please dear God, no more than 16, and you are at 64, I don't know what the Greek regulator says about that?
KOSTAS ZORBADELOS: It's the first time I hear there are regulators imposed a number ‑‑ or a ratio in the IPv4 usage.
SPEAKER: So this is actually ‑‑ this is a recommendation coming from the French police, for example, it's not anything to do with the telecom regulator.
KOSTAS ZORBADELOS: In any case I don't think eight against 16 makes a real difference. The main issue is when ‑‑ when an event occurs and you want to provide input about the customer we would need to have port information as well, from the law enforcement agency. So in order to ‑‑ a single IP does not identify a customer any more, you would need to have port information as well, and if we are given this information, we are able from the binding table to provide all necessary information. So I saw there was also a conversation with ‑‑ from Jan and Europol about that, about whether the law enforcement agencies should provide also port identification for subscribers. I really hope it will become mandatory or so or a rule.
SPEAKER: It's kind of been up to the operators I have seen to educate the Leos about NAT and say here is what we can provide but you need to ask, kind of be a bit pre‑emptive about it.
KOSTAS ZORBADELOS: In our case when this situation arises we just mention the law enforcement agency, okay, this is ‑‑ this is behind the CGN, no hope, and in this case we will just, we can share what is available. That is the 63 other people sharing the IP.
SPEAKER: For German government. One question is, you describe this binding table, does this have a least time or...? Is it sixth or what is the time the binding is valid?
KOSTAS ZORBADELOS: The binding table is generally fixed but of course you would need to keep a history of the binding table in time. It's a good idea I think to have it it in a version controlled system. So you actually have ‑‑ in our case, the binding table as our deployment stands just gets extra entries each time. So we have additions and we do not change what is already there but in any case I think the solution to this is version control your binding table..
SPEAKER: So this means that the IP address is somehow fixed for the customer?
KOSTAS ZORBADELOS: Yes.
SPEAKER: This is interesting. And you said it is somehow based that you took this technology due to you are a daughter of German telecom?
KOSTAS ZORBADELOS: Yes, Deutsche Telekom is behind lw4o6 RFC, a good colleague is author of that and we cooperated with the entirety.
SPEAKER: In Germany we have two laws, one is as a customer you have the right to use your own CPE, I am not quite sure that this ‑‑
KOSTAS ZORBADELOS: Exactly right. We need to provision this service only to people that have the proper CPE.
SPEAKER: And another issue is that a customer and end customer, a private one has to have a possibility to change his IP.
KOSTAS ZORBADELOS: So what do you do in static IP offerings?
SPEAKER: Then he intended to have one. But normal, he has to have the right to change it.
KOSTAS ZORBADELOS: Yes.
SPEAKER: So in this he can't.
KOSTAS ZORBADELOS: As it stands now, in the current deployment, no, because a central DHCP set‑up will offer the same parameters, let's say, to the same CPE. But I think we can accommodate that, in the next version of the DHCPv6 deployment and when we utilise KEA but our idea and focus is to go with what is also good recommendation, provide static, even IPv6 prefixes to customers. We are not in Germany, yes.
CHAIR: Thank you. Well ‑‑
Now we have the lightning talks and I must introduce you this nice toy that RIPE NCC has provided us, this is the first time we will use it for the lightning talks as they are lightning, we will have this funny light here, the first five minutes will be green, the next four minutes will be yellow and the last minute will be red and once you run out of time the lightning talk is finished. So if you have ‑‑ if you need more space for the presentation you have less space for the questions, okay. And our next speaker is Louis Poinsignon.
LOUIS POINSIGNON: Hello again everyone. So I am going to talk about all the junk traffic on the Internet. So again, I am Louis, I am a network engineer at Cloudflare and I decided to dig into the terabytes of flows we sampled. Recently we launched 1.1 .171 which is a new DNS resolver, and it's a special IP range, which APNIC labs allowed us to announce so 17171 /24 and ‑‑ 24 and this prefixes are known to receive unwanted traffic, mostly because of misconfiguration, misuse, proxy, just internal use. So the routing history is back in 2010 the RIPE and merit ransom tests to see if the prefix could be distributed, if it could be used, and it actually congested the ports so they didn't distribute it. And then the next years Google and YouTube just announced it on the Internet. So we announced it a bit before the official launch of our DNS resolver on 1st April, we announced 15 days before. So mid‑March. And the traffic levels were 10 gigabits per second, of free traffic just like coming nothing ‑‑ no service, just 10 gigabits per second, so back in the years it was mostly like eight years ago during the last test, it was mostly around 120 megabits per second on the same prefix, and 101 on the whole /8 back in 2014. And one thing to notice is 1 gig bits per second on 1.1 .1 .1 so these are traffic graphs. And traffic type. We see mostly all the variant of http, port 80, 8,000, whatever. Some UDP traffic too. DNS also http ports that we see. Also 514 so we assume some people are sending us SIS logs. So there is some complaints like possibly a firewall or IDS just sending us their logs. And also we see a lot of traffic CPEs on 188.8.131.52, one of biggest queer ear, one of the most queried. Traffic source, it's mostly China, like 96%, US, 2% and the rest of the world is 2%. So you may think having an Anycast network could help absorb the 10 gigabits per traffic but it only reaches one or two data centres because of the landing points. Bursts and patterns. This is we noticed, we did an analysis per IP, it's actually very, very like a pattern every time it's the same, every day there is like same amount of traffic from the same IPs in China, it's on .7, .8, .9, .10, probably like a load balancer on port 8 which varies between 5 and we also some short bursts like NTP which is probably like misconfiguration. And memcached also, a lot of DDOS attacks and DHCP spikes for some reason. Before launching around thousand packets per second. What changed, 10 years ago at NANOG 69 the analysis, it was 10 to 20 times less traffic than now, and we still see some IPerv traffic too. Now we have reached around 10% of legitimate traffic, which is actually legitimate DNS queries because a lot of them, legitimate DNS queries but they were broken. So just we could not respond. Just a small word, I talked about unwanted traffic, but also there was traffic we wanted but we could not get, thanks to RIPE Atlas we could get all the, we could test many networks before launching and we realised some people could not actually reach us, mostly CPE null routes or responded 1.1.1. 30 major service providers around the world having major issue, using 1.0 for internal purposes, people were very nice, they cleaned their configuration very fast and we managed to get most of it working for the 1st April. So this is actually why the reason ‑‑ so this is for instance posted on the Internet saying you should use 1.0.0 and nobody is using it. Even in books, use fictional IP addressing. Yeah. So Cisco book, just 1.1 and 2.2.2 which is a range. Conclusions, trying to clean up, a lot of providers were not accepting and at the same time there was a lot of misconfiguration just creating bad traffic. And we realise we may be receiving some PI information like syslogs, if somebody leaked the prefix they could actually listen to some information like that. Anyway. Any questions?
CHAIR: We only have time for two quick questions. No.
SPEAKER: So I was wondering do you just listen to how many traffic you get on the port or did you try to put a web server on it and just see what the requests you are getting?
LOUIS POINSIGNON: We also serve ‑‑ so yeah, for the http requests we also serve 1.1.1, it's an http website as well, hard to configure the DNS so we saw in the logs it's actually proxies, people just trying to proxy to us, and yeah, we saw a lot of connection like that. Regarding syslogs it was a bit ‑‑ we didn't know what we should do, should we see the payload or not. We see some lawyer, what syslogs? And we didn't do it and because maybe like the most part couldn't contain PI. We don't know. But, yeah...
CHAIR: Our next speaker is Andreas Reuter.
Andreas Reuter on measurement of RPKI adoption.
ANDREAS REUTER: This is an update on the talk I have already given a year ago on RIPE 74, and the core question we asked ourselves are there any AS out there actually dropping invalid routes already? As a quick recap, at the RIPE peeing in Budapest I presented analysis of existing work, showed why it's not sufficient and new methodology based on active experiments. If you like to know about it the slides have a link to the paper and the presentation from last year. I am kind of back to give you run down of what happened the last year, what did we find and so on, what observations we make and we have a new monitoring platform we like you to try out. So on to the results. One year ago we found three ASes that were actually dropping invalid routes. Now, we have over 40 AS dropping invalid routes and it's important to know this is a lower because we can't measure every ah we can't even measure most, we can measure a few 100 at most so this is a lower bound. There is probably more AS out there. But the question is what happened from three to 40? The answer is route service happened, so a lot of IXPs have route service that offer filtering feature where you can choose mot to receive invalid routes, for a lot this used to be an opt‑in feature and some have switched this from opt‑in to opt out, like France‑IX and AMS‑IX so we see uptake in AS because of the switch. So we acknowledge this is technically not the AS actually doing route or general validation. The route server is doing it but from our end it's transparent we are still counting this as route deployment and we have got to be aware it's a route server doing it.
Some of the observations we have made while fiddling with our experiments is that some vendor implementations of ROV are still pretty faulty so one thing we found is an implementation could not probably revalidate a route so if you have an existing and the RPKI state of it changes it might not pick that up. And leading into that it can also, even if you withdraw
A. Route and you reannounce it and think of the revalidation bug shouldn't occur because you renounced it, it's still sometimes will give you the wrong validation result so a route that is invalid might be valid for this which is really bad. So this is hopefully being worked on and hopefully it will get better.
Another thing that we found is that the BGP data from public route collectors is very useful but it's kind of strange. There is a lot of weird things going on with vantage points that export routes so they sometimes they disappear for a few hours or days or weeks, a lot of them give you partial feeds so you don't really know what you are getting. This is not new, but I just think it's worth repeating this because if you use that data you need to know it's incomplete.
And another thing is that we found that ROV filtering is in flux, start and stop it again and do it inconsistently so it's good we have longitudinal monitoring in place instead of just a snapshot.
So now on to the monitoring website, that is what I really want to know you, a website that shows the results of our measurements, so the measurements in order to explain you would have to read the paper, we announced two prefixes, one is announced in a way it's always valid, the announcement is the control variable so to speak. The other one switches from valid to invalid and we observe how AS react to this on the Internet. And if you are interested in RPKI and you are interested in monitoring the RPKI, you might find it useful that we switch these ROAs periodically each day, you can use this as almost kind of a ROA beacon so we know people who use this to analyse RPKI cache consistency or we are also using it to test open source implementation of ROV, if you are interested in RPKI this might be something to look at.
So why do we build this website? Because it allows us ‑‑ well, we want to give the community the means to kind of assess the state of deployment. And even though we are showing as lower bound and can't capture everything this is still something and hopefully seeing the number of AS grow will maybe encourage some people to deploy it themselves.
So just to say the website is, the URL is in the title you can try it out right now. What comes next is a few screenshots to show you what you are looking at because it has to make sense. This is the main content of the website, it's a table and shows for each ROA and AS, name and number and there is a confidence value associated with each AS, so the confidence is simply a measure of how consistently does this AS filter or do we see it. You can look for each of the AS into the details, expand the ROA and then you will see which vantage point does filter, how many times have we observed it to be filtering, again every vantage point has a confidence value as well, when it was last measured and last marked as filtering and if you would like to know why it was marked as filtering, this will give you the details and yet another table and the actual routes exported by the vantage point to the route collector. On 30th April for these two prefixes we see these routes with the time stamps and you can see that the AS path for anchor prefix, and the one that doesn't change, that is always valid is unchanged whereas for the experiment prefix, which is always changing, you have a few time stamps of undefined which means there is no route because it was filtered. And now I hope you can maybe try it out yourself and give us some feedback. If your AS is on the list and it shouldn't be on the list, and op circuits please give us some feedback, or come and talk to us off‑line. Time for any questions?
CHAIR: Thank you.
Are there any questions from the audience? No questions. I guess we have to play with this tool before. So if there are no more questions, we are going to the third presentation today, it's Jen Linkova from Google.
JEN LINKOVA: Hello, can you hear me? So my name is Jen and I spent best years of my life sitting in the cold dark room configuring routers and as a result I came up with some ideas, or some percentages of how to do that and because I used to be researcher before I got into this IT show business I knew that it nice to have a co‑author who is famous, it gives you more credibility, so I got lucky enough to find out that some of my ideas have been documented already, actually a couple of centuries ago, so I am going to have some supporting statements to my observation from some classic work. So, Shakespeare's guide to network maintenance. Let's get ome fun. Only half of that slides, every second slide is supposed to be taken seriously, and I marked with Shakespeare image so, don't take my slides seriously, please. And if you want to take something seriously go back to your emails.
The goal is, I have a kind of checklist what to do or not to do and I want to share it with you so you can do your network maintenance very quickly, very efficient and spend the ‑‑ this free time on writing postmortems.
So rule number 1 is: Keep communication to minimum. Don't ask your colleagues, they are very busy, I know, because I am. Don't ask them to review with a you are going to do. Some people sent notifications, I hate getting them, I am going to change the network tomorrow 12 p.m. Sydney time. Who cares, right? 12 p.m. is okay, good enough.
Second rule: You need to prove that you are very important to company, right, called job security. So make sure only you can do that maintenance, so you are irreplaceable. Make your documentation very, very brief, you all know how to turn up a router, if you start to ‑‑ you are during that BGP session you will know how to do that. Don't try to specify all those comments and they can change from vendor to vendor so your documentation might be outdated.
Third rule: Internet keeps changing, right, your network keeps changing too, but if you prepare something in advance, you might get in trouble because by the time you are going to push into the network your network might have something different configured so you will spend more time on this troubleshooting than if you just came and did the maintenance by improvising and don't ever route automate because as soon as you start automating you start assuming your network is in a particular state.
It could not be two minutes.
Third: Be brave. Never test anything, right? Your vendors write the documentation.
Four: Everything should work as they say and the just BGP config change, yeah, I have seen outages caused by interface description change but it's ‑‑ it never happens, right?
Again, one of the most important points of my presentation, again be ‑‑ make your plan very, very clear, it's better actually not to verify your changes at all, but if do you, again make sure only you can do that.
Check connectivity is very good. We all know how to do that. Verify everything works, it's our verification plan. All these pings, you start people ‑‑ telling people just ping something, they need to remember they have to ping over v4 or v6, it's too complicated, some people run some scripts to verify if the network is running. Just don't do this. Also, again, my advice is to spend less time on the maintenance, more time on your real life, or post mortems, it depends how lucky you are. So how to optimise your time? Don't spend too much time verifying. Some people I have seen it doing strange stuff and see ‑‑ and then verify after. Too much time, another half an hour at least. Also it's very good idea never verify the actual traffic flows, look in the configuration, if all lines you are supposed to put there are there, good enough. Asking customers if they see the result, no, no. Don't distract them, right, we talk about the mails. Another thing never ‑‑ you should never, ever do is to check that things which are supposed to fail actually fail. If you change firewalls never verify you cannot send spoof traffic in your network after you change it. You might find some unpleasant things. This is my favourite one. Never have a roll back plan. Why would you even start your maintenance if you expect you need to roll back, right? If you have some very nice colleagues who insisted on having roll back plan, never test it it. We all know you can roll back your software version, right? And only plan for roll back if you have to do it right now, you did the maintenance and you found something broke, nobody expect you to roll back changes 24 hours after. And again, job security is very important. Only you can do that, so if you are asked for a says; rollback all changes made. Clear. Right?
Rollback 1 is also very good especially if you need to rollback two weeks after. And all this deleted particular line of the configuration, it makes your rollback plan very complicated and scares people and many opportunity of doing copy and paste mistakes. And actually the last slide is after you do your maintenance, turn off your phone, do forget your phone, go home or go to pub, it's Friday. No, it's not Friday, but I am done anyway. Thank you.
CHAIR: Thanks. I guess you are not taking questions because you already left. Thanks for that final presentation. And well, this ends the presentations today. Let me just give you some announcements.
Remember, there is a BoF here from 6:00 to 7:00, it's about operators and the IETF. There is also an informal meeting for network operation group organisers who are people, who are going to do a future organisation, in the room first floor in the old part of the venue. This is a small informal meeting. If you want to speak about NOGs, you are invited, you all are invited to come to the session on Friday at 9:00, from 9:00 to 10:00, we are going to speak about NOGs, a community within the RIPE community so please all come. And then we have the social this evening at the museum, it's at 9:00 p.m., there will be buses leaving from the street in front of the palace at 8:45 and from 8:45 on, every 15 minutes.
So thanks everybody. I hope you enjoy the sessions and see you around.
LIVE CAPTION BY AOIFE DOWNES, RPR