Help troubleshooting L2TP over IPSEC VPN connections.



  • So we have the VPN setup and it is working currently for 3 out of 4 users. I have been dealing with the problematic connection but can't figure out how to solve the issue. I'd really appreciate any help you guys can provide.


    L2TP over IPSEC VPN

    VPN Server: EdgeRouter PoE 5 v1.10.5
    Client: Windows 10 v1709 build 16299.579

    Windows Side
    Client is properly reaching the VPN server even though the Windows error says the server is unreachable (logs below). Don't really think the problem lies on the Windows side but still, I have checked the Windows setup and everything is set according to documentation and the same as the other working clients. The machine has been rebooted (several times) and I have even uninstalled and reinstalled the WAN Miniport interfaces.

    Edge Router Side
    Full log - sudo swanctl --log while trying to connect.

    06[NET] received packet: from USER_PUBLIC_IP[500] to EDGE_ROUTER_IP[500] (408 bytes)06[ENC] parsed ID_PROT request 0 [ SA V V V V V V V V ]
    06[ENC] received unknown vendor ID: 01:52:8b:bb:c0:06:96:12:18:49:ab:9a:1c:5b:2a:51:00:00:00:01
    06[IKE] received MS NT5 ISAKMPOAKLEY vendor ID06[IKE] received NAT-T (RFC 3947) vendor ID
    06[IKE] received draft-ietf-ipsec-nat-t-ike-02\n vendor ID06[IKE] received FRAGMENTATION vendor ID
    06[ENC] received unknown vendor ID: fb:1d:e3:cd:f3:41:b7:ea:16:b7:e5:be:08:55:f1
    :20
    06[ENC] received unknown vendor ID: 26:24:4d:38:ed:db:61:b3:17:2a:36:e3:d0:cf:b8
    :1906[ENC] received unknown vendor ID: e3:a5:96:6a:76:37:9f:e7:07:22:82:31:e5:ce:86
    :52
    06[IKE] USER_PUBLIC_IP is initiating a Main Mode IKE_SA
    06[ENC] generating ID_PROT response 0 [ SA V V V ]
    06[NET] sending packet: from EDGE_ROUTER_IP[500] to USER_PUBLIC_IP[500] (136 bytes)
    01[NET] received packet: from USER_PUBLIC_IP[500] to EDGE_ROUTER_IP[500] (228 bytes)
    01[ENC] parsed ID_PROT request 0 [ KE No NAT-D NAT-D ]01[IKE] remote host is behind NAT
    01[ENC] generating ID_PROT response 0 [ KE No NAT-D NAT-D ]01[NET] sending packet: from EDGE_ROUTER_IP[500] to USER_PUBLIC_IP[500] (212 bytes)
    05[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (76 bytes
    )05[ENC] parsed ID_PROT request 0 [ ID HASH ]
    05[CFG] looking for pre-shared key peer configs matching EDGE_ROUTER_IP...USER_PUBLIC_IP[192.168.0.16]
    05[CFG] selected peer config "remote-access"
    05[IKE] IKE_SA remote-access[63] established between EDGE_ROUTER_IP[EDGE_ROUTER_IP
    ]...USER_PUBLIC_IP[192.168.0.16]05[IKE] DPD not supported by peer, disabled05[ENC] generating ID_PROT response 0 [ ID HASH ]
    05[NET] sending packet: from EDGE_ROUTER_IP[4500] to USER_PUBLIC_IP[4500] (76 bytes)09[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (444 byte
    s)
    09[ENC] parsed QUICK_MODE request 1 [ HASH SA No ID ID NAT-OA NAT-OA ]
    09[IKE] received 3600s lifetime, configured 0s
    09[IKE] received 250000000 lifebytes, configured 009[ENC] generating QUICK_MODE response 1 [ HASH SA No ID ID NAT-OA NAT-OA ]
    09[NET] sending packet: from EDGE_ROUTER_IP[4500] to USER_PUBLIC_IP[4500] (204 bytes
    )
    13[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (60 bytes)
    13[ENC] parsed QUICK_MODE request 1 [ HASH ]
    13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[ud
    p/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    13[IKE] unable to install IPsec policies (SPD) in kernel
    13[KNL] deleting policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out failed, not found
    13[KNL] deleting policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in failed, not found
    13[KNL] deleting policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out failed, not found
    13[KNL] deleting policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in failed, not found
    13[IKE] sending DELETE for ESP CHILD_SA with SPI 740d890e
    13[ENC] generating INFORMATIONAL_V1 request 3087336472 [ HASH D ]
    13[NET] sending packet: from EDGE_ROUTER_IP[4500] to USER_PUBLIC_IP[4500] (76 bytes)
    14[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (76 bytes)
    14[ENC] parsed INFORMATIONAL_V1 request 2912129370 [ HASH D ]
    14[IKE] received DELETE for ESP CHILD_SA with SPI 740d890e
    14[IKE] CHILD_SA not found, ignored
    04[NET] received packet: from USER_PUBLIC_IP[4500] to EDGE_ROUTER_IP[4500] (92 bytes)
    04[ENC] parsed INFORMATIONAL_V1 request 1035896583 [ HASH D ]
    04[IKE] received DELETE for IKE_SA remote-access[63]
    04[IKE] deleting IKE_SA remote-access[63] between EDGE_ROUTER_IP[EDGE_ROUTER_IP]...USER_PUBLIC_IP[192.168.0.16]
    

    Checking the logs, I can see everything is working properly until this messages start to appear.

    13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    

    It can't install the policy for reqid 35 because there is an existing reqid (14) which has the same policy.

    Indeed there is, policy remote-access policy 14 is a child of remote-access 28

    remote-access: #28, ESTABLISHED, IKEv1, 2dba0e93f1dc2f3c:4a212e556a07f9b7
      local  'EDGE_ROUTER_IP' @ EDGE_ROUTER_IP
      remote '192.168.0.8' @ USER_PUBLIC_IP
      AES_CBC-256/HMAC_SHA1_96/PRF_HMAC_SHA1/ECP_384
      established 75540s ago
      remote-access: #14, INSTALLED, TRANSPORT-in-UDP, ESP:AES_CBC-128/HMAC_SHA1_96
        installed 75207 ago
        in  c9a20ab8, 2965565 bytes, 32775 packets,  8314s ago
        out 8fadd716, 44934358 bytes, 50838 packets,  8268s ago
        local  EDGE_ROUTER_IP/32[udp/l2f]
        remote USER_PUBLIC_IP/32[udp/l2f]
    

    This leads me to believe the user maybe already be connected via another machine, but the user doesn't show as online when using show vpn remote-access.

    Any idea how to fix the conflict with the duplicate policies and why it is happening?

    Only thing I haven't done is rebooting the edge router since other users are working fine and don't want to cause a disruption for them.



  • Can you sign in from your system using the users VPN credentials?

    This will give you another point to test from to see if it is a router issue.



  • @gjacobse Will try that next 🙂



  • @gjacobse I can connect without a problem from a different public ip





  • If this is what I think it is,.. it's something we have gone round and round with for nearly a year. It's something we can't seem to nail down as either an ERL or OS issue.



  • @romo said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @JaredBusch @scottalanmiller Any idea?

    Is this user trying to connect from the same IP as another user?



  • @jaredbusch said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @romo said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @JaredBusch @scottalanmiller Any idea?

    Is this user trying to connect from the same IP as another user?

    Generally not.. They are remote / home users.



  • @jaredbusch said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @romo said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @JaredBusch @scottalanmiller Any idea?

    Is this user trying to connect from the same IP as another user?

    No, a single user trying to connect from home. She connected Wednesday without a problem, but Thursday she tries to connect again and it is not possible.

    Logs show

    13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    

    New connection can't be made because a policy with the same details is already present. If we vpn from any place that has a different public ip than the one from her home, we can establish the vpn connection without a problem.



  • @romo said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @jaredbusch said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @romo said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @JaredBusch @scottalanmiller Any idea?

    Is this user trying to connect from the same IP as another user?

    No, a single user trying to connect from home. She connected Wednesday without a problem, but Thursday she tries to connect again and it is not possible.

    Logs show

    13[CFG] unable to install policy EDGE_ROUTER_IP/32[udp/l2f] === USER_PUBLIC_IP/32[udp/l2f] out (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    13[CFG] unable to install policy USER_PUBLIC_IP/32[udp/l2f] === EDGE_ROUTER_IP/32[udp/l2f] in (mark 0/0x00000000) for reqid 35, the same policy for reqid 14 exists
    

    New connection can't be made because a policy with the same details is already present. If we vpn from any place that has a different public ip than the one from her home, we can establish the vpn connection without a problem.

    What VPN client are you using, default to Windows?



  • @dbeato Yes



  • @romo said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @dbeato Yes

    Okay, was looking at that error on other OpenVPN clients that had issues on older versions.



  • @dbeato said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @romo said in Help troubleshooting L2TP over IPSEC VPN connections.:

    @dbeato Yes

    Okay, was looking at that error on other OpenVPN clients that had issues on older versions.

    It looks like it is a Strongswan issue, as a temporary fix it should be resolved by manually restarting the IPSec VPN (restart vpn). Unfortunately, during working hours it seems to be too disruptive to use for properly connected users. At least without having tested the effects of the restart for connected users.

    The strange thing is the connection is acting as if two computers were trying to access the VPN server behind the same NAT when according to the user it is only a single device.



  • Here is our issue https://wiki.strongswan.org/issues/431, it was fixed 3 years ago when version 5.3 of strongSwan came out.

    I had not found what strongSwan version we were using, I just assumed we were using something newer. Then I found that our edge router is using strongSwan 5.2.2.

    Here is our version.

    Status of IKE charon daemon (strongSwan 5.2.2, Linux 3.10.107-UBNT, mips64):
      uptime: 3 days, since Aug 06 22:12:40 2018
      malloc: sbrk 376832, mmap 0, used 295456, free 81376
      worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled:
    

    From here https://community.ubnt.com/t5/EdgeMAX-Feature-Requests/Upgrade-to-strongswan-5-6-x/idi-p/1507341 we see a change to strongSwan 5.5.x has been accepted don't know when it will be available.

    strongSwan 5.3 + can now handle identical policies by reusing the same reqid. This allows identical CHILD_SAs to the same host.

    So that probably means multiple machines behind NAT could also work when the fix is implemented.



  • jeeze,.. that is a sad state to think that we have nbeen fighting this for that long,...

    @JaredBusch @scottalanmiller
    Can a cron be set to restart the ipsec every 24 hours?



  • @romo said in Help troubleshooting L2TP over IPSEC VPN connections.:

    Here is our issue https://wiki.strongswan.org/issues/431, it was fixed 3 years ago when version 5.3 of strongSwan came out.

    I had not found what strongSwan version we were using, I just assumed we were using something newer. Then I found that our edge router is using strongSwan 5.2.2.

    Here is our version.

    Status of IKE charon daemon (strongSwan 5.2.2, Linux 3.10.107-UBNT, mips64):
      uptime: 3 days, since Aug 06 22:12:40 2018
      malloc: sbrk 376832, mmap 0, used 295456, free 81376
      worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled:
    

    From here https://community.ubnt.com/t5/EdgeMAX-Feature-Requests/Upgrade-to-strongswan-5-6-x/idi-p/1507341 we see a change to strongSwan 5.5.x has been accepted don't know when it will be available.

    strongSwan 5.3 + can now handle identical policies by reusing the same reqid. This allows identical CHILD_SAs to the same host.

    So that probably means multiple machines behind NAT could also work when the fix is implemented.

    Yeah, that is what I found and was referring to. I just did not post it here.



  • @gjacobse said in Help troubleshooting L2TP over IPSEC VPN connections.:

    jeeze,.. that is a sad state to think that we have nbeen fighting this for that long,...

    @JaredBusch @scottalanmiller
    Can a cron be set to restart the ipsec every 24 hours?

    Yes.