Nginx Active-Passive HA



  • I have a client who is about to migrate to using Let's Encrypt for SSL instead of their standard SSL issuer and fully manual process they had before. They host many hundreds of sites and manually updating certs was just ridiculously time consuming for them.

    I'm looking to setup Nginx in Active-Passive HA mode so that when the cert update job takes Nginx offline for up to 15-20mins, the sites aren't taken offline.

    I've found a couple tutorials that explain the setup process and will be testing this setup to death before it goes online (virtual IP, defining the master/passive node...etc) but I'm wondering if there is a best-practice for the SSL certs location. Should each Nginx instance host its own set of certs for the same domains? In this case, running the renew script on one would renew certs on only that instance (since Nginx has to reload to use the new certs) and then renew on the other node? I can't imagine I should save the certs on some network location because the remaining Nginx node would not be able to use the new certs until reload so in effect negating the HA setup. Should I simply have a script to copy the new certs to the other node after the master comes back online and then reload the other node's Nginx service?

    The majority of these sites are low traffic (fewer than 100 visits a day) so offline sites for a few minutes a day or once a week during early morning hours isn't going to kill anyone but it's still a good plan to setup the HA proxies should one go down and a bonus if we can keep sites online while certs are getting renewed.

    Thoughts? Recommendations? Gotchas?



  • I'm guessing someone might suggest some automation method like SaltStack for this (not even sure if that's doable) but if you are going to suggest this, please provide a link to documentation where I can read up on it.



  • Nginx does not have to go offline if you have the .well-known routed correctly.

    It would still need to restart for the cert to be applied of course.



  • My Nginx doesn't go offline during a cert renewal, do them all of the time.



  • @jaredbusch said in Nginx Active-Passive HA:

    It would still need to restart for the cert to be applied of course.

    Just a reload, no downtime.



  • Maybe I'm going renewals wrong or I'm misunderstanding the process but the renew script has the certbot renew --pre-hook "systemctl stop nginx" --post-hook "systemctl start nginx" line. Wouldn't that take Nginx offline, then renew certs, then restart Nginx? Maybe there's a better renewal method I'm not aware of.

    Tbh, I've only assumed Nginx was going offline because of this line but only renewing a dozen or so certs only takes seconds so it isn't something I've actually had a chance to test.



  • @scottalanmiller said in Nginx Active-Passive HA:

    My Nginx doesn't go offline during a cert renewal, do them all of the time.

    Mine does because I have not setup the .wellknown path as I do everything certonly when adding a cert. This means the certbot renew needs to shutdown nginx and run its own websesrver temporarily. It is all scripted with a pre-hook and post-hook to stop and start nginx though. so it is still fully automated.

    I need to revisit this as cerbot is smarter now than it used to be.



  • @jaredbusch said in Nginx Active-Passive HA:

    @scottalanmiller said in Nginx Active-Passive HA:

    My Nginx doesn't go offline during a cert renewal, do them all of the time.

    Mine does because I have not setup the .wellknown path as I do everything certonly when adding a cert. This means the certbot renew needs to shutdown nginx and run its own websesrver temporarily. It is all scripted with a pre-hook and post-hook to stop and start nginx though. so it is still fully automated.

    I need to revisit this as cerbot is smarter now than it used to be.

    Yeah, this is the method I use as well.



  • @nashbrydges said in Nginx Active-Passive HA:

    Maybe I'm going renewals wrong or I'm misunderstanding the process but the renew script has the certbot renew --pre-hook "systemctl stop nginx" --post-hook "systemctl start nginx" line.

    I don't use this part: "--pre-hook "systemctl stop nginx"



  • @nashbrydges said in Nginx Active-Passive HA:

    Maybe I'm going renewals wrong or I'm misunderstanding the process but the renew script has the certbot renew --pre-hook "systemctl stop nginx" --post-hook "systemctl start nginx" line. Wouldn't that take Nginx offline, then renew certs, then restart Nginx? Maybe there's a better renewal method I'm not aware of.

    Tbh, I've only assumed Nginx was going offline because of this line but only renewing a dozen or so certs only takes seconds so it isn't something I've actually had a chance to test.

    Yes, that takes Nginx offline.



  • @scottalanmiller said in Nginx Active-Passive HA:

    @nashbrydges said in Nginx Active-Passive HA:

    Maybe I'm going renewals wrong or I'm misunderstanding the process but the renew script has the certbot renew --pre-hook "systemctl stop nginx" --post-hook "systemctl start nginx" line.

    I don't use this part: "--pre-hook "systemctl stop nginx"

    You have to depending on how you got the cert to begin with.



  • @scottalanmiller said in Nginx Active-Passive HA:

    @jaredbusch said in Nginx Active-Passive HA:

    It would still need to restart for the cert to be applied of course.

    Just a reload, no downtime.

    Is this what you mean?

    certbot certonly --webroot -w /path/to/your/webroot -d example.com --post-hook="service nginx reload"
    


  • @black3dynamite said in Nginx Active-Passive HA:

    @scottalanmiller said in Nginx Active-Passive HA:

    @jaredbusch said in Nginx Active-Passive HA:

    It would still need to restart for the cert to be applied of course.

    Just a reload, no downtime.

    Is this what you mean?

    certbot certonly --webroot -w /path/to/your/webroot -d example.com --post-hook="service nginx reload"
    

    This will work if you define the webroot path which I don't. Separate Nginx server from web servers.



  • My initial cert request process looks like this:

    certbot certonly -d mydomain.com --pre-hook "systemctl stop nginx" --post-hook "systemctl start nginx" --preferred-challenges http

    When prompted, I select 1 to spin up a temporary web server for the issuance and challenge. This as I understand it allows me to not have to name webroot folders anywhere. I've already defined the path of the certs because this is easy to figure out based on the command line that will save the certs in the location for the first named domain so when Nginx restarts, certs and domain are all good to go. I have a separate Nginx server that handles nothing but proxy and SSL services. All sites are hosted on their own Fedora, CentOS or Ubuntu servers. I don't use webroot authentication.

    If I setup .well-known path, can this be setup globally for all cert issuances and renewals? I guess I would set this up in my config file for each domain.



  • Yeah, that's nothing like what my initial looks like.





  • @black3dynamite correct. this is what I need to setup on my system.



  • server {
           listen         80;
           server_name    my.domain.com;
           return         301 https://$server_name$request_uri;
    
            location /.well-known/acme-challenge {
                root /var/www/letsencrypt;
             }
    }
    

    Is what an example I have on one of mine.



  • Honest question... Why not just rsync /etc/letsencrypt from ServerA to ServerB after the certs are renewed?



  • @dafyre said in Nginx Active-Passive HA:

    Honest question... Why not just rsync /etc/letsencrypt from ServerA to ServerB after the certs are renewed?

    There is not discussion about the second server at this point. it is all about the initial renew.



  • @dafyre said in Nginx Active-Passive HA:

        location /.well-known/acme-challenge {
            root /var/www/letsencrypt;
         }
    

    So I understand it well, these lines are ONLY to tell Let's Encrypt which folders to look to for the challenge/response and has nothing to do with any actual site webroot folders. Am I correct? This is just used so Nginx can act as the web server for those challenges/responses.





  • @nashbrydges said in Nginx Active-Passive HA:

    @dafyre said in Nginx Active-Passive HA:

        location /.well-known/acme-challenge {
            root /var/www/letsencrypt;
         }
    

    So I understand it well, these lines are ONLY to tell Let's Encrypt which folders to look to for the challenge/response and has nothing to do with any actual site webroot folders. Am I correct? This is just used so Nginx can act as the web server for those challenges/responses.

    Right. But any website you want to protect with SSL, you add this into the server {} section for each site... so if you have my.domain.conf, and nextcloud.domain.conf, you'd have to put the code in each of those files in the server {} sections.

    Edit: here's the full config for that site:

    server {
           listen         80;
           server_name    my.domain.com
           return         301 https://$server_name$request_uri;
    
            location /.well-known/acme-challenge {
                root /var/www/letsencrypt;
             }
    }
    
    server {
     listen 443 ssl;
    
     server_name my.domain.com
    
     client_max_body_size 10G;
     fastcgi_buffers 64 4K;
     proxy_send_timeout     7200;
     send_timeout   7200;
    
     add_header Strict-Transport-Security "max-age=15552000; includeSubdomains;" always;
     ssl on;
     ssl_certificate /etc/nginx/certs/my.domain.com/fullchain.pem;
     ssl_certificate_key /etc/nginx/certs/my.domain.com/privkey.pem;
     ssl_protocols  TLSv1.1 TLSv1.2;
     ssl_ciphers 'EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH';
    
     location / {
      proxy_pass http://my.ip.addr.ess;
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto $scheme;
    
    }
    
     location /.well-known/acme-challenge {
        root /var/www/letsencrypt;
     }
    
    }
    


  • @dafyre Awesome! Thanks for clarifying that. I don't have any expiring certs for the next 40 days so I'll keep a look out to see how this works.



  • Assuming this is going to work as planned, back to the original question...setting up Nginx HA and certs management. Which approach is best/recommended?

    1. Let each Nginx server manage its own certs and renewals?
    2. Only have one manage certs and renewals and copy certs to second node?
    3. Use Let's Encrypt --duplicate option (here)?
    4. None of the above?


  • @nashbrydges said in Nginx Active-Passive HA:

    Assuming this is going to work as planned, back to the original question...setting up Nginx HA and certs management. Which approach is best/recommended?

    1. Let each Nginx server manage its own certs and renewals?
    2. Only have one manage certs and renewals and copy certs to second node?
    3. Use Let's Encrypt --duplicate option (here)?
    4. None of the above?

    I see no reason approach #2 won't work. The private keys are under /etc/letsencrypt with the actual certs themselves too.

    Just use rsync with the appropriate switches to preserve permissions and such.



  • I have this for my well-known on my Nginx Proxy
    0_1520451668608_DeepinScreenshot_select-area_20180307144017.png



  • @dafyre said in Nginx Active-Passive HA:

    @nashbrydges said in Nginx Active-Passive HA:

    Assuming this is going to work as planned, back to the original question...setting up Nginx HA and certs management. Which approach is best/recommended?

    1. Let each Nginx server manage its own certs and renewals?
    2. Only have one manage certs and renewals and copy certs to second node?
    3. Use Let's Encrypt --duplicate option (here)?
    4. None of the above?

    I see no reason approach #2 won't work. The private keys are under /etc/letsencrypt with the actual certs themselves too.

    Just use rsync with the appropriate switches to preserve permissions and such.

    I would definitely do #2.



  • @NashBrydges side question. If you setup the .well-known to work correctly, why do you then need the HA? because nginx will never be down except for the momentary reload after the certs are updated.



  • @jaredbusch said in Nginx Active-Passive HA:

    @NashBrydges side question. If you setup the .well-known to work correctly, why do you then need the HA? because nginx will never be down except for the momentary reload after the certs are updated.

    That certainly addresses the biggest concern about a long downtime during the renewall process for a high number of certs and probably addresses most concerns with this client. He's already running Veeam replication to a second box so his RTO and RPO are relatively short and within his business tolerance.

    Having said that, it's a great learning opportunity for me to set this up in my lab, if for no other reason than to try it and see how it works.