Food for thought: Fixing an over-engineered environment



  • Now that the VoIP project is pretty much done, next is to tackle our data center servers. The backup software we use, Yosemite Server Backup is at end-of-life, and methinks while I find something to replace it, I may as well solve what's in this post.

    Caveat: I know that much all of the below is just wrong and over-engineered.

    Nothing is virtualized, with the exception of Server 3.

    Current Hardware
    Server 1

    • Runs Windows Server 2012 R2
    • Is our IIS server for our web application
    • Processor: Dual Intel E5-2630 v2. According to New Relic average CPU usage hasn't gone above 6% in the last three months.
    • RAM: 64 GB. According to New Relic, no more than 8 GB of RAM has been in use in the last three months.
    • Storage: Two Intel S3500 300 GB SSDs in RAID 1 (52% of storage used); Four Seagate Enterprise 4 TB HDDs in RAID 10 (replaced iSCSI NAS that was in degraded RAID 5 with consumer drives; 7% of storage used)
    • Networking: see below

    Server 2

    • Runs Windows Server 2012 R2
    • Is our MS SQL Server
    • Processor: Dual Intel E5-2643 v2. According to New Relic average CPU usage hasn't gone above 4% in the last three months.
    • RAM: 256 GB. According to New Relic, no more than 92 GB of RAM has been in use in the last three months.
    • Storage: Two Intel S3500 300 GB SSDs in RAID 1 (24% of storage used); Four Intel S3700 200 GB SSDs in RAID 10 (50% of storage used)
    • Networking: see below

    Server 3

    • Runs Windows Server 2012 R2
    • Is our REDIS server and main server for Yosemite backup; also runs Hyper-V as a role and hosts an Ubuntu VM running postfix
    • Processor: Dual Intel E5-2630 v2. According to New Relic average CPU usage hasn't gone above 2% in the last three months.
    • RAM: 64 GB. According to New Relic, no more than 4 GB of RAM has been in use in the last three months.
    • Storage: Two Intel S3500 300 GB SSDs in RAID 1 (24% of storage used)
    • Networking: see below

    Networking

    • Each server has two Intel i350-t4 NICs as well as an IPMI NIC, as all of these are SuperMicro servers
    • Connection to WAN is via an Cisco ASA 5505 (also have a site-to-site VPN back to the office)
    • Switch is a Dell PowerConnect 6224
      • Configured with 4 VLANs (VLAN1 - interfaces with the ASA; VLAN2 - interfaces between the three servers; VLAN3 and VLAN4 - no longer used, was for MPIO with our iSCSI NAS)
      • Each server has two NICs connected to VLAN 1 (the IPMI NIC and a connection to the ASA)
      • Each server has two NICS teamed into one that connects to VLAN2

    The above will not be fixed immediately. My goal is for this thread to be a sounding board for ideas / sanity check for ideas that I have to fix this. On a related note, On High is eventually wanting something like this to be available: Have produciton VMs (of IIS, SQL, and REDIS) and staging VMs. When it's time to put up a new build of the application, we do it on the staging VMs, then we [do X] and those become the production VMs and the production VMs become the staging VM. I'm unsure how this would all work, but methinks it's more important to tackle the above problems of almost nothing virtualized and what is virtualized is done in the wrong way first.

    Overall Architecture Brainstorm

    I can see the above being consolidated to one physical machine likely running Hyper-V. The question becomes storage since we currently have a mixture of SSD and HDD. The cases have eight 3.5" bays.

    I can see the VLAN1 networking turning into physical NICs on one subnet interfacing with the ASA (to be replace by a Ubiquiti device) likely through a switch with two links per server (one for the IMPI the other for general traffic). The VLAN2 networking would be done through connections to a Hyper-V private switch.

    Yes, I have not mentioned backups yet. I am learning more so I can begin the brainstorm.



  • Well, seems you already have the physical resources needed. Just getting Hyper-V installed/added, possibly throw in @StarWind_Software and more storage in one of the nodes if you need fail over (big if).

    At least they should already realize they were oversold an environment they really don't need, as in evidence from the removed SAN device.



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    The VLAN2 networking would be done through connections to a Hyper-V private switch.

    Why split keep VLAN 2?
    FYI, multi-homing Windows machines can lead to pain. Not saying it's not possible, just can be painful. Not that you said you were multi homing anything.

    How will traffic get from the default VLAN onto VLAN 2, if the VMs are only on VLAN2 via a private Hyper-V switch?



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Yes, I have not mentioned backups yet. I am learning more so I can begin the brainstorm.

    Where are your backups being stored?
    You said

    Server 3

    • Runs Windows Server 2012 R2
    • Yosemite backup;
    • Storage: Two Intel S3500 300 GB SSDs in RAID 1 (24% of storage used)

    Is your backup for everything fitting inside 300 GB? SSD in this case seems like overkill, but perhaps you need this?



  • rough math looks like you're using 900 GB of storage today. add four more Intel S3500 300 GB drives, take all 8 into server 2 and you'll have 1.2 TB of usable SSD. Sounds like server 2 also already has enough RAM to cover the entire load of your entire setup.

    You could setup a desktop PC with Hyper-V (with enough resources), P2V Server 2 to said PC, migrate Server 3 to that Desktop PC as well, then reconfigure server 2 as mentioned above. then install Hyper-V on it, migrate off the PC to Server 2. Then P2V from Server 1 to Server 2, then decomp server 1.

    Once it's all running well on Server 2, setup either server 1 or 3 for your test environment.



  • I'd probably back everything up. Twice. Then test restores work...
    Then, i'd disk2vhd everything and verify all are working.

    I'd likely then move all the 300GB SSDs in to Server 2 and provision as a Hyper-V host with Raid 5 array. Giving about 1.5TBs of space. I'd put all the VMs on to that host and run as production. (May need to revisit licensing).

    Then, make server 1 Hyper-V host with 4 x 4TB drives in raid 10 (about 8TB usable or so), and setup that host to be a replica server target for server 2 VMs, with a few replica copies of each VM as it has the space. I'd probably move most RAM from Server 3 into Server 1 if its compatible.

    Server 3
    Get some more large drives and provision as a backup target of some form.

    Or something along those lines anyway.



  • @jimmy9008 said in Food for thought: Fixing an over-engineered environment:

    Raid 5 array. Giving about 1.5TBs of space.

    duh - whoops missed that part.



  • @dashrender said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    The VLAN2 networking would be done through connections to a Hyper-V private switch.

    Why split keep VLAN 2?
    FYI, multi-homing Windows machines can lead to pain. Not saying it's not possible, just can be painful. Not that you said you were multi homing anything.

    How will traffic get from the default VLAN onto VLAN 2, if the VMs are only on VLAN2 via a private Hyper-V switch?

    I may have been unclear about this. There is no communication between the default VLAN and VLAN2. In the new setup, VLAN2 would go away. I'd configure two Virtual switches on the Hyper-V host. One external and one private. Each VM would have two vNIC, one connected to each virtual switch. Private switch would be for traffic between the VMs, and external switch for Internet access.

    I haven't had a problem having multiple VMs sharing one NIC with our office server. Since the data center Hyper-V host has eight NICs, I could give each VM its own physical NIC if I wanted.



  • Ok so you do plan to multihome the VM. What are you hoping to gain by having this internal virtual switch?



  • @dashrender said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Yes, I have not mentioned backups yet. I am learning more so I can begin the brainstorm.

    Where are your backups being stored?
    You said

    Server 3

    • Runs Windows Server 2012 R2
    • Yosemite backup;
    • Storage: Two Intel S3500 300 GB SSDs in RAID 1 (24% of storage used)

    Is your backup for everything fitting inside 300 GB? SSD in this case seems like overkill, but perhaps you need this?

    No. Backup is stored on an external WD MyBook hard drive.



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    I'd configure two Virtual switches on the Hyper-V host. One external and one private. Each VM would have two vNIC, one connected to each virtual switch. Private switch would be for traffic between the VMs, and external switch for Internet access.

    This seems unnecessarily complex for your environment. Any reason for doing this and not just a single virtual switch?



  • @dashrender said in Food for thought: Fixing an over-engineered environment:

    Ok so you do plan to multihome the VM. What are you hoping to gain by having this internal virtual switch?

    The original idea (which is likely flawed) of the network was to separate server-to-server traffic from server-to-Internet traffic. I believe the purpose of this was keep the server-to-server pipe free of the Internet traffic to prevent bottlenecks. However, I don't think this issue is really an issue (see images below). Creating the private virtual switch preserves this architecture, and I would assume data transfer would be faster over the private virtual switch rather than through the physical switch.

    We also teamed two 1 GB NICs on each server to connect that internal VLAN to have more bandwidth. Each server has its host file configured to where the IPs of the other servers resolve to the internal subnet to make sure the traffic used the correct NIC.

    SQL Server Network Traffic
    Internal = Teamed GB NICs on VLAN2. External = single GB NIC on VLAN1
    0_1509544448805_4e15d792-a1ae-4cad-8f5d-50062a96499a-image.png
    0_1509544499906_5b7d5587-e712-4301-9c0f-2caa605f462b-image.png

    IIS server traffic is about the same. Three month average for the Internal team is 7.4 Mb/s TX and 1.69 Mb/s RX. The external NIC is 2.34 Mb/s TX and 863 Kb/s RX

    REDIS server (that has postfix VM) has almost no traffic on the external NIC, and a three month average of 726 Kb/s TX and 10.5 MB/s RX on the Internal team.

    Clearly, none of this is approaching saturation even for a 100 Mbps NIC, so perhaps the only thing gained from separating the traffic is extra complexity.



  • @coliver said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    I'd configure two Virtual switches on the Hyper-V host. One external and one private. Each VM would have two vNIC, one connected to each virtual switch. Private switch would be for traffic between the VMs, and external switch for Internet access.

    This seems unnecessarily complex for your environment. Any reason for doing this and not just a single virtual switch?

    Ha! You beat me to the conclusion. :P



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Creating the private virtual switch preserves this architecture, and I would assume data transfer would be faster over the private virtual switch rather than through the physical switch.

    This assumes that VM-to-VM traffic leaves the virtual switch to begin with. IIRC none of the VM-to-VM traffice would be going to your physical switch. The virtual switch would be handling all of that traffic.



  • @coliver said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Creating the private virtual switch preserves this architecture, and I would assume data transfer would be faster over the private virtual switch rather than through the physical switch.

    This assumes that VM-to-VM traffic leaves the virtual switch to begin with. IIRC none of the VM-to-VM traffice would be going to your physical switch. The virtual switch would be handling all of that traffic.

    That's correct. Right now, nothing is virtualized, so each physical server has two teamed NICs sending traffic to our physical switch on VLAN2 as the Internal network, with another NIC sending traffic on the default VLAN as the External network.

    Since my thought is to turn everything into a VM, it would be better performing to create a virtual private switch just for that VM-to-VM traffic rather than configure something that still utilizes the physical switch for such traffic. However, from what I'm seeing it doesn't look like separating that traffic onto its own private switch is necessary.


  • Service Provider

    @coliver said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    I'd configure two Virtual switches on the Hyper-V host. One external and one private. Each VM would have two vNIC, one connected to each virtual switch. Private switch would be for traffic between the VMs, and external switch for Internet access.

    This seems unnecessarily complex for your environment. Any reason for doing this and not just a single virtual switch?

    I agree. What’s the benefit here?



  • @scottalanmiller said in Food for thought: Fixing an over-engineered environment:

    @coliver said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    I'd configure two Virtual switches on the Hyper-V host. One external and one private. Each VM would have two vNIC, one connected to each virtual switch. Private switch would be for traffic between the VMs, and external switch for Internet access.

    This seems unnecessarily complex for your environment. Any reason for doing this and not just a single virtual switch?

    I agree. What’s the benefit here?

    From the data that New Relic shows me, it looks like there isn't any. I guess I could make an argument that the SQL Server VM and the REDIS VM shouldn't have Internet access. The problem with that is two fold.

    1. I'd add the complexity of having WSUS or something to be able to feed those VMs Windows updates.
    2. It seems like having a way to RDP into those machines would be overly complex.


  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Since my thought is to turn everything into a VM, it would be better performing to create a virtual private switch just for that VM-to-VM traffic rather than configure something that still utilizes the physical switch for such traffic. However, from what I'm seeing it doesn't look like separating that traffic onto its own private switch is necessary.

    I'm confused as to how you would see better performance? You're going to have more then one host correct? Unless you are planning on setting up an independent physical switch for host-to-host/vm-to-vm communication then everything would be going over the physical switch regardless. VLANs aren't for performance purposes they are for security purposes.



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Since my thought is to turn everything into a VM, it would be better performing to create a virtual private switch just for that VM-to-VM traffic rather than configure something that still utilizes the physical switch for such traffic. However, from what I'm seeing it doesn't look like separating that traffic onto its own private switch is necessary.

    The idea might have some credibility in the real world, but in a single host, where the traffic is all on vswitches, this won't really make any difference.
    Each VM with only one a single vswitch connection, all inter VM traffic will stay inside the hypervisor, never touching the physical switches. You team several 1 GB or upgrade to a 10GB NIC in the server (and a 10 GB port on the switch) and you shouldn't see that be a bottle neck at all.



  • @coliver
    Right now, I'm planning on one host with multiple VMs. So if I had this separate, internal network, methinks performance would be better on a virtual private switch, rather than using virtual external switches bound to a physical NIC that is a part of a separate VLAN on the physical switch.

    On performance, you're right about VLANs, they're designed for security. I guess you could argue you'd reducing potential broadcast traffic, but in this situation that wouldn't matter, as the number of devices is the same. It looks more and more like the separate-network-for-server-to-server communication is unnecessary.

    @Dashrender
    You're right. The only time VM traffic would be going over a 1 GB link would be when that traffic has to travel over the physical NIC to the physical switch. Even if the virtual switch was an external switch, the VM to VM traffic would be going over the 10 GB virtual switch link.



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Right now, I'm planning on one host with multiple VMs. So if I had this separate, internal network, methinks performance would be better on a virtual private switch, rather than using virtual external switches bound to a physical NIC that is a part of a separate VLAN on the physical switch.

    Probably not. But you're talking yourself out of it now so I don't need to say anything else.



  • @coliver said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Right now, I'm planning on one host with multiple VMs. So if I had this separate, internal network, methinks performance would be better on a virtual private switch, rather than using virtual external switches bound to a physical NIC that is a part of a separate VLAN on the physical switch.

    Probably not. But you're talking yourself out of it now so I don't need to say anything else.

    :D Yeah, during this thought process, I'll likely be talking myself out of most things that would be just a virtualized version of current architecture.



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    @coliver said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Right now, I'm planning on one host with multiple VMs. So if I had this separate, internal network, methinks performance would be better on a virtual private switch, rather than using virtual external switches bound to a physical NIC that is a part of a separate VLAN on the physical switch.

    Probably not. But you're talking yourself out of it now so I don't need to say anything else.

    :D Yeah, during this thought process, I'll likely be talking myself out of most things that would be just a virtualized version of current architecture.

    Definitely a hard thing to get over at times.



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    @coliver said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    Right now, I'm planning on one host with multiple VMs. So if I had this separate, internal network, methinks performance would be better on a virtual private switch, rather than using virtual external switches bound to a physical NIC that is a part of a separate VLAN on the physical switch.

    Probably not. But you're talking yourself out of it now so I don't need to say anything else.

    :D Yeah, during this thought process, I'll likely be talking myself out of most things that would be just a virtualized version of current architecture.

    So green field it. Ignore current infrastructure for a bit. How would you make this work in an ideal environment. Then look at where what you have now differs from that ideal. Are those differences necessary? Would moving them toward ideal adversely effect users?



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    @coliver
    Right now, I'm planning on one host with multiple VMs. So if I had this separate, internal network, methinks performance would be better on a virtual private switch, rather than using virtual external switches bound to a physical NIC that is a part of a separate VLAN on the physical switch.

    If the VMs are on the same host no need to give them internal and external virtual NICs. They will communicate over the external virtual switch, but the traffic wont go to the physical NIC/out to the LAN.

    You only want internal switch between VMs where they are only supposed to talk with each other/not be on a LAN.



  • @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    @coliver

    On performance, you're right about VLANs, they're designed for security. I guess you could argue you'd reducing potential broadcast traffic, but in this situation that wouldn't matter, as the number of devices is the same. It looks more and more like the separate-network-for-server-to-server communication is unnecessary.

    I didn't think they were for security...

    I thought VLANs were purely for segregation of traffic to make quality of service/planning better. Yeah sure, something on VLAN1 wont interact with VLAN2... but its the same switch/hardware/cables. So I presume if I can get access to that kit with Wireshark or something id be able to get the traffic regardless of VLANs, and the fact they are VLANs wouldn't matter... Could be wrong here though (probably am)...


  • Service Provider

    @jimmy9008 said in Food for thought: Fixing an over-engineered environment:

    @eddiejennings said in Food for thought: Fixing an over-engineered environment:

    @coliver

    On performance, you're right about VLANs, they're designed for security. I guess you could argue you'd reducing potential broadcast traffic, but in this situation that wouldn't matter, as the number of devices is the same. It looks more and more like the separate-network-for-server-to-server communication is unnecessary.

    I didn't think they were for security...

    I thought VLANs were purely for segregation of traffic to make quality of service/planning better.

    No that's the myth. They actually make those things worse. They make planning harder and confuse people about QoS. They add overhead and bottlenecks so you have to plan more and do more QoS just ot overcome the VLAN problems. VLANs are for security in some limited cases and for management on a massive scale.


  • Service Provider

    @jimmy9008 said in Food for thought: Fixing an over-engineered environment:

    but its the same switch/hardware/cables. So I presume if I can get access to that kit with Wireshark or something id be able to get the traffic regardless of VLANs, and the fact they are VLANs wouldn't matter... Could be wrong here though (probably am)...

    That's subnets that you are thinking of. If you can do that with a VLAN, it's not a VLAN ;) The definition of a VLAN means that that can't be done.


  • Service Provider

    Okay, I've not read everything but starting from the top...

    Networking - VLANs are gone. You describe very clearly in the OP that they serve no purpose, don't talk about them again. Gone. Done. Over. One Big Flat Network, OBFN.

    Servers - Definitely no need for more than one. Going down to just one will significantly improve your performance and your reliability. Right now your apps depend on the separate database server which depends on your SAN. That's an inverted pyramid with another tier. So instead of the normal three tiers of risk, you have five! Collapsing that down to one will make you so much more reliable. Hyper-V is fine. So is KVM.

    Storage - This is easy, local disks. Either all SSD or one SSD pool and one spinner pool. That's all.


  • Service Provider

    REDIS should be on Linux, REDIS on Windows is crazy. It's expensive and slow.



Looks like your connection to MangoLassi was lost, please wait while we try to reconnect.