I’m looking to my fellow users for help here, and I’ve always been impressed with the breadth and depth of the knowledge that you’ve shared in this forum. My background’s in general educational/ corporate/small business IT support, desktop deployment with MDT, and basic Windows server setups with Hyper-V. I’ve also helped manage a vSphere 5.5 installation with around 30 hypervisors before, so I’m comfortable with that as well, but have more experience with Hyper-V. My new job was initially focused on the same types of tasks, but now management has chosen to discontinue our IT support services and concentrate all of our resources on our core SaaS applications.
The problem is that while our core web applications have tremendous functionality, they have terrible performance and even worse security. The servers were basically set up by the developers years ago as a test environment when initially developing the core business, but then they quickly became the production servers once everything was working and able to generate a profit.
Again, I’m no application architect, but I know that SQL server shouldn’t be installed on the system drive, that the application server probably needs more RAM than the database servers, that field client applications shouldn’t bypass the application server and communicate directly with database servers, and that everything should be behind a firewall and only accessible via VPN connection or federated identity, not RDP. There’s no Active Directory infrastructure for these machines, so there are individual local accounts on each machine with random insecure credentials, and the list just goes on. All of this was set up before I worked here, and I’ve been lobbying to get things fixed ever since. We’d hired an employee that was experienced in infrastructure architecture but he had to move back overseas for family reasons, and the task of setting up an entirely new, performant application infrastructure has been passed to me. I like doing the research, but we’re a small company, and I’m only afforded so much time to come up with a solution. At previous jobs I had network and system engineers that dealt with the big infrastructure issues. It’d be one thing if I was just trying to get things running better at our current load, but within the next 2-3 months we’ve got a large client coming on that will make usage of our system increase by a factor of at least 10.
As it stands, I need help with everything from hardware sizing to SQL licensing. I don’t know the best way to license SQL 2014 for our purposes, and I know that our lead dev would prefer the Enterprise version for some of the extra features, but I think the costs will be too prohibitive.
Right now we’re running IIS 8 on Server 2012, with a SQL 2012 backend. The application’s in .NET 4.6.1, and we’ll be moving it to IIS 8.5 soon. Unfortunately, our entire setup is dependent upon IIS, .NET and SQL, and I don’t think that will be changing anytime soon, if ever.
Our servers interact with several types of clients:
Mobile Devices – Mobile view of web application, plus native Android & iOS apps that tie into a subset of the web application’s functionality. There are generally 30-40 connections of this type at any given time.
- Site Servers – These machines ingest information from SCADA systems & issue automation commands. Each machine runs a local .NET app that has a SQL 2008R2 backend. Almost all of the site servers utilize cellular connections, and traffic volume has been an issue. There are around 80 sites like this that send and receive data to/from our servers every 30 seconds to a minute.
- Web Users – The web application is used by another 30-40 users at any given time.
Response times on the web application are often painfully slow. Some queries can take over 100 seconds. I’ve run a lot of SQL health scripts from Brent Ozar’s site and those have helped a bit, but I don’t think the speed issue lies with SQL. IIS seems to be the culprit.
CURRENT SERVER SPECIFICATIONS:
We utilize 3 physical servers at this time, along with a Redis Instance in Azure.
Application Server:
Dual Xeon E5645 (6-core) @ 2.4GHz
64GB RAM
System Drive – Intel DC3500 – 600GB
SSD Storage – Intel DC S3500 SSD–800GB
SATA Storage -- Seagate 7200 rpm – 3TB
Network – 1Gbps up/down (usually at 1-3% utilization)
Redis on Windows – Caching App server requests
Database Server – Master:
Dual Xeon E5-2630 (6-core) @ 2.3 GHz
128GB RAM
System Drive – Intel DC S3500 – 600GB
SATA Storage -- Seagate 7200 rpm – 3TB
Network – 1Gbps up/down (usually at 1% utilization)
Redis on Windows – Pub/Sub relationship with Azure Redis – High-volume ASP requests
Transactional Replication Publisher
Database Server – Slave:
Dual Xeon E5-2630 (6-core) @ 2.3 GHz
128GB RAM
System Drive – Micron M510DC SSD – 960GB
Network – 1Gbps up/down (usually at 1% utilization)
Transactional Replication Subscriber
Azure Redis – Standard Tier -13GB (I have no idea why we have a Redis instance in Azure, but I imagine that it’s a speed bottleneck as well. I don’t know how to measure the response time between our servers and the Azure Redis instance, though.)
My proposed hardware is along these lines:
3 Hypervisors with the following specs:
2 x 2.4GHz Octa-Core E5-2630 v3 Haswell Xeon
256GB RAM
Boot Drive - 160 GB SSDs in RAID1
VM drives – Intel DC S3500 or NVMe drives of around 900GB in RAID1
Storage drives – 4 x 6TB SATA in RAID 10 w/ BBU
Server 2012R2 / SQL 2014 Std or Ent
1U Quad-core servers for pfSense & HAProxy.
Here are some basic diagrams of our current setup along with my proposed setup, which is all based heavily on Stack Exchange's setup:
These are all little more than guesses, as I really don’t know the best way to set up a fast and secure IIS / .NET / SQL infrastructure. Is virtualization a bad idea for this type of setup? My thoughts were that the advantage to having 3 or so high-performance hypervisors would be that we could more easily migrate things to better hardware as the need arises, and that it should run nearly as fast as a bare-metal server as long as we’re not putting both databases/app servers/redis instances on just one box, causing resource contention.
Any help you can give would be greatly appreciated.