Help with Application Infrastructure / Architecture
-
Welcome to the MangoLassi community!
-
Why Redis hosted on Azure? Redis on Azure itself is fine, I'm wondering why your pub/sub is being handled remotely from the physical servers. Seems like having Redis local to the rest of the system would mean faster responses with less overhead. Redis is pretty trivial to manage and runs great on Linux so can be all done for free.
-
I wondered the exact same thing. It was set up that way by our lead dev, (who also co-founded our company), thinking that we were going to migrate EVERYTHING to Azure, but then realized upon testing that it's much slower there. I guess he's got a lot of API calls pointing there and either doesn't want / hasn't had time to move things local. IT wasn't consulted about this beforehand, so we became stuck with it after the fact. This is the only part of our setup that wasn't in place prior to my arrival, and I didn't know about it until a few weeks ago. I just thought we had part of one of our websites in Azure.
-
@jn19 said:
I wondered the exact same thing. It was set up that way by our lead dev, (who also co-founded our company), thinking that we were going to migrate EVERYTHING to Azure, but then realized upon testing that it's much slower there.
And more expensive... and unreliable. LOL
If cost was a concern, why is SQL Server in there?
-
@jn19 said:
As it stands, I need help with everything from hardware sizing to SQL licensing. I don’t know the best way to license SQL 2014 for our purposes, and I know that our lead dev would prefer the Enterprise version for some of the extra features, but I think the costs will be too prohibitive.
Right now we’re running IIS 8 on Server 2012, with a SQL 2012 backend. The application’s in .NET 4.6.1, and we’ll be moving it to IIS 8.5 soon. Unfortunately, our entire setup is dependent upon IIS, .NET and SQL, and I don’t think that will be changing anytime soon, if ever.
That's really bad. All of those are good tech, but as a SaaS vendor you are really, really stuck. You need all kinds of crazy licensing for these in a public setting and will need to monitor those licenses for forever. This is both a licensing and a human cost that will be enormous.
.NET can be run without Windows Servers, but Microsoft makes that free knowing that people will use it as an excuse to get mired into costly MS licenses and it works.
Getting away from SQL Server is your top concern in this list. Getting to PostgreSQL would save you a developer or two's salary in licensing improvements.
-
The hardware design seems very odd to say the least.
3 server, each with dual socket boards, and 6 Core CPU's.
Are you looking to redesign everything from the ground up, replace equipment etc?
-
@jn19 said:
- Web Users – The web application is used by another 30-40 users at any given time.
Response times on the web application are often painfully slow. Some queries can take over 100 seconds. I’ve run a lot of SQL health scripts from Brent Ozar’s site and those have helped a bit, but I don’t think the speed issue lies with SQL. IIS seems to be the culprit.
That is not many users at all. If you are getting performance issues from the IIS / .NET layer with that few users and you don't feel that the database is a bottleneck, then there is a really, really good chance that you have a code problem in .NET that needs to be addressed. It might be that some really critical components are blocking and waiting on things that they should not be waiting on. How many threads do you have working? MVC depends on heavy external concurrency for performance so this is very important.
- Web Users – The web application is used by another 30-40 users at any given time.
-
I think you're trying to work these issues backwards. You have performance problems, you suspect where the problems are, so you're trying to solve them by throwing more hardware at it. You need to step back and really figure out what's going on there. Monitor entire setup for a few days and see if there are obvious bottlenecks, like CPU, RAM or disk IO.
And like Scott said above, it really sounds like you have some bad code there.
-
I agree with @marcinozga , there is a really good chance that hardware won't solve the issue here. It might mitigate it some, it might hold it off but as you scale it will just get worse and worse most likely, if the hardware even does anything.
-
You've got me! They've been leasing servers this whole time on a monthly basis, so over the last 3-4 years the company has probably paid $30k each for machines that might have been $7k-8k new. SQL Standard's been bundled into that monthly price at around ~$900/mo/SQL server (dual-proc hexacore machines), so full licenses could have easily been bought for that by now. Plus, the SQL servers are generally at maybe 10-15% CPU usage for the "master" server, and maybe 5% at most for the "slave" server, the latter of which is where the app server and clients pull most of their data. It's just been a lack of good long-term planning, really. I'm trying to help now that I'm here, but it would have been nice to have been here before everything was coded and put into production.
I do wonder what, from a technical standpoint, keeps us from using something like Postgresql, as we do industrial automation, and all of the data acquisition devices we utilize have Linux drivers available. Not that we'd ever have time to rewrite things to switch to it, but I wonder nonetheless.
-
Oh, I agree that it should run quite well on the current hardware, given the right setup. Here's some info from one of the hang reports in LeanSentry, which has been a pretty handy tool for IIS analytics:
[img]http://i.imgur.com/djV1cRb.png[/img]
[img]http://i.imgur.com/c0NCrlb.png[/img]The blocked request location in this instance was a "Session in AcquireRequestState."
-
I'm basically looking for the best ways to improve performance that I can control, i.e. any IIS/SQL/Server 2012 configuration or architecture changes that can be made that will require little to no work on the part of the developer(s). I've got full access to these machines but I have no software development experience, so I just want to do what I can to get things running more smoothly.
-
From a platform perspective this seems strange to me that your devs are not the ones working to fix these issues.
Unless you are responsible for application performance as well as hardware performance? -
@Dashrender said:
From a platform perspective this seems strange to me that your devs are not the ones working to fix these issues.
Unless you are responsible for application performance as well as hardware performance?I concur, but since the only dev on the main application (co-founder/co-owner/boss's boss) is convinced that it's hardware or some simple configuration setting somewhere that's causing the issue, I figure I should go ahead and investigate every avenue of improvement that I can touch!
-
@jn19 said:
You've got me! They've been leasing servers this whole time on a monthly basis, so over the last 3-4 years the company has probably paid $30k each for machines that might have been $7k-8k new. SQL Standard's been bundled into that monthly price at around ~$900/mo/SQL server (dual-proc hexacore machines), so full licenses could have easily been bought for that by now. Plus, the SQL servers are generally at maybe 10-15% CPU usage for the "master" server, and maybe 5% at most for the "slave" server, the latter of which is where the app server and clients pull most of their data. It's just been a lack of good long-term planning, really. I'm trying to help now that I'm here, but it would have been nice to have been here before everything was coded and put into production.
I do wonder what, from a technical standpoint, keeps us from using something like Postgresql, as we do industrial automation, and all of the data acquisition devices we utilize have Linux drivers available. Not that we'd ever have time to rewrite things to switch to it, but I wonder nonetheless.
Doesn't matter if they have Linux drivers... PostgreSQL is the database only, teh application layer can happily run on Windows. Not that that would be my first choice, just saying that using PostgreSQL is over the network and the platform for PostgreSQL itself isn't a factor for other things.
-
@jn19 said:
I'm basically looking for the best ways to improve performance that I can control, i.e. any IIS/SQL/Server 2012 configuration or architecture changes that can be made that will require little to no work on the part of the developer(s). I've got full access to these machines but I have no software development experience, so I just want to do what I can to get things running more smoothly.
The problem is... those aren't the places to fix things and you could drop a million dollars and do effectively nothing. It looks like you have a code problem, throwing money and hardware at it might do nothing.
-
@jn19 said:
@Dashrender said:
From a platform perspective this seems strange to me that your devs are not the ones working to fix these issues.
Unless you are responsible for application performance as well as hardware performance?I concur, but since the only dev on the main application (co-founder/co-owner/boss's boss) is convinced that it's hardware or some simple configuration setting somewhere that's causing the issue, I figure I should go ahead and investigate every avenue of improvement that I can touch!
Oh man... the same guy that caused all of the performance and cost problems already? That doesn't sound like a healthy situation.
-
@Dashrender said:
From a platform perspective this seems strange to me that your devs are not the ones working to fix these issues.
Well, one of the problems with devs causing issues is that often they caused them because they don't really know what they are doing and so can't fix them because they aren't sure why or how it all works.
This is not just suggested by several of the scenarios that the OP mentioned about how they got to where they are, but using SQL Server and IIS for SaaS apps, Redis on Azure, misunderstanding the goals of cloud and such all have the same "not necessarily but realistically... bad developers" problem. It's all "tech I heard about from my first year college professor" who, in turn, was a failed developer that's never worked in the real world and when put together is just a chain of disaster.
Someone in a position of decision making who understands the tech, even a little, looking at the cost associated with all of the Windows Server and SQL Server licenses and that scaling cost as they take the product public would nearly always put a stop to using those technologies before the first line of code was written. Sure, there are exceptions, but few and far between. Those technologies cost a fortune and creating licensing problems that are staggering.
It's hard to tell but it sounds like just one bad decision layered on another and people not willing to take ownership of their mistakes leading to an attempt to throw money (VC money, perhaps) at a problem to hide the fact that the person responsible doesn't want to take ownership of the issue.
-
@jn19 said:
I've got full access to these machines but I have no software development experience, so I just want to do what I can to get things running more smoothly.
I'm not saying quit, but this is when you prep your resume and start looking. I'm not being funny in any way. It's impossible to read the situation from here, but everything that we are hearing is that you have completely incompetent developers and management and they are driving the product into the ground and throwing money away like crazy and are trying to blame IT for their failings. This aren't the kinds of things that are likely to fix themselves down the road. This is the making of a bad situation - most likely just a company failing. But if this is supposed to be software to sell to customers, how will these problems play out at that point? How will paying customers react to being told to "buy faster desktops" or other insane things when the application isn't fast enough for them?
-
So your IIS worker process is blocked, that IS a code issue. Likely the only fix for this is adding IIS workers and throwing threads at the issue. That's about it.