Is this server strategy reckless and/or insane?
-
@creayt said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@creayt said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.
But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.
What about it makes it relational? Is it financial data?
It's people interacting w/ public web content as intermingle-able groups, having cross-pollinating conversations about it, relating each conversation, participant, tag, and content piece to each other, classifying it in personal and group contexts for future relation, and using various analytical algorithms, eventually AI, to analyze the relationships between the different data at each tier in the hierarchy and use it as a suggestion engine to expose users to new groups, conversations, content, and other users, in a nutshell.
That's like textbook NoSQL target content there. Conversations, groups, tagging, analytics.... it's like the "who's who" of NoSQL target topics.
-
You are describing tasks often handled by engines like Hadoop, ElasticSearch, Cassandra, MongoDB, etc.
-
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@creayt said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@creayt said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.
But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.
What about it makes it relational? Is it financial data?
It's people interacting w/ public web content as intermingle-able groups, having cross-pollinating conversations about it, relating each conversation, participant, tag, and content piece to each other, classifying it in personal and group contexts for future relation, and using various analytical algorithms, eventually AI, to analyze the relationships between the different data at each tier in the hierarchy and use it as a suggestion engine to expose users to new groups, conversations, content, and other users, in a nutshell.
That's like textbook NoSQL target content there. Conversations, groups, tagging, analytics.... it's like the "who's who" of NoSQL target topics.
Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.
As far as hardware, how would what I've described so far work for going w/ NoSQL instead of MySQL? Anything you'd change specifically?
-
@creayt said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@creayt said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@creayt said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.
But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.
What about it makes it relational? Is it financial data?
It's people interacting w/ public web content as intermingle-able groups, having cross-pollinating conversations about it, relating each conversation, participant, tag, and content piece to each other, classifying it in personal and group contexts for future relation, and using various analytical algorithms, eventually AI, to analyze the relationships between the different data at each tier in the hierarchy and use it as a suggestion engine to expose users to new groups, conversations, content, and other users, in a nutshell.
That's like textbook NoSQL target content there. Conversations, groups, tagging, analytics.... it's like the "who's who" of NoSQL target topics.
Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.
As far as hardware, how would what I've described so far work for going w/ NoSQL instead of MySQL? Anything you'd change specifically?
Not really (change) as speed is speed. Databases don't change that much one from another. They all like RAM, IOPS and other things the same. What IS different about a lot of NoSQL is that, and keep in mind this has nothing to do with being NoSQL vs. relational but just product commonalities, is that NoSQL clusters tend to be 3+ nodes and relational clusters tend to be pairs.
-
BTW, we are posting on a system that handles everything on the NoSQL MongoDB platform.
-
@creayt said in Is this server strategy reckless and/or insane?:
Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.
Until ~10 years ago, RDBMS were so dominant that it was just "how everything was done." But as SaaS started to explode, the need for growth and performance change needs and NoSQL systems started to take off. They are really where the bulk of new stuff goes today, at least of big commercial stuff. SaaS vendors outside of financial use them for nearly everything. They are what power things like Google, Facebook, Change and other large websites that have to handle insane levels of data all over the world.
-
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@creayt said in Is this server strategy reckless and/or insane?:
Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.
Until ~10 years ago, RDBMS were so dominant that it was just "how everything was done." But as SaaS started to explode, the need for growth and performance change needs and NoSQL systems started to take off. They are really where the bulk of new stuff goes today, at least of big commercial stuff. SaaS vendors outside of financial use them for nearly everything. They are what power things like Google, Facebook, Change and other large websites that have to handle insane levels of data all over the world.
Have you found any interesting sources talking about what Facebook uses NoSQL for? Here's a recent article from one of their lead DB engineers talking about how they primarily use MySQL for what sounds like most of the persistent stuff that needs to scale to large numbers of users ( mentions shares, comments, and likes explicitly ). Apparently they've written their own storage engine for MySQL which dominates InnoDB and actively maintain their own branch of MySQL itself, which was last committed to 2 hours ago.
https://code.facebook.com/posts/190251048047090/myrocks-a-space-and-write-optimized-mysql-database/
-
In the article I linked to, dude says this: "There are many reasons why we use MySQL at Facebook. MySQL is amenable to automation, making it easy for a small team to manage thousands of MySQL servers..."
Gulp. Thousands. Of. Nodes. Those guys.
-
@creayt said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@creayt said in Is this server strategy reckless and/or insane?:
Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.
Until ~10 years ago, RDBMS were so dominant that it was just "how everything was done." But as SaaS started to explode, the need for growth and performance change needs and NoSQL systems started to take off. They are really where the bulk of new stuff goes today, at least of big commercial stuff. SaaS vendors outside of financial use them for nearly everything. They are what power things like Google, Facebook, Change and other large websites that have to handle insane levels of data all over the world.
Have you found any interesting sources talking about what Facebook uses NoSQL for? Here's a recent article from one of their lead DB engineers talking about how they primarily use MySQL for what sounds like most of the persistent stuff that needs to scale to large numbers of users ( mentions shares, comments, and likes explicitly ). Apparently they've written their own storage engine for MySQL which dominates InnoDB and actively maintain their own branch of MySQL itself, which was last committed to 2 hours ago.
https://code.facebook.com/posts/190251048047090/myrocks-a-space-and-write-optimized-mysql-database/
That's a weird article. I'm not sure how much I'd trust that, even those it is hosted on Facebook, it doesn't feel logical. And doesn't match anything we see anywhere else. It sounds like, from how they describe it, it's one small piece used for isolated processes. But even in what they describe, it's not how you are picturing it. They are using a NoSQL database that is just managed by MySQL. MySQL itself is a management platform, not a database. Rocks is their database and that is non-relational. So nothing they are talking about there applies to you. That they manage it via MySQL is interesting, but not useful in your case.
Generally, though, Hadoop and Cassandra are what is behind Facebook's main services.
-
@creayt said in Is this server strategy reckless and/or insane?:
In the article I linked to, dude says this: "There are many reasons why we use MySQL at Facebook. MySQL is amenable to automation, making it easy for a small team to manage thousands of MySQL servers..."
Gulp. Thousands. Of. Nodes. Those guys.
This is the NoSQL behind the scenes of what they are using.
-
This topic definitely exploded for today! It did not seem like it was that busy when it was going on. But nearly 200 posts on a single topic!
-
@dustinb3403 I'm running it on microSD .... Brrrr
-
@matteo-nunziati said in Is this server strategy reckless and/or insane?:
@dustinb3403 I'm running it on microSD .... Brrrr
So many posts... running what?
-
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@matteo-nunziati said in Is this server strategy reckless and/or insane?:
@dustinb3403 I'm running it on microSD .... Brrrr
So many posts... running what?
Running hyperv on microSD. HPE microSD.
-
About bench. I've made some tests with my new server before deployment. Disabling controller and disk cache helped a lot understanding real perf of disks.
I've seen sata ssd x4 raid5 outperform 15k sas x4 raid 10.
Enabling cache at controller level blends things, even with big files making benches a bit more blurry. -
@matteo-nunziati said in Is this server strategy reckless and/or insane?:
I've seen sata ssd x4 raid5 outperform 15k sas x4 raid 10.
x4 SSD any RAID level will outperform x4 15k HDD in any configuration
You're looking at a max of like 250ish realistic IOPS with 15k HDDs. Sure, you can get more at like 100% sequential reads, but not in typical use.
An SSD will give at least tens of thousands IOPS drives, up to hundreds of thousands per drive. There really is no comparison.
-
@matteo-nunziati said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@matteo-nunziati said in Is this server strategy reckless and/or insane?:
@dustinb3403 I'm running it on microSD .... Brrrr
So many posts... running what?
Running hyperv on microSD. HPE microSD.
Gotcha. Thanks.
-
@tim_g said in Is this server strategy reckless and/or insane?:
An SSD will give at least tens of thousands IOPS drives, up to hundreds of thousands per drive. There really is no comparison.
And that's on SATA. Go to PCIe and you can breach a million per drive!
-
@matteo-nunziati said in Is this server strategy reckless and/or insane?:
@scottalanmiller said in Is this server strategy reckless and/or insane?:
@matteo-nunziati said in Is this server strategy reckless and/or insane?:
@dustinb3403 I'm running it on microSD .... Brrrr
So many posts... running what?
Running hyperv on microSD. HPE microSD.
You can do it. It is not recommended, and Windows will not install itself there. You have to work around the installer to do it to a SD card.
-
@tim_g said in Is this server strategy reckless and/or insane?:
@matteo-nunziati said in Is this server strategy reckless and/or insane?:
I've seen sata ssd x4 raid5 outperform 15k sas x4 raid 10.
x4 SSD any RAID level will outperform x4 15k HDD in any configuration
You're looking at a max of like 250ish realistic IOPS with 15k HDDs. Sure, you can get more at like 100% sequential reads, but not in typical use.
An SSD will give at least tens of thousands IOPS drives, up to hundreds of thousands per drive. There really is no comparison.
I know. My point was that cache tend to blurry things. Disable it is best way to compare ios in different configurations.