SCDPM 2016 using MBS

Obsolesce

This is the backup server HOST:

This is the DPM vm. Note that I over-provisioned storage on purpose for dedup on the back-end:

Obsolesce

Side note: The smaller physical volume (1.9 TB (RAID10)) is the internal storage on the R420 (serv-backup). The larger physical volume (5.91 TB (RAID10)) is a bunch of random spinning rust in an MD1000 attached to the R420.

Obsolesce

The initial backup (data sync) of the 3 TB and 300 GB was very fast. Much better than the current production backup method. That alone would be a nice improvement, allowing more time for maintenance after backups are done.

Obsolesce

Some updates:

A couple days ago, I added a third production server to the backup test that is about 1.6 TB.

All protection groups are scheduled to back up daily (excluding weekends). They all complete pretty fast, even without taking advantage of the new Resilient Change Tracking (RCT) technology used when VMs are running at a configuration version of 8.0 (Hyper-V 2016).

I now have three test protection groups (split weirdly for testing and tracking):

VM1 - fileserver (3.2 TB of data) (Test group 1)
Hyper-V Host running 6 VMs (255 GB of data) (Test group 2)
VM2 - application server (1.61 TB of data) (Test group 3)

Total data to back up: 5065 GB (5.1 TB)

I have between 3 and 6 recovery points for each VM or server, depending on which one it is.

DPM Admin Console shows the following amounts of backup storage capacity being used for each protection group:

Test Group 1: 3250 GB
Test Group 2: 258 GB
Test Group 3: 1658 GB

Total backup storage capacity being used: 5166 GB

This number contains 3 to 6 recovery points.

Now on the DPM Host:

You can see I'm averaging over 50% space savings.

Backup space savings of over 50%, upwards of 75% depending on the group, plus the fact it does it quickly (with further optimization available), shows this is going rather nicely, and sure beats the current process.

Obsolesce

Next, (after some more data testing), I want to test backup replication, tape, and cloud using different retention ranges, backup frequency, recovery points, backup modes, etc...

Obsolesce

I will also be testing virtual tape (via iSCSI) using Starwind. I plan on using that to replicate the existing backups to another location.

scottalanmiller

@Tim_G said in SCDPM 2016 using MBS:

I will also be testing virtual tape (via iSCSI) using Starwind. I plan on using that to replicate the existing backups to another location.

Sweet. I'm sure @KOOLER and @Stuka will be happy.

Obsolesce

Well, I finally finally finally found HP drivers for my LTO2 tape drive test on Windows Server 2016.

Some HPE driver repository page...

Hopefully I can save someone else the trouble, here's the link: https://downloads.linux.hpe.com/SDR/repo/spp/2016.04.0/hp/swpackages/

cp023805.exe worked for me.

Now I just need to find an LTO2 tape. In the meantime, I'm working on getting an MD3000 running at another site for vTape testing with DPM 2016... using StarWind virtual tape redirector.

Obsolesce

Tiny update:

To make things easier:

Backup host = HOST1 (MD1000 is plugged in to HOST1 (serv-backup))
DPM server = VM1 (running on HOST1 (serv-DPM))

This whole thing is pretty resilient it seems. Due to a clerical error, the MD1000 was accidentally switched off and on, instead of a different system. Somehow, the physical host (HOST1) was stuck in a state where dedup optimization was in progress, plus the DPM vHost (VM1) running on it was also syncing backups... normally that shouldn't matter, it should just push through it slower than normal.

But it was stuck for over a week. I didn't notice for that long because it's just a test system and my attention was elsewhere, no alerting set up.

RAM was at 100% use too.

I "turned off" not shut down, VM1. I rebooted HOST1. I did Windows updates/rebooted. I manually performed a dedup optimization on both storage volumes (internal storage and MD1000), also garbage collection and scrubbing jobs on both volumes.

Now HOST1 was in good shape. So I booted up VM1, did the update/reboot dance.

The existing backup jobs synchronized (backed up) automatically on their own, the production test jobs. Mainly, a 4tb file server VM and 1.5 tb application server VM. Only took half an hour. In fact I didn't even know it done it until hours later.

No backup data loss.

I'm now in the process of adding more config version 8.0 VMs to a new protection job to back up using DPM 2016. It's pretty cool to see it working, how it is randomly distributing the data to the virtual disks that get deduped:

Obsolesce

I also made some adjustments to RAM usage.

HOST1 only has 24 GB of RAM.
VM1 uses 12 GB, non-dynamic, as it houses the MS SQL database and also runs SCDPM. I'm sure I can cut this in half to 6 GB as it's barely being used, but I'm not going to at this time. I'm also trying to push limits and have things break in this test environment before moving to production. So I'm also trying to cut things close.

Anyways, Dedup daily optimization is set to use 50% RAM... because VM1 is using the other half, when dedup optimization occurs, the server gets really slow. Nothing else is happening on the server during this time, so it shouldn't matter. But I am doing stuff on it for testing purposes and I just can't do anything during the hour or two it takes to run Dedup.

So I edited the amount of RAM dedup uses just for the daily optimization schedule:

Set-DedupSchedule -Name DailyOptimization -Memory 35

For my test environment, I found 35% to be good. Now I have some memory to work with during the time it's running dedup optimization (6am)... which is also the best time for me to work on things.