Why do my dom0s have only 600MB RAM?
-
@momurda said in Why do my dom0s have only 600MB RAM?:
From my limited Xen knowledge dom0 is essential for stable Xenserver performance, esp for file io. Why the hell would anyone ever lower that amount? To make me angry 2 years later? They never even met me... To save a hundred dollars instead of buying ram?
How much RAM are your VMs taking up?
-
They all have fixed ram amounts mostly 2-8gb. Total mem usage on each host is way less than the 128GB available to each host. Yet dom0 memory allocation was reduced in size at some point in the past by 20%.
Ridiculous. though it could have been a step someone, who worked here in the past, took when these hosts only had 32GB ram each and had the same # of vms.(25-30)
edit: clarity -
Oh, I see, someone who works there might have reduced them at some point. Now I see. Yes, expand those a bit
@DustinB3403 how much memory does your Dom0 have?
-
@scottalanmiller said
@DustinB3403 how much memory does your Dom0 have?
Oh, what? You can't ask me?
Probably for the best since I don't know how to check.
See me in 2017.
-
@BRRABill said in Why do my dom0s have only 600MB RAM?:
@scottalanmiller said
@DustinB3403 how much memory does your Dom0 have?
Oh, what? You can't ask me?
Probably for the best since I don't know how to check.
See me in 2017.
free -m
-
-
Discrete systems.
-
Here is my least powerful system.
This host has 8GB of RAM (really 7167 MB) and dom0 is using 1015 MB
My host from have 65513 MB total, and 3134 MB to dom0 and 98210 MB and 5577 MB to dom0 respectively.
-
@DustinB3403 said in Why do my dom0s have only 600MB RAM?:
Here is my least powerful system.
This host has 8GB of RAM (really 7167 MB) and dom0 is using 1015 MB
My host from have 65513 MB total, and 3134 MB to dom0 and 98210 MB and 5577 MB to dom0 respectively.
Looks like it only has 596MB.
-
Correct, it does for the least powerful of my servers. This unit only has 8GB of ram in it.
But was a simple straight install of XS 6.5
So the installation possibly scales the memory for Dom0 based on the hardware it finds present.
@momurda how much memory does this host have?
-
Yes, it should be scaling based on the memory in the system.
-
Lack of Ram, not a problem. Each host has 128GB. someone forced the hosts to use 608MB. I changed them last night per Citrix document, but set it to use 3G. See if that solves the problem. Was hitting less than 20MB free during normal operations, that is bad. Now it is not.
http://support.citrix.com/article/CTX134951 --should be using between 1024 MB and 4096 MB.per
Instructions
To configure the dom0 memory, complete the following procedure:On the XenServer host, open a local shell and log on as root.
Enter the following command:
/opt/xensource/libexec/xen-cmdline --set-xen dom0_mem=<nn>M,max:<nn>M
Note: <nn> represents the memory in megabytes, to be allocated to dom0. This value should be between 1024 and 4096, depending on the number of VMs that are expected to run and the total memory of the host. A higher value results in less memory being available to the VMs.Example : /opt/xensource/libexec/xen-cmdline --set-xen dom0_mem=4096M,max:4096M
Note: The above command will set the dom0 memory to 4GB or 4096 MB.
User-added imageReboot the XenServer host using XenCenter or reboot command on the console.
After the host is rebooted, run the command free on the console to verify the new memory settings. -
@momurda said in Why do my dom0s have only 600MB RAM?:
Lack of Ram, not a problem. Each host has 128GB. someone forced the hosts to use 608MB. I changed them last night per Citrix document, but set it to use 3G. See if that solves the problem. Was hitting less than 20MB free during normal operations, that is bad. Now it is not.
So someone customized the installation on this host to use a custom amount of RAM?
Time to get out Mr. Slappy (left and right hands and slap that person)
-
@DustinB3403
I have never met the person before me, they were fired. I hope i dont, ever. -
This time i really thought i had it.
But hopefully i have finally fixed it this afternoon.
Today right as people were leaving, this error happened again.
After i had increased the dom0 memory allocation. Only this time it didnt last 7 minutes, it lasted over 20. Probably due to the increased memory available for dom0 as that was the only change i made recently.
SMlog full of SR_BACKEND_FAILUREs, timeouts all that bad stuff previously mentioned.
So then when it all started working again i look at smlog as it should normally be in my environment.
Every 30 seconds there is some message:'''
XS001 SM: [16965] sr_scan {'sr_uuid': 'cc37b853-066e-fbcb-f5c2-dcca47fd168b', 'subtask_of': 'DummyRef:|737fa116-27fc-0ad6-c923-335d7d645e68|SR.scan', 'args': [], 'host_ref': 'OpaqueRef:ff71ac2a-851d-36dc-43e4-6ea0708498e9', 'session_ref': 'OpaqueRef:7433c31d-3a94-75fa-316b-c0549ce51389', 'device_config': {'username': 'admin', 'type': 'cifs', 'SRmaster': 'true', 'cifspassword_secret':pw hash removed', 'location': '//10.1.0.10/iso'}, 'command': 'sr_scan', 'sr_ref': 'OpaqueRef:77fe45fb-7f66-b2d0-1aac-72e990bfa378'}
'''I go back, looking at all the archived SMlog.x.gz logs.
This message has been happening for at least 7 months, every 30 seconds without fail. In fact, i thought it was a normal SMlog message because it has been happening at least since my first day on the job. It is also always the same SR uuid # that shows up; which was the CIFS ISO share SR for Xen that was made right after installation back in 2013(not by me). I would guess these type of errors have been happening since 2013, like clockwork almost.
About ready to throw my fist through something at this point, and users are unhappy.
I decided to unplug and forget this SR, and recreate it as NFS ISO share in XS rather than CIFS/SMB
Have done that now, all i can do is wait a few days to see if storage errors occur again. I can say however, those SMlog messages dont show up every 30 seconds anymore.
In fact the only messages showing up are Unitrends snapshotting and attaching itself to vdis for backups right now, so at least it is 'back to normal' for now. Though normal is $#@!ed apparently.It also sheds light as to why previous guy would have reduced memory allocation to dom0, as this seems to reduce the time of these timeouts, while adding more increases the time of them(allegedly, i will know by Monday). If this is the actual fix, it means i will have solved a multiyear problem that 3 other people in my position were unable to solve. I really hope this is it.
And, does anybody actually know what that 'error' message in SMlog means? It doesnt have the word error, doesnt say anything bad, just lists some uuids.
Hopefully this horse is way past dead, and I can go clean my shillelagh.