SQL understanding - power outage

scottalanmiller

Jared has it, it's the log and NOT memory. It only takes a moment to write to the log, but takes a bit to get a database lock and update the actual DB, so SQL Server writes to the log first so that it is safe and recorded, then updates the database itself.

bbigford

@JaredBusch said in SQL understanding - power outage:

Are you talking about the transaction log? If you are running MS SQL Server in full recovery mode, everything is wrote to a transaction log file in order to be able to reply everything if needed.

It wasn't necessarily the logs. It's something in the buffer that would cause the logs to have a gap in time (however small). So the admin would do a full restore, recovering as many logs as had been committed to the database. But then there was still some data missing, in the memory buffer.

I'll do some more digging. I read though that article @NetworkNerd, but that wasn't it.

scottalanmiller

If you have atomic commits then that can never happen. SQL Server can definitely protect against lost data.

Dashrender

I thought the logs are what kept you from losing any data? Data is written to the log first, then attempted to write to the DB, if that right fails, the log is updated, and the write is perhaps attempted again.

Could there be data in RAM that hasn't made it to the log yet? Sure, but that data is lost, and shouldn't affect the stability of the DB. Or am I missing something?

scottalanmiller

@Dashrender said in SQL understanding - power outage:

I thought the logs are what kept you from losing any data? Data is written to the log first, then attempted to write to the DB, if that right fails, the log is updated, and the write is perhaps attempted again.

Could there be data in RAM that hasn't made it to the log yet? Sure, but that data is lost, and shouldn't affect the stability of the DB. Or am I missing something?

Correct, what is in RAM that never got to the disks, not even the logs, isn't part of the DB's concern yet.

bbigford

Here's what the person I heard it from said:

"likely easy to talk on the phone (i'm on a conference call, but have time this afternoon).

The term is quiescence, that you are looking for. This means that writes have been halted, and all data in buffer has been flushed. Part of the challenge here is this needs to be done not only at the app layer, the OS layer, the storage layer underneath that (and even at the drive level, as cheap consumer SSDs and SATA drives will have write buffers not protected).

Modern SQL commits writes first to the transaction log then to the database, and in the event of a power loss can "replay" the transaction log. The problem is if the log is large the can take a REALLY long time, and if you have systems underneath that ACK'd writes at the SCSI layer out of order (Consumer level SSDs) this might not properly recover."

Not really sure what data is sitting in the buffer during a power outage though. Meta data waiting to be written to tran logs? So then when a power outage happens and that buffer gets flushed, there's an inconsistency in those tran logs that were mid-write...

Dashrender

@BBigford said in SQL understanding - power outage:

Not really sure what data is sitting in the buffer during a power outage though. Meta data waiting to be written to tran logs? So then when a power outage happens and that buffer gets flushed, there's an inconsistency in those tran logs that were mid-write...

By metadata, does he mean data written back to the log to indicate that the disk did in fact complete it's write?

scottalanmiller

@BBigford said in SQL understanding - power outage:

Here's what the person I heard it from said:

"likely easy to talk on the phone (i'm on a conference call, but have time this afternoon).

The term is quiescence, that you are looking for. This means that writes have been halted, and all data in buffer has been flushed. Part of the challenge here is this needs to be done not only at the app layer, the OS layer, the storage layer underneath that (and even at the drive level, as cheap consumer SSDs and SATA drives will have write buffers not protected).

Modern SQL commits writes first to the transaction log then to the database, and in the event of a power loss can "replay" the transaction log. The problem is if the log is large the can take a REALLY long time, and if you have systems underneath that ACK'd writes at the SCSI layer out of order (Consumer level SSDs) this might not properly recover."

Not really sure what data is sitting in the buffer during a power outage though. Meta data waiting to be written to tran logs? So then when a power outage happens and that buffer gets flushed, there's an inconsistency in those tran logs that were mid-write...

That's where atomic commits come in. They make sure that everything is quiesced before anything goes to the database. ZFS does the same thing for different tasks. This really isn't an issue in a properly designed database - unless you've added a caching layer yourself that doesn't honour the flush commands from higher up the stack. Then it is on your own head. You could always put your database into RAM and cause it to fail like that if you wanted.

bbigford

@scottalanmiller said in SQL understanding - power outage:

@BBigford said in SQL understanding - power outage:

Here's what the person I heard it from said:

"likely easy to talk on the phone (i'm on a conference call, but have time this afternoon).

The term is quiescence, that you are looking for. This means that writes have been halted, and all data in buffer has been flushed. Part of the challenge here is this needs to be done not only at the app layer, the OS layer, the storage layer underneath that (and even at the drive level, as cheap consumer SSDs and SATA drives will have write buffers not protected).

Modern SQL commits writes first to the transaction log then to the database, and in the event of a power loss can "replay" the transaction log. The problem is if the log is large the can take a REALLY long time, and if you have systems underneath that ACK'd writes at the SCSI layer out of order (Consumer level SSDs) this might not properly recover."

Not really sure what data is sitting in the buffer during a power outage though. Meta data waiting to be written to tran logs? So then when a power outage happens and that buffer gets flushed, there's an inconsistency in those tran logs that were mid-write...

That's where atomic commits come in. They make sure that everything is quiesced before anything goes to the database. ZFS does the same thing for different tasks. This really isn't an issue in a properly designed database - unless you've added a caching layer yourself that doesn't honour the flush commands from higher up the stack. Then it is on your own head. You could always put your database into RAM and cause it to fail like that if you wanted.

I found out what he was talking about. Commit to tran log > Application acknowledge transaction > written to DB. The data that gets flushed shouldn't matter because most applications wouldn't acknowledge something that isn't commit. So if there was a power outage then the tran logs could just replay, app acknowledges, and so on.

I was missing something, and only thinking of tran logs and the database. Didn't think he had been talking about the front end.

scottalanmiller

@BBigford said in SQL understanding - power outage:

@scottalanmiller said in SQL understanding - power outage:

@BBigford said in SQL understanding - power outage:

Here's what the person I heard it from said:

"likely easy to talk on the phone (i'm on a conference call, but have time this afternoon).

The term is quiescence, that you are looking for. This means that writes have been halted, and all data in buffer has been flushed. Part of the challenge here is this needs to be done not only at the app layer, the OS layer, the storage layer underneath that (and even at the drive level, as cheap consumer SSDs and SATA drives will have write buffers not protected).

Modern SQL commits writes first to the transaction log then to the database, and in the event of a power loss can "replay" the transaction log. The problem is if the log is large the can take a REALLY long time, and if you have systems underneath that ACK'd writes at the SCSI layer out of order (Consumer level SSDs) this might not properly recover."

Not really sure what data is sitting in the buffer during a power outage though. Meta data waiting to be written to tran logs? So then when a power outage happens and that buffer gets flushed, there's an inconsistency in those tran logs that were mid-write...

That's where atomic commits come in. They make sure that everything is quiesced before anything goes to the database. ZFS does the same thing for different tasks. This really isn't an issue in a properly designed database - unless you've added a caching layer yourself that doesn't honour the flush commands from higher up the stack. Then it is on your own head. You could always put your database into RAM and cause it to fail like that if you wanted.

I found out what he was talking about. Commit to tran log > Application acknowledge transaction > written to DB. The data that gets flushed shouldn't matter because most applications wouldn't acknowledge something that isn't commit. So if there was a power outage then the tran logs could just replay, app acknowledges, and so on.

I was missing something, and only thinking of tran logs and the database. Didn't think he had been talking about the front end.

Ah, okay. Yeah, if the application doesn't wait for the database, it's just being reckless.