ECC is a method of detecting and then correcting single-bit memory errors. A single-bit memory error is a data error in server output or production, and the presence of errors can have a big impact on server performance. Error-correcting code memory (ECC memory) is a type of computer data storage that can detect and correct the most common kinds of internal data corruption. The majority of motherboards will only support one or another, workstation and server boards for example need ECC usually and almost all consumer grade boards only use non ECC.
ECC memory is used in most computers where data corruption cannot be tolerated under any circumstances, such as for scientific or financial computing.
Most non-ECC memory cannot detect errors although some non-ECC memory with parity support allows detection but not correction.
Single-Bit Errors of ECC Memory:
A single-bit error is when one bit (a binary 1 or 0) of a byte of data (8 bits) is changed to the opposite value (1 to 0, or vice versa). Multiple bit errors can be detected by single bit ECC, but may not be corrected by it in all instances. Instead the system ignores it and reloads the data.
ECC is a logical step to parity. It uses multiple parity bits assigned to larger chunks of data to detect and correct single bit errors. Instead of a single parity bit for each 8 bits of data, ECC generates a 7bit code for each 64 bits of data by using non binary, cyclic error-correcting code.
When the 64 bits of data is read by the system, a second 7-bit code is generated, and then compared to the original 7-bit code.
Advantages & Disadvantages:
ECC memory usually involves a higher price when compared to non-ECC memory, due to additional hardware required for producing ECC memory modules, and due to lower production volumes of ECC memory and associated system hardware. Motherboards, chipsets and processors that support ECC may also be more expensive.
However, modern systems integrate ECC testing into the CPU, generating no additional delay to memory accesses.
Ultimately, there is a trade-off between protection against unusual loss of data, and a higher cost.
Important in Server:
Memory errors could also lead to data transcription errors, where a number is changed or a decimal is misplaced. In this scenario, you may not even know the error has occurred. It could be days or weeks before that transaction is next reviewed. Security vulnerabilities, transcription errors, corrupted information, lost data, and downtime caused by system crashes all are technological complications that may be minimized or even eliminated by ECC memory.