Build servers offline due to failed SSD

Daniel J. Luke dluke at geeklair.net
Mon Mar 8 04:12:58 UTC 2021


On Mar 7, 2021, at 8:30 PM, Todd Doucet <ttd at lambentresearch.com> wrote:
> I think one can only get so far with purely qualitative analysis of the characteristics of SSDs and HDs and then the end of that analysis will be one-size-fits all advice, for example "recommended" or "not recommended" for servers.

this +1000

> Surely the answer might vary depending on the particular server usage pattern, the need for performance, the cost of routine maintenance (swapping out aging drives or SSDs), the cost of the devices themselves, etc.

exactly

There's a reason you don't really see 15k enterprise drives anymore.

> It seems to me that a given server operator can tell how long a particular SSD is likely to last.  They do not fail randomly, at least not very much.  The fail when they are "used up" and you can figure out well in advance, usually, when you will need to swap the old ones out of service.

Back in 2015 - there's this article https://techreport.com/review/27909/the-ssd-endurance-experiment-theyre-all-dead/ where someone actually bothered to test and report some results.

> HDs fail also, obviously, but tend not to be so predictable about it.  Whether it makes sense for a given server to use an SSD really does depend on the numbers.  All drives will fail.  All drives will need to be rotated out of service.  It is a matter of cost, convenience, and performance.
> 
> The only caveat I can think of is that there might be an issue of malicious use--a server with SSDs might be vulnerable to a wear attack, depending on the server services offered, I suppose.

I'm sure there are worst-case scenarios for spinning disks that (in theory) could be exploited to wear their mechanisms out as well.

I've personally used both enterprise and consumer SSDs in high-write environments where the cost of replacing the SSDs was worthwhile for the performance benefits (or otherwise didn't change the overall cost of the solution) - and I've been pleasantly surprised with how much more use I've gotten from them than I originally calculated (based on the drive specs + the planed utilization + over provisioning). 

YMMV of course - but the blanket "you shouldn't use SSDs for servers" or "no one uses SSDs for servers" is wrong. For those who are interested in more details, there are a bunch of good USENIX and ACM papers where people have actually gone and collected data on real-world failure rates.

-- 
Daniel J. Luke



More information about the macports-users mailing list