cross-posted from: https://slrpnk.net/post/21031468
SSDs can only tolerate a certain number of writes to each block. And the number is low. I have a 64gb SSD that went into a permanent read-only mode. 64gb is still today a very useful capacity. Thus the usefulness is cut short by hardware design deficiencies.
Contrast that with magnetic hard drives which often live beyond the usefulness of their capacity. That is, people toss out working 80mb mechanical drives now because they’re too small to justify the physical space they occupy, not because of premature failure ending the device’s useful life.
Nannying
When an SSD crosses a line whereby the manufacturer considers it unreliable, it goes into a read-only mode which (I believe) is passworded with a key that is not disclosed to consumers. The read-only mode is reasonable as it protects users from data loss. But the problem is the nannying that denies “owners” ultimate control over their own property.
When I try to
dd if=/dev/zero of=/dev/mydrive
, dd is lied to and will write zeros all day and report success, butdd
’s instructions are merely ignored and have no effect.The best fix in that scenario would generally be to tell the drive to erase itself using a special ATA command, like this:
$ hdparm --security-erase $'\0' /dev/sdb security_password: "" /dev/sdb: Issuing SECURITY_ERASE command, password="", user=user SG_IO: bad/missing sense data, sb[]: 70 00 01 00 00 00 00 0a 00 00 00 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 SG_IO: bad/missing sense data, sb[]: 70 00 0b 00 00 00 00 0a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Not sure why my null char got converted to a yen symbol, but as you can see the ATA instruction is blocked.
Here is a take from someone who endorses the nannying. The problem is that there is a presumption on how the drive will be used. Give me a special switch like:
$ hdparm --security-erase $'\0' --I-know-what-I-am-doing-please-let-me-shoot-myself-in-the-foot /dev/sdb
and this is what I would do:
$ dd if=KNOPPIX_V8.2-2018-05-10-EN.iso of=/dev/foo $ hdparm --make-read-only /dev/foo
When the drive crosses whatever arbitrary line of reliability, it’s of course perfectly reasonable to do one last write operation to control what content is used in read-only mode.
5 years later when a different live distro is needed, it would of course be reasonable to repeat the process. One write every ~5 years would at least keep the hardware somewhat useful in the long term.
Luckily I quoted you, which shows that you have defined “repair” so narrowly as to exclude taking actions to restore a product to put back into service.
I never said it would. But more importantly, this is a red herring. I don’t accept your claim that it wouldn’t, but it’s a moot point because this is not the sort of repair I would do and it’s not likely worthwhile. The anti-repair tactic that I condemn is the one that blocks owners from hacks that make the device more useful than the read-only state.
(emphasis mine) This is the nannying I am calling out. If someone can make a degraded product useful again, it’s neither your place nor the manufacturers place to tell advanced users/repairers not to – to dictate what is appropriate.
It’s over-compliant. Also, we don’t give a shit about JEDEC standards after the drive is garbage. The standards are only useful during the useful life of the product. From your own source:
I need a couple weeks tops to transfer my data. It’s good that we get a year. Then what? The drive is as useful as a brick. And needlessly so.
That’s because you’re not making the distinction between reading and writing, and understanding that it’s writing that fails. The fitness to write to a NAND declines gradually with each cycle. Every transistor is different. A transistor might last 11,943 cycles and it sits next to a transistor that lasts 10,392 cycles. They drew a line and said “10k writes is safe for this tech, so draw a line there and go into read-only mode when an arbitrary number of transistors have likely undergone 10k writes”.
The telemetry on the device is not sophisticated enough to track exactly when a transistor’s state becomes ambiguous. So the best they could do is keep an avg cycle count which factors in a large safety margin for error. So of course it would be an insignificant risk to do 1 (or 5) more write cycles. Even if the straw that breaks the camel’s back is on the 1 additional write operation on a particular sector, we have software that is sophisticated enough to correct it. Have a look at
par2
.It’s not “against” the spec because the spec does not specify how we may use the drive. Rightfully so. The spec says the drive must remain readable for 1 year after crossing a threshhold (which BTW is determined by write cycle counts not actual ability to store electrons).
Bricking by design is a bad idea because preventable e-waste and consumerism is harmful to the environment. I write this post from a 2008 laptop that novice consumers would have declared useless 10 years ago.
Of course it’s a right to repair issue because it’s a nannying anti-repair tactic that has prematurely forced a functional product into uselessness. I am being artificially blocked from returning the product into useful service.
Me providing an example of a repair is not me claiming it is the only method of repair.
Except, again, you aren’t making it useful again, you’re attempting to bypass a fail safe put in place by engineers. You aren’t repairing anything to make useful again, you aren’t fixing any part of the SSD. You’re merely attempting to bypass a “lockout”. You aren’t arguing to repair the drive; you’re arguing to keep using after this point (which is fine, even if I disagree with it).
The first paragraph quoted (and the article as whole) cover reads, different between different drives (including different specs for enterprise vs consumer) and how the values are drawn. 10k is for intel 50nm MLC NAND specifically. Other values are presented in the article. It isn’t arbitrary as you’ve attempted to hand wave it as. I suggest you read it in its entirety. It doesn’t matter how sophisticated the software standard is, the oxide on the drive will eventually wear down and is a physical problem.
Except it isn’t useful service. I would have a hard time buying that a a pre-fail drive, even second hand, is useful for service. I get what you’re going for/saying but again it doesn’t pass for right to repair imo. It’s risking data loss to wring an extra 12 months (or likely, less) from a dying drive. For every 1 person like you that its an annoyance for it saves multitudes more that are less savvy pointlessly risking data loss.
Luckily I quoted you, which shows that you have defined “repair” so narrowly as to exclude taking actions to restore a product to put back into service.
Of course it’s useful again. To claim writing to a drive is not useful is to misunderstand how storage devices are useful.
No I’m not. The fail safe should remain. That much was well done by engineers and I would be outraged if it were not in place. I WANT my drive to go into read-only mode when it crosses a reliability threshhold. The contention is what happens after the fail safe – after recovering the data. No one here believes the drive should not fail safe.
Yes I read that. And? It’s immaterial to the discussion whether it’s an enterprise or consumer grade. Enterprise hardware still lands in the hands of consumers at 2nd-hand markets.
And? Why do you think this is relevant to the nannying anti-repair discussion? It doesn’t obviate anything I have said. It’s just a red herring.
Yes it is. Read your own source. They are counting write cycles to get an approximation of wear, not counting electrons that stick.
This supports what I have said. Extreme precision is not needed when we have software that gives redundancy to a user-specified extent and precisely detects errors.
Denying owners control over their own property s.t. they cannot put it back into service is an assault on repair. Opposing the nannying is to advocate for a right to repair.
You’re not grasping how the tech works. The 12 months is powered off state maintenance for reading. Again, you’re missing the reading and writing roles here. I’m not going to explain it again. Read your own source again.
This is a false dichotomy. It’s possible to protect the low tech novices without compromising experts from retaining control over their own product. This false dichotomy manifests from your erroneous belief that the fail safe contradicts an ability to reverse the safety switch after it triggers.
Yes, that would be a compelling point did I not, twice, tell you your interpretation of my quote is incorrect and go on to clarify it as an example. I think this makes your intentions clear enough that it isn’t worth continuing wasting time on. All I’ll say is I’m glad you have nothing to do with making the specifications for this sort of hardware and that it’s left to competent and educated engineers. Assault on repair, good lord lol.
Your words, quoted here again as proof that you have defined “repair” so narrowly as to exclude taking actions to restore a product to put back into service:
What is your mother tongue that is so far from English?
You are really lost here. We actually agreed on the engineering decision (which was the decision to have a fail safe trigger). Again, the point of contention is the management decision to block property owners from control over their own property after they recover their data – the management decision that forces useful hardware to be needlessly committed to e-waste after the data has been migrated. It is because you think the profit-driven management decision of a private enterprise is “engineering” makes you profoundly incompetent for involvement in engineering specs. But you might be able to do marketing or management at a company like Microsoft. Shareholders would at least love your corporate boot-licking posture and your propaganda rhetoric in framing management decisions as “engineering”.
But plz, stay away from specs. Proper specs favor the consumers/users and community. They are not optimized to exploit consumers to enrich corporate suppliers and generate landfill.