INTL 3:770/1 3:770/3   
   REPLYADDR invalid@invalid.invalid   
   REPLYTO 3:770/3.0 UUCP   
   MSGID: 75ada47e   
   REPLY: 0a543bd9   
   PID: SoupGate-Win32 v1.05   
   Björn Lundin writes:   
   > On 2024-08-03 03:27, Lawrence D'Oliveiro wrote:   
   >> On Fri, 2 Aug 2024 15:12:57 +0100, The Natural Philosopher wrote:   
   >>   
   >>> SSDS/NVM have their own internal caching.   
   >> True for all drives, unfortunately.   
   >> It’s bloody stupid, because the drive caching is on the wrong side   
   >> of the   
   >> drive interface. Better to leave it to the OS, which can use main RAM for   
   >> its filesystem caching, on the fast side of that drive interface.   
   >> When a drive says to the OS driver “write is done”, it should mean   
   >> “write   
   >> has gone to actual persistent storage”, not “write is in my cache”.   
   >   
   > I think they cause grief in the postgres mail lists some 15-20 years ago.   
   > They were called 'lying IDE-disks' and were not popular in that crowd.   
      
   ‘Lying disks’ are those that either disregard flush operations, or lie   
   about whether they have a write-back cache at all. That is certainly a   
   stupid outcome, though as a response to operating systems or   
   applications that are over-eager to flush, you can see why there’d be   
   pressure from marketing to do it, and acceptance from market segments   
   that either don’t value data integrity or alternatively assume that the   
   storage is unreliable and address the issue some other way.   
      
   I think what Lawrence is complaining about is the fact that the default   
   behavior, even for a non-lying disk, is when a SATA device returns a   
   response to the host, this may indicate only that data has been   
   transferred to an internal write-back cache rather than the underlying   
   medium.   
      
   But that’s just the normal engineering response to high physical IO   
   latency.   
      
   Recall that both traditional hard disks and SSDs do not have a 1:1   
   mapping between the logical block read/writes requested by the host. In   
   a hard disk it takes time to reach the correct track, and the order of   
   writes from the host may not match the track order. In an SSD multiple   
   logical blocks are grouped into pages and pages must be written in a   
   single operation.   
      
   The same logic turns up elsewhere. The write() syscall completing   
   normally only indicates that data has been transferred to the operating   
   system’s RAM cache, not to your SSD or hard disk (and certainly not to a   
   remote disk, when using a network filesystem). A memory write   
   instruction on a modern CPU only transfers a value to an internal write   
   buffer; the data may only reach the external DRAM hundreds of cycles   
   later, or not at all if the same location is written again soon.   
      
   The alternative is absurdly slow write IO. Usually there are some   
   combination of flush operations, synchronous IO modes, barrier   
   operations, etc, to allow data integrity requirements to be met without   
   sacrificing performance globally (and this is true both at the Linux   
   syscall layer and in the SATA protocol ... provided of course your disk   
   does not lie).   
      
   --   
   https://www.greenend.org.uk/rjk/   
      
   --- SoupGate-Win32 v1.05   
    * Origin: Agency HUB, Dunedin - New Zealand | Fido<>Usenet Gateway (3:770/3)   
   SEEN-BY: 1/19 16/0 19/37 80/1 90/1 105/81 106/201 123/130 129/305   
   SEEN-BY: 142/104 153/757 7715 203/0 218/700 840 220/70 221/1 6 242   
   SEEN-BY: 221/360 226/17 30 100 227/114 229/110 111 200 206 300 317   
   SEEN-BY: 229/400 426 428 470 550 616 664 700 230/0 240/5832 266/512   
   SEEN-BY: 267/800 280/5003 282/1038 291/111 292/854 301/1 310/31 320/119   
   SEEN-BY: 320/219 319 2119 322/757 762 325/304 335/364 341/66 342/200   
   SEEN-BY: 396/45 423/81 460/58 633/280 712/848 770/1 3 100 330 340   
   SEEN-BY: 772/210 220 230 5020/400 5053/58 5058/104 5075/35   
   PATH: 770/3 1 218/840 221/6 1 320/219 229/426   
      
|