home bbs files messages ]

Just a sample of the Echomail archive

Cooperative anarchy at its finest, still active today. Darkrealms is the Zone 1 Hub.

   GOLDED      GoldED Public Release discussion.      2,690 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 2,231 of 2,690   
   Michiel van der Vlist to Nicholas Boel   
   Need volonteers to test another patch   
   03 Mar 24 16:45:34   
   
   TID: FMail-W32 2.2.0.0   
   RFC-X-No-Archive: Yes   
   TZUTC: 0100   
   CHRS: UTF-8 4   
   MSGID: 2:280/5555 65e49e05   
   REPLY: 1:154/10 65e48d3c   
   Hello Nicholas,   
      
   On Sunday March 03 2024 08:46, you wrote to Vitaliy Aksyonov:   
      
    NB> As for the pseudo-graphics wrapped to the next line, I have a   
    NB> (probably dumb) question about this: If the pseudo graphics were   
    NB> originally cp437 (single byte) and translated to utf-8, once they are   
    NB> translated are they now multiple bytes per character?   
      
   I prefer dumb quetion, they are easier to answer... ;-)   
      
   Yes, they are translated to multi (usually two for most characters used in   
   Fidonet) byte characters. Only the ASCII characters (0-127) are not translated   
   and so remain one byte.   
      
    NB> If "UTF-8 uses 1 to 4 bytes to encode a single character", I guess   
    NB> what I'm wondering is if the character was 1 byte to begin with, why   
    NB> wouldn't it stay 1 byte when translated to utf-8? Or is it because   
    NB> those _specific_ characters when in utf-8 are already multiple bytes?   
      
   A non ASCII character can not be translated to one byte for the simple reason   
   that the remaning  128 bytes with the highest bit set are not enough to encode   
   ALL the characters in ALL the single byte characters sets. The whole idea of   
   unicode is to encode ALL the characters of ALL those characters sets, CP437,   
   CP850, CP 866, CP 1250, etc into ONE encoding scheme. One byte is just not   
   enough for all.   
      
   To put it simple: if you want to encode CP437 and CP866, you could put CP437   
   OR CP866 in the first byte, but you need at least one bit more information   
   which one it is; CP437 or CP866. That is not exactly how UTF-8 works but it   
   should give you an idea of why just one byte can not be enough.   
      
      
   Cheers, Michiel   
      
   --- GoldED+/W32-MSVC 1.1.5-b20170303   
    * Origin: Nieuw Schnøørd (2:280/5555)   
   SEEN-BY: 15/0 18/200 90/1 103/705 105/81 106/201 124/5016 128/260   
   SEEN-BY: 129/305 135/225 153/757 7715 154/10 30 203/0 218/700 221/0   
   SEEN-BY: 221/6 226/30 227/114 229/110 112 113 206 307 317 400 426   
   SEEN-BY: 229/428 470 664 700 240/1120 5832 266/512 280/464 5003 5555   
   SEEN-BY: 282/1038 291/111 292/854 8125 301/1 310/31 320/219 322/757   
   SEEN-BY: 341/66 234 342/200 396/45 423/120 460/16 58 256 1124 5858   
   SEEN-BY: 467/888 633/280 712/848 770/1 5019/40 5020/400 1042 5053/58   
   SEEN-BY: 5054/30 5075/35   
   PATH: 280/5555 464 460/58 229/426   
      

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca