home bbs files messages ]

Just a sample of the Echomail archive

Cooperative anarchy at its finest, still active today. Darkrealms is the Zone 1 Hub.

   UTF-8      UTF-8 encoded messages      382 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 275 of 382   
   Michiel van der Vlist to Sergey Dorofeev   
   UTF-8 nodelist report   
   09 Mar 25 11:42:16   
   
   TID: FMail-W32 2.3.0.1-B20240319   
   TZUTC: 0100   
   CHRS: UTF-8 4   
   MSGID: 2:280/5555 67cd7088   
   REPLY: 2:5020/12000 4f5391fe   
   Hello Sergey,   
      
   On Friday March 07 2025 15:01, you wrote to me:   
      
    MV>> He insists on entering the 'a' and 'o' with umlaut in Säve and   
    MV>> Björn in 202/208 in Latin-1 in the normal ASCII nodelist. So in   
    MV>> the ASCII list they are replaced by question marks by MakeNl. In   
    MV>> the UTF list which in his case is just a copy of the ASCII   
    MV>> segment submitted, they appear "as submitted" and the line is   
    MV>> flagged as in error by my program.   
      
    SD> I think it is not very contradictory. I he will success in entering   
    SD> non-ASCII chars in nodelist (making it full 8-bit), encoding must be   
    SD> defined.   
      
   The encoding for the regular nodelist IS defined: ASCII and ASCII only. For   
   backward compatibility it must stay that way. There still may be nodelist   
   processing software around that breaks when he highest bit is not zero. That   
   is why MakeNl (without the ALLOW8BIT setting) substitutes a question mark for   
   characters with the highest bit set.   
      
   The encoding for the UTF nodelist is also defined: UTF-8.   
      
    SD>  Ok, if it will be latin-1, but let it be only for European   
    SD> segments. That is, lets define encoding on per-region or even   
    SD> per-network basis.   
      
   Very bad idea. Having more than one encoding within the same file is a bad   
   idea anyway, not just for the nodelist but for ANY text file.   
      
    SD> So when importing nodelist, it must be split back on segments and   
    SD> correctly transcoded. E.g. default encoding if ASCII, so Zone records   
    SD> must be ASCII. But zone may specify own encoding, so regions in it may   
    SD> use it in own record, and define encoding for underlying regions.   
    SD> Further, region record use zone encoding, and may define encoding for   
    SD> networks. Network record use region encoding and may define encoding   
    SD> for node records.   
      
   Are you serious? You really still want every back alley in Fidonet to have its   
   own 8 bit encoding? With all the forward and backward re-encoding and other   
   limitations? C'mon.. That's chaos! Unicode was invented for the very purpose   
   of getting rid of all this codepage shit.   
      
   Why do you think Microsoft went full Unicode internally? Three decades ago.   
   Why do you think 99% of what is on the web is UTF-8? To get rid of the mess of   
   all the hundreds of 8 bit encodings that floated around!   
      
   Nah, as far as the nodelist goes, it is either just ASCII or UTF-8. No more   
   codepage shit.   
      
      
   Cheers, Michiel   
      
   --- GoldED+/W32-MSVC 1.1.5-b20170303   
    * Origin: Nieuw Schnøørd (2:280/5555)   
   SEEN-BY: 4/0 90/0 105/81 106/201 128/187 153/7715 154/10 110 203/0   
   SEEN-BY: 218/700 221/6 226/30 227/114 229/110 114 317 426 428 470   
   SEEN-BY: 229/700 705 240/5832 280/464 5555 291/111 292/789 301/1 310/31   
   SEEN-BY: 320/219 341/66 234 460/58 900/0 902/0 26 905/0 5019/40 5020/1042   
   SEEN-BY: 5075/35   
   PATH: 280/5555 341/66 902/26 229/426   
      

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca