home bbs files messages ]

Just a sample of the Echomail archive

Cooperative anarchy at its finest, still active today. Darkrealms is the Zone 1 Hub.

   LINUX      Torvalds farts & fans know what he ate      8,232 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 5,921 of 8,232   
   Holger Granholm to Maurice Kinal   
   Re: Character codes   
   10 Mar 19 16:15:00   
   
   MSGID: 2:20/228 02ba6b4b   
   In a message on 03-05-19 Maurice Kinal said to Holger Granholm:   
      
   Hello Maurice,   
      
   Excuse the delay. I was in Stockholm, Sweden for the Boat Show.   
      
    HG> OK, the code 218 128 162 that i interpreted as hyphen actually   
    HG> is the longer 'dash'.   
      
   MK> I am not sure what you mean but using 218 (DA) as the leading byte   
   MK> means you are restricted to a 2 byte or 16 bit character and not a   
   MK> 24 bit character that is required for euro sign in utf8.  The way   
   MK> the leading byte works is like this;   
      
   I understand, but this is how the UTF codes are represented in PC8,   
   = 8bit ASCII, and I have come to the conclusion that I will, at least   
   try, to use only the two following bytes in the translation table.   
      
   That may be all that is needed but if not, I can always include the   
   leading byte. Kind of cut and try .   
      
   MK> The first zero shows that there are two leading ones which means   
   MK> there is only one trailing byte following.   
      
   MK> So that means either 218 128 and 162 is ignored.   
      
   MK> For the utf8 euro character the prefix is;   
      
   MK> dec 226 = bin 11100010   
   MK>                  ^   
      
   According to my interpretation of how the chracter is presented in PC8   
   it's  as 218 130 172. All normal umlaut characters are presented with   
   only two bytes, like 195 165 for the small angstrom character that is   
   included in your "Moose" tagline.   
      
   MK> and as you can see the first zero yields three leading ones which is   
   MK> three bytes or 24 bits.   
      
   I don't need more than 16-bit characters for that editor.   
   UTF characters ARE presented with two bytes in it.   
      
   MK> For the record 218 128 is U+0680 which we already know to be a 16   
   MK> bit Arabic character.   
      
   That third byte (first 218 or 226) comes only as a prefix for other   
   characters.   
      
   MK> Thank you.  Buenas noches mi amigo.  :-)   
      
   Gracias mi amigo.   
      
      
   Have a good night,   
      
   Holger   
      
      
   .. Computers always win because they have inside information ;o)   
   -- MR/2 2.30   
      
   --- PCBoard (R) v15.22 (OS/2) 2   
    * Origin: Coming to you from the Sunny Aland Islands. (2:20/228)   
   SEEN-BY: 15/2 20/228 123/1970 154/10 201/0 111 120 121 420 203/0 124   
   SEEN-BY: 203/412 211/37 221/1 6 360 226/17 229/107 275 426 452 616   
   SEEN-BY: 229/1014 230/0 240/5832 249/206 317 400 280/464 5003 292/789   
   SEEN-BY: 292/854 8125 301/520 317/3 320/219 322/757 335/364 342/200   
   SEEN-BY: 393/68 410/9 423/81 633/280 3828/7   
   PATH: 20/228 201/111 0 203/0 221/1 292/854 229/426   
      

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca