home bbs files messages ]

Just a sample of the Echomail archive

Cooperative anarchy at its finest, still active today. Darkrealms is the Zone 1 Hub.

   GOLDED      GoldED Public Release discussion.      2,690 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 1,913 of 2,690   
   Scott Street to All   
   CharacterSet Translation   
   11 Feb 22 13:18:22   
   
   MSGID: 1:266/420.1 6206a870   
   PID: GED+OSX 1.1.5-b20180707   
   CHRS: UTF-8 2   
   TZUTC: -0500   
   TID: hpt/mac 1.9 2022-02-08   
   Hello Everyone!   
      
   After much tinkering, I've been unable to get translations to be 100%.  The   
   biggest issue being CP866 -> UTF8.  It seems that I can't get Golded+ to   
   really do translation.   
      
   The bits from my golded.cfg  [which I've tried on Linux and macOS]   
   -paste-   
   XLATPATH        /fido/etc/golded/   
   XLATLOCALSET UTF-8   
   XLATCHARSETALIAS UTF-8 UTF8   
   XLATCHARSET CP1125  UTF-8        1125_u8.chs   
   XLATCHARSET CP437   UTF-8        437_u8.chs   
   XLATCHARSET CP850   UTF-8        850_u8.chs   
   XLATCHARSET CP865   UTF-8        865_u8.chs   
   XLATCHARSET CP866   UTF-8        866_u8.chs   
   XLATCHARSET LATIN-1 UTF-8        iso1_u8.chs   
   XLATCHARSET KOI8-R  UTF-8        koi8_u8.chs   
   -end-   
      
   I thought it was just the messages, so I wrote a PHP library to read JAM files   
   and translate the message body text to UTF8 and then output that to the   
   terminal  [the same terminal I use for Golded+, etc etc].   So my terminal   
   (Apple's macOSX Terminal.app) does indeed display characters correctly, it   
   just seems I can't get GoldEd+ to do it as well.   
      
   PHP code bits for reference:   
   -paste-   
   $xlated = mb_convert_encoding($line, "UTF-8", $msg_encoding);   
   -end-   
      
   $xlated is the body line string after mb_convert_encoding() takes the raw   
   bytes ( $line ) and converts them based on $msg_encoding, which is the   
   message's CHRS (or CHRSET) value, which was translated earlier to a PHP native   
   character set.  See https://www.php.net/manual/en/function.mb-co   
   vert-encoding.php for more info on the PHP function.   
      
      
   In addition: I'm using the included translation files,  the most troubling   
   display is from users with CP866 character sets.   
   -file 866_u8.chs-   
      
    This file is a charset conversion module in text form.   
      
    Source file:   
    http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP866.TXT   
      
   100000          ; ID number (when >65535, all 255 chars will be translated)   
   0               ; version number   
      
   2               ; level number   
      
   CP866   
   UTF-8   
      
   \0 \0                ; NULL   
   \0 \d1               ; START OF HEADING   
   \0 \d2               ; START OF TEXT   
   \0 \d3               ; END OF TEXT   
   \0 \d4               ; END OF TRANSMISSION   
   \0 \d5               ; ENQUIRY   
   \0 \d6               ; ACKNOWLEDGE   
   \0 \d7               ; BELL   
   \0 \d8               ; BACKSPACE   
   \0 \d9               ; HORIZONTAL TABULATION   
   \0 \d10              ; LINE FEED   
   \0 \d11              ; VERTICAL TABULATION   
   \0 \d12              ; FORM FEED   
   \0 \d13              ; CARRIAGE RETURN   
   \0 \d14              ; SHIFT OUT   
   \0 \d15              ; SHIFT IN   
   \0 \d16              ; DATA LINK ESCAPE   
   \0 \d17              ; DEVICE CONTROL ONE   
   \0 \d18              ; DEVICE CONTROL TWO   
   \0 \d19              ; DEVICE CONTROL THREE   
   \0 \d20              ; DEVICE CONTROL FOUR   
   \0 \d21              ; NEGATIVE ACKNOWLEDGE   
   \0 \d22              ; SYNCHRONOUS IDLE   
   \0 \d23              ; END OF TRANSMISSION BLOCK   
   \0 \d24              ; CANCEL   
   \0 \d25              ; END OF MEDIUM   
   \0 \d26              ; SUBSTITUTE   
   \0 \d27              ; ESCAPE   
   \0 \d28              ; FILE SEPARATOR   
   \0 \d29              ; GROUP SEPARATOR   
   \0 \d30              ; RECORD SEPARATOR   
   \0 \d31              ; UNIT SEPARATOR   
   \0 \d32              ; SPACE   
   \0 \d33              ; EXCLAMATION MARK   
   \0 \d34              ; QUOTATION MARK   
   \0 \d35              ; NUMBER SIGN   
   \0 \d36              ; DOLLAR SIGN   
   \0 \d37              ; PERCENT SIGN   
   \0 \d38              ; AMPERSAND   
   \0 \d39              ; APOSTROPHE   
   \0 \d40              ; LEFT PARENTHESIS   
   \0 \d41              ; RIGHT PARENTHESIS   
   \0 \d42              ; ASTERISK   
   \0 \d43              ; PLUS SIGN   
   \0 \d44              ; COMMA   
   \0 \d45              ; HYPHEN-MINUS   
   \0 \d46              ; FULL STOP   
   \0 \d47              ; SOLIDUS   
   \0 \d48              ; DIGIT ZERO   
   \0 \d49              ; DIGIT ONE   
   \0 \d50              ; DIGIT TWO   
   \0 \d51              ; DIGIT THREE   
   \0 \d52              ; DIGIT FOUR   
   \0 \d53              ; DIGIT FIVE   
   \0 \d54              ; DIGIT SIX   
   \0 \d55              ; DIGIT SEVEN   
   \0 \d56              ; DIGIT EIGHT   
   \0 \d57              ; DIGIT NINE   
   \0 \d58              ; COLON   
   \0 \d59              ; SEMICOLON   
   \0 \d60              ; LESS-THAN SIGN   
   \0 \d61              ; EQUALS SIGN   
   \0 \d62              ; GREATER-THAN SIGN   
   \0 \d63              ; QUESTION MARK   
   \0 \d64              ; COMMERCIAL AT   
   \0 \d65              ; LATIN CAPITAL LETTER A   
   \0 \d66              ; LATIN CAPITAL LETTER B   
   \0 \d67              ; LATIN CAPITAL LETTER C   
   \0 \d68              ; LATIN CAPITAL LETTER D   
   \0 \d69              ; LATIN CAPITAL LETTER E   
   \0 \d70              ; LATIN CAPITAL LETTER F   
   \0 \d71              ; LATIN CAPITAL LETTER G   
   \0 \d72              ; LATIN CAPITAL LETTER H   
   \0 \d73              ; LATIN CAPITAL LETTER I   
   \0 \d74              ; LATIN CAPITAL LETTER J   
   \0 \d75              ; LATIN CAPITAL LETTER K   
   \0 \d76              ; LATIN CAPITAL LETTER L   
   \0 \d77              ; LATIN CAPITAL LETTER M   
   \0 \d78              ; LATIN CAPITAL LETTER N   
   \0 \d79              ; LATIN CAPITAL LETTER O   
   \0 \d80              ; LATIN CAPITAL LETTER P   
   \0 \d81              ; LATIN CAPITAL LETTER Q   
   \0 \d82              ; LATIN CAPITAL LETTER R   
   \0 \d83              ; LATIN CAPITAL LETTER S   
   \0 \d84              ; LATIN CAPITAL LETTER T   
   \0 \d85              ; LATIN CAPITAL LETTER U   
   \0 \d86              ; LATIN CAPITAL LETTER V   
   \0 \d87              ; LATIN CAPITAL LETTER W   
   \0 \d88              ; LATIN CAPITAL LETTER X   
   \0 \d89              ; LATIN CAPITAL LETTER Y   
   \0 \d90              ; LATIN CAPITAL LETTER Z   
   \0 \d91              ; LEFT SQUARE BRACKET   
   \0 \d92              ; REVERSE SOLIDUS   
   \0 \d93              ; RIGHT SQUARE BRACKET   
   \0 \d94              ; CIRCUMFLEX ACCENT   
   \0 \d95              ; LOW LINE   
   \0 \d96              ; GRAVE ACCENT   
   \0 \d97              ; LATIN SMALL LETTER A   
   \0 \d98              ; LATIN SMALL LETTER B   
   \0 \d99              ; LATIN SMALL LETTER C   
   \0 \d100             ; LATIN SMALL LETTER D   
   \0 \d101             ; LATIN SMALL LETTER E   
   \0 \d102             ; LATIN SMALL LETTER F   
   \0 \d103             ; LATIN SMALL LETTER G   
   \0 \d104             ; LATIN SMALL LETTER H   
   \0 \d105             ; LATIN SMALL LETTER I   
   \0 \d106             ; LATIN SMALL LETTER J   
   \0 \d107             ; LATIN SMALL LETTER K   
   \0 \d108             ; LATIN SMALL LETTER L   
   \0 \d109             ; LATIN SMALL LETTER M   
   \0 \d110             ; LATIN SMALL LETTER N   
   \0 \d111             ; LATIN SMALL LETTER O   
   \0 \d112             ; LATIN SMALL LETTER P   
   \0 \d113             ; LATIN SMALL LETTER Q   
   \0 \d114             ; LATIN SMALL LETTER R   
   \0 \d115             ; LATIN SMALL LETTER S   
   \0 \d116             ; LATIN SMALL LETTER T   
   \0 \d117             ; LATIN SMALL LETTER U   
   \0 \d118             ; LATIN SMALL LETTER V   
   \0 \d119             ; LATIN SMALL LETTER W   
   \0 \d120             ; LATIN SMALL LETTER X   
   \0 \d121             ; LATIN SMALL LETTER Y   
   \0 \d122             ; LATIN SMALL LETTER Z   
   \0 \d123             ; LEFT CURLY BRACKET   
   \0 \d124             ; VERTICAL LINE   
   \0 \d125             ; RIGHT CURLY BRACKET   
   \0 \d126             ; TILDE   
   \0 \d127             ; DELETE   
   \d208 \d144          ; CYRILLIC CAPITAL LETTER A   
   \d208 \d145          ; CYRILLIC CAPITAL LETTER BE   
   \d208 \d146          ; CYRILLIC CAPITAL LETTER VE   
   \d208 \d147          ; CYRILLIC CAPITAL LETTER GHE   
   \d208 \d148          ; CYRILLIC CAPITAL LETTER DE   
   \d208 \d149          ; CYRILLIC CAPITAL LETTER IE   
   \d208 \d150          ; CYRILLIC CAPITAL LETTER ZHE   
   \d208 \d151          ; CYRILLIC CAPITAL LETTER ZE   
   \d208 \d152          ; CYRILLIC CAPITAL LETTER I   
   \d208 \d153          ; CYRILLIC CAPITAL LETTER SHORT I   
   \d208 \d154          ; CYRILLIC CAPITAL LETTER KA   
   \d208 \d155          ; CYRILLIC CAPITAL LETTER EL   
   \d208 \d156          ; CYRILLIC CAPITAL LETTER EM   
   \d208 \d157          ; CYRILLIC CAPITAL LETTER EN   
   \d208 \d158          ; CYRILLIC CAPITAL LETTER O   
   \d208 \d159          ; CYRILLIC CAPITAL LETTER PE   
   \d208 \d160          ; CYRILLIC CAPITAL LETTER ER   
   \d208 \d161          ; CYRILLIC CAPITAL LETTER ES   
   \d208 \d162          ; CYRILLIC CAPITAL LETTER TE   
   \d208 \d163          ; CYRILLIC CAPITAL LETTER U   
   \d208 \d164          ; CYRILLIC CAPITAL LETTER EF   
   \d208 \d165          ; CYRILLIC CAPITAL LETTER HA   
   \d208 \d166          ; CYRILLIC CAPITAL LETTER TSE   
   \d208 \d167          ; CYRILLIC CAPITAL LETTER CHE   
   \d208 \d168          ; CYRILLIC CAPITAL LETTER SHA   
   \d208 \d169          ; CYRILLIC CAPITAL LETTER SHCHA   
   \d208 \d170          ; CYRILLIC CAPITAL LETTER HARD SIGN   
   \d208 \d171          ; CYRILLIC CAPITAL LETTER YERU   
   \d208 \d172          ; CYRILLIC CAPITAL LETTER SOFT SIGN   
   \d208 \d173          ; CYRILLIC CAPITAL LETTER E   
   \d208 \d174          ; CYRILLIC CAPITAL LETTER YU   
   \d208 \d175          ; CYRILLIC CAPITAL LETTER YA   
   \d208 \d176          ; CYRILLIC SMALL LETTER A   
   \d208 \d177          ; CYRILLIC SMALL LETTER BE   
   \d208 \d178          ; CYRILLIC SMALL LETTER VE   
   \d208 \d179          ; CYRILLIC SMALL LETTER GHE   
   \d208 \d180          ; CYRILLIC SMALL LETTER DE   
   \d208 \d181          ; CYRILLIC SMALL LETTER IE   
   \d208 \d182          ; CYRILLIC SMALL LETTER ZHE   
   \d208 \d183          ; CYRILLIC SMALL LETTER ZE   
   \d208 \d184          ; CYRILLIC SMALL LETTER I   
   \d208 \d185          ; CYRILLIC SMALL LETTER SHORT I   
   \d208 \d186          ; CYRILLIC SMALL LETTER KA   
   \d208 \d187          ; CYRILLIC SMALL LETTER EL   
   \d208 \d188          ; CYRILLIC SMALL LETTER EM   
   \d208 \d189          ; CYRILLIC SMALL LETTER EN   
   \d208 \d190          ; CYRILLIC SMALL LETTER O   
   \d208 \d191          ; CYRILLIC SMALL LETTER PE   
   \d226 \d150 \d145    ; LIGHT SHADE   
   \d226 \d150 \d146    ; MEDIUM SHADE   
   \d226 \d150 \d147    ; DARK SHADE   
   \d226 \d148 \d130    ; BOX DRAWINGS LIGHT VERTICAL   
   \d226 \d148 \d164    ; BOX DRAWINGS LIGHT VERTICAL AND LEFT   
   \d226 \d149 \d161    ; BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE   
   \d226 \d149 \d162    ; BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE   
   \d226 \d149 \d150    ; BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE   
   \d226 \d149 \d149    ; BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE   
   \d226 \d149 \d163    ; BOX DRAWINGS DOUBLE VERTICAL AND LEFT   
   \d226 \d149 \d145    ; BOX DRAWINGS DOUBLE VERTICAL   
   \d226 \d149 \d151    ; BOX DRAWINGS DOUBLE DOWN AND LEFT   
   \d226 \d149 \d157    ; BOX DRAWINGS DOUBLE UP AND LEFT   
   \d226 \d149 \d144    ; BOX DRAWINGS DOUBLE HORIZONTAL   
   \d226 \d148 \d148    ; BOX DRAWINGS LIGHT UP AND RIGHT   
   \d226 \d148 \d180    ; BOX DRAWINGS LIGHT UP AND HORIZONTAL   
   \d226 \d148 \d172    ; BOX DRAWINGS LIGHT DOWN AND HORIZONTAL   
   \d226 \d148 \d156    ; BOX DRAWINGS LIGHT VERTICAL AND RIGHT   
   \d226 \d148 \d128    ; BOX DRAWINGS LIGHT HORIZONTAL   
   \d226 \d148 \d188    ; BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL   
   \d226 \d149 \d158    ; BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE   
   \d226 \d149 \d159    ; BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE   
   \d226 \d149 \d154    ; BOX DRAWINGS DOUBLE UP AND RIGHT   
   \d226 \d149 \d148    ; BOX DRAWINGS DOUBLE DOWN AND RIGHT   
   \d226 \d149 \d169    ; BOX DRAWINGS DOUBLE UP AND HORIZONTAL   
   \d226 \d149 \d166    ; BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL   
   \d226 \d149 \d160    ; BOX DRAWINGS DOUBLE VERTICAL AND RIGHT   
   \d226 \d149 \d144    ; BOX DRAWINGS DOUBLE HORIZONTAL   
   \d226 \d149 \d172    ; BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL   
   \d226 \d149 \d167    ; BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE   
   \d226 \d149 \d168    ; BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE   
   \d226 \d149 \d164    ; BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE   
   \d226 \d149 \d165    ; BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE   
   \d226 \d149 \d153    ; BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE   
   \d226 \d149 \d152    ; BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE   
   \d226 \d149 \d146    ; BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE   
   \d226 \d149 \d147    ; BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE   
   \d226 \d149 \d171    ; BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE   
   \d226 \d149 \d170    ; BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE   
   \d226 \d148 \d152    ; BOX DRAWINGS LIGHT UP AND LEFT   
   \d226 \d148 \d140    ; BOX DRAWINGS LIGHT DOWN AND RIGHT   
   \d226 \d150 \d136    ; FULL BLOCK   
   \d226 \d150 \d132    ; LOWER HALF BLOCK   
   \d226 \d150 \d140    ; LEFT HALF BLOCK   
   \d226 \d150 \d144    ; RIGHT HALF BLOCK   
   \d226 \d150 \d128    ; UPPER HALF BLOCK   
   \d209 \d128          ; CYRILLIC SMALL LETTER ER   
   \d209 \d129          ; CYRILLIC SMALL LETTER ES   
   \d209 \d130          ; CYRILLIC SMALL LETTER TE   
   \d209 \d131          ; CYRILLIC SMALL LETTER U   
   \d209 \d132          ; CYRILLIC SMALL LETTER EF   
   \d209 \d133          ; CYRILLIC SMALL LETTER HA   
   \d209 \d134          ; CYRILLIC SMALL LETTER TSE   
   \d209 \d135          ; CYRILLIC SMALL LETTER CHE   
   \d209 \d136          ; CYRILLIC SMALL LETTER SHA   
   \d209 \d137          ; CYRILLIC SMALL LETTER SHCHA   
   \d209 \d138          ; CYRILLIC SMALL LETTER HARD SIGN   
   \d209 \d139          ; CYRILLIC SMALL LETTER YERU   
   \d209 \d140          ; CYRILLIC SMALL LETTER SOFT SIGN   
   \d209 \d141          ; CYRILLIC SMALL LETTER E   
   \d209 \d142          ; CYRILLIC SMALL LETTER YU   
   \d209 \d143          ; CYRILLIC SMALL LETTER YA   
   \d208 \d129          ; CYRILLIC CAPITAL LETTER IO   
   \d209 \d145          ; CYRILLIC SMALL LETTER IO   
   \d208 \d132          ; CYRILLIC CAPITAL LETTER UKRAINIAN IE   
   \d209 \d148          ; CYRILLIC SMALL LETTER UKRAINIAN IE   
   \d208 \d135          ; CYRILLIC CAPITAL LETTER YI   
   \d209 \d151          ; CYRILLIC SMALL LETTER YI   
   \d208 \d142          ; CYRILLIC CAPITAL LETTER SHORT U   
   \d209 \d158          ; CYRILLIC SMALL LETTER SHORT U   
   \d194 \d176          ; DEGREE SIGN   
   \d226 \d136 \d153    ; BULLET OPERATOR   
   \d194 \d183          ; MIDDLE DOT   
   \d226 \d136 \d154    ; SQUARE ROOT   
   \d226 \d132 \d150    ; NUMERO SIGN   
   \d194 \d164          ; CURRENCY SIGN   
   \d226 \d150 \d160    ; BLACK SQUARE   
   \d194 \d160          ; NO-BREAK SPACE   
   END   
   -file end-   
      
   My primary example message is from FIDONEWS, MsgID "2:5030/1081.117 61f6e5cd"   
      
   My PHP script correctly converts the CP866 characters into UTF-8; but Golded+   
   just makes a mess of it.   
   The tagline of the message translates to "- And you would do art. Poetry,   
   right?"   
   and the origin: (loosly) "I advise you to rub with ant alcohol"   
   Which appears to be posted by a version of GoldEd running on Windows-32bit -   
   So I have to believe proper character translation can be done!   
      
   Sorry for the fairly large post;  just tried to give as much information as   
   possible in one shot.   
      
   Any help is greatly appreciated!   
      
      
   Scott   
      
   ---   
    * Origin: -={ The Digital Post }=- (1:266/420.1)   
   SEEN-BY: 1/123 15/0 18/200 90/1 105/81 106/201 114/201 120/340 123/131   
   SEEN-BY: 124/5009 129/12 14 102 125 160 165 305 328 330 331 153/7715   
   SEEN-BY: 226/30 227/114 229/110 206 317 400 424 426 452 664 700 240/5832   
   SEEN-BY: 261/1 220 266/75 420 512 618 267/152 154 282/1038 292/854   
   SEEN-BY: 301/1 317/3 320/219 322/757 342/200 396/45 460/58 633/280   
   SEEN-BY: 712/848   
   PATH: 266/420 512 229/426   
      

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca