EuroAssembler Index Manual Download Source Macros


Sitemap Links Forum Tests Projects

Test t1360: Encoding string in code pages


Description
"Latin Small Letter u with diaeresis" alias umlauted ü is encodable as Unicode 0x00FC, which is in UTF-8 encoded as two bytes 0xC3,0xBC treated by €ASM as two letters. Source text of the test is in UTF-8 encoding.
Tested procedures
ExpStoreString   ExpStoreUString  
Source & expected listing t1360.htm.lst
| |EUROASM LIST=ON,DUMP=ON,DUMPWIDTH=36,AUTOALIGN=ON | |t1360 PROGRAM FORMAT=BIN,WIDTH=16,MODEL=TINY,LISTMAP=OFF,LISTGLOBALS=OFF |[BIN] |[BIN] | | EUROASM CODEPAGE=UTF-8 ; Treat umlauted ü as one character encoded in UTF-8 as two bytes. |0000:4DC3BC6C6C6572 | DB "Müller" ; 7 bytes of UTF-8 text were emitted to data segment. |0007:90 ....AutoAlignment stuff. |0008:4D00FC006C006C0065007200 | DU "Müller" ; 6 wide (16bit) characters were emitted to data segment. | |;; | | EUROASM CODEPAGE=Windows-1252 ; Incorrectly declare that the source is written in Western European code page. |0014: | ; Umlauted ü (0xC3BC in UTF-8) will be treated in Windows-1252 as two characters: |0014: | ; 0xC3 as "Latin Capital letter A with tilde" à (Unicode 0x00C3), and |0014: | ; 0xBC as "Vulgar fraction one quarter" ¼ /Unicode 0x00BC). |0014: | ; So the string would be displayed in Windows as Müller, if used as output text. |0014:4DC3BC6C6C6572 | DB "Müller" ; 7 bytes of Windows-1252 text were emitted to data segment. |001B:90 ....AutoAlignment stuff. |001C:4D00C300BC006C006C0065007200 | DU "Müller" ; 7 wide (16bit) characters were emitted to data segment. | |;; | | EUROASM CODEPAGE=Windows-1253 ; Incorrectly declare that the source is written in Greek code page. |002A: | ; Umlauted ü (0xC3BC in UTF-8) will be treated in Windows-1253 as two characters: |002A: | ; 0xC3 as "Greek Capital Letter Gamma" Γ (Unicode 0x0393), and |002A: | ; 0xBC as "Greek Capital Letter Omicron with acute accent" Ό /Unicode 0x038C). |002A: | ; So the string would be displayed in Windows as MΓΌller, if used as output text. |002A:4DC3BC6C6C6572 | DB "Müller" ; 7 bytes of UTF-8 text were emitted to data segment. |0031:90 ....AutoAlignment stuff. |0032:4D0093038C036C006C0065007200 | DU "Müller" ; 7 wide (16bit) characters were emitted to data segment. | |ENDPROGRAM t1360
Expected messages t1360.out
I0180 Assembling source file "t1360.htm". I0270 Assembling source "t1360". I0310 Assembling source pass 1. I0330 Assembling source pass 2 - final. I0470 Assembling program "t1360". "t1360.htm"{56} I0510 Assembling program pass 1. "t1360.htm"{56} I0530 Assembling program pass 2 - final. "t1360.htm"{56} I0660 16bit TINY BIN file "t1360.bin" created, size=64. "t1360.htm"{80} I0650 Program "t1360" assembled in 2 passes with errorlevel 0. "t1360.htm"{80} I0750 Source "t1360" (98 lines) assembled in 2 passes with errorlevel 0. I0860 Listing file "t1360.htm.lst" created, size=2767. I0990 EuroAssembler terminated with errorlevel 0.

▲Back to the top▲