EuroAssembler Index Manual Download Source Macros


Sitemap Links Forum Tests Projects

euroconv.htm
EuroConv
EuroConv for Linux
EuroConv for Windows

  EuroConv is a program from the collection EuroTool which changes the encoding of input text file and converts it to another encoding.

It recognizes text in ASCII, UTF-8, UTF-16, UTF-32 or one of 75 OEM/ANSI 8-bit encodings.
A complete list of supported encodings can be obtained with the command euroconv /InputEncoding=? alias euroconv /IE=?.
Output UTF encoding can be optionally completed with Byte Order Mark (BOM).

EuroConv also detects HTML entities in the input text, which may be optionally converted to the corresponding character.

EuroConv will convert either the entire input file, or it may omit the beginning and the end of input file by specifying header and footer size or length.


Graphic window of EuroConv launched in MS Windows:

; EuroConv default configuration (UTF-8): /InputFile= ; Input file name to be converted. /InputEncoding= ; Encoding of InputFile; autodetect when not specified. /HeaderSize=0 ; Number of bytes to omit from conversion at the begining. /HeaderLength=0 ; Number of lines to omit from conversion at the begining. /FooterSize=0 ; Number of bytes to omit from conversion at the end. /FooterLength=0 ; Number of lines to omit from conversion at the end. /HtmlEntities=I ; Ignore | Convert to a character | NonASCII convert (do not convert &, <, >, "). /OutputFile= ; Output file name where the converted input will be saved. /OutputEncoding= ; Encoding of OutputFile; UTF-8 when not specified. /ByteOrderMark=yes ; Use BOM in output UTF encoding. /InvalidCharacter=T ; Transliterate | Convert to HTML entity | Question-mark replace | Omit

Enumerated options /HtmlEntities= and /InvalidCharacters= alias /HE= and /IC= recognize only the first letter I|C|N and T|C|Q|O from their values.

For instance a typical invokation can look like euroconv input.txt output.txt /he=c and it will convert the input file from autodetected encoding (whichever it was) to UTF-8 (default /OE) and convert any HTML entity found in input text to their characters.

Option /HtmlEntities=NonASCII will ignore ASCII HTML entities & < > " but it will convert other entities, such as € € etc.

How does the program work:
  1. EuroConv searches for the global configuration /etc/eurotool/euroconv.ini (Linux) or %AppData%\eurotool\euroconv.ini (Windows). It it doesn't exist, it will try to create it with default arguments.
    You should run EuroConv with root privilegies for the first time: sudo ./euroconv.x.
  2. Now EuroSort reads arguments from the command line. If either InputFile or OutputFile is missing, it will let you specify arguments in the graphic window.
  3. Input file is mapped to memory and if /InputEncoding= is not explicitly specified, it will be autodetected.
  4. If /HeaderSize= and /FooterSize= are not 0, they are applied. Then /HeaderLength= and /FooterLength= are applied, if not 0.
    Header and Footer are omitted from the conversion.
  5. Specified arguments are recapitulated and the conversion begins.
  6. Each input character is converted to 32-bit Unicode point. In 8-bit InputEncoding it uses CPtable from CodePages for conversion to UTF-32LE.
    If the conversion is not possible, counter of input errors is incremented.
  7. Unicode character is then converted to the OutputEncoding, again it may use CPtable from CodePages for conversion from UTF-32LE to OEM or ANSI output encoding.
    When the conversion is not possible, counter of output errors is incremented.
  8. EuroConv recapitulates the number of converted characters, input and output errors and terminates.
    Conversion is reversible only if both input and output errors are 0 and HtmlEntities=I.
↑ EuroConv
This linker script is a common configuration and argument processing for both Linux and Windows version.

Both programs in this source file have the name euroconv . The linked executable file for Linux has the name euroconv.x and the version for Windows has the name euroconv.exe.

Both executables will be built with the command euroasm euroconv.htm.
         EUROASM CPU=X64, Unicode=No, MaxInclusions=100, NoWarn=0563
         INCLUDE argument.htm   ; Assemble the module argument.htm.
         INCLUDE "convmain.htm" ; Assemble the module convmain.htm (common for both OS).
↑ EuroConv for Linux

Linux GUI version works with ANSI terminal in character pseudo graphic mode.

         INCLUDE "convlinc.htm" ; Assemble the module convlinc.htm (Linux console subsystem).
         INCLUDE "convling.htm" ; Assemble the module convling.htm (Linux pseudographic subsystem).
euroconv PROGRAM Format=ELFX, Width=64, Entry=MainCon                 ; Executable Linux file euroconv.x.
          LINK argument.obj, convmain.obj, convlinc.obj, convling.obj ; Program EuroConv for Linux has four modules.
         ENDPROGRAM euroconv
↑ EuroConv for MS Windows
         INCLUDE "convwinc.htm" ; Assemble the module convwinc.htm (Windows console subsystem).
         INCLUDE "convwing.htm" ; Assemble the module convwing.htm (Windows graphic subsystem).
euroconv PROGRAM Format=PE, Width=64, Entry=MainCon, IconFile=euroconv.ico ; Executable Windosw file euroconv.exe.
          LINK argument.obj, convmain.obj, convwinc.obj, convwing.obj      ; Program EuroView for Windows has four modules.
         ENDPROGRAM euroconv

▲Back to the top▲