EuroConv is a program from the collection EuroTool
which changes the encoding of input text file and converts it to another encoding.
It recognizes text in ASCII, UTF-8, UTF-16, UTF-32 or one of 75 OEM/ANSI 8-bit encodings.
A complete list of supported encodings can be obtained with the command
euroconv /InputEncoding=? alias euroconv /IE=?.
Output UTF encoding can be optionally completed with Byte Order Mark (BOM).
EuroConv also detects HTML entities in the input text, which may be optionally converted to the corresponding character.
EuroConv will convert either the entire input file, or it may omit the beginning and the end of input file by specifying header and footer size or length.
Graphic window of EuroConv launched in MS Windows:
Enumerated options /HtmlEntities= and /InvalidCharacters= alias
/HE= and /IC= recognize only the first letter I|C|N and T|C|Q|O from their values.
For instance a typical invokation can look like euroconv input.txt output.txt /he=c
and it will convert the input file from autodetected encoding (whichever it was) to UTF-8 (default /OE)
and convert any HTML entity found in input text to their characters.
Option /HtmlEntities=NonASCII will ignore ASCII HTML entities
& < > " but it will convert other entities, such as
€ € etc.
How does the program work:
- EuroConv searches for the global configuration
/etc/eurotool/euroconv.ini(Linux) or%AppData%\eurotool\euroconv.ini(Windows). It it doesn't exist, it will try to create it with default arguments.
You should run EuroConv with root privilegies for the first time:sudo ./euroconv.x.- Now EuroSort reads arguments from the command line. If either InputFile or OutputFile is missing, it will let you specify arguments in the graphic window.
- Input file is mapped to memory and if
/InputEncoding=is not explicitly specified, it will be autodetected.- If
/HeaderSize=and/FooterSize=are not 0, they are applied. Then/HeaderLength=and/FooterLength=are applied, if not 0.
Header and Footer are omitted from the conversion.- Specified arguments are recapitulated and the conversion begins.
- Each input character is converted to 32-bit Unicode point. In 8-bit InputEncoding it uses CPtable from CodePages for conversion to UTF-32LE.
If the conversion is not possible, counter of input errors is incremented.- Unicode character is then converted to the OutputEncoding, again it may use CPtable from CodePages for conversion from UTF-32LE to OEM or ANSI output encoding.
When the conversion is not possible, counter of output errors is incremented.- EuroConv recapitulates the number of converted characters, input and output errors and terminates.
Conversion is reversible only if both input and output errors are 0 and HtmlEntities=I.
Both programs in this source file have the name euroconv
. The linked executable file for Linux has the name
euroconv.x
and the version for Windows has the name euroconv.exe
.
euroasm euroconv.htm.
EUROASM CPU=X64, Unicode=No, MaxInclusions=100, NoWarn=0563
INCLUDE argument.htm ; Assemble the module argument.htm.
INCLUDE "convmain.htm" ; Assemble the module convmain.htm (common for both OS).
Linux GUI version works with ANSI terminal in character pseudo graphic mode.
INCLUDE "convlinc.htm" ; Assemble the module convlinc.htm (Linux console subsystem).
INCLUDE "convling.htm" ; Assemble the module convling.htm (Linux pseudographic subsystem).
euroconv PROGRAM Format=ELFX, Width=64, Entry=MainCon ; Executable Linux file euroconv.x
.
LINK argument.obj, convmain.obj, convlinc.obj, convling.obj ; Program EuroConv for Linux has four modules.
ENDPROGRAM euroconv
INCLUDE "convwinc.htm" ; Assemble the module convwinc.htm (Windows console subsystem).
INCLUDE "convwing.htm" ; Assemble the module convwing.htm (Windows graphic subsystem).
euroconv PROGRAM Format=PE, Width=64, Entry=MainCon, IconFile=euroconv.ico ; Executable Windosw file euroconv.exe
.
LINK argument.obj, convmain.obj, convwinc.obj, convwing.obj ; Program EuroView for Windows has four modules.
ENDPROGRAM euroconv