[Image][Image] Uniconv Help ------------------------------------------------------------------------ Uniconv is a command line utility that uses the Basis Technology C++ Library for Unicode for converting text between encodings and optionally applying transforms to it. Help Contents * Usage o Examples * Encodings o Encodings Quick Reference * Character Properties o Properties Quick Reference * Transforms o Transforms Quick Reference * Error Messages * Copyright Information ------------------------------------------------------------------------ Usage Uniconv will convert a text file written in a given encoding (click here for accepted encodings) to another of its accepted encodings. It uses a command line interface, the usage being as follows: uniconv [-options] [property | transform]* uniconv Name of the program to run. input-encoding required List the encoding of the input file. Encoding name must be written in the way listed below. input-file required List the name of the file (if in the current directory) or the path and file name of the file (if not in the current directory) to be converted. output-encoding required List the desired encoding of the ouput file. Encoding name must be written in the way listed below. output-file required List the name of the file to be created in the new encoding (if in the current directory) or the path and file name of the new file (if not in the current directory). property optional Returns true or false value for characters. A property is associated with the transform that follows it. Properties not followed by a transform are ignored. Multiple property-transform pairs are OK. Multiple properties per transform are also OK. See Character Properties for more information about how to use properties, and see below for a quick reference of the properties available. transform optional Changes a property value for designated characters in a file. Multiple transforms are OK. See Transforms for more information about how to use transforms, and see below for a quick reference of the transforms available. options: Use these flags at the beginning of the command line, before you specify the input and output encodings and filenames. -debug optional This option will print messages generated by Auto-detect. For example, if you are converting a Japanese file and the input encoding is japaneseautodetect, uniconv will list the encodings it is attempting (sjis, euc-j, etc.) and the results. -help optional Displays the copyright information. -subst optional Allows you to change the default substitution character. The substitution character is the character that is used if there is no direct mapping between characters in a conversion. The default substitution character is CTRL-Z. Notes - All command line arguments are case insensitive. - Separate properties and transforms with a space. - If there are multiple properties or transforms, they will be performed in the order listed. - The options -debug, -help, -subst, if used, must directly follow "uniconv". - * means more than one property or transform is OK. Examples to change a file from Shift-JIS encoding to UCS2 encoding uniconv sjis input.txt ucs2 output.txt to change a file from ASCII encoding to UCS2 encoding and convert all uppercase letters to lowercase uniconv ascii input.txt ucs2 output.txt tolowercase to keep a file in Shift-JIS and convert all half-width characters to full-width uniconv sjis input.txt sjis output.txt ToFullwidth to keep a file in Shift-JIS and convert all half-width characters to full-width and all uppercase romaji to lowercase uniconv sjis input.txt sjis output.txt tofullwidth tolowercase to keep a file in Shift-JIS and convert only katakana half-width characters to full-width, leaving romaji half-width characters as-is uniconv sjis input.txt sjis output.txt katakana tofullwidth Encodings Quick Reference: Accepted Encodings Arabic, ASCII, Big5, BMP, ChineseAutoDetect, cp1251, cp1252, cp437, cp850, EUC-J, EUC-KR, GB2312, Greek, Hebrew, HZ, ISO-2022-JP, ISO-2022-KR, ISOLatinCyrillic, JapaneseAutoDetect, JIS_X0201, JIS_X_0208, KoreanAutoDetect, Latin1, Latin2, Latin3, Latin4, Latin5, Latin6, Shift-JIS, Thai, UCS2, Unicode11UCS2, Unicode11UTF7, Unicode11UTF8, UTF7, UTF8 Properties Quick Reference: Accepted Properties UppercaseLetter, LowercaseLetter, TitlecaseLetter, ModifierLetter, OtherLetter, AnyLetter, NonSpacingMark, CombiningMark, DecimalNumber, OtherNumber, DashPunctuation, OpenPunctuation, ClosePunctuation, OtherPunctuation, MathSymbol, CurrencySymbol, OtherSymbol, SpaceSeparator, LineSeparator, ParagraphSeparator, ControlCharacter, OtherCharacter, UndefinedScript, GeneralScript, Latin, Greek, Cyrillic, Armenian, Hebrew, Arabic, Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam, Thai, Lao, Tibetan, Georgian, HangulJamo, Hiragana, Katakana, Kana, Bopomofo, CJKUnifiedIdeographs, Hangul, UndefinedWidth, Fullwidth, Halfwidth Transforms Quick Reference: Accepted Transforms ToLowercase, ToUppercase, ToFullwidth, ToHalfwidth, ToHiragana, ToKatakana, Decompose, Compose, ToCombiningMark, ToSpacingMark, Select, Filter, ToCRLF, ToCR, ToLF, ToParagraphSeparator, ToLineSeparator, ToCanonical, ToTraditionalChinese, ToSimplifiedChinese, RomajiToHiragana, RomajiToKatakana, KanaToRomaji, ToLatinNumber, SGMLEntity Copyright Information (Type uniconv -help at the command line for the below information) Uniconv may be freely distributed and used without charge by individuals, academic institutions, and non-profit institutions. Use by businesses or governments requires purchase of a licensed copy. Uniconv is built using the Basis Technology C++ Library for Unicode, which is available in source code form. For more information regarding Uniconv and the C++ Library for Unicode, contact: Basis Technology Corp. One Kendall Square, Bldg. 200 Cambridge, MA 02139 U.S.A. unicode@basistech.com http://unicode.basistech.com Tel: 617-252-5636 Fax: 617-252-9150 ------------------------------------------------------------------------ Rosette, the C++ Library for Unicode Basis Technology Corp. One Kendall Square, Bldg. 200 Cambridge, MA 02139 tel. 617-252-5636 unicode@basistech.com