ASCII
(which is the native encoding for TeX).The following packages are commonly employed for language support:
ASCII
(which is the native encoding for TeX).inputenc
tranlates inputs into TeX language, generally composing them with ASCII
and control sequences: for istance â wold be internally represented as \^a with the ISO latin-1 encoding . With UTF-8 the representation gets more convoluted since complex characters such as the japanese kanji need to be accomodated.\usepackage[encoding]{inputenc}
: in pretty much all cases UTF-8 should be used (\usepackage[utf8]{inputenc}
), the current distributions of LaTeX actually uses UTF-8 by default but is still good practice to specify it in case your .tex
file were to run on a machine with an older distribution. ASCII |
The American Standard Code for Information Interchange is the first character encoding ever used, pubblished in 1963 it encoded each character in 7 bits + 1 parity bit for a total of 128 characters. The option for ASCII is (unsurprisingly) [ascii]
|
MAC OS Roman |
Encoding used by Apple's PCs from 1981 to 2001. Not having a parity bit it could encode a total of 256 characters with the first 128 being the ASCII ones. The option for MAC OS Roman is [applemac]
|
UTF-8 UCS |
The Universal Character Set is of variable lenght (1-4bytes) and can thus encode up to 2.000.000.000 characters (obviusly not all of them are in use as of now), the first 128 of those are still ASCII. Its flexibility made it today's most used character encoding system by far. The option for UTF-8 is [utf8]
|
fontenc
package takes the interpretations produced by inputenc
and converts them in actual characters by using command sequences on an estamblished table of 128, 256 glyphs based on the chosen encoding. For istance â (if not already present in the current fontenc
encoding) would be converted by inputenc into \^a, then fontenc would read it and place the accent on the "a" glyph. OT1
and used a table of 128 glyphs (as all encoding preceeded by O do): up until 2015 it was the default encoding. The latest LaTeX distributions employs T1 which, with 256 glyphs, covers most latin languages. T2A
,T2B
,T2C
but they are all contained in the X2
encoding.\usepackage[encoding]{fontenc}
.
babel
package supports specific typesettings for one or more languages. The syntax is \includepackage[language]{babel}
. For more information about this package please consult the relative section
fontenc
with its tables can't really deal with this without external support: language specific solutions have to be found. CJK
can output chinese, japanese and korean:
%\documentclass[UTF8]{ctexart} whole document
\documentclass{article}
\usepackage{CJKutf8}
\begin{document}
This package supports texts in Chinese Japanese and Korean:\par
\begin{CJK}{UTF8}{min}{
法84和リオカケ港側ト聞州つざドか購34打オナウカ聞載リ独止70援ざう実本短致召ゃ。前にきー芸多ドさッぐ月勧ぐひ問氏こま科大はよぞお触浜の向究らげて荒理ソコモ迎京改期ーちぞ詳面ヨホ録学輔チロ恵位セマレ訃傍剰えだン。26宿チソ間面アチウエ案抗ぎ力描ぽっば問件そんて屋85真チル文器ミスタネ初勝イテ件有こ稿勢りあ化志にッぱ販演危距告え。
投ホ代判へ都7入ぴま都出ク円出あ別社ユワ断載ノ意中軍ド詳損付へす選棄メカ親策ホス社予ば回供権か報合でぼふ登国うゃも討当都殺幌いゅ。終トカ立35無著リぐ堀無ざクす駒一ユエメヲ情察変ぶづけべ中能やたま十製ハホノネ今9団す禁食加牟ネヘレヤ操26殺上得腕や。外スクテ空内トラス提捕読イやだぜ員村れ優政もびし読65苦コ光季かいクわ送者取ツ者員タスヲモ担知せふリ。}
\end{CJK}
\end{document}