Introduction:
=============

This file documents an approximate correlation between the data files
provided in the !Unicode distribution and the encoding headers in GNU
libiconv 1.9.1.

Those with '?' in the iconv column either are not represented in iconv
or I've missed the relevant header file ;)

A number of encodings are present in the iconv distribution but not
in !Unicode. These are documented at the end of this file.

Changelog:
==========

v 0.01 (09-Sep-2004)
~~~~~~~~~~~~~~~~~~~~
Initial Incarnation

v 0.02 (11-Sep-2004)
~~~~~~~~~~~~~~~~~~~~
Documented additional encodings supported by the Iconv module.
Corrected list of !Unicode deficiencies.


!Unicode->iconv:
================

Unicode:			iconv:			notes:

Acorn.Latin1			riscos1.h

Apple.CentEuro			mac_centraleurope.h
Apple.Cyrillic			mac_cyrillic.h
Apple.Roman			mac_roman.h
Apple.Ukrainian			mac_ukraine.h

BigFive				big5.h

ISO2022.C0.40[ISO646]		?

ISO2022.C1.43[IS6429]		?

ISO2022.G94.40[646old]		iso646_cn.h
ISO2022.G94.41[646-GB]		?
ISO2022.G94.42[646IRV]		?
ISO2022.G94.43[FinSwe]		?
ISO2022.G94.47[646-SE]		?
ISO2022.G94.48[646-SE]		?
ISO2022.G94.49[JS201K]		jisx0201.h		top of JIS range 
ISO2022.G94.4A[JS201R]		jisx0201.h iso646_jp.h	bottom of JIS range
ISO2022.G94.4B[646-DE]		?
ISO2022.G94.4C[646-PT]		?
ISO2022.G94.54[GB1988]		?
ISO2022.G94.56[Teltxt]		?
ISO2022.G94.59[646-IT]		?
ISO2022.G94.5A[646-ES]		?
ISO2022.G94.60[646-NO]		?
ISO2022.G94.66[646-FR]		?
ISO2022.G94.69[646-HU]		?
ISO2022.G94.6B[Arabic]		?
ISO2022.G94.6C[IS6397]		?
ISO2022.G94.7A[SerbCr]		?

ISO2022.G94x94.40[JS6226]	?
ISO2022.G94x94.41[GB2312]	gb2312.h
ISO2022.G94x94.42[JIS208]	jis0x208.h
ISO2022.G94x94.43[KS1001]	ksc5601.h
ISO2022.G94x94.44[JIS212]	jis0x212.h
ISO2022.G94x94.47[CNS1]		cns11643_1.h		the tables differ
ISO2022.G94x94.48[CNS2]		cns11643_2.h
ISO2022.G94x94.49[CNS3]		cns11643_3.h
ISO2022.G94x94.4A[CNS4]		cns11643_4.h
ISO2022.G94x94.4B[CNS5]		cns11643_5.h
ISO2022.G94x94.4C[CNS6]		cns11643_6.h
ISO2022.G94x94.4D[CNS7]		cns11643_7.h

ISO2022.G96.41[Lat1]		iso8859_1.h
ISO2022.G96.42[Lat2]		iso8859_2.h
ISO2022.G96.43[Lat3]		iso8859_3.h
ISO2022.G96.44[Lat4]		iso8859_4.h
ISO2022.G96.46[Greek]		?
ISO2022.G96.47[Arabic]		iso8859_6.h		ISO-8859-6 ignored
ISO2022.G96.48[Hebrew]		?
ISO2022.G96.4C[Cyrill]		?
ISO2022.G96.4D[Lat5]		iso8859_5.h
ISO2022.G96.50[LatSup]		?
ISO2022.G96.52[IS6397]		?
ISO2022.G96.54[Thai]		tis620.h
ISO2022.G96.56[Lat6]		iso8859_6.h
ISO2022.G96.58[L6Sami]		?
ISO2022.G96.59[Lat7]		iso8859_7.h
ISO2022.G96.5C[Welsh]		?
ISO2022.G96.5D[Sami]		?
ISO2022.G96.5E[Hebrew]		?
ISO2022.G96.5F[Lat8]		iso8859_8.h
ISO2022.G96.62[Lat9]		iso8859_9.h

KOI8-R				koi8_r.h

Microsoft.CP1250		cp1250.h
Microsoft.CP1251		cp1251.h
Microsoft.CP1252		cp1252.h
Microsoft.CP1254		cp1254.h
Microsoft.CP866			cp866.h
Microsoft.CP932			cp932.h cp932ext.h

iconv->!Unicode:
================

Iconv has the following encodings, which are not present in !Unicode. 
Providing a suitable data file for !Unicode is trivial. Whether UnicodeLib
will then act upon the addition of these is unknown.
This list is ordered as per libiconv's NOTES file.

European & Semitic languages:

	ISO-8859-16 (iso8859_16.h)
	KOI8-{U,RU,T} (koi8_xx.h)
	CP125{3,5,6,7} (cp125n.h)
	CP850 (cp850.h)
	CP862 (cp862.h)
	Mac{Croatian,Romania,Greek,Turkish,Hebrew,Arabic} (mac_foo.h)

Japanese:

	None afaikt.

Simplified Chinese:

	GB18030 (gb18030.h, gb18030ext.h)
	HZ-GB-2312 (hz.h)

Traditional Chinese:

	CP950 (cp950.h)
	BIG5-HKSCS (big5hkscs.h)

Korean:

	CP949 (cp949.h)

Armenian:

	ARMSCII-8 (armscii_8.h)

Georgian:

	Georgian-Academy, Georgian-PS (georgian_academy.h, georgian_ps.h)

Thai:

	CP874 (cp874.h)
	MacThai (mac_thai.h)

Laotian:

	MuleLao-1, CP1133 (mulelao.h, cp1133.h)

Vietnamese:

	VISCII, TCVN (viscii.h, tcvn.h)
	CP1258 (cp1258.h)

Unicode:

	BE/LE variants of normal encodings. I assume UnicodeLib handles
	these, but can't be sure.
	C99 / JAVA - well, yes.


Iconv Module:
=============

The iconv module is effectively a thin veneer around UnicodeLib. However,
8bit encodings are implemented within the module rather than using the
support in UnicodeLib. The rationale for this is simply that, although
UnicodeLib will understand (and act upon - reportedly...) additions to
the ISO2022 Unicode resource, other encodings are ignored. As the vast
majority of outstanding encodings fall into this category, and the code
is fairly simple, it made sense to implement it within the module.

With use of the iconv module, the list of outstanding encodings is
reduced to:

	CP1255 (requires state-based transcoding)

	GB18030 (not 8bit - reportedly a requirement of PRC)
	HZ-GB-2312 (not 8bit - supported by IE4)

	CP950 (not 8bit - a (MS) variant of Big5)
	BIG5-HKSCS (not 8bit - again, a Big5 variant)

	CP949 (not 8bit)

	ARMSCII-8 (easily implemented, if required)

	VISCII (easily implemented, if required)
	CP1258, TCVN (requires state-based transcoding)

Additionally, the rest of the CodePage encodings implemented in iconv
but not listed above (due to omissions from the iconv documentation)
are implemented by the iconv module.
