This tutorial, created by Tex Texin and Richard Gillam, two leaders in software and Web internationalization, is available to be presented at your company, or in your city If you would like to train your staff and arrange for the tutorial to be presented at your company, please email us at info@unicode-conference.org.
UniClinic can be customized specifically for your organization and its development environment, to better meet your training needs.
|
Introduction- What are the business drivers for internationalization?
- Business Without Borders
- Opportunities Internationally
- Opportunities on the Web
- Business and Economic Forces at Work
- ROI
Technological drivers for Unicode and Internationalization
- In Software applications
- On the World Wide Web
- Multilingual applications
World Tour: Regional Customs Affecting Software Design and Implementation and Efficient Solutions
- Graphics
- Data Formats (Calendars, Dates, Times, Numbers, Currency, Addresses, etc.)
- Linguistic Software Requirements (Externalization, Argument Substitution, Text expansion, word order, Collation, etc)
- Rendering, Fonts, Writing directions (Bidirectional Vertical)
- Input methods
Writing Systems Around the World A survey of languages and writing systems including ideographic, bidirectional, and complex scripts.
(e.g. Chinese, Japanese, Korean, Thai, Indic, Hebrew, Arabic, and others.)
Models of Character Encoding
- Character Sets and Character Encodings- What are they, What problems do they create?
- Unicode and its Repertoire
- Character-Glyph Model
- Combining Characters
- Unicode Encoding Model and it's encodings - Scalar Values, CEF, CES, UTF-8, UTF-16, Surrogates, UTF-32, BOM, etc.
- Character properties (alphabetic, numeric, direction, case, etc.)
Design Decisions
- Choosing the right UTF-n
- Migration to Unicode- programming changes for Unicode-enabling
- Transcoding- Converting legacy encodings to Unicode
- Typical problems with encoding conversions
- Characters that look alike- How to choose the right character
Unicode Algorithms - Part I
- Bidirectional Algorithm
- Line-Breaking
- Regular Expressions and Unicode
Unicode Algorithms - Part II
- UCA- Unicode Collation Algorithm
- Tailoring collations
- Canonical Forms and Normalization
- When is normalization required or important?
- Choosing a normalization form
- Private Use Area, Gaiji Characters
- Unicode compression
- Comparing compression approaches
- Working in small spaces: Efficient storage for Unicode tables
Migration Techniques
- Migration tools
- Estimating migration to Unicode projects
- Unicode footprint requirements (disk, memory, etc.)
- Unicode and Databases (data types, field widths, indexes, queries, collation, database drivers, etc.)
- Multilingual text processing and issues
Unicode on the Wire
- Protocols and Standards on the internet and the Web (e-mail, URLs, etc.)
HTTP, IRI, IDN, Mail (MIME)
- HTML, XML, XHTML
- Encoding declarations and encoding negotiation
- Unicode versus Markup
- Reference Processing model
Unicode in Programming Languages
- identifiers
- parsers
- SQL
- Java
- C/C++/
- C#
- Perl
- Debugging Tips, tools
Localization with Unicode
Tools, Globalization Management Systems (GMS), translation memory supporting Unicode
Unicode and Real World issues
- Surrogates on Windows
- GB18030
- Oracle, SQL Server
- Security
|