Vincent Wei · ed462573
--- a/Showing-Text-in-Complex-or-Mixed-Scripts.md
+++ b/Showing-Text-in-Complex-or-Mixed-Scripts.md
-_How to layout, shape, and render text in complex or mixed scripts._
+_How to lay out, shape, and render text in complex or mixed scripts._
+
+Table of Contents
+
+- [Overview](#overview)
+- [General Process](#general-process)
+- [Internals](#internals)
+- [Example](#example)

 ## Overview

-## Key Points
+As described in [Using Enhanced Font Interfaces](Using-Enhanced-Font-Interfaces),
+in order to support complex or mixed scripts, we tuned and enhanced
+MiniGUI's font interfaces in version 4.0.0.
+
+But this is only a small part of the whole story. To show a text paragraph in
+complex/mixed scripts, we must implement the complete APIs which conforming
+to Unicode standards and specifications, including the Unicode
+Bidirectional Algorithm (UAX #9), Unicode Line Breaking Algorithm
+(UAX #14), Unicode Normalization Forms (UAX #15), Unicode Script Property
+(UAX #24), Unicode Text Segmentation (UAX #29), Unicode Vertical
+Text Layout (UAX #50), and so on.
+
+On the basis of full support for Unicode, we provides APIs for laying out,
+shaping, and rendering the text in complex or mixed scripts.
+
+This document describe how to use these APIs to lay out, shape, and render
+text in complex or mixed scripts.
+
+Before we continue, let's clarify a few terms and concepts first.
+
+A `script` means a set of letters of one human language, for example, Latin,
+Han, Arabic, Hangul, and so on.
+
+The `writing system` means the specific writing convention of a
+language/script. For example, we write letters in Latin from left to right
+horizontally then from top to bottom vertically, while we write traditional
+Chinese words from top to bottom vertically then from right to left
+horizontally.
+
+The common languages, such as English and modern Chinese, are
+`standard scripts`. In these scripts, there is a one-to-one relationship
+between an encoded character (e.g. 0x4E2D in UTF-8) and the glyph (中)
+that represents it. And we write the characters in standard scripts
+always from left to right horizontally then from top to bottom vertically.
+
+A `glyph` is a set of data which represents a specific character in a
+visual or printable form. In computer, a glyph may be a bitmap or a
+vector path data.
+
+Generally, one `font` contains a lot of glyphs in bitmap or vector path data
+for the characters in a specific language or script or a few similar
+languages or scripts. For example, today, most fonts for East Asia markets
+contain almost all glyphs for Chinese (both traditional and simplified),
+Japanese, and Korea characters.
+
+Systems and applications that handle the standard scripts do not need to
+make a distinction between character processing and glyph processing.
+In working with text in a standard script, it is most often convenient to
+think only in terms of character processing, or simply text processing:
+that is, the sequential rendering of glyphs representing character codes
+as input in logical order.
+
+The term `complex script` refers to any writing system that requires
+some degree of character reordering and/or glyph processing to display,
+print or edit. In other words, scripts for which Unicode logical order
+and nominal glyph rendering of codepoints do not result in acceptable text.
+Such scripts, examples of which are Arabic and the numerous Indic scripts
+descended from the Brahmi writing system, are generally identifiable by
+their morphographic characteristics: the changing of the shape or position
+of glyphs as determined by their relationship to each other. It should be
+noted that such processing is not optional, but is essential to correctly
+rendering text in these scripts. Additional glyph processing to render
+appropriately sophisticated typography may be desirable beyond the minimum
+required to make the text readable.
+
+The situation becomes more complicated when we process a text in
+`mixed scripts`. That is, the text contains characters from different
+scripts, for example, Latin, Arabic, and Chinese.
+
+## General Process
+
+To lay out, shape, and render a text in mixed scripts, you should call
+`GetUCharsUntilParagraphBoundary` function first to convert
+a multi-byte string to a Unicode string under the specified white space
+rule, breaking rule, and transformation rule. For example, converting a
+general C string in UTF-8 or GB18030 to a Uchar32 string by calling this
+function. You can call `CreateLogFontForMChar2UChar` function to create
+a dummy logfont object for this purpose in order to expense a minimal
+memory.
+
+If the text is in simple scripts, like Latin or Chinese, you can call
+`GetGlyphsExtentPointEx` function to lay out the paragraph. This function
+returns a glyph string which can fit in a line with the specified
+maximal extent and rendering flags. After this, you call
+`DrawGlyphStringEx` function to draw the glyph string to the
+specific position of a DC.
+
+If the text is in complex and/or mixed scripts, like Arabic, Thai,
+and Indic, you should create a TEXTRUNS object first by calling
+`CreateTextRuns` function, then initialize the shaping engine for
+laying out the text.
+
+MiniGUI provides two types of shaping engine. One is the basic
+shaping engine. The corresponding function is `InitBasicShapingEngine`.
+The other is called complex shaping engine, which is based on HarfBuzz.
+The corresponding function is `InitComplexShapingEngine`. The latter
+one can give you a better shaping result.
+
+After this, you should call `CreateLayout` to create a layout object
+for laying out the text, then call `LayoutNextLine` to lay out the lines
+one by one.
+
+You can render the laid out lines by calling `DrawLayoutLine` function.
+
+Finally, you call `DestroyLayout` and `DestroyTextRuns` to destroy
+the layout object and text runs object.
+
+Before rendering the glyphs laid out, you can also call `GetLayoutLineRect`
+to get the line rectangle, or call `CalcLayoutBoundingRect` to get
+the bounding rectangle of one paragraph.
+
+## Internals
+
+These new APIs provide a very flexible implementation for your apps
+to process the complex scripts. The implementation is derived from
+LGPL'd Pango, but we optimize and simplify the original implementation
+in the following respects:
+
+* We split the layout process into two stages. We get the text runs
+  (Pango items) in the first stage, and the text runs will keep as
+  constants for subsequent different layouts. In the second stage,
+  we create a layout object for a set of specific layout parameters,
+  and generates the lines one by one for the caller. This is useful
+  for an app like browser, it can reuse the text runs if the output
+  width or height changed, and it is no need to re-generate the text
+  runs because of the size change of the output rectangle.
+
+* We use MiniGUI's fontname for the font attributes of text, and leave
+  the font selection and the glyph generating to MiniGUI's logfont
+  module. In this way, we simplify the layout process greatly.
+
+* We always use Uchar32 string for the whole layout process. So the
+  code and the structures are clearer than original implementation.
+
+* We provide two shaping engines for rendering the text. One is a
+  basic shaping engine and other is the complex shaping engine based
+  on HarfBuzz. The former can be used for some simple applications.
+
+## 

-TBC...
\ No newline at end of file