This chapter provides information about the Text Manager API declared in TextMgr.h
by discussing these topics:
Text Manager Structures and Types
Text Manager Constants
Text Manager Functions and Macros
For more information on the Text Manager, see the chapter "Text".
Text Manager Structures and Types
CharEncodingType Typedef
Purpose
Specifies possible character encodings.
Declared In
TextMgr.h
Prototype
typedef uint16_t CharEncodingType
Comments
A given device supports a single character encoding. Palm OS® Cobalt devices support either the Palm OS version of Windows code page 12521 (an extension of ISO Latin 1) or the Palm OS version of Windows code page 9321 (an extension of Shift JIS). In addition, Palm OS licensees and some third-party developers provide support for additional character encodings including Big-5, Hebrew, Arabic, Thai, Korean, and Cyrillic.
The character encoding constants generally follow the format:
charEncodingName
where Name is the name of the character encoding.
The following table shows examples of the character encoding constants. For a complete list, see the TextMgr.h
file.
TxtConvertStateType Struct
Purpose
Maintains state across calls to TxtConvertEncoding()
. It is essentially opaque; simply declare a structure of this type and pass a pointer to your structure when making multiple calls to TxtConvertEncoding()
for a single source text buffer.
Declared In
TextMgr.h
Prototype
typedef struct { uint8_t ioSrcState[kTxtConvertStateSize]; uint8_t ioDstState[kTxtConvertStateSize]; } TxtConvertStateType
Comments
kTxtConvertStateSize
is simply a constant that determines the size of the source and destination state buffers.
Text Manager Constants
Byte Attribute Flags
Purpose
Flags that identify the possible locations of a given byte within a multi-byte character.
Declared In
TextMgr.h
Constants
-
#define byteAttrFirst 0x80
- First byte of multi-byte character.
-
#define byteAttrHighLow (byteAttrFirst | byteAttrLast)
- Either the first byte of a multi-byte character or the last byte of a multi-byte character.
-
#define byteAttrLast 0x40
- Last byte of multi-byte character.
-
#define byteAttrMiddle 0x20
- Middle byte of multi-byte character.
-
#define byteAttrSingle 0x01
- Single-byte character.
-
#define byteAttrSingleLow (byteAttrSingle | byteAttrLast)
- Either a single-byte character or the low-order byte of a multi-byte character.
Comments
If a byte is valid in more than one location of a character, multiple return bits are set. For example, 0x40 in the Shift JIS character encoding is valid as a single-byte character and as the low-order byte of a double-byte character. Thus, the return value for TxtByteAttr(0x40)
on a Shift JIS system has both the byteAttrSingle
and byteAttrLast
bits set.
Every byte in a stream of double-byte data must be either a single byte, a high byte, a single/low byte (byteAttrSingleLow
), or a high/low byte (byteAttrHighLow
).
See Also
Character Attributes
Purpose
Flags that identify various character attributes.
Declared In
TextMgr.h
Constants
-
#define charAttrAlNum (charAttr_DI | charAttr_LO | charAttr_UP | charAttr_XA)
- Alphanumeric characters
-
#define charAttrAlpha (charAttr_LO | charAttr_UP | charAttr_XA)
- Alphabetic characters
-
#define charAttrCntrl (charAttr_BB | charAttr_CN)
- Control characters
-
#define charAttrDelim (charAttr_SP | charAttr_PU)
- Delimiters
-
#define charAttrGraph (charAttr_DI | charAttr_LO | charAttr_PU | charAttr_UP | charAttr_XA)
- Printable, non-space characters
-
#define charAttrPrint (charAttr_DI | charAttr_LO | charAttr_PU | charAttr_SP | charAttr_UP | charAttr_XA)
- Printable characters
-
#define charAttrSpace (charAttr_CN | charAttr_SP | charAttr_XS)
- Whitespace characters
-
#define charAttr_BB 0x00000080
- BEL, BS, etc.
-
#define charAttr_CN 0x00000040
- CR, FF, HT, NL, VT
-
#define charAttr_DI 0x00000020
- '0'-'9'
-
#define charAttr_DO 0x00000400
- Characters that appear on the display but never in user data, such as the ellipsis character
-
#define charAttr_LO 0x00000010
- 'a'-'z' and lowercase extended characters
-
#define charAttr_PU 0x00000008
- Punctuation
-
#define charAttr_SP 0x00000004
- Space
-
#define charAttr_UP 0x00000002
- 'A'-'Z' and uppercase extended characters
-
#define charAttr_XA 0x00000200
- Extra alphabetic
-
#define charAttr_XD 0x00000001
- '0'-'9', 'A'-'F', 'a'-'f'
-
#define charAttr_XS 0x00000100
- Extra space
Character Encoding Attributes
Purpose
Constants used to interpret the return value of TxtGetEncodingFlags()
.
Declared In
TextMgr.h
Constants
-
#define charEncodingOnlySingleByte 0x00000001
- The character encoding consists only of single-byte characters.
-
#define charEncodingHasDoubleByte 0x00000002
- The character encoding contains one or more double-byte characters.
-
#define charEncodingHasLigatures 0x00000004
- The character encoding has ligatures.
-
#define charEncodingRightToLeft 0x00000008
- The character encoding supports a writing system that primarily renders text right-to-left.
Encoding Conversion Constant Modifiers
Purpose
Constants to OR with the destination character encoding (CharEncodingType
) passed to TxtConvertEncoding()
.
Declared In
TextMgr.h
Constants
-
#define charEncodingDstBestFitFlag 0x8000
- Causes
TxtConvertEncoding()
to make an extra effort to convert characters in the source encoding to similar (if not equal) characters in the destination encoding.
Comments
As an example, when converting from charEncodingUCS2
to charEncodingPalmSJIS
, no mapping exists for U+00A1
(INVERTED EXCLAMATION MARK) because this character doesn't exist in charEncodingPalmSJIS
. In this case, TxtConvertEncoding()
returns txtErrNoCharMapping
. If you OR the charEncodingDstBestFitFlag
with the destination character encoding, however, TxtConvertEncoding()
converts the character to chrExclamationMark
(which is close). Generally, the operating system tries to support as many code page 1252 characters as possible in the "best fit" table.
If charEncodingDstBestFitFlag
is set and either the source or destination encoding is unknown, TxtConvertEncoding()
copies anything that is 7-bit ASCII from the source to the destination. It then returns txtErrUnknownEncodingFallbackCopy
. The rules for unknown characters apply during this 7-bit copy; if an inconvertible character is encountered, the substitution string (if one has been specified) is used in its place, and txtErrNoCharMapping
is returned instead.
Encoding Conversion Substitution Constants
Purpose
Values used to substitute in TxtConvertEncoding()
.
Declared In
TextMgr.h
Constants
-
#define textSubstitutionDefaultLen 1
- The length in bytes of
textSubstitutionDefaultStr
. -
#define textSubstitutionDefaultStr "?"
- Can be passed to
TxtConvertEncoding()
as the substitution string parameter. The substitution string contains a character that is used in the destination string if a character from the source string is not recognized in the destination encoding. -
#define textSubstitutionEncoding charEncodingUTF8
- The encoding used for the substitution string parameter of
TxtConvertEncoding()
. The string you pass for the substitution string parameter is always assumed to be in this encoding.
Size Constants
Purpose
Constants that specify sizes of items used in the Text Manager.
Declared In
TextMgr.h
Constants
-
#define kTxtConvertStateSize 32
- Used in the
TxtConvertStateType
structure to specify the maximum size of the source and destination encodings. -
#define maxCharBytes 4
- Maximum size a single
wchar32_t
character will occupy in a text string. -
#define maxEncodingNameLength 40
- Maximum length in bytes of any character encoding name.
Text Manager Error Constants
Purpose
Declared In
TextMgr.h
Constants
-
#define txtErrConvertOverflow (txtErrorClass | 4)
- The destination buffer is not large enough to contain the converted text.
-
#define txtErrConvertUnderflow (txtErrorClass | 5)
- The end of the source buffer contains a partial character.
-
#define txtErrMalformedText (txtErrorClass | 9)
- An error in the source text encoding has been discovered.
-
#define txtErrNoCharMapping (txtErrorClass | 7)
- The device does not contain a mapping between the source and destination encodings for at least one of the characters in the source string.
-
#define txtErrTranslitOverflow (txtErrorClass | 3)
- The destination buffer is not large enough to contain the converted string.
-
#define txtErrTranslitOverrun (txtErrorClass | 2)
- The source and destination buffers point to the same memory location and performing the requested operation would cause the function to overwrite unprocessed data in the input buffer.
-
#define txtErrTranslitUnderflow (txtErrorClass | 8)
- The end of the source buffer contains a partial character.
-
#define txtErrUknownTranslitOp (txtErrorClass | 1)
- The transliteration operation constant value is not recognized
-
#define txtErrUnknownEncoding (txtErrorClass | 6)
- One of the specified encodings is unknown or can't be handled.
-
#define txtErrUnknownEncodingFallbackCopy (txtErrorClass | 10)
- Either the source or destination encoding is unknown, and the best fit flag was set in the destination encoding.
Text Manager Feature Settings
Purpose
Text Manager settings that can be obtained or set in the sysFtrNumTextMgrFlags
feature.
Declared In
TextMgr.h
Constants
-
#define textMgrBestFitFlag 0x00000004
- The
TxtConvertEncoding()
function can use the charEncodingDstBestFitFlag. See "Encoding Conversion Constant Modifiers" for more information. This flag is always set in Palm OS Cobalt. -
#define textMgrExistsFlag 0x00000001
- The Text Manager is installed on the device. This flag is always set in Palm OS Cobalt.
-
#define textMgrStrictFlag 0x00000002
- No longer used.
TranslitOpType Typedef
Purpose
Specifies the transliteration operation to be performed by a given call to TxtTransliterate()
. Each character encoding contains its own set of special transliteration operations, the values for which begin at translitOpCustomBase
.
Declared In
TextMgr.h
Prototype
typedef uint16_t TranslitOpType
Constants
-
#define translitOpStandardBase 0
- Base value at which character-encoding-independent transliterations are defined.
-
#define translitOpUpperCase 0
- Convert all characters to uppercase.
-
#define translitOpLowerCase 1
- Convert all characters to lowercase.
-
#define translitOpReserved2 2
- Reserved for future use.
-
#define translitOpReserved3 3
- Reserved for future use.
-
#define translitOpPreprocess 0x8000
- OR this value with another transliteration flag to have the
TxtTransliterate()
function return the space requirements for the result. -
#define translitOpCustomBase 1000
- Base value at which character-encoding specific transliteration constants begin.
Text Manager Functions and Macros
CHAR_ENCODING_VALUE Macro
Purpose
Macro used to set the values of the character encoding constants.
Declared In
TextMgr.h
Prototype
#define CHAR_ENCODING_VALUE (
value
)
Parameters
Returns
A CharEncodingType
value.
Comments
Applications do not need to use this macro.
sizeOf7BitChar Macro
Purpose
Returns the true size of a low-ASCII character.
Declared In
TextMgr.h
Prototype
#define sizeOf7BitChar (
c
)
Parameters
Returns
Comments
In C, checking the size of a character constant returns the size of an integer. For example, sizeof('a')
returns 2. Because of this, it's safest to use the sizeOf7BitChar()
macro to document buffer size and string length calculations. Note that this can only be used with low-ASCII characters, as anything else might be the high byte of a double-byte character.
TxtByteAttr Function
Purpose
Returns the possible locations of a given byte within a multi-byte character.
Declared In
TextMgr.h
Prototype
uint8_t TxtByteAttr (
uint8_t iByte
)
Parameters
Returns
A byte with one or more of the Byte Attribute Flags set.
Comments
Text Manager functions that need to determine the byte positioning of a character use TxtByteAttr()
to do so. You rarely need to use this function yourself.
TxtCaselessCompare Function
Purpose
Performs a case-insensitive comparison of two text buffers.
Declared In
TextMgr.h
Prototype
int16_t TxtCaselessCompare ( const char*s1
, size_ts1Len
, size_t*s1MatchLen
, const char*s2
, size_ts2Len
, size_t*s2MatchLen
)
Parameters
-
→ s1
- The first text buffer to compare.
-
→ s1Len
- The length in bytes of the text pointed to by
s1
. -
← s1MatchLen
- Points to the offset of the first character in
s1
that determines the sort order. PassNULL
for this parameter if you don't need to know this number. -
→ s2
- The second text buffer to compare.
-
→ s2Len
- The length in bytes of the text pointed to by
s2
. -
← s2MatchLen
- Points to the offset of the first character in
s2
that determines the sort order. PassNULL
for this parameter if you don't need to know this number.
Returns
Comments
In certain character encodings (such as Shift JIS), one character may be accurately represented as either a single-byte character or a multi-byte character. TxtCaselessCompare()
accurately matches a single-byte character with its multi-byte equivalent. For this reason, the values returned in s1MatchLen
and s2MatchLen
are not always equal.
You must make sure that the parameters s1
and s2
point to the start of a valid character. That is, they must point to the first byte of a multi-byte character or they must point to a single-byte character; if they don't, results are unpredictable.
See Also
StrCaselessCompare()
, TxtCompare()
, StrCompare()
TxtCharAttr Function
Purpose
Returns a character's attributes.
Declared In
TextMgr.h
Prototype
uint32_t TxtCharAttr (
wchar32_t iChar
)
Parameters
Returns
An integer with any of the Character Attributes bits set.
Comments
The character passed to this function must be a valid character given the system encoding.
This function is used in the Text Manager's character attribute macros (TxtCharIsAlNum()
, TxtCharIsCntrl()
, and so on). The macros perform operations analogous to the standard C functions isPunct()
, isPrintable()
, and so on. Usually, you'd use one of these macros instead of calling TxtCharAttr()
directly.
To obtain attributes specific to a given character encoding, use TxtCharXAttr()
.
See Also
TxtCharBounds Function
Purpose
Returns the boundaries of a character containing the byte at a specified offset in a string.
Declared In
TextMgr.h
Prototype
wchar32_t TxtCharBounds ( const char*iTextP
, size_tiOffset
, size_t*oCharStart
, size_t*oCharEnd
)
Parameters
-
→ iTextP
- The text buffer to search.
-
→ iOffset
- A valid offset into the buffer
iTextP
. This location may contain a byte in any position (start, middle, or end) of a multi-byte character. -
← oCharStart
- Points to the starting offset of the character containing the byte at
iOffset
. -
← oCharEnd
- Points to the ending offset of the character containing the byte at
iOffset
.
Returns
The character located between the offsets oCharStart and oCharEnd.
Comments
Use this function to determine the boundaries of a character in a string or text buffer.
TxtCharBounds()
is often slow and should be used only where needed. If the byte at iOffset is valid in more than one location of a character, the function must search back toward the beginning of the text buffer until it finds an unambiguous byte to determine the appropriate boundaries.
You must make sure that the parameter iTextP points to the beginning of the string. That is, if the string begins with a multi-byte character, iTextP must point to the first byte of that character; if it doesn't, results are unpredictable.
TxtCharEncoding Function
Purpose
Returns the minimum encoding required to represent a character.
Declared In
TextMgr.h
Prototype
CharEncodingType TxtCharEncoding (
wchar32_t iChar
)
Parameters
Returns
A CharEncodingType
value that indicates the minimum encoding required to represent iChar. If the character isn't recognizable, charEncodingUnknown
is returned.
Comments
The minimum encoding is the encoding that represents the fewest number of characters while still containing the character specified in iChar. For example, if the character is a blank or a tab character, the minimum encoding is charEncodingAscii
because these characters can be represented in single-byte ASCII. If the character is a ü, the minimum encoding is charEncodingISO8859_1
.
This function is used by TxtStrEncoding()
, which is the function that most applications should use to determine the character encoding for tagging text (for instance, for email).
Use TxtMaxEncoding()
to determine the order of encodings.
Palm OS only supports a single character encoding at a time. Because of this, the result of TxtCharEncoding()
is always logically equal to or less than the encoding used on the current system. That is, you'll only receive a return value of charEncodingISO8859_1
if you're running on a US or European system and you pass a non-ASCII character.
See Also
TxtStrEncoding()
, TxtMaxEncoding()
TxtCharIsAlNum Macro
Purpose
Indicates if the character is alphanumeric.
Declared In
TextMgr.h
Prototype
#define TxtCharIsAlNum (
ch
)
Parameters
Returns
true
if the character is a letter in an alphabet or a numeric digit, false
otherwise.
See Also
TxtCharIsDigit()
, TxtCharIsAlpha()
TxtCharIsAlpha Macro
Purpose
Indicates if a character is a letter in an alphabet.
Declared In
TextMgr.h
Prototype
#define TxtCharIsAlpha (
ch
)
Parameters
Returns
true
if the character is a letter in an alphabet, false
otherwise.
See Also
TxtCharIsAlNum()
, TxtCharIsLower()
, TxtCharIsUpper()
TxtCharIsCntrl Macro
Purpose
Indicates if a character is a control character.
Declared In
TextMgr.h
Prototype
#define TxtCharIsCntrl (
ch
)
Parameters
Returns
true
if the character is a non-printable character, such as the bell character or a carriage return; false
otherwise.
TxtCharIsDelim Macro
Purpose
Indicates if a character is a delimiter.
Declared In
TextMgr.h
Prototype
#define TxtCharIsDelim (
ch
)
Parameters
Returns
true
if the character is a word delimiter (whitespace or punctuation), false
otherwise.
TxtCharIsDigit Macro
Purpose
Indicates if the character is a decimal digit.
Declared In
TextMgr.h
Prototype
#define TxtCharIsDigit (
ch
)
Parameters
Returns
true
if the character is 0 through 9, false
otherwise.
See Also
TxtCharIsAlNum()
, TxtCharIsHex()
TxtCharIsGraph Macro
Purpose
Indicates if a character is a graphic character.
Declared In
TextMgr.h
Prototype
#define TxtCharIsGraph (
ch
)
Parameters
Returns
true
if the character is a graphic character, false
otherwise.
Comments
A graphic character is any character visible on the screen, in other words, letters, digits, and punctuation marks. A blank space is not a graphic character because it is not visible.
This macro differs from TxtCharIsPrint()
in that it returns false
if the character is whitespace. TxtCharIsPrint()
returns true
if the character is whitespace.
TxtCharIsHardKey Macro
Purpose
Returns true if the character is one of the hard keys on the device.
Declared In
TextMgr.h
Prototype
#define TxtCharIsHardKey (m
,c
)
Parameters
-
→ m
- The value passed in the
modifiers
field of thekeyDownEvent
. -
→ c
- The character from the
keyDownEvent
.
Returns
true
if the character is one of the built-in hard keys on the device, false
otherwise.
TxtCharIsHex Macro
Purpose
Indicates if a character is a hexadecimal digit.
Declared In
TextMgr.h
Prototype
#define TxtCharIsHex (
ch
)
Parameters
Returns
true
if the character is a hexadecimal digit from 0 to F, false
otherwise.
See Also
TxtCharIsLower Macro
Purpose
Indicates if a character is a lowercase letter.
Declared In
TextMgr.h
Prototype
#define TxtCharIsLower (
ch
)
Parameters
Returns
true
if the character is a lowercase letter, false
otherwise.
See Also
TxtCharIsAlpha()
, TxtCharIsUpper()
TxtCharIsPrint Macro
Purpose
Indicates if a character is printable.
Declared In
TextMgr.h
Prototype
#define TxtCharIsPrint (
ch
)
Parameters
Returns
true
if the character is not a control character, false
otherwise.
Comments
This macro differs from TxtCharIsGraph()
in that it returns true
if the character is whitespace. TxtCharIsGraph()
returns false
if the character is whitespace.
If you are using a debug ROM and you pass a virtual character to this macro, a fatal alert is generated.
See Also
TxtCharIsPunct Macro
Purpose
Indicates if a character is a punctuation mark.
Declared In
TextMgr.h
Prototype
#define TxtCharIsPunct (
ch
)
Parameters
Returns
true
if the character is a punctuation mark, false
otherwise.
TxtCharIsSpace Macro
Purpose
Indicates if a character is a whitespace character.
Declared In
TextMgr.h
Prototype
#define TxtCharIsSpace (
ch
)
Parameters
Returns
true
if the character is whitespace such as a blank space, tab, or newline; false
otherwise.
TxtCharIsUpper Macro
Purpose
Indicates if a character is an uppercase letter.
Declared In
TextMgr.h
Prototype
#define TxtCharIsUpper (
ch
)
Parameters
Returns
true
if the character is an uppercase letter, false
otherwise.
See Also
TxtCharIsAlpha()
, TxtCharIsLower()
TxtCharIsValid Function
Purpose
Determines whether a character is valid given the Palm OS character encoding.
Declared In
TextMgr.h
Prototype
Boolean TxtCharIsValid (
wchar32_t iChar
)
Parameters
Returns
true
if iChar is a valid character; false
if iChar is not a valid character.
See Also
TxtCharAttr()
, TxtCharIsPrint()
TxtCharIsVirtual Macro
Purpose
Returns whether a character is a virtual character or not.
Declared In
TextMgr.h
Prototype
#define TxtCharIsVirtual (m
,c
)
Parameters
-
→ m
- The value passed in the
modifiers
field of thekeyDownEvent
. -
→ c
- The character from the
keyDownEvent
.
Returns
true
if the character c is a virtual character, false
otherwise.
Comments
Virtual characters are nondisplayable characters that trigger special events in the operating system, such as displaying low battery warnings or displaying the keyboard dialog. Virtual characters should never occur in any data and should never appear on the screen.
TxtCharSize Function
Purpose
Returns the number of bytes required to store the character in a string.
Declared In
TextMgr.h
Prototype
size_t TxtCharSize (
wchar32_t iChar
)
Parameters
Returns
The number of bytes required to store the character in a string.
Comments
Although character variables are always multi-byte long wchar32_t
values, in some character encodings such as Shift JIS, characters in strings are represented by a mix of one or more bytes per character. If the character can be represented by a single byte (its high-order bytes are 0), it is stored in a string as a single-byte character.
See Also
TxtCharXAttr Function
Purpose
Returns the extended attribute bits for a character.
Declared In
TextMgr.h
Prototype
uint32_t TxtCharXAttr (
wchar32_t iChar
)
Parameters
Returns
An unsigned 32-bit value with one or more extended attribute bits set. For specific return values, look in the header files that are specific to certain character encodings (CharLatin.h
or CharShiftJIS.h
).
Comments
To interpret the results, you must know the character encoding being used. The function LmGetSystemLocale()
returns the character encoding used on the device as one of the CharEncodingType
values. You can pass NULL
as the parameter to LmGetSystemLocale()
if you don't want to retrieve any other locale information.
See Also
TxtCharAttr()
, "Retrieving the Character Encoding"
TxtCompare Function
Purpose
Performs a case-sensitive comparison of all or part of two text buffers.
Declared In
TextMgr.h
Prototype
int16_t TxtCompare ( const char*s1
, size_ts1Len
, size_t*s1MatchLen
, const char*s2
, size_ts2Len
, size_t*s2MatchLen
)
Parameters
-
→ s1
- The first text buffer to compare.
-
→ s1Len
- The length in bytes of the text pointed to by
s1
. -
← s1MatchLen
- Points to the offset of the first character in
s1
that determines the sort order. PassNULL
for this parameter if you don't need to know this number. -
→ s2
- The second text buffer to compare.
-
→ s2Len
- The length in bytes of the text pointed to by
s2
. -
← s2MatchLen
- Points to the offset of the first character in
s2
that determines the sort order. PassNULL
for this parameter if you don't need to know this number.
Returns
Comments
This function performs a case-sensitive comparison. If you want to perform a case-insensitive comparison, use TxtCaselessCompare()
.
The s1MatchLen
and s2MatchLen
parameters are not as useful for the TxtCompare()
function as they are for the TxtCaselessCompare()
function because TxtCompare()
implements a multi-pass sort algorithm. For example, if you use TxtCaselessCompare()
to compare the string "celery" with the string "Cauliflower," it returns a positive value to indicate that "celery" sorts after "Cauliflower," and it returns a match length of 1 to indicate that the second letter determines the sort order ("e" comes after "a"). However, because TxtCompare()
ultimately does a case-sensitive comparison, comparing the string "c" to the string "C" produces a negative result and a match length of 0.
In certain character encodings (such as Shift JIS), one character may be accurately represented as either a single-byte character or a multi-byte character. TxtCompare()
accurately matches a single-byte character with its multi-byte equivalent. For this reason, the values returned in s1MatchLen
and s2MatchLen
are not always equal.
You must make sure that the parameters s1
and s2
point to the start of a a valid character. That is, they must point to the first byte of a multi-byte character or they must point to a single-byte character; if they don't, results are unpredictable.
See Also
TxtConvertEncoding Function
Purpose
Converts a text buffer from one character encoding to another.
Declared In
TextMgr.h
Prototype
status_t TxtConvertEncoding ( BooleannewConversion
, TxtConvertStateType*ioStateP
, const char*srcTextP
, size_t*ioSrcBytes
, CharEncodingTypesrcEncoding
, char*dstTextP
, size_t*ioDstBytes
, CharEncodingTypedstEncoding
, const char*substitutionStr
, size_tsubstitutionLen
)
Parameters
-
→ newConversion
- Set to
true
if this function call is starting a new conversion, orfalse
if this function call is a continuation of a previous conversion. -
↔ ioStateP
- If
newConversion
isfalse
, this parameter must point to aTxtConvertStateType
structure containing the same data used for the previous invocation. IfnewConversion
istrue
and no subsequent calls are planned, this parameter can beNULL
. -
→ srcTextP
- The source text buffer. If
newConversion
istrue
, this must point to the start of a text buffer. IfnewConversion
isfalse
, it may point to a location in the middle of a text buffer. In either case, it must point to an inter-character boundary. -
↔ ioSrcBytes
- A pointer to the size, in bytes, of the text starting at
srcTextP
that needs to be converted. Upon return,*ioSrcBytes
contains the number of bytes successfully processed. - If
srcTextP
is null-terminated and you wantdstTextP
to be null terminated, include a byte for the null terminator in this size. -
→ srcEncoding
- The character encoding that the source text uses. See
CharEncodingType
. -
↔ dstTextP
- The destination text buffer, which must be large enough to hold the result of converting
srcTextP
to the specified encoding. You can passNULL
for thedstTextP
parameter to determine the required length of the buffer before actually doing the conversion; the required length is returned inioDstBytes
. -
TxtConvertEncoding()
does not write the terminating null character todstTextP
unless one is present insrcTextP
andioSrcBytes
includes space for it. -
↔ ioDstBytes
- A pointer to the length, in bytes, of
dstTextP
. Upon return,*ioDstBytes
contains the number of bytes required to represent the source text in the new encoding. -
→ dstEncoding
- The character encoding to which to convert
srcTextP
. SeeCharEncodingType
for a description of the possible values. Note that the encoding can be modified, giving you greater control over the conversion process; see "Encoding Conversion Constant Modifiers." -
→ substitutionStr
- A string to be substituted for any invalid or inconvertible characters that occur in the source text. This string must be valid in the encoding specified by the constant
textSubstitutionEncoding
. If this parameter isNULL
,TxtConvertEncoding()
immediately returns if it encounters an invalid character. - You can pass the constant textSubstitutionDefaultStr for this parameter to have a question mark used as the substitution string.
-
→ substitutionLen
- The number of bytes in
substitutionStr
, not including the terminating null byte. - If you use textSubstitutionDefaultStr for
substitutionStr
, use textSubstitutionDefaultLen for this parameter.
Returns
errNone
upon success or one of the following if an error occurs:
-
txtErrConvertOverflow
- The destination buffer is not large enough to contain the converted text.
-
txtErrConvertUnderflow
- The end of the source buffer contains a partial character.
-
txtErrMalformedText
- An error in the source text encoding has been discovered.
-
txtErrNoCharMapping
- The device does not contain a mapping between the source and destination encodings for at least one of the characters in
srcTextP
. -
txtErrUnknownEncoding
- One of the specified encodings is unknown or can't be handled.
-
txtErrUnknownEncodingFallbackCopy
- Either the source or destination encoding is unknown, and the best fit flag was set in the destination encoding. Before returning this error code,
TxtConvertEncoding()
copies anything that is 7-bit ASCII from the source text buffer to the destination text buffer.
Comments
This function converts ioSrcBytes
of text in srcTextP
from the srcEncoding
to the dstEncoding
character encoding and returns the result in dstTextP
.
The supported encodings for srcEncoding
and dstEncoding
are locale-dependent. See "Encodings Supported by Various Locales." However, this function is most commonly used to convert between an encoding used on the Internet and the device's encoding; therefore, all locales support conversions between most Unicode character sets and the device's encoding. If you use any of the following character encodings, the conversion should work:
- The device's character encoding as returned by the function
LmGetSystemLocale()
- Any of the following, which can be retrieved using
LmGetLocaleSetting()
:-
lmChoiceInboundDefaultVObjectEncoding
(assrcEncoding
only) -
lmChoicePrimarySMSEncoding
(asdstEncoding
only) -
lmChoiceSecondarySMSEncoding
(asdstEncoding
only) -
lmChoicePrimaryEmailEncoding
(asdstEncoding
only) -
lmChoiceSecondaryEmailEncoding
(asdstEncoding
only) -
lmChoiceOutboundVObjectEncoding
(asdstEncoding
only)
-
TIP: If you're converting text that was received from the Internet, the encoding name is passed along with the text data. Use the
TxtNameToEncoding()
function to convert the name to a CharEncodingType
value.
If the function encounters an inconvertible character in the source text, it puts substitutionStr
in the destination buffer in that character's place and continues the conversion. When the conversion is complete, it returns txtErrNoCharMapping
to indicate that an error occurred (assuming that no other higher-priority error occurred during the conversion). If substitutionStr
is NULL
, the function stops the conversion and immediately returns txtErrNoCharMapping
. ioSrcBytes
is set to the offset of the inconvertible character, dstTextP
contains the converted string up to that point, and ioDstBytes
contains the size of the converted text. You can examine the character at ioSrcBytes
and choose to move past it and continue the conversion. Follow the rules for making repeated calls to TxtConvertEncoding()
as described below.
Calling TxtConvertEncoding() in a Loop
You can make repeated calls to TxtConvertEncoding()
in a loop if you only want to convert part of the input buffer at a time. When you make repeated calls to this function, the first call should use true
for newConversion
, and srcTextP
should point to the start of the text buffer. All subsequent calls should use the following values:
-
newConversion
-
false
. -
ioStateP
- The same data that was returned by the previous invocation.
-
srcTextP
- The location where this call should begin converting. Typically, this would be the previous
srcTextP
plus the number of bytes returned inioSrcBytes
. - If you are skipping over an inconvertible character,
srcTextP
must point to the character after that location. -
ioSrcBytes
- The number of bytes that this function call should convert.
-
dstTextP
- A pointer to a location where this function can begin writing the converted string. You might choose to have each function call write to a different destination buffer. To have successive calls write to the same buffer, pass the previous
dstTextP
plus the number of bytes returned inioDstBytes
each time. -
ioDstBytes
- The number of bytes available for output in the
dstTextP
buffer. In other words, the number of bytes remaining.
Encodings Supported by Various Locales
Each device's ROM contains a system-use only locale module that contains tables TxtConvertEncoding()
uses to convert one encoding to another. Therefore, the encodings that TxtConvertEncoding()
supports are dependent upon the ROM's locale. The locale module provides support for Unicode, the device encoding, and a set of related or locale-important encodings. The following tables summarize the set of encodings supported in TxtConvertEncoding()
by various locales.
Table 8.1 Source encodings for Latin ROMs
Table 8.2 Destination encodings for Latin ROMs
Table 8.3 Source encodings for Shift JIS ROMs
Table 8.4 Destination encodings for Shift JIS ROMs
Table 8.5 Source encodings for GB ROMs
Table 8.6 Destination encodings for GB ROMs
TxtEncodingName Function
Purpose
Obtains a character encoding's name.
Declared In
TextMgr.h
Prototype
const char *TxtEncodingName (
CharEncodingType iEncoding
)
Parameters
-
→ iEncoding
- One of the
CharEncodingType
values, indicating a character encoding.
Returns
A constant string containing the name of the encoding.
Comments
Use this function to obtain the official name of the character encoding, suitable to pass to an Internet application or any other application that requires the character encoding's name to be passed along with the data.
See Also
TxtFindString Function
Purpose
Performs a case-insensitive search for a string in another string.
Declared In
TextMgr.h
Prototype
Boolean TxtFindString ( const char*iSrcStringP
, const char*iTargetStringP
, size_t*oFoundPos
, size_t*oFoundLen
)
Parameters
-
→ iSrcStringP
- The string to be searched.
-
→ iTargetStringP
- Prepared version of the string to be found. This string should either be passed directly from the
strToFind
field in thesysAppLaunchCmdFind
launch code's parameter block or it should be prepared using the functionTxtPrepFindString()
. -
← oFoundPos
- Pointer to the offset of the match in iSrcStringP.
-
← oFoundLen
- Pointer to the length in bytes of the matching text.
Returns
true
if the function finds iTargetStringP within iSrcStringP; false
otherwise.
If found, the values pointed to by the oFoundPos and oFoundLen parameters are set to the starting offset and the length of the matching text. If not found, the values pointed to by oFoundPos and oFoundLen are set to 0.
The search that TxtFindString()
performs is locale-dependent. On most ROMs with Latin-based encodings, TxtFindString()
returns true
only if the string is at the beginning of a word. On Shift JIS encoded ROMs, TxtFindString()
returns true
if the string is located anywhere in the word.
You must make sure that the parameters iSrcStringP and iTargetStringP point to the start of a valid character. That is, they must point to the first byte of a multi-byte character, or they must point to a single-byte character; if they don't, results are unpredictable.
See Also
TxtGetChar Function
Purpose
Retrieves the character starting at the specified offset within a text buffer.
Declared In
TextMgr.h
Prototype
wchar32_t TxtGetChar ( const char*iTextP
, size_tiOffset
)
Parameters
-
→ iTextP
- Pointer to the text buffer to be searched.
-
→ iOffset
- A valid offset into the buffer iTextP. This offset must point to an inter-character boundary.
Returns
The character at iOffset in iTextP.
Comments
You must make sure that the parameter iTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
See Also
TxtGetNextChar()
, TxtSetNextChar()
TxtGetEncodingFlags Function
Purpose
Returns the attributes of a particular character encoding.
Declared In
TextMgr.h
Prototype
uint32_t TxtGetEncodingFlags (
CharEncodingType iEncoding
)
Parameters
-
→ iEncoding
- A
CharEncodingType
value specifying a character encoding.
Returns
An unsigned integer with one or more of the Character Encoding Attributes flags set.
TxtGetNextChar Function
Purpose
Retrieves the character starting at the specified offset within a text buffer.
Declared In
TextMgr.h
Prototype
size_t TxtGetNextChar ( const char*iTextP
, size_tiOffset
, wchar32_t*oChar
)
Parameters
-
→ iTextP
- Pointer to the text buffer to be searched.
-
→ iOffset
- A valid offset into the buffer iTextP. This offset must point to an inter-character boundary.
-
← oChar
- The character at iOffset in iTextP. Pass
NULL
for this parameter if you don't need the character returned.
Returns
The size in bytes of the character at iOffset. If oChar is not NULL
upon entry, it points to the character at iOffset upon return.
Comments
You must make sure that the parameter iTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
Example
You can use this function to iterate through a text buffer character-by-character in this way:
size_t i = 0; wchar32_t ch; while (i < bufferLength) { i += TxtGetNextChar(buffer, i, &ch); //do something with ch. }
See Also
TxtGetChar()
, TxtGetPreviousChar()
, TxtSetNextChar()
TxtGetPreviousChar Function
Purpose
Retrieves the character before the specified offset within a text buffer.
Declared In
TextMgr.h
Prototype
size_t TxtGetPreviousChar ( const char*iTextP
, size_tiOffset
, wchar32_t*oChar
)
Parameters
-
→ iTextP
- Pointer to the text buffer to be searched.
-
→ iOffset
- A valid offset into the buffer iTextP. This offset must point to an inter-character boundary.
-
← oChar
- The character immediately preceding iOffset in iTextP. Pass
NULL
for this parameter if you don't need the character returned.
Returns
The size in bytes of the character preceding iOffset in iTextP. If oChar is not NULL
upon entry, then it points to the character preceding iOffset upon return. Returns 0 if iOffset is at the start of the buffer (that is, iOffset is 0).
Comments
This function is often slower to use than TxtGetNextChar()
because it must determine the appropriate character boundaries if the byte immediately before the offset is valid in more than one location (start, middle, or end) of a multi-byte character. To do this, it must work backwards toward the beginning of the string until it finds an unambiguous byte.
You must make sure that the parameter iTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
Example
You can use this function to iterate through a text buffer character-by-character in this way:
wchar32_t ch; // Find the start of the character containing the last byte. TxtCharBounds (buffer, bufferLength - 1, &start, &end); i = start; while (i > 0) { i -= TxtGetPreviousChar(buffer, i, &ch); //do something with ch. }
TxtGetTruncationOffset Function
Purpose
Returns the appropriate byte position for truncating a text buffer such that it is at most a specified number of bytes long.
Declared In
TextMgr.h
Prototype
size_t TxtGetTruncationOffset ( const char*iTextP
, size_tiOffset
)
Parameters
Returns
The appropriate byte offset for truncating iTextP at a valid inter-character boundary. The return value may be less than or equal to iOffset.
Comments
You must make sure that the parameter iTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
TxtGetWordWrapOffset Function
Purpose
Locates an appropriate place for a line break in a text buffer.
Declared In
TextMgr.h
Prototype
size_t TxtGetWordWrapOffset ( const char*iTextP
, size_tiOffset
)
Parameters
-
→ iTextP
- Pointer to a text buffer.
-
→ iOffset
- A valid offset where the search should begin. The search is performed backward starting from this offset.
Returns
The offset of a character that can begin on a new line (typically, the beginning of the word that contains iOffset or last word before iOffset). If an appropriate break could not be found, returns iOffset.
Comments
The FntWordWrap()
function calls TxtGetWordWrapOffset()
to locate an appropriate place to break the text. The returned offset points to the character that should begin the next line.
This function starts at iOffset and works backward until it finds a character that typically occurs between words (for example, white space or punctuation). Then it moves forward until it locates the character that begins a word (typically, a letter or number). Note that this function may return an offset value that is greater than the one passed in if the offset passed in occurs immediately before white space or in the middle of white space.
TxtMaxEncoding Function
Purpose
Returns the higher of two encodings.
Declared In
TextMgr.h
Prototype
CharEncodingType TxtMaxEncoding ( CharEncodingTypea
, CharEncodingTypeb
)
Parameters
-
→ a
- A
CharEncodingType
to compare. -
→ b
- Another
CharEncodingType
to compare.
Returns
The higher of a
or b
. One character encoding is higher than another if it is more specific. For example code page 1252 is "higher" than ISO 8859-1 because it represents more characters than ISO 8859-1.
Comments
This function is used by TxtStrEncoding()
to determine the encoding required for a string.
See Also
TxtCharEncoding()
, CharEncodingType
TxtNameToEncoding Function
Purpose
Returns an encoding's constant given its name.
Declared In
TextMgr.h
Prototype
CharEncodingType TxtNameToEncoding (
const char *iEncodingName
)
Parameters
-
→ iEncodingName
- One of the string constants containing the official name of an encoding. You can find a list of official names at this URL:
http://www.iana.org/assignments/character-sets
.
Returns
One of the CharEncodingType
constants. Returns charEncodingUnknown
if the specified encoding could not be found.
Comments
Use this function to convert a character encoding's name as received from an Internet application into the character encoding constant that some Text Manager functions require.
This function properly converts aliases for a character encoding. For example, passing the strings "us-ascii", "ASCII", "cp367", and "IBM367" all return charEncodingAscii
.
All locales can access the Text Manager's character set list, which contains the standard set of aliases for the locales that Palm OS supports. Each locale may add its own aliases to the list as well. For example, a device with the Shift JIS encoding might add its own set of aliases, which would be unknown in other locales.
See Also
TxtNextCharSize Macro
Purpose
Returns the size of the character starting at the specified offset within a text buffer.
Declared In
TextMgr.h
Prototype
#define TxtNextCharSize (iTextP
,iOffset
)
Parameters
-
→ iTextP
- Pointer to the text buffer to be searched.
-
→ iOffset
- A valid offset into the buffer iTextP. This offset must point to an inter-character boundary.
Returns
The size in bytes of the character at iOffset.
Comments
You must make sure that the parameter iTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
See Also
TxtParamString Function
Purpose
Replaces substrings within a string with the specified values.
Declared In
TextMgr.h
Prototype
char *TxtParamString ( const char*inTemplate
, const char*param0
, const char*param1
, const char*param2
, const char*param3
)
Parameters
-
→ inTemplate
- The string containing the substrings to replace.
-
→ param0
- String to replace ^0 with or
NULL
. -
→ param1
- String to replace ^1 with or
NULL
. -
→ param2
- String to replace ^2 with or
NULL
. -
→ param3
- String to replace ^3 with or
NULL
.
Returns
A pointer to a locked relocatable chunk in the dynamic heap that contains the appropriate substitutions.
Comments
This function searches inTemplate
for occurrences of the sequences ^0, ^1, ^2, and ^3. When it finds these, it replaces them with the corresponding string passed to this function. Multiple instances of each sequence will be replaced.
The replacement strings can also contain the substitution strings, provided they refer to a later parameter. That is, the param0
string can have references to ^1, ^2, and ^3, the param1
string can have references to ^2 and ^3, and the param2
string can have references to ^3. Any other occurrences of the substitution strings in the replacement strings are ignored. For example, if param3
is the string "^0", any occurrences of ^3 in inTemplate
are replaced with the string "^0".
You must make sure that the parameter inTemplate
points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
TxtParamString()
allocates space for the returned string in the dynamic heap through a call to MemHandleNew()
, and then returns the result of calling MemHandleLock()
with this handle. Your code is responsible for freeing this memory when it is no longer needed.
See Also
TxtReplaceStr()
, FrmCustomAlert()
TxtPrepFindString Function
Purpose
Prepares a string for use in TxtFindString()
.
Declared In
TextMgr.h
Prototype
size_t TxtPrepFindString ( const char*iSrcTextP
, size_tiSrcLen
, char*oDstTextP
, size_tiDstSize
)
Parameters
-
→ iSrcTextP
- The text to be searched for. Must not be
NULL
. -
→ iSrcLen
- The number of bytes of
iSrcTextP
to convert. -
← oDstTextP
- The same text as in iSrcTextP but converted to a suitable format for searching. oDstTextP must not be the same address as iSrcTextP.
-
→ iDstSize
- The length in bytes of the area pointed to by oDstTextP.
Returns
The number of bytes from iSrcTextP
that were converted.
Comments
Use this function to normalize the string to search for before using TxtFindString()
to perform a search that is internal to your application. If you are using TxtFindString()
in response to the sysAppLaunchCmdFind
launch code, the string that the launch code passes in is already properly normalized for the search.
This function normalizes the string to be searched for. The method by which a search string is normalized varies depending on the version of Palm OS and the character encoding supported by the device.
If necessary to prevent overflow of the destination buffer, not all of iSrcTextP is converted.
You must make sure that the parameter iSrcTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character. If it doesn't, results are unpredictable.
TxtPreviousCharSize Macro
Purpose
Returns the size of the character before the specified offset within a text buffer.
Declared In
TextMgr.h
Prototype
#define TxtPreviousCharSize (iTextP
,iOffset
)
Parameters
-
→ iTextP
- Pointer to the text buffer.
-
→ iOffset
- A valid offset into the buffer iTextP. This offset must point to an inter-character boundary.
Returns
The size in bytes of the character preceding iOffset in iTextP. Returns 0 if iOffset is at the start of the buffer (that is, iOffset is 0).
Comments
You must make sure that the parameter iTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
This macro is often slower to use than TxtNextCharSize()
because it must determine the appropriate character boundaries if the byte immediately before the offset is valid in more than one location (start, middle, or end) of a multi-byte character. To do this, it must work backwards toward the beginning of the string until it finds an unambiguous byte.
See Also
TxtReplaceStr Function
Purpose
Replaces a substring of a given format with another string.
Declared In
TextMgr.h
Prototype
uint16_t TxtReplaceStr ( char*iStringP
, size_tiMaxLen
, const char*iParamStringP
, uint16_tiParamNum
)
Parameters
-
↔ iStringP
- The string in which to perform the replacing.
-
→ iMaxLen
- The maximum length in bytes that iStringP can become.
-
→ iParamStringP
- The string that
^
iParamNum
should be replaced with. IfNULL
, no changes are made. -
→ iParamNum
- A single-digit number (0 to 9).
Returns
The number of occurrences found and replaced.
Raises a fatal error message if iParamNum is greater than 9.
Comments
This function searches iStringP for occurrences of the string ^
iParamNum,
where iParamNum
is any digit from 0 to 9. When it finds the string, it replaces it with iParamStringP. Multiple instances are replaced as long as the resulting string doesn't contain more than iMaxLen bytes, not counting the terminating null.
You can set the iParamStringP parameter to NULL
to determine the required length of iStringP before actually doing the replacing. TxtReplaceStr()
returns the number of occurrences it finds of ^
iParamNum
. Multiply this value by the length of the iParamStr
you intend to use to determine the appropriate length of iStringP.
You must make sure that the parameter iStringP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
TxtSetNextChar Function
Purpose
Sets a character within a text buffer.
Declared In
TextMgr.h
Prototype
size_t TxtSetNextChar ( char*iTextP
, size_tiOffset
, wchar32_tiChar
)
Parameters
-
↔ iTextP
- Pointer to a text buffer.
-
→ iOffset
- A valid offset into the buffer iTextP. This offset must point to an inter-character boundary.
-
→ iChar
- The character to replace the character at iOffset with. Must not be a virtual character.
Returns
Comments
This function replaces the character in iTextP at the location iOffset with the character iChar. Note that there must be enough space at iOffset to write the character.
You can use TxtCharSize()
to determine the size of iChar.
You must make sure that the parameter iTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
See Also
TxtStrEncoding Function
Purpose
Returns the encoding required to represent a string.
Declared In
TextMgr.h
Prototype
CharEncodingType TxtStrEncoding (
const char *iStringP
)
Parameters
Returns
A CharEncodingType
value that indicates the encoding required to represent iStringP. If any character in the string isn't recognizable, then charEncodingUnknown
is returned.
Comments
The encoding for the string is the maximum encoding of any character in that string. For example, if a two-character string contains a blank space and a ü, the appropriate encoding is charEncodingISO8859_1
. The blank space's minimum encoding is ASCII. The minimum encoding for the ü is ISO 8859-1. The maximum of these two encodings is ISO 8859-1.
Use this function for informational purposes only. Your code should not assume that the character encoding returned by this function is the Palm OS system's character encoding. (Instead use LmGetSystemLocale()
.)
See Also
TxtCharEncoding()
, TxtMaxEncoding()
TxtTransliterate Function
Purpose
Converts the specified number of bytes in a text buffer using the specified operation.
Declared In
TextMgr.h
Prototype
status_t TxtTransliterate ( const char*iSrcTextP
, size_tiSrcLength
, char*oDstTextP
, size_t*ioDstLength
, TranslitOpTypeiTranslitOp
)
Parameters
-
→ iSrcTextP
- Pointer to a text buffer.
-
→ iSrcLength
- The length in bytes of iSrcTextP.
-
← oDstTextP
- The output buffer containing the converted characters.
-
↔ ioDstLength
- Upon entry, the maximum length of oDstTextP. Upon return, the actual length of oDstTextP.
-
→ iTranslitOp
- A 16-bit unsigned value that specifies which transliteration operation is to be performed. See
TranslitOpType
for the possible values for this field. - You can ensure that you have enough space for the output by OR-ing your chosen operation with
translitOpPreprocess
.
Returns
-
errNone
- Success
-
txtErrUknownTranslitOp
- iTranslitOp's value is not recognized
-
txtErrTranslitOverrun
- iSrcTextP and oDstTextP point to the same memory location and the operation would cause the function to overwrite unprocessed data in the input buffer.
-
txtErrTranslitOverflow
- oDstTextP is not large enough to contain the converted string.
-
txtErrTranslitUnderflow
- The end of the source buffer contains a partial character.
Comments
iSrcTextP and oDstTextP may point to the same location if you want to perform the operation in place. However, you should be careful that the space required for oDstTextP is not larger than iSrcTextP so that you don't generate a txtErrTranslitOverrun
error.
For example, suppose on a Shift JIS encoded system, you want to convert a series of single-byte Japanese Katakana symbols to double-byte Katakana symbols. You cannot perform this operation in place because it replaces a single-byte character with a multi-byte character. When the first converted character is written to the buffer, it overwrites the second input character. Thus, a text overrun has occurred.
You must make sure that the parameter iSrcTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
Example
The following code shows how to convert a string to uppercase.
outSize = buf2Len; error = TxtTransliterate(buf1, buf1len, &buf2, &outSize, translitOpUpperCase|translitOpPreprocess); if (outSize > buf2len) /* allocate more memory for buf2 */ error = TxtTransliterate(buf1, buf1Len, &buf2, &outSize, translitOpUpperCase);
TxtTruncateString Function
Purpose
Determines if a string fits within a given number of bytes. If not, truncates the string.
Declared In
TextMgr.h
Prototype
Boolean TxtTruncateString ( char*iDstString
, const char*iSrcString
, size_tiMaxLength
, BooleaniAddEllipsis
)
Parameters
-
← iDstString
- The null-terminated string truncated if necessary so that it is no more than iMaxLength bytes long.
-
→ iSrcString
- A null-terminated string.
-
→ iMaxLength
- The maximum length of
iDstString
including the null terminator. -
→ iAddEllipsis
- If
true
, an ellipsis character is the last character ofiDstString
ifiSrcString
had to be truncated. Iffalse
,iSrcString
is truncated at the last character that fits iniDstString
.
Returns
true
if the string was truncated, or false
if the string can fit without truncation.
Comments
This function determines whether iSrcString can be copied into a string with the specified length without being truncated. If it can, TxtTruncateString()
returns false
and copies iSrcString
into iDstString
. If the string must be truncated, this function copies one less than the number of characters that can fit in iMaxLength into iDstString
and then appends an ellipsis (...) character.
See Also
FntWidthToOffset()
, WinDrawTruncChars()
, TxtGetTruncationOffset()
TxtWordBounds Function
Purpose
Finds the boundaries of a word of text that contains the character starting at the specified offset.
Declared In
TextMgr.h
Prototype
Boolean TxtWordBounds ( const char*iTextP
, size_tiLength
, size_tiOffset
, size_t*oWordStart
, size_t*oWordEnd
)
Parameters
-
→ iTextP
- Pointer to a text buffer.
-
→ iLength
- The length in bytes of the text pointed to by iTextP.
-
→ iOffset
- A valid offset into the text buffer iTextP. This offset must point to the beginning of a character.
-
← oWordStart
- The starting offset of the text word.
-
← oWordEnd
- The ending offset of the text word.
Returns
true
if a word is found. Returns false
if the word doesn't exist or is punctuation or whitespace.
Comments
Assuming the ASCII encoding, if the text buffer contains the string "Hi! How are you?" and you pass 5 as the offset, TxtWordBounds()
returns the start and end of the word containing the character at offset 5, which is the character "o". Thus, oWordStart and oWordEnd would point to the start and end of the word "How".
You must make sure that the parameter iTextP points to the start of a valid character. That is, it must point to the first byte of a multi-byte character or it must point to a single-byte character; if it doesn't, results are unpredictable.
See Also
TxtCharBounds()
, TxtCharIsDelim()
, TxtGetWordWrapOffset()
1. This encoding is identical to its Windows counterpart with some additional characters added in the control range.