ICU 65.1  65.1
Public Member Functions | Static Public Member Functions | Protected Member Functions | Friends
icu::RuleBasedNumberFormat Class Reference

The RuleBasedNumberFormat class formats numbers according to a set of rules. More...

#include <rbnf.h>

Inheritance diagram for icu::RuleBasedNumberFormat:
icu::NumberFormat icu::Format icu::UObject icu::UMemory

Public Member Functions

 RuleBasedNumberFormat (const UnicodeString &rules, UParseError &perror, UErrorCode &status)
 Creates a RuleBasedNumberFormat that behaves according to the description passed in. More...
 
 RuleBasedNumberFormat (const UnicodeString &rules, const UnicodeString &localizations, UParseError &perror, UErrorCode &status)
 Creates a RuleBasedNumberFormat that behaves according to the description passed in. More...
 
 RuleBasedNumberFormat (const UnicodeString &rules, const Locale &locale, UParseError &perror, UErrorCode &status)
 Creates a RuleBasedNumberFormat that behaves according to the rules passed in. More...
 
 RuleBasedNumberFormat (const UnicodeString &rules, const UnicodeString &localizations, const Locale &locale, UParseError &perror, UErrorCode &status)
 Creates a RuleBasedNumberFormat that behaves according to the description passed in. More...
 
 RuleBasedNumberFormat (URBNFRuleSetTag tag, const Locale &locale, UErrorCode &status)
 Creates a RuleBasedNumberFormat from a predefined ruleset. More...
 
 RuleBasedNumberFormat (const RuleBasedNumberFormat &rhs)
 Copy constructor. More...
 
RuleBasedNumberFormatoperator= (const RuleBasedNumberFormat &rhs)
 Assignment operator. More...
 
virtual ~RuleBasedNumberFormat ()
 Release memory allocated for a RuleBasedNumberFormat when you are finished with it. More...
 
virtual RuleBasedNumberFormatclone () const
 Clone this object polymorphically. More...
 
virtual UBool operator== (const Format &other) const
 Return true if the given Format objects are semantically equal. More...
 
virtual UnicodeString getRules () const
 return the rules that were provided to the RuleBasedNumberFormat. More...
 
virtual int32_t getNumberOfRuleSetNames () const
 Return the number of public rule set names. More...
 
virtual UnicodeString getRuleSetName (int32_t index) const
 Return the name of the index'th public ruleSet. More...
 
virtual int32_t getNumberOfRuleSetDisplayNameLocales (void) const
 Return the number of locales for which we have localized rule set display names. More...
 
virtual Locale getRuleSetDisplayNameLocale (int32_t index, UErrorCode &status) const
 Return the index'th display name locale. More...
 
virtual UnicodeString getRuleSetDisplayName (int32_t index, const Locale &locale=Locale::getDefault())
 Return the rule set display names for the provided locale. More...
 
virtual UnicodeString getRuleSetDisplayName (const UnicodeString &ruleSetName, const Locale &locale=Locale::getDefault())
 Return the rule set display name for the provided rule set and locale. More...
 
virtual UnicodeStringformat (int32_t number, UnicodeString &toAppendTo, FieldPosition &pos) const
 Formats the specified 32-bit number using the default ruleset. More...
 
virtual UnicodeStringformat (int64_t number, UnicodeString &toAppendTo, FieldPosition &pos) const
 Formats the specified 64-bit number using the default ruleset. More...
 
virtual UnicodeStringformat (double number, UnicodeString &toAppendTo, FieldPosition &pos) const
 Formats the specified number using the default ruleset. More...
 
virtual UnicodeStringformat (int32_t number, const UnicodeString &ruleSetName, UnicodeString &toAppendTo, FieldPosition &pos, UErrorCode &status) const
 Formats the specified number using the named ruleset. More...
 
virtual UnicodeStringformat (int64_t number, const UnicodeString &ruleSetName, UnicodeString &toAppendTo, FieldPosition &pos, UErrorCode &status) const
 Formats the specified 64-bit number using the named ruleset. More...
 
virtual UnicodeStringformat (double number, const UnicodeString &ruleSetName, UnicodeString &toAppendTo, FieldPosition &pos, UErrorCode &status) const
 Formats the specified number using the named ruleset. More...
 
virtual void parse (const UnicodeString &text, Formattable &result, ParsePosition &parsePosition) const
 Parses the specfied string, beginning at the specified position, according to this formatter's rules. More...
 
virtual void setLenient (UBool enabled)
 Turns lenient parse mode on and off. More...
 
virtual UBool isLenient (void) const
 Returns true if lenient-parse mode is turned on. More...
 
virtual void setDefaultRuleSet (const UnicodeString &ruleSetName, UErrorCode &status)
 Override the default rule set to use. More...
 
virtual UnicodeString getDefaultRuleSetName () const
 Return the name of the current default rule set. More...
 
virtual void setContext (UDisplayContext value, UErrorCode &status)
 Set a particular UDisplayContext value in the formatter, such as UDISPCTX_CAPITALIZATION_FOR_STANDALONE. More...
 
virtual ERoundingMode getRoundingMode (void) const
 Get the rounding mode. More...
 
virtual void setRoundingMode (ERoundingMode roundingMode)
 Set the rounding mode. More...
 
virtual UClassID getDynamicClassID (void) const
 ICU "poor man's RTTI", returns a UClassID for the actual class. More...
 
virtual void adoptDecimalFormatSymbols (DecimalFormatSymbols *symbolsToAdopt)
 Sets the decimal format symbols, which is generally not changed by the programmer or user. More...
 
virtual void setDecimalFormatSymbols (const DecimalFormatSymbols &symbols)
 Sets the decimal format symbols, which is generally not changed by the programmer or user. More...
 
- Public Member Functions inherited from icu::NumberFormat
virtual ~NumberFormat ()
 Destructor. More...
 
virtual UnicodeStringformat (const Formattable &obj, UnicodeString &appendTo, FieldPosition &pos, UErrorCode &status) const
 Format an object to produce a string. More...
 
virtual UnicodeStringformat (const Formattable &obj, UnicodeString &appendTo, FieldPositionIterator *posIter, UErrorCode &status) const
 Format an object to produce a string. More...
 
virtual void parseObject (const UnicodeString &source, Formattable &result, ParsePosition &parse_pos) const
 Parse a string to produce an object. More...
 
UnicodeStringformat (double number, UnicodeString &appendTo) const
 Format a double number. More...
 
UnicodeStringformat (int32_t number, UnicodeString &appendTo) const
 Format a long number. More...
 
UnicodeStringformat (int64_t number, UnicodeString &appendTo) const
 Format an int64 number. More...
 
virtual UnicodeStringformat (double number, UnicodeString &appendTo, FieldPosition &pos, UErrorCode &status) const
 Format a double number. More...
 
virtual UnicodeStringformat (double number, UnicodeString &appendTo, FieldPositionIterator *posIter, UErrorCode &status) const
 Format a double number. More...
 
virtual UnicodeStringformat (int32_t number, UnicodeString &appendTo, FieldPosition &pos, UErrorCode &status) const
 Format a long number. More...
 
virtual UnicodeStringformat (int32_t number, UnicodeString &appendTo, FieldPositionIterator *posIter, UErrorCode &status) const
 Format an int32 number. More...
 
virtual UnicodeStringformat (int64_t number, UnicodeString &appendTo, FieldPosition &pos, UErrorCode &status) const
 Format an int64 number. More...
 
virtual UnicodeStringformat (int64_t number, UnicodeString &appendTo, FieldPositionIterator *posIter, UErrorCode &status) const
 Format an int64 number. More...
 
virtual UnicodeStringformat (StringPiece number, UnicodeString &appendTo, FieldPositionIterator *posIter, UErrorCode &status) const
 Format a decimal number. More...
 
virtual UnicodeStringformat (const number::impl::DecimalQuantity &number, UnicodeString &appendTo, FieldPositionIterator *posIter, UErrorCode &status) const
 Format a decimal number. More...
 
virtual void parse (const UnicodeString &text, Formattable &result, UErrorCode &status) const
 Parse a string as a numeric value, and return a Formattable numeric object. More...
 
virtual CurrencyAmountparseCurrency (const UnicodeString &text, ParsePosition &pos) const
 Parses text from the given string as a currency amount. More...
 
UBool isParseIntegerOnly (void) const
 Return true if this format will parse numbers as integers only. More...
 
virtual void setParseIntegerOnly (UBool value)
 Sets whether or not numbers should be parsed as integers only. More...
 
UBool isGroupingUsed (void) const
 Returns true if grouping is used in this format. More...
 
virtual void setGroupingUsed (UBool newValue)
 Set whether or not grouping will be used in this format. More...
 
int32_t getMaximumIntegerDigits (void) const
 Returns the maximum number of digits allowed in the integer portion of a number. More...
 
virtual void setMaximumIntegerDigits (int32_t newValue)
 Sets the maximum number of digits allowed in the integer portion of a number. More...
 
int32_t getMinimumIntegerDigits (void) const
 Returns the minimum number of digits allowed in the integer portion of a number. More...
 
virtual void setMinimumIntegerDigits (int32_t newValue)
 Sets the minimum number of digits allowed in the integer portion of a number. More...
 
int32_t getMaximumFractionDigits (void) const
 Returns the maximum number of digits allowed in the fraction portion of a number. More...
 
virtual void setMaximumFractionDigits (int32_t newValue)
 Sets the maximum number of digits allowed in the fraction portion of a number. More...
 
int32_t getMinimumFractionDigits (void) const
 Returns the minimum number of digits allowed in the fraction portion of a number. More...
 
virtual void setMinimumFractionDigits (int32_t newValue)
 Sets the minimum number of digits allowed in the fraction portion of a number. More...
 
virtual void setCurrency (const char16_t *theCurrency, UErrorCode &ec)
 Sets the currency used to display currency amounts. More...
 
const char16_t * getCurrency () const
 Gets the currency used to display currency amounts. More...
 
virtual UDisplayContext getContext (UDisplayContextType type, UErrorCode &status) const
 Get the formatter's UDisplayContext value for the specified UDisplayContextType, such as UDISPCTX_TYPE_CAPITALIZATION. More...
 
virtual void setRoundingMode (ERoundingMode roundingMode)
 Set the rounding mode. More...
 
- Public Member Functions inherited from icu::Format
virtual ~Format ()
 Destructor. More...
 
UBool operator!= (const Format &other) const
 Return true if the given Format objects are not semantically equal. More...
 
UnicodeStringformat (const Formattable &obj, UnicodeString &appendTo, UErrorCode &status) const
 Formats an object to produce a string. More...
 
void parseObject (const UnicodeString &source, Formattable &result, UErrorCode &status) const
 Parses a string to produce an object. More...
 
Locale getLocale (ULocDataLocaleType type, UErrorCode &status) const
 Get the locale for this format object. More...
 
const char * getLocaleID (ULocDataLocaleType type, UErrorCode &status) const
 Get the locale for this format object. More...
 
- Public Member Functions inherited from icu::UObject
virtual ~UObject ()
 Destructor. More...
 

Static Public Member Functions

static UClassID getStaticClassID (void)
 ICU "poor man's RTTI", returns a UClassID for this class. More...
 
- Static Public Member Functions inherited from icu::NumberFormat
static NumberFormatcreateInstance (UErrorCode &)
 Create a default style NumberFormat for the current default locale. More...
 
static NumberFormatcreateInstance (const Locale &inLocale, UErrorCode &)
 Create a default style NumberFormat for the specified locale. More...
 
static NumberFormatcreateInstance (const Locale &desiredLocale, UNumberFormatStyle style, UErrorCode &errorCode)
 Create a specific style NumberFormat for the specified locale. More...
 
static NumberFormatinternalCreateInstance (const Locale &desiredLocale, UNumberFormatStyle style, UErrorCode &errorCode)
 ICU use only. More...
 
static const SharedNumberFormat * createSharedInstance (const Locale &inLocale, UNumberFormatStyle style, UErrorCode &status)
 ICU use only. More...
 
static NumberFormatcreateCurrencyInstance (UErrorCode &)
 Returns a currency format for the current default locale. More...
 
static NumberFormatcreateCurrencyInstance (const Locale &inLocale, UErrorCode &)
 Returns a currency format for the specified locale. More...
 
static NumberFormatcreatePercentInstance (UErrorCode &)
 Returns a percentage format for the current default locale. More...
 
static NumberFormatcreatePercentInstance (const Locale &inLocale, UErrorCode &)
 Returns a percentage format for the specified locale. More...
 
static NumberFormatcreateScientificInstance (UErrorCode &)
 Returns a scientific format for the current default locale. More...
 
static NumberFormatcreateScientificInstance (const Locale &inLocale, UErrorCode &)
 Returns a scientific format for the specified locale. More...
 
static const LocalegetAvailableLocales (int32_t &count)
 Get the set of Locales for which NumberFormats are installed. More...
 
static URegistryKey registerFactory (NumberFormatFactory *toAdopt, UErrorCode &status)
 Register a new NumberFormatFactory. More...
 
static UBool unregister (URegistryKey key, UErrorCode &status)
 Unregister a previously-registered NumberFormatFactory using the key returned from the register call. More...
 
static StringEnumerationgetAvailableLocales (void)
 Return a StringEnumeration over the locales available at the time of the call, including registered locales. More...
 
static UClassID getStaticClassID (void)
 Return the class ID for this class. More...
 

Protected Member Functions

virtual UnicodeStringformat (const number::impl::DecimalQuantity &number, UnicodeString &appendTo, FieldPosition &pos, UErrorCode &status) const
 Format a decimal number. More...
 
- Protected Member Functions inherited from icu::NumberFormat
 NumberFormat ()
 Default constructor for subclass use only. More...
 
 NumberFormat (const NumberFormat &)
 Copy constructor. More...
 
NumberFormatoperator= (const NumberFormat &)
 Assignment operator. More...
 
virtual void getEffectiveCurrency (char16_t *result, UErrorCode &ec) const
 Returns the currency in effect for this formatter. More...
 
- Protected Member Functions inherited from icu::Format
void setLocaleIDs (const char *valid, const char *actual)
 
 Format ()
 Default constructor for subclass use only. More...
 
 Format (const Format &)
 
Formatoperator= (const Format &)
 

Friends

class NFSubstitution
 
class NFRule
 
class NFRuleSet
 
class FractionalPartSubstitution
 

Additional Inherited Members

- Public Types inherited from icu::NumberFormat
enum  ERoundingMode {
  kRoundCeiling, kRoundFloor, kRoundDown, kRoundUp,
  kRoundHalfEven, kRoundHalfDown, kRoundHalfUp, kRoundUnnecessary
}
 Rounding mode. More...
 
enum  EAlignmentFields {
  kIntegerField = UNUM_INTEGER_FIELD, kFractionField = UNUM_FRACTION_FIELD, kDecimalSeparatorField = UNUM_DECIMAL_SEPARATOR_FIELD, kExponentSymbolField = UNUM_EXPONENT_SYMBOL_FIELD,
  kExponentSignField = UNUM_EXPONENT_SIGN_FIELD, kExponentField = UNUM_EXPONENT_FIELD, kGroupingSeparatorField = UNUM_GROUPING_SEPARATOR_FIELD, kCurrencyField = UNUM_CURRENCY_FIELD,
  kPercentField = UNUM_PERCENT_FIELD, kPermillField = UNUM_PERMILL_FIELD, kSignField = UNUM_SIGN_FIELD, kMeasureUnitField = UNUM_MEASURE_UNIT_FIELD,
  kCompactField = UNUM_COMPACT_FIELD, INTEGER_FIELD = UNUM_INTEGER_FIELD, FRACTION_FIELD = UNUM_FRACTION_FIELD
}
 Alignment Field constants used to construct a FieldPosition object. More...
 
- Static Protected Member Functions inherited from icu::NumberFormat
static NumberFormatmakeInstance (const Locale &desiredLocale, UNumberFormatStyle style, UBool mustBeDecimalFormat, UErrorCode &errorCode)
 Creates the specified number format style of the desired locale. More...
 
- Static Protected Member Functions inherited from icu::Format
static void syntaxError (const UnicodeString &pattern, int32_t pos, UParseError &parseError)
 Simple function for initializing a UParseError from a UnicodeString. More...
 
- Static Protected Attributes inherited from icu::NumberFormat
static const int32_t gDefaultMaxIntegerDigits
 
static const int32_t gDefaultMinIntegerDigits
 

Detailed Description

The RuleBasedNumberFormat class formats numbers according to a set of rules.

This number formatter is typically used for spelling out numeric values in words (e.g., 25,3476 as "twenty-five thousand three hundred seventy-six" or "vingt-cinq mille trois cents soixante-seize" or "fünfundzwanzigtausenddreihundertsechsundsiebzig"), but can also be used for other complicated formatting tasks, such as formatting a number of seconds as hours, minutes and seconds (e.g., 3,730 as "1:02:10").

The resources contain three predefined formatters for each locale: spellout, which spells out a value in words (123 is "one hundred twenty-three"); ordinal, which appends an ordinal suffix to the end of a numeral (123 is "123rd"); and duration, which shows a duration in seconds as hours, minutes, and seconds (123 is "2:03").  The client can also define more specialized RuleBasedNumberFormats by supplying programmer-defined rule sets.

The behavior of a RuleBasedNumberFormat is specified by a textual description that is either passed to the constructor as a String or loaded from a resource bundle. In its simplest form, the description consists of a semicolon-delimited list of rules. Each rule has a string of output text and a value or range of values it is applicable to. In a typical spellout rule set, the first twenty rules are the words for the numbers from 0 to 19:

zero; one; two; three; four; five; six; seven; eight; nine;
ten; eleven; twelve; thirteen; fourteen; fifteen; sixteen; seventeen; eighteen; nineteen;

For larger numbers, we can use the preceding set of rules to format the ones place, and we only have to supply the words for the multiples of 10:

 20: twenty[->>];
30: thirty[->>];
40: forty[->>];
50: fifty[->>];
60: sixty[->>];
70: seventy[->>];
80: eighty[->>];
90: ninety[->>];

In these rules, the base value is spelled out explicitly and set off from the rule's output text with a colon. The rules are in a sorted list, and a rule is applicable to all numbers from its own base value to one less than the next rule's base value. The ">>" token is called a substitution and tells the fomatter to isolate the number's ones digit, format it using this same set of rules, and place the result at the position of the ">>" token. Text in brackets is omitted if the number being formatted is an even multiple of 10 (the hyphen is a literal hyphen; 24 is "twenty-four," not "twenty four").

For even larger numbers, we can actually look up several parts of the number in the list:

100: << hundred[ >>];

The "<<" represents a new kind of substitution. The << isolates the hundreds digit (and any digits to its left), formats it using this same rule set, and places the result where the "<<" was. Notice also that the meaning of >> has changed: it now refers to both the tens and the ones digits. The meaning of both substitutions depends on the rule's base value. The base value determines the rule's divisor, which is the highest power of 10 that is less than or equal to the base value (the user can change this). To fill in the substitutions, the formatter divides the number being formatted by the divisor. The integral quotient is used to fill in the << substitution, and the remainder is used to fill in the >> substitution. The meaning of the brackets changes similarly: text in brackets is omitted if the value being formatted is an even multiple of the rule's divisor. The rules are applied recursively, so if a substitution is filled in with text that includes another substitution, that substitution is also filled in.

This rule covers values up to 999, at which point we add another rule:

1000: << thousand[ >>];

Again, the meanings of the brackets and substitution tokens shift because the rule's base value is a higher power of 10, changing the rule's divisor. This rule can actually be used all the way up to 999,999. This allows us to finish out the rules as follows:

 1,000,000: << million[ >>];
1,000,000,000: << billion[ >>];
1,000,000,000,000: << trillion[ >>];
1,000,000,000,000,000: OUT OF RANGE!;

Commas, periods, and spaces can be used in the base values to improve legibility and are ignored by the rule parser. The last rule in the list is customarily treated as an "overflow rule," applying to everything from its base value on up, and often (as in this example) being used to print out an error message or default representation. Notice also that the size of the major groupings in large numbers is controlled by the spacing of the rules: because in English we group numbers by thousand, the higher rules are separated from each other by a factor of 1,000.

To see how these rules actually work in practice, consider the following example: Formatting 25,430 with this rule set would work like this:

<< thousand >> [the rule whose base value is 1,000 is applicable to 25,340]
twenty->> thousand >> [25,340 over 1,000 is 25. The rule for 20 applies.]
twenty-five thousand >> [25 mod 10 is 5. The rule for 5 is "five."
twenty-five thousand << hundred >> [25,340 mod 1,000 is 340. The rule for 100 applies.]
twenty-five thousand three hundred >> [340 over 100 is 3. The rule for 3 is "three."]
twenty-five thousand three hundred forty [340 mod 100 is 40. The rule for 40 applies. Since 40 divides evenly by 10, the hyphen and substitution in the brackets are omitted.]

The above syntax suffices only to format positive integers. To format negative numbers, we add a special rule:

-x: minus >>;

This is called a negative-number rule, and is identified by "-x" where the base value would be. This rule is used to format all negative numbers. the >> token here means "find the number's absolute value, format it with these rules, and put the result here."

We also add a special rule called a fraction rule for numbers with fractional parts:

x.x: << point >>;

This rule is used for all positive non-integers (negative non-integers pass through the negative-number rule first and then through this rule). Here, the << token refers to the number's integral part, and the >> to the number's fractional part. The fractional part is formatted as a series of single-digit numbers (e.g., 123.456 would be formatted as "one hundred twenty-three point four five six").

To see how this rule syntax is applied to various languages, examine the resource data.

There is actually much more flexibility built into the rule language than the description above shows. A formatter may own multiple rule sets, which can be selected by the caller, and which can use each other to fill in their substitutions. Substitutions can also be filled in with digits, using a DecimalFormat object. There is syntax that can be used to alter a rule's divisor in various ways. And there is provision for much more flexible fraction handling. A complete description of the rule syntax follows:


The description of a RuleBasedNumberFormat's behavior consists of one or more rule sets. Each rule set consists of a name, a colon, and a list of rules. A rule set name must begin with a % sign. Rule sets with names that begin with a single % sign are public: the caller can specify that they be used to format and parse numbers. Rule sets with names that begin with %% are private: they exist only for the use of other rule sets. If a formatter only has one rule set, the name may be omitted.

The user can also specify a special "rule set" named %lenient-parse. The body of %lenient-parse isn't a set of number-formatting rules, but a RuleBasedCollator description which is used to define equivalences for lenient parsing. For more information on the syntax, see RuleBasedCollator. For more information on lenient parsing, see setLenientParse(). Note: symbols that have syntactic meaning in collation rules, such as '&', have no particular meaning when appearing outside of the lenient-parse rule set.

The body of a rule set consists of an ordered, semicolon-delimited list of rules. Internally, every rule has a base value, a divisor, rule text, and zero, one, or two substitutions. These parameters are controlled by the description syntax, which consists of a rule descriptor, a colon, and a rule body.

A rule descriptor can take one of the following forms (text in italics is the name of a token):

bv: bv specifies the rule's base value. bv is a decimal number expressed using ASCII digits. bv may contain spaces, period, and commas, which are ignored. The rule's divisor is the highest power of 10 less than or equal to the base value.
bv/rad: bv specifies the rule's base value. The rule's divisor is the highest power of rad less than or equal to the base value.
bv>: bv specifies the rule's base value. To calculate the divisor, let the radix be 10, and the exponent be the highest exponent of the radix that yields a result less than or equal to the base value. Every > character after the base value decreases the exponent by 1. If the exponent is positive or 0, the divisor is the radix raised to the power of the exponent; otherwise, the divisor is 1.
bv/rad>: bv specifies the rule's base value. To calculate the divisor, let the radix be rad, and the exponent be the highest exponent of the radix that yields a result less than or equal to the base value. Every > character after the radix decreases the exponent by 1. If the exponent is positive or 0, the divisor is the radix raised to the power of the exponent; otherwise, the divisor is 1.
-x: The rule is a negative-number rule.
x.x: The rule is an improper fraction rule. If the full stop in the middle of the rule name is replaced with the decimal point that is used in the language or DecimalFormatSymbols, then that rule will have precedence when formatting and parsing this rule. For example, some languages use the comma, and can thus be written as x,x instead. For example, you can use "x.x: &lt;&lt; point &gt;&gt;;x,x: &lt;&lt; comma &gt;&gt;;" to handle the decimal point that matches the language's natural spelling of the punctuation of either the full stop or comma.
0.x: The rule is a proper fraction rule. If the full stop in the middle of the rule name is replaced with the decimal point that is used in the language or DecimalFormatSymbols, then that rule will have precedence when formatting and parsing this rule. For example, some languages use the comma, and can thus be written as 0,x instead. For example, you can use "0.x: point &gt;&gt;;0,x: comma &gt;&gt;;" to handle the decimal point that matches the language's natural spelling of the punctuation of either the full stop or comma.
x.0: The rule is a master rule. If the full stop in the middle of the rule name is replaced with the decimal point that is used in the language or DecimalFormatSymbols, then that rule will have precedence when formatting and parsing this rule. For example, some languages use the comma, and can thus be written as x,0 instead. For example, you can use "x.0: &lt;&lt; point;x,0: &lt;&lt; comma;" to handle the decimal point that matches the language's natural spelling of the punctuation of either the full stop or comma.
Inf: The rule for infinity.
NaN: The rule for an IEEE 754 NaN (not a number).
nothing If the rule's rule descriptor is left out, the base value is one plus the preceding rule's base value (or zero if this is the first rule in the list) in a normal rule set.  In a fraction rule set, the base value is the same as the preceding rule's base value.

A rule set may be either a regular rule set or a fraction rule set, depending on whether it is used to format a number's integral part (or the whole number) or a number's fractional part. Using a rule set to format a rule's fractional part makes it a fraction rule set.

Which rule is used to format a number is defined according to one of the following algorithms: If the rule set is a regular rule set, do the following:

If the rule set is a fraction rule set, do the following:

A rule's body consists of a string of characters terminated by a semicolon. The rule may include zero, one, or two substitution tokens, and a range of text in brackets. The brackets denote optional text (and may also include one or both substitutions). The exact meanings of the substitution tokens, and under what conditions optional text is omitted, depend on the syntax of the substitution token and the context. The rest of the text in a rule body is literal text that is output when the rule matches the number being formatted.

A substitution token begins and ends with a token character. The token character and the context together specify a mathematical operation to be performed on the number being formatted. An optional substitution descriptor specifies how the value resulting from that operation is used to fill in the substitution. The position of the substitution token in the rule body specifies the location of the resultant text in the original rule text.

The meanings of the substitution token characters are as follows:

>> in normal rule Divide the number by the rule's divisor and format the remainder
in negative-number rule Find the absolute value of the number and format the result
in fraction or master rule Isolate the number's fractional part and format it.
in rule in fraction rule set Not allowed.
>>> in normal rule Divide the number by the rule's divisor and format the remainder, but bypass the normal rule-selection process and just use the rule that precedes this one in this rule list.
in all other rules Not allowed.
<< in normal rule Divide the number by the rule's divisor and format the quotient
in negative-number rule Not allowed.
in fraction or master rule Isolate the number's integral part and format it.
in rule in fraction rule set Multiply the number by the rule's base value and format the result.
== in all rule sets Format the number unchanged
[] in normal rule Omit the optional text if the number is an even multiple of the rule's divisor
in negative-number rule Not allowed.
in improper-fraction rule Omit the optional text if the number is between 0 and 1 (same as specifying both an x.x rule and a 0.x rule)
in master rule Omit the optional text if the number is an integer (same as specifying both an x.x rule and an x.0 rule)
in proper-fraction rule Not allowed.
in rule in fraction rule set Omit the optional text if multiplying the number by the rule's base value yields 1.
$(cardinal,plural syntax)$ in all rule sets This provides the ability to choose a word based on the number divided by the radix to the power of the exponent of the base value for the specified locale, which is normally equivalent to the << value. This uses the cardinal plural rules from PluralFormat. All strings used in the plural format are treated as the same base value for parsing.
$(ordinal,plural syntax)$ in all rule sets This provides the ability to choose a word based on the number divided by the radix to the power of the exponent of the base value for the specified locale, which is normally equivalent to the << value. This uses the ordinal plural rules from PluralFormat. All strings used in the plural format are treated as the same base value for parsing.

The substitution descriptor (i.e., the text between the token characters) may take one of three forms:

a rule set name Perform the mathematical operation on the number, and format the result using the named rule set.
a DecimalFormat pattern Perform the mathematical operation on the number, and format the result using a DecimalFormat with the specified pattern.  The pattern must begin with 0 or #.
nothing Perform the mathematical operation on the number, and format the result using the rule set containing the current rule, except:
  • You can't have an empty substitution descriptor with a == substitution.
  • If you omit the substitution descriptor in a >> substitution in a fraction rule, format the result one digit at a time using the rule set containing the current rule.
  • If you omit the substitution descriptor in a << substitution in a rule in a fraction rule set, format the result using the default rule set for this formatter.

Whitespace is ignored between a rule set name and a rule set body, between a rule descriptor and a rule body, or between rules. If a rule body begins with an apostrophe, the apostrophe is ignored, but all text after it becomes significant (this is how you can have a rule's rule text begin with whitespace). There is no escape function: the semicolon is not allowed in rule set names or in rule text, and the colon is not allowed in rule set names. The characters beginning a substitution token are always treated as the beginning of a substitution token.

See the resource data and the demo program for annotated examples of real rule sets using these features.

User subclasses are not supported. While clients may write subclasses, such code will not necessarily work and will not be guaranteed to work stably from release to release.

Localizations

Constructors are available that allow the specification of localizations for the public rule sets (and also allow more control over what public rule sets are available). Localization data is represented as a textual description. The description represents an array of arrays of string. The first element is an array of the public rule set names, each of these must be one of the public rule set names that appear in the rules. Only names in this array will be treated as public rule set names by the API. Each subsequent element is an array of localizations of these names. The first element of one of these subarrays is the locale name, and the remaining elements are localizations of the public rule set names, in the same order as they were listed in the first arrray.

In the syntax, angle brackets '<', '>' are used to delimit the arrays, and comma ',' is used to separate elements of an array. Whitespace is ignored, unless quoted.

For example:

< < foo, bar, baz >,
  < en, Foo, Bar, Baz >,
  < fr, 'le Foo', 'le Bar', 'le Baz' >
  < zh, \u7532, \u4e59, \u4e19 > >
Author
Richard Gillam
See also
NumberFormat
DecimalFormat
PluralFormat
PluralRules
Stable:
ICU 2.0

Definition at line 562 of file rbnf.h.

Constructor & Destructor Documentation

◆ RuleBasedNumberFormat() [1/6]

icu::RuleBasedNumberFormat::RuleBasedNumberFormat ( const UnicodeString rules,
UParseError perror,
UErrorCode status 
)

Creates a RuleBasedNumberFormat that behaves according to the description passed in.

The formatter uses the default locale.

Parameters
rulesA description of the formatter's desired behavior. See the class documentation for a complete explanation of the description syntax.
perrorThe parse error if an error was encountered.
statusThe status indicating whether the constructor succeeded.
Stable:
ICU 3.2

◆ RuleBasedNumberFormat() [2/6]

icu::RuleBasedNumberFormat::RuleBasedNumberFormat ( const UnicodeString rules,
const UnicodeString localizations,
UParseError perror,
UErrorCode status 
)

Creates a RuleBasedNumberFormat that behaves according to the description passed in.

The formatter uses the default locale.

The localizations data provides information about the public rule sets and their localized display names for different locales. The first element in the list is an array of the names of the public rule sets. The first element in this array is the initial default ruleset. The remaining elements in the list are arrays of localizations of the names of the public rule sets. Each of these is one longer than the initial array, with the first String being the ULocale ID, and the remaining Strings being the localizations of the rule set names, in the same order as the initial array. Arrays are NULL-terminated.

Parameters
rulesA description of the formatter's desired behavior. See the class documentation for a complete explanation of the description syntax.
localizationsthe localization information. names in the description. These will be copied by the constructor.
perrorThe parse error if an error was encountered.
statusThe status indicating whether the constructor succeeded.
Stable:
ICU 3.2

◆ RuleBasedNumberFormat() [3/6]

icu::RuleBasedNumberFormat::RuleBasedNumberFormat ( const UnicodeString rules,
const Locale locale,
UParseError perror,
UErrorCode status 
)

Creates a RuleBasedNumberFormat that behaves according to the rules passed in.

The formatter uses the specified locale to determine the characters to use when formatting numerals, and to define equivalences for lenient parsing.

Parameters
rulesThe formatter rules. See the class documentation for a complete explanation of the rule syntax.
localeA locale that governs which characters are used for formatting values in numerals and which characters are equivalent in lenient parsing.
perrorThe parse error if an error was encountered.
statusThe status indicating whether the constructor succeeded.
Stable:
ICU 2.0

◆ RuleBasedNumberFormat() [4/6]

icu::RuleBasedNumberFormat::RuleBasedNumberFormat ( const UnicodeString rules,
const UnicodeString localizations,
const Locale locale,
UParseError perror,
UErrorCode status 
)

Creates a RuleBasedNumberFormat that behaves according to the description passed in.

The formatter uses the default locale.

The localizations data provides information about the public rule sets and their localized display names for different locales. The first element in the list is an array of the names of the public rule sets. The first element in this array is the initial default ruleset. The remaining elements in the list are arrays of localizations of the names of the public rule sets. Each of these is one longer than the initial array, with the first String being the ULocale ID, and the remaining Strings being the localizations of the rule set names, in the same order as the initial array. Arrays are NULL-terminated.

Parameters
rulesA description of the formatter's desired behavior. See the class documentation for a complete explanation of the description syntax.
localizationsa list of localizations for the rule set names in the description. These will be copied by the constructor.
localeA locale that governs which characters are used for formatting values in numerals and which characters are equivalent in lenient parsing.
perrorThe parse error if an error was encountered.
statusThe status indicating whether the constructor succeeded.
Stable:
ICU 3.2

◆ RuleBasedNumberFormat() [5/6]

icu::RuleBasedNumberFormat::RuleBasedNumberFormat ( URBNFRuleSetTag  tag,
const Locale locale,
UErrorCode status 
)

Creates a RuleBasedNumberFormat from a predefined ruleset.

The selector code choosed among three possible predefined formats: spellout, ordinal, and duration.

Parameters
tagA selector code specifying which kind of formatter to create for that locale. There are four legal values: URBNF_SPELLOUT, which creates a formatter that spells out a value in words in the desired language, URBNF_ORDINAL, which attaches an ordinal suffix from the desired language to the end of a number (e.g. "123rd"), URBNF_DURATION, which formats a duration in seconds as hours, minutes, and seconds always rounding down, and URBNF_NUMBERING_SYSTEM, which is used to invoke rules for alternate numbering systems such as the Hebrew numbering system, or for Roman Numerals, etc.
localeThe locale for the formatter.
statusThe status indicating whether the constructor succeeded.
Stable:
ICU 2.0

◆ RuleBasedNumberFormat() [6/6]

icu::RuleBasedNumberFormat::RuleBasedNumberFormat ( const RuleBasedNumberFormat rhs)

Copy constructor.

Parameters
rhsthe object to be copied from.
Stable:
ICU 2.6

◆ ~RuleBasedNumberFormat()

virtual icu::RuleBasedNumberFormat::~RuleBasedNumberFormat ( )
virtual

Release memory allocated for a RuleBasedNumberFormat when you are finished with it.

Stable:
ICU 2.6

Member Function Documentation

◆ adoptDecimalFormatSymbols()

virtual void icu::RuleBasedNumberFormat::adoptDecimalFormatSymbols ( DecimalFormatSymbols symbolsToAdopt)
virtual

Sets the decimal format symbols, which is generally not changed by the programmer or user.

The formatter takes ownership of symbolsToAdopt; the client must not delete it.

Parameters
symbolsToAdoptDecimalFormatSymbols to be adopted.
Stable:
ICU 49

◆ clone()

virtual RuleBasedNumberFormat* icu::RuleBasedNumberFormat::clone ( ) const
virtual

Clone this object polymorphically.

The caller is responsible for deleting the result when done.

Returns
A copy of the object.
Stable:
ICU 2.6

Implements icu::NumberFormat.

◆ format() [1/7]

virtual UnicodeString& icu::RuleBasedNumberFormat::format ( int32_t  number,
UnicodeString toAppendTo,
FieldPosition pos 
) const
virtual

Formats the specified 32-bit number using the default ruleset.

Parameters
numberThe number to format.
toAppendTothe string that will hold the (appended) result
posthe fieldposition
Returns
A textual representation of the number.
Stable:
ICU 2.0

Implements icu::NumberFormat.

◆ format() [2/7]

virtual UnicodeString& icu::RuleBasedNumberFormat::format ( int64_t  number,
UnicodeString toAppendTo,
FieldPosition pos 
) const
virtual

Formats the specified 64-bit number using the default ruleset.

Parameters
numberThe number to format.
toAppendTothe string that will hold the (appended) result
posthe fieldposition
Returns
A textual representation of the number.
Stable:
ICU 2.1

Reimplemented from icu::NumberFormat.

◆ format() [3/7]

virtual UnicodeString& icu::RuleBasedNumberFormat::format ( double  number,
UnicodeString toAppendTo,
FieldPosition pos 
) const
virtual

Formats the specified number using the default ruleset.

Parameters
numberThe number to format.
toAppendTothe string that will hold the (appended) result
posthe fieldposition
Returns
A textual representation of the number.
Stable:
ICU 2.0

Implements icu::NumberFormat.

◆ format() [4/7]

virtual UnicodeString& icu::RuleBasedNumberFormat::format ( int32_t  number,
const UnicodeString ruleSetName,
UnicodeString toAppendTo,
FieldPosition pos,
UErrorCode status 
) const
virtual

Formats the specified number using the named ruleset.

Parameters
numberThe number to format.
ruleSetNameThe name of the rule set to format the number with. This must be the name of a valid public rule set for this formatter.
toAppendTothe string that will hold the (appended) result
posthe fieldposition
statusthe status
Returns
A textual representation of the number.
Stable:
ICU 2.0

◆ format() [5/7]

virtual UnicodeString& icu::RuleBasedNumberFormat::format ( int64_t  number,
const UnicodeString ruleSetName,
UnicodeString toAppendTo,
FieldPosition pos,
UErrorCode status 
) const
virtual

Formats the specified 64-bit number using the named ruleset.

Parameters
numberThe number to format.
ruleSetNameThe name of the rule set to format the number with. This must be the name of a valid public rule set for this formatter.
toAppendTothe string that will hold the (appended) result
posthe fieldposition
statusthe status
Returns
A textual representation of the number.
Stable:
ICU 2.1

◆ format() [6/7]

virtual UnicodeString& icu::RuleBasedNumberFormat::format ( double  number,
const UnicodeString ruleSetName,
UnicodeString toAppendTo,
FieldPosition pos,
UErrorCode status 
) const
virtual

Formats the specified number using the named ruleset.

Parameters
numberThe number to format.
ruleSetNameThe name of the rule set to format the number with. This must be the name of a valid public rule set for this formatter.
toAppendTothe string that will hold the (appended) result
posthe fieldposition
statusthe status
Returns
A textual representation of the number.
Stable:
ICU 2.0

◆ format() [7/7]

virtual UnicodeString& icu::RuleBasedNumberFormat::format ( const number::impl::DecimalQuantity &  number,
UnicodeString appendTo,
FieldPosition pos,
UErrorCode status 
) const
protectedvirtual

Format a decimal number.

The number is a DigitList wrapper onto a floating point decimal number. The default implementation in NumberFormat converts the decimal number to a double and formats that. Subclasses of NumberFormat that want to specifically handle big decimal numbers must override this method. class DecimalFormat does so.

Parameters
numberThe number, a DigitList format Decimal Floating Point.
appendToOutput parameter to receive result. Result is appended to existing contents.
posOn input: an alignment field, if desired. On output: the offsets of the alignment field.
statusOutput param filled with success/failure status.
Returns
Reference to 'appendTo' parameter.
Internal:
Do not use. This API is for internal use only.

Reimplemented from icu::NumberFormat.

◆ getDefaultRuleSetName()

virtual UnicodeString icu::RuleBasedNumberFormat::getDefaultRuleSetName ( ) const
virtual

Return the name of the current default rule set.

If the current rule set is not public, returns a bogus (and empty) UnicodeString.

Returns
the name of the current default rule set
Stable:
ICU 3.0

◆ getDynamicClassID()

virtual UClassID icu::RuleBasedNumberFormat::getDynamicClassID ( void  ) const
virtual

ICU "poor man's RTTI", returns a UClassID for the actual class.

Stable:
ICU 2.8

Implements icu::NumberFormat.

◆ getNumberOfRuleSetDisplayNameLocales()

virtual int32_t icu::RuleBasedNumberFormat::getNumberOfRuleSetDisplayNameLocales ( void  ) const
virtual

Return the number of locales for which we have localized rule set display names.

Returns
the number of locales for which we have localized rule set display names.
Stable:
ICU 3.2

◆ getNumberOfRuleSetNames()

virtual int32_t icu::RuleBasedNumberFormat::getNumberOfRuleSetNames ( ) const
virtual

Return the number of public rule set names.

Returns
the number of public rule set names.
Stable:
ICU 2.0

◆ getRoundingMode()

virtual ERoundingMode icu::RuleBasedNumberFormat::getRoundingMode ( void  ) const
virtual

Get the rounding mode.

Returns
A rounding mode
Stable:
ICU 60

Reimplemented from icu::NumberFormat.

◆ getRules()

virtual UnicodeString icu::RuleBasedNumberFormat::getRules ( ) const
virtual

return the rules that were provided to the RuleBasedNumberFormat.

Returns
the result String that was passed in
Stable:
ICU 2.0

◆ getRuleSetDisplayName() [1/2]

virtual UnicodeString icu::RuleBasedNumberFormat::getRuleSetDisplayName ( int32_t  index,
const Locale locale = Locale::getDefault() 
)
virtual

Return the rule set display names for the provided locale.

These are in the same order as those returned by getRuleSetName. The locale is matched against the locales for which there is display name data, using normal fallback rules. If no locale matches, the default display names are returned. (These are the internal rule set names minus the leading ''.)

Parameters
indexthe index of the rule set
localethe locale (returned by getRuleSetDisplayNameLocales) for which the localized display name is desired
Returns
the display name for the given index, which might be bogus if there is an error
See also
getRuleSetName
Stable:
ICU 3.2

◆ getRuleSetDisplayName() [2/2]

virtual UnicodeString icu::RuleBasedNumberFormat::getRuleSetDisplayName ( const UnicodeString ruleSetName,
const Locale locale = Locale::getDefault() 
)
virtual

Return the rule set display name for the provided rule set and locale.

The locale is matched against the locales for which there is display name data, using normal fallback rules. If no locale matches, the default display name is returned.

Returns
the display name for the rule set
Stable:
ICU 3.2
See also
getRuleSetDisplayName

◆ getRuleSetDisplayNameLocale()

virtual Locale icu::RuleBasedNumberFormat::getRuleSetDisplayNameLocale ( int32_t  index,
UErrorCode status 
) const
virtual

Return the index'th display name locale.

Parameters
indexthe index of the locale
statusset to a failure code when this function fails
Returns
the locale
See also
getNumberOfRuleSetDisplayNameLocales
Stable:
ICU 3.2

◆ getRuleSetName()

virtual UnicodeString icu::RuleBasedNumberFormat::getRuleSetName ( int32_t  index) const
virtual

Return the name of the index'th public ruleSet.

If index is not valid, the function returns null.

Parameters
indexthe index of the ruleset
Returns
the name of the index'th public ruleSet.
Stable:
ICU 2.0

◆ getStaticClassID()

static UClassID icu::RuleBasedNumberFormat::getStaticClassID ( void  )
static

ICU "poor man's RTTI", returns a UClassID for this class.

Stable:
ICU 2.8

◆ isLenient()

UBool icu::RuleBasedNumberFormat::isLenient ( void  ) const
inlinevirtual

Returns true if lenient-parse mode is turned on.

Lenient parsing is off by default.

Returns
true if lenient-parse mode is turned on.
See also
setLenient
Stable:
ICU 2.0

Reimplemented from icu::NumberFormat.

Definition at line 1102 of file rbnf.h.

◆ operator=()

RuleBasedNumberFormat& icu::RuleBasedNumberFormat::operator= ( const RuleBasedNumberFormat rhs)

Assignment operator.

Parameters
rhsthe object to be copied from.
Stable:
ICU 2.6

◆ operator==()

virtual UBool icu::RuleBasedNumberFormat::operator== ( const Format other) const
virtual

Return true if the given Format objects are semantically equal.

Objects of different subclasses are considered unequal.

Parameters
otherthe object to be compared with.
Returns
true if the given Format objects are semantically equal.
Stable:
ICU 2.6

Reimplemented from icu::NumberFormat.

◆ parse()

virtual void icu::RuleBasedNumberFormat::parse ( const UnicodeString text,
Formattable result,
ParsePosition parsePosition 
) const
virtual

Parses the specfied string, beginning at the specified position, according to this formatter's rules.

This will match the string against all of the formatter's public rule sets and return the value corresponding to the longest parseable substring. This function's behavior is affected by the lenient parse mode.

Parameters
textThe string to parse
resultthe result of the parse, either a double or a long.
parsePositionOn entry, contains the position of the first character in "text" to examine. On exit, has been updated to contain the position of the first character in "text" that wasn't consumed by the parse.
See also
setLenient
Stable:
ICU 2.0

Implements icu::NumberFormat.

◆ setContext()

virtual void icu::RuleBasedNumberFormat::setContext ( UDisplayContext  value,
UErrorCode status 
)
virtual

Set a particular UDisplayContext value in the formatter, such as UDISPCTX_CAPITALIZATION_FOR_STANDALONE.

Note: For getContext, see NumberFormat.

Parameters
valueThe UDisplayContext value to set.
statusInput/output status. If at entry this indicates a failure status, the function will do nothing; otherwise this will be updated with any new status from the function.
Stable:
ICU 53

Reimplemented from icu::NumberFormat.

◆ setDecimalFormatSymbols()

virtual void icu::RuleBasedNumberFormat::setDecimalFormatSymbols ( const DecimalFormatSymbols symbols)
virtual

Sets the decimal format symbols, which is generally not changed by the programmer or user.

A clone of the symbols is created and the symbols is not adopted; the client is still responsible for deleting it.

Parameters
symbolsDecimalFormatSymbols.
Stable:
ICU 49

◆ setDefaultRuleSet()

virtual void icu::RuleBasedNumberFormat::setDefaultRuleSet ( const UnicodeString ruleSetName,
UErrorCode status 
)
virtual

Override the default rule set to use.

If ruleSetName is null, reset to the initial default rule set. If the rule set is not a public rule set name, U_ILLEGAL_ARGUMENT_ERROR is returned in status.

Parameters
ruleSetNamethe name of the rule set, or null to reset the initial default.
statusset to failure code when a problem occurs.
Stable:
ICU 2.6

◆ setLenient()

virtual void icu::RuleBasedNumberFormat::setLenient ( UBool  enabled)
virtual

Turns lenient parse mode on and off.

When in lenient parse mode, the formatter uses a Collator for parsing the text. Only primary differences are treated as significant. This means that case differences, accent differences, alternate spellings of the same letter (e.g., ae and a-umlaut in German), ignorable characters, etc. are ignored in matching the text. In many cases, numerals will be accepted in place of words or phrases as well.

For example, all of the following will correctly parse as 255 in English in lenient-parse mode:
"two hundred fifty-five"
"two hundred fifty five"
"TWO HUNDRED FIFTY-FIVE"
"twohundredfiftyfive"
"2 hundred fifty-5"

The Collator used is determined by the locale that was passed to this object on construction. The description passed to this object on construction may supply additional collation rules that are appended to the end of the default collator for the locale, enabling additional equivalences (such as adding more ignorable characters or permitting spelled-out version of symbols; see the demo program for examples).

It's important to emphasize that even strict parsing is relatively lenient: it will accept some text that it won't produce as output. In English, for example, it will correctly parse "two hundred zero" and "fifteen hundred".

Parameters
enabledIf true, turns lenient-parse mode on; if false, turns it off.
See also
RuleBasedCollator
Stable:
ICU 2.0

Reimplemented from icu::NumberFormat.

◆ setRoundingMode()

virtual void icu::RuleBasedNumberFormat::setRoundingMode ( ERoundingMode  roundingMode)
virtual

Set the rounding mode.

Parameters
roundingModeA rounding mode
Stable:
ICU 60

The documentation for this class was generated from the following file: