ICU 65.1  65.1
Public Member Functions | Static Public Member Functions
icu::CanonicalIterator Class Referencefinal

This class allows one to iterate through all the strings that are canonically equivalent to a given string. More...

#include <caniter.h>

Inheritance diagram for icu::CanonicalIterator:
icu::UObject icu::UMemory

Public Member Functions

 CanonicalIterator (const UnicodeString &source, UErrorCode &status)
 Construct a CanonicalIterator object. More...
 
virtual ~CanonicalIterator ()
 Destructor Cleans pieces. More...
 
UnicodeString getSource ()
 Gets the NFD form of the current source we are iterating over. More...
 
void reset ()
 Resets the iterator so that one can start again from the beginning. More...
 
UnicodeString next ()
 Get the next canonically equivalent string. More...
 
void setSource (const UnicodeString &newSource, UErrorCode &status)
 Set a new source for this iterator. More...
 
virtual UClassID getDynamicClassID () const
 ICU "poor man's RTTI", returns a UClassID for the actual class. More...
 
- Public Member Functions inherited from icu::UObject
virtual ~UObject ()
 Destructor. More...
 

Static Public Member Functions

static void permute (UnicodeString &source, UBool skipZeros, Hashtable *result, UErrorCode &status)
 Dumb recursive implementation of permutation. More...
 
static UClassID getStaticClassID ()
 ICU "poor man's RTTI", returns a UClassID for this class. More...
 

Detailed Description

This class allows one to iterate through all the strings that are canonically equivalent to a given string.

For example, here are some sample results: Results for: {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 1: \u0041\u030A\u0064\u0307\u0327 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 2: \u0041\u030A\u0064\u0327\u0307 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 3: \u0041\u030A\u1E0B\u0327 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 4: \u0041\u030A\u1E11\u0307 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE} 5: \u00C5\u0064\u0307\u0327 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 6: \u00C5\u0064\u0327\u0307 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 7: \u00C5\u1E0B\u0327 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 8: \u00C5\u1E11\u0307 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE} 9: \u212B\u0064\u0307\u0327 = {ANGSTROM SIGN}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 10: \u212B\u0064\u0327\u0307 = {ANGSTROM SIGN}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 11: \u212B\u1E0B\u0327 = {ANGSTROM SIGN}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 12: \u212B\u1E11\u0307 = {ANGSTROM SIGN}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE}
Note: the code is intended for use with small strings, and is not suitable for larger ones, since it has not been optimized for that situation. Note, CanonicalIterator is not intended to be subclassed.

Author
M. Davis
C++ port by V. Weinstein
Stable:
ICU 2.4

Definition at line 76 of file caniter.h.

Constructor & Destructor Documentation

◆ CanonicalIterator()

icu::CanonicalIterator::CanonicalIterator ( const UnicodeString source,
UErrorCode status 
)

Construct a CanonicalIterator object.

Parameters
sourcestring to get results for
statusFill-in parameter which receives the status of this operation.
Stable:
ICU 2.4

◆ ~CanonicalIterator()

virtual icu::CanonicalIterator::~CanonicalIterator ( )
virtual

Destructor Cleans pieces.

Stable:
ICU 2.4

Member Function Documentation

◆ getDynamicClassID()

virtual UClassID icu::CanonicalIterator::getDynamicClassID ( ) const
virtual

ICU "poor man's RTTI", returns a UClassID for the actual class.

Stable:
ICU 2.2

Reimplemented from icu::UObject.

◆ getSource()

UnicodeString icu::CanonicalIterator::getSource ( )

Gets the NFD form of the current source we are iterating over.

Returns
gets the source: NOTE: it is the NFD form of source
Stable:
ICU 2.4

◆ getStaticClassID()

static UClassID icu::CanonicalIterator::getStaticClassID ( )
static

ICU "poor man's RTTI", returns a UClassID for this class.

Stable:
ICU 2.2

◆ next()

UnicodeString icu::CanonicalIterator::next ( )

Get the next canonically equivalent string.


Warning: The strings are not guaranteed to be in any particular order.

Returns
the next string that is canonically equivalent. A bogus string is returned when the iteration is done.
Stable:
ICU 2.4

◆ permute()

static void icu::CanonicalIterator::permute ( UnicodeString source,
UBool  skipZeros,
Hashtable *  result,
UErrorCode status 
)
static

Dumb recursive implementation of permutation.

TODO: optimize

Parameters
sourcethe string to find permutations for
skipZerosdetermine if skip zeros
resultthe results in a set.
statusFill-in parameter which receives the status of this operation.
Internal:
Do not use. This API is for internal use only.

◆ reset()

void icu::CanonicalIterator::reset ( )

Resets the iterator so that one can start again from the beginning.

Stable:
ICU 2.4

◆ setSource()

void icu::CanonicalIterator::setSource ( const UnicodeString newSource,
UErrorCode status 
)

Set a new source for this iterator.

Allows object reuse.

Parameters
newSourcethe source string to iterate against. This allows the same iterator to be used while changing the source string, saving object creation.
statusFill-in parameter which receives the status of this operation.
Stable:
ICU 2.4

The documentation for this class was generated from the following file: