sun.io
Class ByteToCharConverter

java.lang.Object
  extended by sun.io.ByteToCharConverter
Direct Known Subclasses:
ByteToCharASCII, ByteToCharCp33722, ByteToCharCp964, ByteToCharDBCS_ASCII, ByteToCharDBCS_EBCDIC, ByteToCharDoubleByte, ByteToCharEUC, ByteToCharEUC_TW, ByteToCharISO2022, ByteToCharISO8859_1, ByteToCharJISAutoDetect, ByteToCharSingleByte, ByteToCharUnicode, ByteToCharUTF8

public abstract class ByteToCharConverter
extends Object

An abstract base class for subclasses which convert character data in an external encoding into Unicode characters.


Field Summary
protected  int badInputLength
           
protected  int byteOff
           
protected  int charOff
           
protected  char[] subChars
           
protected  boolean subMode
           
 
Constructor Summary
ByteToCharConverter()
           
 
Method Summary
abstract  int convert(byte[] input, int inStart, int inEnd, char[] output, int outStart, int outEnd)
          Converts an array of bytes containing characters in an external encoding into an array of Unicode characters.
 char[] convertAll(byte[] input)
          Converts an array of bytes containing characters in an external encoding into an array of Unicode characters.
abstract  int flush(char[] output, int outStart, int outEnd)
          Writes any remaining output to the output buffer and resets the converter to its initial state.
 int getBadInputLength()
          Returns the length, in bytes, of the input which caused a MalformedInputException.
abstract  String getCharacterEncoding()
          Returns the character set id for the conversion
static ByteToCharConverter getConverter(String encoding)
          Returns appropriate ByteToCharConverter subclass instance.
static ByteToCharConverter getDefault()
          Create an instance of the default ByteToCharConverter subclass.
 int getMaxCharsPerByte()
          Returns the maximum number of characters needed to convert a byte.
 int nextByteIndex()
          Returns the index of the byte just past the last byte successfully converted by the previous call to convert.
 int nextCharIndex()
          Returns the index of the character just past the last character written by the previous call to convert.
abstract  void reset()
          Resets converter to its initial state.
 void setSubstitutionChars(char[] c)
          sets the substitution character to use
 void setSubstitutionMode(boolean doSub)
          Sets converter into substitution mode.
 String toString()
          returns a string representation of the character conversion
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

subMode

protected boolean subMode

subChars

protected char[] subChars

charOff

protected int charOff

byteOff

protected int byteOff

badInputLength

protected int badInputLength
Constructor Detail

ByteToCharConverter

public ByteToCharConverter()
Method Detail

getDefault

public static ByteToCharConverter getDefault()
Create an instance of the default ByteToCharConverter subclass.


getConverter

public static ByteToCharConverter getConverter(String encoding)
                                        throws UnsupportedEncodingException
Returns appropriate ByteToCharConverter subclass instance.

Parameters:
string - represents encoding
Throws:
UnsupportedEncodingException

getCharacterEncoding

public abstract String getCharacterEncoding()
Returns the character set id for the conversion


convert

public abstract int convert(byte[] input,
                            int inStart,
                            int inEnd,
                            char[] output,
                            int outStart,
                            int outEnd)
                     throws MalformedInputException,
                            UnknownCharacterException,
                            ConversionBufferFullException
Converts an array of bytes containing characters in an external encoding into an array of Unicode characters. This method allows a buffer by buffer conversion of a data stream. The state of the conversion is saved between calls to convert. Among other things, this means multibyte input sequences can be split between calls. If a call to convert results in an exception, the conversion may be continued by calling convert again with suitably modified parameters. All conversions should be finished with a call to the flush method.

Parameters:
input - byte array containing text to be converted.
inStart - begin conversion at this offset in input array.
inEnd - stop conversion at this offset in input array (exclusive).
output - character array to receive conversion result.
outStart - start writing to output array at this offset.
outEnd - stop writing to output array at this offset (exclusive).
Returns:
the number of bytes written to output.
Throws:
MalformedInputException - if the input buffer contains any sequence of bytes that is illegal for the input character set.
UnknownCharacterException - for any character that that cannot be converted to Unicode. Thrown only when converter is not in substitution mode.
ConversionBufferFullException - if output array is filled prior to converting all the input.

convertAll

public char[] convertAll(byte[] input)
                  throws MalformedInputException
Converts an array of bytes containing characters in an external encoding into an array of Unicode characters. Unlike convert, this method does not do incremental conversion. It assumes that the given input array contains all the characters to be converted. The state of the converter is reset at the beginning of this method and is left in the reset state on successful termination. The converter is not reset if an exception is thrown. This allows the caller to determine where the bad input was encountered by calling nextByteIndex.

This method uses substitution mode when performing the conversion. The method setSubstitutionChars may be used to determine what characters are substituted. Even though substitution mode is used, the state of the converter's substitution mode is not changed at the end of this method.

Parameters:
input - array containing Unicode characters to be converted.
Returns:
an array of chars containing the converted characters.
Throws:
MalformedInputException - if the input buffer contains any sequence of chars that is illegal in the input character encoding. After this exception is thrown, the method nextByteIndex can be called to obtain the index of the first invalid input byte and getBadInputLength can be called to determine the length of the invalid input.
See Also:
nextByteIndex(), setSubstitutionMode(boolean), CharToByteConverter.setSubstitutionBytes(byte[]), getBadInputLength()

flush

public abstract int flush(char[] output,
                          int outStart,
                          int outEnd)
                   throws MalformedInputException,
                          ConversionBufferFullException
Writes any remaining output to the output buffer and resets the converter to its initial state.

Parameters:
output - char array to receive flushed output.
outStart - start writing to output array at this offset.
outEnd - stop writing to output array at this offset (exclusive).
Throws:
MalformedInputException - if the output to be flushed contained a partial or invalid multibyte character sequence. flush will write what it can to the output buffer and reset the converter before throwing this exception. An additional call to flush is not required.
ConversionBufferFullException - if output array is filled before all the output can be flushed. flush will write what it can to the output buffer and remember its state. An additional call to flush with a new output buffer will conclude the operation.

reset

public abstract void reset()
Resets converter to its initial state.


getMaxCharsPerByte

public int getMaxCharsPerByte()
Returns the maximum number of characters needed to convert a byte. Useful for calculating the maximum output buffer size needed for a particular input buffer.


getBadInputLength

public int getBadInputLength()
Returns the length, in bytes, of the input which caused a MalformedInputException. Always refers to the last MalformedInputException thrown by the converter. If none have ever been thrown, returns 0.


nextCharIndex

public int nextCharIndex()
Returns the index of the character just past the last character written by the previous call to convert.


nextByteIndex

public int nextByteIndex()
Returns the index of the byte just past the last byte successfully converted by the previous call to convert.


setSubstitutionMode

public void setSubstitutionMode(boolean doSub)
Sets converter into substitution mode. In substitution mode, the converter will replace untranslatable characters in the source encoding with the substitution character set by setSubstitionChars. When not in substitution mode, the converter will throw an UnknownCharacterException when it encounters untranslatable input.

Parameters:
doSub - if true, enable substitution mode.
See Also:
setSubstitutionChars(char[])

setSubstitutionChars

public void setSubstitutionChars(char[] c)
                          throws IllegalArgumentException
sets the substitution character to use

Parameters:
c - the substitution character
Throws:
IllegalArgumentException

toString

public String toString()
returns a string representation of the character conversion

Overrides:
toString in class Object
Returns:
a string representation of the object.