sun.io
Class CharToByteConverter

java.lang.Object
  extended by sun.io.CharToByteConverter
Direct Known Subclasses:
CharToByteASCII, CharToByteCp933, CharToByteCp949, CharToByteCp949C, CharToByteCp970, CharToByteDBCS_ASCII, CharToByteDBCS_EBCDIC, CharToByteDoubleByte, CharToByteEUC, CharToByteEUC_TW, CharToByteISO2022, CharToByteISO8859_1, sun.io.CharToByteSingleByte, CharToByteUnicode, CharToByteUTF8

public abstract class CharToByteConverter
extends Object

An abstract base class for subclasses which convert Unicode characters into an external encoding.


Field Summary
protected  int badInputLength
          Length of bad input that caused conversion to stop.
protected  int byteOff
          Offset of next byte to be output.
protected  int charOff
          Offset of next character to be converted.
protected  byte[] subBytes
          Bytes to substitute for unmappable input.
protected  boolean subMode
          Substitution mode flag.
 
Constructor Summary
CharToByteConverter()
           
 
Method Summary
 boolean canConvert(char c)
          Returns true if the given character can be converted to the target character encoding.
abstract  int convert(char[] input, int inStart, int inEnd, byte[] output, int outStart, int outEnd)
          Converts an array of Unicode characters into an array of bytes in the target character encoding.
 byte[] convertAll(char[] input)
          Converts an array of Unicode characters into an array of bytes in the target character encoding.
 int convertAny(char[] input, int inStart, int inEnd, byte[] output, int outStart, int outEnd)
           
abstract  int flush(byte[] output, int outStart, int outEnd)
          Writes any remaining output to the output buffer and resets the converter to its initial state.
 int flushAny(byte[] output, int outStart, int outEnd)
          Writes any remaining output to the output buffer and resets the converter to its initial state.
 int getBadInputLength()
          Returns the length, in chars, of the input which caused a MalformedInputException.
abstract  String getCharacterEncoding()
          Returns the character set id for the conversion.
static CharToByteConverter getConverter(String encoding)
          Returns appropriate CharToByteConverter subclass instance.
static CharToByteConverter getDefault()
          Create an instance of the default CharToByteConverter subclass.
abstract  int getMaxBytesPerChar()
          Returns the maximum number of bytes needed to convert a char.
 int nextByteIndex()
          Returns the index of the byte just past the last byte written by the previous call to convert.
 int nextCharIndex()
          Returns the index of the character just past the last character successfully converted by the previous call to convert.
abstract  void reset()
          Resets converter to its initial state.
 void setSubstitutionBytes(byte[] newSubBytes)
          Sets the substitution bytes to use when the converter is in substitution mode.
 void setSubstitutionMode(boolean doSub)
          Sets converter into substitution mode.
 String toString()
          Returns a string representation of the class.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

subMode

protected boolean subMode
Substitution mode flag.


subBytes

protected byte[] subBytes
Bytes to substitute for unmappable input.


charOff

protected int charOff
Offset of next character to be converted.


byteOff

protected int byteOff
Offset of next byte to be output.


badInputLength

protected int badInputLength
Length of bad input that caused conversion to stop.

Constructor Detail

CharToByteConverter

public CharToByteConverter()
Method Detail

getDefault

public static CharToByteConverter getDefault()
Create an instance of the default CharToByteConverter subclass.


getConverter

public static CharToByteConverter getConverter(String encoding)
                                        throws UnsupportedEncodingException
Returns appropriate CharToByteConverter subclass instance.

Parameters:
string - represets encoding
Throws:
UnsupportedEncodingException

getCharacterEncoding

public abstract String getCharacterEncoding()
Returns the character set id for the conversion.


convert

public abstract int convert(char[] input,
                            int inStart,
                            int inEnd,
                            byte[] output,
                            int outStart,
                            int outEnd)
                     throws MalformedInputException,
                            UnknownCharacterException,
                            ConversionBufferFullException
Converts an array of Unicode characters into an array of bytes in the target character encoding. This method allows a buffer by buffer conversion of a data stream. The state of the conversion is saved between calls to convert. If a call to convert results in an exception, the conversion may be continued by calling convert again with suitably modified parameters. All conversions should be finished with a call to the flush method.

Parameters:
input - array containing Unicode characters to be converted.
inStart - begin conversion at this offset in input array.
inEnd - stop conversion at this offset in input array (exclusive).
output - byte array to receive conversion result.
outStart - start writing to output array at this offset.
outEnd - stop writing to output array at this offset (exclusive).
Returns:
the number of bytes written to output.
Throws:
MalformedInputException - if the input buffer contains any sequence of chars that is illegal in Unicode (principally unpaired surrogates and ? or ?). After this exception is thrown, the method nextCharIndex can be called to obtain the index of the first invalid input character. The MalformedInputException can be queried for the length of the invalid input.
UnknownCharacterException - for any character that that cannot be converted to the external character encoding. Thrown only when converter is not in substitution mode.
ConversionBufferFullException - if output array is filled prior to converting all the input.

convertAny

public int convertAny(char[] input,
                      int inStart,
                      int inEnd,
                      byte[] output,
                      int outStart,
                      int outEnd)
               throws ConversionBufferFullException
Throws:
ConversionBufferFullException

convertAll

public byte[] convertAll(char[] input)
                  throws MalformedInputException
Converts an array of Unicode characters into an array of bytes in the target character encoding. Unlike convert, this method does not do incremental conversion. It assumes that the given input array contains all the characters to be converted. The state of the converter is reset at the beginning of this method and is left in the reset state on successful termination. The converter is not reset if an exception is thrown. This allows the caller to determine where the bad input was encountered by calling nextCharIndex.

This method uses substitution mode when performing the conversion. The method setSubstitutionBytes may be used to determine what bytes are substituted. Even though substitution mode is used, the state of the converter's substitution mode is not changed at the end of this method.

Parameters:
input - array containing Unicode characters to be converted.
Returns:
an array of bytes containing the converted characters.
Throws:
MalformedInputException - if the input buffer contains any sequence of chars that is illegal in Unicode (principally unpaired surrogates and ? or ?). After this exception is thrown, the method nextCharIndex can be called to obtain the index of the first invalid input character and getBadInputLength can be called to determine the length of the invalid input.
See Also:
nextCharIndex(), setSubstitutionMode(boolean), setSubstitutionBytes(byte[]), getBadInputLength()

flush

public abstract int flush(byte[] output,
                          int outStart,
                          int outEnd)
                   throws MalformedInputException,
                          ConversionBufferFullException
Writes any remaining output to the output buffer and resets the converter to its initial state.

Parameters:
output - byte array to receive flushed output.
outStart - start writing to output array at this offset.
outEnd - stop writing to output array at this offset (exclusive).
Throws:
MalformedInputException - if the output to be flushed contained a partial or invalid multibyte character sequence. Will occur if the input buffer on the last call to convert ended with the first character of a surrogate pair. flush will write what it can to the output buffer and reset the converter before throwing this exception. An additional call to flush is not required.
ConversionBufferFullException - if output array is filled before all the output can be flushed. flush will write what it can to the output buffer and remember its state. An additional call to flush with a new output buffer will conclude the operation.

flushAny

public int flushAny(byte[] output,
                    int outStart,
                    int outEnd)
             throws ConversionBufferFullException
Writes any remaining output to the output buffer and resets the converter to its initial state. May only be called when substitution mode is turned on, and never complains about malformed input (always substitutes).

Parameters:
output - byte array to receive flushed output.
outStart - start writing to output array at this offset.
outEnd - stop writing to output array at this offset (exclusive).
Returns:
number of bytes writter into output.
Throws:
ConversionBufferFullException - if output array is filled before all the output can be flushed. flush will write what it can to the output buffer and remember its state. An additional call to flush with a new output buffer will conclude the operation.

reset

public abstract void reset()
Resets converter to its initial state.


canConvert

public boolean canConvert(char c)
Returns true if the given character can be converted to the target character encoding.

Parameters:
c - character to test
Returns:
true if given character is translatable, false otherwise.

getMaxBytesPerChar

public abstract int getMaxBytesPerChar()
Returns the maximum number of bytes needed to convert a char. Useful for calculating the maximum output buffer size needed for a particular input buffer.


getBadInputLength

public int getBadInputLength()
Returns the length, in chars, of the input which caused a MalformedInputException. Always refers to the last MalformedInputException thrown by the converter. If none have ever been thrown, returns 0.


nextCharIndex

public int nextCharIndex()
Returns the index of the character just past the last character successfully converted by the previous call to convert.


nextByteIndex

public int nextByteIndex()
Returns the index of the byte just past the last byte written by the previous call to convert.


setSubstitutionMode

public void setSubstitutionMode(boolean doSub)
Sets converter into substitution mode. In substitution mode, the converter will replace untranslatable characters in the source encoding with the substitution character set by setSubstitutionBytes. When not in substitution mode, the converter will throw an UnknownCharacterException when it encounters untranslatable input.

Parameters:
doSub - if true, enable substitution mode.
See Also:
setSubstitutionBytes(byte[])

setSubstitutionBytes

public void setSubstitutionBytes(byte[] newSubBytes)
                          throws IllegalArgumentException
Sets the substitution bytes to use when the converter is in substitution mode. The given bytes should represent a valid character in the target character encoding and must not be longer than the value returned by getMaxBytesPerChar for this converter.

Parameters:
newSubBytes - the substitution bytes
Throws:
IllegalArgumentException - if given byte array is longer than the value returned by the method getMaxBytesPerChar.
See Also:
setSubstitutionMode(boolean), getMaxBytesPerChar()

toString

public String toString()
Returns a string representation of the class.

Overrides:
toString in class Object
Returns:
a string representation of the object.