While reading the java language specification (java se specs, main page) (JLS) for java 7 (direct link to pdf), I came across a section on escape sequences and unicode characters (Lexical Structure, p. 15), and how they're processed by the jvm, so I made a java "main" class to test outputting unicode and escape sequences to standard out (console).
Get the code here: EscapeSequencesTest.java
I learned and noticed a couple of things while working with escape sequences:
* some unicode values found in comments cause eclipse (java?) to display an error in the java file, e.g.:
// NOTE: ('\u000C') in the comment doesn't give compile errors like examples above, e.g., '\u000D' (but without a space between "\" and "u"--otherwise eclipse gives an error)
the green value, \u000C doesn't give an error, but when eclipse reads the red value, \u000D, eclipse marks the line with a red X, with the following error msg:
Invalid character constant
so to fix this, i had to change the red value to:
\ u000D
(put a space between the backslash delimeter, \, and the unicode letter, u)
see also java.util.regex.Pattern API, for examples of escape sequences used in regular expressions, e.g.: to match characters by octal value, hexadecimal value, matching line terminators (CR, LF, CRLF), etc
No comments:
Post a Comment