Java Fundamentals  «Prev  Next»
Lesson 2It all starts with tokens
ObjectiveExplain the fundamental coding elements in Java programs.

Fundamental Coding Elements

When a program is processed by the Java compiler, it is first broken down into tokens. A token is the smallest code element in a program that is meaningful to the compiler. The following line of Java code contains five tokens:
  
boolean busy = true;

The tokens in this example are boolean, busy, =, true, and ;. Understanding tokens is critical, because tokens describe the fundamental structure of the Java programming language. Java tokens can be divided into five categories:

Java Identifiers

Identifiers are tokens that are used to represent names and are used a great deal in Java programming since many parts of a program require names. Along with making Java programs easier to understand, identifiers are also important because they uniquely identify parts of a program.
Java identifiers must begin with
  1. a letter,
  2. an underscore ( _ ), or
  3. a dollar sign ($), and
  4. can include both uppercase and lowercase letters.

Java is case sensitive, which means that the identifiers Ernie, ernie, and ERNIE are all differentiated from each other. Identifier characters after the first character can include the numbers 0 to 9. The only other catch to naming identifiers is that an identifier cannot share a name with a Java keyword such as class, if, or return.
  • Java Identifiers and Keywords: Classes, variables, and methods require names. In Java, these names are called identifiers, and, as you might expect, there are rules for what constitutes a legal Java identifier. Beyond what is legal, Java and Oracle programmers have created conventions for naming methods, variables, and classes. Like all programming languages, Java has a set of built-in keywords. These keywords must not be used as identifiers. Later in this chapter we will review the details of these naming rules, conventions, and the Java keywords.
    Technically, legal identifiers must be composed of only Unicode characters, numbers, currency symbols, and connecting characters (such as underscores). The exam does not dive into the details of which ranges of the Unicode character set are considered to qualify as letters and digits. So, for example, you will not need to know that Tibetan digits range from \u0420 to \u0f29. Here are the rules you do need to know:
    1. Identifiers must start with a letter, a currency character ($), or a connecting character such as the underscore (_). Identifiers cannot start with a digit. After the first character, identifiers can contain any combination of letters, currency characters, connecting characters, or numbers.
    2. In practice, there is no limit to the number of characters an identifier can contain.
    3. You cannot use a Java keyword as an identifier. Table 1-1 lists all of the Java keywords.
    4. Identifiers in Java are case-sensitive; foo and FOO are two different identifiers.

    Examples of legal and illegal identifiers follow.
    First some legal identifiers:
    int _a;
    int $c;
    int ______2_w;
    int _$;
    int this_is_a_very_detailed_name_for_an_identifier;
    

    The following are illegal (it's your job to recognize why):
    int :b;
    int -d;
    int e#;
    int .f;
    int 7g;
    


  1. keywords,
  2. Literals,

Evaluation of Java Computation Operators

Operators are tokens that specify an evaluation or computation. Take a look at the following example:
int curYear = 1999;
int tilY2K = 2000 - curYear;

Both of these examples rely on the assignment operator (=), which takes a value on the right and stores it (assigns it) to a variable on the left. The addition and subtraction operators (+ and -) carry out simple addition and subtraction, much like a calculator. The first line of code results in a value of 1999 being stored in curYear. The second line then effectively performs the computation 2000 - 1999, which results in a value of 1 being stored in tilY2K.
  • Types of Operators: Operators perform data manipulations on one or more input variables (called operands). For example, in the expression 2+3, the operands are 2 and 3, and the operator is +. In terms of the number of operands, a distinction can be made among unary operators (one operand), binary operators (two operands), and ternary operators (three operands). In terms of the operations performed, a distinction can be made among the following:
    1. Arithmetic operators
    2. Assignment operators
    3. Bitwise operators
    4. Logical operators
    5. Relational operators


    Table 6-2: Arithmetic Operators
    Table 6-2: Arithmetic Operators

    Most of these operators are probably very familiar to you already. Addition, subtraction, multiplication, and division are used in everyday calculations. It is worth noting at this point that while they operate in the way you understand and expect, the answer is not always exactly what you are looking for. In other cases, the way data is stored as binary numbers cannot accurately represent non-whole numbers. For this reason, operations on floating point numbers often result in a number that is very close to what you expect, but with several digits after the decimal point. This is simply due to the fact that these decimals are approximations. For example, if you multiply 1.3 times 0.01, the answer would be 0.013. However, when you ask Java to calculate
    1.3f*0.01f
    
    , the result is 0.12999999. Of course, this rounds to the 0.013 you are expecting, so the operation is the same. Sometimes the problem is not with rounding, but due to the data type being used. To illustrate this, imagine you have two integers, 5 and 2. If you add them together, you expect 7 (and this is what Java will return as well). However, if you divide 5 by 2, you already know the answer is 2.5. However, Java is using integers, so the result of integer operations must be an integer. Therefore, Java evaluates 5/2 = 2. The remainder is not included in the result.
    This is where the modulo operator comes in. It will calculate the remainder in division. So, while 5/2 = 2 (and the remainder of 1 was ignored), 5%2 = 1 (here is that remainder of 1). Between the two operators, you have the complete solution. It is interesting to note that the modulo operator is often used to check whether a number is even or odd. For an even number, %2 will result in 0, whereas for an odd number, %2 will result in 1.


Modern Java

Java Separators group Coding Elements

Separators are tokens used by the Java compiler to group other coding elements. For example, commas are separators used to separate a list of items, much like a list of words in a sentence. Following are the separators used in Java:
{ } ; , :

Purpose of Java Separators
  1. ( ) Encloses arguments in method definitions and calling; adjusts precedence in arithmetic expressions; surrounds cast types and delimits test expressions in flow control statements
  2. { } defines blocks of code and automatically initializes arrays
  3. [ ] declares array types and dereferences array values
  4. ; terminates statements
  5. , separates successive identifiers in variable declarations; chains statements in the test, expression of a for loop
  6. . Selects a field or method from an object; separates package names from sub-package and class names
  7. : Used after loop labels

Terminator versus a Separator in Java

There is a distinction between terminator and separator.
  1. The comma between identifiers in declarations is a separator because it comes between elements in the list.
  2. The semicolon is a terminator because it ends each statement.
If the semicolon were a statement separator, the last semicolon in a code block would be unnecessary and (depending on the choice of the language designer) possibly invalid.


Examples of Java Tokens

  1. Identifiers: Tokens that represent names
  2. Keywords: Special identifiers set aside as programming constructs
  3. Literals: Program data elements that are constant
  4. Operators: Programming constructs used to specify an evaluation or computation
  5. Separators: Symbols to inform the Java compiler of how code elements are grouped

Additional information with respect to Identifiers and Keywords can be found at the following link. Identifiers Keywords
  1. Identifiers: Tokens that represent names
  2. Keywords: Special identifiers set aside as programming constructs
  3. Literals: Program data elements that are constant
  4. Operators: Programming constructs used to specify an evaluation or computation
  5. Separators: Symbols to inform the Java compiler of how code elements are grouped

Coding elements that are not considered tokens include comments and whitespace (spaces, tabs, and end-of-lines), which are ignored by the Java compiler.

Unicode Character Set

Java programs are written using Unicode. You can use Unicode characters anywhere in a Java program, including comments and identifiers such as variable names. Unlike the 7-bit ASCII character set, which is useful only for English, and the 8-bit ISO Latin-1 character set, which is useful only for major Western European languages, the Unicode character set can represent virtually every written language in common use on the planet. If you do not use a Unicode-enabled text editor, or if you do not want to force other programmers who view or edit your code to use a Unicode-enabled editor, you can embed Unicode characters into your Java programs using the special Unicode escape sequence \uxxxx, in other words, a backslash and a lowercase u, followed by four hexadecimal characters. For example, \u0020 is the space character, and \u03c0 is the character Π

SEMrush Software