by interpreting a Pattern
.
A matcher is created from a pattern by invoking the pattern's matcher
method. Once created, a matcher can be used to
perform three different kinds of match operations:
The matches
method attempts to match the entire
input sequence against the pattern.
The lookingAt
method attempts to match the
input sequence, starting at the beginning, against the pattern.
The find
method scans the input sequence looking for
the next subsequence that matches the pattern.
Each of these methods returns a boolean indicating success or failure.
More information about a successful match can be obtained by querying the
state of the matcher.
A matcher finds matches in a subset of its input called the
region. By default, the region contains all of the matcher's input.
The region can be modified via theregion
method and queried
via the regionStart
and regionEnd
methods. The way that the region boundaries interact with some pattern
constructs can be changed. See
useAnchoringBounds
and useTransparentBounds
for more details.
This class also defines methods for replacing matched subsequences with
new strings whose contents can, if desired, be computed from the match
result. The appendReplacement
and appendTail
methods can be used in tandem in order to collect
the result into an existing string buffer, or the more convenient replaceAll
method can be used to create a string in which every
matching subsequence in the input sequence is replaced.
The explicit state of a matcher includes the start and end indices of
the most recent successful match. It also includes the start and end
indices of the input subsequence captured by each capturing group in the pattern as well as a total
count of such subsequences. As a convenience, methods are also provided for
returning these captured subsequences in string form.
The explicit state of a matcher is initially undefined; attempting to
query any part of it before a successful match will cause an IllegalStateException
to be thrown. The explicit state of a matcher is
recomputed by every match operation.
The implicit state of a matcher includes the input character sequence as
well as the append position, which is initially zero and is updated
by the appendReplacement
method.
A matcher may be reset explicitly by invoking its #reset()
method or, if a new input sequence is desired, its
method. Resetting a
matcher discards its explicit state information and sets the append position
to zero.
Instances of this class are not safe for use by multiple concurrent
threads.
Methods:
Implements a non-terminal append-and-replace step.
This method performs the following actions:
It reads characters from the input sequence, starting at the
append position, and appends them to the given string buffer. It
stops after reading the last character preceding the previous match,
that is, the character at index
- 1.
It appends the given replacement string to the string buffer.
It sets the append position of this matcher to the index of
the last character matched, plus one, that is, to
.
The replacement string may contain references to subsequences
captured during the previous match: Each occurrence of
$g will be replaced by the result of
evaluating group
(g).
The first number after the $ is always treated as part of
the group reference. Subsequent numbers are incorporated into g if
they would form a legal group reference. Only the numerals '0'
through '9' are considered as potential components of the group
reference. If the second group matched the string "foo", for
example, then passing the replacement string "$2bar" would
cause "foobar" to be appended to the string buffer. A dollar
sign ($) may be included as a literal in the replacement
string by preceding it with a backslash (\$).
Note that backslashes (\) and dollar signs ($) in
the replacement string may cause the results to be different than if it
were being treated as a literal replacement string. Dollar signs may be
treated as references to captured subsequences as described above, and
backslashes are used to escape literal characters in the replacement
string.
This method is intended to be used in a loop together with the
appendTail
and find
methods. The
following code, for example, writes one dog two dogs in the
yard to the standard-output stream:
Pattern p = Pattern.compile("cat");
Matcher m = p.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, "dog");
}
m.appendTail(sb);
System.out.println(sb.toString());
Implements a terminal append-and-replace step.
This method reads characters from the input sequence, starting at
the append position, and appends them to the given string buffer. It is
intended to be invoked after one or more invocations of the appendReplacement
method in order to copy the
remainder of the input sequence.
public int end ()
Returns the offset after the last character matched.
public int end (int group)
Returns the offset after the last character of the subsequence
captured by the given group during this match.
Capturing groups are indexed from left
to right, starting at one. Group zero denotes the entire pattern, so
the expression m.end(0) is equivalent to
m.end().
Indicates whether some other object is "equal to" this one.
The equals
method implements an equivalence relation
on non-null object references:
- It is reflexive: for any non-null reference value
x
, x.equals(x)
should return
true
.
- It is symmetric: for any non-null reference values
x
and y
, x.equals(y)
should return true
if and only if
y.equals(x)
returns true
.
- It is transitive: for any non-null reference values
x
, y
, and z
, if
x.equals(y)
returns true
and
y.equals(z)
returns true
, then
x.equals(z)
should return true
.
- It is consistent: for any non-null reference values
x
and y
, multiple invocations of
x.equals(y) consistently return true
or consistently return false
, provided no
information used in equals
comparisons on the
objects is modified.
- For any non-null reference value
x
,
x.equals(null)
should return false
.
The equals method for class Object
implements
the most discriminating possible equivalence relation on objects;
that is, for any non-null reference values x
and
y
, this method returns true
if and only
if x
and y
refer to the same object
(x == y
has the value true
).
Note that it is generally necessary to override the hashCode
method whenever this method is overridden, so as to maintain the
general contract for the hashCode method, which states
that equal objects must have equal hash codes.
public boolean find ()
Attempts to find the next subsequence of the input sequence that matches
the pattern.
This method starts at the beginning of this matcher's region, or, if
a previous invocation of the method was successful and the matcher has
not since been reset, at the first character not matched by the previous
match.
If the match succeeds then more information can be obtained via the
start, end, and group methods.
public boolean find (int start)
Resets this matcher and then attempts to find the next subsequence of
the input sequence that matches the pattern, starting at the specified
index.
If the match succeeds then more information can be obtained via the
start, end, and group methods, and subsequent
invocations of the
method will start at the first
character not matched by this match.
Returns the runtime class of an object. That Class
object is the object that is locked by static synchronized
methods of the represented class.
Returns the input subsequence matched by the previous match.
For a matcher m with input sequence s,
the expressions m.group() and
s.substring(m.start(), m.end())
are equivalent.
Note that some patterns, for example a*, match the empty
string. This method will return the empty string when the pattern
successfully matches the empty string in the input.
Returns the input subsequence captured by the given group during the
previous match operation.
For a matcher m, input sequence s, and group index
g, the expressions m.group(g) and
s.substring(m.start(g), m.end(g))
are equivalent.
Capturing groups are indexed from left
to right, starting at one. Group zero denotes the entire pattern, so
the expression m.group(0) is equivalent to m.group().
If the match was successful but the group specified failed to match
any part of the input sequence, then null is returned. Note
that some groups, for example (a*), match the empty string.
This method will return the empty string when such a group successfully
matches the empty string in the input.
public int groupCount ()
Returns the number of capturing groups in this match result's pattern.
Group zero denotes the entire pattern by convention. It is not
included in this count.
Any non-negative integer smaller than or equal to the value
returned by this method is guaranteed to be a valid group index for
this matcher.
public boolean hasAnchoringBounds ()
Queries the anchoring of region bounds for this matcher.
This method returns true if this matcher uses
anchoring bounds, false otherwise.
See useAnchoringBounds
for a
description of anchoring bounds.
By default, a matcher uses anchoring region boundaries.
public native int hashCode ()
Returns a hash code value for the object. This method is
supported for the benefit of hashtables such as those provided by
java.util.Hashtable
.
The general contract of hashCode
is:
- Whenever it is invoked on the same object more than once during
an execution of a Java application, the hashCode method
must consistently return the same integer, provided no information
used in equals comparisons on the object is modified.
This integer need not remain consistent from one execution of an
application to another execution of the same application.
- If two objects are equal according to the equals(Object)
method, then calling the
hashCode
method on each of
the two objects must produce the same integer result.
- It is not required that if two objects are unequal
according to the
method, then calling the hashCode method on each of the
two objects must produce distinct integer results. However, the
programmer should be aware that producing distinct integer results
for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by
class Object does return distinct integers for distinct
objects. (This is typically implemented by converting the internal
address of the object into an integer, but this implementation
technique is not required by the
JavaTM programming language.)
public boolean hasTransparentBounds ()
Queries the transparency of region bounds for this matcher.
This method returns true if this matcher uses
transparent bounds, false if it uses opaque
bounds.
See useTransparentBounds
for a
description of transparent and opaque bounds.
By default, a matcher uses opaque region boundaries.
public boolean hitEnd ()
Returns true if the end of input was hit by the search engine in
the last match operation performed by this matcher.
When this method returns true, then it is possible that more input
would have changed the result of the last search.
public boolean lookingAt ()
Attempts to match the input sequence, starting at the beginning of the
region, against the pattern.
Like the matches
method, this method always starts
at the beginning of the region; unlike that method, it does not
require that the entire region be matched.
If the match succeeds then more information can be obtained via the
start, end, and group methods.
public boolean matches ()
Attempts to match the entire region against the pattern.
If the match succeeds then more information can be obtained via the
start, end, and group methods.
public final native void notify ()
Wakes up a single thread that is waiting on this object's
monitor. If any threads are waiting on this object, one of them
is chosen to be awakened. The choice is arbitrary and occurs at
the discretion of the implementation. A thread waits on an object's
monitor by calling one of the wait
methods.
The awakened thread will not be able to proceed until the current
thread relinquishes the lock on this object. The awakened thread will
compete in the usual manner with any other threads that might be
actively competing to synchronize on this object; for example, the
awakened thread enjoys no reliable privilege or disadvantage in being
the next thread to lock this object.
This method should only be called by a thread that is the owner
of this object's monitor. A thread becomes the owner of the
object's monitor in one of three ways:
- By executing a synchronized instance method of that object.
- By executing the body of a
synchronized
statement
that synchronizes on the object.
- For objects of type
Class,
by executing a
synchronized static method of that class.
Only one thread at a time can own an object's monitor.
public final native void notifyAll ()
Wakes up all threads that are waiting on this object's monitor. A
thread waits on an object's monitor by calling one of the
wait
methods.
The awakened threads will not be able to proceed until the current
thread relinquishes the lock on this object. The awakened threads
will compete in the usual manner with any other threads that might
be actively competing to synchronize on this object; for example,
the awakened threads enjoy no reliable privilege or disadvantage in
being the next thread to lock this object.
This method should only be called by a thread that is the owner
of this object's monitor. See the notify
method for a
description of the ways in which a thread can become the owner of
a monitor.
Returns the pattern that is interpreted by this matcher.
Returns a literal replacement String
for the specified
String
.
This method produces a String
that will work
use as a literal replacement s
in the
appendReplacement
method of the Matcher
class.
The String
produced will match the sequence of characters
in s
treated as a literal sequence. Slashes ('\') and
dollar signs ('$') will be given no special meaning.
Sets the limits of this matcher's region. The region is the part of the
input sequence that will be searched to find a match. Invoking this
method resets the matcher, and then sets the region to start at the
index specified by the start
parameter and end at the
index specified by the end
parameter.
Depending on the transparency and anchoring being used (see
useTransparentBounds
and
useAnchoringBounds
), certain constructs such
as anchors may behave differently at or around the boundaries of the
region.
public int regionEnd ()
Reports the end index (exclusive) of this matcher's region.
The searches this matcher conducts are limited to finding matches
within regionStart
(inclusive) and
regionEnd
(exclusive).
public int regionStart ()
Reports the start index of this matcher's region. The
searches this matcher conducts are limited to finding matches
within regionStart
(inclusive) and
regionEnd
(exclusive).
Replaces every subsequence of the input sequence that matches the
pattern with the given replacement string.
This method first resets this matcher. It then scans the input
sequence looking for matches of the pattern. Characters that are not
part of any match are appended directly to the result string; each match
is replaced in the result by the replacement string. The replacement
string may contain references to captured subsequences as in the appendReplacement
method.
Note that backslashes (\) and dollar signs ($) in
the replacement string may cause the results to be different than if it
were being treated as a literal replacement string. Dollar signs may be
treated as references to captured subsequences as described above, and
backslashes are used to escape literal characters in the replacement
string.
Given the regular expression a*b, the input
"aabfooaabfooabfoob", and the replacement string
"-", an invocation of this method on a matcher for that
expression would yield the string "-foo-foo-foo-".
Invoking this method changes this matcher's state. If the matcher
is to be used in further matching operations then it should first be
reset.
Replaces the first subsequence of the input sequence that matches the
pattern with the given replacement string.
This method first resets this matcher. It then scans the input
sequence looking for a match of the pattern. Characters that are not
part of the match are appended directly to the result string; the match
is replaced in the result by the replacement string. The replacement
string may contain references to captured subsequences as in the appendReplacement
method.
Given the regular expression dog, the input
"zzzdogzzzdogzzz", and the replacement string
"cat", an invocation of this method on a matcher for that
expression would yield the string "zzzcatzzzdogzzz".
Invoking this method changes this matcher's state. If the matcher
is to be used in further matching operations then it should first be
reset.
public boolean requireEnd ()
Returns true if more input could change a positive match into a
negative one.
If this method returns true, and a match was found, then more
input could cause the match to be lost. If this method returns false
and a match was found, then more input might change the match but the
match won't be lost. If a match was not found, then requireEnd has no
meaning.
Resets this matcher.
Resetting a matcher discards all of its explicit state information
and sets its append position to zero. The matcher's region is set to the
default region, which is its entire character sequence. The anchoring
and transparency of this matcher's region boundaries are unaffected.
Resets this matcher with a new input sequence.
Resetting a matcher discards all of its explicit state information
and sets its append position to zero. The matcher's region is set to
the default region, which is its entire character sequence. The
anchoring and transparency of this matcher's region boundaries are
unaffected.
public int start ()
Returns the start index of the match.
public int start (int group)
Returns the start index of the subsequence captured by the given group
during this match.
Capturing groups are indexed from left
to right, starting at one. Group zero denotes the entire pattern, so
the expression m.start(0) is equivalent to
m.start().
Returns the match state of this matcher as a MatchResult
.
The result is unaffected by subsequent operations performed upon this
matcher.
Returns the string representation of this matcher. The
string representation of a Matcher
contains information
that may be useful for debugging. The exact format is unspecified.
Sets the anchoring of region bounds for this matcher.
Invoking this method with an argument of true will set this
matcher to use anchoring bounds. If the boolean
argument is false, then non-anchoring bounds will be
used.
Using anchoring bounds, the boundaries of this
matcher's region match anchors such as ^ and $.
Without anchoring bounds, the boundaries of this
matcher's region will not match anchors such as ^ and $.
By default, a matcher uses anchoring region boundaries.
Changes the Pattern that this Matcher uses to
find matches with.
This method causes this matcher to lose information
about the groups of the last match that occurred. The
matcher's position in the input is maintained and its
last append position is unaffected.
Sets the transparency of region bounds for this matcher.
Invoking this method with an argument of true will set this
matcher to use transparent bounds. If the boolean
argument is false, then opaque bounds will be used.
Using transparent bounds, the boundaries of this
matcher's region are transparent to lookahead, lookbehind,
and boundary matching constructs. Those constructs can see beyond the
boundaries of the region to see if a match is appropriate.
Using opaque bounds, the boundaries of this matcher's
region are opaque to lookahead, lookbehind, and boundary matching
constructs that may try to see beyond them. Those constructs cannot
look past the boundaries so they will fail to match anything outside
of the region.
By default, a matcher uses opaque bounds.
Causes current thread to wait until another thread invokes the
method or the
method for this object.
In other words, this method behaves exactly as if it simply
performs the call wait(0).
The current thread must own this object's monitor. The thread
releases ownership of this monitor and waits until another thread
notifies threads waiting on this object's monitor to wake up
either through a call to the notify
method or the
notifyAll
method. The thread then waits until it can
re-obtain ownership of the monitor and resumes execution.
As in the one argument version, interrupts and spurious wakeups are
possible, and this method should always be used in a loop:
synchronized (obj) {
while (<condition does not hold>)
obj.wait();
... // Perform action appropriate to condition
}
This method should only be called by a thread that is the owner
of this object's monitor. See the notify
method for a
description of the ways in which a thread can become the owner of
a monitor.
Causes current thread to wait until either another thread invokes the
method or the
method for this object, or a
specified amount of time has elapsed.
The current thread must own this object's monitor.
This method causes the current thread (call it T) to
place itself in the wait set for this object and then to relinquish
any and all synchronization claims on this object. Thread T
becomes disabled for thread scheduling purposes and lies dormant
until one of four things happens:
- Some other thread invokes the notify method for this
object and thread T happens to be arbitrarily chosen as
the thread to be awakened.
- Some other thread invokes the notifyAll method for this
object.
- Some other thread interrupts
thread T.
- The specified amount of real time has elapsed, more or less. If
timeout is zero, however, then real time is not taken into
consideration and the thread simply waits until notified.
The thread T is then removed from the wait set for this
object and re-enabled for thread scheduling. It then competes in the
usual manner with other threads for the right to synchronize on the
object; once it has gained control of the object, all its
synchronization claims on the object are restored to the status quo
ante - that is, to the situation as of the time that the wait
method was invoked. Thread T then returns from the
invocation of the wait method. Thus, on return from the
wait method, the synchronization state of the object and of
thread T is exactly as it was when the wait method
was invoked.
A thread can also wake up without being notified, interrupted, or
timing out, a so-called spurious wakeup. While this will rarely
occur in practice, applications must guard against it by testing for
the condition that should have caused the thread to be awakened, and
continuing to wait if the condition is not satisfied. In other words,
waits should always occur in loops, like this one:
synchronized (obj) {
while (<condition does not hold>)
obj.wait(timeout);
... // Perform action appropriate to condition
}
(For more information on this topic, see Section 3.2.3 in Doug Lea's
"Concurrent Programming in Java (Second Edition)" (Addison-Wesley,
2000), or Item 50 in Joshua Bloch's "Effective Java Programming
Language Guide" (Addison-Wesley, 2001).
If the current thread is
interrupted
by another thread
while it is waiting, then an InterruptedException is thrown.
This exception is not thrown until the lock status of this object has
been restored as described above.
Note that the wait method, as it places the current thread
into the wait set for this object, unlocks only this object; any
other objects on which the current thread may be synchronized remain
locked while the thread waits.
This method should only be called by a thread that is the owner
of this object's monitor. See the notify
method for a
description of the ways in which a thread can become the owner of
a monitor.
Causes current thread to wait until another thread invokes the
method or the
method for this object, or
some other thread interrupts the current thread, or a certain
amount of real time has elapsed.
This method is similar to the wait
method of one
argument, but it allows finer control over the amount of time to
wait for a notification before giving up. The amount of real time,
measured in nanoseconds, is given by:
1000000*timeout+nanos
In all other respects, this method does the same thing as the
method
of one argument. In particular,
wait(0, 0) means the same thing as wait(0).
The current thread must own this object's monitor. The thread
releases ownership of this monitor and waits until either of the
following two conditions has occurred:
- Another thread notifies threads waiting on this object's monitor
to wake up either through a call to the
notify
method
or the notifyAll
method.
- The timeout period, specified by
timeout
milliseconds plus nanos
nanoseconds arguments, has
elapsed.
The thread then waits until it can re-obtain ownership of the
monitor and resumes execution.
As in the one argument version, interrupts and spurious wakeups are
possible, and this method should always be used in a loop:
synchronized (obj) {
while (<condition does not hold>)
obj.wait(timeout, nanos);
... // Perform action appropriate to condition
}
This method should only be called by a thread that is the owner
of this object's monitor. See the notify
method for a
description of the ways in which a thread can become the owner of
a monitor.