net.htmlparser.jericho
Class EndTag

java.lang.Object
  extended by net.htmlparser.jericho.Segment
      extended by net.htmlparser.jericho.Tag
          extended by net.htmlparser.jericho.EndTag
All Implemented Interfaces:
CharSequence, Comparable<Segment>

public final class EndTag
extends Tag

Represents the end tag of an element in a specific source document.

An end tag always has a type that is a subclass of EndTagType, meaning it always starts with the characters '</'.

EndTag instances are obtained using one of the following methods:

The Tag superclass defines the getName() method used to get the name of this end tag.

See also the XML 1.0 specification for end tags.

See Also:
Tag, StartTag, Element

Method Summary
static String generateHTML(String tagName)
          Generates the HTML text of a normal end tag with the specified tag name.
 String getDebugInfo()
          Returns a string representation of this object useful for debugging purposes.
 Element getElement()
          Returns the element that is ended by this end tag.
 EndTagType getEndTagType()
          Returns the type of this end tag.
 TagType getTagType()
          Returns the type of this tag.
 boolean isUnregistered()
          Indicates whether this tag has a syntax that does not match any of the registered tag types.
 String tidy()
          Returns an XML representation of this end tag.
 
Methods inherited from class net.htmlparser.jericho.Tag
getName, getNameSegment, getNextTag, getPreviousTag, getUserData, isXMLName, isXMLNameChar, isXMLNameStartChar, setUserData
 
Methods inherited from class net.htmlparser.jericho.Segment
charAt, compareTo, encloses, encloses, equals, getAllCharacterReferences, getAllElements, getAllElements, getAllElements, getAllElements, getAllElements, getAllElementsByClass, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTagsByClass, getAllTags, getAllTags, getBegin, getChildElements, getEnd, getFirstElement, getFirstElement, getFirstElement, getFirstElement, getFirstElementByClass, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTagByClass, getFormControls, getFormFields, getNodeIterator, getRenderer, getSource, getTextExtractor, hashCode, ignoreWhenParsing, isWhiteSpace, isWhiteSpace, length, parseAttributes, subSequence, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Method Detail

getElement

public Element getElement()
Returns the element that is ended by this end tag.

Returns null if this end tag is not properly matched to any start tag in the source document.

This method is much less efficient than the StartTag.getElement() method.

IMPLEMENTATION NOTE: The explanation for why this method is relatively inefficient lies in the fact that more than one start tag type can have the same corresponding end tag type, so it is not possible to know for certain which type of start tag this end tag is matched to (see EndTagType.getCorrespondingStartTagType() for more explanation). Because of this uncertainty, the implementation of this method must check every start tag preceding this end tag, calling its StartTag.getElement() method to see whether it is terminated by this end tag.

Specified by:
getElement in class Tag
Returns:
the element that is ended by this end tag.

getEndTagType

public EndTagType getEndTagType()
Returns the type of this end tag.

This is equivalent to (EndTagType)getTagType().

Returns:
the type of this end tag.

getTagType

public TagType getTagType()
Description copied from class: Tag
Returns the type of this tag.

Specified by:
getTagType in class Tag
Returns:
the type of this tag.

isUnregistered

public boolean isUnregistered()
Description copied from class: Tag
Indicates whether this tag has a syntax that does not match any of the registered tag types.

The only requirement of an unregistered tag type is that it starts with '<' and there is a closing '>' character at some position after it in the source document.

The absence or presence of a '/' character after the initial '<' determines whether an unregistered tag is respectively a StartTag with a type of StartTagType.UNREGISTERED or an EndTag with a type of EndTagType.UNREGISTERED.

There are no restrictions on the characters that might appear between these delimiters, including other '<' characters. This may result in a '>' character that is identified as the closing delimiter of two separate tags, one an unregistered tag, and the other a tag of any type that begins in the middle of the unregistered tag. As explained below, unregistered tags are usually only found when specifically looking for them, so it is up to the user to detect and deal with any such nonsensical results.

Unregistered tags are only returned by the Source.getTagAt(int pos) method, named search methods, where the specified name matches the first characters inside the tag, and by tag type search methods, where the specified tagType is either StartTagType.UNREGISTERED or EndTagType.UNREGISTERED.

Open tag searches and other searches always ignore unregistered tags, although every discovery of an unregistered tag is logged by the parser.

The logic behind this design is that unregistered tag types are usually the result of a '<' character in the text that was mistakenly left unencoded, or a less-than operator inside a script, or some other occurrence which is of no interest to the user. By returning unregistered tags in named and tag type search methods, the library allows the user to specifically search for tags with a certain syntax that does not match any existing TagType. This expediency feature avoids the need for the user to create a custom tag type to define the syntax before searching for these tags. By not returning unregistered tags in the less specific search methods, it is providing only the information that most users are interested in.

Specified by:
isUnregistered in class Tag
Returns:
true if this tag has a syntax that does not match any of the registered tag types, otherwise false.

tidy

public String tidy()
Returns an XML representation of this end tag.

This method is included for symmetry with the StartTag.tidy() method and simply returns the source text of the tag.

Specified by:
tidy in class Tag
Returns:
an XML representation of this end tag.

generateHTML

public static String generateHTML(String tagName)
Generates the HTML text of a normal end tag with the specified tag name.

Example:

The following method call:

EndTag.generateHTML("INPUT")
returns the following output:
</INPUT>

Parameters:
tagName - the name of the end tag.
Returns:
the HTML text of a normal end tag with the specified tag name.
See Also:
StartTag.generateHTML(String tagName, Map attributesMap, boolean emptyElementTag)

getDebugInfo

public String getDebugInfo()
Description copied from class: Segment
Returns a string representation of this object useful for debugging purposes.

Overrides:
getDebugInfo in class Segment
Returns:
a string representation of this object useful for debugging purposes.