trait MarkupParser extends MarkupParserCommon with TokenTests
An XML parser.
Parses XML 1.0, invokes callback methods of a MarkupHandler
and returns
whatever the markup handler returns. Use ConstructingParser
if you just
want to parse XML to construct instances of scala.xml.Node
.
While XML elements are returned, DTD declarations - if handled - are collected using side-effects.
- Self Type
- MarkupParser with MarkupHandler
- Alphabetic
- By Inheritance
- MarkupParser
- MarkupParserCommon
- TokenTests
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Type Members
- type AttributesType = (MetaData, NamespaceBinding)
- Definition Classes
- MarkupParser → MarkupParserCommon
- type ElementType = NodeSeq
- Definition Classes
- MarkupParser → MarkupParserCommon
- type InputType = Source
- Definition Classes
- MarkupParser → MarkupParserCommon
- type NamespaceType = NamespaceBinding
- Definition Classes
- MarkupParser → MarkupParserCommon
- type PositionType = Int
- Definition Classes
- MarkupParser → MarkupParserCommon
Abstract Value Members
Concrete Value Members
- def appendText(pos: Int, ts: NodeBuffer, txt: String): Unit
- def attrDecl(): Unit
<! attlist := ATTLIST
- def ch: Char
The library and compiler parsers had the interesting distinction of different behavior for nextch (a function for which there are a total of two plausible behaviors, so we know the design space was fully explored.) One of them returned the value of nextch before the increment and one of them the new value.
The library and compiler parsers had the interesting distinction of different behavior for nextch (a function for which there are a total of two plausible behaviors, so we know the design space was fully explored.) One of them returned the value of nextch before the increment and one of them the new value. So to unify code we have to at least temporarily abstract over the nextchs.
- Definition Classes
- MarkupParser → MarkupParserCommon
- def checkPubID(s: String): Boolean
- Definition Classes
- TokenTests
- def checkSysID(s: String): Boolean
- Definition Classes
- TokenTests
- def content(pscope: NamespaceBinding): NodeSeq
content1 ::= '<' content1 | '&' charref ...
- def content1(pscope: NamespaceBinding, ts: NodeBuffer): Unit
'<' content1 ::= ...
- def document(): Document
[22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? [23] XMLDecl ::= ' VersionInfo EncodingDecl? SDDecl? S? '?>' [24] VersionInfo ::= S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"') [25] Eq ::= S? '=' S? [26] VersionNum ::= '1.0' [27] Misc ::= Comment | PI | S
- val dtd: DTD
- def element(pscope: NamespaceBinding): NodeSeq
- def element1(pscope: NamespaceBinding): NodeSeq
'<' element ::= xmlTag1 '>' { xmlExpr | '{' simpleExpr '}' } ETag | xmlTag1 '/' '>'
- def elementDecl(): Unit
<! element := ELEMENT
- def entityDecl(): Unit
<! element := ELEMENT
- def eof: Boolean
- Definition Classes
- MarkupParser → MarkupParserCommon
- def errorNoEnd(tag: String): Nothing
- Definition Classes
- MarkupParser → MarkupParserCommon
- val extIndex: Int
- def extSubset(): Unit
- def externalID(): ExternalID
externalID ::= SYSTEM S syslit PUBLIC S pubid S syslit
- def initialize: MarkupParser.this
As the current code requires you to call nextch once manually after construction, this method formalizes that suboptimal reality.
- val inpStack: List[Source]
stack of inputs
- def intSubset(): Unit
"rec-xml/#ExtSubset" pe references may not occur within markup declarations
- def isAlpha(c: Char): Boolean
These are 99% sure to be redundant but refactoring on the safe side.
These are 99% sure to be redundant but refactoring on the safe side.
- Definition Classes
- TokenTests
- def isAlphaDigit(c: Char): Boolean
- Definition Classes
- TokenTests
- def isName(s: String): Boolean
Name ::= ( Letter | '_' ) (NameChar)*
See [5] of XML 1.0 specification.
- Definition Classes
- TokenTests
- def isNameChar(ch: Char): Boolean
NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | #xB7 | CombiningChar | Extender
See [4] and [4a] of Appendix B of XML 1.0 specification.
- Definition Classes
- TokenTests
- def isNameStart(ch: Char): Boolean
NameStart ::= ( Letter | '_' | ':' )
where Letter means in one of the Unicode general categories
{ Ll, Lu, Lo, Lt, Nl }
.We do not allow a name to start with
:
. See [4] and Appendix B of XML 1.0 specification- Definition Classes
- TokenTests
- def isPubIDChar(ch: Char): Boolean
- Definition Classes
- TokenTests
- final def isSpace(cs: collection.Seq[Char]): Boolean
(#x20 | #x9 | #xD | #xA)+
- Definition Classes
- TokenTests
- final def isSpace(ch: Char): Boolean
(#x20 | #x9 | #xD | #xA)
- Definition Classes
- TokenTests
- def isValidIANAEncoding(ianaEncoding: collection.Seq[Char]): Boolean
Returns
true
if the encoding name is a valid IANA encoding.Returns
true
if the encoding name is a valid IANA encoding. This method does not verify that there is a decoder available for this encoding, only that the characters are valid for an IANA encoding name.- ianaEncoding
The IANA encoding name.
- Definition Classes
- TokenTests
- val lastChRead: Char
- def lookahead(): BufferedIterator[Char]
Create a lookahead reader which does not influence the input
Create a lookahead reader which does not influence the input
- Definition Classes
- MarkupParser → MarkupParserCommon
- def markupDecl(): Unit
- def markupDecl1(): Any
- def mkAttributes(name: String, pscope: NamespaceBinding): (MarkupParser.this)#AttributesType
- Definition Classes
- MarkupParser → MarkupParserCommon
- def mkProcInstr(position: Int, name: String, text: String): (MarkupParser.this)#ElementType
- Definition Classes
- MarkupParser → MarkupParserCommon
- val nextChNeeded: Boolean
holds the next character
- def nextch(): Unit
this method tells ch to get the next character when next called
this method tells ch to get the next character when next called
- Definition Classes
- MarkupParser → MarkupParserCommon
- def notationDecl(): Unit
'N' notationDecl ::= "OTATION"
- def parseDTD(): Unit
parses document type declaration and assigns it to instance variable dtd.
parses document type declaration and assigns it to instance variable dtd.
<! parseDTD ::= DOCTYPE name ... >
- def pop(): Unit
- val pos: Int
holds the position in the source file
- def prolog(): (Option[String], Option[String], Option[Boolean])
<? prolog ::= xml S? // this is a bit more lenient than necessary...
- def pubidLiteral(): String
[12] PubidLiteral ::= '"' PubidChar* '"' | "'" (PubidChar - "'")* "'"
- def push(entityName: String): Unit
- def pushExternal(systemId: String): Unit
- val reachedEof: Boolean
- def reportSyntaxError(str: String): Unit
- Definition Classes
- MarkupParser → MarkupParserCommon
- def reportSyntaxError(pos: Int, str: String): Unit
- Definition Classes
- MarkupParser → MarkupParserCommon
- def reportValidationError(pos: Int, str: String): Unit
- def returning[T](x: T)(f: (T) => Unit): T
Apply a function and return the passed value
Apply a function and return the passed value
- Definition Classes
- MarkupParserCommon
- def saving[A, B](getter: A, setter: (A) => Unit)(body: => B): B
Execute body with a variable saved and restored after execution
Execute body with a variable saved and restored after execution
- Definition Classes
- MarkupParserCommon
- def systemLiteral(): String
attribute value, terminated by either ' or ".
attribute value, terminated by either ' or ". value may not contain <.
AttValue ::= `'` { _ } `'` | `"` { _ } `"`
- def textDecl(): (Option[String], Option[String])
prolog, but without standalone
- val tmppos: Int
holds temporary values of pos
holds temporary values of pos
- Definition Classes
- MarkupParser → MarkupParserCommon
- def truncatedError(msg: String): Nothing
- Definition Classes
- MarkupParser → MarkupParserCommon
- def xAttributeValue(): String
- Definition Classes
- MarkupParserCommon
- def xAttributeValue(endCh: Char): String
attribute value, terminated by either
'
or"
.attribute value, terminated by either
'
or"
. value may not contain<
.- endCh
either
'
or"
- Definition Classes
- MarkupParserCommon
- def xAttributes(pscope: NamespaceBinding): (MetaData, NamespaceBinding)
parse attribute and create namespace scope, metadata
parse attribute and create namespace scope, metadata
[41] Attributes ::= { S Name Eq AttValue }
- def xCharData: NodeSeq
'<! CharData ::= [CDATA[ ( {char} - {char}"]]>"{char} ) ']]>' see [15]
- def xCharRef: String
- Definition Classes
- MarkupParserCommon
- def xCharRef(it: Iterator[Char]): String
- Definition Classes
- MarkupParserCommon
- def xCharRef(ch: () => Char, nextch: () => Unit): String
CharRef ::= "&#" '0'..'9' {'0'..'9'} ";" | "&#x" '0'..'9'|'A'..'F'|'a'..'f' { hexdigit } ";"
CharRef ::= "&#" '0'..'9' {'0'..'9'} ";" | "&#x" '0'..'9'|'A'..'F'|'a'..'f' { hexdigit } ";"
see [66]
- Definition Classes
- MarkupParserCommon
- def xComment: NodeSeq
Comment ::= '' see [15]
- def xEQ(): Unit
scan [S] '=' [S]
scan [S] '=' [S]
- Definition Classes
- MarkupParserCommon
- def xEndTag(startName: String): Unit
[42] '<' xmlEndTag ::= '<' '/' Name S? '>'
[42] '<' xmlEndTag ::= '<' '/' Name S? '>'
- Definition Classes
- MarkupParserCommon
- def xEntityValue(): String
entity value, terminated by either ' or ".
entity value, terminated by either ' or ". value may not contain <.
AttValue ::= `'` { _ } `'` | `"` { _ } `"`
- def xHandleError(that: Char, msg: String): Unit
- Definition Classes
- MarkupParser → MarkupParserCommon
- def xName: String
actually, Name ::= (Letter | '_' | ':') (NameChar)* but starting with ':' cannot happen Name ::= (Letter | '_') (NameChar)*
actually, Name ::= (Letter | '_' | ':') (NameChar)* but starting with ':' cannot happen Name ::= (Letter | '_') (NameChar)*
see [5] of XML 1.0 specification
pre-condition: ch != ':' // assured by definition of XMLSTART token post-condition: name does neither start, nor end in ':'
- Definition Classes
- MarkupParserCommon
- def xProcInstr: (MarkupParser.this)#ElementType
'<?' ProcInstr ::= Name [S ({Char} - ({Char}'>?' {Char})]'?>'
'<?' ProcInstr ::= Name [S ({Char} - ({Char}'>?' {Char})]'?>'
see [15]
- Definition Classes
- MarkupParserCommon
- def xSpace(): Unit
scan [3] S ::= (#x20 | #x9 | #xD | #xA)+
scan [3] S ::= (#x20 | #x9 | #xD | #xA)+
- Definition Classes
- MarkupParserCommon
- def xSpaceOpt(): Unit
skip optional space S?
skip optional space S?
- Definition Classes
- MarkupParserCommon
- def xToken(that: collection.Seq[Char]): Unit
- Definition Classes
- MarkupParserCommon
- def xToken(that: Char): Unit
- Definition Classes
- MarkupParserCommon
- def xmlProcInstr(): MetaData
<? prolog ::= xml S ... ?>