| 1 | /**************************************************************************** |
| 2 | ** |
| 3 | ** Copyright (C) 2016 The Qt Company Ltd. |
| 4 | ** Contact: https://www.qt.io/licensing/ |
| 5 | ** |
| 6 | ** This file is part of the QtXmlPatterns module of the Qt Toolkit. |
| 7 | ** |
| 8 | ** $QT_BEGIN_LICENSE:LGPL$ |
| 9 | ** Commercial License Usage |
| 10 | ** Licensees holding valid commercial Qt licenses may use this file in |
| 11 | ** accordance with the commercial license agreement provided with the |
| 12 | ** Software or, alternatively, in accordance with the terms contained in |
| 13 | ** a written agreement between you and The Qt Company. For licensing terms |
| 14 | ** and conditions see https://www.qt.io/terms-conditions. For further |
| 15 | ** information use the contact form at https://www.qt.io/contact-us. |
| 16 | ** |
| 17 | ** GNU Lesser General Public License Usage |
| 18 | ** Alternatively, this file may be used under the terms of the GNU Lesser |
| 19 | ** General Public License version 3 as published by the Free Software |
| 20 | ** Foundation and appearing in the file LICENSE.LGPL3 included in the |
| 21 | ** packaging of this file. Please review the following information to |
| 22 | ** ensure the GNU Lesser General Public License version 3 requirements |
| 23 | ** will be met: https://www.gnu.org/licenses/lgpl-3.0.html. |
| 24 | ** |
| 25 | ** GNU General Public License Usage |
| 26 | ** Alternatively, this file may be used under the terms of the GNU |
| 27 | ** General Public License version 2.0 or (at your option) the GNU General |
| 28 | ** Public license version 3 or any later version approved by the KDE Free |
| 29 | ** Qt Foundation. The licenses are as published by the Free Software |
| 30 | ** Foundation and appearing in the file LICENSE.GPL2 and LICENSE.GPL3 |
| 31 | ** included in the packaging of this file. Please review the following |
| 32 | ** information to ensure the GNU General Public License requirements will |
| 33 | ** be met: https://www.gnu.org/licenses/gpl-2.0.html and |
| 34 | ** https://www.gnu.org/licenses/gpl-3.0.html. |
| 35 | ** |
| 36 | ** $QT_END_LICENSE$ |
| 37 | ** |
| 38 | ****************************************************************************/ |
| 39 | |
| 40 | #include <QString> |
| 41 | |
| 42 | #include "qitem_p.h" |
| 43 | |
| 44 | #include "qabstractxmlreceiver_p.h" |
| 45 | #include "qabstractxmlreceiver.h" |
| 46 | |
| 47 | QT_BEGIN_NAMESPACE |
| 48 | |
| 49 | /*! |
| 50 | \class QAbstractXmlReceiver |
| 51 | \brief The QAbstractXmlReceiver class provides a callback interface |
| 52 | for transforming the output of a QXmlQuery. |
| 53 | \reentrant |
| 54 | \since 4.4 |
| 55 | \ingroup xml-tools |
| 56 | \inmodule QtXmlPatterns |
| 57 | |
| 58 | QAbstractXmlReceiver is an abstract base class that provides |
| 59 | a callback interface for receiving an \l {XQuery Sequence} |
| 60 | {XQuery sequence}, usually the output of an QXmlQuery, and |
| 61 | transforming that sequence into a structure of your choosing, |
| 62 | usually XML. Consider the example: |
| 63 | |
| 64 | \snippet code/src_xmlpatterns_api_qabstractxmlreceiver.cpp 0 |
| 65 | |
| 66 | First it constructs a \l {QXmlQuery} {query} that gets the |
| 67 | first paragraph from document \c index.html. Then it constructs |
| 68 | an \l {QXmlSerializer} {XML serializer} with the \l {QXmlQuery} |
| 69 | {query} and \l {QIODevice} {myOutputDevice} (Note the |
| 70 | \l {QXmlSerializer} {serializer} is an \e {XML receiver}, |
| 71 | ie a subclass of QAbstractXmlReceiver). Finally, it |
| 72 | \l {QXmlQuery::evaluateTo()} {evaluates} the |
| 73 | \l {QXmlQuery} {query}, producing an ordered sequence of calls |
| 74 | to the \l {QXmlSerializer} {serializer's} callback functions. |
| 75 | The sequence of callbacks transforms the query output to XML |
| 76 | and writes it to \l {QIODevice} {myOutputDevice}. |
| 77 | |
| 78 | Although the example uses \l {QXmlQuery} to produce the sequence |
| 79 | of callbacks to functions in QAbstractXmlReceiver, you can call |
| 80 | the callback functions directly as long as your sequence of |
| 81 | calls represents a valid \l {XQuery Sequence} {XQuery sequence}. |
| 82 | |
| 83 | \target XQuery Sequence |
| 84 | \section1 XQuery Sequences |
| 85 | |
| 86 | An XQuery \a sequence is an ordered collection of zero, one, |
| 87 | or many \e items. Each \e item is either an \e {atomic value} |
| 88 | or a \e {node}. An \e {atomic value} is a simple data value. |
| 89 | |
| 90 | There are six kinds of \e nodes. |
| 91 | |
| 92 | \list |
| 93 | |
| 94 | \li An \e {Element Node} represents an XML element. |
| 95 | |
| 96 | \li An \e {Attribute Node} represents an XML attribute. |
| 97 | |
| 98 | \li A \e {Document Node} represents an entire XML document. |
| 99 | |
| 100 | \li A \e {Text Node} represents character data (element content). |
| 101 | |
| 102 | \li A \e {Processing Instruction Node} represents an XML |
| 103 | processing instruction, which is used in an XML document |
| 104 | to tell the application reading the document to perform |
| 105 | some action. A typical example is to use a processing |
| 106 | instruction to tell the application to use a particular |
| 107 | XSLT stylesheet to display the document. |
| 108 | |
| 109 | \li And a \e {Comment node} represents an XML comment. |
| 110 | |
| 111 | \endlist |
| 112 | |
| 113 | The \e sequence of \e nodes and \e {atomic values} obeys |
| 114 | the following rules. Note that \e {Namespace Node} refers |
| 115 | to a special \e {Attribute Node} with name \e {xmlns}. |
| 116 | |
| 117 | \list |
| 118 | |
| 119 | \li Each \e node appears in the \e sequence before its children |
| 120 | and their descendants appear. |
| 121 | |
| 122 | \li A \e node's descendants appear in the \e sequence before |
| 123 | any of its siblings appear. |
| 124 | |
| 125 | \li A \e {Document Node} represents an entire document. Zero or |
| 126 | more \e {Document Nodes} can appear in a \e sequence, but they |
| 127 | can only be top level items (i.e., a \e {Document Node} can't |
| 128 | be a child of another \e node. |
| 129 | |
| 130 | \li \e {Namespace Nodes} immediately follow the \e {Element Node} |
| 131 | with which they are associated. |
| 132 | |
| 133 | \li \e {Attribute Nodes} immediately follow the \e {Namespace Nodes} |
| 134 | of the element with which they are associated, or... |
| 135 | |
| 136 | \li If there are no \e {Namespace Nodes} following an element, then |
| 137 | the \e {Attribute Nodes} immediately follow the element. |
| 138 | |
| 139 | \li An \e {atomic value} can only appear as a top level \e item, |
| 140 | i.e., it can't appear as a child of a \e node. |
| 141 | |
| 142 | \li \e {Processing Instruction Nodes} do not have children, and |
| 143 | their parent is either a \e {Document Node} or an \e {Element |
| 144 | Node}. |
| 145 | |
| 146 | \li \e {Comment Nodes} do not have children, and |
| 147 | their parent is either a \e {Document Node} or an \e {Element |
| 148 | Node}. |
| 149 | |
| 150 | \endlist |
| 151 | |
| 152 | The \e sequence of \e nodes and \e {atomic values} is sent to |
| 153 | an QAbstractXmlReceiver (QXmlSerializer in |
| 154 | the example above) as a sequence of calls to the receiver's |
| 155 | callback functions. The mapping of callback functions to |
| 156 | sequence items is as follows. |
| 157 | |
| 158 | \list |
| 159 | |
| 160 | \li startDocument() and endDocument() are called for each |
| 161 | \e {Document Node} in the \e sequence. endDocument() is not |
| 162 | called until all the \e {Document Node's} children have |
| 163 | appeared in the \e sequence. |
| 164 | |
| 165 | \li startElement() and endElement() are called for each |
| 166 | \e {Element Node}. endElement() is not called until all the |
| 167 | \e {Element Node's} children have appeared in the \e sequence. |
| 168 | |
| 169 | \li attribute() is called for each \e {Attribute Node}. |
| 170 | |
| 171 | \li comment() is called for each \e {Comment Node}. |
| 172 | |
| 173 | \li characters() is called for each \e {Text Node}. |
| 174 | |
| 175 | \li processingInstruction() is called for each \e {Processing |
| 176 | Instruction Node}. |
| 177 | |
| 178 | \li namespaceBinding() is called for each \e {Namespace Node}. |
| 179 | |
| 180 | \li atomicValue() is called for each \e {atomic value}. |
| 181 | |
| 182 | \endlist |
| 183 | |
| 184 | For a complete explanation of XQuery sequences, visit |
| 185 | \l {http://www.w3.org/TR/xpath-datamodel/}{XQuery Data Model}. |
| 186 | |
| 187 | \sa {http://www.w3.org/TR/xpath-datamodel/}{W3C XQuery 1.0 and XPath 2.0 Data Model (XDM)} |
| 188 | \sa QXmlSerializer |
| 189 | \sa QXmlResultItems |
| 190 | */ |
| 191 | |
| 192 | template<const QXmlNodeModelIndex::Axis axis> |
| 193 | void QAbstractXmlReceiver::sendFromAxis(const QXmlNodeModelIndex &node) |
| 194 | { |
| 195 | Q_ASSERT(!node.isNull()); |
| 196 | const QXmlNodeModelIndex::Iterator::Ptr it(node.iterate(axis)); |
| 197 | QXmlNodeModelIndex next(it->next()); |
| 198 | |
| 199 | while(!next.isNull()) |
| 200 | { |
| 201 | sendAsNode(outputItem: next); |
| 202 | next = it->next(); |
| 203 | } |
| 204 | } |
| 205 | |
| 206 | /*! |
| 207 | \internal |
| 208 | */ |
| 209 | QAbstractXmlReceiver::QAbstractXmlReceiver(QAbstractXmlReceiverPrivate *d) |
| 210 | : d_ptr(d) |
| 211 | { |
| 212 | } |
| 213 | |
| 214 | /*! |
| 215 | Constructs an abstract xml receiver. |
| 216 | */ |
| 217 | QAbstractXmlReceiver::QAbstractXmlReceiver() : d_ptr(0) |
| 218 | { |
| 219 | } |
| 220 | |
| 221 | /*! |
| 222 | Destroys the xml receiver. |
| 223 | */ |
| 224 | QAbstractXmlReceiver::~QAbstractXmlReceiver() |
| 225 | { |
| 226 | } |
| 227 | |
| 228 | /*! |
| 229 | \fn void QAbstractXmlReceiver::startElement(const QXmlName &name) |
| 230 | |
| 231 | This callback is called when a new element node appears |
| 232 | in the \l {XQuery Sequence} {sequence}. \a name is the |
| 233 | valid \l {QXmlName} {name} of the node element. |
| 234 | */ |
| 235 | |
| 236 | /* |
| 237 | ### Qt 5: |
| 238 | |
| 239 | Consider how source locations should be communicated. Maybe every signature |
| 240 | should be extended by adding "qint64 line = -1, qint64 column = -1". |
| 241 | */ |
| 242 | |
| 243 | /*! |
| 244 | \fn void QAbstractXmlReceiver::endElement() |
| 245 | |
| 246 | This callback is called when the end of an element node |
| 247 | appears in the \l {XQuery Sequence} {sequence}. |
| 248 | */ |
| 249 | |
| 250 | /*! |
| 251 | \fn void QAbstractXmlReceiver::attribute(const QXmlName &name, |
| 252 | const QStringRef &value) |
| 253 | This callback is called when an attribute node |
| 254 | appears in the \l {XQuery Sequence} {sequence}. |
| 255 | \a name is the \l {QXmlName} {attribute name} and |
| 256 | the \a value string contains the attribute value. |
| 257 | */ |
| 258 | |
| 259 | /*! |
| 260 | \fn void QAbstractXmlReceiver::comment(const QString &value) |
| 261 | |
| 262 | This callback is called when a comment node appears |
| 263 | in the \l {XQuery Sequence} {sequence}. The \a value |
| 264 | is the comment text, which must not contain the string |
| 265 | "--". |
| 266 | */ |
| 267 | |
| 268 | /*! |
| 269 | \fn void QAbstractXmlReceiver::characters(const QStringRef &value) |
| 270 | |
| 271 | This callback is called when a text node appears in the |
| 272 | \l {XQuery Sequence} {sequence}. The \a value contains |
| 273 | the text. Adjacent text nodes may not occur in the |
| 274 | \l {XQuery Sequence} {sequence}, i.e., this callback must not |
| 275 | be called twice in a row. |
| 276 | */ |
| 277 | |
| 278 | /*! |
| 279 | \fn void QAbstractXmlReceiver::startDocument() |
| 280 | |
| 281 | This callback is called when a document node appears |
| 282 | in the \l {XQuery Sequence} {sequence}. |
| 283 | */ |
| 284 | |
| 285 | /* |
| 286 | ### Qt 5: |
| 287 | |
| 288 | Change |
| 289 | virtual void startDocument() = 0; |
| 290 | |
| 291 | To: |
| 292 | virtual void startDocument(const QUrl &uri) = 0; |
| 293 | |
| 294 | Such that it allows the document URI to be communicated. The contract would |
| 295 | allow null QUrls. |
| 296 | */ |
| 297 | |
| 298 | /*! |
| 299 | \fn void QAbstractXmlReceiver::endDocument() |
| 300 | |
| 301 | This callback is called when the end of a document node |
| 302 | appears in the \l {XQuery Sequence} {sequence}. |
| 303 | */ |
| 304 | |
| 305 | /*! |
| 306 | \fn void QAbstractXmlReceiver::processingInstruction(const QXmlName &target, |
| 307 | const QString &value) |
| 308 | |
| 309 | This callback is called when a processing instruction |
| 310 | appears in the \l {XQuery Sequence} {sequence}. |
| 311 | A processing instruction is used in an XML document |
| 312 | to tell the application reading the document to |
| 313 | perform some action. A typical example is to use a |
| 314 | processing instruction to tell the application to use a |
| 315 | particular XSLT stylesheet to process the document. |
| 316 | |
| 317 | \quotefile patternist/xmlStylesheet.xq |
| 318 | |
| 319 | \a target is the \l {QXmlName} {name} of the processing |
| 320 | instruction. Its \e prefix and \e {namespace URI} must both |
| 321 | be empty. Its \e {local name} is the target. In the above |
| 322 | example, the name is \e {xml-stylesheet}. |
| 323 | |
| 324 | The \a value specifies the action to be taken. Note that |
| 325 | the \a value must not contain the string "?>". In the above |
| 326 | example, the \a value is \e{type="test/xsl" href="formatter.xsl}. |
| 327 | |
| 328 | Generally, use of processing instructions should be avoided, |
| 329 | because they are not namespace aware and in many contexts |
| 330 | are stripped out anyway. Processing instructions can often |
| 331 | be replaced with elements from a custom namespace. |
| 332 | */ |
| 333 | |
| 334 | /*! |
| 335 | \fn void QAbstractXmlReceiver::atomicValue(const QVariant &value) |
| 336 | |
| 337 | This callback is called when an atomic value appears in the \l |
| 338 | {XQuery Sequence} {sequence}. The \a value is a simple \l {QVariant} |
| 339 | {data value}. It is guaranteed to be \l {QVariant::isValid()} |
| 340 | {valid}. |
| 341 | */ |
| 342 | |
| 343 | /*! |
| 344 | \fn virtual void QAbstractXmlReceiver::namespaceBinding(const QXmlName &name) |
| 345 | |
| 346 | This callback is called when a namespace binding is in scope of an |
| 347 | element. A namespace is defined by a URI. In the \l {QXmlName} |
| 348 | \a name, the value of \l {QXmlName::namespaceUri()} is that URI. The |
| 349 | value of \l {QXmlName::prefix}() is the prefix that the URI is bound |
| 350 | to. The local name is insignificant and can be an arbitrary value. |
| 351 | */ |
| 352 | |
| 353 | /*! |
| 354 | \internal |
| 355 | |
| 356 | Treats \a outputItem as a node and calls the appropriate function, |
| 357 | e.g., attribute() or comment(), depending on its |
| 358 | QXmlNodeModelIndex::NodeKind. |
| 359 | |
| 360 | This is a helper function that subclasses can use to multiplex |
| 361 | Nodes received via item(). |
| 362 | */ |
| 363 | void QAbstractXmlReceiver::sendAsNode(const QPatternist::Item &outputItem) |
| 364 | { |
| 365 | Q_ASSERT(outputItem); |
| 366 | Q_ASSERT(outputItem.isNode()); |
| 367 | const QXmlNodeModelIndex asNode = outputItem.asNode(); |
| 368 | |
| 369 | switch(asNode.kind()) |
| 370 | { |
| 371 | case QXmlNodeModelIndex::Attribute: |
| 372 | { |
| 373 | const QString &v = outputItem.stringValue(); |
| 374 | attribute(name: asNode.name(), value: QStringRef(&v)); |
| 375 | return; |
| 376 | } |
| 377 | case QXmlNodeModelIndex::Element: |
| 378 | { |
| 379 | startElement(name: asNode.name()); |
| 380 | |
| 381 | /* First the namespaces, then attributes, then the children. */ |
| 382 | asNode.sendNamespaces(receiver: this); |
| 383 | sendFromAxis<QXmlNodeModelIndex::AxisAttribute>(node: asNode); |
| 384 | sendFromAxis<QXmlNodeModelIndex::AxisChild>(node: asNode); |
| 385 | |
| 386 | endElement(); |
| 387 | |
| 388 | return; |
| 389 | } |
| 390 | case QXmlNodeModelIndex::Text: |
| 391 | { |
| 392 | const QString &v = asNode.stringValue(); |
| 393 | characters(value: QStringRef(&v)); |
| 394 | return; |
| 395 | } |
| 396 | case QXmlNodeModelIndex::ProcessingInstruction: |
| 397 | { |
| 398 | processingInstruction(target: asNode.name(), value: outputItem.stringValue()); |
| 399 | return; |
| 400 | } |
| 401 | case QXmlNodeModelIndex::Comment: |
| 402 | { |
| 403 | comment(value: outputItem.stringValue()); |
| 404 | return; |
| 405 | } |
| 406 | case QXmlNodeModelIndex::Document: |
| 407 | { |
| 408 | startDocument(); |
| 409 | sendFromAxis<QXmlNodeModelIndex::AxisChild>(node: asNode); |
| 410 | endDocument(); |
| 411 | return; |
| 412 | } |
| 413 | case QXmlNodeModelIndex::Namespace: |
| 414 | Q_ASSERT_X(false, Q_FUNC_INFO, "Not implemented" ); |
| 415 | } |
| 416 | |
| 417 | Q_ASSERT_X(false, Q_FUNC_INFO, |
| 418 | QString::fromLatin1("Unknown node type: %1" ).arg(asNode.kind()).toUtf8().constData()); |
| 419 | } |
| 420 | |
| 421 | /*! |
| 422 | \internal |
| 423 | |
| 424 | This function may be called instead of characters() if, and only if, |
| 425 | \a value consists only of whitespace. |
| 426 | |
| 427 | The caller gurantees that \a value is not empty. |
| 428 | |
| 429 | \e Whitespace refers to a sequence of characters that are either |
| 430 | spaces, tabs, or newlines, in any order. In other words, not all |
| 431 | the Unicode whitespace category is considered whitespace here. |
| 432 | |
| 433 | However, there is no guarantee or requirement that whitespaceOnly() |
| 434 | is called for text nodes containing whitespace only. characters() |
| 435 | may be called just as well. This is why the default implementation |
| 436 | for whitespaceOnly() calls characters(). |
| 437 | |
| 438 | \sa characters() |
| 439 | */ |
| 440 | void QAbstractXmlReceiver::whitespaceOnly(const QStringRef &value) |
| 441 | { |
| 442 | Q_ASSERT_X(value.toString().trimmed().isEmpty(), Q_FUNC_INFO, |
| 443 | "The caller must guarantee only whitespace is passed. Use characters() in other cases." ); |
| 444 | const QString &v = value.toString(); |
| 445 | characters(value: QStringRef(&v)); |
| 446 | } |
| 447 | |
| 448 | /*! |
| 449 | \internal |
| 450 | */ |
| 451 | void QAbstractXmlReceiver::item(const QPatternist::Item &item) |
| 452 | { |
| 453 | if(item.isNode()) |
| 454 | return sendAsNode(outputItem: item); |
| 455 | else |
| 456 | atomicValue(value: QPatternist::AtomicValue::toQt(value: item.asAtomicValue())); |
| 457 | } |
| 458 | |
| 459 | /*! |
| 460 | \fn void QAbstractXmlReceiver::startOfSequence() |
| 461 | |
| 462 | This callback is called once only, right before the |
| 463 | \l {XQuery Sequence} {sequence} begins. |
| 464 | */ |
| 465 | |
| 466 | /*! |
| 467 | \fn void QAbstractXmlReceiver::endOfSequence() |
| 468 | |
| 469 | This callback is called once only, right after the |
| 470 | \l {XQuery Sequence} {sequence} ends. |
| 471 | */ |
| 472 | |
| 473 | QT_END_NAMESPACE |
| 474 | |
| 475 | |