1 | /**************************************************************************** |
2 | ** |
3 | ** Copyright (C) 2016 The Qt Company Ltd. |
4 | ** Contact: https://www.qt.io/licensing/ |
5 | ** |
6 | ** This file is part of the QtXmlPatterns module of the Qt Toolkit. |
7 | ** |
8 | ** $QT_BEGIN_LICENSE:LGPL$ |
9 | ** Commercial License Usage |
10 | ** Licensees holding valid commercial Qt licenses may use this file in |
11 | ** accordance with the commercial license agreement provided with the |
12 | ** Software or, alternatively, in accordance with the terms contained in |
13 | ** a written agreement between you and The Qt Company. For licensing terms |
14 | ** and conditions see https://www.qt.io/terms-conditions. For further |
15 | ** information use the contact form at https://www.qt.io/contact-us. |
16 | ** |
17 | ** GNU Lesser General Public License Usage |
18 | ** Alternatively, this file may be used under the terms of the GNU Lesser |
19 | ** General Public License version 3 as published by the Free Software |
20 | ** Foundation and appearing in the file LICENSE.LGPL3 included in the |
21 | ** packaging of this file. Please review the following information to |
22 | ** ensure the GNU Lesser General Public License version 3 requirements |
23 | ** will be met: https://www.gnu.org/licenses/lgpl-3.0.html. |
24 | ** |
25 | ** GNU General Public License Usage |
26 | ** Alternatively, this file may be used under the terms of the GNU |
27 | ** General Public License version 2.0 or (at your option) the GNU General |
28 | ** Public license version 3 or any later version approved by the KDE Free |
29 | ** Qt Foundation. The licenses are as published by the Free Software |
30 | ** Foundation and appearing in the file LICENSE.GPL2 and LICENSE.GPL3 |
31 | ** included in the packaging of this file. Please review the following |
32 | ** information to ensure the GNU General Public License requirements will |
33 | ** be met: https://www.gnu.org/licenses/gpl-2.0.html and |
34 | ** https://www.gnu.org/licenses/gpl-3.0.html. |
35 | ** |
36 | ** $QT_END_LICENSE$ |
37 | ** |
38 | ****************************************************************************/ |
39 | |
40 | #include <QString> |
41 | |
42 | #include "qitem_p.h" |
43 | |
44 | #include "qabstractxmlreceiver_p.h" |
45 | #include "qabstractxmlreceiver.h" |
46 | |
47 | QT_BEGIN_NAMESPACE |
48 | |
49 | /*! |
50 | \class QAbstractXmlReceiver |
51 | \brief The QAbstractXmlReceiver class provides a callback interface |
52 | for transforming the output of a QXmlQuery. |
53 | \reentrant |
54 | \since 4.4 |
55 | \ingroup xml-tools |
56 | \inmodule QtXmlPatterns |
57 | |
58 | QAbstractXmlReceiver is an abstract base class that provides |
59 | a callback interface for receiving an \l {XQuery Sequence} |
60 | {XQuery sequence}, usually the output of an QXmlQuery, and |
61 | transforming that sequence into a structure of your choosing, |
62 | usually XML. Consider the example: |
63 | |
64 | \snippet code/src_xmlpatterns_api_qabstractxmlreceiver.cpp 0 |
65 | |
66 | First it constructs a \l {QXmlQuery} {query} that gets the |
67 | first paragraph from document \c index.html. Then it constructs |
68 | an \l {QXmlSerializer} {XML serializer} with the \l {QXmlQuery} |
69 | {query} and \l {QIODevice} {myOutputDevice} (Note the |
70 | \l {QXmlSerializer} {serializer} is an \e {XML receiver}, |
71 | ie a subclass of QAbstractXmlReceiver). Finally, it |
72 | \l {QXmlQuery::evaluateTo()} {evaluates} the |
73 | \l {QXmlQuery} {query}, producing an ordered sequence of calls |
74 | to the \l {QXmlSerializer} {serializer's} callback functions. |
75 | The sequence of callbacks transforms the query output to XML |
76 | and writes it to \l {QIODevice} {myOutputDevice}. |
77 | |
78 | Although the example uses \l {QXmlQuery} to produce the sequence |
79 | of callbacks to functions in QAbstractXmlReceiver, you can call |
80 | the callback functions directly as long as your sequence of |
81 | calls represents a valid \l {XQuery Sequence} {XQuery sequence}. |
82 | |
83 | \target XQuery Sequence |
84 | \section1 XQuery Sequences |
85 | |
86 | An XQuery \a sequence is an ordered collection of zero, one, |
87 | or many \e items. Each \e item is either an \e {atomic value} |
88 | or a \e {node}. An \e {atomic value} is a simple data value. |
89 | |
90 | There are six kinds of \e nodes. |
91 | |
92 | \list |
93 | |
94 | \li An \e {Element Node} represents an XML element. |
95 | |
96 | \li An \e {Attribute Node} represents an XML attribute. |
97 | |
98 | \li A \e {Document Node} represents an entire XML document. |
99 | |
100 | \li A \e {Text Node} represents character data (element content). |
101 | |
102 | \li A \e {Processing Instruction Node} represents an XML |
103 | processing instruction, which is used in an XML document |
104 | to tell the application reading the document to perform |
105 | some action. A typical example is to use a processing |
106 | instruction to tell the application to use a particular |
107 | XSLT stylesheet to display the document. |
108 | |
109 | \li And a \e {Comment node} represents an XML comment. |
110 | |
111 | \endlist |
112 | |
113 | The \e sequence of \e nodes and \e {atomic values} obeys |
114 | the following rules. Note that \e {Namespace Node} refers |
115 | to a special \e {Attribute Node} with name \e {xmlns}. |
116 | |
117 | \list |
118 | |
119 | \li Each \e node appears in the \e sequence before its children |
120 | and their descendants appear. |
121 | |
122 | \li A \e node's descendants appear in the \e sequence before |
123 | any of its siblings appear. |
124 | |
125 | \li A \e {Document Node} represents an entire document. Zero or |
126 | more \e {Document Nodes} can appear in a \e sequence, but they |
127 | can only be top level items (i.e., a \e {Document Node} can't |
128 | be a child of another \e node. |
129 | |
130 | \li \e {Namespace Nodes} immediately follow the \e {Element Node} |
131 | with which they are associated. |
132 | |
133 | \li \e {Attribute Nodes} immediately follow the \e {Namespace Nodes} |
134 | of the element with which they are associated, or... |
135 | |
136 | \li If there are no \e {Namespace Nodes} following an element, then |
137 | the \e {Attribute Nodes} immediately follow the element. |
138 | |
139 | \li An \e {atomic value} can only appear as a top level \e item, |
140 | i.e., it can't appear as a child of a \e node. |
141 | |
142 | \li \e {Processing Instruction Nodes} do not have children, and |
143 | their parent is either a \e {Document Node} or an \e {Element |
144 | Node}. |
145 | |
146 | \li \e {Comment Nodes} do not have children, and |
147 | their parent is either a \e {Document Node} or an \e {Element |
148 | Node}. |
149 | |
150 | \endlist |
151 | |
152 | The \e sequence of \e nodes and \e {atomic values} is sent to |
153 | an QAbstractXmlReceiver (QXmlSerializer in |
154 | the example above) as a sequence of calls to the receiver's |
155 | callback functions. The mapping of callback functions to |
156 | sequence items is as follows. |
157 | |
158 | \list |
159 | |
160 | \li startDocument() and endDocument() are called for each |
161 | \e {Document Node} in the \e sequence. endDocument() is not |
162 | called until all the \e {Document Node's} children have |
163 | appeared in the \e sequence. |
164 | |
165 | \li startElement() and endElement() are called for each |
166 | \e {Element Node}. endElement() is not called until all the |
167 | \e {Element Node's} children have appeared in the \e sequence. |
168 | |
169 | \li attribute() is called for each \e {Attribute Node}. |
170 | |
171 | \li comment() is called for each \e {Comment Node}. |
172 | |
173 | \li characters() is called for each \e {Text Node}. |
174 | |
175 | \li processingInstruction() is called for each \e {Processing |
176 | Instruction Node}. |
177 | |
178 | \li namespaceBinding() is called for each \e {Namespace Node}. |
179 | |
180 | \li atomicValue() is called for each \e {atomic value}. |
181 | |
182 | \endlist |
183 | |
184 | For a complete explanation of XQuery sequences, visit |
185 | \l {http://www.w3.org/TR/xpath-datamodel/}{XQuery Data Model}. |
186 | |
187 | \sa {http://www.w3.org/TR/xpath-datamodel/}{W3C XQuery 1.0 and XPath 2.0 Data Model (XDM)} |
188 | \sa QXmlSerializer |
189 | \sa QXmlResultItems |
190 | */ |
191 | |
192 | template<const QXmlNodeModelIndex::Axis axis> |
193 | void QAbstractXmlReceiver::sendFromAxis(const QXmlNodeModelIndex &node) |
194 | { |
195 | Q_ASSERT(!node.isNull()); |
196 | const QXmlNodeModelIndex::Iterator::Ptr it(node.iterate(axis)); |
197 | QXmlNodeModelIndex next(it->next()); |
198 | |
199 | while(!next.isNull()) |
200 | { |
201 | sendAsNode(outputItem: next); |
202 | next = it->next(); |
203 | } |
204 | } |
205 | |
206 | /*! |
207 | \internal |
208 | */ |
209 | QAbstractXmlReceiver::QAbstractXmlReceiver(QAbstractXmlReceiverPrivate *d) |
210 | : d_ptr(d) |
211 | { |
212 | } |
213 | |
214 | /*! |
215 | Constructs an abstract xml receiver. |
216 | */ |
217 | QAbstractXmlReceiver::QAbstractXmlReceiver() : d_ptr(0) |
218 | { |
219 | } |
220 | |
221 | /*! |
222 | Destroys the xml receiver. |
223 | */ |
224 | QAbstractXmlReceiver::~QAbstractXmlReceiver() |
225 | { |
226 | } |
227 | |
228 | /*! |
229 | \fn void QAbstractXmlReceiver::startElement(const QXmlName &name) |
230 | |
231 | This callback is called when a new element node appears |
232 | in the \l {XQuery Sequence} {sequence}. \a name is the |
233 | valid \l {QXmlName} {name} of the node element. |
234 | */ |
235 | |
236 | /* |
237 | ### Qt 5: |
238 | |
239 | Consider how source locations should be communicated. Maybe every signature |
240 | should be extended by adding "qint64 line = -1, qint64 column = -1". |
241 | */ |
242 | |
243 | /*! |
244 | \fn void QAbstractXmlReceiver::endElement() |
245 | |
246 | This callback is called when the end of an element node |
247 | appears in the \l {XQuery Sequence} {sequence}. |
248 | */ |
249 | |
250 | /*! |
251 | \fn void QAbstractXmlReceiver::attribute(const QXmlName &name, |
252 | const QStringRef &value) |
253 | This callback is called when an attribute node |
254 | appears in the \l {XQuery Sequence} {sequence}. |
255 | \a name is the \l {QXmlName} {attribute name} and |
256 | the \a value string contains the attribute value. |
257 | */ |
258 | |
259 | /*! |
260 | \fn void QAbstractXmlReceiver::comment(const QString &value) |
261 | |
262 | This callback is called when a comment node appears |
263 | in the \l {XQuery Sequence} {sequence}. The \a value |
264 | is the comment text, which must not contain the string |
265 | "--". |
266 | */ |
267 | |
268 | /*! |
269 | \fn void QAbstractXmlReceiver::characters(const QStringRef &value) |
270 | |
271 | This callback is called when a text node appears in the |
272 | \l {XQuery Sequence} {sequence}. The \a value contains |
273 | the text. Adjacent text nodes may not occur in the |
274 | \l {XQuery Sequence} {sequence}, i.e., this callback must not |
275 | be called twice in a row. |
276 | */ |
277 | |
278 | /*! |
279 | \fn void QAbstractXmlReceiver::startDocument() |
280 | |
281 | This callback is called when a document node appears |
282 | in the \l {XQuery Sequence} {sequence}. |
283 | */ |
284 | |
285 | /* |
286 | ### Qt 5: |
287 | |
288 | Change |
289 | virtual void startDocument() = 0; |
290 | |
291 | To: |
292 | virtual void startDocument(const QUrl &uri) = 0; |
293 | |
294 | Such that it allows the document URI to be communicated. The contract would |
295 | allow null QUrls. |
296 | */ |
297 | |
298 | /*! |
299 | \fn void QAbstractXmlReceiver::endDocument() |
300 | |
301 | This callback is called when the end of a document node |
302 | appears in the \l {XQuery Sequence} {sequence}. |
303 | */ |
304 | |
305 | /*! |
306 | \fn void QAbstractXmlReceiver::processingInstruction(const QXmlName &target, |
307 | const QString &value) |
308 | |
309 | This callback is called when a processing instruction |
310 | appears in the \l {XQuery Sequence} {sequence}. |
311 | A processing instruction is used in an XML document |
312 | to tell the application reading the document to |
313 | perform some action. A typical example is to use a |
314 | processing instruction to tell the application to use a |
315 | particular XSLT stylesheet to process the document. |
316 | |
317 | \quotefile patternist/xmlStylesheet.xq |
318 | |
319 | \a target is the \l {QXmlName} {name} of the processing |
320 | instruction. Its \e prefix and \e {namespace URI} must both |
321 | be empty. Its \e {local name} is the target. In the above |
322 | example, the name is \e {xml-stylesheet}. |
323 | |
324 | The \a value specifies the action to be taken. Note that |
325 | the \a value must not contain the string "?>". In the above |
326 | example, the \a value is \e{type="test/xsl" href="formatter.xsl}. |
327 | |
328 | Generally, use of processing instructions should be avoided, |
329 | because they are not namespace aware and in many contexts |
330 | are stripped out anyway. Processing instructions can often |
331 | be replaced with elements from a custom namespace. |
332 | */ |
333 | |
334 | /*! |
335 | \fn void QAbstractXmlReceiver::atomicValue(const QVariant &value) |
336 | |
337 | This callback is called when an atomic value appears in the \l |
338 | {XQuery Sequence} {sequence}. The \a value is a simple \l {QVariant} |
339 | {data value}. It is guaranteed to be \l {QVariant::isValid()} |
340 | {valid}. |
341 | */ |
342 | |
343 | /*! |
344 | \fn virtual void QAbstractXmlReceiver::namespaceBinding(const QXmlName &name) |
345 | |
346 | This callback is called when a namespace binding is in scope of an |
347 | element. A namespace is defined by a URI. In the \l {QXmlName} |
348 | \a name, the value of \l {QXmlName::namespaceUri()} is that URI. The |
349 | value of \l {QXmlName::prefix}() is the prefix that the URI is bound |
350 | to. The local name is insignificant and can be an arbitrary value. |
351 | */ |
352 | |
353 | /*! |
354 | \internal |
355 | |
356 | Treats \a outputItem as a node and calls the appropriate function, |
357 | e.g., attribute() or comment(), depending on its |
358 | QXmlNodeModelIndex::NodeKind. |
359 | |
360 | This is a helper function that subclasses can use to multiplex |
361 | Nodes received via item(). |
362 | */ |
363 | void QAbstractXmlReceiver::sendAsNode(const QPatternist::Item &outputItem) |
364 | { |
365 | Q_ASSERT(outputItem); |
366 | Q_ASSERT(outputItem.isNode()); |
367 | const QXmlNodeModelIndex asNode = outputItem.asNode(); |
368 | |
369 | switch(asNode.kind()) |
370 | { |
371 | case QXmlNodeModelIndex::Attribute: |
372 | { |
373 | const QString &v = outputItem.stringValue(); |
374 | attribute(name: asNode.name(), value: QStringRef(&v)); |
375 | return; |
376 | } |
377 | case QXmlNodeModelIndex::Element: |
378 | { |
379 | startElement(name: asNode.name()); |
380 | |
381 | /* First the namespaces, then attributes, then the children. */ |
382 | asNode.sendNamespaces(receiver: this); |
383 | sendFromAxis<QXmlNodeModelIndex::AxisAttribute>(node: asNode); |
384 | sendFromAxis<QXmlNodeModelIndex::AxisChild>(node: asNode); |
385 | |
386 | endElement(); |
387 | |
388 | return; |
389 | } |
390 | case QXmlNodeModelIndex::Text: |
391 | { |
392 | const QString &v = asNode.stringValue(); |
393 | characters(value: QStringRef(&v)); |
394 | return; |
395 | } |
396 | case QXmlNodeModelIndex::ProcessingInstruction: |
397 | { |
398 | processingInstruction(target: asNode.name(), value: outputItem.stringValue()); |
399 | return; |
400 | } |
401 | case QXmlNodeModelIndex::Comment: |
402 | { |
403 | comment(value: outputItem.stringValue()); |
404 | return; |
405 | } |
406 | case QXmlNodeModelIndex::Document: |
407 | { |
408 | startDocument(); |
409 | sendFromAxis<QXmlNodeModelIndex::AxisChild>(node: asNode); |
410 | endDocument(); |
411 | return; |
412 | } |
413 | case QXmlNodeModelIndex::Namespace: |
414 | Q_ASSERT_X(false, Q_FUNC_INFO, "Not implemented" ); |
415 | } |
416 | |
417 | Q_ASSERT_X(false, Q_FUNC_INFO, |
418 | QString::fromLatin1("Unknown node type: %1" ).arg(asNode.kind()).toUtf8().constData()); |
419 | } |
420 | |
421 | /*! |
422 | \internal |
423 | |
424 | This function may be called instead of characters() if, and only if, |
425 | \a value consists only of whitespace. |
426 | |
427 | The caller gurantees that \a value is not empty. |
428 | |
429 | \e Whitespace refers to a sequence of characters that are either |
430 | spaces, tabs, or newlines, in any order. In other words, not all |
431 | the Unicode whitespace category is considered whitespace here. |
432 | |
433 | However, there is no guarantee or requirement that whitespaceOnly() |
434 | is called for text nodes containing whitespace only. characters() |
435 | may be called just as well. This is why the default implementation |
436 | for whitespaceOnly() calls characters(). |
437 | |
438 | \sa characters() |
439 | */ |
440 | void QAbstractXmlReceiver::whitespaceOnly(const QStringRef &value) |
441 | { |
442 | Q_ASSERT_X(value.toString().trimmed().isEmpty(), Q_FUNC_INFO, |
443 | "The caller must guarantee only whitespace is passed. Use characters() in other cases." ); |
444 | const QString &v = value.toString(); |
445 | characters(value: QStringRef(&v)); |
446 | } |
447 | |
448 | /*! |
449 | \internal |
450 | */ |
451 | void QAbstractXmlReceiver::item(const QPatternist::Item &item) |
452 | { |
453 | if(item.isNode()) |
454 | return sendAsNode(outputItem: item); |
455 | else |
456 | atomicValue(value: QPatternist::AtomicValue::toQt(value: item.asAtomicValue())); |
457 | } |
458 | |
459 | /*! |
460 | \fn void QAbstractXmlReceiver::startOfSequence() |
461 | |
462 | This callback is called once only, right before the |
463 | \l {XQuery Sequence} {sequence} begins. |
464 | */ |
465 | |
466 | /*! |
467 | \fn void QAbstractXmlReceiver::endOfSequence() |
468 | |
469 | This callback is called once only, right after the |
470 | \l {XQuery Sequence} {sequence} ends. |
471 | */ |
472 | |
473 | QT_END_NAMESPACE |
474 | |
475 | |