1 | // Copyright (C) 2016 The Qt Company Ltd. |
2 | // Copyright (C) 2016 Intel Corporation. |
3 | // SPDX-License-Identifier: LicenseRef-Qt-Commercial OR LGPL-3.0-only OR GPL-2.0-only OR GPL-3.0-only |
4 | |
5 | /*! |
6 | \class QUrl |
7 | \inmodule QtCore |
8 | |
9 | \brief The QUrl class provides a convenient interface for working |
10 | with URLs. |
11 | |
12 | \reentrant |
13 | \ingroup io |
14 | \ingroup network |
15 | \ingroup shared |
16 | |
17 | It can parse and construct URLs in both encoded and unencoded |
18 | form. QUrl also has support for internationalized domain names |
19 | (IDNs). |
20 | |
21 | The most common way to use QUrl is to initialize it via the constructor by |
22 | passing a QString containing a full URL. QUrl objects can also be created |
23 | from a QByteArray containing a full URL using QUrl::fromEncoded(), or |
24 | heuristically from incomplete URLs using QUrl::fromUserInput(). The URL |
25 | representation can be obtained from a QUrl using either QUrl::toString() or |
26 | QUrl::toEncoded(). |
27 | |
28 | URLs can be represented in two forms: encoded or unencoded. The |
29 | unencoded representation is suitable for showing to users, but |
30 | the encoded representation is typically what you would send to |
31 | a web server. For example, the unencoded URL |
32 | "http://bühler.example.com/List of applicants.xml" |
33 | would be sent to the server as |
34 | "http://xn--bhler-kva.example.com/List%20of%20applicants.xml". |
35 | |
36 | A URL can also be constructed piece by piece by calling |
37 | setScheme(), setUserName(), setPassword(), setHost(), setPort(), |
38 | setPath(), setQuery() and setFragment(). Some convenience |
39 | functions are also available: setAuthority() sets the user name, |
40 | password, host and port. setUserInfo() sets the user name and |
41 | password at once. |
42 | |
43 | Call isValid() to check if the URL is valid. This can be done at any point |
44 | during the constructing of a URL. If isValid() returns \c false, you should |
45 | clear() the URL before proceeding, or start over by parsing a new URL with |
46 | setUrl(). |
47 | |
48 | Constructing a query is particularly convenient through the use of the \l |
49 | QUrlQuery class and its methods QUrlQuery::setQueryItems(), |
50 | QUrlQuery::addQueryItem() and QUrlQuery::removeQueryItem(). Use |
51 | QUrlQuery::setQueryDelimiters() to customize the delimiters used for |
52 | generating the query string. |
53 | |
54 | For the convenience of generating encoded URL strings or query |
55 | strings, there are two static functions called |
56 | fromPercentEncoding() and toPercentEncoding() which deal with |
57 | percent encoding and decoding of QString objects. |
58 | |
59 | fromLocalFile() constructs a QUrl by parsing a local |
60 | file path. toLocalFile() converts a URL to a local file path. |
61 | |
62 | The human readable representation of the URL is fetched with |
63 | toString(). This representation is appropriate for displaying a |
64 | URL to a user in unencoded form. The encoded form however, as |
65 | returned by toEncoded(), is for internal use, passing to web |
66 | servers, mail clients and so on. Both forms are technically correct |
67 | and represent the same URL unambiguously -- in fact, passing either |
68 | form to QUrl's constructor or to setUrl() will yield the same QUrl |
69 | object. |
70 | |
71 | QUrl conforms to the URI specification from |
72 | \l{RFC 3986} (Uniform Resource Identifier: Generic Syntax), and includes |
73 | scheme extensions from \l{RFC 1738} (Uniform Resource Locators). Case |
74 | folding rules in QUrl conform to \l{RFC 3491} (Nameprep: A Stringprep |
75 | Profile for Internationalized Domain Names (IDN)). It is also compatible with the |
76 | \l{http://freedesktop.org/wiki/Specifications/file-uri-spec/}{file URI specification} |
77 | from freedesktop.org, provided that the locale encodes file names using |
78 | UTF-8 (required by IDN). |
79 | |
80 | \section2 Relative URLs vs Relative Paths |
81 | |
82 | Calling isRelative() will return whether or not the URL is relative. |
83 | A relative URL has no \l {scheme}. For example: |
84 | |
85 | \snippet code/src_corelib_io_qurl.cpp 8 |
86 | |
87 | Notice that a URL can be absolute while containing a relative path, and |
88 | vice versa: |
89 | |
90 | \snippet code/src_corelib_io_qurl.cpp 9 |
91 | |
92 | A relative URL can be resolved by passing it as an argument to resolved(), |
93 | which returns an absolute URL. isParentOf() is used for determining whether |
94 | one URL is a parent of another. |
95 | |
96 | \section2 Error checking |
97 | |
98 | QUrl is capable of detecting many errors in URLs while parsing it or when |
99 | components of the URL are set with individual setter methods (like |
100 | setScheme(), setHost() or setPath()). If the parsing or setter function is |
101 | successful, any previously recorded error conditions will be discarded. |
102 | |
103 | By default, QUrl setter methods operate in QUrl::TolerantMode, which means |
104 | they accept some common mistakes and mis-representation of data. An |
105 | alternate method of parsing is QUrl::StrictMode, which applies further |
106 | checks. See QUrl::ParsingMode for a description of the difference of the |
107 | parsing modes. |
108 | |
109 | QUrl only checks for conformance with the URL specification. It does not |
110 | try to verify that high-level protocol URLs are in the format they are |
111 | expected to be by handlers elsewhere. For example, the following URIs are |
112 | all considered valid by QUrl, even if they do not make sense when used: |
113 | |
114 | \list |
115 | \li "http:/filename.html" |
116 | \li "mailto://example.com" |
117 | \endlist |
118 | |
119 | When the parser encounters an error, it signals the event by making |
120 | isValid() return false and toString() / toEncoded() return an empty string. |
121 | If it is necessary to show the user the reason why the URL failed to parse, |
122 | the error condition can be obtained from QUrl by calling errorString(). |
123 | Note that this message is highly technical and may not make sense to |
124 | end-users. |
125 | |
126 | QUrl is capable of recording only one error condition. If more than one |
127 | error is found, it is undefined which error is reported. |
128 | |
129 | \section2 Character Conversions |
130 | |
131 | Follow these rules to avoid erroneous character conversion when |
132 | dealing with URLs and strings: |
133 | |
134 | \list |
135 | \li When creating a QString to contain a URL from a QByteArray or a |
136 | char*, always use QString::fromUtf8(). |
137 | \endlist |
138 | */ |
139 | |
140 | /*! |
141 | \enum QUrl::ParsingMode |
142 | |
143 | The parsing mode controls the way QUrl parses strings. |
144 | |
145 | \value TolerantMode QUrl will try to correct some common errors in URLs. |
146 | This mode is useful for parsing URLs coming from sources |
147 | not known to be strictly standards-conforming. |
148 | |
149 | \value StrictMode Only valid URLs are accepted. This mode is useful for |
150 | general URL validation. |
151 | |
152 | \value DecodedMode QUrl will interpret the URL component in the fully-decoded form, |
153 | where percent characters stand for themselves, not as the beginning |
154 | of a percent-encoded sequence. This mode is only valid for the |
155 | setters setting components of a URL; it is not permitted in |
156 | the QUrl constructor, in fromEncoded() or in setUrl(). |
157 | For more information on this mode, see the documentation for |
158 | \l {QUrl::ComponentFormattingOption}{QUrl::FullyDecoded}. |
159 | |
160 | In TolerantMode, the parser has the following behaviour: |
161 | |
162 | \list |
163 | |
164 | \li Spaces and "%20": unencoded space characters will be accepted and will |
165 | be treated as equivalent to "%20". |
166 | |
167 | \li Single "%" characters: Any occurrences of a percent character "%" not |
168 | followed by exactly two hexadecimal characters (e.g., "13% coverage.html") |
169 | will be replaced by "%25". Note that one lone "%" character will trigger |
170 | the correction mode for all percent characters. |
171 | |
172 | \li Reserved and unreserved characters: An encoded URL should only |
173 | contain a few characters as literals; all other characters should |
174 | be percent-encoded. In TolerantMode, these characters will be |
175 | accepted if they are found in the URL: |
176 | space / double-quote / "<" / ">" / "\" / |
177 | "^" / "`" / "{" / "|" / "}" |
178 | Those same characters can be decoded again by passing QUrl::DecodeReserved |
179 | to toString() or toEncoded(). In the getters of individual components, |
180 | those characters are often returned in decoded form. |
181 | |
182 | \endlist |
183 | |
184 | When in StrictMode, if a parsing error is found, isValid() will return \c |
185 | false and errorString() will return a message describing the error. |
186 | If more than one error is detected, it is undefined which error gets |
187 | reported. |
188 | |
189 | Note that TolerantMode is not usually enough for parsing user input, which |
190 | often contains more errors and expectations than the parser can deal with. |
191 | When dealing with data coming directly from the user -- as opposed to data |
192 | coming from data-transfer sources, such as other programs -- it is |
193 | recommended to use fromUserInput(). |
194 | |
195 | \sa fromUserInput(), setUrl(), toString(), toEncoded(), QUrl::FormattingOptions |
196 | */ |
197 | |
198 | /*! |
199 | \enum QUrl::UrlFormattingOption |
200 | |
201 | The formatting options define how the URL is formatted when written out |
202 | as text. |
203 | |
204 | \value None The format of the URL is unchanged. |
205 | \value RemoveScheme The scheme is removed from the URL. |
206 | \value RemovePassword Any password in the URL is removed. |
207 | \value RemoveUserInfo Any user information in the URL is removed. |
208 | \value RemovePort Any specified port is removed from the URL. |
209 | \value RemoveAuthority |
210 | \value RemovePath The URL's path is removed, leaving only the scheme, |
211 | host address, and port (if present). |
212 | \value RemoveQuery The query part of the URL (following a '?' character) |
213 | is removed. |
214 | \value RemoveFragment |
215 | \value RemoveFilename The filename (i.e. everything after the last '/' in the path) is removed. |
216 | The trailing '/' is kept, unless StripTrailingSlash is set. |
217 | Only valid if RemovePath is not set. |
218 | \value PreferLocalFile If the URL is a local file according to isLocalFile() |
219 | and contains no query or fragment, a local file path is returned. |
220 | \value StripTrailingSlash The trailing slash is removed from the path, if one is present. |
221 | \value NormalizePathSegments Modifies the path to remove redundant directory separators, |
222 | and to resolve "."s and ".."s (as far as possible). For non-local paths, adjacent |
223 | slashes are preserved. |
224 | |
225 | Note that the case folding rules in \l{RFC 3491}{Nameprep}, which QUrl |
226 | conforms to, require host names to always be converted to lower case, |
227 | regardless of the Qt::FormattingOptions used. |
228 | |
229 | The options from QUrl::ComponentFormattingOptions are also possible. |
230 | |
231 | \sa QUrl::ComponentFormattingOptions |
232 | */ |
233 | |
234 | /*! |
235 | \enum QUrl::ComponentFormattingOption |
236 | \since 5.0 |
237 | |
238 | The component formatting options define how the components of an URL will |
239 | be formatted when written out as text. They can be combined with the |
240 | options from QUrl::FormattingOptions when used in toString() and |
241 | toEncoded(). |
242 | |
243 | \value PrettyDecoded The component is returned in a "pretty form", with |
244 | most percent-encoded characters decoded. The exact |
245 | behavior of PrettyDecoded varies from component to |
246 | component and may also change from Qt release to Qt |
247 | release. This is the default. |
248 | |
249 | \value EncodeSpaces Leave space characters in their encoded form ("%20"). |
250 | |
251 | \value EncodeUnicode Leave non-US-ASCII characters encoded in their UTF-8 |
252 | percent-encoded form (e.g., "%C3%A9" for the U+00E9 |
253 | codepoint, LATIN SMALL LETTER E WITH ACUTE). |
254 | |
255 | \value EncodeDelimiters Leave certain delimiters in their encoded form, as |
256 | would appear in the URL when the full URL is |
257 | represented as text. The delimiters are affected |
258 | by this option change from component to component. |
259 | This flag has no effect in toString() or toEncoded(). |
260 | |
261 | \value EncodeReserved Leave US-ASCII characters not permitted in the URL by |
262 | the specification in their encoded form. This is the |
263 | default on toString() and toEncoded(). |
264 | |
265 | \value DecodeReserved Decode the US-ASCII characters that the URL specification |
266 | does not allow to appear in the URL. This is the |
267 | default on the getters of individual components. |
268 | |
269 | \value FullyEncoded Leave all characters in their properly-encoded form, |
270 | as this component would appear as part of a URL. When |
271 | used with toString(), this produces a fully-compliant |
272 | URL in QString form, exactly equal to the result of |
273 | toEncoded() |
274 | |
275 | \value FullyDecoded Attempt to decode as much as possible. For individual |
276 | components of the URL, this decodes every percent |
277 | encoding sequence, including control characters (U+0000 |
278 | to U+001F) and UTF-8 sequences found in percent-encoded form. |
279 | Use of this mode may cause data loss, see below for more information. |
280 | |
281 | The values of EncodeReserved and DecodeReserved should not be used together |
282 | in one call. The behavior is undefined if that happens. They are provided |
283 | as separate values because the behavior of the "pretty mode" with regards |
284 | to reserved characters is different on certain components and specially on |
285 | the full URL. |
286 | |
287 | \section2 Full decoding |
288 | |
289 | The FullyDecoded mode is similar to the behavior of the functions returning |
290 | QString in Qt 4.x, in that every character represents itself and never has |
291 | any special meaning. This is true even for the percent character ('%'), |
292 | which should be interpreted to mean a literal percent, not the beginning of |
293 | a percent-encoded sequence. The same actual character, in all other |
294 | decoding modes, is represented by the sequence "%25". |
295 | |
296 | Whenever re-applying data obtained with QUrl::FullyDecoded into a QUrl, |
297 | care must be taken to use the QUrl::DecodedMode parameter to the setters |
298 | (like setPath() and setUserName()). Failure to do so may cause |
299 | re-interpretation of the percent character ('%') as the beginning of a |
300 | percent-encoded sequence. |
301 | |
302 | This mode is quite useful when portions of a URL are used in a non-URL |
303 | context. For example, to extract the username, password or file paths in an |
304 | FTP client application, the FullyDecoded mode should be used. |
305 | |
306 | This mode should be used with care, since there are two conditions that |
307 | cannot be reliably represented in the returned QString. They are: |
308 | |
309 | \list |
310 | \li \b{Non-UTF-8 sequences:} URLs may contain sequences of |
311 | percent-encoded characters that do not form valid UTF-8 sequences. Since |
312 | URLs need to be decoded using UTF-8, any decoder failure will result in |
313 | the QString containing one or more replacement characters where the |
314 | sequence existed. |
315 | |
316 | \li \b{Encoded delimiters:} URLs are also allowed to make a distinction |
317 | between a delimiter found in its literal form and its equivalent in |
318 | percent-encoded form. This is most commonly found in the query, but is |
319 | permitted in most parts of the URL. |
320 | \endlist |
321 | |
322 | The following example illustrates the problem: |
323 | |
324 | \snippet code/src_corelib_io_qurl.cpp 10 |
325 | |
326 | If the two URLs were used via HTTP GET, the interpretation by the web |
327 | server would probably be different. In the first case, it would interpret |
328 | as one parameter, with a key of "q" and value "a+=b&c". In the second |
329 | case, it would probably interpret as two parameters, one with a key of "q" |
330 | and value "a =b", and the second with a key "c" and no value. |
331 | |
332 | \sa QUrl::FormattingOptions |
333 | */ |
334 | |
335 | /*! |
336 | \enum QUrl::UserInputResolutionOption |
337 | \since 5.4 |
338 | |
339 | The user input resolution options define how fromUserInput() should |
340 | interpret strings that could either be a relative path or the short |
341 | form of a HTTP URL. For instance \c{file.pl} can be either a local file |
342 | or the URL \c{http://file.pl}. |
343 | |
344 | \value DefaultResolution The default resolution mechanism is to check |
345 | whether a local file exists, in the working |
346 | directory given to fromUserInput, and only |
347 | return a local path in that case. Otherwise a URL |
348 | is assumed. |
349 | \value AssumeLocalFile This option makes fromUserInput() always return |
350 | a local path unless the input contains a scheme, such as |
351 | \c{http://file.pl}. This is useful for applications |
352 | such as text editors, which are able to create |
353 | the file if it doesn't exist. |
354 | |
355 | \sa fromUserInput() |
356 | */ |
357 | |
358 | /*! |
359 | \enum QUrl::AceProcessingOption |
360 | \since 6.3 |
361 | |
362 | The ACE processing options control the way URLs are transformed to and from |
363 | ASCII-Compatible Encoding. |
364 | |
365 | \value IgnoreIDNWhitelist Ignore the IDN whitelist when converting URLs |
366 | to Unicode. |
367 | \value AceTransitionalProcessing Use transitional processing described in UTS #46. |
368 | This allows better compatibility with IDNA 2003 |
369 | specification. |
370 | |
371 | The default is to use nontransitional processing and to allow non-ASCII |
372 | characters only inside URLs whose top-level domains are listed in the IDN whitelist. |
373 | |
374 | \sa toAce(), fromAce(), idnWhitelist() |
375 | */ |
376 | |
377 | /*! |
378 | \fn QUrl::QUrl(QUrl &&other) |
379 | |
380 | Move-constructs a QUrl instance, making it point at the same |
381 | object that \a other was pointing to. |
382 | |
383 | \since 5.2 |
384 | */ |
385 | |
386 | /*! |
387 | \fn QUrl &QUrl::operator=(QUrl &&other) |
388 | |
389 | Move-assigns \a other to this QUrl instance. |
390 | |
391 | \since 5.2 |
392 | */ |
393 | |
394 | #include "qurl.h" |
395 | #include "qurl_p.h" |
396 | #include "qplatformdefs.h" |
397 | #include "qstring.h" |
398 | #include "qstringlist.h" |
399 | #include "qdebug.h" |
400 | #include "qhash.h" |
401 | #include "qdatastream.h" |
402 | #include "private/qipaddress_p.h" |
403 | #include "qurlquery.h" |
404 | #include "private/qdir_p.h" |
405 | #include <private/qtools_p.h> |
406 | |
407 | QT_BEGIN_NAMESPACE |
408 | |
409 | using namespace Qt::StringLiterals; |
410 | using namespace QtMiscUtils; |
411 | |
412 | inline static bool isHex(char c) |
413 | { |
414 | c |= 0x20; |
415 | return isAsciiDigit(c) || (c >= 'a' && c <= 'f'); |
416 | } |
417 | |
418 | static inline QString ftpScheme() |
419 | { |
420 | return QStringLiteral("ftp" ); |
421 | } |
422 | |
423 | static inline QString fileScheme() |
424 | { |
425 | return QStringLiteral("file" ); |
426 | } |
427 | |
428 | static inline QString webDavScheme() |
429 | { |
430 | return QStringLiteral("webdavs" ); |
431 | } |
432 | |
433 | static inline QString webDavSslTag() |
434 | { |
435 | return QStringLiteral("@SSL" ); |
436 | } |
437 | |
438 | class QUrlPrivate |
439 | { |
440 | public: |
441 | enum Section : uchar { |
442 | Scheme = 0x01, |
443 | UserName = 0x02, |
444 | Password = 0x04, |
445 | UserInfo = UserName | Password, |
446 | Host = 0x08, |
447 | Port = 0x10, |
448 | Authority = UserInfo | Host | Port, |
449 | Path = 0x20, |
450 | Hierarchy = Authority | Path, |
451 | Query = 0x40, |
452 | Fragment = 0x80, |
453 | FullUrl = 0xff |
454 | }; |
455 | |
456 | enum Flags : uchar { |
457 | IsLocalFile = 0x01 |
458 | }; |
459 | |
460 | enum ErrorCode { |
461 | // the high byte of the error code matches the Section |
462 | // the first item in each value must be the generic "Invalid xxx Error" |
463 | InvalidSchemeError = Scheme << 8, |
464 | |
465 | InvalidUserNameError = UserName << 8, |
466 | |
467 | InvalidPasswordError = Password << 8, |
468 | |
469 | InvalidRegNameError = Host << 8, |
470 | InvalidIPv4AddressError, |
471 | InvalidIPv6AddressError, |
472 | InvalidCharacterInIPv6Error, |
473 | InvalidIPvFutureError, |
474 | HostMissingEndBracket, |
475 | |
476 | InvalidPortError = Port << 8, |
477 | PortEmptyError, |
478 | |
479 | InvalidPathError = Path << 8, |
480 | |
481 | InvalidQueryError = Query << 8, |
482 | |
483 | InvalidFragmentError = Fragment << 8, |
484 | |
485 | // the following three cases are only possible in combination with |
486 | // presence/absence of the path, authority and scheme. See validityError(). |
487 | AuthorityPresentAndPathIsRelative = Authority << 8 | Path << 8 | 0x10000, |
488 | AuthorityAbsentAndPathIsDoubleSlash, |
489 | RelativeUrlPathContainsColonBeforeSlash = Scheme << 8 | Authority << 8 | Path << 8 | 0x10000, |
490 | |
491 | NoError = 0 |
492 | }; |
493 | |
494 | struct Error { |
495 | QString source; |
496 | qsizetype position; |
497 | ErrorCode code; |
498 | }; |
499 | |
500 | QUrlPrivate(); |
501 | QUrlPrivate(const QUrlPrivate ©); |
502 | ~QUrlPrivate(); |
503 | |
504 | void parse(const QString &url, QUrl::ParsingMode parsingMode); |
505 | bool isEmpty() const |
506 | { return sectionIsPresent == 0 && port == -1 && path.isEmpty(); } |
507 | |
508 | std::unique_ptr<Error> cloneError() const; |
509 | void clearError(); |
510 | void setError(ErrorCode errorCode, const QString &source, qsizetype supplement = -1); |
511 | ErrorCode validityError(QString *source = nullptr, qsizetype *position = nullptr) const; |
512 | bool validateComponent(Section section, const QString &input, qsizetype begin, qsizetype end); |
513 | bool validateComponent(Section section, const QString &input) |
514 | { return validateComponent(section, input, begin: 0, end: input.size()); } |
515 | |
516 | // no QString scheme() const; |
517 | void appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
518 | void appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
519 | void appendUserName(QString &appendTo, QUrl::FormattingOptions options) const; |
520 | void appendPassword(QString &appendTo, QUrl::FormattingOptions options) const; |
521 | void appendHost(QString &appendTo, QUrl::FormattingOptions options) const; |
522 | void appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
523 | void appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
524 | void appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
525 | |
526 | // the "end" parameters are like STL iterators: they point to one past the last valid element |
527 | bool setScheme(const QString &value, qsizetype len, bool doSetError); |
528 | void setAuthority(const QString &auth, qsizetype from, qsizetype end, QUrl::ParsingMode mode); |
529 | void setUserInfo(const QString &userInfo, qsizetype from, qsizetype end); |
530 | void setUserName(const QString &value, qsizetype from, qsizetype end); |
531 | void setPassword(const QString &value, qsizetype from, qsizetype end); |
532 | bool setHost(const QString &value, qsizetype from, qsizetype end, QUrl::ParsingMode mode); |
533 | void setPath(const QString &value, qsizetype from, qsizetype end); |
534 | void setQuery(const QString &value, qsizetype from, qsizetype end); |
535 | void setFragment(const QString &value, qsizetype from, qsizetype end); |
536 | |
537 | inline bool hasScheme() const { return sectionIsPresent & Scheme; } |
538 | inline bool hasAuthority() const { return sectionIsPresent & Authority; } |
539 | inline bool hasUserInfo() const { return sectionIsPresent & UserInfo; } |
540 | inline bool hasUserName() const { return sectionIsPresent & UserName; } |
541 | inline bool hasPassword() const { return sectionIsPresent & Password; } |
542 | inline bool hasHost() const { return sectionIsPresent & Host; } |
543 | inline bool hasPort() const { return port != -1; } |
544 | inline bool hasPath() const { return !path.isEmpty(); } |
545 | inline bool hasQuery() const { return sectionIsPresent & Query; } |
546 | inline bool hasFragment() const { return sectionIsPresent & Fragment; } |
547 | |
548 | inline bool isLocalFile() const { return flags & IsLocalFile; } |
549 | QString toLocalFile(QUrl::FormattingOptions options) const; |
550 | |
551 | QString mergePaths(const QString &relativePath) const; |
552 | |
553 | QAtomicInt ref; |
554 | int port; |
555 | |
556 | QString scheme; |
557 | QString userName; |
558 | QString password; |
559 | QString host; |
560 | QString path; |
561 | QString query; |
562 | QString fragment; |
563 | |
564 | std::unique_ptr<Error> error; |
565 | |
566 | // not used for: |
567 | // - Port (port == -1 means absence) |
568 | // - Path (there's no path delimiter, so we optimize its use out of existence) |
569 | // Schemes are never supposed to be empty, but we keep the flag anyway |
570 | uchar sectionIsPresent; |
571 | uchar flags; |
572 | |
573 | // 32-bit: 2 bytes tail padding available |
574 | // 64-bit: 6 bytes tail padding available |
575 | }; |
576 | |
577 | inline QUrlPrivate::QUrlPrivate() |
578 | : ref(1), port(-1), |
579 | sectionIsPresent(0), |
580 | flags(0) |
581 | { |
582 | } |
583 | |
584 | inline QUrlPrivate::QUrlPrivate(const QUrlPrivate ©) |
585 | : ref(1), port(copy.port), |
586 | scheme(copy.scheme), |
587 | userName(copy.userName), |
588 | password(copy.password), |
589 | host(copy.host), |
590 | path(copy.path), |
591 | query(copy.query), |
592 | fragment(copy.fragment), |
593 | error(copy.cloneError()), |
594 | sectionIsPresent(copy.sectionIsPresent), |
595 | flags(copy.flags) |
596 | { |
597 | } |
598 | |
599 | inline QUrlPrivate::~QUrlPrivate() |
600 | = default; |
601 | |
602 | std::unique_ptr<QUrlPrivate::Error> QUrlPrivate::cloneError() const |
603 | { |
604 | return error ? std::make_unique<Error>(args&: *error) : nullptr; |
605 | } |
606 | |
607 | inline void QUrlPrivate::clearError() |
608 | { |
609 | error.reset(); |
610 | } |
611 | |
612 | inline void QUrlPrivate::setError(ErrorCode errorCode, const QString &source, qsizetype supplement) |
613 | { |
614 | if (error) { |
615 | // don't overwrite an error set in a previous section during parsing |
616 | return; |
617 | } |
618 | error = std::make_unique<Error>(); |
619 | error->code = errorCode; |
620 | error->source = source; |
621 | error->position = supplement; |
622 | } |
623 | |
624 | // From RFC 3986, Appendix A Collected ABNF for URI |
625 | // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] |
626 | //[...] |
627 | // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) |
628 | // |
629 | // authority = [ userinfo "@" ] host [ ":" port ] |
630 | // userinfo = *( unreserved / pct-encoded / sub-delims / ":" ) |
631 | // host = IP-literal / IPv4address / reg-name |
632 | // port = *DIGIT |
633 | //[...] |
634 | // reg-name = *( unreserved / pct-encoded / sub-delims ) |
635 | //[..] |
636 | // pchar = unreserved / pct-encoded / sub-delims / ":" / "@" |
637 | // |
638 | // query = *( pchar / "/" / "?" ) |
639 | // |
640 | // fragment = *( pchar / "/" / "?" ) |
641 | // |
642 | // pct-encoded = "%" HEXDIG HEXDIG |
643 | // |
644 | // unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" |
645 | // reserved = gen-delims / sub-delims |
646 | // gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" |
647 | // sub-delims = "!" / "$" / "&" / "'" / "(" / ")" |
648 | // / "*" / "+" / "," / ";" / "=" |
649 | // the path component has a complex ABNF that basically boils down to |
650 | // slash-separated segments of "pchar" |
651 | |
652 | // The above is the strict definition of the URL components and we mostly |
653 | // adhere to it, with few exceptions. QUrl obeys the following behavior: |
654 | // - percent-encoding sequences always use uppercase HEXDIG; |
655 | // - unreserved characters are *always* decoded, no exceptions; |
656 | // - the space character and bytes with the high bit set are controlled by |
657 | // the EncodeSpaces and EncodeUnicode bits; |
658 | // - control characters, the percent sign itself, and bytes with the high |
659 | // bit set that don't form valid UTF-8 sequences are always encoded, |
660 | // except in FullyDecoded mode; |
661 | // - sub-delims are always left alone, except in FullyDecoded mode; |
662 | // - gen-delim change behavior depending on which section of the URL (or |
663 | // the entire URL) we're looking at; see below; |
664 | // - characters not mentioned above, like "<", and ">", are usually |
665 | // decoded in individual sections of the URL, but encoded when the full |
666 | // URL is put together (we can change on subjective definition of |
667 | // "pretty"). |
668 | // |
669 | // The behavior for the delimiters bears some explanation. The spec says in |
670 | // section 2.2: |
671 | // URIs that differ in the replacement of a reserved character with its |
672 | // corresponding percent-encoded octet are not equivalent. |
673 | // (note: QUrl API mistakenly uses the "reserved" term, so we will refer to |
674 | // them here as "delimiters"). |
675 | // |
676 | // For that reason, we cannot encode delimiters found in decoded form and we |
677 | // cannot decode the ones found in encoded form if that would change the |
678 | // interpretation. Conversely, we *can* perform the transformation if it would |
679 | // not change the interpretation. From the last component of a URL to the first, |
680 | // here are the gen-delims we can unambiguously transform when the field is |
681 | // taken in isolation: |
682 | // - fragment: none, since it's the last |
683 | // - query: "#" is unambiguous |
684 | // - path: "#" and "?" are unambiguous |
685 | // - host: completely special but never ambiguous, see setHost() below. |
686 | // - password: the "#", "?", "/", "[", "]" and "@" characters are unambiguous |
687 | // - username: the "#", "?", "/", "[", "]", "@", and ":" characters are unambiguous |
688 | // - scheme: doesn't accept any delimiter, see setScheme() below. |
689 | // |
690 | // Internally, QUrl stores each component in the format that corresponds to the |
691 | // default mode (PrettyDecoded). It deviates from the "strict" FullyEncoded |
692 | // mode in the following way: |
693 | // - spaces are decoded |
694 | // - valid UTF-8 sequences are decoded |
695 | // - gen-delims that can be unambiguously transformed are decoded |
696 | // - characters controlled by DecodeReserved are often decoded, though this behavior |
697 | // can change depending on the subjective definition of "pretty" |
698 | // |
699 | // Note that the list of gen-delims that we can transform is different for the |
700 | // user info (user name + password) and the authority (user info + host + |
701 | // port). |
702 | |
703 | |
704 | // list the recoding table modifications to be used with the recodeFromUser and |
705 | // appendToUser functions, according to the rules above. Spaces and UTF-8 |
706 | // sequences are handled outside the tables. |
707 | |
708 | // the encodedXXX tables are run with the delimiters set to "leave" by default; |
709 | // the decodedXXX tables are run with the delimiters set to "decode" by default |
710 | // (except for the query, which doesn't use these functions) |
711 | |
712 | namespace { |
713 | template <typename T> constexpr ushort decode(T x) noexcept { return ushort(x); } |
714 | template <typename T> constexpr ushort leave(T x) noexcept { return ushort(0x100 | x); } |
715 | template <typename T> constexpr ushort encode(T x) noexcept { return ushort(0x200 | x); } |
716 | } |
717 | |
718 | static const ushort userNameInIsolation[] = { |
719 | decode(x: ':'), // 0 |
720 | decode(x: '@'), // 1 |
721 | decode(x: ']'), // 2 |
722 | decode(x: '['), // 3 |
723 | decode(x: '/'), // 4 |
724 | decode(x: '?'), // 5 |
725 | decode(x: '#'), // 6 |
726 | |
727 | decode(x: '"'), // 7 |
728 | decode(x: '<'), |
729 | decode(x: '>'), |
730 | decode(x: '^'), |
731 | decode(x: '\\'), |
732 | decode(x: '|'), |
733 | decode(x: '{'), |
734 | decode(x: '}'), |
735 | 0 |
736 | }; |
737 | static const ushort * const passwordInIsolation = userNameInIsolation + 1; |
738 | static const ushort * const pathInIsolation = userNameInIsolation + 5; |
739 | static const ushort * const queryInIsolation = userNameInIsolation + 6; |
740 | static const ushort * const fragmentInIsolation = userNameInIsolation + 7; |
741 | |
742 | static const ushort userNameInUserInfo[] = { |
743 | encode(x: ':'), // 0 |
744 | decode(x: '@'), // 1 |
745 | decode(x: ']'), // 2 |
746 | decode(x: '['), // 3 |
747 | decode(x: '/'), // 4 |
748 | decode(x: '?'), // 5 |
749 | decode(x: '#'), // 6 |
750 | |
751 | decode(x: '"'), // 7 |
752 | decode(x: '<'), |
753 | decode(x: '>'), |
754 | decode(x: '^'), |
755 | decode(x: '\\'), |
756 | decode(x: '|'), |
757 | decode(x: '{'), |
758 | decode(x: '}'), |
759 | 0 |
760 | }; |
761 | static const ushort * const passwordInUserInfo = userNameInUserInfo + 1; |
762 | |
763 | static const ushort userNameInAuthority[] = { |
764 | encode(x: ':'), // 0 |
765 | encode(x: '@'), // 1 |
766 | encode(x: ']'), // 2 |
767 | encode(x: '['), // 3 |
768 | decode(x: '/'), // 4 |
769 | decode(x: '?'), // 5 |
770 | decode(x: '#'), // 6 |
771 | |
772 | decode(x: '"'), // 7 |
773 | decode(x: '<'), |
774 | decode(x: '>'), |
775 | decode(x: '^'), |
776 | decode(x: '\\'), |
777 | decode(x: '|'), |
778 | decode(x: '{'), |
779 | decode(x: '}'), |
780 | 0 |
781 | }; |
782 | static const ushort * const passwordInAuthority = userNameInAuthority + 1; |
783 | |
784 | static const ushort userNameInUrl[] = { |
785 | encode(x: ':'), // 0 |
786 | encode(x: '@'), // 1 |
787 | encode(x: ']'), // 2 |
788 | encode(x: '['), // 3 |
789 | encode(x: '/'), // 4 |
790 | encode(x: '?'), // 5 |
791 | encode(x: '#'), // 6 |
792 | |
793 | // no need to list encode(x) for the other characters |
794 | 0 |
795 | }; |
796 | static const ushort * const passwordInUrl = userNameInUrl + 1; |
797 | static const ushort * const pathInUrl = userNameInUrl + 5; |
798 | static const ushort * const queryInUrl = userNameInUrl + 6; |
799 | static const ushort * const fragmentInUrl = userNameInUrl + 6; |
800 | |
801 | static inline void parseDecodedComponent(QString &data) |
802 | { |
803 | data.replace(c: u'%', after: "%25"_L1 ); |
804 | } |
805 | |
806 | static inline QString |
807 | recodeFromUser(const QString &input, const ushort *actions, qsizetype from, qsizetype to) |
808 | { |
809 | QString output; |
810 | const QChar *begin = input.constData() + from; |
811 | const QChar *end = input.constData() + to; |
812 | if (qt_urlRecode(appendTo&: output, url: QStringView{begin, end}, encoding: {}, tableModifications: actions)) |
813 | return output; |
814 | |
815 | return input.mid(position: from, n: to - from); |
816 | } |
817 | |
818 | // appendXXXX functions: copy from the internal form to the external, user form. |
819 | // the internal value is stored in its PrettyDecoded form, so that case is easy. |
820 | static inline void appendToUser(QString &appendTo, QStringView value, QUrl::FormattingOptions options, |
821 | const ushort *actions) |
822 | { |
823 | // The stored value is already QUrl::PrettyDecoded, so there's nothing to |
824 | // do if that's what the user asked for (test only |
825 | // ComponentFormattingOptions, ignore FormattingOptions). |
826 | if ((options & 0xFFFF0000) == QUrl::PrettyDecoded || |
827 | !qt_urlRecode(appendTo, url: value, encoding: options, tableModifications: actions)) |
828 | appendTo += value; |
829 | |
830 | // copy nullness, if necessary, because QString::operator+=(QStringView) doesn't |
831 | if (appendTo.isNull() && !value.isNull()) |
832 | appendTo.detach(); |
833 | } |
834 | |
835 | inline void QUrlPrivate::appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
836 | { |
837 | if ((options & QUrl::RemoveUserInfo) != QUrl::RemoveUserInfo) { |
838 | appendUserInfo(appendTo, options, appendingTo); |
839 | |
840 | // add '@' only if we added anything |
841 | if (hasUserName() || (hasPassword() && (options & QUrl::RemovePassword) == 0)) |
842 | appendTo += u'@'; |
843 | } |
844 | appendHost(appendTo, options); |
845 | if (!(options & QUrl::RemovePort) && port != -1) |
846 | appendTo += u':' + QString::number(port); |
847 | } |
848 | |
849 | inline void QUrlPrivate::appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
850 | { |
851 | if (Q_LIKELY(!hasUserInfo())) |
852 | return; |
853 | |
854 | const ushort *userNameActions; |
855 | const ushort *passwordActions; |
856 | if (options & QUrl::EncodeDelimiters) { |
857 | userNameActions = userNameInUrl; |
858 | passwordActions = passwordInUrl; |
859 | } else { |
860 | switch (appendingTo) { |
861 | case UserInfo: |
862 | userNameActions = userNameInUserInfo; |
863 | passwordActions = passwordInUserInfo; |
864 | break; |
865 | |
866 | case Authority: |
867 | userNameActions = userNameInAuthority; |
868 | passwordActions = passwordInAuthority; |
869 | break; |
870 | |
871 | case FullUrl: |
872 | userNameActions = userNameInUrl; |
873 | passwordActions = passwordInUrl; |
874 | break; |
875 | |
876 | default: |
877 | // can't happen |
878 | Q_UNREACHABLE(); |
879 | break; |
880 | } |
881 | } |
882 | |
883 | if (!qt_urlRecode(appendTo, url: userName, encoding: options, tableModifications: userNameActions)) |
884 | appendTo += userName; |
885 | if (options & QUrl::RemovePassword || !hasPassword()) { |
886 | return; |
887 | } else { |
888 | appendTo += u':'; |
889 | if (!qt_urlRecode(appendTo, url: password, encoding: options, tableModifications: passwordActions)) |
890 | appendTo += password; |
891 | } |
892 | } |
893 | |
894 | inline void QUrlPrivate::appendUserName(QString &appendTo, QUrl::FormattingOptions options) const |
895 | { |
896 | // only called from QUrl::userName() |
897 | appendToUser(appendTo, value: userName, options, |
898 | actions: options & QUrl::EncodeDelimiters ? userNameInUrl : userNameInIsolation); |
899 | } |
900 | |
901 | inline void QUrlPrivate::appendPassword(QString &appendTo, QUrl::FormattingOptions options) const |
902 | { |
903 | // only called from QUrl::password() |
904 | appendToUser(appendTo, value: password, options, |
905 | actions: options & QUrl::EncodeDelimiters ? passwordInUrl : passwordInIsolation); |
906 | } |
907 | |
908 | inline void QUrlPrivate::appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
909 | { |
910 | QString thePath = path; |
911 | if (options & QUrl::NormalizePathSegments) { |
912 | thePath = qt_normalizePathSegments(name: path, flags: isLocalFile() ? QDirPrivate::DefaultNormalization : QDirPrivate::RemotePath); |
913 | } |
914 | |
915 | QStringView thePathView(thePath); |
916 | if (options & QUrl::RemoveFilename) { |
917 | const qsizetype slash = path.lastIndexOf(c: u'/'); |
918 | if (slash == -1) |
919 | return; |
920 | thePathView = QStringView{path}.left(n: slash + 1); |
921 | } |
922 | // check if we need to remove trailing slashes |
923 | if (options & QUrl::StripTrailingSlash) { |
924 | while (thePathView.size() > 1 && thePathView.endsWith(c: u'/')) |
925 | thePathView.chop(n: 1); |
926 | } |
927 | |
928 | appendToUser(appendTo, value: thePathView, options, |
929 | actions: appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? pathInUrl : pathInIsolation); |
930 | } |
931 | |
932 | inline void QUrlPrivate::appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
933 | { |
934 | appendToUser(appendTo, value: fragment, options, |
935 | actions: options & QUrl::EncodeDelimiters ? fragmentInUrl : |
936 | appendingTo == FullUrl ? nullptr : fragmentInIsolation); |
937 | } |
938 | |
939 | inline void QUrlPrivate::appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
940 | { |
941 | appendToUser(appendTo, value: query, options, |
942 | actions: appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? queryInUrl : queryInIsolation); |
943 | } |
944 | |
945 | // setXXX functions |
946 | |
947 | inline bool QUrlPrivate::setScheme(const QString &value, qsizetype len, bool doSetError) |
948 | { |
949 | // schemes are strictly RFC-compliant: |
950 | // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) |
951 | // we also lowercase the scheme |
952 | |
953 | // schemes in URLs are not allowed to be empty, but they can be in |
954 | // "Relative URIs" which QUrl also supports. QUrl::setScheme does |
955 | // not call us with len == 0, so this can only be from parse() |
956 | scheme.clear(); |
957 | if (len == 0) |
958 | return false; |
959 | |
960 | sectionIsPresent |= Scheme; |
961 | |
962 | // validate it: |
963 | qsizetype needsLowercasing = -1; |
964 | const ushort *p = reinterpret_cast<const ushort *>(value.data()); |
965 | for (qsizetype i = 0; i < len; ++i) { |
966 | if (isAsciiLower(c: p[i])) |
967 | continue; |
968 | if (isAsciiUpper(c: p[i])) { |
969 | needsLowercasing = i; |
970 | continue; |
971 | } |
972 | if (i) { |
973 | if (isAsciiDigit(c: p[i])) |
974 | continue; |
975 | if (p[i] == '+' || p[i] == '-' || p[i] == '.') |
976 | continue; |
977 | } |
978 | |
979 | // found something else |
980 | // don't call setError needlessly: |
981 | // if we've been called from parse(), it will try to recover |
982 | if (doSetError) |
983 | setError(errorCode: InvalidSchemeError, source: value, supplement: i); |
984 | return false; |
985 | } |
986 | |
987 | scheme = value.left(n: len); |
988 | |
989 | if (needsLowercasing != -1) { |
990 | // schemes are ASCII only, so we don't need the full Unicode toLower |
991 | QChar *schemeData = scheme.data(); // force detaching here |
992 | for (qsizetype i = needsLowercasing; i >= 0; --i) { |
993 | ushort c = schemeData[i].unicode(); |
994 | if (isAsciiUpper(c)) |
995 | schemeData[i] = QChar(c + 0x20); |
996 | } |
997 | } |
998 | |
999 | // did we set to the file protocol? |
1000 | if (scheme == fileScheme() |
1001 | #ifdef Q_OS_WIN |
1002 | || scheme == webDavScheme() |
1003 | #endif |
1004 | ) { |
1005 | flags |= IsLocalFile; |
1006 | } else { |
1007 | flags &= ~IsLocalFile; |
1008 | } |
1009 | return true; |
1010 | } |
1011 | |
1012 | inline void QUrlPrivate::setAuthority(const QString &auth, qsizetype from, qsizetype end, QUrl::ParsingMode mode) |
1013 | { |
1014 | sectionIsPresent &= ~Authority; |
1015 | sectionIsPresent |= Host; |
1016 | port = -1; |
1017 | |
1018 | // we never actually _loop_ |
1019 | while (from != end) { |
1020 | qsizetype userInfoIndex = auth.indexOf(c: u'@', from); |
1021 | if (size_t(userInfoIndex) < size_t(end)) { |
1022 | setUserInfo(userInfo: auth, from, end: userInfoIndex); |
1023 | if (mode == QUrl::StrictMode && !validateComponent(section: UserInfo, input: auth, begin: from, end: userInfoIndex)) |
1024 | break; |
1025 | from = userInfoIndex + 1; |
1026 | } |
1027 | |
1028 | qsizetype colonIndex = auth.lastIndexOf(c: u':', from: end - 1); |
1029 | if (colonIndex < from) |
1030 | colonIndex = -1; |
1031 | |
1032 | if (size_t(colonIndex) < size_t(end)) { |
1033 | if (auth.at(i: from).unicode() == '[') { |
1034 | // check if colonIndex isn't inside the "[...]" part |
1035 | qsizetype closingBracket = auth.indexOf(c: u']', from); |
1036 | if (size_t(closingBracket) > size_t(colonIndex)) |
1037 | colonIndex = -1; |
1038 | } |
1039 | } |
1040 | |
1041 | if (size_t(colonIndex) < size_t(end) - 1) { |
1042 | // found a colon with digits after it |
1043 | unsigned long x = 0; |
1044 | for (qsizetype i = colonIndex + 1; i < end; ++i) { |
1045 | ushort c = auth.at(i).unicode(); |
1046 | if (isAsciiDigit(c)) { |
1047 | x *= 10; |
1048 | x += c - '0'; |
1049 | } else { |
1050 | x = ulong(-1); // x != ushort(x) |
1051 | break; |
1052 | } |
1053 | } |
1054 | if (x == ushort(x)) { |
1055 | port = ushort(x); |
1056 | } else { |
1057 | setError(errorCode: InvalidPortError, source: auth, supplement: colonIndex + 1); |
1058 | if (mode == QUrl::StrictMode) |
1059 | break; |
1060 | } |
1061 | } |
1062 | |
1063 | setHost(value: auth, from, end: qMin<size_t>(a: end, b: colonIndex), mode); |
1064 | if (mode == QUrl::StrictMode && !validateComponent(section: Host, input: auth, begin: from, end: qMin<size_t>(a: end, b: colonIndex))) { |
1065 | // clear host too |
1066 | sectionIsPresent &= ~Authority; |
1067 | break; |
1068 | } |
1069 | |
1070 | // success |
1071 | return; |
1072 | } |
1073 | // clear all sections but host |
1074 | sectionIsPresent &= ~Authority | Host; |
1075 | userName.clear(); |
1076 | password.clear(); |
1077 | host.clear(); |
1078 | port = -1; |
1079 | } |
1080 | |
1081 | inline void QUrlPrivate::setUserInfo(const QString &userInfo, qsizetype from, qsizetype end) |
1082 | { |
1083 | qsizetype delimIndex = userInfo.indexOf(c: u':', from); |
1084 | setUserName(value: userInfo, from, end: qMin<size_t>(a: delimIndex, b: end)); |
1085 | |
1086 | if (size_t(delimIndex) >= size_t(end)) { |
1087 | password.clear(); |
1088 | sectionIsPresent &= ~Password; |
1089 | } else { |
1090 | setPassword(value: userInfo, from: delimIndex + 1, end); |
1091 | } |
1092 | } |
1093 | |
1094 | inline void QUrlPrivate::setUserName(const QString &value, qsizetype from, qsizetype end) |
1095 | { |
1096 | sectionIsPresent |= UserName; |
1097 | userName = recodeFromUser(input: value, actions: userNameInIsolation, from, to: end); |
1098 | } |
1099 | |
1100 | inline void QUrlPrivate::setPassword(const QString &value, qsizetype from, qsizetype end) |
1101 | { |
1102 | sectionIsPresent |= Password; |
1103 | password = recodeFromUser(input: value, actions: passwordInIsolation, from, to: end); |
1104 | } |
1105 | |
1106 | inline void QUrlPrivate::setPath(const QString &value, qsizetype from, qsizetype end) |
1107 | { |
1108 | // sectionIsPresent |= Path; // not used, save some cycles |
1109 | path = recodeFromUser(input: value, actions: pathInIsolation, from, to: end); |
1110 | } |
1111 | |
1112 | inline void QUrlPrivate::setFragment(const QString &value, qsizetype from, qsizetype end) |
1113 | { |
1114 | sectionIsPresent |= Fragment; |
1115 | fragment = recodeFromUser(input: value, actions: fragmentInIsolation, from, to: end); |
1116 | } |
1117 | |
1118 | inline void QUrlPrivate::setQuery(const QString &value, qsizetype from, qsizetype iend) |
1119 | { |
1120 | sectionIsPresent |= Query; |
1121 | query = recodeFromUser(input: value, actions: queryInIsolation, from, to: iend); |
1122 | } |
1123 | |
1124 | // Host handling |
1125 | // The RFC says the host is: |
1126 | // host = IP-literal / IPv4address / reg-name |
1127 | // IP-literal = "[" ( IPv6address / IPvFuture ) "]" |
1128 | // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) |
1129 | // [a strict definition of IPv6Address and IPv4Address] |
1130 | // reg-name = *( unreserved / pct-encoded / sub-delims ) |
1131 | // |
1132 | // We deviate from the standard in all but IPvFuture. For IPvFuture we accept |
1133 | // and store only exactly what the RFC says we should. No percent-encoding is |
1134 | // permitted in this field, so Unicode characters and space aren't either. |
1135 | // |
1136 | // For IPv4 addresses, we accept broken addresses like inet_aton does (that is, |
1137 | // less than three dots). However, we correct the address to the proper form |
1138 | // and store the corrected address. After correction, we comply to the RFC and |
1139 | // it's exclusively composed of unreserved characters. |
1140 | // |
1141 | // For IPv6 addresses, we accept addresses including trailing (embedded) IPv4 |
1142 | // addresses, the so-called v4-compat and v4-mapped addresses. We also store |
1143 | // those addresses like that in the hostname field, which violates the spec. |
1144 | // IPv6 hosts are stored with the square brackets in the QString. It also |
1145 | // requires no transformation in any way. |
1146 | // |
1147 | // As for registered names, it's the other way around: we accept only valid |
1148 | // hostnames as specified by STD 3 and IDNA. That means everything we accept is |
1149 | // valid in the RFC definition above, but there are many valid reg-names |
1150 | // according to the RFC that we do not accept in the name of security. Since we |
1151 | // do accept IDNA, reg-names are subject to ACE encoding and decoding, which is |
1152 | // specified by the DecodeUnicode flag. The hostname is stored in its Unicode form. |
1153 | |
1154 | inline void QUrlPrivate::appendHost(QString &appendTo, QUrl::FormattingOptions options) const |
1155 | { |
1156 | if (host.isEmpty()) |
1157 | return; |
1158 | if (host.at(i: 0).unicode() == '[') { |
1159 | // IPv6 addresses might contain a zone-id which needs to be recoded |
1160 | if (options != 0) |
1161 | if (qt_urlRecode(appendTo, url: host, encoding: options, tableModifications: nullptr)) |
1162 | return; |
1163 | appendTo += host; |
1164 | } else { |
1165 | // this is either an IPv4Address or a reg-name |
1166 | // if it is a reg-name, it is already stored in Unicode form |
1167 | if (options & QUrl::EncodeUnicode && !(options & 0x4000000)) |
1168 | appendTo += qt_ACE_do(domain: host, op: ToAceOnly, dot: AllowLeadingDot, options: {}); |
1169 | else |
1170 | appendTo += host; |
1171 | } |
1172 | } |
1173 | |
1174 | // the whole IPvFuture is passed and parsed here, including brackets; |
1175 | // returns null if the parsing was successful, or the QChar of the first failure |
1176 | static const QChar *parseIpFuture(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode) |
1177 | { |
1178 | // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) |
1179 | static const char acceptable[] = |
1180 | "!$&'()*+,;=" // sub-delims |
1181 | ":" // ":" |
1182 | "-._~" ; // unreserved |
1183 | |
1184 | // the brackets and the "v" have been checked |
1185 | const QChar *const origBegin = begin; |
1186 | if (begin[3].unicode() != '.') |
1187 | return &begin[3]; |
1188 | if (isHexDigit(c: begin[2].unicode())) { |
1189 | // this is so unlikely that we'll just go down the slow path |
1190 | // decode the whole string, skipping the "[vH." and "]" which we already know to be there |
1191 | host += QStringView(begin, 4); |
1192 | |
1193 | // uppercase the version, if necessary |
1194 | if (begin[2].unicode() >= 'a') |
1195 | host[host.size() - 2] = QChar{begin[2].unicode() - 0x20}; |
1196 | |
1197 | begin += 4; |
1198 | --end; |
1199 | |
1200 | QString decoded; |
1201 | if (mode == QUrl::TolerantMode && qt_urlRecode(appendTo&: decoded, url: QStringView{begin, end}, encoding: QUrl::FullyDecoded, tableModifications: nullptr)) { |
1202 | begin = decoded.constBegin(); |
1203 | end = decoded.constEnd(); |
1204 | } |
1205 | |
1206 | for ( ; begin != end; ++begin) { |
1207 | if (isAsciiLetterOrNumber(c: begin->unicode())) |
1208 | host += *begin; |
1209 | else if (begin->unicode() < 0x80 && strchr(s: acceptable, c: begin->unicode()) != nullptr) |
1210 | host += *begin; |
1211 | else |
1212 | return decoded.isEmpty() ? begin : &origBegin[2]; |
1213 | } |
1214 | host += u']'; |
1215 | return nullptr; |
1216 | } |
1217 | return &origBegin[2]; |
1218 | } |
1219 | |
1220 | // ONLY the IPv6 address is parsed here, WITHOUT the brackets |
1221 | static const QChar *parseIp6(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode) |
1222 | { |
1223 | QStringView decoded(begin, end); |
1224 | QString decodedBuffer; |
1225 | if (mode == QUrl::TolerantMode) { |
1226 | // this struct is kept in automatic storage because it's only 4 bytes |
1227 | const ushort decodeColon[] = { decode(x: ':'), 0 }; |
1228 | if (qt_urlRecode(appendTo&: decodedBuffer, url: decoded, encoding: QUrl::ComponentFormattingOption::PrettyDecoded, tableModifications: decodeColon)) |
1229 | decoded = decodedBuffer; |
1230 | } |
1231 | |
1232 | const QStringView zoneIdIdentifier(u"%25" ); |
1233 | QIPAddressUtils::IPv6Address address; |
1234 | QStringView zoneId; |
1235 | |
1236 | qsizetype zoneIdPosition = decoded.indexOf(s: zoneIdIdentifier); |
1237 | if ((zoneIdPosition != -1) && (decoded.lastIndexOf(s: zoneIdIdentifier) == zoneIdPosition)) { |
1238 | zoneId = decoded.mid(pos: zoneIdPosition + zoneIdIdentifier.size()); |
1239 | decoded.truncate(n: zoneIdPosition); |
1240 | |
1241 | // was there anything after the zone ID separator? |
1242 | if (zoneId.isEmpty()) |
1243 | return end; |
1244 | } |
1245 | |
1246 | // did the address become empty after removing the zone ID? |
1247 | // (it might have always been empty) |
1248 | if (decoded.isEmpty()) |
1249 | return end; |
1250 | |
1251 | const QChar *ret = QIPAddressUtils::parseIp6(address, begin: decoded.constBegin(), end: decoded.constEnd()); |
1252 | if (ret) |
1253 | return begin + (ret - decoded.constBegin()); |
1254 | |
1255 | host.reserve(asize: host.size() + (end - begin) + 2); // +2 for the brackets |
1256 | host += u'['; |
1257 | QIPAddressUtils::toString(appendTo&: host, address); |
1258 | |
1259 | if (!zoneId.isEmpty()) { |
1260 | host += zoneIdIdentifier; |
1261 | host += zoneId; |
1262 | } |
1263 | host += u']'; |
1264 | return nullptr; |
1265 | } |
1266 | |
1267 | inline bool |
1268 | QUrlPrivate::setHost(const QString &value, qsizetype from, qsizetype iend, QUrl::ParsingMode mode) |
1269 | { |
1270 | const QChar *begin = value.constData() + from; |
1271 | const QChar *end = value.constData() + iend; |
1272 | |
1273 | const qsizetype len = end - begin; |
1274 | host.clear(); |
1275 | sectionIsPresent |= Host; |
1276 | if (len == 0) |
1277 | return true; |
1278 | |
1279 | if (begin[0].unicode() == '[') { |
1280 | // IPv6Address or IPvFuture |
1281 | // smallest IPv6 address is "[::]" (len = 4) |
1282 | // smallest IPvFuture address is "[v7.X]" (len = 6) |
1283 | if (end[-1].unicode() != ']') { |
1284 | setError(errorCode: HostMissingEndBracket, source: value); |
1285 | return false; |
1286 | } |
1287 | |
1288 | if (len > 5 && begin[1].unicode() == 'v') { |
1289 | const QChar *c = parseIpFuture(host, begin, end, mode); |
1290 | if (c) |
1291 | setError(errorCode: InvalidIPvFutureError, source: value, supplement: c - value.constData()); |
1292 | return !c; |
1293 | } else if (begin[1].unicode() == 'v') { |
1294 | setError(errorCode: InvalidIPvFutureError, source: value, supplement: from); |
1295 | } |
1296 | |
1297 | const QChar *c = parseIp6(host, begin: begin + 1, end: end - 1, mode); |
1298 | if (!c) |
1299 | return true; |
1300 | |
1301 | if (c == end - 1) |
1302 | setError(errorCode: InvalidIPv6AddressError, source: value, supplement: from); |
1303 | else |
1304 | setError(errorCode: InvalidCharacterInIPv6Error, source: value, supplement: c - value.constData()); |
1305 | return false; |
1306 | } |
1307 | |
1308 | // check if it's an IPv4 address |
1309 | QIPAddressUtils::IPv4Address ip4; |
1310 | if (QIPAddressUtils::parseIp4(address&: ip4, begin, end)) { |
1311 | // yes, it was |
1312 | QIPAddressUtils::toString(appendTo&: host, address: ip4); |
1313 | return true; |
1314 | } |
1315 | |
1316 | // This is probably a reg-name. |
1317 | // But it can also be an encoded string that, when decoded becomes one |
1318 | // of the types above. |
1319 | // |
1320 | // Two types of encoding are possible: |
1321 | // percent encoding (e.g., "%31%30%2E%30%2E%30%2E%31" -> "10.0.0.1") |
1322 | // Unicode encoding (some non-ASCII characters case-fold to digits |
1323 | // when nameprepping is done) |
1324 | // |
1325 | // The qt_ACE_do function below does IDNA normalization and the STD3 check. |
1326 | // That means a Unicode string may become an IPv4 address, but it cannot |
1327 | // produce a '[' or a '%'. |
1328 | |
1329 | // check for percent-encoding first |
1330 | QString s; |
1331 | if (mode == QUrl::TolerantMode && qt_urlRecode(appendTo&: s, url: QStringView{begin, end}, encoding: { }, tableModifications: nullptr)) { |
1332 | // something was decoded |
1333 | // anything encoded left? |
1334 | qsizetype pos = s.indexOf(c: QChar(0x25)); // '%' |
1335 | if (pos != -1) { |
1336 | setError(errorCode: InvalidRegNameError, source: s, supplement: pos); |
1337 | return false; |
1338 | } |
1339 | |
1340 | // recurse |
1341 | return setHost(value: s, from: 0, iend: s.size(), mode: QUrl::StrictMode); |
1342 | } |
1343 | |
1344 | s = qt_ACE_do(domain: value.mid(position: from, n: iend - from), op: NormalizeAce, dot: ForbidLeadingDot, options: {}); |
1345 | if (s.isEmpty()) { |
1346 | setError(errorCode: InvalidRegNameError, source: value); |
1347 | return false; |
1348 | } |
1349 | |
1350 | // check IPv4 again |
1351 | if (QIPAddressUtils::parseIp4(address&: ip4, begin: s.constBegin(), end: s.constEnd())) { |
1352 | QIPAddressUtils::toString(appendTo&: host, address: ip4); |
1353 | } else { |
1354 | host = s; |
1355 | } |
1356 | return true; |
1357 | } |
1358 | |
1359 | inline void QUrlPrivate::parse(const QString &url, QUrl::ParsingMode parsingMode) |
1360 | { |
1361 | // URI-reference = URI / relative-ref |
1362 | // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] |
1363 | // relative-ref = relative-part [ "?" query ] [ "#" fragment ] |
1364 | // hier-part = "//" authority path-abempty |
1365 | // / other path types |
1366 | // relative-part = "//" authority path-abempty |
1367 | // / other path types here |
1368 | |
1369 | sectionIsPresent = 0; |
1370 | flags = 0; |
1371 | clearError(); |
1372 | |
1373 | // find the important delimiters |
1374 | qsizetype colon = -1; |
1375 | qsizetype question = -1; |
1376 | qsizetype hash = -1; |
1377 | const qsizetype len = url.size(); |
1378 | const QChar *const begin = url.constData(); |
1379 | const ushort *const data = reinterpret_cast<const ushort *>(begin); |
1380 | |
1381 | for (qsizetype i = 0; i < len; ++i) { |
1382 | size_t uc = data[i]; |
1383 | if (uc == '#' && hash == -1) { |
1384 | hash = i; |
1385 | |
1386 | // nothing more to be found |
1387 | break; |
1388 | } |
1389 | |
1390 | if (question == -1) { |
1391 | if (uc == ':' && colon == -1) |
1392 | colon = i; |
1393 | else if (uc == '?') |
1394 | question = i; |
1395 | } |
1396 | } |
1397 | |
1398 | // check if we have a scheme |
1399 | qsizetype hierStart; |
1400 | if (colon != -1 && setScheme(value: url, len: colon, /* don't set error */ doSetError: false)) { |
1401 | hierStart = colon + 1; |
1402 | } else { |
1403 | // recover from a failed scheme: it might not have been a scheme at all |
1404 | scheme.clear(); |
1405 | sectionIsPresent = 0; |
1406 | hierStart = 0; |
1407 | } |
1408 | |
1409 | qsizetype pathStart; |
1410 | qsizetype hierEnd = qMin<size_t>(a: qMin<size_t>(a: question, b: hash), b: len); |
1411 | if (hierEnd - hierStart >= 2 && data[hierStart] == '/' && data[hierStart + 1] == '/') { |
1412 | // we have an authority, it ends at the first slash after these |
1413 | qsizetype authorityEnd = hierEnd; |
1414 | for (qsizetype i = hierStart + 2; i < authorityEnd ; ++i) { |
1415 | if (data[i] == '/') { |
1416 | authorityEnd = i; |
1417 | break; |
1418 | } |
1419 | } |
1420 | |
1421 | setAuthority(auth: url, from: hierStart + 2, end: authorityEnd, mode: parsingMode); |
1422 | |
1423 | // even if we failed to set the authority properly, let's try to recover |
1424 | pathStart = authorityEnd; |
1425 | setPath(value: url, from: pathStart, end: hierEnd); |
1426 | } else { |
1427 | userName.clear(); |
1428 | password.clear(); |
1429 | host.clear(); |
1430 | port = -1; |
1431 | pathStart = hierStart; |
1432 | |
1433 | if (hierStart < hierEnd) |
1434 | setPath(value: url, from: hierStart, end: hierEnd); |
1435 | else |
1436 | path.clear(); |
1437 | } |
1438 | |
1439 | if (size_t(question) < size_t(hash)) |
1440 | setQuery(value: url, from: question + 1, iend: qMin<size_t>(a: hash, b: len)); |
1441 | |
1442 | if (hash != -1) |
1443 | setFragment(value: url, from: hash + 1, end: len); |
1444 | |
1445 | if (error || parsingMode == QUrl::TolerantMode) |
1446 | return; |
1447 | |
1448 | // The parsing so far was partially tolerant of errors, except for the |
1449 | // scheme parser (which is always strict) and the authority (which was |
1450 | // executed in strict mode). |
1451 | // If we haven't found any errors so far, continue the strict-mode parsing |
1452 | // from the path component onwards. |
1453 | |
1454 | if (!validateComponent(section: Path, input: url, begin: pathStart, end: hierEnd)) |
1455 | return; |
1456 | if (size_t(question) < size_t(hash) && !validateComponent(section: Query, input: url, begin: question + 1, end: qMin<size_t>(a: hash, b: len))) |
1457 | return; |
1458 | if (hash != -1) |
1459 | validateComponent(section: Fragment, input: url, begin: hash + 1, end: len); |
1460 | } |
1461 | |
1462 | QString QUrlPrivate::toLocalFile(QUrl::FormattingOptions options) const |
1463 | { |
1464 | QString tmp; |
1465 | QString ourPath; |
1466 | appendPath(appendTo&: ourPath, options, appendingTo: QUrlPrivate::Path); |
1467 | |
1468 | // magic for shared drive on windows |
1469 | if (!host.isEmpty()) { |
1470 | tmp = "//"_L1 + host; |
1471 | #ifdef Q_OS_WIN // QTBUG-42346, WebDAV is visible as local file on Windows only. |
1472 | if (scheme == webDavScheme()) |
1473 | tmp += webDavSslTag(); |
1474 | #endif |
1475 | if (!ourPath.isEmpty() && !ourPath.startsWith(c: u'/')) |
1476 | tmp += u'/'; |
1477 | tmp += ourPath; |
1478 | } else { |
1479 | tmp = ourPath; |
1480 | #ifdef Q_OS_WIN |
1481 | // magic for drives on windows |
1482 | if (ourPath.length() > 2 && ourPath.at(0) == u'/' && ourPath.at(2) == u':') |
1483 | tmp.remove(0, 1); |
1484 | #endif |
1485 | } |
1486 | return tmp; |
1487 | } |
1488 | |
1489 | /* |
1490 | From http://www.ietf.org/rfc/rfc3986.txt, 5.2.3: Merge paths |
1491 | |
1492 | Returns a merge of the current path with the relative path passed |
1493 | as argument. |
1494 | |
1495 | Note: \a relativePath is relative (does not start with '/'). |
1496 | */ |
1497 | inline QString QUrlPrivate::mergePaths(const QString &relativePath) const |
1498 | { |
1499 | // If the base URI has a defined authority component and an empty |
1500 | // path, then return a string consisting of "/" concatenated with |
1501 | // the reference's path; otherwise, |
1502 | if (!host.isEmpty() && path.isEmpty()) |
1503 | return u'/' + relativePath; |
1504 | |
1505 | // Return a string consisting of the reference's path component |
1506 | // appended to all but the last segment of the base URI's path |
1507 | // (i.e., excluding any characters after the right-most "/" in the |
1508 | // base URI path, or excluding the entire base URI path if it does |
1509 | // not contain any "/" characters). |
1510 | QString newPath; |
1511 | if (!path.contains(c: u'/')) |
1512 | newPath = relativePath; |
1513 | else |
1514 | newPath = QStringView{path}.left(n: path.lastIndexOf(c: u'/') + 1) + relativePath; |
1515 | |
1516 | return newPath; |
1517 | } |
1518 | |
1519 | /* |
1520 | From http://www.ietf.org/rfc/rfc3986.txt, 5.2.4: Remove dot segments |
1521 | |
1522 | Removes unnecessary ../ and ./ from the path. Used for normalizing |
1523 | the URL. |
1524 | */ |
1525 | static void removeDotsFromPath(QString *path) |
1526 | { |
1527 | // The input buffer is initialized with the now-appended path |
1528 | // components and the output buffer is initialized to the empty |
1529 | // string. |
1530 | QChar *out = path->data(); |
1531 | const QChar *in = out; |
1532 | const QChar *end = out + path->size(); |
1533 | |
1534 | // If the input buffer consists only of |
1535 | // "." or "..", then remove that from the input |
1536 | // buffer; |
1537 | if (path->size() == 1 && in[0].unicode() == '.') |
1538 | ++in; |
1539 | else if (path->size() == 2 && in[0].unicode() == '.' && in[1].unicode() == '.') |
1540 | in += 2; |
1541 | // While the input buffer is not empty, loop: |
1542 | while (in < end) { |
1543 | |
1544 | // otherwise, if the input buffer begins with a prefix of "../" or "./", |
1545 | // then remove that prefix from the input buffer; |
1546 | if (path->size() >= 2 && in[0].unicode() == '.' && in[1].unicode() == '/') |
1547 | in += 2; |
1548 | else if (path->size() >= 3 && in[0].unicode() == '.' |
1549 | && in[1].unicode() == '.' && in[2].unicode() == '/') |
1550 | in += 3; |
1551 | |
1552 | // otherwise, if the input buffer begins with a prefix of |
1553 | // "/./" or "/.", where "." is a complete path segment, |
1554 | // then replace that prefix with "/" in the input buffer; |
1555 | if (in <= end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.' |
1556 | && in[2].unicode() == '/') { |
1557 | in += 2; |
1558 | continue; |
1559 | } else if (in == end - 2 && in[0].unicode() == '/' && in[1].unicode() == '.') { |
1560 | *out++ = u'/'; |
1561 | in += 2; |
1562 | break; |
1563 | } |
1564 | |
1565 | // otherwise, if the input buffer begins with a prefix |
1566 | // of "/../" or "/..", where ".." is a complete path |
1567 | // segment, then replace that prefix with "/" in the |
1568 | // input buffer and remove the last //segment and its |
1569 | // preceding "/" (if any) from the output buffer; |
1570 | if (in <= end - 4 && in[0].unicode() == '/' && in[1].unicode() == '.' |
1571 | && in[2].unicode() == '.' && in[3].unicode() == '/') { |
1572 | while (out > path->constData() && (--out)->unicode() != '/') |
1573 | ; |
1574 | if (out == path->constData() && out->unicode() != '/') |
1575 | ++in; |
1576 | in += 3; |
1577 | continue; |
1578 | } else if (in == end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.' |
1579 | && in[2].unicode() == '.') { |
1580 | while (out > path->constData() && (--out)->unicode() != '/') |
1581 | ; |
1582 | if (out->unicode() == '/') |
1583 | ++out; |
1584 | in += 3; |
1585 | break; |
1586 | } |
1587 | |
1588 | // otherwise move the first path segment in |
1589 | // the input buffer to the end of the output |
1590 | // buffer, including the initial "/" character |
1591 | // (if any) and any subsequent characters up |
1592 | // to, but not including, the next "/" |
1593 | // character or the end of the input buffer. |
1594 | *out++ = *in++; |
1595 | while (in < end && in->unicode() != '/') |
1596 | *out++ = *in++; |
1597 | } |
1598 | path->truncate(pos: out - path->constData()); |
1599 | } |
1600 | |
1601 | inline QUrlPrivate::ErrorCode QUrlPrivate::validityError(QString *source, qsizetype *position) const |
1602 | { |
1603 | Q_ASSERT(!source == !position); |
1604 | if (error) { |
1605 | if (source) { |
1606 | *source = error->source; |
1607 | *position = error->position; |
1608 | } |
1609 | return error->code; |
1610 | } |
1611 | |
1612 | // There are three more cases of invalid URLs that QUrl recognizes and they |
1613 | // are only possible with constructed URLs (setXXX methods), not with |
1614 | // parsing. Therefore, they are tested here. |
1615 | // |
1616 | // Two cases are a non-empty path that doesn't start with a slash and: |
1617 | // - with an authority |
1618 | // - without an authority, without scheme but the path with a colon before |
1619 | // the first slash |
1620 | // The third case is an empty authority and a non-empty path that starts |
1621 | // with "//". |
1622 | // Those cases are considered invalid because toString() would produce a URL |
1623 | // that wouldn't be parsed back to the same QUrl. |
1624 | |
1625 | if (path.isEmpty()) |
1626 | return NoError; |
1627 | if (path.at(i: 0) == u'/') { |
1628 | if (hasAuthority() || path.size() == 1 || path.at(i: 1) != u'/') |
1629 | return NoError; |
1630 | if (source) { |
1631 | *source = path; |
1632 | *position = 0; |
1633 | } |
1634 | return AuthorityAbsentAndPathIsDoubleSlash; |
1635 | } |
1636 | |
1637 | if (sectionIsPresent & QUrlPrivate::Host) { |
1638 | if (source) { |
1639 | *source = path; |
1640 | *position = 0; |
1641 | } |
1642 | return AuthorityPresentAndPathIsRelative; |
1643 | } |
1644 | if (sectionIsPresent & QUrlPrivate::Scheme) |
1645 | return NoError; |
1646 | |
1647 | // check for a path of "text:text/" |
1648 | for (qsizetype i = 0; i < path.size(); ++i) { |
1649 | ushort c = path.at(i).unicode(); |
1650 | if (c == '/') { |
1651 | // found the slash before the colon |
1652 | return NoError; |
1653 | } |
1654 | if (c == ':') { |
1655 | // found the colon before the slash, it's invalid |
1656 | if (source) { |
1657 | *source = path; |
1658 | *position = i; |
1659 | } |
1660 | return RelativeUrlPathContainsColonBeforeSlash; |
1661 | } |
1662 | } |
1663 | return NoError; |
1664 | } |
1665 | |
1666 | bool QUrlPrivate::validateComponent(QUrlPrivate::Section section, const QString &input, |
1667 | qsizetype begin, qsizetype end) |
1668 | { |
1669 | // What we need to look out for, that the regular parser tolerates: |
1670 | // - percent signs not followed by two hex digits |
1671 | // - forbidden characters, which should always appear encoded |
1672 | // '"' / '<' / '>' / '\' / '^' / '`' / '{' / '|' / '}' / BKSP |
1673 | // control characters |
1674 | // - delimiters not allowed in certain positions |
1675 | // . scheme: parser is already strict |
1676 | // . user info: gen-delims except ":" disallowed ("/" / "?" / "#" / "[" / "]" / "@") |
1677 | // . host: parser is stricter than the standard |
1678 | // . port: parser is stricter than the standard |
1679 | // . path: all delimiters allowed |
1680 | // . fragment: all delimiters allowed |
1681 | // . query: all delimiters allowed |
1682 | static const char forbidden[] = "\"<>\\^`{|}\x7F" ; |
1683 | static const char forbiddenUserInfo[] = ":/?#[]@" ; |
1684 | |
1685 | Q_ASSERT(section != Authority && section != Hierarchy && section != FullUrl); |
1686 | |
1687 | const ushort *const data = reinterpret_cast<const ushort *>(input.constData()); |
1688 | for (size_t i = size_t(begin); i < size_t(end); ++i) { |
1689 | uint uc = data[i]; |
1690 | if (uc >= 0x80) |
1691 | continue; |
1692 | |
1693 | bool error = false; |
1694 | if ((uc == '%' && (size_t(end) < i + 2 || !isHex(c: data[i + 1]) || !isHex(c: data[i + 2]))) |
1695 | || uc <= 0x20 || strchr(s: forbidden, c: uc)) { |
1696 | // found an error |
1697 | error = true; |
1698 | } else if (section & UserInfo) { |
1699 | if (section == UserInfo && strchr(s: forbiddenUserInfo + 1, c: uc)) |
1700 | error = true; |
1701 | else if (section != UserInfo && strchr(s: forbiddenUserInfo, c: uc)) |
1702 | error = true; |
1703 | } |
1704 | |
1705 | if (!error) |
1706 | continue; |
1707 | |
1708 | ErrorCode errorCode = ErrorCode(int(section) << 8); |
1709 | if (section == UserInfo) { |
1710 | // is it the user name or the password? |
1711 | errorCode = InvalidUserNameError; |
1712 | for (size_t j = size_t(begin); j < i; ++j) |
1713 | if (data[j] == ':') { |
1714 | errorCode = InvalidPasswordError; |
1715 | break; |
1716 | } |
1717 | } |
1718 | |
1719 | setError(errorCode, source: input, supplement: i); |
1720 | return false; |
1721 | } |
1722 | |
1723 | // no errors |
1724 | return true; |
1725 | } |
1726 | |
1727 | #if 0 |
1728 | inline void QUrlPrivate::validate() const |
1729 | { |
1730 | QUrlPrivate *that = (QUrlPrivate *)this; |
1731 | that->encodedOriginal = that->toEncoded(); // may detach |
1732 | parse(ParseOnly); |
1733 | |
1734 | QURL_SETFLAG(that->stateFlags, Validated); |
1735 | |
1736 | if (!isValid) |
1737 | return; |
1738 | |
1739 | QString auth = authority(); // causes the non-encoded forms to be valid |
1740 | |
1741 | // authority() calls canonicalHost() which sets this |
1742 | if (!isHostValid) |
1743 | return; |
1744 | |
1745 | if (scheme == "mailto"_L1 ) { |
1746 | if (!host.isEmpty() || port != -1 || !userName.isEmpty() || !password.isEmpty()) { |
1747 | that->isValid = false; |
1748 | that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "expected empty host, username," |
1749 | "port and password" ), |
1750 | 0, 0); |
1751 | } |
1752 | } else if (scheme == ftpScheme() || scheme == httpScheme()) { |
1753 | if (host.isEmpty() && !(path.isEmpty() && encodedPath.isEmpty())) { |
1754 | that->isValid = false; |
1755 | that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "the host is empty, but not the path" ), |
1756 | 0, 0); |
1757 | } |
1758 | } |
1759 | } |
1760 | #endif |
1761 | |
1762 | /*! |
1763 | \macro QT_NO_URL_CAST_FROM_STRING |
1764 | \relates QUrl |
1765 | |
1766 | Disables automatic conversions from QString (or char *) to QUrl. |
1767 | |
1768 | Compiling your code with this define is useful when you have a lot of |
1769 | code that uses QString for file names and you wish to convert it to |
1770 | use QUrl for network transparency. In any code that uses QUrl, it can |
1771 | help avoid missing QUrl::resolved() calls, and other misuses of |
1772 | QString to QUrl conversions. |
1773 | |
1774 | For example, if you have code like |
1775 | |
1776 | \code |
1777 | url = filename; // probably not what you want |
1778 | \endcode |
1779 | |
1780 | you can rewrite it as |
1781 | |
1782 | \code |
1783 | url = QUrl::fromLocalFile(filename); |
1784 | url = baseurl.resolved(QUrl(filename)); |
1785 | \endcode |
1786 | |
1787 | \sa QT_NO_CAST_FROM_ASCII |
1788 | */ |
1789 | |
1790 | |
1791 | /*! |
1792 | Constructs a URL by parsing \a url. Note this constructor expects a proper |
1793 | URL or URL-Reference and will not attempt to guess intent. For example, the |
1794 | following declaration: |
1795 | |
1796 | \snippet code/src_corelib_io_qurl.cpp constructor-url-reference |
1797 | |
1798 | Will construct a valid URL but it may not be what one expects, as the |
1799 | scheme() part of the input is missing. For a string like the above, |
1800 | applications may want to use fromUserInput(). For this constructor or |
1801 | setUrl(), the following is probably what was intended: |
1802 | |
1803 | \snippet code/src_corelib_io_qurl.cpp constructor-url |
1804 | |
1805 | QUrl will automatically percent encode |
1806 | all characters that are not allowed in a URL and decode the percent-encoded |
1807 | sequences that represent an unreserved character (letters, digits, hyphens, |
1808 | underscores, dots and tildes). All other characters are left in their |
1809 | original forms. |
1810 | |
1811 | Parses the \a url using the parser mode \a parsingMode. In TolerantMode |
1812 | (the default), QUrl will correct certain mistakes, notably the presence of |
1813 | a percent character ('%') not followed by two hexadecimal digits, and it |
1814 | will accept any character in any position. In StrictMode, encoding mistakes |
1815 | will not be tolerated and QUrl will also check that certain forbidden |
1816 | characters are not present in unencoded form. If an error is detected in |
1817 | StrictMode, isValid() will return false. The parsing mode DecodedMode is not |
1818 | permitted in this context. |
1819 | |
1820 | Example: |
1821 | |
1822 | \snippet code/src_corelib_io_qurl.cpp 0 |
1823 | |
1824 | To construct a URL from an encoded string, you can also use fromEncoded(): |
1825 | |
1826 | \snippet code/src_corelib_io_qurl.cpp 1 |
1827 | |
1828 | Both functions are equivalent and, in Qt 5, both functions accept encoded |
1829 | data. Usually, the choice of the QUrl constructor or setUrl() versus |
1830 | fromEncoded() will depend on the source data: the constructor and setUrl() |
1831 | take a QString, whereas fromEncoded takes a QByteArray. |
1832 | |
1833 | \sa setUrl(), fromEncoded(), TolerantMode |
1834 | */ |
1835 | QUrl::QUrl(const QString &url, ParsingMode parsingMode) : d(nullptr) |
1836 | { |
1837 | setUrl(url, mode: parsingMode); |
1838 | } |
1839 | |
1840 | /*! |
1841 | Constructs an empty QUrl object. |
1842 | */ |
1843 | QUrl::QUrl() : d(nullptr) |
1844 | { |
1845 | } |
1846 | |
1847 | /*! |
1848 | Constructs a copy of \a other. |
1849 | */ |
1850 | QUrl::QUrl(const QUrl &other) noexcept : d(other.d) |
1851 | { |
1852 | if (d) |
1853 | d->ref.ref(); |
1854 | } |
1855 | |
1856 | /*! |
1857 | Destructor; called immediately before the object is deleted. |
1858 | */ |
1859 | QUrl::~QUrl() |
1860 | { |
1861 | if (d && !d->ref.deref()) |
1862 | delete d; |
1863 | } |
1864 | |
1865 | /*! |
1866 | Returns \c true if the URL is non-empty and valid; otherwise returns \c false. |
1867 | |
1868 | The URL is run through a conformance test. Every part of the URL |
1869 | must conform to the standard encoding rules of the URI standard |
1870 | for the URL to be reported as valid. |
1871 | |
1872 | \snippet code/src_corelib_io_qurl.cpp 2 |
1873 | */ |
1874 | bool QUrl::isValid() const |
1875 | { |
1876 | if (isEmpty()) { |
1877 | // also catches d == nullptr |
1878 | return false; |
1879 | } |
1880 | return d->validityError() == QUrlPrivate::NoError; |
1881 | } |
1882 | |
1883 | /*! |
1884 | Returns \c true if the URL has no data; otherwise returns \c false. |
1885 | |
1886 | \sa clear() |
1887 | */ |
1888 | bool QUrl::isEmpty() const |
1889 | { |
1890 | if (!d) return true; |
1891 | return d->isEmpty(); |
1892 | } |
1893 | |
1894 | /*! |
1895 | Resets the content of the QUrl. After calling this function, the |
1896 | QUrl is equal to one that has been constructed with the default |
1897 | empty constructor. |
1898 | |
1899 | \sa isEmpty() |
1900 | */ |
1901 | void QUrl::clear() |
1902 | { |
1903 | if (d && !d->ref.deref()) |
1904 | delete d; |
1905 | d = nullptr; |
1906 | } |
1907 | |
1908 | /*! |
1909 | Parses \a url and sets this object to that value. QUrl will automatically |
1910 | percent encode all characters that are not allowed in a URL and decode the |
1911 | percent-encoded sequences that represent an unreserved character (letters, |
1912 | digits, hyphens, underscores, dots and tildes). All other characters are |
1913 | left in their original forms. |
1914 | |
1915 | Parses the \a url using the parser mode \a parsingMode. In TolerantMode |
1916 | (the default), QUrl will correct certain mistakes, notably the presence of |
1917 | a percent character ('%') not followed by two hexadecimal digits, and it |
1918 | will accept any character in any position. In StrictMode, encoding mistakes |
1919 | will not be tolerated and QUrl will also check that certain forbidden |
1920 | characters are not present in unencoded form. If an error is detected in |
1921 | StrictMode, isValid() will return false. The parsing mode DecodedMode is |
1922 | not permitted in this context and will produce a run-time warning. |
1923 | |
1924 | \sa url(), toString() |
1925 | */ |
1926 | void QUrl::setUrl(const QString &url, ParsingMode parsingMode) |
1927 | { |
1928 | if (parsingMode == DecodedMode) { |
1929 | qWarning(msg: "QUrl: QUrl::DecodedMode is not permitted when parsing a full URL" ); |
1930 | } else { |
1931 | detach(); |
1932 | d->parse(url, parsingMode); |
1933 | } |
1934 | } |
1935 | |
1936 | /*! |
1937 | Sets the scheme of the URL to \a scheme. As a scheme can only |
1938 | contain ASCII characters, no conversion or decoding is done on the |
1939 | input. It must also start with an ASCII letter. |
1940 | |
1941 | The scheme describes the type (or protocol) of the URL. It's |
1942 | represented by one or more ASCII characters at the start the URL. |
1943 | |
1944 | A scheme is strictly \l {RFC 3986}-compliant: |
1945 | \tt {scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )} |
1946 | |
1947 | The following example shows a URL where the scheme is "ftp": |
1948 | |
1949 | \image qurl-authority2.png |
1950 | |
1951 | To set the scheme, the following call is used: |
1952 | \snippet code/src_corelib_io_qurl.cpp 11 |
1953 | |
1954 | The scheme can also be empty, in which case the URL is interpreted |
1955 | as relative. |
1956 | |
1957 | \sa scheme(), isRelative() |
1958 | */ |
1959 | void QUrl::setScheme(const QString &scheme) |
1960 | { |
1961 | detach(); |
1962 | d->clearError(); |
1963 | if (scheme.isEmpty()) { |
1964 | // schemes are not allowed to be empty |
1965 | d->sectionIsPresent &= ~QUrlPrivate::Scheme; |
1966 | d->flags &= ~QUrlPrivate::IsLocalFile; |
1967 | d->scheme.clear(); |
1968 | } else { |
1969 | d->setScheme(value: scheme, len: scheme.size(), /* do set error */ doSetError: true); |
1970 | } |
1971 | } |
1972 | |
1973 | /*! |
1974 | Returns the scheme of the URL. If an empty string is returned, |
1975 | this means the scheme is undefined and the URL is then relative. |
1976 | |
1977 | The scheme can only contain US-ASCII letters or digits, which means it |
1978 | cannot contain any character that would otherwise require encoding. |
1979 | Additionally, schemes are always returned in lowercase form. |
1980 | |
1981 | \sa setScheme(), isRelative() |
1982 | */ |
1983 | QString QUrl::scheme() const |
1984 | { |
1985 | if (!d) return QString(); |
1986 | |
1987 | return d->scheme; |
1988 | } |
1989 | |
1990 | /*! |
1991 | Sets the authority of the URL to \a authority. |
1992 | |
1993 | The authority of a URL is the combination of user info, a host |
1994 | name and a port. All of these elements are optional; an empty |
1995 | authority is therefore valid. |
1996 | |
1997 | The user info and host are separated by a '@', and the host and |
1998 | port are separated by a ':'. If the user info is empty, the '@' |
1999 | must be omitted; although a stray ':' is permitted if the port is |
2000 | empty. |
2001 | |
2002 | The following example shows a valid authority string: |
2003 | |
2004 | \image qurl-authority.png |
2005 | |
2006 | The \a authority data is interpreted according to \a mode: in StrictMode, |
2007 | any '%' characters must be followed by exactly two hexadecimal characters |
2008 | and some characters (including space) are not allowed in undecoded form. In |
2009 | TolerantMode (the default), all characters are accepted in undecoded form |
2010 | and the tolerant parser will correct stray '%' not followed by two hex |
2011 | characters. |
2012 | |
2013 | This function does not allow \a mode to be QUrl::DecodedMode. To set fully |
2014 | decoded data, call setUserName(), setPassword(), setHost() and setPort() |
2015 | individually. |
2016 | |
2017 | \sa setUserInfo(), setHost(), setPort() |
2018 | */ |
2019 | void QUrl::setAuthority(const QString &authority, ParsingMode mode) |
2020 | { |
2021 | detach(); |
2022 | d->clearError(); |
2023 | |
2024 | if (mode == DecodedMode) { |
2025 | qWarning(msg: "QUrl::setAuthority(): QUrl::DecodedMode is not permitted in this function" ); |
2026 | return; |
2027 | } |
2028 | |
2029 | d->setAuthority(auth: authority, from: 0, end: authority.size(), mode); |
2030 | if (authority.isNull()) { |
2031 | // QUrlPrivate::setAuthority cleared almost everything |
2032 | // but it leaves the Host bit set |
2033 | d->sectionIsPresent &= ~QUrlPrivate::Authority; |
2034 | } |
2035 | } |
2036 | |
2037 | /*! |
2038 | Returns the authority of the URL if it is defined; otherwise |
2039 | an empty string is returned. |
2040 | |
2041 | This function returns an unambiguous value, which may contain that |
2042 | characters still percent-encoded, plus some control sequences not |
2043 | representable in decoded form in QString. |
2044 | |
2045 | The \a options argument controls how to format the user info component. The |
2046 | value of QUrl::FullyDecoded is not permitted in this function. If you need |
2047 | to obtain fully decoded data, call userName(), password(), host() and |
2048 | port() individually. |
2049 | |
2050 | \sa setAuthority(), userInfo(), userName(), password(), host(), port() |
2051 | */ |
2052 | QString QUrl::authority(ComponentFormattingOptions options) const |
2053 | { |
2054 | QString result; |
2055 | if (!d) |
2056 | return result; |
2057 | |
2058 | if (options == QUrl::FullyDecoded) { |
2059 | qWarning(msg: "QUrl::authority(): QUrl::FullyDecoded is not permitted in this function" ); |
2060 | return result; |
2061 | } |
2062 | |
2063 | d->appendAuthority(appendTo&: result, options, appendingTo: QUrlPrivate::Authority); |
2064 | return result; |
2065 | } |
2066 | |
2067 | /*! |
2068 | Sets the user info of the URL to \a userInfo. The user info is an |
2069 | optional part of the authority of the URL, as described in |
2070 | setAuthority(). |
2071 | |
2072 | The user info consists of a user name and optionally a password, |
2073 | separated by a ':'. If the password is empty, the colon must be |
2074 | omitted. The following example shows a valid user info string: |
2075 | |
2076 | \image qurl-authority3.png |
2077 | |
2078 | The \a userInfo data is interpreted according to \a mode: in StrictMode, |
2079 | any '%' characters must be followed by exactly two hexadecimal characters |
2080 | and some characters (including space) are not allowed in undecoded form. In |
2081 | TolerantMode (the default), all characters are accepted in undecoded form |
2082 | and the tolerant parser will correct stray '%' not followed by two hex |
2083 | characters. |
2084 | |
2085 | This function does not allow \a mode to be QUrl::DecodedMode. To set fully |
2086 | decoded data, call setUserName() and setPassword() individually. |
2087 | |
2088 | \sa userInfo(), setUserName(), setPassword(), setAuthority() |
2089 | */ |
2090 | void QUrl::setUserInfo(const QString &userInfo, ParsingMode mode) |
2091 | { |
2092 | detach(); |
2093 | d->clearError(); |
2094 | QString trimmed = userInfo.trimmed(); |
2095 | if (mode == DecodedMode) { |
2096 | qWarning(msg: "QUrl::setUserInfo(): QUrl::DecodedMode is not permitted in this function" ); |
2097 | return; |
2098 | } |
2099 | |
2100 | d->setUserInfo(userInfo: trimmed, from: 0, end: trimmed.size()); |
2101 | if (userInfo.isNull()) { |
2102 | // QUrlPrivate::setUserInfo cleared almost everything |
2103 | // but it leaves the UserName bit set |
2104 | d->sectionIsPresent &= ~QUrlPrivate::UserInfo; |
2105 | } else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::UserInfo, input: userInfo)) { |
2106 | d->sectionIsPresent &= ~QUrlPrivate::UserInfo; |
2107 | d->userName.clear(); |
2108 | d->password.clear(); |
2109 | } |
2110 | } |
2111 | |
2112 | /*! |
2113 | Returns the user info of the URL, or an empty string if the user |
2114 | info is undefined. |
2115 | |
2116 | This function returns an unambiguous value, which may contain that |
2117 | characters still percent-encoded, plus some control sequences not |
2118 | representable in decoded form in QString. |
2119 | |
2120 | The \a options argument controls how to format the user info component. The |
2121 | value of QUrl::FullyDecoded is not permitted in this function. If you need |
2122 | to obtain fully decoded data, call userName() and password() individually. |
2123 | |
2124 | \sa setUserInfo(), userName(), password(), authority() |
2125 | */ |
2126 | QString QUrl::userInfo(ComponentFormattingOptions options) const |
2127 | { |
2128 | QString result; |
2129 | if (!d) |
2130 | return result; |
2131 | |
2132 | if (options == QUrl::FullyDecoded) { |
2133 | qWarning(msg: "QUrl::userInfo(): QUrl::FullyDecoded is not permitted in this function" ); |
2134 | return result; |
2135 | } |
2136 | |
2137 | d->appendUserInfo(appendTo&: result, options, appendingTo: QUrlPrivate::UserInfo); |
2138 | return result; |
2139 | } |
2140 | |
2141 | /*! |
2142 | Sets the URL's user name to \a userName. The \a userName is part |
2143 | of the user info element in the authority of the URL, as described |
2144 | in setUserInfo(). |
2145 | |
2146 | The \a userName data is interpreted according to \a mode: in StrictMode, |
2147 | any '%' characters must be followed by exactly two hexadecimal characters |
2148 | and some characters (including space) are not allowed in undecoded form. In |
2149 | TolerantMode (the default), all characters are accepted in undecoded form |
2150 | and the tolerant parser will correct stray '%' not followed by two hex |
2151 | characters. In DecodedMode, '%' stand for themselves and encoded characters |
2152 | are not possible. |
2153 | |
2154 | QUrl::DecodedMode should be used when setting the user name from a data |
2155 | source which is not a URL, such as a password dialog shown to the user or |
2156 | with a user name obtained by calling userName() with the QUrl::FullyDecoded |
2157 | formatting option. |
2158 | |
2159 | \sa userName(), setUserInfo() |
2160 | */ |
2161 | void QUrl::setUserName(const QString &userName, ParsingMode mode) |
2162 | { |
2163 | detach(); |
2164 | d->clearError(); |
2165 | |
2166 | QString data = userName; |
2167 | if (mode == DecodedMode) { |
2168 | parseDecodedComponent(data); |
2169 | mode = TolerantMode; |
2170 | } |
2171 | |
2172 | d->setUserName(value: data, from: 0, end: data.size()); |
2173 | if (userName.isNull()) |
2174 | d->sectionIsPresent &= ~QUrlPrivate::UserName; |
2175 | else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::UserName, input: userName)) |
2176 | d->userName.clear(); |
2177 | } |
2178 | |
2179 | /*! |
2180 | Returns the user name of the URL if it is defined; otherwise |
2181 | an empty string is returned. |
2182 | |
2183 | The \a options argument controls how to format the user name component. All |
2184 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2185 | percent-encoded sequences are decoded; otherwise, the returned value may |
2186 | contain some percent-encoded sequences for some control sequences not |
2187 | representable in decoded form in QString. |
2188 | |
2189 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2190 | sequences are present. It is recommended to use that value when the result |
2191 | will be used in a non-URL context, such as setting in QAuthenticator or |
2192 | negotiating a login. |
2193 | |
2194 | \sa setUserName(), userInfo() |
2195 | */ |
2196 | QString QUrl::userName(ComponentFormattingOptions options) const |
2197 | { |
2198 | QString result; |
2199 | if (d) |
2200 | d->appendUserName(appendTo&: result, options); |
2201 | return result; |
2202 | } |
2203 | |
2204 | /*! |
2205 | Sets the URL's password to \a password. The \a password is part of |
2206 | the user info element in the authority of the URL, as described in |
2207 | setUserInfo(). |
2208 | |
2209 | The \a password data is interpreted according to \a mode: in StrictMode, |
2210 | any '%' characters must be followed by exactly two hexadecimal characters |
2211 | and some characters (including space) are not allowed in undecoded form. In |
2212 | TolerantMode, all characters are accepted in undecoded form and the |
2213 | tolerant parser will correct stray '%' not followed by two hex characters. |
2214 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2215 | possible. |
2216 | |
2217 | QUrl::DecodedMode should be used when setting the password from a data |
2218 | source which is not a URL, such as a password dialog shown to the user or |
2219 | with a password obtained by calling password() with the QUrl::FullyDecoded |
2220 | formatting option. |
2221 | |
2222 | \sa password(), setUserInfo() |
2223 | */ |
2224 | void QUrl::setPassword(const QString &password, ParsingMode mode) |
2225 | { |
2226 | detach(); |
2227 | d->clearError(); |
2228 | |
2229 | QString data = password; |
2230 | if (mode == DecodedMode) { |
2231 | parseDecodedComponent(data); |
2232 | mode = TolerantMode; |
2233 | } |
2234 | |
2235 | d->setPassword(value: data, from: 0, end: data.size()); |
2236 | if (password.isNull()) |
2237 | d->sectionIsPresent &= ~QUrlPrivate::Password; |
2238 | else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Password, input: password)) |
2239 | d->password.clear(); |
2240 | } |
2241 | |
2242 | /*! |
2243 | Returns the password of the URL if it is defined; otherwise |
2244 | an empty string is returned. |
2245 | |
2246 | The \a options argument controls how to format the user name component. All |
2247 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2248 | percent-encoded sequences are decoded; otherwise, the returned value may |
2249 | contain some percent-encoded sequences for some control sequences not |
2250 | representable in decoded form in QString. |
2251 | |
2252 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2253 | sequences are present. It is recommended to use that value when the result |
2254 | will be used in a non-URL context, such as setting in QAuthenticator or |
2255 | negotiating a login. |
2256 | |
2257 | \sa setPassword() |
2258 | */ |
2259 | QString QUrl::password(ComponentFormattingOptions options) const |
2260 | { |
2261 | QString result; |
2262 | if (d) |
2263 | d->appendPassword(appendTo&: result, options); |
2264 | return result; |
2265 | } |
2266 | |
2267 | /*! |
2268 | Sets the host of the URL to \a host. The host is part of the |
2269 | authority. |
2270 | |
2271 | The \a host data is interpreted according to \a mode: in StrictMode, |
2272 | any '%' characters must be followed by exactly two hexadecimal characters |
2273 | and some characters (including space) are not allowed in undecoded form. In |
2274 | TolerantMode, all characters are accepted in undecoded form and the |
2275 | tolerant parser will correct stray '%' not followed by two hex characters. |
2276 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2277 | possible. |
2278 | |
2279 | Note that, in all cases, the result of the parsing must be a valid hostname |
2280 | according to STD 3 rules, as modified by the Internationalized Resource |
2281 | Identifiers specification (RFC 3987). Invalid hostnames are not permitted |
2282 | and will cause isValid() to become false. |
2283 | |
2284 | \sa host(), setAuthority() |
2285 | */ |
2286 | void QUrl::setHost(const QString &host, ParsingMode mode) |
2287 | { |
2288 | detach(); |
2289 | d->clearError(); |
2290 | |
2291 | QString data = host; |
2292 | if (mode == DecodedMode) { |
2293 | parseDecodedComponent(data); |
2294 | mode = TolerantMode; |
2295 | } |
2296 | |
2297 | if (d->setHost(value: data, from: 0, iend: data.size(), mode)) { |
2298 | if (host.isNull()) |
2299 | d->sectionIsPresent &= ~QUrlPrivate::Host; |
2300 | } else if (!data.startsWith(c: u'[')) { |
2301 | // setHost failed, it might be IPv6 or IPvFuture in need of bracketing |
2302 | Q_ASSERT(d->error); |
2303 | |
2304 | data.prepend(c: u'['); |
2305 | data.append(c: u']'); |
2306 | if (!d->setHost(value: data, from: 0, iend: data.size(), mode)) { |
2307 | // failed again |
2308 | if (data.contains(c: u':')) { |
2309 | // source data contains ':', so it's an IPv6 error |
2310 | d->error->code = QUrlPrivate::InvalidIPv6AddressError; |
2311 | } |
2312 | } else { |
2313 | // succeeded |
2314 | d->clearError(); |
2315 | } |
2316 | } |
2317 | } |
2318 | |
2319 | /*! |
2320 | Returns the host of the URL if it is defined; otherwise |
2321 | an empty string is returned. |
2322 | |
2323 | The \a options argument controls how the hostname will be formatted. The |
2324 | QUrl::EncodeUnicode option will cause this function to return the hostname |
2325 | in the ASCII-Compatible Encoding (ACE) form, which is suitable for use in |
2326 | channels that are not 8-bit clean or that require the legacy hostname (such |
2327 | as DNS requests or in HTTP request headers). If that flag is not present, |
2328 | this function returns the International Domain Name (IDN) in Unicode form, |
2329 | according to the list of permissible top-level domains (see |
2330 | idnWhitelist()). |
2331 | |
2332 | All other flags are ignored. Host names cannot contain control or percent |
2333 | characters, so the returned value can be considered fully decoded. |
2334 | |
2335 | \sa setHost(), idnWhitelist(), setIdnWhitelist(), authority() |
2336 | */ |
2337 | QString QUrl::host(ComponentFormattingOptions options) const |
2338 | { |
2339 | QString result; |
2340 | if (d) { |
2341 | d->appendHost(appendTo&: result, options); |
2342 | if (result.startsWith(c: u'[')) |
2343 | result = result.mid(position: 1, n: result.size() - 2); |
2344 | } |
2345 | return result; |
2346 | } |
2347 | |
2348 | /*! |
2349 | Sets the port of the URL to \a port. The port is part of the |
2350 | authority of the URL, as described in setAuthority(). |
2351 | |
2352 | \a port must be between 0 and 65535 inclusive. Setting the |
2353 | port to -1 indicates that the port is unspecified. |
2354 | */ |
2355 | void QUrl::setPort(int port) |
2356 | { |
2357 | detach(); |
2358 | d->clearError(); |
2359 | |
2360 | if (port < -1 || port > 65535) { |
2361 | d->setError(errorCode: QUrlPrivate::InvalidPortError, source: QString::number(port), supplement: 0); |
2362 | port = -1; |
2363 | } |
2364 | |
2365 | d->port = port; |
2366 | if (port != -1) |
2367 | d->sectionIsPresent |= QUrlPrivate::Host; |
2368 | } |
2369 | |
2370 | /*! |
2371 | \since 4.1 |
2372 | |
2373 | Returns the port of the URL, or \a defaultPort if the port is |
2374 | unspecified. |
2375 | |
2376 | Example: |
2377 | |
2378 | \snippet code/src_corelib_io_qurl.cpp 3 |
2379 | */ |
2380 | int QUrl::port(int defaultPort) const |
2381 | { |
2382 | if (!d) return defaultPort; |
2383 | return d->port == -1 ? defaultPort : d->port; |
2384 | } |
2385 | |
2386 | /*! |
2387 | Sets the path of the URL to \a path. The path is the part of the |
2388 | URL that comes after the authority but before the query string. |
2389 | |
2390 | \image qurl-ftppath.png |
2391 | |
2392 | For non-hierarchical schemes, the path will be everything |
2393 | following the scheme declaration, as in the following example: |
2394 | |
2395 | \image qurl-mailtopath.png |
2396 | |
2397 | The \a path data is interpreted according to \a mode: in StrictMode, |
2398 | any '%' characters must be followed by exactly two hexadecimal characters |
2399 | and some characters (including space) are not allowed in undecoded form. In |
2400 | TolerantMode, all characters are accepted in undecoded form and the |
2401 | tolerant parser will correct stray '%' not followed by two hex characters. |
2402 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2403 | possible. |
2404 | |
2405 | QUrl::DecodedMode should be used when setting the path from a data source |
2406 | which is not a URL, such as a dialog shown to the user or with a path |
2407 | obtained by calling path() with the QUrl::FullyDecoded formatting option. |
2408 | |
2409 | \sa path() |
2410 | */ |
2411 | void QUrl::setPath(const QString &path, ParsingMode mode) |
2412 | { |
2413 | detach(); |
2414 | d->clearError(); |
2415 | |
2416 | QString data = path; |
2417 | if (mode == DecodedMode) { |
2418 | parseDecodedComponent(data); |
2419 | mode = TolerantMode; |
2420 | } |
2421 | |
2422 | d->setPath(value: data, from: 0, end: data.size()); |
2423 | |
2424 | // optimized out, since there is no path delimiter |
2425 | // if (path.isNull()) |
2426 | // d->sectionIsPresent &= ~QUrlPrivate::Path; |
2427 | // else |
2428 | if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Path, input: path)) |
2429 | d->path.clear(); |
2430 | } |
2431 | |
2432 | /*! |
2433 | Returns the path of the URL. |
2434 | |
2435 | \snippet code/src_corelib_io_qurl.cpp 12 |
2436 | |
2437 | The \a options argument controls how to format the path component. All |
2438 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2439 | percent-encoded sequences are decoded; otherwise, the returned value may |
2440 | contain some percent-encoded sequences for some control sequences not |
2441 | representable in decoded form in QString. |
2442 | |
2443 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2444 | sequences are present. It is recommended to use that value when the result |
2445 | will be used in a non-URL context, such as sending to an FTP server. |
2446 | |
2447 | An example of data loss is when you have non-Unicode percent-encoded sequences |
2448 | and use FullyDecoded (the default): |
2449 | |
2450 | \snippet code/src_corelib_io_qurl.cpp 13 |
2451 | |
2452 | In this example, there will be some level of data loss because the \c %FF cannot |
2453 | be converted. |
2454 | |
2455 | Data loss can also occur when the path contains sub-delimiters (such as \c +): |
2456 | |
2457 | \snippet code/src_corelib_io_qurl.cpp 14 |
2458 | |
2459 | Other decoding examples: |
2460 | |
2461 | \snippet code/src_corelib_io_qurl.cpp 15 |
2462 | |
2463 | \sa setPath() |
2464 | */ |
2465 | QString QUrl::path(ComponentFormattingOptions options) const |
2466 | { |
2467 | QString result; |
2468 | if (d) |
2469 | d->appendPath(appendTo&: result, options, appendingTo: QUrlPrivate::Path); |
2470 | return result; |
2471 | } |
2472 | |
2473 | /*! |
2474 | \since 5.2 |
2475 | |
2476 | Returns the name of the file, excluding the directory path. |
2477 | |
2478 | Note that, if this QUrl object is given a path ending in a slash, the name of the file is considered empty. |
2479 | |
2480 | If the path doesn't contain any slash, it is fully returned as the fileName. |
2481 | |
2482 | Example: |
2483 | |
2484 | \snippet code/src_corelib_io_qurl.cpp 7 |
2485 | |
2486 | The \a options argument controls how to format the file name component. All |
2487 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2488 | percent-encoded sequences are decoded; otherwise, the returned value may |
2489 | contain some percent-encoded sequences for some control sequences not |
2490 | representable in decoded form in QString. |
2491 | |
2492 | \sa path() |
2493 | */ |
2494 | QString QUrl::fileName(ComponentFormattingOptions options) const |
2495 | { |
2496 | const QString ourPath = path(options); |
2497 | const qsizetype slash = ourPath.lastIndexOf(c: u'/'); |
2498 | if (slash == -1) |
2499 | return ourPath; |
2500 | return ourPath.mid(position: slash + 1); |
2501 | } |
2502 | |
2503 | /*! |
2504 | \since 4.2 |
2505 | |
2506 | Returns \c true if this URL contains a Query (i.e., if ? was seen on it). |
2507 | |
2508 | \sa setQuery(), query(), hasFragment() |
2509 | */ |
2510 | bool QUrl::hasQuery() const |
2511 | { |
2512 | if (!d) return false; |
2513 | return d->hasQuery(); |
2514 | } |
2515 | |
2516 | /*! |
2517 | Sets the query string of the URL to \a query. |
2518 | |
2519 | This function is useful if you need to pass a query string that |
2520 | does not fit into the key-value pattern, or that uses a different |
2521 | scheme for encoding special characters than what is suggested by |
2522 | QUrl. |
2523 | |
2524 | Passing a value of QString() to \a query (a null QString) unsets |
2525 | the query completely. However, passing a value of QString("") |
2526 | will set the query to an empty value, as if the original URL |
2527 | had a lone "?". |
2528 | |
2529 | The \a query data is interpreted according to \a mode: in StrictMode, |
2530 | any '%' characters must be followed by exactly two hexadecimal characters |
2531 | and some characters (including space) are not allowed in undecoded form. In |
2532 | TolerantMode, all characters are accepted in undecoded form and the |
2533 | tolerant parser will correct stray '%' not followed by two hex characters. |
2534 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2535 | possible. |
2536 | |
2537 | Query strings often contain percent-encoded sequences, so use of |
2538 | DecodedMode is discouraged. One special sequence to be aware of is that of |
2539 | the plus character ('+'). QUrl does not convert spaces to plus characters, |
2540 | even though HTML forms posted by web browsers do. In order to represent an |
2541 | actual plus character in a query, the sequence "%2B" is usually used. This |
2542 | function will leave "%2B" sequences untouched in TolerantMode or |
2543 | StrictMode. |
2544 | |
2545 | \sa query(), hasQuery() |
2546 | */ |
2547 | void QUrl::setQuery(const QString &query, ParsingMode mode) |
2548 | { |
2549 | detach(); |
2550 | d->clearError(); |
2551 | |
2552 | QString data = query; |
2553 | if (mode == DecodedMode) { |
2554 | parseDecodedComponent(data); |
2555 | mode = TolerantMode; |
2556 | } |
2557 | |
2558 | d->setQuery(value: data, from: 0, iend: data.size()); |
2559 | if (query.isNull()) |
2560 | d->sectionIsPresent &= ~QUrlPrivate::Query; |
2561 | else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Query, input: query)) |
2562 | d->query.clear(); |
2563 | } |
2564 | |
2565 | /*! |
2566 | \overload |
2567 | \since 5.0 |
2568 | Sets the query string of the URL to \a query. |
2569 | |
2570 | This function reconstructs the query string from the QUrlQuery object and |
2571 | sets on this QUrl object. This function does not have parsing parameters |
2572 | because the QUrlQuery contains data that is already parsed. |
2573 | |
2574 | \sa query(), hasQuery() |
2575 | */ |
2576 | void QUrl::setQuery(const QUrlQuery &query) |
2577 | { |
2578 | detach(); |
2579 | d->clearError(); |
2580 | |
2581 | // we know the data is in the right format |
2582 | d->query = query.toString(); |
2583 | if (query.isEmpty()) |
2584 | d->sectionIsPresent &= ~QUrlPrivate::Query; |
2585 | else |
2586 | d->sectionIsPresent |= QUrlPrivate::Query; |
2587 | } |
2588 | |
2589 | /*! |
2590 | Returns the query string of the URL if there's a query string, or an empty |
2591 | result if not. To determine if the parsed URL contained a query string, use |
2592 | hasQuery(). |
2593 | |
2594 | The \a options argument controls how to format the query component. All |
2595 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2596 | percent-encoded sequences are decoded; otherwise, the returned value may |
2597 | contain some percent-encoded sequences for some control sequences not |
2598 | representable in decoded form in QString. |
2599 | |
2600 | Note that use of QUrl::FullyDecoded in queries is discouraged, as queries |
2601 | often contain data that is supposed to remain percent-encoded, including |
2602 | the use of the "%2B" sequence to represent a plus character ('+'). |
2603 | |
2604 | \sa setQuery(), hasQuery() |
2605 | */ |
2606 | QString QUrl::query(ComponentFormattingOptions options) const |
2607 | { |
2608 | QString result; |
2609 | if (d) { |
2610 | d->appendQuery(appendTo&: result, options, appendingTo: QUrlPrivate::Query); |
2611 | if (d->hasQuery() && result.isNull()) |
2612 | result.detach(); |
2613 | } |
2614 | return result; |
2615 | } |
2616 | |
2617 | /*! |
2618 | Sets the fragment of the URL to \a fragment. The fragment is the |
2619 | last part of the URL, represented by a '#' followed by a string of |
2620 | characters. It is typically used in HTTP for referring to a |
2621 | certain link or point on a page: |
2622 | |
2623 | \image qurl-fragment.png |
2624 | |
2625 | The fragment is sometimes also referred to as the URL "reference". |
2626 | |
2627 | Passing an argument of QString() (a null QString) will unset the fragment. |
2628 | Passing an argument of QString("") (an empty but not null QString) will set the |
2629 | fragment to an empty string (as if the original URL had a lone "#"). |
2630 | |
2631 | The \a fragment data is interpreted according to \a mode: in StrictMode, |
2632 | any '%' characters must be followed by exactly two hexadecimal characters |
2633 | and some characters (including space) are not allowed in undecoded form. In |
2634 | TolerantMode, all characters are accepted in undecoded form and the |
2635 | tolerant parser will correct stray '%' not followed by two hex characters. |
2636 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2637 | possible. |
2638 | |
2639 | QUrl::DecodedMode should be used when setting the fragment from a data |
2640 | source which is not a URL or with a fragment obtained by calling |
2641 | fragment() with the QUrl::FullyDecoded formatting option. |
2642 | |
2643 | \sa fragment(), hasFragment() |
2644 | */ |
2645 | void QUrl::setFragment(const QString &fragment, ParsingMode mode) |
2646 | { |
2647 | detach(); |
2648 | d->clearError(); |
2649 | |
2650 | QString data = fragment; |
2651 | if (mode == DecodedMode) { |
2652 | parseDecodedComponent(data); |
2653 | mode = TolerantMode; |
2654 | } |
2655 | |
2656 | d->setFragment(value: data, from: 0, end: data.size()); |
2657 | if (fragment.isNull()) |
2658 | d->sectionIsPresent &= ~QUrlPrivate::Fragment; |
2659 | else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Fragment, input: fragment)) |
2660 | d->fragment.clear(); |
2661 | } |
2662 | |
2663 | /*! |
2664 | Returns the fragment of the URL. To determine if the parsed URL contained a |
2665 | fragment, use hasFragment(). |
2666 | |
2667 | The \a options argument controls how to format the fragment component. All |
2668 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2669 | percent-encoded sequences are decoded; otherwise, the returned value may |
2670 | contain some percent-encoded sequences for some control sequences not |
2671 | representable in decoded form in QString. |
2672 | |
2673 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2674 | sequences are present. It is recommended to use that value when the result |
2675 | will be used in a non-URL context. |
2676 | |
2677 | \sa setFragment(), hasFragment() |
2678 | */ |
2679 | QString QUrl::fragment(ComponentFormattingOptions options) const |
2680 | { |
2681 | QString result; |
2682 | if (d) { |
2683 | d->appendFragment(appendTo&: result, options, appendingTo: QUrlPrivate::Fragment); |
2684 | if (d->hasFragment() && result.isNull()) |
2685 | result.detach(); |
2686 | } |
2687 | return result; |
2688 | } |
2689 | |
2690 | /*! |
2691 | \since 4.2 |
2692 | |
2693 | Returns \c true if this URL contains a fragment (i.e., if # was seen on it). |
2694 | |
2695 | \sa fragment(), setFragment() |
2696 | */ |
2697 | bool QUrl::hasFragment() const |
2698 | { |
2699 | if (!d) return false; |
2700 | return d->hasFragment(); |
2701 | } |
2702 | |
2703 | /*! |
2704 | Returns the result of the merge of this URL with \a relative. This |
2705 | URL is used as a base to convert \a relative to an absolute URL. |
2706 | |
2707 | If \a relative is not a relative URL, this function will return \a |
2708 | relative directly. Otherwise, the paths of the two URLs are |
2709 | merged, and the new URL returned has the scheme and authority of |
2710 | the base URL, but with the merged path, as in the following |
2711 | example: |
2712 | |
2713 | \snippet code/src_corelib_io_qurl.cpp 5 |
2714 | |
2715 | Calling resolved() with ".." returns a QUrl whose directory is |
2716 | one level higher than the original. Similarly, calling resolved() |
2717 | with "../.." removes two levels from the path. If \a relative is |
2718 | "/", the path becomes "/". |
2719 | |
2720 | \sa isRelative() |
2721 | */ |
2722 | QUrl QUrl::resolved(const QUrl &relative) const |
2723 | { |
2724 | if (!d) return relative; |
2725 | if (!relative.d) return *this; |
2726 | |
2727 | QUrl t; |
2728 | if (!relative.d->scheme.isEmpty()) { |
2729 | t = relative; |
2730 | t.detach(); |
2731 | } else { |
2732 | if (relative.d->hasAuthority()) { |
2733 | t = relative; |
2734 | t.detach(); |
2735 | } else { |
2736 | t.d = new QUrlPrivate; |
2737 | |
2738 | // copy the authority |
2739 | t.d->userName = d->userName; |
2740 | t.d->password = d->password; |
2741 | t.d->host = d->host; |
2742 | t.d->port = d->port; |
2743 | t.d->sectionIsPresent = d->sectionIsPresent & QUrlPrivate::Authority; |
2744 | |
2745 | if (relative.d->path.isEmpty()) { |
2746 | t.d->path = d->path; |
2747 | if (relative.d->hasQuery()) { |
2748 | t.d->query = relative.d->query; |
2749 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2750 | } else if (d->hasQuery()) { |
2751 | t.d->query = d->query; |
2752 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2753 | } |
2754 | } else { |
2755 | t.d->path = relative.d->path.startsWith(c: u'/') |
2756 | ? relative.d->path |
2757 | : d->mergePaths(relativePath: relative.d->path); |
2758 | if (relative.d->hasQuery()) { |
2759 | t.d->query = relative.d->query; |
2760 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2761 | } |
2762 | } |
2763 | } |
2764 | t.d->scheme = d->scheme; |
2765 | if (d->hasScheme()) |
2766 | t.d->sectionIsPresent |= QUrlPrivate::Scheme; |
2767 | else |
2768 | t.d->sectionIsPresent &= ~QUrlPrivate::Scheme; |
2769 | t.d->flags |= d->flags & QUrlPrivate::IsLocalFile; |
2770 | } |
2771 | t.d->fragment = relative.d->fragment; |
2772 | if (relative.d->hasFragment()) |
2773 | t.d->sectionIsPresent |= QUrlPrivate::Fragment; |
2774 | else |
2775 | t.d->sectionIsPresent &= ~QUrlPrivate::Fragment; |
2776 | |
2777 | removeDotsFromPath(path: &t.d->path); |
2778 | |
2779 | #if defined(QURL_DEBUG) |
2780 | qDebug("QUrl(\"%ls\").resolved(\"%ls\") = \"%ls\"" , |
2781 | qUtf16Printable(url()), |
2782 | qUtf16Printable(relative.url()), |
2783 | qUtf16Printable(t.url())); |
2784 | #endif |
2785 | return t; |
2786 | } |
2787 | |
2788 | /*! |
2789 | Returns \c true if the URL is relative; otherwise returns \c false. A URL is |
2790 | relative reference if its scheme is undefined; this function is therefore |
2791 | equivalent to calling scheme().isEmpty(). |
2792 | |
2793 | Relative references are defined in RFC 3986 section 4.2. |
2794 | |
2795 | \sa {Relative URLs vs Relative Paths} |
2796 | */ |
2797 | bool QUrl::isRelative() const |
2798 | { |
2799 | if (!d) return true; |
2800 | return !d->hasScheme(); |
2801 | } |
2802 | |
2803 | /*! |
2804 | Returns a string representation of the URL. The output can be customized by |
2805 | passing flags with \a options. The option QUrl::FullyDecoded is not |
2806 | permitted in this function since it would generate ambiguous data. |
2807 | |
2808 | The resulting QString can be passed back to a QUrl later on. |
2809 | |
2810 | Synonym for toString(options). |
2811 | |
2812 | \sa FormattingOptions, toEncoded(), toString() |
2813 | */ |
2814 | QString QUrl::url(FormattingOptions options) const |
2815 | { |
2816 | return toString(options); |
2817 | } |
2818 | |
2819 | /*! |
2820 | Returns a string representation of the URL. The output can be customized by |
2821 | passing flags with \a options. The option QUrl::FullyDecoded is not |
2822 | permitted in this function since it would generate ambiguous data. |
2823 | |
2824 | The default formatting option is \l{QUrl::FormattingOptions}{PrettyDecoded}. |
2825 | |
2826 | \sa FormattingOptions, url(), setUrl() |
2827 | */ |
2828 | QString QUrl::toString(FormattingOptions options) const |
2829 | { |
2830 | QString url; |
2831 | if (!isValid()) { |
2832 | // also catches isEmpty() |
2833 | return url; |
2834 | } |
2835 | if ((options & QUrl::FullyDecoded) == QUrl::FullyDecoded) { |
2836 | qWarning(msg: "QUrl: QUrl::FullyDecoded is not permitted when reconstructing the full URL" ); |
2837 | options &= ~QUrl::FullyDecoded; |
2838 | //options |= QUrl::PrettyDecoded; // no-op, value is 0 |
2839 | } |
2840 | |
2841 | // return just the path if: |
2842 | // - QUrl::PreferLocalFile is passed |
2843 | // - QUrl::RemovePath isn't passed (rather stupid if the user did...) |
2844 | // - there's no query or fragment to return |
2845 | // that is, either they aren't present, or we're removing them |
2846 | // - it's a local file |
2847 | if (options.testFlag(f: QUrl::PreferLocalFile) && !options.testFlag(f: QUrl::RemovePath) |
2848 | && (!d->hasQuery() || options.testFlag(f: QUrl::RemoveQuery)) |
2849 | && (!d->hasFragment() || options.testFlag(f: QUrl::RemoveFragment)) |
2850 | && isLocalFile()) { |
2851 | url = d->toLocalFile(options: options | QUrl::FullyDecoded); |
2852 | return url; |
2853 | } |
2854 | |
2855 | // for the full URL, we consider that the reserved characters are prettier if encoded |
2856 | if (options & DecodeReserved) |
2857 | options &= ~EncodeReserved; |
2858 | else |
2859 | options |= EncodeReserved; |
2860 | |
2861 | if (!(options & QUrl::RemoveScheme) && d->hasScheme()) |
2862 | url += d->scheme + u':'; |
2863 | |
2864 | bool pathIsAbsolute = d->path.startsWith(c: u'/'); |
2865 | if (!((options & QUrl::RemoveAuthority) == QUrl::RemoveAuthority) && d->hasAuthority()) { |
2866 | url += "//"_L1 ; |
2867 | d->appendAuthority(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl); |
2868 | } else if (isLocalFile() && pathIsAbsolute) { |
2869 | // Comply with the XDG file URI spec, which requires triple slashes. |
2870 | url += "//"_L1 ; |
2871 | } |
2872 | |
2873 | if (!(options & QUrl::RemovePath)) |
2874 | d->appendPath(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl); |
2875 | |
2876 | if (!(options & QUrl::RemoveQuery) && d->hasQuery()) { |
2877 | url += u'?'; |
2878 | d->appendQuery(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl); |
2879 | } |
2880 | if (!(options & QUrl::RemoveFragment) && d->hasFragment()) { |
2881 | url += u'#'; |
2882 | d->appendFragment(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl); |
2883 | } |
2884 | |
2885 | return url; |
2886 | } |
2887 | |
2888 | /*! |
2889 | \since 5.0 |
2890 | |
2891 | Returns a human-displayable string representation of the URL. |
2892 | The output can be customized by passing flags with \a options. |
2893 | The option RemovePassword is always enabled, since passwords |
2894 | should never be shown back to users. |
2895 | |
2896 | With the default options, the resulting QString can be passed back |
2897 | to a QUrl later on, but any password that was present initially will |
2898 | be lost. |
2899 | |
2900 | \sa FormattingOptions, toEncoded(), toString() |
2901 | */ |
2902 | |
2903 | QString QUrl::toDisplayString(FormattingOptions options) const |
2904 | { |
2905 | return toString(options: options | RemovePassword); |
2906 | } |
2907 | |
2908 | /*! |
2909 | \since 5.2 |
2910 | |
2911 | Returns an adjusted version of the URL. |
2912 | The output can be customized by passing flags with \a options. |
2913 | |
2914 | The encoding options from QUrl::ComponentFormattingOption don't make |
2915 | much sense for this method, nor does QUrl::PreferLocalFile. |
2916 | |
2917 | This is always equivalent to QUrl(url.toString(options)). |
2918 | |
2919 | \sa FormattingOptions, toEncoded(), toString() |
2920 | */ |
2921 | QUrl QUrl::adjusted(QUrl::FormattingOptions options) const |
2922 | { |
2923 | if (!isValid()) { |
2924 | // also catches isEmpty() |
2925 | return QUrl(); |
2926 | } |
2927 | QUrl that = *this; |
2928 | if (options & RemoveScheme) |
2929 | that.setScheme(QString()); |
2930 | if ((options & RemoveAuthority) == RemoveAuthority) { |
2931 | that.setAuthority(authority: QString()); |
2932 | } else { |
2933 | if ((options & RemoveUserInfo) == RemoveUserInfo) |
2934 | that.setUserInfo(userInfo: QString()); |
2935 | else if (options & RemovePassword) |
2936 | that.setPassword(password: QString()); |
2937 | if (options & RemovePort) |
2938 | that.setPort(-1); |
2939 | } |
2940 | if (options & RemoveQuery) |
2941 | that.setQuery(query: QString()); |
2942 | if (options & RemoveFragment) |
2943 | that.setFragment(fragment: QString()); |
2944 | if (options & RemovePath) { |
2945 | that.setPath(path: QString()); |
2946 | } else if (options & (StripTrailingSlash | RemoveFilename | NormalizePathSegments)) { |
2947 | that.detach(); |
2948 | QString path; |
2949 | d->appendPath(appendTo&: path, options: options | FullyEncoded, appendingTo: QUrlPrivate::Path); |
2950 | that.d->setPath(value: path, from: 0, end: path.size()); |
2951 | } |
2952 | return that; |
2953 | } |
2954 | |
2955 | /*! |
2956 | Returns the encoded representation of the URL if it's valid; |
2957 | otherwise an empty QByteArray is returned. The output can be |
2958 | customized by passing flags with \a options. |
2959 | |
2960 | The user info, path and fragment are all converted to UTF-8, and |
2961 | all non-ASCII characters are then percent encoded. The host name |
2962 | is encoded using Punycode. |
2963 | */ |
2964 | QByteArray QUrl::toEncoded(FormattingOptions options) const |
2965 | { |
2966 | options &= ~(FullyDecoded | FullyEncoded); |
2967 | return toString(options: options | FullyEncoded).toLatin1(); |
2968 | } |
2969 | |
2970 | /*! |
2971 | \fn QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode parsingMode) |
2972 | |
2973 | Parses \a input and returns the corresponding QUrl. \a input is |
2974 | assumed to be in encoded form, containing only ASCII characters. |
2975 | |
2976 | Parses the URL using \a mode. See setUrl() for more information on |
2977 | this parameter. QUrl::DecodedMode is not permitted in this context. |
2978 | |
2979 | \sa toEncoded(), setUrl() |
2980 | */ |
2981 | QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode mode) |
2982 | { |
2983 | return QUrl(QString::fromUtf8(utf8: input.constData(), size: input.size()), mode); |
2984 | } |
2985 | |
2986 | /*! |
2987 | Returns a decoded copy of \a input. \a input is first decoded from |
2988 | percent encoding, then converted from UTF-8 to unicode. |
2989 | |
2990 | \note Given invalid input (such as a string containing the sequence "%G5", |
2991 | which is not a valid hexadecimal number) the output will be invalid as |
2992 | well. As an example: the sequence "%G5" could be decoded to 'W'. |
2993 | */ |
2994 | QString QUrl::fromPercentEncoding(const QByteArray &input) |
2995 | { |
2996 | QByteArray ba = QByteArray::fromPercentEncoding(pctEncoded: input); |
2997 | return QString::fromUtf8(utf8: ba, size: ba.size()); |
2998 | } |
2999 | |
3000 | /*! |
3001 | Returns an encoded copy of \a input. \a input is first converted |
3002 | to UTF-8, and all ASCII-characters that are not in the unreserved group |
3003 | are percent encoded. To prevent characters from being percent encoded |
3004 | pass them to \a exclude. To force characters to be percent encoded pass |
3005 | them to \a include. |
3006 | |
3007 | Unreserved is defined as: |
3008 | \tt {ALPHA / DIGIT / "-" / "." / "_" / "~"} |
3009 | |
3010 | \snippet code/src_corelib_io_qurl.cpp 6 |
3011 | */ |
3012 | QByteArray QUrl::toPercentEncoding(const QString &input, const QByteArray &exclude, const QByteArray &include) |
3013 | { |
3014 | return input.toUtf8().toPercentEncoding(exclude, include); |
3015 | } |
3016 | |
3017 | /*! |
3018 | \since 6.3 |
3019 | |
3020 | Returns the Unicode form of the given domain name |
3021 | \a domain, which is encoded in the ASCII Compatible Encoding (ACE). |
3022 | The output can be customized by passing flags with \a options. |
3023 | The result of this function is considered equivalent to \a domain. |
3024 | |
3025 | If the value in \a domain cannot be encoded, it will be converted |
3026 | to QString and returned. |
3027 | |
3028 | The ASCII-Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491 |
3029 | and RFC 3492 and updated by the Unicode Technical Standard #46. It is part |
3030 | of the Internationalizing Domain Names in Applications (IDNA) specification, |
3031 | which allows for domain names (like \c "example.com") to be written using |
3032 | non-US-ASCII characters. |
3033 | */ |
3034 | QString QUrl::fromAce(const QByteArray &domain, QUrl::AceProcessingOptions options) |
3035 | { |
3036 | return qt_ACE_do(domain: QString::fromLatin1(ba: domain), op: NormalizeAce, |
3037 | dot: ForbidLeadingDot /*FIXME: make configurable*/, options); |
3038 | } |
3039 | |
3040 | /*! |
3041 | \since 6.3 |
3042 | |
3043 | Returns the ASCII Compatible Encoding of the given domain name \a domain. |
3044 | The output can be customized by passing flags with \a options. |
3045 | The result of this function is considered equivalent to \a domain. |
3046 | |
3047 | The ASCII-Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491 |
3048 | and RFC 3492 and updated by the Unicode Technical Standard #46. It is part |
3049 | of the Internationalizing Domain Names in Applications (IDNA) specification, |
3050 | which allows for domain names (like \c "example.com") to be written using |
3051 | non-US-ASCII characters. |
3052 | |
3053 | This function returns an empty QByteArray if \a domain is not a valid |
3054 | hostname. Note, in particular, that IPv6 literals are not valid domain |
3055 | names. |
3056 | */ |
3057 | QByteArray QUrl::toAce(const QString &domain, AceProcessingOptions options) |
3058 | { |
3059 | return qt_ACE_do(domain, op: ToAceOnly, dot: ForbidLeadingDot /*FIXME: make configurable*/, options) |
3060 | .toLatin1(); |
3061 | } |
3062 | |
3063 | /*! |
3064 | \internal |
3065 | |
3066 | Returns \c true if this URL is "less than" the given \a url. This |
3067 | provides a means of ordering URLs. |
3068 | */ |
3069 | bool QUrl::operator <(const QUrl &url) const |
3070 | { |
3071 | if (!d || !url.d) { |
3072 | bool thisIsEmpty = !d || d->isEmpty(); |
3073 | bool thatIsEmpty = !url.d || url.d->isEmpty(); |
3074 | |
3075 | // sort an empty URL first |
3076 | return thisIsEmpty && !thatIsEmpty; |
3077 | } |
3078 | |
3079 | int cmp; |
3080 | cmp = d->scheme.compare(s: url.d->scheme); |
3081 | if (cmp != 0) |
3082 | return cmp < 0; |
3083 | |
3084 | cmp = d->userName.compare(s: url.d->userName); |
3085 | if (cmp != 0) |
3086 | return cmp < 0; |
3087 | |
3088 | cmp = d->password.compare(s: url.d->password); |
3089 | if (cmp != 0) |
3090 | return cmp < 0; |
3091 | |
3092 | cmp = d->host.compare(s: url.d->host); |
3093 | if (cmp != 0) |
3094 | return cmp < 0; |
3095 | |
3096 | if (d->port != url.d->port) |
3097 | return d->port < url.d->port; |
3098 | |
3099 | cmp = d->path.compare(s: url.d->path); |
3100 | if (cmp != 0) |
3101 | return cmp < 0; |
3102 | |
3103 | if (d->hasQuery() != url.d->hasQuery()) |
3104 | return url.d->hasQuery(); |
3105 | |
3106 | cmp = d->query.compare(s: url.d->query); |
3107 | if (cmp != 0) |
3108 | return cmp < 0; |
3109 | |
3110 | if (d->hasFragment() != url.d->hasFragment()) |
3111 | return url.d->hasFragment(); |
3112 | |
3113 | cmp = d->fragment.compare(s: url.d->fragment); |
3114 | return cmp < 0; |
3115 | } |
3116 | |
3117 | /*! |
3118 | Returns \c true if this URL and the given \a url are equal; |
3119 | otherwise returns \c false. |
3120 | |
3121 | \sa matches() |
3122 | */ |
3123 | bool QUrl::operator ==(const QUrl &url) const |
3124 | { |
3125 | if (!d && !url.d) |
3126 | return true; |
3127 | if (!d) |
3128 | return url.d->isEmpty(); |
3129 | if (!url.d) |
3130 | return d->isEmpty(); |
3131 | |
3132 | // First, compare which sections are present, since it speeds up the |
3133 | // processing considerably. We just have to ignore the host-is-present flag |
3134 | // for local files (the "file" protocol), due to the requirements of the |
3135 | // XDG file URI specification. |
3136 | int mask = QUrlPrivate::FullUrl; |
3137 | if (isLocalFile()) |
3138 | mask &= ~QUrlPrivate::Host; |
3139 | return (d->sectionIsPresent & mask) == (url.d->sectionIsPresent & mask) && |
3140 | d->scheme == url.d->scheme && |
3141 | d->userName == url.d->userName && |
3142 | d->password == url.d->password && |
3143 | d->host == url.d->host && |
3144 | d->port == url.d->port && |
3145 | d->path == url.d->path && |
3146 | d->query == url.d->query && |
3147 | d->fragment == url.d->fragment; |
3148 | } |
3149 | |
3150 | /*! |
3151 | \since 5.2 |
3152 | |
3153 | Returns \c true if this URL and the given \a url are equal after |
3154 | applying \a options to both; otherwise returns \c false. |
3155 | |
3156 | This is equivalent to calling adjusted(options) on both URLs |
3157 | and comparing the resulting urls, but faster. |
3158 | |
3159 | */ |
3160 | bool QUrl::matches(const QUrl &url, FormattingOptions options) const |
3161 | { |
3162 | if (!d && !url.d) |
3163 | return true; |
3164 | if (!d) |
3165 | return url.d->isEmpty(); |
3166 | if (!url.d) |
3167 | return d->isEmpty(); |
3168 | |
3169 | // First, compare which sections are present, since it speeds up the |
3170 | // processing considerably. We just have to ignore the host-is-present flag |
3171 | // for local files (the "file" protocol), due to the requirements of the |
3172 | // XDG file URI specification. |
3173 | int mask = QUrlPrivate::FullUrl; |
3174 | if (isLocalFile()) |
3175 | mask &= ~QUrlPrivate::Host; |
3176 | |
3177 | if (options.testFlag(f: QUrl::RemoveScheme)) |
3178 | mask &= ~QUrlPrivate::Scheme; |
3179 | else if (d->scheme != url.d->scheme) |
3180 | return false; |
3181 | |
3182 | if (options.testFlag(f: QUrl::RemovePassword)) |
3183 | mask &= ~QUrlPrivate::Password; |
3184 | else if (d->password != url.d->password) |
3185 | return false; |
3186 | |
3187 | if (options.testFlag(f: QUrl::RemoveUserInfo)) |
3188 | mask &= ~QUrlPrivate::UserName; |
3189 | else if (d->userName != url.d->userName) |
3190 | return false; |
3191 | |
3192 | if (options.testFlag(f: QUrl::RemovePort)) |
3193 | mask &= ~QUrlPrivate::Port; |
3194 | else if (d->port != url.d->port) |
3195 | return false; |
3196 | |
3197 | if (options.testFlag(f: QUrl::RemoveAuthority)) |
3198 | mask &= ~QUrlPrivate::Host; |
3199 | else if (d->host != url.d->host) |
3200 | return false; |
3201 | |
3202 | if (options.testFlag(f: QUrl::RemoveQuery)) |
3203 | mask &= ~QUrlPrivate::Query; |
3204 | else if (d->query != url.d->query) |
3205 | return false; |
3206 | |
3207 | if (options.testFlag(f: QUrl::RemoveFragment)) |
3208 | mask &= ~QUrlPrivate::Fragment; |
3209 | else if (d->fragment != url.d->fragment) |
3210 | return false; |
3211 | |
3212 | if ((d->sectionIsPresent & mask) != (url.d->sectionIsPresent & mask)) |
3213 | return false; |
3214 | |
3215 | if (options.testFlag(f: QUrl::RemovePath)) |
3216 | return true; |
3217 | |
3218 | // Compare paths, after applying path-related options |
3219 | QString path1; |
3220 | d->appendPath(appendTo&: path1, options, appendingTo: QUrlPrivate::Path); |
3221 | QString path2; |
3222 | url.d->appendPath(appendTo&: path2, options, appendingTo: QUrlPrivate::Path); |
3223 | return path1 == path2; |
3224 | } |
3225 | |
3226 | /*! |
3227 | Returns \c true if this URL and the given \a url are not equal; |
3228 | otherwise returns \c false. |
3229 | |
3230 | \sa matches() |
3231 | */ |
3232 | bool QUrl::operator !=(const QUrl &url) const |
3233 | { |
3234 | return !(*this == url); |
3235 | } |
3236 | |
3237 | /*! |
3238 | Assigns the specified \a url to this object. |
3239 | */ |
3240 | QUrl &QUrl::operator =(const QUrl &url) noexcept |
3241 | { |
3242 | if (!d) { |
3243 | if (url.d) { |
3244 | url.d->ref.ref(); |
3245 | d = url.d; |
3246 | } |
3247 | } else { |
3248 | if (url.d) |
3249 | qAtomicAssign(d, x: url.d); |
3250 | else |
3251 | clear(); |
3252 | } |
3253 | return *this; |
3254 | } |
3255 | |
3256 | /*! |
3257 | Assigns the specified \a url to this object. |
3258 | */ |
3259 | QUrl &QUrl::operator =(const QString &url) |
3260 | { |
3261 | if (url.isEmpty()) { |
3262 | clear(); |
3263 | } else { |
3264 | detach(); |
3265 | d->parse(url, parsingMode: TolerantMode); |
3266 | } |
3267 | return *this; |
3268 | } |
3269 | |
3270 | /*! |
3271 | \fn void QUrl::swap(QUrl &other) |
3272 | \since 4.8 |
3273 | |
3274 | Swaps URL \a other with this URL. This operation is very |
3275 | fast and never fails. |
3276 | */ |
3277 | |
3278 | /*! |
3279 | \internal |
3280 | |
3281 | Forces a detach. |
3282 | */ |
3283 | void QUrl::detach() |
3284 | { |
3285 | if (!d) |
3286 | d = new QUrlPrivate; |
3287 | else |
3288 | qAtomicDetach(d); |
3289 | } |
3290 | |
3291 | /*! |
3292 | \internal |
3293 | */ |
3294 | bool QUrl::isDetached() const |
3295 | { |
3296 | return !d || d->ref.loadRelaxed() == 1; |
3297 | } |
3298 | |
3299 | static QString fromNativeSeparators(const QString &pathName) |
3300 | { |
3301 | #if defined(Q_OS_WIN) |
3302 | QString result(pathName); |
3303 | const QChar nativeSeparator = u'\\'; |
3304 | auto i = result.indexOf(nativeSeparator); |
3305 | if (i != -1) { |
3306 | QChar * const data = result.data(); |
3307 | const auto length = result.length(); |
3308 | for (; i < length; ++i) { |
3309 | if (data[i] == nativeSeparator) |
3310 | data[i] = u'/'; |
3311 | } |
3312 | } |
3313 | return result; |
3314 | #else |
3315 | return pathName; |
3316 | #endif |
3317 | } |
3318 | |
3319 | /*! |
3320 | Returns a QUrl representation of \a localFile, interpreted as a local |
3321 | file. This function accepts paths separated by slashes as well as the |
3322 | native separator for this platform. |
3323 | |
3324 | This function also accepts paths with a doubled leading slash (or |
3325 | backslash) to indicate a remote file, as in |
3326 | "//servername/path/to/file.txt". Note that only certain platforms can |
3327 | actually open this file using QFile::open(). |
3328 | |
3329 | An empty \a localFile leads to an empty URL (since Qt 5.4). |
3330 | |
3331 | \snippet code/src_corelib_io_qurl.cpp 16 |
3332 | |
3333 | In the first line in snippet above, a file URL is constructed from a |
3334 | local, relative path. A file URL with a relative path only makes sense |
3335 | if there is a base URL to resolve it against. For example: |
3336 | |
3337 | \snippet code/src_corelib_io_qurl.cpp 17 |
3338 | |
3339 | To resolve such a URL, it's necessary to remove the scheme beforehand: |
3340 | |
3341 | \snippet code/src_corelib_io_qurl.cpp 18 |
3342 | |
3343 | For this reason, it is better to use a relative URL (that is, no scheme) |
3344 | for relative file paths: |
3345 | |
3346 | \snippet code/src_corelib_io_qurl.cpp 19 |
3347 | |
3348 | \sa toLocalFile(), isLocalFile(), QDir::toNativeSeparators() |
3349 | */ |
3350 | QUrl QUrl::fromLocalFile(const QString &localFile) |
3351 | { |
3352 | QUrl url; |
3353 | if (localFile.isEmpty()) |
3354 | return url; |
3355 | QString scheme = fileScheme(); |
3356 | QString deslashified = fromNativeSeparators(pathName: localFile); |
3357 | |
3358 | // magic for drives on windows |
3359 | if (deslashified.size() > 1 && deslashified.at(i: 1) == u':' && deslashified.at(i: 0) != u'/') { |
3360 | deslashified.prepend(c: u'/'); |
3361 | } else if (deslashified.startsWith(s: "//"_L1 )) { |
3362 | // magic for shared drive on windows |
3363 | qsizetype indexOfPath = deslashified.indexOf(c: u'/', from: 2); |
3364 | QStringView hostSpec = QStringView{deslashified}.mid(pos: 2, n: indexOfPath - 2); |
3365 | // Check for Windows-specific WebDAV specification: "//host@SSL/path". |
3366 | if (hostSpec.endsWith(s: webDavSslTag(), cs: Qt::CaseInsensitive)) { |
3367 | hostSpec.truncate(n: hostSpec.size() - 4); |
3368 | scheme = webDavScheme(); |
3369 | } |
3370 | |
3371 | // hosts can't be IPv6 addresses without [], so we can use QUrlPrivate::setHost |
3372 | url.detach(); |
3373 | if (!url.d->setHost(value: hostSpec.toString(), from: 0, iend: hostSpec.size(), mode: StrictMode)) { |
3374 | if (url.d->error->code != QUrlPrivate::InvalidRegNameError) |
3375 | return url; |
3376 | |
3377 | // Path hostname is not a valid URL host, so set it entirely in the path |
3378 | // (by leaving deslashified unchanged) |
3379 | } else if (indexOfPath > 2) { |
3380 | deslashified = deslashified.right(n: deslashified.size() - indexOfPath); |
3381 | } else { |
3382 | deslashified.clear(); |
3383 | } |
3384 | } |
3385 | |
3386 | url.setScheme(scheme); |
3387 | url.setPath(path: deslashified, mode: DecodedMode); |
3388 | return url; |
3389 | } |
3390 | |
3391 | /*! |
3392 | Returns the path of this URL formatted as a local file path. The path |
3393 | returned will use forward slashes, even if it was originally created |
3394 | from one with backslashes. |
3395 | |
3396 | If this URL contains a non-empty hostname, it will be encoded in the |
3397 | returned value in the form found on SMB networks (for example, |
3398 | "//servername/path/to/file.txt"). |
3399 | |
3400 | \snippet code/src_corelib_io_qurl.cpp 20 |
3401 | |
3402 | Note: if the path component of this URL contains a non-UTF-8 binary |
3403 | sequence (such as %80), the behaviour of this function is undefined. |
3404 | |
3405 | \sa fromLocalFile(), isLocalFile() |
3406 | */ |
3407 | QString QUrl::toLocalFile() const |
3408 | { |
3409 | // the call to isLocalFile() also ensures that we're parsed |
3410 | if (!isLocalFile()) |
3411 | return QString(); |
3412 | |
3413 | return d->toLocalFile(options: QUrl::FullyDecoded); |
3414 | } |
3415 | |
3416 | /*! |
3417 | \since 4.8 |
3418 | Returns \c true if this URL is pointing to a local file path. A URL is a |
3419 | local file path if the scheme is "file". |
3420 | |
3421 | Note that this function considers URLs with hostnames to be local file |
3422 | paths, even if the eventual file path cannot be opened with |
3423 | QFile::open(). |
3424 | |
3425 | \sa fromLocalFile(), toLocalFile() |
3426 | */ |
3427 | bool QUrl::isLocalFile() const |
3428 | { |
3429 | return d && d->isLocalFile(); |
3430 | } |
3431 | |
3432 | /*! |
3433 | Returns \c true if this URL is a parent of \a childUrl. \a childUrl is a child |
3434 | of this URL if the two URLs share the same scheme and authority, |
3435 | and this URL's path is a parent of the path of \a childUrl. |
3436 | */ |
3437 | bool QUrl::isParentOf(const QUrl &childUrl) const |
3438 | { |
3439 | QString childPath = childUrl.path(); |
3440 | |
3441 | if (!d) |
3442 | return ((childUrl.scheme().isEmpty()) |
3443 | && (childUrl.authority().isEmpty()) |
3444 | && childPath.size() > 0 && childPath.at(i: 0) == u'/'); |
3445 | |
3446 | QString ourPath = path(); |
3447 | |
3448 | return ((childUrl.scheme().isEmpty() || d->scheme == childUrl.scheme()) |
3449 | && (childUrl.authority().isEmpty() || authority() == childUrl.authority()) |
3450 | && childPath.startsWith(s: ourPath) |
3451 | && ((ourPath.endsWith(c: u'/') && childPath.size() > ourPath.size()) |
3452 | || (!ourPath.endsWith(c: u'/') && childPath.size() > ourPath.size() |
3453 | && childPath.at(i: ourPath.size()) == u'/'))); |
3454 | } |
3455 | |
3456 | |
3457 | #ifndef QT_NO_DATASTREAM |
3458 | /*! \relates QUrl |
3459 | |
3460 | Writes url \a url to the stream \a out and returns a reference |
3461 | to the stream. |
3462 | |
3463 | \sa{Serializing Qt Data Types}{Format of the QDataStream operators} |
3464 | */ |
3465 | QDataStream &operator<<(QDataStream &out, const QUrl &url) |
3466 | { |
3467 | QByteArray u; |
3468 | if (url.isValid()) |
3469 | u = url.toEncoded(); |
3470 | out << u; |
3471 | return out; |
3472 | } |
3473 | |
3474 | /*! \relates QUrl |
3475 | |
3476 | Reads a url into \a url from the stream \a in and returns a |
3477 | reference to the stream. |
3478 | |
3479 | \sa{Serializing Qt Data Types}{Format of the QDataStream operators} |
3480 | */ |
3481 | QDataStream &operator>>(QDataStream &in, QUrl &url) |
3482 | { |
3483 | QByteArray u; |
3484 | in >> u; |
3485 | url.setUrl(url: QString::fromLatin1(ba: u)); |
3486 | return in; |
3487 | } |
3488 | #endif // QT_NO_DATASTREAM |
3489 | |
3490 | #ifndef QT_NO_DEBUG_STREAM |
3491 | QDebug operator<<(QDebug d, const QUrl &url) |
3492 | { |
3493 | QDebugStateSaver saver(d); |
3494 | d.nospace() << "QUrl(" << url.toDisplayString() << ')'; |
3495 | return d; |
3496 | } |
3497 | #endif |
3498 | |
3499 | static QString errorMessage(QUrlPrivate::ErrorCode errorCode, const QString &errorSource, qsizetype errorPosition) |
3500 | { |
3501 | QChar c = size_t(errorPosition) < size_t(errorSource.size()) ? |
3502 | errorSource.at(i: errorPosition) : QChar(QChar::Null); |
3503 | |
3504 | switch (errorCode) { |
3505 | case QUrlPrivate::NoError: |
3506 | Q_UNREACHABLE_RETURN(QString()); // QUrl::errorString should have treated this condition |
3507 | |
3508 | case QUrlPrivate::InvalidSchemeError: { |
3509 | auto msg = "Invalid scheme (character '%1' not permitted)"_L1 ; |
3510 | return msg.arg(args&: c); |
3511 | } |
3512 | |
3513 | case QUrlPrivate::InvalidUserNameError: |
3514 | return "Invalid user name (character '%1' not permitted)"_L1 |
3515 | .arg(args&: c); |
3516 | |
3517 | case QUrlPrivate::InvalidPasswordError: |
3518 | return "Invalid password (character '%1' not permitted)"_L1 |
3519 | .arg(args&: c); |
3520 | |
3521 | case QUrlPrivate::InvalidRegNameError: |
3522 | if (errorPosition >= 0) |
3523 | return "Invalid hostname (character '%1' not permitted)"_L1 |
3524 | .arg(args&: c); |
3525 | else |
3526 | return QStringLiteral("Invalid hostname (contains invalid characters)" ); |
3527 | case QUrlPrivate::InvalidIPv4AddressError: |
3528 | return QString(); // doesn't happen yet |
3529 | case QUrlPrivate::InvalidIPv6AddressError: |
3530 | return QStringLiteral("Invalid IPv6 address" ); |
3531 | case QUrlPrivate::InvalidCharacterInIPv6Error: |
3532 | return "Invalid IPv6 address (character '%1' not permitted)"_L1 .arg(args&: c); |
3533 | case QUrlPrivate::InvalidIPvFutureError: |
3534 | return "Invalid IPvFuture address (character '%1' not permitted)"_L1 .arg(args&: c); |
3535 | case QUrlPrivate::HostMissingEndBracket: |
3536 | return QStringLiteral("Expected ']' to match '[' in hostname" ); |
3537 | |
3538 | case QUrlPrivate::InvalidPortError: |
3539 | return QStringLiteral("Invalid port or port number out of range" ); |
3540 | case QUrlPrivate::PortEmptyError: |
3541 | return QStringLiteral("Port field was empty" ); |
3542 | |
3543 | case QUrlPrivate::InvalidPathError: |
3544 | return "Invalid path (character '%1' not permitted)"_L1 |
3545 | .arg(args&: c); |
3546 | |
3547 | case QUrlPrivate::InvalidQueryError: |
3548 | return "Invalid query (character '%1' not permitted)"_L1 |
3549 | .arg(args&: c); |
3550 | |
3551 | case QUrlPrivate::InvalidFragmentError: |
3552 | return "Invalid fragment (character '%1' not permitted)"_L1 |
3553 | .arg(args&: c); |
3554 | |
3555 | case QUrlPrivate::AuthorityPresentAndPathIsRelative: |
3556 | return QStringLiteral("Path component is relative and authority is present" ); |
3557 | case QUrlPrivate::AuthorityAbsentAndPathIsDoubleSlash: |
3558 | return QStringLiteral("Path component starts with '//' and authority is absent" ); |
3559 | case QUrlPrivate::RelativeUrlPathContainsColonBeforeSlash: |
3560 | return QStringLiteral("Relative URL's path component contains ':' before any '/'" ); |
3561 | } |
3562 | |
3563 | Q_UNREACHABLE_RETURN(QString()); |
3564 | } |
3565 | |
3566 | static inline void appendComponentIfPresent(QString &msg, bool present, const char *componentName, |
3567 | const QString &component) |
3568 | { |
3569 | if (present) |
3570 | msg += QLatin1StringView(componentName) % u'"' % component % "\","_L1 ; |
3571 | } |
3572 | |
3573 | /*! |
3574 | \since 4.2 |
3575 | |
3576 | Returns an error message if the last operation that modified this QUrl |
3577 | object ran into a parsing error. If no error was detected, this function |
3578 | returns an empty string and isValid() returns \c true. |
3579 | |
3580 | The error message returned by this function is technical in nature and may |
3581 | not be understood by end users. It is mostly useful to developers trying to |
3582 | understand why QUrl will not accept some input. |
3583 | |
3584 | \sa QUrl::ParsingMode |
3585 | */ |
3586 | QString QUrl::errorString() const |
3587 | { |
3588 | QString msg; |
3589 | if (!d) |
3590 | return msg; |
3591 | |
3592 | QString errorSource; |
3593 | qsizetype errorPosition = 0; |
3594 | QUrlPrivate::ErrorCode errorCode = d->validityError(source: &errorSource, position: &errorPosition); |
3595 | if (errorCode == QUrlPrivate::NoError) |
3596 | return msg; |
3597 | |
3598 | msg += errorMessage(errorCode, errorSource, errorPosition); |
3599 | msg += "; source was \""_L1 ; |
3600 | msg += errorSource; |
3601 | msg += "\";"_L1 ; |
3602 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Scheme, |
3603 | componentName: " scheme = " , component: d->scheme); |
3604 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::UserInfo, |
3605 | componentName: " userinfo = " , component: userInfo()); |
3606 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Host, |
3607 | componentName: " host = " , component: d->host); |
3608 | appendComponentIfPresent(msg, present: d->port != -1, |
3609 | componentName: " port = " , component: QString::number(d->port)); |
3610 | appendComponentIfPresent(msg, present: !d->path.isEmpty(), |
3611 | componentName: " path = " , component: d->path); |
3612 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Query, |
3613 | componentName: " query = " , component: d->query); |
3614 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Fragment, |
3615 | componentName: " fragment = " , component: d->fragment); |
3616 | if (msg.endsWith(c: u',')) |
3617 | msg.chop(n: 1); |
3618 | return msg; |
3619 | } |
3620 | |
3621 | /*! |
3622 | \since 5.1 |
3623 | |
3624 | Converts a list of \a urls into a list of QString objects, using toString(\a options). |
3625 | */ |
3626 | QStringList QUrl::toStringList(const QList<QUrl> &urls, FormattingOptions options) |
3627 | { |
3628 | QStringList lst; |
3629 | lst.reserve(asize: urls.size()); |
3630 | for (const QUrl &url : urls) |
3631 | lst.append(t: url.toString(options)); |
3632 | return lst; |
3633 | |
3634 | } |
3635 | |
3636 | /*! |
3637 | \since 5.1 |
3638 | |
3639 | Converts a list of strings representing \a urls into a list of urls, using QUrl(str, \a mode). |
3640 | Note that this means all strings must be urls, not for instance local paths. |
3641 | */ |
3642 | QList<QUrl> QUrl::fromStringList(const QStringList &urls, ParsingMode mode) |
3643 | { |
3644 | QList<QUrl> lst; |
3645 | lst.reserve(asize: urls.size()); |
3646 | for (const QString &str : urls) |
3647 | lst.append(t: QUrl(str, mode)); |
3648 | return lst; |
3649 | } |
3650 | |
3651 | /*! |
3652 | \typedef QUrl::DataPtr |
3653 | \internal |
3654 | */ |
3655 | |
3656 | /*! |
3657 | \fn DataPtr &QUrl::data_ptr() |
3658 | \internal |
3659 | */ |
3660 | |
3661 | /*! |
3662 | Returns the hash value for the \a url. If specified, \a seed is used to |
3663 | initialize the hash. |
3664 | |
3665 | \relates QHash |
3666 | \since 5.0 |
3667 | */ |
3668 | size_t qHash(const QUrl &url, size_t seed) noexcept |
3669 | { |
3670 | if (!url.d) |
3671 | return qHash(key: -1, seed); // the hash of an unset port (-1) |
3672 | |
3673 | return qHash(key: url.d->scheme) ^ |
3674 | qHash(key: url.d->userName) ^ |
3675 | qHash(key: url.d->password) ^ |
3676 | qHash(key: url.d->host) ^ |
3677 | qHash(key: url.d->port, seed) ^ |
3678 | qHash(key: url.d->path) ^ |
3679 | qHash(key: url.d->query) ^ |
3680 | qHash(key: url.d->fragment); |
3681 | } |
3682 | |
3683 | static QUrl adjustFtpPath(QUrl url) |
3684 | { |
3685 | if (url.scheme() == ftpScheme()) { |
3686 | QString path = url.path(options: QUrl::PrettyDecoded); |
3687 | if (path.startsWith(s: "//"_L1 )) |
3688 | url.setPath(path: "/%2F"_L1 + QStringView{path}.mid(pos: 2), mode: QUrl::TolerantMode); |
3689 | } |
3690 | return url; |
3691 | } |
3692 | |
3693 | static bool isIp6(const QString &text) |
3694 | { |
3695 | QIPAddressUtils::IPv6Address address; |
3696 | return !text.isEmpty() && QIPAddressUtils::parseIp6(address, begin: text.begin(), end: text.end()) == nullptr; |
3697 | } |
3698 | |
3699 | /*! |
3700 | Returns a valid URL from a user supplied \a userInput string if one can be |
3701 | deduced. In the case that is not possible, an invalid QUrl() is returned. |
3702 | |
3703 | This allows the user to input a URL or a local file path in the form of a plain |
3704 | string. This string can be manually typed into a location bar, obtained from |
3705 | the clipboard, or passed in via command line arguments. |
3706 | |
3707 | When the string is not already a valid URL, a best guess is performed, |
3708 | making various assumptions. |
3709 | |
3710 | In the case the string corresponds to a valid file path on the system, |
3711 | a file:// URL is constructed, using QUrl::fromLocalFile(). |
3712 | |
3713 | If that is not the case, an attempt is made to turn the string into a |
3714 | http:// or ftp:// URL. The latter in the case the string starts with |
3715 | 'ftp'. The result is then passed through QUrl's tolerant parser, and |
3716 | in the case or success, a valid QUrl is returned, or else a QUrl(). |
3717 | |
3718 | \section1 Examples: |
3719 | |
3720 | \list |
3721 | \li qt-project.org becomes http://qt-project.org |
3722 | \li ftp.qt-project.org becomes ftp://ftp.qt-project.org |
3723 | \li hostname becomes http://hostname |
3724 | \li /home/user/test.html becomes file:///home/user/test.html |
3725 | \endlist |
3726 | |
3727 | In order to be able to handle relative paths, this method takes an optional |
3728 | \a workingDirectory path. This is especially useful when handling command |
3729 | line arguments. |
3730 | If \a workingDirectory is empty, no handling of relative paths will be done. |
3731 | |
3732 | By default, an input string that looks like a relative path will only be treated |
3733 | as such if the file actually exists in the given working directory. |
3734 | If the application can handle files that don't exist yet, it should pass the |
3735 | flag AssumeLocalFile in \a options. |
3736 | |
3737 | \since 5.4 |
3738 | */ |
3739 | QUrl QUrl::fromUserInput(const QString &userInput, const QString &workingDirectory, |
3740 | UserInputResolutionOptions options) |
3741 | { |
3742 | QString trimmedString = userInput.trimmed(); |
3743 | |
3744 | if (trimmedString.isEmpty()) |
3745 | return QUrl(); |
3746 | |
3747 | // Check for IPv6 addresses, since a path starting with ":" is absolute (a resource) |
3748 | // and IPv6 addresses can start with "c:" too |
3749 | if (isIp6(text: trimmedString)) { |
3750 | QUrl url; |
3751 | url.setHost(host: trimmedString); |
3752 | url.setScheme(QStringLiteral("http" )); |
3753 | return url; |
3754 | } |
3755 | |
3756 | const QUrl url = QUrl(trimmedString, QUrl::TolerantMode); |
3757 | |
3758 | // Check for a relative path |
3759 | if (!workingDirectory.isEmpty()) { |
3760 | const QFileInfo fileInfo(QDir(workingDirectory), userInput); |
3761 | if (fileInfo.exists()) |
3762 | return QUrl::fromLocalFile(localFile: fileInfo.absoluteFilePath()); |
3763 | |
3764 | // Check both QUrl::isRelative (to detect full URLs) and QDir::isAbsolutePath (since on Windows drive letters can be interpreted as schemes) |
3765 | if ((options & AssumeLocalFile) && url.isRelative() && !QDir::isAbsolutePath(path: userInput)) |
3766 | return QUrl::fromLocalFile(localFile: fileInfo.absoluteFilePath()); |
3767 | } |
3768 | |
3769 | // Check first for files, since on Windows drive letters can be interpreted as schemes |
3770 | if (QDir::isAbsolutePath(path: trimmedString)) |
3771 | return QUrl::fromLocalFile(localFile: trimmedString); |
3772 | |
3773 | QUrl urlPrepended = QUrl("http://"_L1 + trimmedString, QUrl::TolerantMode); |
3774 | |
3775 | // Check the most common case of a valid url with a scheme |
3776 | // We check if the port would be valid by adding the scheme to handle the case host:port |
3777 | // where the host would be interpreted as the scheme |
3778 | if (url.isValid() |
3779 | && !url.scheme().isEmpty() |
3780 | && urlPrepended.port() == -1) |
3781 | return adjustFtpPath(url); |
3782 | |
3783 | // Else, try the prepended one and adjust the scheme from the host name |
3784 | if (urlPrepended.isValid() && (!urlPrepended.host().isEmpty() || !urlPrepended.path().isEmpty())) { |
3785 | qsizetype dotIndex = trimmedString.indexOf(c: u'.'); |
3786 | const QStringView hostscheme = QStringView{trimmedString}.left(n: dotIndex); |
3787 | if (hostscheme.compare(other: ftpScheme(), cs: Qt::CaseInsensitive) == 0) |
3788 | urlPrepended.setScheme(ftpScheme()); |
3789 | return adjustFtpPath(url: urlPrepended); |
3790 | } |
3791 | |
3792 | return QUrl(); |
3793 | } |
3794 | |
3795 | QT_END_NAMESPACE |
3796 | |