1 | // Copyright (C) 2016 The Qt Company Ltd. |
2 | // Copyright (C) 2016 Intel Corporation. |
3 | // SPDX-License-Identifier: LicenseRef-Qt-Commercial OR LGPL-3.0-only OR GPL-2.0-only OR GPL-3.0-only |
4 | |
5 | /*! |
6 | \class QUrl |
7 | \inmodule QtCore |
8 | |
9 | \brief The QUrl class provides a convenient interface for working |
10 | with URLs. |
11 | |
12 | \reentrant |
13 | \ingroup io |
14 | \ingroup network |
15 | \ingroup shared |
16 | |
17 | \compares weak |
18 | |
19 | It can parse and construct URLs in both encoded and unencoded |
20 | form. QUrl also has support for internationalized domain names |
21 | (IDNs). |
22 | |
23 | The most common way to use QUrl is to initialize it via the constructor by |
24 | passing a QString containing a full URL. QUrl objects can also be created |
25 | from a QByteArray containing a full URL using QUrl::fromEncoded(), or |
26 | heuristically from incomplete URLs using QUrl::fromUserInput(). The URL |
27 | representation can be obtained from a QUrl using either QUrl::toString() or |
28 | QUrl::toEncoded(). |
29 | |
30 | URLs can be represented in two forms: encoded or unencoded. The |
31 | unencoded representation is suitable for showing to users, but |
32 | the encoded representation is typically what you would send to |
33 | a web server. For example, the unencoded URL |
34 | "http://bühler.example.com/List of applicants.xml" |
35 | would be sent to the server as |
36 | "http://xn--bhler-kva.example.com/List%20of%20applicants.xml". |
37 | |
38 | A URL can also be constructed piece by piece by calling |
39 | setScheme(), setUserName(), setPassword(), setHost(), setPort(), |
40 | setPath(), setQuery() and setFragment(). Some convenience |
41 | functions are also available: setAuthority() sets the user name, |
42 | password, host and port. setUserInfo() sets the user name and |
43 | password at once. |
44 | |
45 | Call isValid() to check if the URL is valid. This can be done at any point |
46 | during the constructing of a URL. If isValid() returns \c false, you should |
47 | clear() the URL before proceeding, or start over by parsing a new URL with |
48 | setUrl(). |
49 | |
50 | Constructing a query is particularly convenient through the use of the \l |
51 | QUrlQuery class and its methods QUrlQuery::setQueryItems(), |
52 | QUrlQuery::addQueryItem() and QUrlQuery::removeQueryItem(). Use |
53 | QUrlQuery::setQueryDelimiters() to customize the delimiters used for |
54 | generating the query string. |
55 | |
56 | For the convenience of generating encoded URL strings or query |
57 | strings, there are two static functions called |
58 | fromPercentEncoding() and toPercentEncoding() which deal with |
59 | percent encoding and decoding of QString objects. |
60 | |
61 | fromLocalFile() constructs a QUrl by parsing a local |
62 | file path. toLocalFile() converts a URL to a local file path. |
63 | |
64 | The human readable representation of the URL is fetched with |
65 | toString(). This representation is appropriate for displaying a |
66 | URL to a user in unencoded form. The encoded form however, as |
67 | returned by toEncoded(), is for internal use, passing to web |
68 | servers, mail clients and so on. Both forms are technically correct |
69 | and represent the same URL unambiguously -- in fact, passing either |
70 | form to QUrl's constructor or to setUrl() will yield the same QUrl |
71 | object. |
72 | |
73 | QUrl conforms to the URI specification from |
74 | \l{RFC 3986} (Uniform Resource Identifier: Generic Syntax), and includes |
75 | scheme extensions from \l{RFC 1738} (Uniform Resource Locators). Case |
76 | folding rules in QUrl conform to \l{RFC 3491} (Nameprep: A Stringprep |
77 | Profile for Internationalized Domain Names (IDN)). It is also compatible with the |
78 | \l{http://freedesktop.org/wiki/Specifications/file-uri-spec/}{file URI specification} |
79 | from freedesktop.org, provided that the locale encodes file names using |
80 | UTF-8 (required by IDN). |
81 | |
82 | \section2 Relative URLs vs Relative Paths |
83 | |
84 | Calling isRelative() will return whether or not the URL is relative. |
85 | A relative URL has no \l {scheme}. For example: |
86 | |
87 | \snippet code/src_corelib_io_qurl.cpp 8 |
88 | |
89 | Notice that a URL can be absolute while containing a relative path, and |
90 | vice versa: |
91 | |
92 | \snippet code/src_corelib_io_qurl.cpp 9 |
93 | |
94 | A relative URL can be resolved by passing it as an argument to resolved(), |
95 | which returns an absolute URL. isParentOf() is used for determining whether |
96 | one URL is a parent of another. |
97 | |
98 | \section2 Error checking |
99 | |
100 | QUrl is capable of detecting many errors in URLs while parsing it or when |
101 | components of the URL are set with individual setter methods (like |
102 | setScheme(), setHost() or setPath()). If the parsing or setter function is |
103 | successful, any previously recorded error conditions will be discarded. |
104 | |
105 | By default, QUrl setter methods operate in QUrl::TolerantMode, which means |
106 | they accept some common mistakes and mis-representation of data. An |
107 | alternate method of parsing is QUrl::StrictMode, which applies further |
108 | checks. See QUrl::ParsingMode for a description of the difference of the |
109 | parsing modes. |
110 | |
111 | QUrl only checks for conformance with the URL specification. It does not |
112 | try to verify that high-level protocol URLs are in the format they are |
113 | expected to be by handlers elsewhere. For example, the following URIs are |
114 | all considered valid by QUrl, even if they do not make sense when used: |
115 | |
116 | \list |
117 | \li "http:/filename.html" |
118 | \li "mailto://example.com" |
119 | \endlist |
120 | |
121 | When the parser encounters an error, it signals the event by making |
122 | isValid() return false and toString() / toEncoded() return an empty string. |
123 | If it is necessary to show the user the reason why the URL failed to parse, |
124 | the error condition can be obtained from QUrl by calling errorString(). |
125 | Note that this message is highly technical and may not make sense to |
126 | end-users. |
127 | |
128 | QUrl is capable of recording only one error condition. If more than one |
129 | error is found, it is undefined which error is reported. |
130 | |
131 | \section2 Character Conversions |
132 | |
133 | Follow these rules to avoid erroneous character conversion when |
134 | dealing with URLs and strings: |
135 | |
136 | \list |
137 | \li When creating a QString to contain a URL from a QByteArray or a |
138 | char*, always use QString::fromUtf8(). |
139 | \endlist |
140 | */ |
141 | |
142 | /*! |
143 | \enum QUrl::ParsingMode |
144 | |
145 | The parsing mode controls the way QUrl parses strings. |
146 | |
147 | \value TolerantMode QUrl will try to correct some common errors in URLs. |
148 | This mode is useful for parsing URLs coming from sources |
149 | not known to be strictly standards-conforming. |
150 | |
151 | \value StrictMode Only valid URLs are accepted. This mode is useful for |
152 | general URL validation. |
153 | |
154 | \value DecodedMode QUrl will interpret the URL component in the fully-decoded form, |
155 | where percent characters stand for themselves, not as the beginning |
156 | of a percent-encoded sequence. This mode is only valid for the |
157 | setters setting components of a URL; it is not permitted in |
158 | the QUrl constructor, in fromEncoded() or in setUrl(). |
159 | For more information on this mode, see the documentation for |
160 | \l {QUrl::ComponentFormattingOption}{QUrl::FullyDecoded}. |
161 | |
162 | In TolerantMode, the parser has the following behaviour: |
163 | |
164 | \list |
165 | |
166 | \li Spaces and "%20": unencoded space characters will be accepted and will |
167 | be treated as equivalent to "%20". |
168 | |
169 | \li Single "%" characters: Any occurrences of a percent character "%" not |
170 | followed by exactly two hexadecimal characters (e.g., "13% coverage.html") |
171 | will be replaced by "%25". Note that one lone "%" character will trigger |
172 | the correction mode for all percent characters. |
173 | |
174 | \li Reserved and unreserved characters: An encoded URL should only |
175 | contain a few characters as literals; all other characters should |
176 | be percent-encoded. In TolerantMode, these characters will be |
177 | accepted if they are found in the URL: |
178 | space / double-quote / "<" / ">" / "\" / |
179 | "^" / "`" / "{" / "|" / "}" |
180 | Those same characters can be decoded again by passing QUrl::DecodeReserved |
181 | to toString() or toEncoded(). In the getters of individual components, |
182 | those characters are often returned in decoded form. |
183 | |
184 | \endlist |
185 | |
186 | When in StrictMode, if a parsing error is found, isValid() will return \c |
187 | false and errorString() will return a message describing the error. |
188 | If more than one error is detected, it is undefined which error gets |
189 | reported. |
190 | |
191 | Note that TolerantMode is not usually enough for parsing user input, which |
192 | often contains more errors and expectations than the parser can deal with. |
193 | When dealing with data coming directly from the user -- as opposed to data |
194 | coming from data-transfer sources, such as other programs -- it is |
195 | recommended to use fromUserInput(). |
196 | |
197 | \sa fromUserInput(), setUrl(), toString(), toEncoded(), QUrl::FormattingOptions |
198 | */ |
199 | |
200 | /*! |
201 | \enum QUrl::UrlFormattingOption |
202 | |
203 | The formatting options define how the URL is formatted when written out |
204 | as text. |
205 | |
206 | \value None The format of the URL is unchanged. |
207 | \value RemoveScheme The scheme is removed from the URL. |
208 | \value RemovePassword Any password in the URL is removed. |
209 | \value RemoveUserInfo Any user information in the URL is removed. |
210 | \value RemovePort Any specified port is removed from the URL. |
211 | \value RemoveAuthority |
212 | \value RemovePath The URL's path is removed, leaving only the scheme, |
213 | host address, and port (if present). |
214 | \value RemoveQuery The query part of the URL (following a '?' character) |
215 | is removed. |
216 | \value RemoveFragment |
217 | \value RemoveFilename The filename (i.e. everything after the last '/' in the path) is removed. |
218 | The trailing '/' is kept, unless StripTrailingSlash is set. |
219 | Only valid if RemovePath is not set. |
220 | \value PreferLocalFile If the URL is a local file according to isLocalFile() |
221 | and contains no query or fragment, a local file path is returned. |
222 | \value StripTrailingSlash The trailing slash is removed from the path, if one is present. |
223 | \value NormalizePathSegments Modifies the path to remove redundant directory separators, |
224 | and to resolve "."s and ".."s (as far as possible). For non-local paths, adjacent |
225 | slashes are preserved. |
226 | |
227 | Note that the case folding rules in \l{RFC 3491}{Nameprep}, which QUrl |
228 | conforms to, require host names to always be converted to lower case, |
229 | regardless of the Qt::FormattingOptions used. |
230 | |
231 | The options from QUrl::ComponentFormattingOptions are also possible. |
232 | |
233 | \sa QUrl::ComponentFormattingOptions |
234 | */ |
235 | |
236 | /*! |
237 | \enum QUrl::ComponentFormattingOption |
238 | \since 5.0 |
239 | |
240 | The component formatting options define how the components of an URL will |
241 | be formatted when written out as text. They can be combined with the |
242 | options from QUrl::FormattingOptions when used in toString() and |
243 | toEncoded(). |
244 | |
245 | \value PrettyDecoded The component is returned in a "pretty form", with |
246 | most percent-encoded characters decoded. The exact |
247 | behavior of PrettyDecoded varies from component to |
248 | component and may also change from Qt release to Qt |
249 | release. This is the default. |
250 | |
251 | \value EncodeSpaces Leave space characters in their encoded form ("%20"). |
252 | |
253 | \value EncodeUnicode Leave non-US-ASCII characters encoded in their UTF-8 |
254 | percent-encoded form (e.g., "%C3%A9" for the U+00E9 |
255 | codepoint, LATIN SMALL LETTER E WITH ACUTE). |
256 | |
257 | \value EncodeDelimiters Leave certain delimiters in their encoded form, as |
258 | would appear in the URL when the full URL is |
259 | represented as text. The delimiters are affected |
260 | by this option change from component to component. |
261 | This flag has no effect in toString() or toEncoded(). |
262 | |
263 | \value EncodeReserved Leave US-ASCII characters not permitted in the URL by |
264 | the specification in their encoded form. This is the |
265 | default on toString() and toEncoded(). |
266 | |
267 | \value DecodeReserved Decode the US-ASCII characters that the URL specification |
268 | does not allow to appear in the URL. This is the |
269 | default on the getters of individual components. |
270 | |
271 | \value FullyEncoded Leave all characters in their properly-encoded form, |
272 | as this component would appear as part of a URL. When |
273 | used with toString(), this produces a fully-compliant |
274 | URL in QString form, exactly equal to the result of |
275 | toEncoded() |
276 | |
277 | \value FullyDecoded Attempt to decode as much as possible. For individual |
278 | components of the URL, this decodes every percent |
279 | encoding sequence, including control characters (U+0000 |
280 | to U+001F) and UTF-8 sequences found in percent-encoded form. |
281 | Use of this mode may cause data loss, see below for more information. |
282 | |
283 | The values of EncodeReserved and DecodeReserved should not be used together |
284 | in one call. The behavior is undefined if that happens. They are provided |
285 | as separate values because the behavior of the "pretty mode" with regards |
286 | to reserved characters is different on certain components and specially on |
287 | the full URL. |
288 | |
289 | \section2 Full decoding |
290 | |
291 | The FullyDecoded mode is similar to the behavior of the functions returning |
292 | QString in Qt 4.x, in that every character represents itself and never has |
293 | any special meaning. This is true even for the percent character ('%'), |
294 | which should be interpreted to mean a literal percent, not the beginning of |
295 | a percent-encoded sequence. The same actual character, in all other |
296 | decoding modes, is represented by the sequence "%25". |
297 | |
298 | Whenever re-applying data obtained with QUrl::FullyDecoded into a QUrl, |
299 | care must be taken to use the QUrl::DecodedMode parameter to the setters |
300 | (like setPath() and setUserName()). Failure to do so may cause |
301 | re-interpretation of the percent character ('%') as the beginning of a |
302 | percent-encoded sequence. |
303 | |
304 | This mode is quite useful when portions of a URL are used in a non-URL |
305 | context. For example, to extract the username, password or file paths in an |
306 | FTP client application, the FullyDecoded mode should be used. |
307 | |
308 | This mode should be used with care, since there are two conditions that |
309 | cannot be reliably represented in the returned QString. They are: |
310 | |
311 | \list |
312 | \li \b{Non-UTF-8 sequences:} URLs may contain sequences of |
313 | percent-encoded characters that do not form valid UTF-8 sequences. Since |
314 | URLs need to be decoded using UTF-8, any decoder failure will result in |
315 | the QString containing one or more replacement characters where the |
316 | sequence existed. |
317 | |
318 | \li \b{Encoded delimiters:} URLs are also allowed to make a distinction |
319 | between a delimiter found in its literal form and its equivalent in |
320 | percent-encoded form. This is most commonly found in the query, but is |
321 | permitted in most parts of the URL. |
322 | \endlist |
323 | |
324 | The following example illustrates the problem: |
325 | |
326 | \snippet code/src_corelib_io_qurl.cpp 10 |
327 | |
328 | If the two URLs were used via HTTP GET, the interpretation by the web |
329 | server would probably be different. In the first case, it would interpret |
330 | as one parameter, with a key of "q" and value "a+=b&c". In the second |
331 | case, it would probably interpret as two parameters, one with a key of "q" |
332 | and value "a =b", and the second with a key "c" and no value. |
333 | |
334 | \sa QUrl::FormattingOptions |
335 | */ |
336 | |
337 | /*! |
338 | \enum QUrl::UserInputResolutionOption |
339 | \since 5.4 |
340 | |
341 | The user input resolution options define how fromUserInput() should |
342 | interpret strings that could either be a relative path or the short |
343 | form of a HTTP URL. For instance \c{file.pl} can be either a local file |
344 | or the URL \c{http://file.pl}. |
345 | |
346 | \value DefaultResolution The default resolution mechanism is to check |
347 | whether a local file exists, in the working |
348 | directory given to fromUserInput, and only |
349 | return a local path in that case. Otherwise a URL |
350 | is assumed. |
351 | \value AssumeLocalFile This option makes fromUserInput() always return |
352 | a local path unless the input contains a scheme, such as |
353 | \c{http://file.pl}. This is useful for applications |
354 | such as text editors, which are able to create |
355 | the file if it doesn't exist. |
356 | |
357 | \sa fromUserInput() |
358 | */ |
359 | |
360 | /*! |
361 | \enum QUrl::AceProcessingOption |
362 | \since 6.3 |
363 | |
364 | The ACE processing options control the way URLs are transformed to and from |
365 | ASCII-Compatible Encoding. |
366 | |
367 | \value IgnoreIDNWhitelist Ignore the IDN whitelist when converting URLs |
368 | to Unicode. |
369 | \value AceTransitionalProcessing Use transitional processing described in UTS #46. |
370 | This allows better compatibility with IDNA 2003 |
371 | specification. |
372 | |
373 | The default is to use nontransitional processing and to allow non-ASCII |
374 | characters only inside URLs whose top-level domains are listed in the IDN whitelist. |
375 | |
376 | \sa toAce(), fromAce(), idnWhitelist() |
377 | */ |
378 | |
379 | /*! |
380 | \fn QUrl::QUrl(QUrl &&other) |
381 | |
382 | Move-constructs a QUrl instance, making it point at the same |
383 | object that \a other was pointing to. |
384 | |
385 | \since 5.2 |
386 | */ |
387 | |
388 | /*! |
389 | \fn QUrl &QUrl::operator=(QUrl &&other) |
390 | |
391 | Move-assigns \a other to this QUrl instance. |
392 | |
393 | \since 5.2 |
394 | */ |
395 | |
396 | #include "qurl.h" |
397 | #include "qurl_p.h" |
398 | #include "qplatformdefs.h" |
399 | #include "qstring.h" |
400 | #include "qstringlist.h" |
401 | #include "qdebug.h" |
402 | #include "qhash.h" |
403 | #include "qdatastream.h" |
404 | #include "private/qipaddress_p.h" |
405 | #include "qurlquery.h" |
406 | #include "private/qdir_p.h" |
407 | #include <private/qtools_p.h> |
408 | |
409 | QT_BEGIN_NAMESPACE |
410 | |
411 | using namespace Qt::StringLiterals; |
412 | using namespace QtMiscUtils; |
413 | |
414 | inline static bool isHex(char c) |
415 | { |
416 | c |= 0x20; |
417 | return isAsciiDigit(c) || (c >= 'a' && c <= 'f'); |
418 | } |
419 | |
420 | static inline QString ftpScheme() |
421 | { |
422 | return QStringLiteral("ftp" ); |
423 | } |
424 | |
425 | static inline QString fileScheme() |
426 | { |
427 | return QStringLiteral("file" ); |
428 | } |
429 | |
430 | static inline QString webDavScheme() |
431 | { |
432 | return QStringLiteral("webdavs" ); |
433 | } |
434 | |
435 | static inline QString webDavSslTag() |
436 | { |
437 | return QStringLiteral("@SSL" ); |
438 | } |
439 | |
440 | class QUrlPrivate |
441 | { |
442 | public: |
443 | enum Section : uchar { |
444 | Scheme = 0x01, |
445 | UserName = 0x02, |
446 | Password = 0x04, |
447 | UserInfo = UserName | Password, |
448 | Host = 0x08, |
449 | Port = 0x10, |
450 | Authority = UserInfo | Host | Port, |
451 | Path = 0x20, |
452 | Hierarchy = Authority | Path, |
453 | Query = 0x40, |
454 | Fragment = 0x80, |
455 | FullUrl = 0xff |
456 | }; |
457 | |
458 | enum Flags : uchar { |
459 | IsLocalFile = 0x01 |
460 | }; |
461 | |
462 | enum ErrorCode { |
463 | // the high byte of the error code matches the Section |
464 | // the first item in each value must be the generic "Invalid xxx Error" |
465 | InvalidSchemeError = Scheme << 8, |
466 | |
467 | InvalidUserNameError = UserName << 8, |
468 | |
469 | InvalidPasswordError = Password << 8, |
470 | |
471 | InvalidRegNameError = Host << 8, |
472 | InvalidIPv4AddressError, |
473 | InvalidIPv6AddressError, |
474 | InvalidCharacterInIPv6Error, |
475 | InvalidIPvFutureError, |
476 | HostMissingEndBracket, |
477 | |
478 | InvalidPortError = Port << 8, |
479 | PortEmptyError, |
480 | |
481 | InvalidPathError = Path << 8, |
482 | |
483 | InvalidQueryError = Query << 8, |
484 | |
485 | InvalidFragmentError = Fragment << 8, |
486 | |
487 | // the following three cases are only possible in combination with |
488 | // presence/absence of the path, authority and scheme. See validityError(). |
489 | AuthorityPresentAndPathIsRelative = Authority << 8 | Path << 8 | 0x10000, |
490 | AuthorityAbsentAndPathIsDoubleSlash, |
491 | RelativeUrlPathContainsColonBeforeSlash = Scheme << 8 | Authority << 8 | Path << 8 | 0x10000, |
492 | |
493 | NoError = 0 |
494 | }; |
495 | |
496 | struct Error { |
497 | QString source; |
498 | qsizetype position; |
499 | ErrorCode code; |
500 | }; |
501 | |
502 | QUrlPrivate(); |
503 | QUrlPrivate(const QUrlPrivate ©); |
504 | ~QUrlPrivate(); |
505 | |
506 | void parse(const QString &url, QUrl::ParsingMode parsingMode); |
507 | bool isEmpty() const |
508 | { return sectionIsPresent == 0 && port == -1 && path.isEmpty(); } |
509 | |
510 | std::unique_ptr<Error> cloneError() const; |
511 | void clearError(); |
512 | void setError(ErrorCode errorCode, const QString &source, qsizetype supplement = -1); |
513 | ErrorCode validityError(QString *source = nullptr, qsizetype *position = nullptr) const; |
514 | bool validateComponent(Section section, const QString &input, qsizetype begin, qsizetype end); |
515 | bool validateComponent(Section section, const QString &input) |
516 | { return validateComponent(section, input, begin: 0, end: input.size()); } |
517 | |
518 | // no QString scheme() const; |
519 | void appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
520 | void appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
521 | void appendUserName(QString &appendTo, QUrl::FormattingOptions options) const; |
522 | void appendPassword(QString &appendTo, QUrl::FormattingOptions options) const; |
523 | void appendHost(QString &appendTo, QUrl::FormattingOptions options) const; |
524 | void appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
525 | void appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
526 | void appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
527 | |
528 | // the "end" parameters are like STL iterators: they point to one past the last valid element |
529 | bool setScheme(const QString &value, qsizetype len, bool doSetError); |
530 | void setAuthority(const QString &auth, qsizetype from, qsizetype end, QUrl::ParsingMode mode); |
531 | void setUserInfo(const QString &userInfo, qsizetype from, qsizetype end); |
532 | void setUserName(const QString &value, qsizetype from, qsizetype end); |
533 | void setPassword(const QString &value, qsizetype from, qsizetype end); |
534 | bool setHost(const QString &value, qsizetype from, qsizetype end, QUrl::ParsingMode mode); |
535 | void setPath(const QString &value, qsizetype from, qsizetype end); |
536 | void setQuery(const QString &value, qsizetype from, qsizetype end); |
537 | void setFragment(const QString &value, qsizetype from, qsizetype end); |
538 | |
539 | inline bool hasScheme() const { return sectionIsPresent & Scheme; } |
540 | inline bool hasAuthority() const { return sectionIsPresent & Authority; } |
541 | inline bool hasUserInfo() const { return sectionIsPresent & UserInfo; } |
542 | inline bool hasUserName() const { return sectionIsPresent & UserName; } |
543 | inline bool hasPassword() const { return sectionIsPresent & Password; } |
544 | inline bool hasHost() const { return sectionIsPresent & Host; } |
545 | inline bool hasPort() const { return port != -1; } |
546 | inline bool hasPath() const { return !path.isEmpty(); } |
547 | inline bool hasQuery() const { return sectionIsPresent & Query; } |
548 | inline bool hasFragment() const { return sectionIsPresent & Fragment; } |
549 | |
550 | inline bool isLocalFile() const { return flags & IsLocalFile; } |
551 | QString toLocalFile(QUrl::FormattingOptions options) const; |
552 | |
553 | QString mergePaths(const QString &relativePath) const; |
554 | |
555 | QAtomicInt ref; |
556 | int port; |
557 | |
558 | QString scheme; |
559 | QString userName; |
560 | QString password; |
561 | QString host; |
562 | QString path; |
563 | QString query; |
564 | QString fragment; |
565 | |
566 | std::unique_ptr<Error> error; |
567 | |
568 | // not used for: |
569 | // - Port (port == -1 means absence) |
570 | // - Path (there's no path delimiter, so we optimize its use out of existence) |
571 | // Schemes are never supposed to be empty, but we keep the flag anyway |
572 | uchar sectionIsPresent; |
573 | uchar flags; |
574 | |
575 | // 32-bit: 2 bytes tail padding available |
576 | // 64-bit: 6 bytes tail padding available |
577 | }; |
578 | |
579 | inline QUrlPrivate::QUrlPrivate() |
580 | : ref(1), port(-1), |
581 | sectionIsPresent(0), |
582 | flags(0) |
583 | { |
584 | } |
585 | |
586 | inline QUrlPrivate::QUrlPrivate(const QUrlPrivate ©) |
587 | : ref(1), port(copy.port), |
588 | scheme(copy.scheme), |
589 | userName(copy.userName), |
590 | password(copy.password), |
591 | host(copy.host), |
592 | path(copy.path), |
593 | query(copy.query), |
594 | fragment(copy.fragment), |
595 | error(copy.cloneError()), |
596 | sectionIsPresent(copy.sectionIsPresent), |
597 | flags(copy.flags) |
598 | { |
599 | } |
600 | |
601 | inline QUrlPrivate::~QUrlPrivate() |
602 | = default; |
603 | |
604 | std::unique_ptr<QUrlPrivate::Error> QUrlPrivate::cloneError() const |
605 | { |
606 | return error ? std::make_unique<Error>(args&: *error) : nullptr; |
607 | } |
608 | |
609 | inline void QUrlPrivate::clearError() |
610 | { |
611 | error.reset(); |
612 | } |
613 | |
614 | inline void QUrlPrivate::setError(ErrorCode errorCode, const QString &source, qsizetype supplement) |
615 | { |
616 | if (error) { |
617 | // don't overwrite an error set in a previous section during parsing |
618 | return; |
619 | } |
620 | error = std::make_unique<Error>(); |
621 | error->code = errorCode; |
622 | error->source = source; |
623 | error->position = supplement; |
624 | } |
625 | |
626 | // From RFC 3986, Appendix A Collected ABNF for URI |
627 | // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] |
628 | //[...] |
629 | // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) |
630 | // |
631 | // authority = [ userinfo "@" ] host [ ":" port ] |
632 | // userinfo = *( unreserved / pct-encoded / sub-delims / ":" ) |
633 | // host = IP-literal / IPv4address / reg-name |
634 | // port = *DIGIT |
635 | //[...] |
636 | // reg-name = *( unreserved / pct-encoded / sub-delims ) |
637 | //[..] |
638 | // pchar = unreserved / pct-encoded / sub-delims / ":" / "@" |
639 | // |
640 | // query = *( pchar / "/" / "?" ) |
641 | // |
642 | // fragment = *( pchar / "/" / "?" ) |
643 | // |
644 | // pct-encoded = "%" HEXDIG HEXDIG |
645 | // |
646 | // unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" |
647 | // reserved = gen-delims / sub-delims |
648 | // gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" |
649 | // sub-delims = "!" / "$" / "&" / "'" / "(" / ")" |
650 | // / "*" / "+" / "," / ";" / "=" |
651 | // the path component has a complex ABNF that basically boils down to |
652 | // slash-separated segments of "pchar" |
653 | |
654 | // The above is the strict definition of the URL components and we mostly |
655 | // adhere to it, with few exceptions. QUrl obeys the following behavior: |
656 | // - percent-encoding sequences always use uppercase HEXDIG; |
657 | // - unreserved characters are *always* decoded, no exceptions; |
658 | // - the space character and bytes with the high bit set are controlled by |
659 | // the EncodeSpaces and EncodeUnicode bits; |
660 | // - control characters, the percent sign itself, and bytes with the high |
661 | // bit set that don't form valid UTF-8 sequences are always encoded, |
662 | // except in FullyDecoded mode; |
663 | // - sub-delims are always left alone, except in FullyDecoded mode; |
664 | // - gen-delim change behavior depending on which section of the URL (or |
665 | // the entire URL) we're looking at; see below; |
666 | // - characters not mentioned above, like "<", and ">", are usually |
667 | // decoded in individual sections of the URL, but encoded when the full |
668 | // URL is put together (we can change on subjective definition of |
669 | // "pretty"). |
670 | // |
671 | // The behavior for the delimiters bears some explanation. The spec says in |
672 | // section 2.2: |
673 | // URIs that differ in the replacement of a reserved character with its |
674 | // corresponding percent-encoded octet are not equivalent. |
675 | // (note: QUrl API mistakenly uses the "reserved" term, so we will refer to |
676 | // them here as "delimiters"). |
677 | // |
678 | // For that reason, we cannot encode delimiters found in decoded form and we |
679 | // cannot decode the ones found in encoded form if that would change the |
680 | // interpretation. Conversely, we *can* perform the transformation if it would |
681 | // not change the interpretation. From the last component of a URL to the first, |
682 | // here are the gen-delims we can unambiguously transform when the field is |
683 | // taken in isolation: |
684 | // - fragment: none, since it's the last |
685 | // - query: "#" is unambiguous |
686 | // - path: "#" and "?" are unambiguous |
687 | // - host: completely special but never ambiguous, see setHost() below. |
688 | // - password: the "#", "?", "/", "[", "]" and "@" characters are unambiguous |
689 | // - username: the "#", "?", "/", "[", "]", "@", and ":" characters are unambiguous |
690 | // - scheme: doesn't accept any delimiter, see setScheme() below. |
691 | // |
692 | // Internally, QUrl stores each component in the format that corresponds to the |
693 | // default mode (PrettyDecoded). It deviates from the "strict" FullyEncoded |
694 | // mode in the following way: |
695 | // - spaces are decoded |
696 | // - valid UTF-8 sequences are decoded |
697 | // - gen-delims that can be unambiguously transformed are decoded |
698 | // - characters controlled by DecodeReserved are often decoded, though this behavior |
699 | // can change depending on the subjective definition of "pretty" |
700 | // |
701 | // Note that the list of gen-delims that we can transform is different for the |
702 | // user info (user name + password) and the authority (user info + host + |
703 | // port). |
704 | |
705 | |
706 | // list the recoding table modifications to be used with the recodeFromUser and |
707 | // appendToUser functions, according to the rules above. Spaces and UTF-8 |
708 | // sequences are handled outside the tables. |
709 | |
710 | // the encodedXXX tables are run with the delimiters set to "leave" by default; |
711 | // the decodedXXX tables are run with the delimiters set to "decode" by default |
712 | // (except for the query, which doesn't use these functions) |
713 | |
714 | namespace { |
715 | template <typename T> constexpr ushort decode(T x) noexcept { return ushort(x); } |
716 | template <typename T> constexpr ushort leave(T x) noexcept { return ushort(0x100 | x); } |
717 | template <typename T> constexpr ushort encode(T x) noexcept { return ushort(0x200 | x); } |
718 | } |
719 | |
720 | static const ushort userNameInIsolation[] = { |
721 | decode(x: ':'), // 0 |
722 | decode(x: '@'), // 1 |
723 | decode(x: ']'), // 2 |
724 | decode(x: '['), // 3 |
725 | decode(x: '/'), // 4 |
726 | decode(x: '?'), // 5 |
727 | decode(x: '#'), // 6 |
728 | |
729 | decode(x: '"'), // 7 |
730 | decode(x: '<'), |
731 | decode(x: '>'), |
732 | decode(x: '^'), |
733 | decode(x: '\\'), |
734 | decode(x: '|'), |
735 | decode(x: '{'), |
736 | decode(x: '}'), |
737 | 0 |
738 | }; |
739 | static const ushort * const passwordInIsolation = userNameInIsolation + 1; |
740 | static const ushort * const pathInIsolation = userNameInIsolation + 5; |
741 | static const ushort * const queryInIsolation = userNameInIsolation + 6; |
742 | static const ushort * const fragmentInIsolation = userNameInIsolation + 7; |
743 | |
744 | static const ushort userNameInUserInfo[] = { |
745 | encode(x: ':'), // 0 |
746 | decode(x: '@'), // 1 |
747 | decode(x: ']'), // 2 |
748 | decode(x: '['), // 3 |
749 | decode(x: '/'), // 4 |
750 | decode(x: '?'), // 5 |
751 | decode(x: '#'), // 6 |
752 | |
753 | decode(x: '"'), // 7 |
754 | decode(x: '<'), |
755 | decode(x: '>'), |
756 | decode(x: '^'), |
757 | decode(x: '\\'), |
758 | decode(x: '|'), |
759 | decode(x: '{'), |
760 | decode(x: '}'), |
761 | 0 |
762 | }; |
763 | static const ushort * const passwordInUserInfo = userNameInUserInfo + 1; |
764 | |
765 | static const ushort userNameInAuthority[] = { |
766 | encode(x: ':'), // 0 |
767 | encode(x: '@'), // 1 |
768 | encode(x: ']'), // 2 |
769 | encode(x: '['), // 3 |
770 | decode(x: '/'), // 4 |
771 | decode(x: '?'), // 5 |
772 | decode(x: '#'), // 6 |
773 | |
774 | decode(x: '"'), // 7 |
775 | decode(x: '<'), |
776 | decode(x: '>'), |
777 | decode(x: '^'), |
778 | decode(x: '\\'), |
779 | decode(x: '|'), |
780 | decode(x: '{'), |
781 | decode(x: '}'), |
782 | 0 |
783 | }; |
784 | static const ushort * const passwordInAuthority = userNameInAuthority + 1; |
785 | |
786 | static const ushort userNameInUrl[] = { |
787 | encode(x: ':'), // 0 |
788 | encode(x: '@'), // 1 |
789 | encode(x: ']'), // 2 |
790 | encode(x: '['), // 3 |
791 | encode(x: '/'), // 4 |
792 | encode(x: '?'), // 5 |
793 | encode(x: '#'), // 6 |
794 | |
795 | // no need to list encode(x) for the other characters |
796 | 0 |
797 | }; |
798 | static const ushort * const passwordInUrl = userNameInUrl + 1; |
799 | static const ushort * const pathInUrl = userNameInUrl + 5; |
800 | static const ushort * const queryInUrl = userNameInUrl + 6; |
801 | static const ushort * const fragmentInUrl = userNameInUrl + 6; |
802 | |
803 | static inline void parseDecodedComponent(QString &data) |
804 | { |
805 | data.replace(c: u'%', after: "%25"_L1 ); |
806 | } |
807 | |
808 | static inline QString |
809 | recodeFromUser(const QString &input, const ushort *actions, qsizetype from, qsizetype to) |
810 | { |
811 | QString output; |
812 | const QChar *begin = input.constData() + from; |
813 | const QChar *end = input.constData() + to; |
814 | if (qt_urlRecode(appendTo&: output, url: QStringView{begin, end}, encoding: {}, tableModifications: actions)) |
815 | return output; |
816 | |
817 | return input.mid(position: from, n: to - from); |
818 | } |
819 | |
820 | // appendXXXX functions: copy from the internal form to the external, user form. |
821 | // the internal value is stored in its PrettyDecoded form, so that case is easy. |
822 | static inline void appendToUser(QString &appendTo, QStringView value, QUrl::FormattingOptions options, |
823 | const ushort *actions) |
824 | { |
825 | // The stored value is already QUrl::PrettyDecoded, so there's nothing to |
826 | // do if that's what the user asked for (test only |
827 | // ComponentFormattingOptions, ignore FormattingOptions). |
828 | if ((options & 0xFFFF0000) == QUrl::PrettyDecoded || |
829 | !qt_urlRecode(appendTo, url: value, encoding: options, tableModifications: actions)) |
830 | appendTo += value; |
831 | |
832 | // copy nullness, if necessary, because QString::operator+=(QStringView) doesn't |
833 | if (appendTo.isNull() && !value.isNull()) |
834 | appendTo.detach(); |
835 | } |
836 | |
837 | inline void QUrlPrivate::appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
838 | { |
839 | if ((options & QUrl::RemoveUserInfo) != QUrl::RemoveUserInfo) { |
840 | appendUserInfo(appendTo, options, appendingTo); |
841 | |
842 | // add '@' only if we added anything |
843 | if (hasUserName() || (hasPassword() && (options & QUrl::RemovePassword) == 0)) |
844 | appendTo += u'@'; |
845 | } |
846 | appendHost(appendTo, options); |
847 | if (!(options & QUrl::RemovePort) && port != -1) |
848 | appendTo += u':' + QString::number(port); |
849 | } |
850 | |
851 | inline void QUrlPrivate::appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
852 | { |
853 | if (Q_LIKELY(!hasUserInfo())) |
854 | return; |
855 | |
856 | const ushort *userNameActions; |
857 | const ushort *passwordActions; |
858 | if (options & QUrl::EncodeDelimiters) { |
859 | userNameActions = userNameInUrl; |
860 | passwordActions = passwordInUrl; |
861 | } else { |
862 | switch (appendingTo) { |
863 | case UserInfo: |
864 | userNameActions = userNameInUserInfo; |
865 | passwordActions = passwordInUserInfo; |
866 | break; |
867 | |
868 | case Authority: |
869 | userNameActions = userNameInAuthority; |
870 | passwordActions = passwordInAuthority; |
871 | break; |
872 | |
873 | case FullUrl: |
874 | userNameActions = userNameInUrl; |
875 | passwordActions = passwordInUrl; |
876 | break; |
877 | |
878 | default: |
879 | // can't happen |
880 | Q_UNREACHABLE(); |
881 | break; |
882 | } |
883 | } |
884 | |
885 | if (!qt_urlRecode(appendTo, url: userName, encoding: options, tableModifications: userNameActions)) |
886 | appendTo += userName; |
887 | if (options & QUrl::RemovePassword || !hasPassword()) { |
888 | return; |
889 | } else { |
890 | appendTo += u':'; |
891 | if (!qt_urlRecode(appendTo, url: password, encoding: options, tableModifications: passwordActions)) |
892 | appendTo += password; |
893 | } |
894 | } |
895 | |
896 | inline void QUrlPrivate::appendUserName(QString &appendTo, QUrl::FormattingOptions options) const |
897 | { |
898 | // only called from QUrl::userName() |
899 | appendToUser(appendTo, value: userName, options, |
900 | actions: options & QUrl::EncodeDelimiters ? userNameInUrl : userNameInIsolation); |
901 | } |
902 | |
903 | inline void QUrlPrivate::appendPassword(QString &appendTo, QUrl::FormattingOptions options) const |
904 | { |
905 | // only called from QUrl::password() |
906 | appendToUser(appendTo, value: password, options, |
907 | actions: options & QUrl::EncodeDelimiters ? passwordInUrl : passwordInIsolation); |
908 | } |
909 | |
910 | inline void QUrlPrivate::appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
911 | { |
912 | QString thePath = path; |
913 | if (options & QUrl::NormalizePathSegments) { |
914 | qt_normalizePathSegments( |
915 | path: &thePath, |
916 | flags: isLocalFile() ? QDirPrivate::KeepLocalTrailingSlash : QDirPrivate::RemotePath); |
917 | } |
918 | |
919 | QStringView thePathView(thePath); |
920 | if (options & QUrl::RemoveFilename) { |
921 | const qsizetype slash = thePathView.lastIndexOf(c: u'/'); |
922 | if (slash == -1) |
923 | return; |
924 | thePathView = thePathView.left(n: slash + 1); |
925 | } |
926 | // check if we need to remove trailing slashes |
927 | if (options & QUrl::StripTrailingSlash) { |
928 | while (thePathView.size() > 1 && thePathView.endsWith(c: u'/')) |
929 | thePathView.chop(n: 1); |
930 | } |
931 | |
932 | appendToUser(appendTo, value: thePathView, options, |
933 | actions: appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? pathInUrl : pathInIsolation); |
934 | } |
935 | |
936 | inline void QUrlPrivate::appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
937 | { |
938 | appendToUser(appendTo, value: fragment, options, |
939 | actions: options & QUrl::EncodeDelimiters ? fragmentInUrl : |
940 | appendingTo == FullUrl ? nullptr : fragmentInIsolation); |
941 | } |
942 | |
943 | inline void QUrlPrivate::appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
944 | { |
945 | appendToUser(appendTo, value: query, options, |
946 | actions: appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? queryInUrl : queryInIsolation); |
947 | } |
948 | |
949 | // setXXX functions |
950 | |
951 | inline bool QUrlPrivate::setScheme(const QString &value, qsizetype len, bool doSetError) |
952 | { |
953 | // schemes are strictly RFC-compliant: |
954 | // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) |
955 | // we also lowercase the scheme |
956 | |
957 | // schemes in URLs are not allowed to be empty, but they can be in |
958 | // "Relative URIs" which QUrl also supports. QUrl::setScheme does |
959 | // not call us with len == 0, so this can only be from parse() |
960 | scheme.clear(); |
961 | if (len == 0) |
962 | return false; |
963 | |
964 | sectionIsPresent |= Scheme; |
965 | |
966 | // validate it: |
967 | qsizetype needsLowercasing = -1; |
968 | const ushort *p = reinterpret_cast<const ushort *>(value.data()); |
969 | for (qsizetype i = 0; i < len; ++i) { |
970 | if (isAsciiLower(c: p[i])) |
971 | continue; |
972 | if (isAsciiUpper(c: p[i])) { |
973 | needsLowercasing = i; |
974 | continue; |
975 | } |
976 | if (i) { |
977 | if (isAsciiDigit(c: p[i])) |
978 | continue; |
979 | if (p[i] == '+' || p[i] == '-' || p[i] == '.') |
980 | continue; |
981 | } |
982 | |
983 | // found something else |
984 | // don't call setError needlessly: |
985 | // if we've been called from parse(), it will try to recover |
986 | if (doSetError) |
987 | setError(errorCode: InvalidSchemeError, source: value, supplement: i); |
988 | return false; |
989 | } |
990 | |
991 | scheme = value.left(n: len); |
992 | |
993 | if (needsLowercasing != -1) { |
994 | // schemes are ASCII only, so we don't need the full Unicode toLower |
995 | QChar *schemeData = scheme.data(); // force detaching here |
996 | for (qsizetype i = needsLowercasing; i >= 0; --i) { |
997 | ushort c = schemeData[i].unicode(); |
998 | if (isAsciiUpper(c)) |
999 | schemeData[i] = QChar(c + 0x20); |
1000 | } |
1001 | } |
1002 | |
1003 | // did we set to the file protocol? |
1004 | if (scheme == fileScheme() |
1005 | #ifdef Q_OS_WIN |
1006 | || scheme == webDavScheme() |
1007 | #endif |
1008 | ) { |
1009 | flags |= IsLocalFile; |
1010 | } else { |
1011 | flags &= ~IsLocalFile; |
1012 | } |
1013 | return true; |
1014 | } |
1015 | |
1016 | inline void QUrlPrivate::setAuthority(const QString &auth, qsizetype from, qsizetype end, QUrl::ParsingMode mode) |
1017 | { |
1018 | sectionIsPresent &= ~Authority; |
1019 | port = -1; |
1020 | if (from == end && !auth.isNull()) |
1021 | sectionIsPresent |= Host; // empty but not null authority implies host |
1022 | |
1023 | // we never actually _loop_ |
1024 | while (from != end) { |
1025 | qsizetype userInfoIndex = auth.indexOf(c: u'@', from); |
1026 | if (size_t(userInfoIndex) < size_t(end)) { |
1027 | setUserInfo(userInfo: auth, from, end: userInfoIndex); |
1028 | if (mode == QUrl::StrictMode && !validateComponent(section: UserInfo, input: auth, begin: from, end: userInfoIndex)) |
1029 | break; |
1030 | from = userInfoIndex + 1; |
1031 | } |
1032 | |
1033 | qsizetype colonIndex = auth.lastIndexOf(c: u':', from: end - 1); |
1034 | if (colonIndex < from) |
1035 | colonIndex = -1; |
1036 | |
1037 | if (size_t(colonIndex) < size_t(end)) { |
1038 | if (auth.at(i: from).unicode() == '[') { |
1039 | // check if colonIndex isn't inside the "[...]" part |
1040 | qsizetype closingBracket = auth.indexOf(c: u']', from); |
1041 | if (size_t(closingBracket) > size_t(colonIndex)) |
1042 | colonIndex = -1; |
1043 | } |
1044 | } |
1045 | |
1046 | if (size_t(colonIndex) < size_t(end) - 1) { |
1047 | // found a colon with digits after it |
1048 | unsigned long x = 0; |
1049 | for (qsizetype i = colonIndex + 1; i < end; ++i) { |
1050 | ushort c = auth.at(i).unicode(); |
1051 | if (isAsciiDigit(c)) { |
1052 | x *= 10; |
1053 | x += c - '0'; |
1054 | } else { |
1055 | x = ulong(-1); // x != ushort(x) |
1056 | break; |
1057 | } |
1058 | } |
1059 | if (x == ushort(x)) { |
1060 | port = ushort(x); |
1061 | } else { |
1062 | setError(errorCode: InvalidPortError, source: auth, supplement: colonIndex + 1); |
1063 | if (mode == QUrl::StrictMode) |
1064 | break; |
1065 | } |
1066 | } |
1067 | |
1068 | setHost(value: auth, from, end: qMin<size_t>(a: end, b: colonIndex), mode); |
1069 | if (mode == QUrl::StrictMode && !validateComponent(section: Host, input: auth, begin: from, end: qMin<size_t>(a: end, b: colonIndex))) { |
1070 | // clear host too |
1071 | sectionIsPresent &= ~Authority; |
1072 | break; |
1073 | } |
1074 | |
1075 | // success |
1076 | return; |
1077 | } |
1078 | // clear all sections but host |
1079 | sectionIsPresent &= ~Authority | Host; |
1080 | userName.clear(); |
1081 | password.clear(); |
1082 | host.clear(); |
1083 | port = -1; |
1084 | } |
1085 | |
1086 | inline void QUrlPrivate::setUserInfo(const QString &userInfo, qsizetype from, qsizetype end) |
1087 | { |
1088 | qsizetype delimIndex = userInfo.indexOf(c: u':', from); |
1089 | setUserName(value: userInfo, from, end: qMin<size_t>(a: delimIndex, b: end)); |
1090 | |
1091 | if (size_t(delimIndex) >= size_t(end)) { |
1092 | password.clear(); |
1093 | sectionIsPresent &= ~Password; |
1094 | } else { |
1095 | setPassword(value: userInfo, from: delimIndex + 1, end); |
1096 | } |
1097 | } |
1098 | |
1099 | inline void QUrlPrivate::setUserName(const QString &value, qsizetype from, qsizetype end) |
1100 | { |
1101 | sectionIsPresent |= UserName; |
1102 | userName = recodeFromUser(input: value, actions: userNameInIsolation, from, to: end); |
1103 | } |
1104 | |
1105 | inline void QUrlPrivate::setPassword(const QString &value, qsizetype from, qsizetype end) |
1106 | { |
1107 | sectionIsPresent |= Password; |
1108 | password = recodeFromUser(input: value, actions: passwordInIsolation, from, to: end); |
1109 | } |
1110 | |
1111 | inline void QUrlPrivate::setPath(const QString &value, qsizetype from, qsizetype end) |
1112 | { |
1113 | // sectionIsPresent |= Path; // not used, save some cycles |
1114 | path = recodeFromUser(input: value, actions: pathInIsolation, from, to: end); |
1115 | } |
1116 | |
1117 | inline void QUrlPrivate::setFragment(const QString &value, qsizetype from, qsizetype end) |
1118 | { |
1119 | sectionIsPresent |= Fragment; |
1120 | fragment = recodeFromUser(input: value, actions: fragmentInIsolation, from, to: end); |
1121 | } |
1122 | |
1123 | inline void QUrlPrivate::setQuery(const QString &value, qsizetype from, qsizetype iend) |
1124 | { |
1125 | sectionIsPresent |= Query; |
1126 | query = recodeFromUser(input: value, actions: queryInIsolation, from, to: iend); |
1127 | } |
1128 | |
1129 | // Host handling |
1130 | // The RFC says the host is: |
1131 | // host = IP-literal / IPv4address / reg-name |
1132 | // IP-literal = "[" ( IPv6address / IPvFuture ) "]" |
1133 | // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) |
1134 | // [a strict definition of IPv6Address and IPv4Address] |
1135 | // reg-name = *( unreserved / pct-encoded / sub-delims ) |
1136 | // |
1137 | // We deviate from the standard in all but IPvFuture. For IPvFuture we accept |
1138 | // and store only exactly what the RFC says we should. No percent-encoding is |
1139 | // permitted in this field, so Unicode characters and space aren't either. |
1140 | // |
1141 | // For IPv4 addresses, we accept broken addresses like inet_aton does (that is, |
1142 | // less than three dots). However, we correct the address to the proper form |
1143 | // and store the corrected address. After correction, we comply to the RFC and |
1144 | // it's exclusively composed of unreserved characters. |
1145 | // |
1146 | // For IPv6 addresses, we accept addresses including trailing (embedded) IPv4 |
1147 | // addresses, the so-called v4-compat and v4-mapped addresses. We also store |
1148 | // those addresses like that in the hostname field, which violates the spec. |
1149 | // IPv6 hosts are stored with the square brackets in the QString. It also |
1150 | // requires no transformation in any way. |
1151 | // |
1152 | // As for registered names, it's the other way around: we accept only valid |
1153 | // hostnames as specified by STD 3 and IDNA. That means everything we accept is |
1154 | // valid in the RFC definition above, but there are many valid reg-names |
1155 | // according to the RFC that we do not accept in the name of security. Since we |
1156 | // do accept IDNA, reg-names are subject to ACE encoding and decoding, which is |
1157 | // specified by the DecodeUnicode flag. The hostname is stored in its Unicode form. |
1158 | |
1159 | inline void QUrlPrivate::appendHost(QString &appendTo, QUrl::FormattingOptions options) const |
1160 | { |
1161 | if (host.isEmpty()) { |
1162 | if ((sectionIsPresent & Host) && appendTo.isNull()) |
1163 | appendTo.detach(); |
1164 | return; |
1165 | } |
1166 | if (host.at(i: 0).unicode() == '[') { |
1167 | // IPv6 addresses might contain a zone-id which needs to be recoded |
1168 | if (options != 0) |
1169 | if (qt_urlRecode(appendTo, url: host, encoding: options, tableModifications: nullptr)) |
1170 | return; |
1171 | appendTo += host; |
1172 | } else { |
1173 | // this is either an IPv4Address or a reg-name |
1174 | // if it is a reg-name, it is already stored in Unicode form |
1175 | if (options & QUrl::EncodeUnicode && !(options & 0x4000000)) |
1176 | appendTo += qt_ACE_do(domain: host, op: ToAceOnly, dot: AllowLeadingDot, options: {}); |
1177 | else |
1178 | appendTo += host; |
1179 | } |
1180 | } |
1181 | |
1182 | // the whole IPvFuture is passed and parsed here, including brackets; |
1183 | // returns null if the parsing was successful, or the QChar of the first failure |
1184 | static const QChar *parseIpFuture(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode) |
1185 | { |
1186 | // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) |
1187 | static const char acceptable[] = |
1188 | "!$&'()*+,;=" // sub-delims |
1189 | ":" // ":" |
1190 | "-._~" ; // unreserved |
1191 | |
1192 | // the brackets and the "v" have been checked |
1193 | const QChar *const origBegin = begin; |
1194 | if (begin[3].unicode() != '.') |
1195 | return &begin[3]; |
1196 | if (isHexDigit(c: begin[2].unicode())) { |
1197 | // this is so unlikely that we'll just go down the slow path |
1198 | // decode the whole string, skipping the "[vH." and "]" which we already know to be there |
1199 | host += QStringView(begin, 4); |
1200 | |
1201 | // uppercase the version, if necessary |
1202 | if (begin[2].unicode() >= 'a') |
1203 | host[host.size() - 2] = QChar{begin[2].unicode() - 0x20}; |
1204 | |
1205 | begin += 4; |
1206 | --end; |
1207 | |
1208 | QString decoded; |
1209 | if (mode == QUrl::TolerantMode && qt_urlRecode(appendTo&: decoded, url: QStringView{begin, end}, encoding: QUrl::FullyDecoded, tableModifications: nullptr)) { |
1210 | begin = decoded.constBegin(); |
1211 | end = decoded.constEnd(); |
1212 | } |
1213 | |
1214 | for ( ; begin != end; ++begin) { |
1215 | if (isAsciiLetterOrNumber(c: begin->unicode())) |
1216 | host += *begin; |
1217 | else if (begin->unicode() < 0x80 && strchr(s: acceptable, c: begin->unicode()) != nullptr) |
1218 | host += *begin; |
1219 | else |
1220 | return decoded.isEmpty() ? begin : &origBegin[2]; |
1221 | } |
1222 | host += u']'; |
1223 | return nullptr; |
1224 | } |
1225 | return &origBegin[2]; |
1226 | } |
1227 | |
1228 | // ONLY the IPv6 address is parsed here, WITHOUT the brackets |
1229 | static const QChar *parseIp6(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode) |
1230 | { |
1231 | QStringView decoded(begin, end); |
1232 | QString decodedBuffer; |
1233 | if (mode == QUrl::TolerantMode) { |
1234 | // this struct is kept in automatic storage because it's only 4 bytes |
1235 | const ushort decodeColon[] = { decode(x: ':'), 0 }; |
1236 | if (qt_urlRecode(appendTo&: decodedBuffer, url: decoded, encoding: QUrl::ComponentFormattingOption::PrettyDecoded, tableModifications: decodeColon)) |
1237 | decoded = decodedBuffer; |
1238 | } |
1239 | |
1240 | const QStringView zoneIdIdentifier(u"%25" ); |
1241 | QIPAddressUtils::IPv6Address address; |
1242 | QStringView zoneId; |
1243 | |
1244 | qsizetype zoneIdPosition = decoded.indexOf(s: zoneIdIdentifier); |
1245 | if ((zoneIdPosition != -1) && (decoded.lastIndexOf(s: zoneIdIdentifier) == zoneIdPosition)) { |
1246 | zoneId = decoded.mid(pos: zoneIdPosition + zoneIdIdentifier.size()); |
1247 | decoded.truncate(n: zoneIdPosition); |
1248 | |
1249 | // was there anything after the zone ID separator? |
1250 | if (zoneId.isEmpty()) |
1251 | return end; |
1252 | } |
1253 | |
1254 | // did the address become empty after removing the zone ID? |
1255 | // (it might have always been empty) |
1256 | if (decoded.isEmpty()) |
1257 | return end; |
1258 | |
1259 | const QChar *ret = QIPAddressUtils::parseIp6(address, begin: decoded.constBegin(), end: decoded.constEnd()); |
1260 | if (ret) |
1261 | return begin + (ret - decoded.constBegin()); |
1262 | |
1263 | host.reserve(asize: host.size() + (end - begin) + 2); // +2 for the brackets |
1264 | host += u'['; |
1265 | QIPAddressUtils::toString(appendTo&: host, address); |
1266 | |
1267 | if (!zoneId.isEmpty()) { |
1268 | host += zoneIdIdentifier; |
1269 | host += zoneId; |
1270 | } |
1271 | host += u']'; |
1272 | return nullptr; |
1273 | } |
1274 | |
1275 | inline bool |
1276 | QUrlPrivate::setHost(const QString &value, qsizetype from, qsizetype iend, QUrl::ParsingMode mode) |
1277 | { |
1278 | const QChar *begin = value.constData() + from; |
1279 | const QChar *end = value.constData() + iend; |
1280 | |
1281 | const qsizetype len = end - begin; |
1282 | host.clear(); |
1283 | sectionIsPresent &= ~Host; |
1284 | if (!value.isNull() || (sectionIsPresent & Authority)) |
1285 | sectionIsPresent |= Host; |
1286 | if (len == 0) |
1287 | return true; |
1288 | |
1289 | if (begin[0].unicode() == '[') { |
1290 | // IPv6Address or IPvFuture |
1291 | // smallest IPv6 address is "[::]" (len = 4) |
1292 | // smallest IPvFuture address is "[v7.X]" (len = 6) |
1293 | if (end[-1].unicode() != ']') { |
1294 | setError(errorCode: HostMissingEndBracket, source: value); |
1295 | return false; |
1296 | } |
1297 | |
1298 | if (len > 5 && begin[1].unicode() == 'v') { |
1299 | const QChar *c = parseIpFuture(host, begin, end, mode); |
1300 | if (c) |
1301 | setError(errorCode: InvalidIPvFutureError, source: value, supplement: c - value.constData()); |
1302 | return !c; |
1303 | } else if (begin[1].unicode() == 'v') { |
1304 | setError(errorCode: InvalidIPvFutureError, source: value, supplement: from); |
1305 | } |
1306 | |
1307 | const QChar *c = parseIp6(host, begin: begin + 1, end: end - 1, mode); |
1308 | if (!c) |
1309 | return true; |
1310 | |
1311 | if (c == end - 1) |
1312 | setError(errorCode: InvalidIPv6AddressError, source: value, supplement: from); |
1313 | else |
1314 | setError(errorCode: InvalidCharacterInIPv6Error, source: value, supplement: c - value.constData()); |
1315 | return false; |
1316 | } |
1317 | |
1318 | // check if it's an IPv4 address |
1319 | QIPAddressUtils::IPv4Address ip4; |
1320 | if (QIPAddressUtils::parseIp4(address&: ip4, begin, end)) { |
1321 | // yes, it was |
1322 | QIPAddressUtils::toString(appendTo&: host, address: ip4); |
1323 | return true; |
1324 | } |
1325 | |
1326 | // This is probably a reg-name. |
1327 | // But it can also be an encoded string that, when decoded becomes one |
1328 | // of the types above. |
1329 | // |
1330 | // Two types of encoding are possible: |
1331 | // percent encoding (e.g., "%31%30%2E%30%2E%30%2E%31" -> "10.0.0.1") |
1332 | // Unicode encoding (some non-ASCII characters case-fold to digits |
1333 | // when nameprepping is done) |
1334 | // |
1335 | // The qt_ACE_do function below does IDNA normalization and the STD3 check. |
1336 | // That means a Unicode string may become an IPv4 address, but it cannot |
1337 | // produce a '[' or a '%'. |
1338 | |
1339 | // check for percent-encoding first |
1340 | QString s; |
1341 | if (mode == QUrl::TolerantMode && qt_urlRecode(appendTo&: s, url: QStringView{begin, end}, encoding: { }, tableModifications: nullptr)) { |
1342 | // something was decoded |
1343 | // anything encoded left? |
1344 | qsizetype pos = s.indexOf(c: QChar(0x25)); // '%' |
1345 | if (pos != -1) { |
1346 | setError(errorCode: InvalidRegNameError, source: s, supplement: pos); |
1347 | return false; |
1348 | } |
1349 | |
1350 | // recurse |
1351 | return setHost(value: s, from: 0, iend: s.size(), mode: QUrl::StrictMode); |
1352 | } |
1353 | |
1354 | s = qt_ACE_do(domain: value.mid(position: from, n: iend - from), op: NormalizeAce, dot: ForbidLeadingDot, options: {}); |
1355 | if (s.isEmpty()) { |
1356 | setError(errorCode: InvalidRegNameError, source: value); |
1357 | return false; |
1358 | } |
1359 | |
1360 | // check IPv4 again |
1361 | if (QIPAddressUtils::parseIp4(address&: ip4, begin: s.constBegin(), end: s.constEnd())) { |
1362 | QIPAddressUtils::toString(appendTo&: host, address: ip4); |
1363 | } else { |
1364 | host = s; |
1365 | } |
1366 | return true; |
1367 | } |
1368 | |
1369 | inline void QUrlPrivate::parse(const QString &url, QUrl::ParsingMode parsingMode) |
1370 | { |
1371 | // URI-reference = URI / relative-ref |
1372 | // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] |
1373 | // relative-ref = relative-part [ "?" query ] [ "#" fragment ] |
1374 | // hier-part = "//" authority path-abempty |
1375 | // / other path types |
1376 | // relative-part = "//" authority path-abempty |
1377 | // / other path types here |
1378 | |
1379 | sectionIsPresent = 0; |
1380 | flags = 0; |
1381 | clearError(); |
1382 | |
1383 | // find the important delimiters |
1384 | qsizetype colon = -1; |
1385 | qsizetype question = -1; |
1386 | qsizetype hash = -1; |
1387 | const qsizetype len = url.size(); |
1388 | const QChar *const begin = url.constData(); |
1389 | const ushort *const data = reinterpret_cast<const ushort *>(begin); |
1390 | |
1391 | for (qsizetype i = 0; i < len; ++i) { |
1392 | size_t uc = data[i]; |
1393 | if (uc == '#' && hash == -1) { |
1394 | hash = i; |
1395 | |
1396 | // nothing more to be found |
1397 | break; |
1398 | } |
1399 | |
1400 | if (question == -1) { |
1401 | if (uc == ':' && colon == -1) |
1402 | colon = i; |
1403 | else if (uc == '?') |
1404 | question = i; |
1405 | } |
1406 | } |
1407 | |
1408 | // check if we have a scheme |
1409 | qsizetype hierStart; |
1410 | if (colon != -1 && setScheme(value: url, len: colon, /* don't set error */ doSetError: false)) { |
1411 | hierStart = colon + 1; |
1412 | } else { |
1413 | // recover from a failed scheme: it might not have been a scheme at all |
1414 | scheme.clear(); |
1415 | sectionIsPresent = 0; |
1416 | hierStart = 0; |
1417 | } |
1418 | |
1419 | qsizetype pathStart; |
1420 | qsizetype hierEnd = qMin<size_t>(a: qMin<size_t>(a: question, b: hash), b: len); |
1421 | if (hierEnd - hierStart >= 2 && data[hierStart] == '/' && data[hierStart + 1] == '/') { |
1422 | // we have an authority, it ends at the first slash after these |
1423 | qsizetype authorityEnd = hierEnd; |
1424 | for (qsizetype i = hierStart + 2; i < authorityEnd ; ++i) { |
1425 | if (data[i] == '/') { |
1426 | authorityEnd = i; |
1427 | break; |
1428 | } |
1429 | } |
1430 | |
1431 | setAuthority(auth: url, from: hierStart + 2, end: authorityEnd, mode: parsingMode); |
1432 | |
1433 | // even if we failed to set the authority properly, let's try to recover |
1434 | pathStart = authorityEnd; |
1435 | setPath(value: url, from: pathStart, end: hierEnd); |
1436 | } else { |
1437 | userName.clear(); |
1438 | password.clear(); |
1439 | host.clear(); |
1440 | port = -1; |
1441 | pathStart = hierStart; |
1442 | |
1443 | if (hierStart < hierEnd) |
1444 | setPath(value: url, from: hierStart, end: hierEnd); |
1445 | else |
1446 | path.clear(); |
1447 | } |
1448 | |
1449 | if (size_t(question) < size_t(hash)) |
1450 | setQuery(value: url, from: question + 1, iend: qMin<size_t>(a: hash, b: len)); |
1451 | |
1452 | if (hash != -1) |
1453 | setFragment(value: url, from: hash + 1, end: len); |
1454 | |
1455 | if (error || parsingMode == QUrl::TolerantMode) |
1456 | return; |
1457 | |
1458 | // The parsing so far was partially tolerant of errors, except for the |
1459 | // scheme parser (which is always strict) and the authority (which was |
1460 | // executed in strict mode). |
1461 | // If we haven't found any errors so far, continue the strict-mode parsing |
1462 | // from the path component onwards. |
1463 | |
1464 | if (!validateComponent(section: Path, input: url, begin: pathStart, end: hierEnd)) |
1465 | return; |
1466 | if (size_t(question) < size_t(hash) && !validateComponent(section: Query, input: url, begin: question + 1, end: qMin<size_t>(a: hash, b: len))) |
1467 | return; |
1468 | if (hash != -1) |
1469 | validateComponent(section: Fragment, input: url, begin: hash + 1, end: len); |
1470 | } |
1471 | |
1472 | QString QUrlPrivate::toLocalFile(QUrl::FormattingOptions options) const |
1473 | { |
1474 | QString tmp; |
1475 | QString ourPath; |
1476 | appendPath(appendTo&: ourPath, options, appendingTo: QUrlPrivate::Path); |
1477 | |
1478 | // magic for shared drive on windows |
1479 | if (!host.isEmpty()) { |
1480 | tmp = "//"_L1 + host; |
1481 | #ifdef Q_OS_WIN // QTBUG-42346, WebDAV is visible as local file on Windows only. |
1482 | if (scheme == webDavScheme()) |
1483 | tmp += webDavSslTag(); |
1484 | #endif |
1485 | if (!ourPath.isEmpty() && !ourPath.startsWith(c: u'/')) |
1486 | tmp += u'/'; |
1487 | tmp += ourPath; |
1488 | } else { |
1489 | tmp = ourPath; |
1490 | #ifdef Q_OS_WIN |
1491 | // magic for drives on windows |
1492 | if (ourPath.length() > 2 && ourPath.at(0) == u'/' && ourPath.at(2) == u':') |
1493 | tmp.remove(0, 1); |
1494 | #endif |
1495 | } |
1496 | return tmp; |
1497 | } |
1498 | |
1499 | /* |
1500 | From http://www.ietf.org/rfc/rfc3986.txt, 5.2.3: Merge paths |
1501 | |
1502 | Returns a merge of the current path with the relative path passed |
1503 | as argument. |
1504 | |
1505 | Note: \a relativePath is relative (does not start with '/'). |
1506 | */ |
1507 | inline QString QUrlPrivate::mergePaths(const QString &relativePath) const |
1508 | { |
1509 | // If the base URI has a defined authority component and an empty |
1510 | // path, then return a string consisting of "/" concatenated with |
1511 | // the reference's path; otherwise, |
1512 | if (!host.isEmpty() && path.isEmpty()) |
1513 | return u'/' + relativePath; |
1514 | |
1515 | // Return a string consisting of the reference's path component |
1516 | // appended to all but the last segment of the base URI's path |
1517 | // (i.e., excluding any characters after the right-most "/" in the |
1518 | // base URI path, or excluding the entire base URI path if it does |
1519 | // not contain any "/" characters). |
1520 | QString newPath; |
1521 | if (!path.contains(c: u'/')) |
1522 | newPath = relativePath; |
1523 | else |
1524 | newPath = QStringView{path}.left(n: path.lastIndexOf(c: u'/') + 1) + relativePath; |
1525 | |
1526 | return newPath; |
1527 | } |
1528 | |
1529 | // Authority-less URLs cannot have paths starting with double slashes (see |
1530 | // QUrlPrivate::validityError). We refuse to turn a valid URL into invalid by |
1531 | // way of QUrl::resolved(). |
1532 | static void fixupNonAuthorityPath(QString *path) |
1533 | { |
1534 | if (path->isEmpty() || path->at(i: 0) != u'/') |
1535 | return; |
1536 | |
1537 | // Find the first non-slash character, because its position is equal to the |
1538 | // number of slashes. We'll remove all but one of them. |
1539 | qsizetype i = 0; |
1540 | while (i + 1 < path->size() && path->at(i: i + 1) == u'/') |
1541 | ++i; |
1542 | if (i) |
1543 | path->remove(i: 0, len: i); |
1544 | } |
1545 | |
1546 | inline QUrlPrivate::ErrorCode QUrlPrivate::validityError(QString *source, qsizetype *position) const |
1547 | { |
1548 | Q_ASSERT(!source == !position); |
1549 | if (error) { |
1550 | if (source) { |
1551 | *source = error->source; |
1552 | *position = error->position; |
1553 | } |
1554 | return error->code; |
1555 | } |
1556 | |
1557 | // There are three more cases of invalid URLs that QUrl recognizes and they |
1558 | // are only possible with constructed URLs (setXXX methods), not with |
1559 | // parsing. Therefore, they are tested here. |
1560 | // |
1561 | // Two cases are a non-empty path that doesn't start with a slash and: |
1562 | // - with an authority |
1563 | // - without an authority, without scheme but the path with a colon before |
1564 | // the first slash |
1565 | // The third case is an empty authority and a non-empty path that starts |
1566 | // with "//". |
1567 | // Those cases are considered invalid because toString() would produce a URL |
1568 | // that wouldn't be parsed back to the same QUrl. |
1569 | |
1570 | if (path.isEmpty()) |
1571 | return NoError; |
1572 | if (path.at(i: 0) == u'/') { |
1573 | if (hasAuthority() || path.size() == 1 || path.at(i: 1) != u'/') |
1574 | return NoError; |
1575 | if (source) { |
1576 | *source = path; |
1577 | *position = 0; |
1578 | } |
1579 | return AuthorityAbsentAndPathIsDoubleSlash; |
1580 | } |
1581 | |
1582 | if (sectionIsPresent & QUrlPrivate::Host) { |
1583 | if (source) { |
1584 | *source = path; |
1585 | *position = 0; |
1586 | } |
1587 | return AuthorityPresentAndPathIsRelative; |
1588 | } |
1589 | if (sectionIsPresent & QUrlPrivate::Scheme) |
1590 | return NoError; |
1591 | |
1592 | // check for a path of "text:text/" |
1593 | for (qsizetype i = 0; i < path.size(); ++i) { |
1594 | ushort c = path.at(i).unicode(); |
1595 | if (c == '/') { |
1596 | // found the slash before the colon |
1597 | return NoError; |
1598 | } |
1599 | if (c == ':') { |
1600 | // found the colon before the slash, it's invalid |
1601 | if (source) { |
1602 | *source = path; |
1603 | *position = i; |
1604 | } |
1605 | return RelativeUrlPathContainsColonBeforeSlash; |
1606 | } |
1607 | } |
1608 | return NoError; |
1609 | } |
1610 | |
1611 | bool QUrlPrivate::validateComponent(QUrlPrivate::Section section, const QString &input, |
1612 | qsizetype begin, qsizetype end) |
1613 | { |
1614 | // What we need to look out for, that the regular parser tolerates: |
1615 | // - percent signs not followed by two hex digits |
1616 | // - forbidden characters, which should always appear encoded |
1617 | // '"' / '<' / '>' / '\' / '^' / '`' / '{' / '|' / '}' / BKSP |
1618 | // control characters |
1619 | // - delimiters not allowed in certain positions |
1620 | // . scheme: parser is already strict |
1621 | // . user info: gen-delims except ":" disallowed ("/" / "?" / "#" / "[" / "]" / "@") |
1622 | // . host: parser is stricter than the standard |
1623 | // . port: parser is stricter than the standard |
1624 | // . path: all delimiters allowed |
1625 | // . fragment: all delimiters allowed |
1626 | // . query: all delimiters allowed |
1627 | static const char forbidden[] = "\"<>\\^`{|}\x7F" ; |
1628 | static const char forbiddenUserInfo[] = ":/?#[]@" ; |
1629 | |
1630 | Q_ASSERT(section != Authority && section != Hierarchy && section != FullUrl); |
1631 | |
1632 | const ushort *const data = reinterpret_cast<const ushort *>(input.constData()); |
1633 | for (size_t i = size_t(begin); i < size_t(end); ++i) { |
1634 | uint uc = data[i]; |
1635 | if (uc >= 0x80) |
1636 | continue; |
1637 | |
1638 | bool error = false; |
1639 | if ((uc == '%' && (size_t(end) < i + 2 || !isHex(c: data[i + 1]) || !isHex(c: data[i + 2]))) |
1640 | || uc <= 0x20 || strchr(s: forbidden, c: uc)) { |
1641 | // found an error |
1642 | error = true; |
1643 | } else if (section & UserInfo) { |
1644 | if (section == UserInfo && strchr(s: forbiddenUserInfo + 1, c: uc)) |
1645 | error = true; |
1646 | else if (section != UserInfo && strchr(s: forbiddenUserInfo, c: uc)) |
1647 | error = true; |
1648 | } |
1649 | |
1650 | if (!error) |
1651 | continue; |
1652 | |
1653 | ErrorCode errorCode = ErrorCode(int(section) << 8); |
1654 | if (section == UserInfo) { |
1655 | // is it the user name or the password? |
1656 | errorCode = InvalidUserNameError; |
1657 | for (size_t j = size_t(begin); j < i; ++j) |
1658 | if (data[j] == ':') { |
1659 | errorCode = InvalidPasswordError; |
1660 | break; |
1661 | } |
1662 | } |
1663 | |
1664 | setError(errorCode, source: input, supplement: i); |
1665 | return false; |
1666 | } |
1667 | |
1668 | // no errors |
1669 | return true; |
1670 | } |
1671 | |
1672 | #if 0 |
1673 | inline void QUrlPrivate::validate() const |
1674 | { |
1675 | QUrlPrivate *that = (QUrlPrivate *)this; |
1676 | that->encodedOriginal = that->toEncoded(); // may detach |
1677 | parse(ParseOnly); |
1678 | |
1679 | QURL_SETFLAG(that->stateFlags, Validated); |
1680 | |
1681 | if (!isValid) |
1682 | return; |
1683 | |
1684 | QString auth = authority(); // causes the non-encoded forms to be valid |
1685 | |
1686 | // authority() calls canonicalHost() which sets this |
1687 | if (!isHostValid) |
1688 | return; |
1689 | |
1690 | if (scheme == "mailto"_L1 ) { |
1691 | if (!host.isEmpty() || port != -1 || !userName.isEmpty() || !password.isEmpty()) { |
1692 | that->isValid = false; |
1693 | that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "expected empty host, username," |
1694 | "port and password" ), |
1695 | 0, 0); |
1696 | } |
1697 | } else if (scheme == ftpScheme() || scheme == httpScheme()) { |
1698 | if (host.isEmpty() && !(path.isEmpty() && encodedPath.isEmpty())) { |
1699 | that->isValid = false; |
1700 | that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "the host is empty, but not the path" ), |
1701 | 0, 0); |
1702 | } |
1703 | } |
1704 | } |
1705 | #endif |
1706 | |
1707 | /*! |
1708 | \macro QT_NO_URL_CAST_FROM_STRING |
1709 | \relates QUrl |
1710 | |
1711 | Disables automatic conversions from QString (or char *) to QUrl. |
1712 | |
1713 | Compiling your code with this define is useful when you have a lot of |
1714 | code that uses QString for file names and you wish to convert it to |
1715 | use QUrl for network transparency. In any code that uses QUrl, it can |
1716 | help avoid missing QUrl::resolved() calls, and other misuses of |
1717 | QString to QUrl conversions. |
1718 | |
1719 | For example, if you have code like |
1720 | |
1721 | \code |
1722 | url = filename; // probably not what you want |
1723 | \endcode |
1724 | |
1725 | you can rewrite it as |
1726 | |
1727 | \code |
1728 | url = QUrl::fromLocalFile(filename); |
1729 | url = baseurl.resolved(QUrl(filename)); |
1730 | \endcode |
1731 | |
1732 | \sa QT_NO_CAST_FROM_ASCII |
1733 | */ |
1734 | |
1735 | |
1736 | /*! |
1737 | Constructs a URL by parsing \a url. Note this constructor expects a proper |
1738 | URL or URL-Reference and will not attempt to guess intent. For example, the |
1739 | following declaration: |
1740 | |
1741 | \snippet code/src_corelib_io_qurl.cpp constructor-url-reference |
1742 | |
1743 | Will construct a valid URL but it may not be what one expects, as the |
1744 | scheme() part of the input is missing. For a string like the above, |
1745 | applications may want to use fromUserInput(). For this constructor or |
1746 | setUrl(), the following is probably what was intended: |
1747 | |
1748 | \snippet code/src_corelib_io_qurl.cpp constructor-url |
1749 | |
1750 | QUrl will automatically percent encode |
1751 | all characters that are not allowed in a URL and decode the percent-encoded |
1752 | sequences that represent an unreserved character (letters, digits, hyphens, |
1753 | underscores, dots and tildes). All other characters are left in their |
1754 | original forms. |
1755 | |
1756 | Parses the \a url using the parser mode \a parsingMode. In TolerantMode |
1757 | (the default), QUrl will correct certain mistakes, notably the presence of |
1758 | a percent character ('%') not followed by two hexadecimal digits, and it |
1759 | will accept any character in any position. In StrictMode, encoding mistakes |
1760 | will not be tolerated and QUrl will also check that certain forbidden |
1761 | characters are not present in unencoded form. If an error is detected in |
1762 | StrictMode, isValid() will return false. The parsing mode DecodedMode is not |
1763 | permitted in this context. |
1764 | |
1765 | Example: |
1766 | |
1767 | \snippet code/src_corelib_io_qurl.cpp 0 |
1768 | |
1769 | To construct a URL from an encoded string, you can also use fromEncoded(): |
1770 | |
1771 | \snippet code/src_corelib_io_qurl.cpp 1 |
1772 | |
1773 | Both functions are equivalent and, in Qt 5, both functions accept encoded |
1774 | data. Usually, the choice of the QUrl constructor or setUrl() versus |
1775 | fromEncoded() will depend on the source data: the constructor and setUrl() |
1776 | take a QString, whereas fromEncoded takes a QByteArray. |
1777 | |
1778 | \sa setUrl(), fromEncoded(), TolerantMode |
1779 | */ |
1780 | QUrl::QUrl(const QString &url, ParsingMode parsingMode) : d(nullptr) |
1781 | { |
1782 | setUrl(url, mode: parsingMode); |
1783 | } |
1784 | |
1785 | /*! |
1786 | Constructs an empty QUrl object. |
1787 | */ |
1788 | QUrl::QUrl() : d(nullptr) |
1789 | { |
1790 | } |
1791 | |
1792 | /*! |
1793 | Constructs a copy of \a other. |
1794 | */ |
1795 | QUrl::QUrl(const QUrl &other) noexcept : d(other.d) |
1796 | { |
1797 | if (d) |
1798 | d->ref.ref(); |
1799 | } |
1800 | |
1801 | /*! |
1802 | Destructor; called immediately before the object is deleted. |
1803 | */ |
1804 | QUrl::~QUrl() |
1805 | { |
1806 | if (d && !d->ref.deref()) |
1807 | delete d; |
1808 | } |
1809 | |
1810 | /*! |
1811 | Returns \c true if the URL is non-empty and valid; otherwise returns \c false. |
1812 | |
1813 | The URL is run through a conformance test. Every part of the URL |
1814 | must conform to the standard encoding rules of the URI standard |
1815 | for the URL to be reported as valid. |
1816 | |
1817 | \snippet code/src_corelib_io_qurl.cpp 2 |
1818 | */ |
1819 | bool QUrl::isValid() const |
1820 | { |
1821 | if (isEmpty()) { |
1822 | // also catches d == nullptr |
1823 | return false; |
1824 | } |
1825 | return d->validityError() == QUrlPrivate::NoError; |
1826 | } |
1827 | |
1828 | /*! |
1829 | Returns \c true if the URL has no data; otherwise returns \c false. |
1830 | |
1831 | \sa clear() |
1832 | */ |
1833 | bool QUrl::isEmpty() const |
1834 | { |
1835 | if (!d) return true; |
1836 | return d->isEmpty(); |
1837 | } |
1838 | |
1839 | /*! |
1840 | Resets the content of the QUrl. After calling this function, the |
1841 | QUrl is equal to one that has been constructed with the default |
1842 | empty constructor. |
1843 | |
1844 | \sa isEmpty() |
1845 | */ |
1846 | void QUrl::clear() |
1847 | { |
1848 | if (d && !d->ref.deref()) |
1849 | delete d; |
1850 | d = nullptr; |
1851 | } |
1852 | |
1853 | /*! |
1854 | Parses \a url and sets this object to that value. QUrl will automatically |
1855 | percent encode all characters that are not allowed in a URL and decode the |
1856 | percent-encoded sequences that represent an unreserved character (letters, |
1857 | digits, hyphens, underscores, dots and tildes). All other characters are |
1858 | left in their original forms. |
1859 | |
1860 | Parses the \a url using the parser mode \a parsingMode. In TolerantMode |
1861 | (the default), QUrl will correct certain mistakes, notably the presence of |
1862 | a percent character ('%') not followed by two hexadecimal digits, and it |
1863 | will accept any character in any position. In StrictMode, encoding mistakes |
1864 | will not be tolerated and QUrl will also check that certain forbidden |
1865 | characters are not present in unencoded form. If an error is detected in |
1866 | StrictMode, isValid() will return false. The parsing mode DecodedMode is |
1867 | not permitted in this context and will produce a run-time warning. |
1868 | |
1869 | \sa url(), toString() |
1870 | */ |
1871 | void QUrl::setUrl(const QString &url, ParsingMode parsingMode) |
1872 | { |
1873 | if (parsingMode == DecodedMode) { |
1874 | qWarning(msg: "QUrl: QUrl::DecodedMode is not permitted when parsing a full URL" ); |
1875 | } else { |
1876 | detach(); |
1877 | d->parse(url, parsingMode); |
1878 | } |
1879 | } |
1880 | |
1881 | /*! |
1882 | Sets the scheme of the URL to \a scheme. As a scheme can only |
1883 | contain ASCII characters, no conversion or decoding is done on the |
1884 | input. It must also start with an ASCII letter. |
1885 | |
1886 | The scheme describes the type (or protocol) of the URL. It's |
1887 | represented by one or more ASCII characters at the start the URL. |
1888 | |
1889 | A scheme is strictly \l {RFC 3986}-compliant: |
1890 | \tt {scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )} |
1891 | |
1892 | The following example shows a URL where the scheme is "ftp": |
1893 | |
1894 | \image qurl-authority2.png |
1895 | |
1896 | To set the scheme, the following call is used: |
1897 | \snippet code/src_corelib_io_qurl.cpp 11 |
1898 | |
1899 | The scheme can also be empty, in which case the URL is interpreted |
1900 | as relative. |
1901 | |
1902 | \sa scheme(), isRelative() |
1903 | */ |
1904 | void QUrl::setScheme(const QString &scheme) |
1905 | { |
1906 | detach(); |
1907 | d->clearError(); |
1908 | if (scheme.isEmpty()) { |
1909 | // schemes are not allowed to be empty |
1910 | d->sectionIsPresent &= ~QUrlPrivate::Scheme; |
1911 | d->flags &= ~QUrlPrivate::IsLocalFile; |
1912 | d->scheme.clear(); |
1913 | } else { |
1914 | d->setScheme(value: scheme, len: scheme.size(), /* do set error */ doSetError: true); |
1915 | } |
1916 | } |
1917 | |
1918 | /*! |
1919 | Returns the scheme of the URL. If an empty string is returned, |
1920 | this means the scheme is undefined and the URL is then relative. |
1921 | |
1922 | The scheme can only contain US-ASCII letters or digits, which means it |
1923 | cannot contain any character that would otherwise require encoding. |
1924 | Additionally, schemes are always returned in lowercase form. |
1925 | |
1926 | \sa setScheme(), isRelative() |
1927 | */ |
1928 | QString QUrl::scheme() const |
1929 | { |
1930 | if (!d) return QString(); |
1931 | |
1932 | return d->scheme; |
1933 | } |
1934 | |
1935 | /*! |
1936 | Sets the authority of the URL to \a authority. |
1937 | |
1938 | The authority of a URL is the combination of user info, a host |
1939 | name and a port. All of these elements are optional; an empty |
1940 | authority is therefore valid. |
1941 | |
1942 | The user info and host are separated by a '@', and the host and |
1943 | port are separated by a ':'. If the user info is empty, the '@' |
1944 | must be omitted; although a stray ':' is permitted if the port is |
1945 | empty. |
1946 | |
1947 | The following example shows a valid authority string: |
1948 | |
1949 | \image qurl-authority.png |
1950 | |
1951 | The \a authority data is interpreted according to \a mode: in StrictMode, |
1952 | any '%' characters must be followed by exactly two hexadecimal characters |
1953 | and some characters (including space) are not allowed in undecoded form. In |
1954 | TolerantMode (the default), all characters are accepted in undecoded form |
1955 | and the tolerant parser will correct stray '%' not followed by two hex |
1956 | characters. |
1957 | |
1958 | This function does not allow \a mode to be QUrl::DecodedMode. To set fully |
1959 | decoded data, call setUserName(), setPassword(), setHost() and setPort() |
1960 | individually. |
1961 | |
1962 | \sa setUserInfo(), setHost(), setPort() |
1963 | */ |
1964 | void QUrl::setAuthority(const QString &authority, ParsingMode mode) |
1965 | { |
1966 | detach(); |
1967 | d->clearError(); |
1968 | |
1969 | if (mode == DecodedMode) { |
1970 | qWarning(msg: "QUrl::setAuthority(): QUrl::DecodedMode is not permitted in this function" ); |
1971 | return; |
1972 | } |
1973 | |
1974 | d->setAuthority(auth: authority, from: 0, end: authority.size(), mode); |
1975 | } |
1976 | |
1977 | /*! |
1978 | Returns the authority of the URL if it is defined; otherwise |
1979 | an empty string is returned. |
1980 | |
1981 | This function returns an unambiguous value, which may contain that |
1982 | characters still percent-encoded, plus some control sequences not |
1983 | representable in decoded form in QString. |
1984 | |
1985 | The \a options argument controls how to format the user info component. The |
1986 | value of QUrl::FullyDecoded is not permitted in this function. If you need |
1987 | to obtain fully decoded data, call userName(), password(), host() and |
1988 | port() individually. |
1989 | |
1990 | \sa setAuthority(), userInfo(), userName(), password(), host(), port() |
1991 | */ |
1992 | QString QUrl::authority(ComponentFormattingOptions options) const |
1993 | { |
1994 | QString result; |
1995 | if (!d) |
1996 | return result; |
1997 | |
1998 | if (options == QUrl::FullyDecoded) { |
1999 | qWarning(msg: "QUrl::authority(): QUrl::FullyDecoded is not permitted in this function" ); |
2000 | return result; |
2001 | } |
2002 | |
2003 | d->appendAuthority(appendTo&: result, options, appendingTo: QUrlPrivate::Authority); |
2004 | return result; |
2005 | } |
2006 | |
2007 | /*! |
2008 | Sets the user info of the URL to \a userInfo. The user info is an |
2009 | optional part of the authority of the URL, as described in |
2010 | setAuthority(). |
2011 | |
2012 | The user info consists of a user name and optionally a password, |
2013 | separated by a ':'. If the password is empty, the colon must be |
2014 | omitted. The following example shows a valid user info string: |
2015 | |
2016 | \image qurl-authority3.png |
2017 | |
2018 | The \a userInfo data is interpreted according to \a mode: in StrictMode, |
2019 | any '%' characters must be followed by exactly two hexadecimal characters |
2020 | and some characters (including space) are not allowed in undecoded form. In |
2021 | TolerantMode (the default), all characters are accepted in undecoded form |
2022 | and the tolerant parser will correct stray '%' not followed by two hex |
2023 | characters. |
2024 | |
2025 | This function does not allow \a mode to be QUrl::DecodedMode. To set fully |
2026 | decoded data, call setUserName() and setPassword() individually. |
2027 | |
2028 | \sa userInfo(), setUserName(), setPassword(), setAuthority() |
2029 | */ |
2030 | void QUrl::setUserInfo(const QString &userInfo, ParsingMode mode) |
2031 | { |
2032 | detach(); |
2033 | d->clearError(); |
2034 | QString trimmed = userInfo.trimmed(); |
2035 | if (mode == DecodedMode) { |
2036 | qWarning(msg: "QUrl::setUserInfo(): QUrl::DecodedMode is not permitted in this function" ); |
2037 | return; |
2038 | } |
2039 | |
2040 | d->setUserInfo(userInfo: trimmed, from: 0, end: trimmed.size()); |
2041 | if (userInfo.isNull()) { |
2042 | // QUrlPrivate::setUserInfo cleared almost everything |
2043 | // but it leaves the UserName bit set |
2044 | d->sectionIsPresent &= ~QUrlPrivate::UserInfo; |
2045 | } else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::UserInfo, input: userInfo)) { |
2046 | d->sectionIsPresent &= ~QUrlPrivate::UserInfo; |
2047 | d->userName.clear(); |
2048 | d->password.clear(); |
2049 | } |
2050 | } |
2051 | |
2052 | /*! |
2053 | Returns the user info of the URL, or an empty string if the user |
2054 | info is undefined. |
2055 | |
2056 | This function returns an unambiguous value, which may contain that |
2057 | characters still percent-encoded, plus some control sequences not |
2058 | representable in decoded form in QString. |
2059 | |
2060 | The \a options argument controls how to format the user info component. The |
2061 | value of QUrl::FullyDecoded is not permitted in this function. If you need |
2062 | to obtain fully decoded data, call userName() and password() individually. |
2063 | |
2064 | \sa setUserInfo(), userName(), password(), authority() |
2065 | */ |
2066 | QString QUrl::userInfo(ComponentFormattingOptions options) const |
2067 | { |
2068 | QString result; |
2069 | if (!d) |
2070 | return result; |
2071 | |
2072 | if (options == QUrl::FullyDecoded) { |
2073 | qWarning(msg: "QUrl::userInfo(): QUrl::FullyDecoded is not permitted in this function" ); |
2074 | return result; |
2075 | } |
2076 | |
2077 | d->appendUserInfo(appendTo&: result, options, appendingTo: QUrlPrivate::UserInfo); |
2078 | return result; |
2079 | } |
2080 | |
2081 | /*! |
2082 | Sets the URL's user name to \a userName. The \a userName is part |
2083 | of the user info element in the authority of the URL, as described |
2084 | in setUserInfo(). |
2085 | |
2086 | The \a userName data is interpreted according to \a mode: in StrictMode, |
2087 | any '%' characters must be followed by exactly two hexadecimal characters |
2088 | and some characters (including space) are not allowed in undecoded form. In |
2089 | TolerantMode (the default), all characters are accepted in undecoded form |
2090 | and the tolerant parser will correct stray '%' not followed by two hex |
2091 | characters. In DecodedMode, '%' stand for themselves and encoded characters |
2092 | are not possible. |
2093 | |
2094 | QUrl::DecodedMode should be used when setting the user name from a data |
2095 | source which is not a URL, such as a password dialog shown to the user or |
2096 | with a user name obtained by calling userName() with the QUrl::FullyDecoded |
2097 | formatting option. |
2098 | |
2099 | \sa userName(), setUserInfo() |
2100 | */ |
2101 | void QUrl::setUserName(const QString &userName, ParsingMode mode) |
2102 | { |
2103 | detach(); |
2104 | d->clearError(); |
2105 | |
2106 | QString data = userName; |
2107 | if (mode == DecodedMode) { |
2108 | parseDecodedComponent(data); |
2109 | mode = TolerantMode; |
2110 | } |
2111 | |
2112 | d->setUserName(value: data, from: 0, end: data.size()); |
2113 | if (userName.isNull()) |
2114 | d->sectionIsPresent &= ~QUrlPrivate::UserName; |
2115 | else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::UserName, input: userName)) |
2116 | d->userName.clear(); |
2117 | } |
2118 | |
2119 | /*! |
2120 | Returns the user name of the URL if it is defined; otherwise |
2121 | an empty string is returned. |
2122 | |
2123 | The \a options argument controls how to format the user name component. All |
2124 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2125 | percent-encoded sequences are decoded; otherwise, the returned value may |
2126 | contain some percent-encoded sequences for some control sequences not |
2127 | representable in decoded form in QString. |
2128 | |
2129 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2130 | sequences are present. It is recommended to use that value when the result |
2131 | will be used in a non-URL context, such as setting in QAuthenticator or |
2132 | negotiating a login. |
2133 | |
2134 | \sa setUserName(), userInfo() |
2135 | */ |
2136 | QString QUrl::userName(ComponentFormattingOptions options) const |
2137 | { |
2138 | QString result; |
2139 | if (d) |
2140 | d->appendUserName(appendTo&: result, options); |
2141 | return result; |
2142 | } |
2143 | |
2144 | /*! |
2145 | Sets the URL's password to \a password. The \a password is part of |
2146 | the user info element in the authority of the URL, as described in |
2147 | setUserInfo(). |
2148 | |
2149 | The \a password data is interpreted according to \a mode: in StrictMode, |
2150 | any '%' characters must be followed by exactly two hexadecimal characters |
2151 | and some characters (including space) are not allowed in undecoded form. In |
2152 | TolerantMode, all characters are accepted in undecoded form and the |
2153 | tolerant parser will correct stray '%' not followed by two hex characters. |
2154 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2155 | possible. |
2156 | |
2157 | QUrl::DecodedMode should be used when setting the password from a data |
2158 | source which is not a URL, such as a password dialog shown to the user or |
2159 | with a password obtained by calling password() with the QUrl::FullyDecoded |
2160 | formatting option. |
2161 | |
2162 | \sa password(), setUserInfo() |
2163 | */ |
2164 | void QUrl::setPassword(const QString &password, ParsingMode mode) |
2165 | { |
2166 | detach(); |
2167 | d->clearError(); |
2168 | |
2169 | QString data = password; |
2170 | if (mode == DecodedMode) { |
2171 | parseDecodedComponent(data); |
2172 | mode = TolerantMode; |
2173 | } |
2174 | |
2175 | d->setPassword(value: data, from: 0, end: data.size()); |
2176 | if (password.isNull()) |
2177 | d->sectionIsPresent &= ~QUrlPrivate::Password; |
2178 | else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Password, input: password)) |
2179 | d->password.clear(); |
2180 | } |
2181 | |
2182 | /*! |
2183 | Returns the password of the URL if it is defined; otherwise |
2184 | an empty string is returned. |
2185 | |
2186 | The \a options argument controls how to format the user name component. All |
2187 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2188 | percent-encoded sequences are decoded; otherwise, the returned value may |
2189 | contain some percent-encoded sequences for some control sequences not |
2190 | representable in decoded form in QString. |
2191 | |
2192 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2193 | sequences are present. It is recommended to use that value when the result |
2194 | will be used in a non-URL context, such as setting in QAuthenticator or |
2195 | negotiating a login. |
2196 | |
2197 | \sa setPassword() |
2198 | */ |
2199 | QString QUrl::password(ComponentFormattingOptions options) const |
2200 | { |
2201 | QString result; |
2202 | if (d) |
2203 | d->appendPassword(appendTo&: result, options); |
2204 | return result; |
2205 | } |
2206 | |
2207 | /*! |
2208 | Sets the host of the URL to \a host. The host is part of the |
2209 | authority. |
2210 | |
2211 | The \a host data is interpreted according to \a mode: in StrictMode, |
2212 | any '%' characters must be followed by exactly two hexadecimal characters |
2213 | and some characters (including space) are not allowed in undecoded form. In |
2214 | TolerantMode, all characters are accepted in undecoded form and the |
2215 | tolerant parser will correct stray '%' not followed by two hex characters. |
2216 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2217 | possible. |
2218 | |
2219 | Note that, in all cases, the result of the parsing must be a valid hostname |
2220 | according to STD 3 rules, as modified by the Internationalized Resource |
2221 | Identifiers specification (RFC 3987). Invalid hostnames are not permitted |
2222 | and will cause isValid() to become false. |
2223 | |
2224 | \sa host(), setAuthority() |
2225 | */ |
2226 | void QUrl::setHost(const QString &host, ParsingMode mode) |
2227 | { |
2228 | detach(); |
2229 | d->clearError(); |
2230 | |
2231 | QString data = host; |
2232 | if (mode == DecodedMode) { |
2233 | parseDecodedComponent(data); |
2234 | mode = TolerantMode; |
2235 | } |
2236 | |
2237 | if (d->setHost(value: data, from: 0, iend: data.size(), mode)) { |
2238 | return; |
2239 | } else if (!data.startsWith(c: u'[')) { |
2240 | // setHost failed, it might be IPv6 or IPvFuture in need of bracketing |
2241 | Q_ASSERT(d->error); |
2242 | |
2243 | data.prepend(c: u'['); |
2244 | data.append(c: u']'); |
2245 | if (!d->setHost(value: data, from: 0, iend: data.size(), mode)) { |
2246 | // failed again |
2247 | if (data.contains(c: u':')) { |
2248 | // source data contains ':', so it's an IPv6 error |
2249 | d->error->code = QUrlPrivate::InvalidIPv6AddressError; |
2250 | } |
2251 | d->sectionIsPresent &= ~QUrlPrivate::Host; |
2252 | } else { |
2253 | // succeeded |
2254 | d->clearError(); |
2255 | } |
2256 | } |
2257 | } |
2258 | |
2259 | /*! |
2260 | Returns the host of the URL if it is defined; otherwise |
2261 | an empty string is returned. |
2262 | |
2263 | The \a options argument controls how the hostname will be formatted. The |
2264 | QUrl::EncodeUnicode option will cause this function to return the hostname |
2265 | in the ASCII-Compatible Encoding (ACE) form, which is suitable for use in |
2266 | channels that are not 8-bit clean or that require the legacy hostname (such |
2267 | as DNS requests or in HTTP request headers). If that flag is not present, |
2268 | this function returns the International Domain Name (IDN) in Unicode form, |
2269 | according to the list of permissible top-level domains (see |
2270 | idnWhitelist()). |
2271 | |
2272 | All other flags are ignored. Host names cannot contain control or percent |
2273 | characters, so the returned value can be considered fully decoded. |
2274 | |
2275 | \sa setHost(), idnWhitelist(), setIdnWhitelist(), authority() |
2276 | */ |
2277 | QString QUrl::host(ComponentFormattingOptions options) const |
2278 | { |
2279 | QString result; |
2280 | if (d) { |
2281 | d->appendHost(appendTo&: result, options); |
2282 | if (result.startsWith(c: u'[')) |
2283 | result = result.mid(position: 1, n: result.size() - 2); |
2284 | } |
2285 | return result; |
2286 | } |
2287 | |
2288 | /*! |
2289 | Sets the port of the URL to \a port. The port is part of the |
2290 | authority of the URL, as described in setAuthority(). |
2291 | |
2292 | \a port must be between 0 and 65535 inclusive. Setting the |
2293 | port to -1 indicates that the port is unspecified. |
2294 | */ |
2295 | void QUrl::setPort(int port) |
2296 | { |
2297 | detach(); |
2298 | d->clearError(); |
2299 | |
2300 | if (port < -1 || port > 65535) { |
2301 | d->setError(errorCode: QUrlPrivate::InvalidPortError, source: QString::number(port), supplement: 0); |
2302 | port = -1; |
2303 | } |
2304 | |
2305 | d->port = port; |
2306 | if (port != -1) |
2307 | d->sectionIsPresent |= QUrlPrivate::Host; |
2308 | } |
2309 | |
2310 | /*! |
2311 | \since 4.1 |
2312 | |
2313 | Returns the port of the URL, or \a defaultPort if the port is |
2314 | unspecified. |
2315 | |
2316 | Example: |
2317 | |
2318 | \snippet code/src_corelib_io_qurl.cpp 3 |
2319 | */ |
2320 | int QUrl::port(int defaultPort) const |
2321 | { |
2322 | if (!d) return defaultPort; |
2323 | return d->port == -1 ? defaultPort : d->port; |
2324 | } |
2325 | |
2326 | /*! |
2327 | Sets the path of the URL to \a path. The path is the part of the |
2328 | URL that comes after the authority but before the query string. |
2329 | |
2330 | \image qurl-ftppath.png |
2331 | |
2332 | For non-hierarchical schemes, the path will be everything |
2333 | following the scheme declaration, as in the following example: |
2334 | |
2335 | \image qurl-mailtopath.png |
2336 | |
2337 | The \a path data is interpreted according to \a mode: in StrictMode, |
2338 | any '%' characters must be followed by exactly two hexadecimal characters |
2339 | and some characters (including space) are not allowed in undecoded form. In |
2340 | TolerantMode, all characters are accepted in undecoded form and the |
2341 | tolerant parser will correct stray '%' not followed by two hex characters. |
2342 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2343 | possible. |
2344 | |
2345 | QUrl::DecodedMode should be used when setting the path from a data source |
2346 | which is not a URL, such as a dialog shown to the user or with a path |
2347 | obtained by calling path() with the QUrl::FullyDecoded formatting option. |
2348 | |
2349 | \sa path() |
2350 | */ |
2351 | void QUrl::setPath(const QString &path, ParsingMode mode) |
2352 | { |
2353 | detach(); |
2354 | d->clearError(); |
2355 | |
2356 | QString data = path; |
2357 | if (mode == DecodedMode) { |
2358 | parseDecodedComponent(data); |
2359 | mode = TolerantMode; |
2360 | } |
2361 | |
2362 | d->setPath(value: data, from: 0, end: data.size()); |
2363 | |
2364 | // optimized out, since there is no path delimiter |
2365 | // if (path.isNull()) |
2366 | // d->sectionIsPresent &= ~QUrlPrivate::Path; |
2367 | // else |
2368 | if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Path, input: path)) |
2369 | d->path.clear(); |
2370 | } |
2371 | |
2372 | /*! |
2373 | Returns the path of the URL. |
2374 | |
2375 | \snippet code/src_corelib_io_qurl.cpp 12 |
2376 | |
2377 | The \a options argument controls how to format the path component. All |
2378 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2379 | percent-encoded sequences are decoded; otherwise, the returned value may |
2380 | contain some percent-encoded sequences for some control sequences not |
2381 | representable in decoded form in QString. |
2382 | |
2383 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2384 | sequences are present. It is recommended to use that value when the result |
2385 | will be used in a non-URL context, such as sending to an FTP server. |
2386 | |
2387 | An example of data loss is when you have non-Unicode percent-encoded sequences |
2388 | and use FullyDecoded (the default): |
2389 | |
2390 | \snippet code/src_corelib_io_qurl.cpp 13 |
2391 | |
2392 | In this example, there will be some level of data loss because the \c %FF cannot |
2393 | be converted. |
2394 | |
2395 | Data loss can also occur when the path contains sub-delimiters (such as \c +): |
2396 | |
2397 | \snippet code/src_corelib_io_qurl.cpp 14 |
2398 | |
2399 | Other decoding examples: |
2400 | |
2401 | \snippet code/src_corelib_io_qurl.cpp 15 |
2402 | |
2403 | \sa setPath() |
2404 | */ |
2405 | QString QUrl::path(ComponentFormattingOptions options) const |
2406 | { |
2407 | QString result; |
2408 | if (d) |
2409 | d->appendPath(appendTo&: result, options, appendingTo: QUrlPrivate::Path); |
2410 | return result; |
2411 | } |
2412 | |
2413 | /*! |
2414 | \since 5.2 |
2415 | |
2416 | Returns the name of the file, excluding the directory path. |
2417 | |
2418 | Note that, if this QUrl object is given a path ending in a slash, the name of the file is considered empty. |
2419 | |
2420 | If the path doesn't contain any slash, it is fully returned as the fileName. |
2421 | |
2422 | Example: |
2423 | |
2424 | \snippet code/src_corelib_io_qurl.cpp 7 |
2425 | |
2426 | The \a options argument controls how to format the file name component. All |
2427 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2428 | percent-encoded sequences are decoded; otherwise, the returned value may |
2429 | contain some percent-encoded sequences for some control sequences not |
2430 | representable in decoded form in QString. |
2431 | |
2432 | \sa path() |
2433 | */ |
2434 | QString QUrl::fileName(ComponentFormattingOptions options) const |
2435 | { |
2436 | const QString ourPath = path(options); |
2437 | const qsizetype slash = ourPath.lastIndexOf(c: u'/'); |
2438 | if (slash == -1) |
2439 | return ourPath; |
2440 | return ourPath.mid(position: slash + 1); |
2441 | } |
2442 | |
2443 | /*! |
2444 | \since 4.2 |
2445 | |
2446 | Returns \c true if this URL contains a Query (i.e., if ? was seen on it). |
2447 | |
2448 | \sa setQuery(), query(), hasFragment() |
2449 | */ |
2450 | bool QUrl::hasQuery() const |
2451 | { |
2452 | if (!d) return false; |
2453 | return d->hasQuery(); |
2454 | } |
2455 | |
2456 | /*! |
2457 | Sets the query string of the URL to \a query. |
2458 | |
2459 | This function is useful if you need to pass a query string that |
2460 | does not fit into the key-value pattern, or that uses a different |
2461 | scheme for encoding special characters than what is suggested by |
2462 | QUrl. |
2463 | |
2464 | Passing a value of QString() to \a query (a null QString) unsets |
2465 | the query completely. However, passing a value of QString("") |
2466 | will set the query to an empty value, as if the original URL |
2467 | had a lone "?". |
2468 | |
2469 | The \a query data is interpreted according to \a mode: in StrictMode, |
2470 | any '%' characters must be followed by exactly two hexadecimal characters |
2471 | and some characters (including space) are not allowed in undecoded form. In |
2472 | TolerantMode, all characters are accepted in undecoded form and the |
2473 | tolerant parser will correct stray '%' not followed by two hex characters. |
2474 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2475 | possible. |
2476 | |
2477 | Query strings often contain percent-encoded sequences, so use of |
2478 | DecodedMode is discouraged. One special sequence to be aware of is that of |
2479 | the plus character ('+'). QUrl does not convert spaces to plus characters, |
2480 | even though HTML forms posted by web browsers do. In order to represent an |
2481 | actual plus character in a query, the sequence "%2B" is usually used. This |
2482 | function will leave "%2B" sequences untouched in TolerantMode or |
2483 | StrictMode. |
2484 | |
2485 | \sa query(), hasQuery() |
2486 | */ |
2487 | void QUrl::setQuery(const QString &query, ParsingMode mode) |
2488 | { |
2489 | detach(); |
2490 | d->clearError(); |
2491 | |
2492 | QString data = query; |
2493 | if (mode == DecodedMode) { |
2494 | parseDecodedComponent(data); |
2495 | mode = TolerantMode; |
2496 | } |
2497 | |
2498 | d->setQuery(value: data, from: 0, iend: data.size()); |
2499 | if (query.isNull()) |
2500 | d->sectionIsPresent &= ~QUrlPrivate::Query; |
2501 | else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Query, input: query)) |
2502 | d->query.clear(); |
2503 | } |
2504 | |
2505 | /*! |
2506 | \overload |
2507 | \since 5.0 |
2508 | Sets the query string of the URL to \a query. |
2509 | |
2510 | This function reconstructs the query string from the QUrlQuery object and |
2511 | sets on this QUrl object. This function does not have parsing parameters |
2512 | because the QUrlQuery contains data that is already parsed. |
2513 | |
2514 | \sa query(), hasQuery() |
2515 | */ |
2516 | void QUrl::setQuery(const QUrlQuery &query) |
2517 | { |
2518 | detach(); |
2519 | d->clearError(); |
2520 | |
2521 | // we know the data is in the right format |
2522 | d->query = query.toString(); |
2523 | if (query.isEmpty()) |
2524 | d->sectionIsPresent &= ~QUrlPrivate::Query; |
2525 | else |
2526 | d->sectionIsPresent |= QUrlPrivate::Query; |
2527 | } |
2528 | |
2529 | /*! |
2530 | Returns the query string of the URL if there's a query string, or an empty |
2531 | result if not. To determine if the parsed URL contained a query string, use |
2532 | hasQuery(). |
2533 | |
2534 | The \a options argument controls how to format the query component. All |
2535 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2536 | percent-encoded sequences are decoded; otherwise, the returned value may |
2537 | contain some percent-encoded sequences for some control sequences not |
2538 | representable in decoded form in QString. |
2539 | |
2540 | Note that use of QUrl::FullyDecoded in queries is discouraged, as queries |
2541 | often contain data that is supposed to remain percent-encoded, including |
2542 | the use of the "%2B" sequence to represent a plus character ('+'). |
2543 | |
2544 | \sa setQuery(), hasQuery() |
2545 | */ |
2546 | QString QUrl::query(ComponentFormattingOptions options) const |
2547 | { |
2548 | QString result; |
2549 | if (d) { |
2550 | d->appendQuery(appendTo&: result, options, appendingTo: QUrlPrivate::Query); |
2551 | if (d->hasQuery() && result.isNull()) |
2552 | result.detach(); |
2553 | } |
2554 | return result; |
2555 | } |
2556 | |
2557 | /*! |
2558 | Sets the fragment of the URL to \a fragment. The fragment is the |
2559 | last part of the URL, represented by a '#' followed by a string of |
2560 | characters. It is typically used in HTTP for referring to a |
2561 | certain link or point on a page: |
2562 | |
2563 | \image qurl-fragment.png |
2564 | |
2565 | The fragment is sometimes also referred to as the URL "reference". |
2566 | |
2567 | Passing an argument of QString() (a null QString) will unset the fragment. |
2568 | Passing an argument of QString("") (an empty but not null QString) will set the |
2569 | fragment to an empty string (as if the original URL had a lone "#"). |
2570 | |
2571 | The \a fragment data is interpreted according to \a mode: in StrictMode, |
2572 | any '%' characters must be followed by exactly two hexadecimal characters |
2573 | and some characters (including space) are not allowed in undecoded form. In |
2574 | TolerantMode, all characters are accepted in undecoded form and the |
2575 | tolerant parser will correct stray '%' not followed by two hex characters. |
2576 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2577 | possible. |
2578 | |
2579 | QUrl::DecodedMode should be used when setting the fragment from a data |
2580 | source which is not a URL or with a fragment obtained by calling |
2581 | fragment() with the QUrl::FullyDecoded formatting option. |
2582 | |
2583 | \sa fragment(), hasFragment() |
2584 | */ |
2585 | void QUrl::setFragment(const QString &fragment, ParsingMode mode) |
2586 | { |
2587 | detach(); |
2588 | d->clearError(); |
2589 | |
2590 | QString data = fragment; |
2591 | if (mode == DecodedMode) { |
2592 | parseDecodedComponent(data); |
2593 | mode = TolerantMode; |
2594 | } |
2595 | |
2596 | d->setFragment(value: data, from: 0, end: data.size()); |
2597 | if (fragment.isNull()) |
2598 | d->sectionIsPresent &= ~QUrlPrivate::Fragment; |
2599 | else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Fragment, input: fragment)) |
2600 | d->fragment.clear(); |
2601 | } |
2602 | |
2603 | /*! |
2604 | Returns the fragment of the URL. To determine if the parsed URL contained a |
2605 | fragment, use hasFragment(). |
2606 | |
2607 | The \a options argument controls how to format the fragment component. All |
2608 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2609 | percent-encoded sequences are decoded; otherwise, the returned value may |
2610 | contain some percent-encoded sequences for some control sequences not |
2611 | representable in decoded form in QString. |
2612 | |
2613 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2614 | sequences are present. It is recommended to use that value when the result |
2615 | will be used in a non-URL context. |
2616 | |
2617 | \sa setFragment(), hasFragment() |
2618 | */ |
2619 | QString QUrl::fragment(ComponentFormattingOptions options) const |
2620 | { |
2621 | QString result; |
2622 | if (d) { |
2623 | d->appendFragment(appendTo&: result, options, appendingTo: QUrlPrivate::Fragment); |
2624 | if (d->hasFragment() && result.isNull()) |
2625 | result.detach(); |
2626 | } |
2627 | return result; |
2628 | } |
2629 | |
2630 | /*! |
2631 | \since 4.2 |
2632 | |
2633 | Returns \c true if this URL contains a fragment (i.e., if # was seen on it). |
2634 | |
2635 | \sa fragment(), setFragment() |
2636 | */ |
2637 | bool QUrl::hasFragment() const |
2638 | { |
2639 | if (!d) return false; |
2640 | return d->hasFragment(); |
2641 | } |
2642 | |
2643 | /*! |
2644 | Returns the result of the merge of this URL with \a relative. This |
2645 | URL is used as a base to convert \a relative to an absolute URL. |
2646 | |
2647 | If \a relative is not a relative URL, this function will return \a |
2648 | relative directly. Otherwise, the paths of the two URLs are |
2649 | merged, and the new URL returned has the scheme and authority of |
2650 | the base URL, but with the merged path, as in the following |
2651 | example: |
2652 | |
2653 | \snippet code/src_corelib_io_qurl.cpp 5 |
2654 | |
2655 | Calling resolved() with ".." returns a QUrl whose directory is |
2656 | one level higher than the original. Similarly, calling resolved() |
2657 | with "../.." removes two levels from the path. If \a relative is |
2658 | "/", the path becomes "/". |
2659 | |
2660 | \sa isRelative() |
2661 | */ |
2662 | QUrl QUrl::resolved(const QUrl &relative) const |
2663 | { |
2664 | if (!d) return relative; |
2665 | if (!relative.d) return *this; |
2666 | |
2667 | QUrl t; |
2668 | if (!relative.d->scheme.isEmpty()) { |
2669 | t = relative; |
2670 | t.detach(); |
2671 | } else { |
2672 | if (relative.d->hasAuthority()) { |
2673 | t = relative; |
2674 | t.detach(); |
2675 | } else { |
2676 | t.d = new QUrlPrivate; |
2677 | |
2678 | // copy the authority |
2679 | t.d->userName = d->userName; |
2680 | t.d->password = d->password; |
2681 | t.d->host = d->host; |
2682 | t.d->port = d->port; |
2683 | t.d->sectionIsPresent = d->sectionIsPresent & QUrlPrivate::Authority; |
2684 | |
2685 | if (relative.d->path.isEmpty()) { |
2686 | t.d->path = d->path; |
2687 | if (relative.d->hasQuery()) { |
2688 | t.d->query = relative.d->query; |
2689 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2690 | } else if (d->hasQuery()) { |
2691 | t.d->query = d->query; |
2692 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2693 | } |
2694 | } else { |
2695 | t.d->path = relative.d->path.startsWith(c: u'/') |
2696 | ? relative.d->path |
2697 | : d->mergePaths(relativePath: relative.d->path); |
2698 | if (relative.d->hasQuery()) { |
2699 | t.d->query = relative.d->query; |
2700 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2701 | } |
2702 | } |
2703 | } |
2704 | t.d->scheme = d->scheme; |
2705 | if (d->hasScheme()) |
2706 | t.d->sectionIsPresent |= QUrlPrivate::Scheme; |
2707 | else |
2708 | t.d->sectionIsPresent &= ~QUrlPrivate::Scheme; |
2709 | t.d->flags |= d->flags & QUrlPrivate::IsLocalFile; |
2710 | } |
2711 | t.d->fragment = relative.d->fragment; |
2712 | if (relative.d->hasFragment()) |
2713 | t.d->sectionIsPresent |= QUrlPrivate::Fragment; |
2714 | else |
2715 | t.d->sectionIsPresent &= ~QUrlPrivate::Fragment; |
2716 | |
2717 | qt_normalizePathSegments( |
2718 | path: &t.d->path, |
2719 | flags: isLocalFile() ? QDirPrivate::KeepLocalTrailingSlash : QDirPrivate::RemotePath); |
2720 | if (!t.d->hasAuthority()) |
2721 | fixupNonAuthorityPath(path: &t.d->path); |
2722 | |
2723 | #if defined(QURL_DEBUG) |
2724 | qDebug("QUrl(\"%ls\").resolved(\"%ls\") = \"%ls\"" , |
2725 | qUtf16Printable(url()), |
2726 | qUtf16Printable(relative.url()), |
2727 | qUtf16Printable(t.url())); |
2728 | #endif |
2729 | return t; |
2730 | } |
2731 | |
2732 | /*! |
2733 | Returns \c true if the URL is relative; otherwise returns \c false. A URL is |
2734 | relative reference if its scheme is undefined; this function is therefore |
2735 | equivalent to calling scheme().isEmpty(). |
2736 | |
2737 | Relative references are defined in RFC 3986 section 4.2. |
2738 | |
2739 | \sa {Relative URLs vs Relative Paths} |
2740 | */ |
2741 | bool QUrl::isRelative() const |
2742 | { |
2743 | if (!d) return true; |
2744 | return !d->hasScheme(); |
2745 | } |
2746 | |
2747 | /*! |
2748 | Returns a string representation of the URL. The output can be customized by |
2749 | passing flags with \a options. The option QUrl::FullyDecoded is not |
2750 | permitted in this function since it would generate ambiguous data. |
2751 | |
2752 | The resulting QString can be passed back to a QUrl later on. |
2753 | |
2754 | Synonym for toString(options). |
2755 | |
2756 | \sa FormattingOptions, toEncoded(), toString() |
2757 | */ |
2758 | QString QUrl::url(FormattingOptions options) const |
2759 | { |
2760 | return toString(options); |
2761 | } |
2762 | |
2763 | /*! |
2764 | Returns a string representation of the URL. The output can be customized by |
2765 | passing flags with \a options. The option QUrl::FullyDecoded is not |
2766 | permitted in this function since it would generate ambiguous data. |
2767 | |
2768 | The default formatting option is \l{QUrl::FormattingOptions}{PrettyDecoded}. |
2769 | |
2770 | \sa FormattingOptions, url(), setUrl() |
2771 | */ |
2772 | QString QUrl::toString(FormattingOptions options) const |
2773 | { |
2774 | QString url; |
2775 | if (!isValid()) { |
2776 | // also catches isEmpty() |
2777 | return url; |
2778 | } |
2779 | if ((options & QUrl::FullyDecoded) == QUrl::FullyDecoded) { |
2780 | qWarning(msg: "QUrl: QUrl::FullyDecoded is not permitted when reconstructing the full URL" ); |
2781 | options &= ~QUrl::FullyDecoded; |
2782 | //options |= QUrl::PrettyDecoded; // no-op, value is 0 |
2783 | } |
2784 | |
2785 | // return just the path if: |
2786 | // - QUrl::PreferLocalFile is passed |
2787 | // - QUrl::RemovePath isn't passed (rather stupid if the user did...) |
2788 | // - there's no query or fragment to return |
2789 | // that is, either they aren't present, or we're removing them |
2790 | // - it's a local file |
2791 | if (options.testFlag(f: QUrl::PreferLocalFile) && !options.testFlag(f: QUrl::RemovePath) |
2792 | && (!d->hasQuery() || options.testFlag(f: QUrl::RemoveQuery)) |
2793 | && (!d->hasFragment() || options.testFlag(f: QUrl::RemoveFragment)) |
2794 | && isLocalFile()) { |
2795 | url = d->toLocalFile(options: options | QUrl::FullyDecoded); |
2796 | return url; |
2797 | } |
2798 | |
2799 | // for the full URL, we consider that the reserved characters are prettier if encoded |
2800 | if (options & DecodeReserved) |
2801 | options &= ~EncodeReserved; |
2802 | else |
2803 | options |= EncodeReserved; |
2804 | |
2805 | if (!(options & QUrl::RemoveScheme) && d->hasScheme()) |
2806 | url += d->scheme + u':'; |
2807 | |
2808 | bool pathIsAbsolute = d->path.startsWith(c: u'/'); |
2809 | if (!((options & QUrl::RemoveAuthority) == QUrl::RemoveAuthority) && d->hasAuthority()) { |
2810 | url += "//"_L1 ; |
2811 | d->appendAuthority(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl); |
2812 | } else if (isLocalFile() && pathIsAbsolute) { |
2813 | // Comply with the XDG file URI spec, which requires triple slashes. |
2814 | url += "//"_L1 ; |
2815 | } |
2816 | |
2817 | if (!(options & QUrl::RemovePath)) |
2818 | d->appendPath(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl); |
2819 | |
2820 | if (!(options & QUrl::RemoveQuery) && d->hasQuery()) { |
2821 | url += u'?'; |
2822 | d->appendQuery(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl); |
2823 | } |
2824 | if (!(options & QUrl::RemoveFragment) && d->hasFragment()) { |
2825 | url += u'#'; |
2826 | d->appendFragment(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl); |
2827 | } |
2828 | |
2829 | return url; |
2830 | } |
2831 | |
2832 | /*! |
2833 | \since 5.0 |
2834 | |
2835 | Returns a human-displayable string representation of the URL. |
2836 | The output can be customized by passing flags with \a options. |
2837 | The option RemovePassword is always enabled, since passwords |
2838 | should never be shown back to users. |
2839 | |
2840 | With the default options, the resulting QString can be passed back |
2841 | to a QUrl later on, but any password that was present initially will |
2842 | be lost. |
2843 | |
2844 | \sa FormattingOptions, toEncoded(), toString() |
2845 | */ |
2846 | |
2847 | QString QUrl::toDisplayString(FormattingOptions options) const |
2848 | { |
2849 | return toString(options: options | RemovePassword); |
2850 | } |
2851 | |
2852 | /*! |
2853 | \since 5.2 |
2854 | |
2855 | Returns an adjusted version of the URL. |
2856 | The output can be customized by passing flags with \a options. |
2857 | |
2858 | The encoding options from QUrl::ComponentFormattingOption don't make |
2859 | much sense for this method, nor does QUrl::PreferLocalFile. |
2860 | |
2861 | This is always equivalent to QUrl(url.toString(options)). |
2862 | |
2863 | \sa FormattingOptions, toEncoded(), toString() |
2864 | */ |
2865 | QUrl QUrl::adjusted(QUrl::FormattingOptions options) const |
2866 | { |
2867 | if (!isValid()) { |
2868 | // also catches isEmpty() |
2869 | return QUrl(); |
2870 | } |
2871 | QUrl that = *this; |
2872 | if (options & RemoveScheme) |
2873 | that.setScheme(QString()); |
2874 | if ((options & RemoveAuthority) == RemoveAuthority) { |
2875 | that.setAuthority(authority: QString()); |
2876 | } else { |
2877 | if ((options & RemoveUserInfo) == RemoveUserInfo) |
2878 | that.setUserInfo(userInfo: QString()); |
2879 | else if (options & RemovePassword) |
2880 | that.setPassword(password: QString()); |
2881 | if (options & RemovePort) |
2882 | that.setPort(-1); |
2883 | } |
2884 | if (options & RemoveQuery) |
2885 | that.setQuery(query: QString()); |
2886 | if (options & RemoveFragment) |
2887 | that.setFragment(fragment: QString()); |
2888 | if (options & RemovePath) { |
2889 | that.setPath(path: QString()); |
2890 | } else if (options & (StripTrailingSlash | RemoveFilename | NormalizePathSegments)) { |
2891 | that.detach(); |
2892 | QString path; |
2893 | d->appendPath(appendTo&: path, options: options | FullyEncoded, appendingTo: QUrlPrivate::Path); |
2894 | that.d->setPath(value: path, from: 0, end: path.size()); |
2895 | } |
2896 | return that; |
2897 | } |
2898 | |
2899 | /*! |
2900 | Returns the encoded representation of the URL if it's valid; |
2901 | otherwise an empty QByteArray is returned. The output can be |
2902 | customized by passing flags with \a options. |
2903 | |
2904 | The user info, path and fragment are all converted to UTF-8, and |
2905 | all non-ASCII characters are then percent encoded. The host name |
2906 | is encoded using Punycode. |
2907 | */ |
2908 | QByteArray QUrl::toEncoded(FormattingOptions options) const |
2909 | { |
2910 | options &= ~(FullyDecoded | FullyEncoded); |
2911 | return toString(options: options | FullyEncoded).toLatin1(); |
2912 | } |
2913 | |
2914 | /*! |
2915 | Parses \a input and returns the corresponding QUrl. \a input is |
2916 | assumed to be in encoded form, containing only ASCII characters. |
2917 | |
2918 | Parses the URL using \a mode. See setUrl() for more information on |
2919 | this parameter. QUrl::DecodedMode is not permitted in this context. |
2920 | |
2921 | \note In Qt versions prior to 6.7, this function took a QByteArray, not |
2922 | QByteArrayView. If you experience compile errors, it's because your code |
2923 | is passing objects that are implicitly convertible to QByteArray, but not |
2924 | QByteArrayView. Wrap the corresponding argument in \c{QByteArray{~~~}} to |
2925 | make the cast explicit. This is backwards-compatible with old Qt versions. |
2926 | |
2927 | \sa toEncoded(), setUrl() |
2928 | */ |
2929 | QUrl QUrl::fromEncoded(QByteArrayView input, ParsingMode mode) |
2930 | { |
2931 | return QUrl(QString::fromUtf8(utf8: input), mode); |
2932 | } |
2933 | |
2934 | /*! |
2935 | Returns a decoded copy of \a input. \a input is first decoded from |
2936 | percent encoding, then converted from UTF-8 to unicode. |
2937 | |
2938 | \note Given invalid input (such as a string containing the sequence "%G5", |
2939 | which is not a valid hexadecimal number) the output will be invalid as |
2940 | well. As an example: the sequence "%G5" could be decoded to 'W'. |
2941 | */ |
2942 | QString QUrl::fromPercentEncoding(const QByteArray &input) |
2943 | { |
2944 | QByteArray ba = QByteArray::fromPercentEncoding(pctEncoded: input); |
2945 | return QString::fromUtf8(utf8: ba, size: ba.size()); |
2946 | } |
2947 | |
2948 | /*! |
2949 | Returns an encoded copy of \a input. \a input is first converted |
2950 | to UTF-8, and all ASCII-characters that are not in the unreserved group |
2951 | are percent encoded. To prevent characters from being percent encoded |
2952 | pass them to \a exclude. To force characters to be percent encoded pass |
2953 | them to \a include. |
2954 | |
2955 | Unreserved is defined as: |
2956 | \tt {ALPHA / DIGIT / "-" / "." / "_" / "~"} |
2957 | |
2958 | \snippet code/src_corelib_io_qurl.cpp 6 |
2959 | */ |
2960 | QByteArray QUrl::toPercentEncoding(const QString &input, const QByteArray &exclude, const QByteArray &include) |
2961 | { |
2962 | return input.toUtf8().toPercentEncoding(exclude, include); |
2963 | } |
2964 | |
2965 | /*! |
2966 | \since 6.3 |
2967 | |
2968 | Returns the Unicode form of the given domain name |
2969 | \a domain, which is encoded in the ASCII Compatible Encoding (ACE). |
2970 | The output can be customized by passing flags with \a options. |
2971 | The result of this function is considered equivalent to \a domain. |
2972 | |
2973 | If the value in \a domain cannot be encoded, it will be converted |
2974 | to QString and returned. |
2975 | |
2976 | The ASCII-Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491 |
2977 | and RFC 3492 and updated by the Unicode Technical Standard #46. It is part |
2978 | of the Internationalizing Domain Names in Applications (IDNA) specification, |
2979 | which allows for domain names (like \c "example.com") to be written using |
2980 | non-US-ASCII characters. |
2981 | */ |
2982 | QString QUrl::fromAce(const QByteArray &domain, QUrl::AceProcessingOptions options) |
2983 | { |
2984 | return qt_ACE_do(domain: QString::fromLatin1(ba: domain), op: NormalizeAce, |
2985 | dot: ForbidLeadingDot /*FIXME: make configurable*/, options); |
2986 | } |
2987 | |
2988 | /*! |
2989 | \since 6.3 |
2990 | |
2991 | Returns the ASCII Compatible Encoding of the given domain name \a domain. |
2992 | The output can be customized by passing flags with \a options. |
2993 | The result of this function is considered equivalent to \a domain. |
2994 | |
2995 | The ASCII-Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491 |
2996 | and RFC 3492 and updated by the Unicode Technical Standard #46. It is part |
2997 | of the Internationalizing Domain Names in Applications (IDNA) specification, |
2998 | which allows for domain names (like \c "example.com") to be written using |
2999 | non-US-ASCII characters. |
3000 | |
3001 | This function returns an empty QByteArray if \a domain is not a valid |
3002 | hostname. Note, in particular, that IPv6 literals are not valid domain |
3003 | names. |
3004 | */ |
3005 | QByteArray QUrl::toAce(const QString &domain, AceProcessingOptions options) |
3006 | { |
3007 | return qt_ACE_do(domain, op: ToAceOnly, dot: ForbidLeadingDot /*FIXME: make configurable*/, options) |
3008 | .toLatin1(); |
3009 | } |
3010 | |
3011 | /*! |
3012 | \internal |
3013 | |
3014 | \fn bool QUrl::operator<(const QUrl &lhs, const QUrl &rhs) |
3015 | |
3016 | Returns \c true if URL \a lhs is "less than" URL \a rhs. This |
3017 | provides a means of ordering URLs. |
3018 | */ |
3019 | |
3020 | Qt::weak_ordering compareThreeWay(const QUrl &lhs, const QUrl &rhs) |
3021 | { |
3022 | if (!lhs.d || !rhs.d) { |
3023 | bool thisIsEmpty = !lhs.d || lhs.d->isEmpty(); |
3024 | bool thatIsEmpty = !rhs.d || rhs.d->isEmpty(); |
3025 | |
3026 | // sort an empty URL first |
3027 | if (thisIsEmpty) { |
3028 | if (!thatIsEmpty) |
3029 | return Qt::weak_ordering::less; |
3030 | else |
3031 | return Qt::weak_ordering::equivalent; |
3032 | } else { |
3033 | return Qt::weak_ordering::greater; |
3034 | } |
3035 | } |
3036 | |
3037 | int cmp; |
3038 | cmp = lhs.d->scheme.compare(s: rhs.d->scheme); |
3039 | if (cmp != 0) |
3040 | return Qt::compareThreeWay(lhs: cmp, rhs: 0); |
3041 | |
3042 | cmp = lhs.d->userName.compare(s: rhs.d->userName); |
3043 | if (cmp != 0) |
3044 | return Qt::compareThreeWay(lhs: cmp, rhs: 0); |
3045 | |
3046 | cmp = lhs.d->password.compare(s: rhs.d->password); |
3047 | if (cmp != 0) |
3048 | return Qt::compareThreeWay(lhs: cmp, rhs: 0); |
3049 | |
3050 | cmp = lhs.d->host.compare(s: rhs.d->host); |
3051 | if (cmp != 0) |
3052 | return Qt::compareThreeWay(lhs: cmp, rhs: 0); |
3053 | |
3054 | if (lhs.d->port != rhs.d->port) |
3055 | return Qt::compareThreeWay(lhs: lhs.d->port, rhs: rhs.d->port); |
3056 | |
3057 | cmp = lhs.d->path.compare(s: rhs.d->path); |
3058 | if (cmp != 0) |
3059 | return Qt::compareThreeWay(lhs: cmp, rhs: 0); |
3060 | |
3061 | if (lhs.d->hasQuery() != rhs.d->hasQuery()) |
3062 | return rhs.d->hasQuery() ? Qt::weak_ordering::less : Qt::weak_ordering::greater; |
3063 | |
3064 | cmp = lhs.d->query.compare(s: rhs.d->query); |
3065 | if (cmp != 0) |
3066 | return Qt::compareThreeWay(lhs: cmp, rhs: 0); |
3067 | |
3068 | if (lhs.d->hasFragment() != rhs.d->hasFragment()) |
3069 | return rhs.d->hasFragment() ? Qt::weak_ordering::less : Qt::weak_ordering::greater; |
3070 | |
3071 | cmp = lhs.d->fragment.compare(s: rhs.d->fragment); |
3072 | return Qt::compareThreeWay(lhs: cmp, rhs: 0); |
3073 | } |
3074 | |
3075 | /*! |
3076 | \fn bool QUrl::operator==(const QUrl &lhs, const QUrl &rhs) |
3077 | |
3078 | Returns \c true if \a lhs and \a rhs URLs are equivalent; |
3079 | otherwise returns \c false. |
3080 | |
3081 | \sa matches() |
3082 | */ |
3083 | |
3084 | bool comparesEqual(const QUrl &lhs, const QUrl &rhs) |
3085 | { |
3086 | if (!lhs.d && !rhs.d) |
3087 | return true; |
3088 | if (!lhs.d) |
3089 | return rhs.d->isEmpty(); |
3090 | if (!rhs.d) |
3091 | return lhs.d->isEmpty(); |
3092 | |
3093 | // First, compare which sections are present, since it speeds up the |
3094 | // processing considerably. We just have to ignore the host-is-present flag |
3095 | // for local files (the "file" protocol), due to the requirements of the |
3096 | // XDG file URI specification. |
3097 | int mask = QUrlPrivate::FullUrl; |
3098 | if (lhs.isLocalFile()) |
3099 | mask &= ~QUrlPrivate::Host; |
3100 | return (lhs.d->sectionIsPresent & mask) == (rhs.d->sectionIsPresent & mask) && |
3101 | lhs.d->scheme == rhs.d->scheme && |
3102 | lhs.d->userName == rhs.d->userName && |
3103 | lhs.d->password == rhs.d->password && |
3104 | lhs.d->host == rhs.d->host && |
3105 | lhs.d->port == rhs.d->port && |
3106 | lhs.d->path == rhs.d->path && |
3107 | lhs.d->query == rhs.d->query && |
3108 | lhs.d->fragment == rhs.d->fragment; |
3109 | } |
3110 | |
3111 | /*! |
3112 | \since 5.2 |
3113 | |
3114 | Returns \c true if this URL and the given \a url are equal after |
3115 | applying \a options to both; otherwise returns \c false. |
3116 | |
3117 | This is equivalent to calling adjusted(options) on both URLs |
3118 | and comparing the resulting urls, but faster. |
3119 | |
3120 | */ |
3121 | bool QUrl::matches(const QUrl &url, FormattingOptions options) const |
3122 | { |
3123 | if (!d && !url.d) |
3124 | return true; |
3125 | if (!d) |
3126 | return url.d->isEmpty(); |
3127 | if (!url.d) |
3128 | return d->isEmpty(); |
3129 | |
3130 | // First, compare which sections are present, since it speeds up the |
3131 | // processing considerably. We just have to ignore the host-is-present flag |
3132 | // for local files (the "file" protocol), due to the requirements of the |
3133 | // XDG file URI specification. |
3134 | int mask = QUrlPrivate::FullUrl; |
3135 | if (isLocalFile()) |
3136 | mask &= ~QUrlPrivate::Host; |
3137 | |
3138 | if (options.testFlag(f: QUrl::RemoveScheme)) |
3139 | mask &= ~QUrlPrivate::Scheme; |
3140 | else if (d->scheme != url.d->scheme) |
3141 | return false; |
3142 | |
3143 | if (options.testFlag(f: QUrl::RemovePassword)) |
3144 | mask &= ~QUrlPrivate::Password; |
3145 | else if (d->password != url.d->password) |
3146 | return false; |
3147 | |
3148 | if (options.testFlag(f: QUrl::RemoveUserInfo)) |
3149 | mask &= ~QUrlPrivate::UserName; |
3150 | else if (d->userName != url.d->userName) |
3151 | return false; |
3152 | |
3153 | if (options.testFlag(f: QUrl::RemovePort)) |
3154 | mask &= ~QUrlPrivate::Port; |
3155 | else if (d->port != url.d->port) |
3156 | return false; |
3157 | |
3158 | if (options.testFlag(f: QUrl::RemoveAuthority)) |
3159 | mask &= ~QUrlPrivate::Host; |
3160 | else if (d->host != url.d->host) |
3161 | return false; |
3162 | |
3163 | if (options.testFlag(f: QUrl::RemoveQuery)) |
3164 | mask &= ~QUrlPrivate::Query; |
3165 | else if (d->query != url.d->query) |
3166 | return false; |
3167 | |
3168 | if (options.testFlag(f: QUrl::RemoveFragment)) |
3169 | mask &= ~QUrlPrivate::Fragment; |
3170 | else if (d->fragment != url.d->fragment) |
3171 | return false; |
3172 | |
3173 | if ((d->sectionIsPresent & mask) != (url.d->sectionIsPresent & mask)) |
3174 | return false; |
3175 | |
3176 | if (options.testFlag(f: QUrl::RemovePath)) |
3177 | return true; |
3178 | |
3179 | // Compare paths, after applying path-related options |
3180 | QString path1; |
3181 | d->appendPath(appendTo&: path1, options, appendingTo: QUrlPrivate::Path); |
3182 | QString path2; |
3183 | url.d->appendPath(appendTo&: path2, options, appendingTo: QUrlPrivate::Path); |
3184 | return path1 == path2; |
3185 | } |
3186 | |
3187 | /*! |
3188 | \fn bool QUrl::operator !=(const QUrl &lhs, const QUrl &rhs) |
3189 | |
3190 | Returns \c true if \a lhs and \a rhs URLs are not equal; |
3191 | otherwise returns \c false. |
3192 | |
3193 | \sa matches() |
3194 | */ |
3195 | |
3196 | /*! |
3197 | Assigns the specified \a url to this object. |
3198 | */ |
3199 | QUrl &QUrl::operator =(const QUrl &url) noexcept |
3200 | { |
3201 | if (!d) { |
3202 | if (url.d) { |
3203 | url.d->ref.ref(); |
3204 | d = url.d; |
3205 | } |
3206 | } else { |
3207 | if (url.d) |
3208 | qAtomicAssign(d, x: url.d); |
3209 | else |
3210 | clear(); |
3211 | } |
3212 | return *this; |
3213 | } |
3214 | |
3215 | /*! |
3216 | Assigns the specified \a url to this object. |
3217 | */ |
3218 | QUrl &QUrl::operator =(const QString &url) |
3219 | { |
3220 | if (url.isEmpty()) { |
3221 | clear(); |
3222 | } else { |
3223 | detach(); |
3224 | d->parse(url, parsingMode: TolerantMode); |
3225 | } |
3226 | return *this; |
3227 | } |
3228 | |
3229 | /*! |
3230 | \fn void QUrl::swap(QUrl &other) |
3231 | \since 4.8 |
3232 | |
3233 | Swaps URL \a other with this URL. This operation is very |
3234 | fast and never fails. |
3235 | */ |
3236 | |
3237 | /*! |
3238 | \internal |
3239 | |
3240 | Forces a detach. |
3241 | */ |
3242 | void QUrl::detach() |
3243 | { |
3244 | if (!d) |
3245 | d = new QUrlPrivate; |
3246 | else |
3247 | qAtomicDetach(d); |
3248 | } |
3249 | |
3250 | /*! |
3251 | \internal |
3252 | */ |
3253 | bool QUrl::isDetached() const |
3254 | { |
3255 | return !d || d->ref.loadRelaxed() == 1; |
3256 | } |
3257 | |
3258 | static QString fromNativeSeparators(const QString &pathName) |
3259 | { |
3260 | #if defined(Q_OS_WIN) |
3261 | QString result(pathName); |
3262 | const QChar nativeSeparator = u'\\'; |
3263 | auto i = result.indexOf(nativeSeparator); |
3264 | if (i != -1) { |
3265 | QChar * const data = result.data(); |
3266 | const auto length = result.length(); |
3267 | for (; i < length; ++i) { |
3268 | if (data[i] == nativeSeparator) |
3269 | data[i] = u'/'; |
3270 | } |
3271 | } |
3272 | return result; |
3273 | #else |
3274 | return pathName; |
3275 | #endif |
3276 | } |
3277 | |
3278 | /*! |
3279 | Returns a QUrl representation of \a localFile, interpreted as a local |
3280 | file. This function accepts paths separated by slashes as well as the |
3281 | native separator for this platform. |
3282 | |
3283 | This function also accepts paths with a doubled leading slash (or |
3284 | backslash) to indicate a remote file, as in |
3285 | "//servername/path/to/file.txt". Note that only certain platforms can |
3286 | actually open this file using QFile::open(). |
3287 | |
3288 | An empty \a localFile leads to an empty URL (since Qt 5.4). |
3289 | |
3290 | \snippet code/src_corelib_io_qurl.cpp 16 |
3291 | |
3292 | In the first line in snippet above, a file URL is constructed from a |
3293 | local, relative path. A file URL with a relative path only makes sense |
3294 | if there is a base URL to resolve it against. For example: |
3295 | |
3296 | \snippet code/src_corelib_io_qurl.cpp 17 |
3297 | |
3298 | To resolve such a URL, it's necessary to remove the scheme beforehand: |
3299 | |
3300 | \snippet code/src_corelib_io_qurl.cpp 18 |
3301 | |
3302 | For this reason, it is better to use a relative URL (that is, no scheme) |
3303 | for relative file paths: |
3304 | |
3305 | \snippet code/src_corelib_io_qurl.cpp 19 |
3306 | |
3307 | \sa toLocalFile(), isLocalFile(), QDir::toNativeSeparators() |
3308 | */ |
3309 | QUrl QUrl::fromLocalFile(const QString &localFile) |
3310 | { |
3311 | QUrl url; |
3312 | if (localFile.isEmpty()) |
3313 | return url; |
3314 | QString scheme = fileScheme(); |
3315 | QString deslashified = fromNativeSeparators(pathName: localFile); |
3316 | |
3317 | // magic for drives on windows |
3318 | if (deslashified.size() > 1 && deslashified.at(i: 1) == u':' && deslashified.at(i: 0) != u'/') { |
3319 | deslashified.prepend(c: u'/'); |
3320 | } else if (deslashified.startsWith(s: "//"_L1 )) { |
3321 | // magic for shared drive on windows |
3322 | qsizetype indexOfPath = deslashified.indexOf(c: u'/', from: 2); |
3323 | QStringView hostSpec = QStringView{deslashified}.mid(pos: 2, n: indexOfPath - 2); |
3324 | // Check for Windows-specific WebDAV specification: "//host@SSL/path". |
3325 | if (hostSpec.endsWith(s: webDavSslTag(), cs: Qt::CaseInsensitive)) { |
3326 | hostSpec.truncate(n: hostSpec.size() - 4); |
3327 | scheme = webDavScheme(); |
3328 | } |
3329 | |
3330 | // hosts can't be IPv6 addresses without [], so we can use QUrlPrivate::setHost |
3331 | url.detach(); |
3332 | if (!url.d->setHost(value: hostSpec.toString(), from: 0, iend: hostSpec.size(), mode: StrictMode)) { |
3333 | if (url.d->error->code != QUrlPrivate::InvalidRegNameError) |
3334 | return url; |
3335 | |
3336 | // Path hostname is not a valid URL host, so set it entirely in the path |
3337 | // (by leaving deslashified unchanged) |
3338 | } else if (indexOfPath > 2) { |
3339 | deslashified = deslashified.right(n: deslashified.size() - indexOfPath); |
3340 | } else { |
3341 | deslashified.clear(); |
3342 | } |
3343 | } |
3344 | |
3345 | url.setScheme(scheme); |
3346 | url.setPath(path: deslashified, mode: DecodedMode); |
3347 | return url; |
3348 | } |
3349 | |
3350 | /*! |
3351 | Returns the path of this URL formatted as a local file path. The path |
3352 | returned will use forward slashes, even if it was originally created |
3353 | from one with backslashes. |
3354 | |
3355 | If this URL contains a non-empty hostname, it will be encoded in the |
3356 | returned value in the form found on SMB networks (for example, |
3357 | "//servername/path/to/file.txt"). |
3358 | |
3359 | \snippet code/src_corelib_io_qurl.cpp 20 |
3360 | |
3361 | Note: if the path component of this URL contains a non-UTF-8 binary |
3362 | sequence (such as %80), the behaviour of this function is undefined. |
3363 | |
3364 | \sa fromLocalFile(), isLocalFile() |
3365 | */ |
3366 | QString QUrl::toLocalFile() const |
3367 | { |
3368 | // the call to isLocalFile() also ensures that we're parsed |
3369 | if (!isLocalFile()) |
3370 | return QString(); |
3371 | |
3372 | return d->toLocalFile(options: QUrl::FullyDecoded); |
3373 | } |
3374 | |
3375 | /*! |
3376 | \since 4.8 |
3377 | Returns \c true if this URL is pointing to a local file path. A URL is a |
3378 | local file path if the scheme is "file". |
3379 | |
3380 | Note that this function considers URLs with hostnames to be local file |
3381 | paths, even if the eventual file path cannot be opened with |
3382 | QFile::open(). |
3383 | |
3384 | \sa fromLocalFile(), toLocalFile() |
3385 | */ |
3386 | bool QUrl::isLocalFile() const |
3387 | { |
3388 | return d && d->isLocalFile(); |
3389 | } |
3390 | |
3391 | /*! |
3392 | Returns \c true if this URL is a parent of \a childUrl. \a childUrl is a child |
3393 | of this URL if the two URLs share the same scheme and authority, |
3394 | and this URL's path is a parent of the path of \a childUrl. |
3395 | */ |
3396 | bool QUrl::isParentOf(const QUrl &childUrl) const |
3397 | { |
3398 | QString childPath = childUrl.path(); |
3399 | |
3400 | if (!d) |
3401 | return ((childUrl.scheme().isEmpty()) |
3402 | && (childUrl.authority().isEmpty()) |
3403 | && childPath.size() > 0 && childPath.at(i: 0) == u'/'); |
3404 | |
3405 | QString ourPath = path(); |
3406 | |
3407 | return ((childUrl.scheme().isEmpty() || d->scheme == childUrl.scheme()) |
3408 | && (childUrl.authority().isEmpty() || authority() == childUrl.authority()) |
3409 | && childPath.startsWith(s: ourPath) |
3410 | && ((ourPath.endsWith(c: u'/') && childPath.size() > ourPath.size()) |
3411 | || (!ourPath.endsWith(c: u'/') && childPath.size() > ourPath.size() |
3412 | && childPath.at(i: ourPath.size()) == u'/'))); |
3413 | } |
3414 | |
3415 | |
3416 | #ifndef QT_NO_DATASTREAM |
3417 | /*! \relates QUrl |
3418 | |
3419 | Writes url \a url to the stream \a out and returns a reference |
3420 | to the stream. |
3421 | |
3422 | \sa{Serializing Qt Data Types}{Format of the QDataStream operators} |
3423 | */ |
3424 | QDataStream &operator<<(QDataStream &out, const QUrl &url) |
3425 | { |
3426 | QByteArray u; |
3427 | if (url.isValid()) |
3428 | u = url.toEncoded(); |
3429 | out << u; |
3430 | return out; |
3431 | } |
3432 | |
3433 | /*! \relates QUrl |
3434 | |
3435 | Reads a url into \a url from the stream \a in and returns a |
3436 | reference to the stream. |
3437 | |
3438 | \sa{Serializing Qt Data Types}{Format of the QDataStream operators} |
3439 | */ |
3440 | QDataStream &operator>>(QDataStream &in, QUrl &url) |
3441 | { |
3442 | QByteArray u; |
3443 | in >> u; |
3444 | url.setUrl(url: QString::fromLatin1(ba: u)); |
3445 | return in; |
3446 | } |
3447 | #endif // QT_NO_DATASTREAM |
3448 | |
3449 | #ifndef QT_NO_DEBUG_STREAM |
3450 | QDebug operator<<(QDebug d, const QUrl &url) |
3451 | { |
3452 | QDebugStateSaver saver(d); |
3453 | d.nospace() << "QUrl(" << url.toDisplayString() << ')'; |
3454 | return d; |
3455 | } |
3456 | #endif |
3457 | |
3458 | static QString errorMessage(QUrlPrivate::ErrorCode errorCode, const QString &errorSource, qsizetype errorPosition) |
3459 | { |
3460 | QChar c = size_t(errorPosition) < size_t(errorSource.size()) ? |
3461 | errorSource.at(i: errorPosition) : QChar(QChar::Null); |
3462 | |
3463 | switch (errorCode) { |
3464 | case QUrlPrivate::NoError: |
3465 | Q_UNREACHABLE_RETURN(QString()); // QUrl::errorString should have treated this condition |
3466 | |
3467 | case QUrlPrivate::InvalidSchemeError: { |
3468 | auto msg = "Invalid scheme (character '%1' not permitted)"_L1 ; |
3469 | return msg.arg(args&: c); |
3470 | } |
3471 | |
3472 | case QUrlPrivate::InvalidUserNameError: |
3473 | return "Invalid user name (character '%1' not permitted)"_L1 |
3474 | .arg(args&: c); |
3475 | |
3476 | case QUrlPrivate::InvalidPasswordError: |
3477 | return "Invalid password (character '%1' not permitted)"_L1 |
3478 | .arg(args&: c); |
3479 | |
3480 | case QUrlPrivate::InvalidRegNameError: |
3481 | if (errorPosition >= 0) |
3482 | return "Invalid hostname (character '%1' not permitted)"_L1 |
3483 | .arg(args&: c); |
3484 | else |
3485 | return QStringLiteral("Invalid hostname (contains invalid characters)" ); |
3486 | case QUrlPrivate::InvalidIPv4AddressError: |
3487 | return QString(); // doesn't happen yet |
3488 | case QUrlPrivate::InvalidIPv6AddressError: |
3489 | return QStringLiteral("Invalid IPv6 address" ); |
3490 | case QUrlPrivate::InvalidCharacterInIPv6Error: |
3491 | return "Invalid IPv6 address (character '%1' not permitted)"_L1 .arg(args&: c); |
3492 | case QUrlPrivate::InvalidIPvFutureError: |
3493 | return "Invalid IPvFuture address (character '%1' not permitted)"_L1 .arg(args&: c); |
3494 | case QUrlPrivate::HostMissingEndBracket: |
3495 | return QStringLiteral("Expected ']' to match '[' in hostname" ); |
3496 | |
3497 | case QUrlPrivate::InvalidPortError: |
3498 | return QStringLiteral("Invalid port or port number out of range" ); |
3499 | case QUrlPrivate::PortEmptyError: |
3500 | return QStringLiteral("Port field was empty" ); |
3501 | |
3502 | case QUrlPrivate::InvalidPathError: |
3503 | return "Invalid path (character '%1' not permitted)"_L1 |
3504 | .arg(args&: c); |
3505 | |
3506 | case QUrlPrivate::InvalidQueryError: |
3507 | return "Invalid query (character '%1' not permitted)"_L1 |
3508 | .arg(args&: c); |
3509 | |
3510 | case QUrlPrivate::InvalidFragmentError: |
3511 | return "Invalid fragment (character '%1' not permitted)"_L1 |
3512 | .arg(args&: c); |
3513 | |
3514 | case QUrlPrivate::AuthorityPresentAndPathIsRelative: |
3515 | return QStringLiteral("Path component is relative and authority is present" ); |
3516 | case QUrlPrivate::AuthorityAbsentAndPathIsDoubleSlash: |
3517 | return QStringLiteral("Path component starts with '//' and authority is absent" ); |
3518 | case QUrlPrivate::RelativeUrlPathContainsColonBeforeSlash: |
3519 | return QStringLiteral("Relative URL's path component contains ':' before any '/'" ); |
3520 | } |
3521 | |
3522 | Q_UNREACHABLE_RETURN(QString()); |
3523 | } |
3524 | |
3525 | static inline void appendComponentIfPresent(QString &msg, bool present, const char *componentName, |
3526 | const QString &component) |
3527 | { |
3528 | if (present) |
3529 | msg += QLatin1StringView(componentName) % u'"' % component % "\","_L1 ; |
3530 | } |
3531 | |
3532 | /*! |
3533 | \since 4.2 |
3534 | |
3535 | Returns an error message if the last operation that modified this QUrl |
3536 | object ran into a parsing error. If no error was detected, this function |
3537 | returns an empty string and isValid() returns \c true. |
3538 | |
3539 | The error message returned by this function is technical in nature and may |
3540 | not be understood by end users. It is mostly useful to developers trying to |
3541 | understand why QUrl will not accept some input. |
3542 | |
3543 | \sa QUrl::ParsingMode |
3544 | */ |
3545 | QString QUrl::errorString() const |
3546 | { |
3547 | QString msg; |
3548 | if (!d) |
3549 | return msg; |
3550 | |
3551 | QString errorSource; |
3552 | qsizetype errorPosition = 0; |
3553 | QUrlPrivate::ErrorCode errorCode = d->validityError(source: &errorSource, position: &errorPosition); |
3554 | if (errorCode == QUrlPrivate::NoError) |
3555 | return msg; |
3556 | |
3557 | msg += errorMessage(errorCode, errorSource, errorPosition); |
3558 | msg += "; source was \""_L1 ; |
3559 | msg += errorSource; |
3560 | msg += "\";"_L1 ; |
3561 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Scheme, |
3562 | componentName: " scheme = " , component: d->scheme); |
3563 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::UserInfo, |
3564 | componentName: " userinfo = " , component: userInfo()); |
3565 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Host, |
3566 | componentName: " host = " , component: d->host); |
3567 | appendComponentIfPresent(msg, present: d->port != -1, |
3568 | componentName: " port = " , component: QString::number(d->port)); |
3569 | appendComponentIfPresent(msg, present: !d->path.isEmpty(), |
3570 | componentName: " path = " , component: d->path); |
3571 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Query, |
3572 | componentName: " query = " , component: d->query); |
3573 | appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Fragment, |
3574 | componentName: " fragment = " , component: d->fragment); |
3575 | if (msg.endsWith(c: u',')) |
3576 | msg.chop(n: 1); |
3577 | return msg; |
3578 | } |
3579 | |
3580 | /*! |
3581 | \since 5.1 |
3582 | |
3583 | Converts a list of \a urls into a list of QString objects, using toString(\a options). |
3584 | */ |
3585 | QStringList QUrl::toStringList(const QList<QUrl> &urls, FormattingOptions options) |
3586 | { |
3587 | QStringList lst; |
3588 | lst.reserve(asize: urls.size()); |
3589 | for (const QUrl &url : urls) |
3590 | lst.append(t: url.toString(options)); |
3591 | return lst; |
3592 | |
3593 | } |
3594 | |
3595 | /*! |
3596 | \since 5.1 |
3597 | |
3598 | Converts a list of strings representing \a urls into a list of urls, using QUrl(str, \a mode). |
3599 | Note that this means all strings must be urls, not for instance local paths. |
3600 | */ |
3601 | QList<QUrl> QUrl::fromStringList(const QStringList &urls, ParsingMode mode) |
3602 | { |
3603 | QList<QUrl> lst; |
3604 | lst.reserve(asize: urls.size()); |
3605 | for (const QString &str : urls) |
3606 | lst.append(t: QUrl(str, mode)); |
3607 | return lst; |
3608 | } |
3609 | |
3610 | /*! |
3611 | \typedef QUrl::DataPtr |
3612 | \internal |
3613 | */ |
3614 | |
3615 | /*! |
3616 | \fn DataPtr &QUrl::data_ptr() |
3617 | \internal |
3618 | */ |
3619 | |
3620 | /*! |
3621 | Returns the hash value for the \a url. If specified, \a seed is used to |
3622 | initialize the hash. |
3623 | |
3624 | \relates QHash |
3625 | \since 5.0 |
3626 | */ |
3627 | size_t qHash(const QUrl &url, size_t seed) noexcept |
3628 | { |
3629 | if (!url.d) |
3630 | return qHash(key: -1, seed); // the hash of an unset port (-1) |
3631 | |
3632 | return qHash(key: url.d->scheme) ^ |
3633 | qHash(key: url.d->userName) ^ |
3634 | qHash(key: url.d->password) ^ |
3635 | qHash(key: url.d->host) ^ |
3636 | qHash(key: url.d->port, seed) ^ |
3637 | qHash(key: url.d->path) ^ |
3638 | qHash(key: url.d->query) ^ |
3639 | qHash(key: url.d->fragment); |
3640 | } |
3641 | |
3642 | static QUrl adjustFtpPath(QUrl url) |
3643 | { |
3644 | if (url.scheme() == ftpScheme()) { |
3645 | QString path = url.path(options: QUrl::PrettyDecoded); |
3646 | if (path.startsWith(s: "//"_L1 )) |
3647 | url.setPath(path: "/%2F"_L1 + QStringView{path}.mid(pos: 2), mode: QUrl::TolerantMode); |
3648 | } |
3649 | return url; |
3650 | } |
3651 | |
3652 | static bool isIp6(const QString &text) |
3653 | { |
3654 | QIPAddressUtils::IPv6Address address; |
3655 | return !text.isEmpty() && QIPAddressUtils::parseIp6(address, begin: text.begin(), end: text.end()) == nullptr; |
3656 | } |
3657 | |
3658 | /*! |
3659 | Returns a valid URL from a user supplied \a userInput string if one can be |
3660 | deduced. In the case that is not possible, an invalid QUrl() is returned. |
3661 | |
3662 | This allows the user to input a URL or a local file path in the form of a plain |
3663 | string. This string can be manually typed into a location bar, obtained from |
3664 | the clipboard, or passed in via command line arguments. |
3665 | |
3666 | When the string is not already a valid URL, a best guess is performed, |
3667 | making various assumptions. |
3668 | |
3669 | In the case the string corresponds to a valid file path on the system, |
3670 | a file:// URL is constructed, using QUrl::fromLocalFile(). |
3671 | |
3672 | If that is not the case, an attempt is made to turn the string into a |
3673 | http:// or ftp:// URL. The latter in the case the string starts with |
3674 | 'ftp'. The result is then passed through QUrl's tolerant parser, and |
3675 | in the case or success, a valid QUrl is returned, or else a QUrl(). |
3676 | |
3677 | \section1 Examples: |
3678 | |
3679 | \list |
3680 | \li qt-project.org becomes http://qt-project.org |
3681 | \li ftp.qt-project.org becomes ftp://ftp.qt-project.org |
3682 | \li hostname becomes http://hostname |
3683 | \li /home/user/test.html becomes file:///home/user/test.html |
3684 | \endlist |
3685 | |
3686 | In order to be able to handle relative paths, this method takes an optional |
3687 | \a workingDirectory path. This is especially useful when handling command |
3688 | line arguments. |
3689 | If \a workingDirectory is empty, no handling of relative paths will be done. |
3690 | |
3691 | By default, an input string that looks like a relative path will only be treated |
3692 | as such if the file actually exists in the given working directory. |
3693 | If the application can handle files that don't exist yet, it should pass the |
3694 | flag AssumeLocalFile in \a options. |
3695 | |
3696 | \since 5.4 |
3697 | */ |
3698 | QUrl QUrl::fromUserInput(const QString &userInput, const QString &workingDirectory, |
3699 | UserInputResolutionOptions options) |
3700 | { |
3701 | QString trimmedString = userInput.trimmed(); |
3702 | |
3703 | if (trimmedString.isEmpty()) |
3704 | return QUrl(); |
3705 | |
3706 | // Check for IPv6 addresses, since a path starting with ":" is absolute (a resource) |
3707 | // and IPv6 addresses can start with "c:" too |
3708 | if (isIp6(text: trimmedString)) { |
3709 | QUrl url; |
3710 | url.setHost(host: trimmedString); |
3711 | url.setScheme(QStringLiteral("http" )); |
3712 | return url; |
3713 | } |
3714 | |
3715 | const QUrl url = QUrl(trimmedString, QUrl::TolerantMode); |
3716 | |
3717 | // Check for a relative path |
3718 | if (!workingDirectory.isEmpty()) { |
3719 | const QFileInfo fileInfo(QDir(workingDirectory), userInput); |
3720 | if (fileInfo.exists()) |
3721 | return QUrl::fromLocalFile(localFile: fileInfo.absoluteFilePath()); |
3722 | |
3723 | // Check both QUrl::isRelative (to detect full URLs) and QDir::isAbsolutePath (since on Windows drive letters can be interpreted as schemes) |
3724 | if ((options & AssumeLocalFile) && url.isRelative() && !QDir::isAbsolutePath(path: userInput)) |
3725 | return QUrl::fromLocalFile(localFile: fileInfo.absoluteFilePath()); |
3726 | } |
3727 | |
3728 | // Check first for files, since on Windows drive letters can be interpreted as schemes |
3729 | if (QDir::isAbsolutePath(path: trimmedString)) |
3730 | return QUrl::fromLocalFile(localFile: trimmedString); |
3731 | |
3732 | QUrl urlPrepended = QUrl("http://"_L1 + trimmedString, QUrl::TolerantMode); |
3733 | |
3734 | // Check the most common case of a valid url with a scheme |
3735 | // We check if the port would be valid by adding the scheme to handle the case host:port |
3736 | // where the host would be interpreted as the scheme |
3737 | if (url.isValid() |
3738 | && !url.scheme().isEmpty() |
3739 | && urlPrepended.port() == -1) |
3740 | return adjustFtpPath(url); |
3741 | |
3742 | // Else, try the prepended one and adjust the scheme from the host name |
3743 | if (urlPrepended.isValid() && (!urlPrepended.host().isEmpty() || !urlPrepended.path().isEmpty())) { |
3744 | qsizetype dotIndex = trimmedString.indexOf(c: u'.'); |
3745 | const QStringView hostscheme = QStringView{trimmedString}.left(n: dotIndex); |
3746 | if (hostscheme.compare(other: ftpScheme(), cs: Qt::CaseInsensitive) == 0) |
3747 | urlPrepended.setScheme(ftpScheme()); |
3748 | return adjustFtpPath(url: urlPrepended); |
3749 | } |
3750 | |
3751 | return QUrl(); |
3752 | } |
3753 | |
3754 | QT_END_NAMESPACE |
3755 | |