1/****************************************************************************
2**
3** Copyright (C) 2016 The Qt Company Ltd.
4** Copyright (C) 2016 Intel Corporation.
5** Contact: https://www.qt.io/licensing/
6**
7** This file is part of the QtCore module of the Qt Toolkit.
8**
9** $QT_BEGIN_LICENSE:LGPL$
10** Commercial License Usage
11** Licensees holding valid commercial Qt licenses may use this file in
12** accordance with the commercial license agreement provided with the
13** Software or, alternatively, in accordance with the terms contained in
14** a written agreement between you and The Qt Company. For licensing terms
15** and conditions see https://www.qt.io/terms-conditions. For further
16** information use the contact form at https://www.qt.io/contact-us.
17**
18** GNU Lesser General Public License Usage
19** Alternatively, this file may be used under the terms of the GNU Lesser
20** General Public License version 3 as published by the Free Software
21** Foundation and appearing in the file LICENSE.LGPL3 included in the
22** packaging of this file. Please review the following information to
23** ensure the GNU Lesser General Public License version 3 requirements
24** will be met: https://www.gnu.org/licenses/lgpl-3.0.html.
25**
26** GNU General Public License Usage
27** Alternatively, this file may be used under the terms of the GNU
28** General Public License version 2.0 or (at your option) the GNU General
29** Public license version 3 or any later version approved by the KDE Free
30** Qt Foundation. The licenses are as published by the Free Software
31** Foundation and appearing in the file LICENSE.GPL2 and LICENSE.GPL3
32** included in the packaging of this file. Please review the following
33** information to ensure the GNU General Public License requirements will
34** be met: https://www.gnu.org/licenses/gpl-2.0.html and
35** https://www.gnu.org/licenses/gpl-3.0.html.
36**
37** $QT_END_LICENSE$
38**
39****************************************************************************/
40
41/*!
42 \class QUrl
43 \inmodule QtCore
44
45 \brief The QUrl class provides a convenient interface for working
46 with URLs.
47
48 \reentrant
49 \ingroup io
50 \ingroup network
51 \ingroup shared
52
53
54 It can parse and construct URLs in both encoded and unencoded
55 form. QUrl also has support for internationalized domain names
56 (IDNs).
57
58 The most common way to use QUrl is to initialize it via the
59 constructor by passing a QString. Otherwise, setUrl() can also
60 be used.
61
62 URLs can be represented in two forms: encoded or unencoded. The
63 unencoded representation is suitable for showing to users, but
64 the encoded representation is typically what you would send to
65 a web server. For example, the unencoded URL
66 "http://bühler.example.com/List of applicants.xml"
67 would be sent to the server as
68 "http://xn--bhler-kva.example.com/List%20of%20applicants.xml".
69
70 A URL can also be constructed piece by piece by calling
71 setScheme(), setUserName(), setPassword(), setHost(), setPort(),
72 setPath(), setQuery() and setFragment(). Some convenience
73 functions are also available: setAuthority() sets the user name,
74 password, host and port. setUserInfo() sets the user name and
75 password at once.
76
77 Call isValid() to check if the URL is valid. This can be done at any point
78 during the constructing of a URL. If isValid() returns \c false, you should
79 clear() the URL before proceeding, or start over by parsing a new URL with
80 setUrl().
81
82 Constructing a query is particularly convenient through the use of the \l
83 QUrlQuery class and its methods QUrlQuery::setQueryItems(),
84 QUrlQuery::addQueryItem() and QUrlQuery::removeQueryItem(). Use
85 QUrlQuery::setQueryDelimiters() to customize the delimiters used for
86 generating the query string.
87
88 For the convenience of generating encoded URL strings or query
89 strings, there are two static functions called
90 fromPercentEncoding() and toPercentEncoding() which deal with
91 percent encoding and decoding of QString objects.
92
93 fromLocalFile() constructs a QUrl by parsing a local
94 file path. toLocalFile() converts a URL to a local file path.
95
96 The human readable representation of the URL is fetched with
97 toString(). This representation is appropriate for displaying a
98 URL to a user in unencoded form. The encoded form however, as
99 returned by toEncoded(), is for internal use, passing to web
100 servers, mail clients and so on. Both forms are technically correct
101 and represent the same URL unambiguously -- in fact, passing either
102 form to QUrl's constructor or to setUrl() will yield the same QUrl
103 object.
104
105 QUrl conforms to the URI specification from
106 \l{RFC 3986} (Uniform Resource Identifier: Generic Syntax), and includes
107 scheme extensions from \l{RFC 1738} (Uniform Resource Locators). Case
108 folding rules in QUrl conform to \l{RFC 3491} (Nameprep: A Stringprep
109 Profile for Internationalized Domain Names (IDN)). It is also compatible with the
110 \l{http://freedesktop.org/wiki/Specifications/file-uri-spec/}{file URI specification}
111 from freedesktop.org, provided that the locale encodes file names using
112 UTF-8 (required by IDN).
113
114 \section2 Relative URLs vs Relative Paths
115
116 Calling isRelative() will return whether or not the URL is relative.
117 A relative URL has no \l {scheme}. For example:
118
119 \snippet code/src_corelib_io_qurl.cpp 8
120
121 Notice that a URL can be absolute while containing a relative path, and
122 vice versa:
123
124 \snippet code/src_corelib_io_qurl.cpp 9
125
126 A relative URL can be resolved by passing it as an argument to resolved(),
127 which returns an absolute URL. isParentOf() is used for determining whether
128 one URL is a parent of another.
129
130 \section2 Error checking
131
132 QUrl is capable of detecting many errors in URLs while parsing it or when
133 components of the URL are set with individual setter methods (like
134 setScheme(), setHost() or setPath()). If the parsing or setter function is
135 successful, any previously recorded error conditions will be discarded.
136
137 By default, QUrl setter methods operate in QUrl::TolerantMode, which means
138 they accept some common mistakes and mis-representation of data. An
139 alternate method of parsing is QUrl::StrictMode, which applies further
140 checks. See QUrl::ParsingMode for a description of the difference of the
141 parsing modes.
142
143 QUrl only checks for conformance with the URL specification. It does not
144 try to verify that high-level protocol URLs are in the format they are
145 expected to be by handlers elsewhere. For example, the following URIs are
146 all considered valid by QUrl, even if they do not make sense when used:
147
148 \list
149 \li "http:/filename.html"
150 \li "mailto://example.com"
151 \endlist
152
153 When the parser encounters an error, it signals the event by making
154 isValid() return false and toString() / toEncoded() return an empty string.
155 If it is necessary to show the user the reason why the URL failed to parse,
156 the error condition can be obtained from QUrl by calling errorString().
157 Note that this message is highly technical and may not make sense to
158 end-users.
159
160 QUrl is capable of recording only one error condition. If more than one
161 error is found, it is undefined which error is reported.
162
163 \section2 Character Conversions
164
165 Follow these rules to avoid erroneous character conversion when
166 dealing with URLs and strings:
167
168 \list
169 \li When creating a QString to contain a URL from a QByteArray or a
170 char*, always use QString::fromUtf8().
171 \endlist
172*/
173
174/*!
175 \enum QUrl::ParsingMode
176
177 The parsing mode controls the way QUrl parses strings.
178
179 \value TolerantMode QUrl will try to correct some common errors in URLs.
180 This mode is useful for parsing URLs coming from sources
181 not known to be strictly standards-conforming.
182
183 \value StrictMode Only valid URLs are accepted. This mode is useful for
184 general URL validation.
185
186 \value DecodedMode QUrl will interpret the URL component in the fully-decoded form,
187 where percent characters stand for themselves, not as the beginning
188 of a percent-encoded sequence. This mode is only valid for the
189 setters setting components of a URL; it is not permitted in
190 the QUrl constructor, in fromEncoded() or in setUrl().
191 For more information on this mode, see the documentation for
192 \l {QUrl::ComponentFormattingOption}{QUrl::FullyDecoded}.
193
194 In TolerantMode, the parser has the following behaviour:
195
196 \list
197
198 \li Spaces and "%20": unencoded space characters will be accepted and will
199 be treated as equivalent to "%20".
200
201 \li Single "%" characters: Any occurrences of a percent character "%" not
202 followed by exactly two hexadecimal characters (e.g., "13% coverage.html")
203 will be replaced by "%25". Note that one lone "%" character will trigger
204 the correction mode for all percent characters.
205
206 \li Reserved and unreserved characters: An encoded URL should only
207 contain a few characters as literals; all other characters should
208 be percent-encoded. In TolerantMode, these characters will be
209 accepted if they are found in the URL:
210 space / double-quote / "<" / ">" / "\" /
211 "^" / "`" / "{" / "|" / "}"
212 Those same characters can be decoded again by passing QUrl::DecodeReserved
213 to toString() or toEncoded(). In the getters of individual components,
214 those characters are often returned in decoded form.
215
216 \endlist
217
218 When in StrictMode, if a parsing error is found, isValid() will return \c
219 false and errorString() will return a message describing the error.
220 If more than one error is detected, it is undefined which error gets
221 reported.
222
223 Note that TolerantMode is not usually enough for parsing user input, which
224 often contains more errors and expectations than the parser can deal with.
225 When dealing with data coming directly from the user -- as opposed to data
226 coming from data-transfer sources, such as other programs -- it is
227 recommended to use fromUserInput().
228
229 \sa fromUserInput(), setUrl(), toString(), toEncoded(), QUrl::FormattingOptions
230*/
231
232/*!
233 \enum QUrl::UrlFormattingOption
234
235 The formatting options define how the URL is formatted when written out
236 as text.
237
238 \value None The format of the URL is unchanged.
239 \value RemoveScheme The scheme is removed from the URL.
240 \value RemovePassword Any password in the URL is removed.
241 \value RemoveUserInfo Any user information in the URL is removed.
242 \value RemovePort Any specified port is removed from the URL.
243 \value RemoveAuthority
244 \value RemovePath The URL's path is removed, leaving only the scheme,
245 host address, and port (if present).
246 \value RemoveQuery The query part of the URL (following a '?' character)
247 is removed.
248 \value RemoveFragment
249 \value RemoveFilename The filename (i.e. everything after the last '/' in the path) is removed.
250 The trailing '/' is kept, unless StripTrailingSlash is set.
251 Only valid if RemovePath is not set.
252 \value PreferLocalFile If the URL is a local file according to isLocalFile()
253 and contains no query or fragment, a local file path is returned.
254 \value StripTrailingSlash The trailing slash is removed from the path, if one is present.
255 \value NormalizePathSegments Modifies the path to remove redundant directory separators,
256 and to resolve "."s and ".."s (as far as possible). For non-local paths, adjacent
257 slashes are preserved.
258
259 Note that the case folding rules in \l{RFC 3491}{Nameprep}, which QUrl
260 conforms to, require host names to always be converted to lower case,
261 regardless of the Qt::FormattingOptions used.
262
263 The options from QUrl::ComponentFormattingOptions are also possible.
264
265 \sa QUrl::ComponentFormattingOptions
266*/
267
268/*!
269 \enum QUrl::ComponentFormattingOption
270 \since 5.0
271
272 The component formatting options define how the components of an URL will
273 be formatted when written out as text. They can be combined with the
274 options from QUrl::FormattingOptions when used in toString() and
275 toEncoded().
276
277 \value PrettyDecoded The component is returned in a "pretty form", with
278 most percent-encoded characters decoded. The exact
279 behavior of PrettyDecoded varies from component to
280 component and may also change from Qt release to Qt
281 release. This is the default.
282
283 \value EncodeSpaces Leave space characters in their encoded form ("%20").
284
285 \value EncodeUnicode Leave non-US-ASCII characters encoded in their UTF-8
286 percent-encoded form (e.g., "%C3%A9" for the U+00E9
287 codepoint, LATIN SMALL LETTER E WITH ACUTE).
288
289 \value EncodeDelimiters Leave certain delimiters in their encoded form, as
290 would appear in the URL when the full URL is
291 represented as text. The delimiters are affected
292 by this option change from component to component.
293 This flag has no effect in toString() or toEncoded().
294
295 \value EncodeReserved Leave US-ASCII characters not permitted in the URL by
296 the specification in their encoded form. This is the
297 default on toString() and toEncoded().
298
299 \value DecodeReserved Decode the US-ASCII characters that the URL specification
300 does not allow to appear in the URL. This is the
301 default on the getters of individual components.
302
303 \value FullyEncoded Leave all characters in their properly-encoded form,
304 as this component would appear as part of a URL. When
305 used with toString(), this produces a fully-compliant
306 URL in QString form, exactly equal to the result of
307 toEncoded()
308
309 \value FullyDecoded Attempt to decode as much as possible. For individual
310 components of the URL, this decodes every percent
311 encoding sequence, including control characters (U+0000
312 to U+001F) and UTF-8 sequences found in percent-encoded form.
313 Use of this mode may cause data loss, see below for more information.
314
315 The values of EncodeReserved and DecodeReserved should not be used together
316 in one call. The behavior is undefined if that happens. They are provided
317 as separate values because the behavior of the "pretty mode" with regards
318 to reserved characters is different on certain components and specially on
319 the full URL.
320
321 \section2 Full decoding
322
323 The FullyDecoded mode is similar to the behavior of the functions returning
324 QString in Qt 4.x, in that every character represents itself and never has
325 any special meaning. This is true even for the percent character ('%'),
326 which should be interpreted to mean a literal percent, not the beginning of
327 a percent-encoded sequence. The same actual character, in all other
328 decoding modes, is represented by the sequence "%25".
329
330 Whenever re-applying data obtained with QUrl::FullyDecoded into a QUrl,
331 care must be taken to use the QUrl::DecodedMode parameter to the setters
332 (like setPath() and setUserName()). Failure to do so may cause
333 re-interpretation of the percent character ('%') as the beginning of a
334 percent-encoded sequence.
335
336 This mode is quite useful when portions of a URL are used in a non-URL
337 context. For example, to extract the username, password or file paths in an
338 FTP client application, the FullyDecoded mode should be used.
339
340 This mode should be used with care, since there are two conditions that
341 cannot be reliably represented in the returned QString. They are:
342
343 \list
344 \li \b{Non-UTF-8 sequences:} URLs may contain sequences of
345 percent-encoded characters that do not form valid UTF-8 sequences. Since
346 URLs need to be decoded using UTF-8, any decoder failure will result in
347 the QString containing one or more replacement characters where the
348 sequence existed.
349
350 \li \b{Encoded delimiters:} URLs are also allowed to make a distinction
351 between a delimiter found in its literal form and its equivalent in
352 percent-encoded form. This is most commonly found in the query, but is
353 permitted in most parts of the URL.
354 \endlist
355
356 The following example illustrates the problem:
357
358 \snippet code/src_corelib_io_qurl.cpp 10
359
360 If the two URLs were used via HTTP GET, the interpretation by the web
361 server would probably be different. In the first case, it would interpret
362 as one parameter, with a key of "q" and value "a+=b&c". In the second
363 case, it would probably interpret as two parameters, one with a key of "q"
364 and value "a =b", and the second with a key "c" and no value.
365
366 \sa QUrl::FormattingOptions
367*/
368
369/*!
370 \enum QUrl::UserInputResolutionOption
371 \since 5.4
372
373 The user input resolution options define how fromUserInput() should
374 interpret strings that could either be a relative path or the short
375 form of a HTTP URL. For instance \c{file.pl} can be either a local file
376 or the URL \c{http://file.pl}.
377
378 \value DefaultResolution The default resolution mechanism is to check
379 whether a local file exists, in the working
380 directory given to fromUserInput, and only
381 return a local path in that case. Otherwise a URL
382 is assumed.
383 \value AssumeLocalFile This option makes fromUserInput() always return
384 a local path unless the input contains a scheme, such as
385 \c{http://file.pl}. This is useful for applications
386 such as text editors, which are able to create
387 the file if it doesn't exist.
388
389 \sa fromUserInput()
390*/
391
392/*!
393 \fn QUrl::QUrl(QUrl &&other)
394
395 Move-constructs a QUrl instance, making it point at the same
396 object that \a other was pointing to.
397
398 \since 5.2
399*/
400
401/*!
402 \fn QUrl &QUrl::operator=(QUrl &&other)
403
404 Move-assigns \a other to this QUrl instance.
405
406 \since 5.2
407*/
408
409#include "qurl.h"
410#include "qurl_p.h"
411#include "qplatformdefs.h"
412#include "qstring.h"
413#include "qstringlist.h"
414#include "qdebug.h"
415#include "qhash.h"
416#include "qdatastream.h"
417#if QT_CONFIG(topleveldomain) // ### Qt6: Remove section
418#include "qtldurl_p.h"
419#endif
420#include "private/qipaddress_p.h"
421#include "qurlquery.h"
422#include "private/qdir_p.h"
423#include <private/qmemory_p.h>
424
425QT_BEGIN_NAMESPACE
426
427inline static bool isHex(char c)
428{
429 c |= 0x20;
430 return (c >= '0' && c <= '9') || (c >= 'a' && c <= 'f');
431}
432
433static inline QString ftpScheme()
434{
435 return QStringLiteral("ftp");
436}
437
438static inline QString fileScheme()
439{
440 return QStringLiteral("file");
441}
442
443static inline QString webDavScheme()
444{
445 return QStringLiteral("webdavs");
446}
447
448static inline QString webDavSslTag()
449{
450 return QStringLiteral("@SSL");
451}
452
453class QUrlPrivate
454{
455public:
456 enum Section : uchar {
457 Scheme = 0x01,
458 UserName = 0x02,
459 Password = 0x04,
460 UserInfo = UserName | Password,
461 Host = 0x08,
462 Port = 0x10,
463 Authority = UserInfo | Host | Port,
464 Path = 0x20,
465 Hierarchy = Authority | Path,
466 Query = 0x40,
467 Fragment = 0x80,
468 FullUrl = 0xff
469 };
470
471 enum Flags : uchar {
472 IsLocalFile = 0x01
473 };
474
475 enum ErrorCode {
476 // the high byte of the error code matches the Section
477 // the first item in each value must be the generic "Invalid xxx Error"
478 InvalidSchemeError = Scheme << 8,
479
480 InvalidUserNameError = UserName << 8,
481
482 InvalidPasswordError = Password << 8,
483
484 InvalidRegNameError = Host << 8,
485 InvalidIPv4AddressError,
486 InvalidIPv6AddressError,
487 InvalidCharacterInIPv6Error,
488 InvalidIPvFutureError,
489 HostMissingEndBracket,
490
491 InvalidPortError = Port << 8,
492 PortEmptyError,
493
494 InvalidPathError = Path << 8,
495
496 InvalidQueryError = Query << 8,
497
498 InvalidFragmentError = Fragment << 8,
499
500 // the following three cases are only possible in combination with
501 // presence/absence of the path, authority and scheme. See validityError().
502 AuthorityPresentAndPathIsRelative = Authority << 8 | Path << 8 | 0x10000,
503 AuthorityAbsentAndPathIsDoubleSlash,
504 RelativeUrlPathContainsColonBeforeSlash = Scheme << 8 | Authority << 8 | Path << 8 | 0x10000,
505
506 NoError = 0
507 };
508
509 struct Error {
510 QString source;
511 ErrorCode code;
512 int position;
513 };
514
515 QUrlPrivate();
516 QUrlPrivate(const QUrlPrivate &copy);
517 ~QUrlPrivate();
518
519 void parse(const QString &url, QUrl::ParsingMode parsingMode);
520 bool isEmpty() const
521 { return sectionIsPresent == 0 && port == -1 && path.isEmpty(); }
522
523 std::unique_ptr<Error> cloneError() const;
524 void clearError();
525 void setError(ErrorCode errorCode, const QString &source, int supplement = -1);
526 ErrorCode validityError(QString *source = nullptr, int *position = nullptr) const;
527 bool validateComponent(Section section, const QString &input, int begin, int end);
528 bool validateComponent(Section section, const QString &input)
529 { return validateComponent(section, input, begin: 0, end: uint(input.length())); }
530
531 // no QString scheme() const;
532 void appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
533 void appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
534 void appendUserName(QString &appendTo, QUrl::FormattingOptions options) const;
535 void appendPassword(QString &appendTo, QUrl::FormattingOptions options) const;
536 void appendHost(QString &appendTo, QUrl::FormattingOptions options) const;
537 void appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
538 void appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
539 void appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
540
541 // the "end" parameters are like STL iterators: they point to one past the last valid element
542 bool setScheme(const QString &value, int len, bool doSetError);
543 void setAuthority(const QString &auth, int from, int end, QUrl::ParsingMode mode);
544 void setUserInfo(const QString &userInfo, int from, int end);
545 void setUserName(const QString &value, int from, int end);
546 void setPassword(const QString &value, int from, int end);
547 bool setHost(const QString &value, int from, int end, QUrl::ParsingMode mode);
548 void setPath(const QString &value, int from, int end);
549 void setQuery(const QString &value, int from, int end);
550 void setFragment(const QString &value, int from, int end);
551
552 inline bool hasScheme() const { return sectionIsPresent & Scheme; }
553 inline bool hasAuthority() const { return sectionIsPresent & Authority; }
554 inline bool hasUserInfo() const { return sectionIsPresent & UserInfo; }
555 inline bool hasUserName() const { return sectionIsPresent & UserName; }
556 inline bool hasPassword() const { return sectionIsPresent & Password; }
557 inline bool hasHost() const { return sectionIsPresent & Host; }
558 inline bool hasPort() const { return port != -1; }
559 inline bool hasPath() const { return !path.isEmpty(); }
560 inline bool hasQuery() const { return sectionIsPresent & Query; }
561 inline bool hasFragment() const { return sectionIsPresent & Fragment; }
562
563 inline bool isLocalFile() const { return flags & IsLocalFile; }
564 QString toLocalFile(QUrl::FormattingOptions options) const;
565
566 QString mergePaths(const QString &relativePath) const;
567
568 QAtomicInt ref;
569 int port;
570
571 QString scheme;
572 QString userName;
573 QString password;
574 QString host;
575 QString path;
576 QString query;
577 QString fragment;
578
579 std::unique_ptr<Error> error;
580
581 // not used for:
582 // - Port (port == -1 means absence)
583 // - Path (there's no path delimiter, so we optimize its use out of existence)
584 // Schemes are never supposed to be empty, but we keep the flag anyway
585 uchar sectionIsPresent;
586 uchar flags;
587
588 // 32-bit: 2 bytes tail padding available
589 // 64-bit: 6 bytes tail padding available
590};
591
592inline QUrlPrivate::QUrlPrivate()
593 : ref(1), port(-1),
594 sectionIsPresent(0),
595 flags(0)
596{
597}
598
599inline QUrlPrivate::QUrlPrivate(const QUrlPrivate &copy)
600 : ref(1), port(copy.port),
601 scheme(copy.scheme),
602 userName(copy.userName),
603 password(copy.password),
604 host(copy.host),
605 path(copy.path),
606 query(copy.query),
607 fragment(copy.fragment),
608 error(copy.cloneError()),
609 sectionIsPresent(copy.sectionIsPresent),
610 flags(copy.flags)
611{
612}
613
614inline QUrlPrivate::~QUrlPrivate()
615 = default;
616
617std::unique_ptr<QUrlPrivate::Error> QUrlPrivate::cloneError() const
618{
619 return error ? qt_make_unique<Error>(args&: *error) : nullptr;
620}
621
622inline void QUrlPrivate::clearError()
623{
624 error.reset();
625}
626
627inline void QUrlPrivate::setError(ErrorCode errorCode, const QString &source, int supplement)
628{
629 if (error) {
630 // don't overwrite an error set in a previous section during parsing
631 return;
632 }
633 error = qt_make_unique<Error>();
634 error->code = errorCode;
635 error->source = source;
636 error->position = supplement;
637}
638
639// From RFC 3986, Appendix A Collected ABNF for URI
640// URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
641//[...]
642// scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
643//
644// authority = [ userinfo "@" ] host [ ":" port ]
645// userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
646// host = IP-literal / IPv4address / reg-name
647// port = *DIGIT
648//[...]
649// reg-name = *( unreserved / pct-encoded / sub-delims )
650//[..]
651// pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
652//
653// query = *( pchar / "/" / "?" )
654//
655// fragment = *( pchar / "/" / "?" )
656//
657// pct-encoded = "%" HEXDIG HEXDIG
658//
659// unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
660// reserved = gen-delims / sub-delims
661// gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
662// sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
663// / "*" / "+" / "," / ";" / "="
664// the path component has a complex ABNF that basically boils down to
665// slash-separated segments of "pchar"
666
667// The above is the strict definition of the URL components and we mostly
668// adhere to it, with few exceptions. QUrl obeys the following behavior:
669// - percent-encoding sequences always use uppercase HEXDIG;
670// - unreserved characters are *always* decoded, no exceptions;
671// - the space character and bytes with the high bit set are controlled by
672// the EncodeSpaces and EncodeUnicode bits;
673// - control characters, the percent sign itself, and bytes with the high
674// bit set that don't form valid UTF-8 sequences are always encoded,
675// except in FullyDecoded mode;
676// - sub-delims are always left alone, except in FullyDecoded mode;
677// - gen-delim change behavior depending on which section of the URL (or
678// the entire URL) we're looking at; see below;
679// - characters not mentioned above, like "<", and ">", are usually
680// decoded in individual sections of the URL, but encoded when the full
681// URL is put together (we can change on subjective definition of
682// "pretty").
683//
684// The behavior for the delimiters bears some explanation. The spec says in
685// section 2.2:
686// URIs that differ in the replacement of a reserved character with its
687// corresponding percent-encoded octet are not equivalent.
688// (note: QUrl API mistakenly uses the "reserved" term, so we will refer to
689// them here as "delimiters").
690//
691// For that reason, we cannot encode delimiters found in decoded form and we
692// cannot decode the ones found in encoded form if that would change the
693// interpretation. Conversely, we *can* perform the transformation if it would
694// not change the interpretation. From the last component of a URL to the first,
695// here are the gen-delims we can unambiguously transform when the field is
696// taken in isolation:
697// - fragment: none, since it's the last
698// - query: "#" is unambiguous
699// - path: "#" and "?" are unambiguous
700// - host: completely special but never ambiguous, see setHost() below.
701// - password: the "#", "?", "/", "[", "]" and "@" characters are unambiguous
702// - username: the "#", "?", "/", "[", "]", "@", and ":" characters are unambiguous
703// - scheme: doesn't accept any delimiter, see setScheme() below.
704//
705// Internally, QUrl stores each component in the format that corresponds to the
706// default mode (PrettyDecoded). It deviates from the "strict" FullyEncoded
707// mode in the following way:
708// - spaces are decoded
709// - valid UTF-8 sequences are decoded
710// - gen-delims that can be unambiguously transformed are decoded
711// - characters controlled by DecodeReserved are often decoded, though this behavior
712// can change depending on the subjective definition of "pretty"
713//
714// Note that the list of gen-delims that we can transform is different for the
715// user info (user name + password) and the authority (user info + host +
716// port).
717
718
719// list the recoding table modifications to be used with the recodeFromUser and
720// appendToUser functions, according to the rules above. Spaces and UTF-8
721// sequences are handled outside the tables.
722
723// the encodedXXX tables are run with the delimiters set to "leave" by default;
724// the decodedXXX tables are run with the delimiters set to "decode" by default
725// (except for the query, which doesn't use these functions)
726
727#define decode(x) ushort(x)
728#define leave(x) ushort(0x100 | (x))
729#define encode(x) ushort(0x200 | (x))
730
731static const ushort userNameInIsolation[] = {
732 decode(':'), // 0
733 decode('@'), // 1
734 decode(']'), // 2
735 decode('['), // 3
736 decode('/'), // 4
737 decode('?'), // 5
738 decode('#'), // 6
739
740 decode('"'), // 7
741 decode('<'),
742 decode('>'),
743 decode('^'),
744 decode('\\'),
745 decode('|'),
746 decode('{'),
747 decode('}'),
748 0
749};
750static const ushort * const passwordInIsolation = userNameInIsolation + 1;
751static const ushort * const pathInIsolation = userNameInIsolation + 5;
752static const ushort * const queryInIsolation = userNameInIsolation + 6;
753static const ushort * const fragmentInIsolation = userNameInIsolation + 7;
754
755static const ushort userNameInUserInfo[] = {
756 encode(':'), // 0
757 decode('@'), // 1
758 decode(']'), // 2
759 decode('['), // 3
760 decode('/'), // 4
761 decode('?'), // 5
762 decode('#'), // 6
763
764 decode('"'), // 7
765 decode('<'),
766 decode('>'),
767 decode('^'),
768 decode('\\'),
769 decode('|'),
770 decode('{'),
771 decode('}'),
772 0
773};
774static const ushort * const passwordInUserInfo = userNameInUserInfo + 1;
775
776static const ushort userNameInAuthority[] = {
777 encode(':'), // 0
778 encode('@'), // 1
779 encode(']'), // 2
780 encode('['), // 3
781 decode('/'), // 4
782 decode('?'), // 5
783 decode('#'), // 6
784
785 decode('"'), // 7
786 decode('<'),
787 decode('>'),
788 decode('^'),
789 decode('\\'),
790 decode('|'),
791 decode('{'),
792 decode('}'),
793 0
794};
795static const ushort * const passwordInAuthority = userNameInAuthority + 1;
796
797static const ushort userNameInUrl[] = {
798 encode(':'), // 0
799 encode('@'), // 1
800 encode(']'), // 2
801 encode('['), // 3
802 encode('/'), // 4
803 encode('?'), // 5
804 encode('#'), // 6
805
806 // no need to list encode(x) for the other characters
807 0
808};
809static const ushort * const passwordInUrl = userNameInUrl + 1;
810static const ushort * const pathInUrl = userNameInUrl + 5;
811static const ushort * const queryInUrl = userNameInUrl + 6;
812static const ushort * const fragmentInUrl = userNameInUrl + 6;
813
814static inline void parseDecodedComponent(QString &data)
815{
816 data.replace(c: QLatin1Char('%'), after: QLatin1String("%25"));
817}
818
819static inline QString
820recodeFromUser(const QString &input, const ushort *actions, int from, int to)
821{
822 QString output;
823 const QChar *begin = input.constData() + from;
824 const QChar *end = input.constData() + to;
825 if (qt_urlRecode(appendTo&: output, begin, end, encoding: {}, tableModifications: actions))
826 return output;
827
828 return input.mid(position: from, n: to - from);
829}
830
831// appendXXXX functions: copy from the internal form to the external, user form.
832// the internal value is stored in its PrettyDecoded form, so that case is easy.
833static inline void appendToUser(QString &appendTo, const QStringRef &value, QUrl::FormattingOptions options,
834 const ushort *actions)
835{
836 // Test ComponentFormattingOptions, ignore FormattingOptions.
837 if ((options & 0xFFFF0000) == QUrl::PrettyDecoded) {
838 appendTo += value;
839 return;
840 }
841
842 if (!qt_urlRecode(appendTo, begin: value.data(), end: value.end(), encoding: options, tableModifications: actions))
843 appendTo += value;
844}
845
846static inline void appendToUser(QString &appendTo, const QString &value, QUrl::FormattingOptions options,
847 const ushort *actions)
848{
849 appendToUser(appendTo, value: QStringRef(&value), options, actions);
850}
851
852
853inline void QUrlPrivate::appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
854{
855 if ((options & QUrl::RemoveUserInfo) != QUrl::RemoveUserInfo) {
856 appendUserInfo(appendTo, options, appendingTo);
857
858 // add '@' only if we added anything
859 if (hasUserName() || (hasPassword() && (options & QUrl::RemovePassword) == 0))
860 appendTo += QLatin1Char('@');
861 }
862 appendHost(appendTo, options);
863 if (!(options & QUrl::RemovePort) && port != -1)
864 appendTo += QLatin1Char(':') + QString::number(port);
865}
866
867inline void QUrlPrivate::appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
868{
869 if (Q_LIKELY(!hasUserInfo()))
870 return;
871
872 const ushort *userNameActions;
873 const ushort *passwordActions;
874 if (options & QUrl::EncodeDelimiters) {
875 userNameActions = userNameInUrl;
876 passwordActions = passwordInUrl;
877 } else {
878 switch (appendingTo) {
879 case UserInfo:
880 userNameActions = userNameInUserInfo;
881 passwordActions = passwordInUserInfo;
882 break;
883
884 case Authority:
885 userNameActions = userNameInAuthority;
886 passwordActions = passwordInAuthority;
887 break;
888
889 case FullUrl:
890 userNameActions = userNameInUrl;
891 passwordActions = passwordInUrl;
892 break;
893
894 default:
895 // can't happen
896 Q_UNREACHABLE();
897 break;
898 }
899 }
900
901 if (!qt_urlRecode(appendTo, begin: userName.constData(), end: userName.constEnd(), encoding: options, tableModifications: userNameActions))
902 appendTo += userName;
903 if (options & QUrl::RemovePassword || !hasPassword()) {
904 return;
905 } else {
906 appendTo += QLatin1Char(':');
907 if (!qt_urlRecode(appendTo, begin: password.constData(), end: password.constEnd(), encoding: options, tableModifications: passwordActions))
908 appendTo += password;
909 }
910}
911
912inline void QUrlPrivate::appendUserName(QString &appendTo, QUrl::FormattingOptions options) const
913{
914 // only called from QUrl::userName()
915 appendToUser(appendTo, value: userName, options,
916 actions: options & QUrl::EncodeDelimiters ? userNameInUrl : userNameInIsolation);
917}
918
919inline void QUrlPrivate::appendPassword(QString &appendTo, QUrl::FormattingOptions options) const
920{
921 // only called from QUrl::password()
922 appendToUser(appendTo, value: password, options,
923 actions: options & QUrl::EncodeDelimiters ? passwordInUrl : passwordInIsolation);
924}
925
926inline void QUrlPrivate::appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
927{
928 QString thePath = path;
929 if (options & QUrl::NormalizePathSegments) {
930 thePath = qt_normalizePathSegments(name: path, flags: isLocalFile() ? QDirPrivate::DefaultNormalization : QDirPrivate::RemotePath);
931 }
932
933 QStringRef thePathRef(&thePath);
934 if (options & QUrl::RemoveFilename) {
935 const int slash = path.lastIndexOf(c: QLatin1Char('/'));
936 if (slash == -1)
937 return;
938 thePathRef = path.leftRef(n: slash + 1);
939 }
940 // check if we need to remove trailing slashes
941 if (options & QUrl::StripTrailingSlash) {
942 while (thePathRef.length() > 1 && thePathRef.endsWith(c: QLatin1Char('/')))
943 thePathRef.chop(n: 1);
944 }
945
946 appendToUser(appendTo, value: thePathRef, options,
947 actions: appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? pathInUrl : pathInIsolation);
948}
949
950inline void QUrlPrivate::appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
951{
952 appendToUser(appendTo, value: fragment, options,
953 actions: options & QUrl::EncodeDelimiters ? fragmentInUrl :
954 appendingTo == FullUrl ? nullptr : fragmentInIsolation);
955}
956
957inline void QUrlPrivate::appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
958{
959 appendToUser(appendTo, value: query, options,
960 actions: appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? queryInUrl : queryInIsolation);
961}
962
963// setXXX functions
964
965inline bool QUrlPrivate::setScheme(const QString &value, int len, bool doSetError)
966{
967 // schemes are strictly RFC-compliant:
968 // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
969 // we also lowercase the scheme
970
971 // schemes in URLs are not allowed to be empty, but they can be in
972 // "Relative URIs" which QUrl also supports. QUrl::setScheme does
973 // not call us with len == 0, so this can only be from parse()
974 scheme.clear();
975 if (len == 0)
976 return false;
977
978 sectionIsPresent |= Scheme;
979
980 // validate it:
981 int needsLowercasing = -1;
982 const ushort *p = reinterpret_cast<const ushort *>(value.constData());
983 for (int i = 0; i < len; ++i) {
984 if (p[i] >= 'a' && p[i] <= 'z')
985 continue;
986 if (p[i] >= 'A' && p[i] <= 'Z') {
987 needsLowercasing = i;
988 continue;
989 }
990 if (i) {
991 if (p[i] >= '0' && p[i] <= '9')
992 continue;
993 if (p[i] == '+' || p[i] == '-' || p[i] == '.')
994 continue;
995 }
996
997 // found something else
998 // don't call setError needlessly:
999 // if we've been called from parse(), it will try to recover
1000 if (doSetError)
1001 setError(errorCode: InvalidSchemeError, source: value, supplement: i);
1002 return false;
1003 }
1004
1005 scheme = value.left(n: len);
1006
1007 if (needsLowercasing != -1) {
1008 // schemes are ASCII only, so we don't need the full Unicode toLower
1009 QChar *schemeData = scheme.data(); // force detaching here
1010 for (int i = needsLowercasing; i >= 0; --i) {
1011 ushort c = schemeData[i].unicode();
1012 if (c >= 'A' && c <= 'Z')
1013 schemeData[i] = QChar(c + 0x20);
1014 }
1015 }
1016
1017 // did we set to the file protocol?
1018 if (scheme == fileScheme()
1019#ifdef Q_OS_WIN
1020 || scheme == webDavScheme()
1021#endif
1022 ) {
1023 flags |= IsLocalFile;
1024 } else {
1025 flags &= ~IsLocalFile;
1026 }
1027 return true;
1028}
1029
1030inline void QUrlPrivate::setAuthority(const QString &auth, int from, int end, QUrl::ParsingMode mode)
1031{
1032 sectionIsPresent &= ~Authority;
1033 sectionIsPresent |= Host;
1034 port = -1;
1035
1036 // we never actually _loop_
1037 while (from != end) {
1038 int userInfoIndex = auth.indexOf(c: QLatin1Char('@'), from);
1039 if (uint(userInfoIndex) < uint(end)) {
1040 setUserInfo(userInfo: auth, from, end: userInfoIndex);
1041 if (mode == QUrl::StrictMode && !validateComponent(section: UserInfo, input: auth, begin: from, end: userInfoIndex))
1042 break;
1043 from = userInfoIndex + 1;
1044 }
1045
1046 int colonIndex = auth.lastIndexOf(c: QLatin1Char(':'), from: end - 1);
1047 if (colonIndex < from)
1048 colonIndex = -1;
1049
1050 if (uint(colonIndex) < uint(end)) {
1051 if (auth.at(i: from).unicode() == '[') {
1052 // check if colonIndex isn't inside the "[...]" part
1053 int closingBracket = auth.indexOf(c: QLatin1Char(']'), from);
1054 if (uint(closingBracket) > uint(colonIndex))
1055 colonIndex = -1;
1056 }
1057 }
1058
1059 if (uint(colonIndex) < uint(end) - 1) {
1060 // found a colon with digits after it
1061 unsigned long x = 0;
1062 for (int i = colonIndex + 1; i < end; ++i) {
1063 ushort c = auth.at(i).unicode();
1064 if (c >= '0' && c <= '9') {
1065 x *= 10;
1066 x += c - '0';
1067 } else {
1068 x = ulong(-1); // x != ushort(x)
1069 break;
1070 }
1071 }
1072 if (x == ushort(x)) {
1073 port = ushort(x);
1074 } else {
1075 setError(errorCode: InvalidPortError, source: auth, supplement: colonIndex + 1);
1076 if (mode == QUrl::StrictMode)
1077 break;
1078 }
1079 }
1080
1081 setHost(value: auth, from, end: qMin<uint>(a: end, b: colonIndex), mode);
1082 if (mode == QUrl::StrictMode && !validateComponent(section: Host, input: auth, begin: from, end: qMin<uint>(a: end, b: colonIndex))) {
1083 // clear host too
1084 sectionIsPresent &= ~Authority;
1085 break;
1086 }
1087
1088 // success
1089 return;
1090 }
1091 // clear all sections but host
1092 sectionIsPresent &= ~Authority | Host;
1093 userName.clear();
1094 password.clear();
1095 host.clear();
1096 port = -1;
1097}
1098
1099inline void QUrlPrivate::setUserInfo(const QString &userInfo, int from, int end)
1100{
1101 int delimIndex = userInfo.indexOf(c: QLatin1Char(':'), from);
1102 setUserName(value: userInfo, from, end: qMin<uint>(a: delimIndex, b: end));
1103
1104 if (uint(delimIndex) >= uint(end)) {
1105 password.clear();
1106 sectionIsPresent &= ~Password;
1107 } else {
1108 setPassword(value: userInfo, from: delimIndex + 1, end);
1109 }
1110}
1111
1112inline void QUrlPrivate::setUserName(const QString &value, int from, int end)
1113{
1114 sectionIsPresent |= UserName;
1115 userName = recodeFromUser(input: value, actions: userNameInIsolation, from, to: end);
1116}
1117
1118inline void QUrlPrivate::setPassword(const QString &value, int from, int end)
1119{
1120 sectionIsPresent |= Password;
1121 password = recodeFromUser(input: value, actions: passwordInIsolation, from, to: end);
1122}
1123
1124inline void QUrlPrivate::setPath(const QString &value, int from, int end)
1125{
1126 // sectionIsPresent |= Path; // not used, save some cycles
1127 path = recodeFromUser(input: value, actions: pathInIsolation, from, to: end);
1128}
1129
1130inline void QUrlPrivate::setFragment(const QString &value, int from, int end)
1131{
1132 sectionIsPresent |= Fragment;
1133 fragment = recodeFromUser(input: value, actions: fragmentInIsolation, from, to: end);
1134}
1135
1136inline void QUrlPrivate::setQuery(const QString &value, int from, int iend)
1137{
1138 sectionIsPresent |= Query;
1139 query = recodeFromUser(input: value, actions: queryInIsolation, from, to: iend);
1140}
1141
1142// Host handling
1143// The RFC says the host is:
1144// host = IP-literal / IPv4address / reg-name
1145// IP-literal = "[" ( IPv6address / IPvFuture ) "]"
1146// IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
1147// [a strict definition of IPv6Address and IPv4Address]
1148// reg-name = *( unreserved / pct-encoded / sub-delims )
1149//
1150// We deviate from the standard in all but IPvFuture. For IPvFuture we accept
1151// and store only exactly what the RFC says we should. No percent-encoding is
1152// permitted in this field, so Unicode characters and space aren't either.
1153//
1154// For IPv4 addresses, we accept broken addresses like inet_aton does (that is,
1155// less than three dots). However, we correct the address to the proper form
1156// and store the corrected address. After correction, we comply to the RFC and
1157// it's exclusively composed of unreserved characters.
1158//
1159// For IPv6 addresses, we accept addresses including trailing (embedded) IPv4
1160// addresses, the so-called v4-compat and v4-mapped addresses. We also store
1161// those addresses like that in the hostname field, which violates the spec.
1162// IPv6 hosts are stored with the square brackets in the QString. It also
1163// requires no transformation in any way.
1164//
1165// As for registered names, it's the other way around: we accept only valid
1166// hostnames as specified by STD 3 and IDNA. That means everything we accept is
1167// valid in the RFC definition above, but there are many valid reg-names
1168// according to the RFC that we do not accept in the name of security. Since we
1169// do accept IDNA, reg-names are subject to ACE encoding and decoding, which is
1170// specified by the DecodeUnicode flag. The hostname is stored in its Unicode form.
1171
1172inline void QUrlPrivate::appendHost(QString &appendTo, QUrl::FormattingOptions options) const
1173{
1174 if (host.isEmpty())
1175 return;
1176 if (host.at(i: 0).unicode() == '[') {
1177 // IPv6 addresses might contain a zone-id which needs to be recoded
1178 if (options != 0)
1179 if (qt_urlRecode(appendTo, begin: host.constBegin(), end: host.constEnd(), encoding: options, tableModifications: nullptr))
1180 return;
1181 appendTo += host;
1182 } else {
1183 // this is either an IPv4Address or a reg-name
1184 // if it is a reg-name, it is already stored in Unicode form
1185 if (options & QUrl::EncodeUnicode && !(options & 0x4000000))
1186 appendTo += qt_ACE_do(domain: host, op: ToAceOnly, dot: AllowLeadingDot);
1187 else
1188 appendTo += host;
1189 }
1190}
1191
1192// the whole IPvFuture is passed and parsed here, including brackets;
1193// returns null if the parsing was successful, or the QChar of the first failure
1194static const QChar *parseIpFuture(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode)
1195{
1196 // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
1197 static const char acceptable[] =
1198 "!$&'()*+,;=" // sub-delims
1199 ":" // ":"
1200 "-._~"; // unreserved
1201
1202 // the brackets and the "v" have been checked
1203 const QChar *const origBegin = begin;
1204 if (begin[3].unicode() != '.')
1205 return &begin[3];
1206 if ((begin[2].unicode() >= 'A' && begin[2].unicode() <= 'F') ||
1207 (begin[2].unicode() >= 'a' && begin[2].unicode() <= 'f') ||
1208 (begin[2].unicode() >= '0' && begin[2].unicode() <= '9')) {
1209 // this is so unlikely that we'll just go down the slow path
1210 // decode the whole string, skipping the "[vH." and "]" which we already know to be there
1211 host += QString::fromRawData(begin, size: 4);
1212
1213 // uppercase the version, if necessary
1214 if (begin[2].unicode() >= 'a')
1215 host[host.length() - 2] = begin[2].unicode() - 0x20;
1216
1217 begin += 4;
1218 --end;
1219
1220 QString decoded;
1221 if (mode == QUrl::TolerantMode && qt_urlRecode(appendTo&: decoded, begin, end, encoding: QUrl::FullyDecoded, tableModifications: nullptr)) {
1222 begin = decoded.constBegin();
1223 end = decoded.constEnd();
1224 }
1225
1226 for ( ; begin != end; ++begin) {
1227 if (begin->unicode() >= 'A' && begin->unicode() <= 'Z')
1228 host += *begin;
1229 else if (begin->unicode() >= 'a' && begin->unicode() <= 'z')
1230 host += *begin;
1231 else if (begin->unicode() >= '0' && begin->unicode() <= '9')
1232 host += *begin;
1233 else if (begin->unicode() < 0x80 && strchr(s: acceptable, c: begin->unicode()) != nullptr)
1234 host += *begin;
1235 else
1236 return decoded.isEmpty() ? begin : &origBegin[2];
1237 }
1238 host += QLatin1Char(']');
1239 return nullptr;
1240 }
1241 return &origBegin[2];
1242}
1243
1244// ONLY the IPv6 address is parsed here, WITHOUT the brackets
1245static const QChar *parseIp6(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode)
1246{
1247 // ### Update to use QStringView once QStringView::indexOf and QStringView::lastIndexOf exists
1248 QString decoded;
1249 if (mode == QUrl::TolerantMode) {
1250 // this struct is kept in automatic storage because it's only 4 bytes
1251 const ushort decodeColon[] = { decode(':'), 0 };
1252 if (qt_urlRecode(appendTo&: decoded, begin, end, encoding: QUrl::ComponentFormattingOption::PrettyDecoded, tableModifications: decodeColon) == 0)
1253 decoded = QString(begin, end-begin);
1254 } else {
1255 decoded = QString(begin, end-begin);
1256 }
1257
1258 const QLatin1String zoneIdIdentifier("%25");
1259 QIPAddressUtils::IPv6Address address;
1260 QString zoneId;
1261
1262 const QChar *endBeforeZoneId = decoded.constEnd();
1263
1264 int zoneIdPosition = decoded.indexOf(s: zoneIdIdentifier);
1265 if ((zoneIdPosition != -1) && (decoded.lastIndexOf(s: zoneIdIdentifier) == zoneIdPosition)) {
1266 zoneId = decoded.mid(position: zoneIdPosition + zoneIdIdentifier.size());
1267 endBeforeZoneId = decoded.constBegin() + zoneIdPosition;
1268
1269 // was there anything after the zone ID separator?
1270 if (zoneId.isEmpty())
1271 return end;
1272 }
1273
1274 // did the address become empty after removing the zone ID?
1275 // (it might have always been empty)
1276 if (decoded.constBegin() == endBeforeZoneId)
1277 return end;
1278
1279 const QChar *ret = QIPAddressUtils::parseIp6(address, begin: decoded.constBegin(), end: endBeforeZoneId);
1280 if (ret)
1281 return begin + (ret - decoded.constBegin());
1282
1283 host.reserve(asize: host.size() + (decoded.constEnd() - decoded.constBegin()));
1284 host += QLatin1Char('[');
1285 QIPAddressUtils::toString(appendTo&: host, address);
1286
1287 if (!zoneId.isEmpty()) {
1288 host += zoneIdIdentifier;
1289 host += zoneId;
1290 }
1291 host += QLatin1Char(']');
1292 return nullptr;
1293}
1294
1295inline bool QUrlPrivate::setHost(const QString &value, int from, int iend, QUrl::ParsingMode mode)
1296{
1297 const QChar *begin = value.constData() + from;
1298 const QChar *end = value.constData() + iend;
1299
1300 const int len = end - begin;
1301 host.clear();
1302 sectionIsPresent |= Host;
1303 if (len == 0)
1304 return true;
1305
1306 if (begin[0].unicode() == '[') {
1307 // IPv6Address or IPvFuture
1308 // smallest IPv6 address is "[::]" (len = 4)
1309 // smallest IPvFuture address is "[v7.X]" (len = 6)
1310 if (end[-1].unicode() != ']') {
1311 setError(errorCode: HostMissingEndBracket, source: value);
1312 return false;
1313 }
1314
1315 if (len > 5 && begin[1].unicode() == 'v') {
1316 const QChar *c = parseIpFuture(host, begin, end, mode);
1317 if (c)
1318 setError(errorCode: InvalidIPvFutureError, source: value, supplement: c - value.constData());
1319 return !c;
1320 } else if (begin[1].unicode() == 'v') {
1321 setError(errorCode: InvalidIPvFutureError, source: value, supplement: from);
1322 }
1323
1324 const QChar *c = parseIp6(host, begin: begin + 1, end: end - 1, mode);
1325 if (!c)
1326 return true;
1327
1328 if (c == end - 1)
1329 setError(errorCode: InvalidIPv6AddressError, source: value, supplement: from);
1330 else
1331 setError(errorCode: InvalidCharacterInIPv6Error, source: value, supplement: c - value.constData());
1332 return false;
1333 }
1334
1335 // check if it's an IPv4 address
1336 QIPAddressUtils::IPv4Address ip4;
1337 if (QIPAddressUtils::parseIp4(address&: ip4, begin, end)) {
1338 // yes, it was
1339 QIPAddressUtils::toString(appendTo&: host, address: ip4);
1340 return true;
1341 }
1342
1343 // This is probably a reg-name.
1344 // But it can also be an encoded string that, when decoded becomes one
1345 // of the types above.
1346 //
1347 // Two types of encoding are possible:
1348 // percent encoding (e.g., "%31%30%2E%30%2E%30%2E%31" -> "10.0.0.1")
1349 // Unicode encoding (some non-ASCII characters case-fold to digits
1350 // when nameprepping is done)
1351 //
1352 // The qt_ACE_do function below applies nameprepping and the STD3 check.
1353 // That means a Unicode string may become an IPv4 address, but it cannot
1354 // produce a '[' or a '%'.
1355
1356 // check for percent-encoding first
1357 QString s;
1358 if (mode == QUrl::TolerantMode && qt_urlRecode(appendTo&: s, begin, end, encoding: { }, tableModifications: nullptr)) {
1359 // something was decoded
1360 // anything encoded left?
1361 int pos = s.indexOf(c: QChar(0x25)); // '%'
1362 if (pos != -1) {
1363 setError(errorCode: InvalidRegNameError, source: s, supplement: pos);
1364 return false;
1365 }
1366
1367 // recurse
1368 return setHost(value: s, from: 0, iend: s.length(), mode: QUrl::StrictMode);
1369 }
1370
1371 s = qt_ACE_do(domain: QString::fromRawData(begin, size: len), op: NormalizeAce, dot: ForbidLeadingDot);
1372 if (s.isEmpty()) {
1373 setError(errorCode: InvalidRegNameError, source: value);
1374 return false;
1375 }
1376
1377 // check IPv4 again
1378 if (QIPAddressUtils::parseIp4(address&: ip4, begin: s.constBegin(), end: s.constEnd())) {
1379 QIPAddressUtils::toString(appendTo&: host, address: ip4);
1380 } else {
1381 host = s;
1382 }
1383 return true;
1384}
1385
1386inline void QUrlPrivate::parse(const QString &url, QUrl::ParsingMode parsingMode)
1387{
1388 // URI-reference = URI / relative-ref
1389 // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
1390 // relative-ref = relative-part [ "?" query ] [ "#" fragment ]
1391 // hier-part = "//" authority path-abempty
1392 // / other path types
1393 // relative-part = "//" authority path-abempty
1394 // / other path types here
1395
1396 sectionIsPresent = 0;
1397 flags = 0;
1398 clearError();
1399
1400 // find the important delimiters
1401 int colon = -1;
1402 int question = -1;
1403 int hash = -1;
1404 const int len = url.length();
1405 const QChar *const begin = url.constData();
1406 const ushort *const data = reinterpret_cast<const ushort *>(begin);
1407
1408 for (int i = 0; i < len; ++i) {
1409 uint uc = data[i];
1410 if (uc == '#' && hash == -1) {
1411 hash = i;
1412
1413 // nothing more to be found
1414 break;
1415 }
1416
1417 if (question == -1) {
1418 if (uc == ':' && colon == -1)
1419 colon = i;
1420 else if (uc == '?')
1421 question = i;
1422 }
1423 }
1424
1425 // check if we have a scheme
1426 int hierStart;
1427 if (colon != -1 && setScheme(value: url, len: colon, /* don't set error */ doSetError: false)) {
1428 hierStart = colon + 1;
1429 } else {
1430 // recover from a failed scheme: it might not have been a scheme at all
1431 scheme.clear();
1432 sectionIsPresent = 0;
1433 hierStart = 0;
1434 }
1435
1436 int pathStart;
1437 int hierEnd = qMin<uint>(a: qMin<uint>(a: question, b: hash), b: len);
1438 if (hierEnd - hierStart >= 2 && data[hierStart] == '/' && data[hierStart + 1] == '/') {
1439 // we have an authority, it ends at the first slash after these
1440 int authorityEnd = hierEnd;
1441 for (int i = hierStart + 2; i < authorityEnd ; ++i) {
1442 if (data[i] == '/') {
1443 authorityEnd = i;
1444 break;
1445 }
1446 }
1447
1448 setAuthority(auth: url, from: hierStart + 2, end: authorityEnd, mode: parsingMode);
1449
1450 // even if we failed to set the authority properly, let's try to recover
1451 pathStart = authorityEnd;
1452 setPath(value: url, from: pathStart, end: hierEnd);
1453 } else {
1454 userName.clear();
1455 password.clear();
1456 host.clear();
1457 port = -1;
1458 pathStart = hierStart;
1459
1460 if (hierStart < hierEnd)
1461 setPath(value: url, from: hierStart, end: hierEnd);
1462 else
1463 path.clear();
1464 }
1465
1466 if (uint(question) < uint(hash))
1467 setQuery(value: url, from: question + 1, iend: qMin<uint>(a: hash, b: len));
1468
1469 if (hash != -1)
1470 setFragment(value: url, from: hash + 1, end: len);
1471
1472 if (error || parsingMode == QUrl::TolerantMode)
1473 return;
1474
1475 // The parsing so far was partially tolerant of errors, except for the
1476 // scheme parser (which is always strict) and the authority (which was
1477 // executed in strict mode).
1478 // If we haven't found any errors so far, continue the strict-mode parsing
1479 // from the path component onwards.
1480
1481 if (!validateComponent(section: Path, input: url, begin: pathStart, end: hierEnd))
1482 return;
1483 if (uint(question) < uint(hash) && !validateComponent(section: Query, input: url, begin: question + 1, end: qMin<uint>(a: hash, b: len)))
1484 return;
1485 if (hash != -1)
1486 validateComponent(section: Fragment, input: url, begin: hash + 1, end: len);
1487}
1488
1489QString QUrlPrivate::toLocalFile(QUrl::FormattingOptions options) const
1490{
1491 QString tmp;
1492 QString ourPath;
1493 appendPath(appendTo&: ourPath, options, appendingTo: QUrlPrivate::Path);
1494
1495 // magic for shared drive on windows
1496 if (!host.isEmpty()) {
1497 tmp = QLatin1String("//") + host;
1498#ifdef Q_OS_WIN // QTBUG-42346, WebDAV is visible as local file on Windows only.
1499 if (scheme == webDavScheme())
1500 tmp += webDavSslTag();
1501#endif
1502 if (!ourPath.isEmpty() && !ourPath.startsWith(c: QLatin1Char('/')))
1503 tmp += QLatin1Char('/');
1504 tmp += ourPath;
1505 } else {
1506 tmp = ourPath;
1507#ifdef Q_OS_WIN
1508 // magic for drives on windows
1509 if (ourPath.length() > 2 && ourPath.at(0) == QLatin1Char('/') && ourPath.at(2) == QLatin1Char(':'))
1510 tmp.remove(0, 1);
1511#endif
1512 }
1513 return tmp;
1514}
1515
1516/*
1517 From http://www.ietf.org/rfc/rfc3986.txt, 5.2.3: Merge paths
1518
1519 Returns a merge of the current path with the relative path passed
1520 as argument.
1521
1522 Note: \a relativePath is relative (does not start with '/').
1523*/
1524inline QString QUrlPrivate::mergePaths(const QString &relativePath) const
1525{
1526 // If the base URI has a defined authority component and an empty
1527 // path, then return a string consisting of "/" concatenated with
1528 // the reference's path; otherwise,
1529 if (!host.isEmpty() && path.isEmpty())
1530 return QLatin1Char('/') + relativePath;
1531
1532 // Return a string consisting of the reference's path component
1533 // appended to all but the last segment of the base URI's path
1534 // (i.e., excluding any characters after the right-most "/" in the
1535 // base URI path, or excluding the entire base URI path if it does
1536 // not contain any "/" characters).
1537 QString newPath;
1538 if (!path.contains(c: QLatin1Char('/')))
1539 newPath = relativePath;
1540 else
1541 newPath = path.leftRef(n: path.lastIndexOf(c: QLatin1Char('/')) + 1) + relativePath;
1542
1543 return newPath;
1544}
1545
1546/*
1547 From http://www.ietf.org/rfc/rfc3986.txt, 5.2.4: Remove dot segments
1548
1549 Removes unnecessary ../ and ./ from the path. Used for normalizing
1550 the URL.
1551*/
1552static void removeDotsFromPath(QString *path)
1553{
1554 // The input buffer is initialized with the now-appended path
1555 // components and the output buffer is initialized to the empty
1556 // string.
1557 QChar *out = path->data();
1558 const QChar *in = out;
1559 const QChar *end = out + path->size();
1560
1561 // If the input buffer consists only of
1562 // "." or "..", then remove that from the input
1563 // buffer;
1564 if (path->size() == 1 && in[0].unicode() == '.')
1565 ++in;
1566 else if (path->size() == 2 && in[0].unicode() == '.' && in[1].unicode() == '.')
1567 in += 2;
1568 // While the input buffer is not empty, loop:
1569 while (in < end) {
1570
1571 // otherwise, if the input buffer begins with a prefix of "../" or "./",
1572 // then remove that prefix from the input buffer;
1573 if (path->size() >= 2 && in[0].unicode() == '.' && in[1].unicode() == '/')
1574 in += 2;
1575 else if (path->size() >= 3 && in[0].unicode() == '.'
1576 && in[1].unicode() == '.' && in[2].unicode() == '/')
1577 in += 3;
1578
1579 // otherwise, if the input buffer begins with a prefix of
1580 // "/./" or "/.", where "." is a complete path segment,
1581 // then replace that prefix with "/" in the input buffer;
1582 if (in <= end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.'
1583 && in[2].unicode() == '/') {
1584 in += 2;
1585 continue;
1586 } else if (in == end - 2 && in[0].unicode() == '/' && in[1].unicode() == '.') {
1587 *out++ = QLatin1Char('/');
1588 in += 2;
1589 break;
1590 }
1591
1592 // otherwise, if the input buffer begins with a prefix
1593 // of "/../" or "/..", where ".." is a complete path
1594 // segment, then replace that prefix with "/" in the
1595 // input buffer and remove the last //segment and its
1596 // preceding "/" (if any) from the output buffer;
1597 if (in <= end - 4 && in[0].unicode() == '/' && in[1].unicode() == '.'
1598 && in[2].unicode() == '.' && in[3].unicode() == '/') {
1599 while (out > path->constData() && (--out)->unicode() != '/')
1600 ;
1601 if (out == path->constData() && out->unicode() != '/')
1602 ++in;
1603 in += 3;
1604 continue;
1605 } else if (in == end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.'
1606 && in[2].unicode() == '.') {
1607 while (out > path->constData() && (--out)->unicode() != '/')
1608 ;
1609 if (out->unicode() == '/')
1610 ++out;
1611 in += 3;
1612 break;
1613 }
1614
1615 // otherwise move the first path segment in
1616 // the input buffer to the end of the output
1617 // buffer, including the initial "/" character
1618 // (if any) and any subsequent characters up
1619 // to, but not including, the next "/"
1620 // character or the end of the input buffer.
1621 *out++ = *in++;
1622 while (in < end && in->unicode() != '/')
1623 *out++ = *in++;
1624 }
1625 path->truncate(pos: out - path->constData());
1626}
1627
1628inline QUrlPrivate::ErrorCode QUrlPrivate::validityError(QString *source, int *position) const
1629{
1630 Q_ASSERT(!source == !position);
1631 if (error) {
1632 if (source) {
1633 *source = error->source;
1634 *position = error->position;
1635 }
1636 return error->code;
1637 }
1638
1639 // There are three more cases of invalid URLs that QUrl recognizes and they
1640 // are only possible with constructed URLs (setXXX methods), not with
1641 // parsing. Therefore, they are tested here.
1642 //
1643 // Two cases are a non-empty path that doesn't start with a slash and:
1644 // - with an authority
1645 // - without an authority, without scheme but the path with a colon before
1646 // the first slash
1647 // The third case is an empty authority and a non-empty path that starts
1648 // with "//".
1649 // Those cases are considered invalid because toString() would produce a URL
1650 // that wouldn't be parsed back to the same QUrl.
1651
1652 if (path.isEmpty())
1653 return NoError;
1654 if (path.at(i: 0) == QLatin1Char('/')) {
1655 if (hasAuthority() || path.length() == 1 || path.at(i: 1) != QLatin1Char('/'))
1656 return NoError;
1657 if (source) {
1658 *source = path;
1659 *position = 0;
1660 }
1661 return AuthorityAbsentAndPathIsDoubleSlash;
1662 }
1663
1664 if (sectionIsPresent & QUrlPrivate::Host) {
1665 if (source) {
1666 *source = path;
1667 *position = 0;
1668 }
1669 return AuthorityPresentAndPathIsRelative;
1670 }
1671 if (sectionIsPresent & QUrlPrivate::Scheme)
1672 return NoError;
1673
1674 // check for a path of "text:text/"
1675 for (int i = 0; i < path.length(); ++i) {
1676 ushort c = path.at(i).unicode();
1677 if (c == '/') {
1678 // found the slash before the colon
1679 return NoError;
1680 }
1681 if (c == ':') {
1682 // found the colon before the slash, it's invalid
1683 if (source) {
1684 *source = path;
1685 *position = i;
1686 }
1687 return RelativeUrlPathContainsColonBeforeSlash;
1688 }
1689 }
1690 return NoError;
1691}
1692
1693bool QUrlPrivate::validateComponent(QUrlPrivate::Section section, const QString &input,
1694 int begin, int end)
1695{
1696 // What we need to look out for, that the regular parser tolerates:
1697 // - percent signs not followed by two hex digits
1698 // - forbidden characters, which should always appear encoded
1699 // '"' / '<' / '>' / '\' / '^' / '`' / '{' / '|' / '}' / BKSP
1700 // control characters
1701 // - delimiters not allowed in certain positions
1702 // . scheme: parser is already strict
1703 // . user info: gen-delims except ":" disallowed ("/" / "?" / "#" / "[" / "]" / "@")
1704 // . host: parser is stricter than the standard
1705 // . port: parser is stricter than the standard
1706 // . path: all delimiters allowed
1707 // . fragment: all delimiters allowed
1708 // . query: all delimiters allowed
1709 static const char forbidden[] = "\"<>\\^`{|}\x7F";
1710 static const char forbiddenUserInfo[] = ":/?#[]@";
1711
1712 Q_ASSERT(section != Authority && section != Hierarchy && section != FullUrl);
1713
1714 const ushort *const data = reinterpret_cast<const ushort *>(input.constData());
1715 for (uint i = uint(begin); i < uint(end); ++i) {
1716 uint uc = data[i];
1717 if (uc >= 0x80)
1718 continue;
1719
1720 bool error = false;
1721 if ((uc == '%' && (uint(end) < i + 2 || !isHex(c: data[i + 1]) || !isHex(c: data[i + 2])))
1722 || uc <= 0x20 || strchr(s: forbidden, c: uc)) {
1723 // found an error
1724 error = true;
1725 } else if (section & UserInfo) {
1726 if (section == UserInfo && strchr(s: forbiddenUserInfo + 1, c: uc))
1727 error = true;
1728 else if (section != UserInfo && strchr(s: forbiddenUserInfo, c: uc))
1729 error = true;
1730 }
1731
1732 if (!error)
1733 continue;
1734
1735 ErrorCode errorCode = ErrorCode(int(section) << 8);
1736 if (section == UserInfo) {
1737 // is it the user name or the password?
1738 errorCode = InvalidUserNameError;
1739 for (uint j = uint(begin); j < i; ++j)
1740 if (data[j] == ':') {
1741 errorCode = InvalidPasswordError;
1742 break;
1743 }
1744 }
1745
1746 setError(errorCode, source: input, supplement: i);
1747 return false;
1748 }
1749
1750 // no errors
1751 return true;
1752}
1753
1754#if 0
1755inline void QUrlPrivate::validate() const
1756{
1757 QUrlPrivate *that = (QUrlPrivate *)this;
1758 that->encodedOriginal = that->toEncoded(); // may detach
1759 parse(ParseOnly);
1760
1761 QURL_SETFLAG(that->stateFlags, Validated);
1762
1763 if (!isValid)
1764 return;
1765
1766 QString auth = authority(); // causes the non-encoded forms to be valid
1767
1768 // authority() calls canonicalHost() which sets this
1769 if (!isHostValid)
1770 return;
1771
1772 if (scheme == QLatin1String("mailto")) {
1773 if (!host.isEmpty() || port != -1 || !userName.isEmpty() || !password.isEmpty()) {
1774 that->isValid = false;
1775 that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "expected empty host, username,"
1776 "port and password"),
1777 0, 0);
1778 }
1779 } else if (scheme == ftpScheme() || scheme == httpScheme()) {
1780 if (host.isEmpty() && !(path.isEmpty() && encodedPath.isEmpty())) {
1781 that->isValid = false;
1782 that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "the host is empty, but not the path"),
1783 0, 0);
1784 }
1785 }
1786}
1787#endif
1788
1789/*!
1790 \macro QT_NO_URL_CAST_FROM_STRING
1791 \relates QUrl
1792
1793 Disables automatic conversions from QString (or char *) to QUrl.
1794
1795 Compiling your code with this define is useful when you have a lot of
1796 code that uses QString for file names and you wish to convert it to
1797 use QUrl for network transparency. In any code that uses QUrl, it can
1798 help avoid missing QUrl::resolved() calls, and other misuses of
1799 QString to QUrl conversions.
1800
1801 \oldcode
1802 url = filename; // probably not what you want
1803 \newcode
1804 url = QUrl::fromLocalFile(filename);
1805 url = baseurl.resolved(QUrl(filename));
1806 \endcode
1807
1808 \sa QT_NO_CAST_FROM_ASCII
1809*/
1810
1811
1812/*!
1813 Constructs a URL by parsing \a url. QUrl will automatically percent encode
1814 all characters that are not allowed in a URL and decode the percent-encoded
1815 sequences that represent an unreserved character (letters, digits, hyphens,
1816 undercores, dots and tildes). All other characters are left in their
1817 original forms.
1818
1819 Parses the \a url using the parser mode \a parsingMode. In TolerantMode
1820 (the default), QUrl will correct certain mistakes, notably the presence of
1821 a percent character ('%') not followed by two hexadecimal digits, and it
1822 will accept any character in any position. In StrictMode, encoding mistakes
1823 will not be tolerated and QUrl will also check that certain forbidden
1824 characters are not present in unencoded form. If an error is detected in
1825 StrictMode, isValid() will return false. The parsing mode DecodedMode is not
1826 permitted in this context.
1827
1828 Example:
1829
1830 \snippet code/src_corelib_io_qurl.cpp 0
1831
1832 To construct a URL from an encoded string, you can also use fromEncoded():
1833
1834 \snippet code/src_corelib_io_qurl.cpp 1
1835
1836 Both functions are equivalent and, in Qt 5, both functions accept encoded
1837 data. Usually, the choice of the QUrl constructor or setUrl() versus
1838 fromEncoded() will depend on the source data: the constructor and setUrl()
1839 take a QString, whereas fromEncoded takes a QByteArray.
1840
1841 \sa setUrl(), fromEncoded(), TolerantMode
1842*/
1843QUrl::QUrl(const QString &url, ParsingMode parsingMode) : d(nullptr)
1844{
1845 setUrl(url, mode: parsingMode);
1846}
1847
1848/*!
1849 Constructs an empty QUrl object.
1850*/
1851QUrl::QUrl() : d(nullptr)
1852{
1853}
1854
1855/*!
1856 Constructs a copy of \a other.
1857*/
1858QUrl::QUrl(const QUrl &other) : d(other.d)
1859{
1860 if (d)
1861 d->ref.ref();
1862}
1863
1864/*!
1865 Destructor; called immediately before the object is deleted.
1866*/
1867QUrl::~QUrl()
1868{
1869 if (d && !d->ref.deref())
1870 delete d;
1871}
1872
1873/*!
1874 Returns \c true if the URL is non-empty and valid; otherwise returns \c false.
1875
1876 The URL is run through a conformance test. Every part of the URL
1877 must conform to the standard encoding rules of the URI standard
1878 for the URL to be reported as valid.
1879
1880 \snippet code/src_corelib_io_qurl.cpp 2
1881*/
1882bool QUrl::isValid() const
1883{
1884 if (isEmpty()) {
1885 // also catches d == nullptr
1886 return false;
1887 }
1888 return d->validityError() == QUrlPrivate::NoError;
1889}
1890
1891/*!
1892 Returns \c true if the URL has no data; otherwise returns \c false.
1893
1894 \sa clear()
1895*/
1896bool QUrl::isEmpty() const
1897{
1898 if (!d) return true;
1899 return d->isEmpty();
1900}
1901
1902/*!
1903 Resets the content of the QUrl. After calling this function, the
1904 QUrl is equal to one that has been constructed with the default
1905 empty constructor.
1906
1907 \sa isEmpty()
1908*/
1909void QUrl::clear()
1910{
1911 if (d && !d->ref.deref())
1912 delete d;
1913 d = nullptr;
1914}
1915
1916/*!
1917 Parses \a url and sets this object to that value. QUrl will automatically
1918 percent encode all characters that are not allowed in a URL and decode the
1919 percent-encoded sequences that represent an unreserved character (letters,
1920 digits, hyphens, undercores, dots and tildes). All other characters are
1921 left in their original forms.
1922
1923 Parses the \a url using the parser mode \a parsingMode. In TolerantMode
1924 (the default), QUrl will correct certain mistakes, notably the presence of
1925 a percent character ('%') not followed by two hexadecimal digits, and it
1926 will accept any character in any position. In StrictMode, encoding mistakes
1927 will not be tolerated and QUrl will also check that certain forbidden
1928 characters are not present in unencoded form. If an error is detected in
1929 StrictMode, isValid() will return false. The parsing mode DecodedMode is
1930 not permitted in this context and will produce a run-time warning.
1931
1932 \sa url(), toString()
1933*/
1934void QUrl::setUrl(const QString &url, ParsingMode parsingMode)
1935{
1936 if (parsingMode == DecodedMode) {
1937 qWarning(msg: "QUrl: QUrl::DecodedMode is not permitted when parsing a full URL");
1938 } else {
1939 detach();
1940 d->parse(url, parsingMode);
1941 }
1942}
1943
1944/*!
1945 \fn void QUrl::setEncodedUrl(const QByteArray &encodedUrl, ParsingMode parsingMode)
1946 \deprecated
1947 Constructs a URL by parsing the contents of \a encodedUrl.
1948
1949 \a encodedUrl is assumed to be a URL string in percent encoded
1950 form, containing only ASCII characters.
1951
1952 The parsing mode \a parsingMode is used for parsing \a encodedUrl.
1953
1954 \obsolete Use setUrl(QString::fromUtf8(encodedUrl), parsingMode)
1955
1956 \sa setUrl()
1957*/
1958
1959/*!
1960 Sets the scheme of the URL to \a scheme. As a scheme can only
1961 contain ASCII characters, no conversion or decoding is done on the
1962 input. It must also start with an ASCII letter.
1963
1964 The scheme describes the type (or protocol) of the URL. It's
1965 represented by one or more ASCII characters at the start the URL.
1966
1967 A scheme is strictly \l {http://www.ietf.org/rfc/rfc3986.txt} {RFC 3986}-compliant:
1968 \tt {scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )}
1969
1970 The following example shows a URL where the scheme is "ftp":
1971
1972 \image qurl-authority2.png
1973
1974 To set the scheme, the following call is used:
1975 \snippet code/src_corelib_io_qurl.cpp 11
1976
1977 The scheme can also be empty, in which case the URL is interpreted
1978 as relative.
1979
1980 \sa scheme(), isRelative()
1981*/
1982void QUrl::setScheme(const QString &scheme)
1983{
1984 detach();
1985 d->clearError();
1986 if (scheme.isEmpty()) {
1987 // schemes are not allowed to be empty
1988 d->sectionIsPresent &= ~QUrlPrivate::Scheme;
1989 d->flags &= ~QUrlPrivate::IsLocalFile;
1990 d->scheme.clear();
1991 } else {
1992 d->setScheme(value: scheme, len: scheme.length(), /* do set error */ doSetError: true);
1993 }
1994}
1995
1996/*!
1997 Returns the scheme of the URL. If an empty string is returned,
1998 this means the scheme is undefined and the URL is then relative.
1999
2000 The scheme can only contain US-ASCII letters or digits, which means it
2001 cannot contain any character that would otherwise require encoding.
2002 Additionally, schemes are always returned in lowercase form.
2003
2004 \sa setScheme(), isRelative()
2005*/
2006QString QUrl::scheme() const
2007{
2008 if (!d) return QString();
2009
2010 return d->scheme;
2011}
2012
2013/*!
2014 Sets the authority of the URL to \a authority.
2015
2016 The authority of a URL is the combination of user info, a host
2017 name and a port. All of these elements are optional; an empty
2018 authority is therefore valid.
2019
2020 The user info and host are separated by a '@', and the host and
2021 port are separated by a ':'. If the user info is empty, the '@'
2022 must be omitted; although a stray ':' is permitted if the port is
2023 empty.
2024
2025 The following example shows a valid authority string:
2026
2027 \image qurl-authority.png
2028
2029 The \a authority data is interpreted according to \a mode: in StrictMode,
2030 any '%' characters must be followed by exactly two hexadecimal characters
2031 and some characters (including space) are not allowed in undecoded form. In
2032 TolerantMode (the default), all characters are accepted in undecoded form
2033 and the tolerant parser will correct stray '%' not followed by two hex
2034 characters.
2035
2036 This function does not allow \a mode to be QUrl::DecodedMode. To set fully
2037 decoded data, call setUserName(), setPassword(), setHost() and setPort()
2038 individually.
2039
2040 \sa setUserInfo(), setHost(), setPort()
2041*/
2042void QUrl::setAuthority(const QString &authority, ParsingMode mode)
2043{
2044 detach();
2045 d->clearError();
2046
2047 if (mode == DecodedMode) {
2048 qWarning(msg: "QUrl::setAuthority(): QUrl::DecodedMode is not permitted in this function");
2049 return;
2050 }
2051
2052 d->setAuthority(auth: authority, from: 0, end: authority.length(), mode);
2053 if (authority.isNull()) {
2054 // QUrlPrivate::setAuthority cleared almost everything
2055 // but it leaves the Host bit set
2056 d->sectionIsPresent &= ~QUrlPrivate::Authority;
2057 }
2058}
2059
2060/*!
2061 Returns the authority of the URL if it is defined; otherwise
2062 an empty string is returned.
2063
2064 This function returns an unambiguous value, which may contain that
2065 characters still percent-encoded, plus some control sequences not
2066 representable in decoded form in QString.
2067
2068 The \a options argument controls how to format the user info component. The
2069 value of QUrl::FullyDecoded is not permitted in this function. If you need
2070 to obtain fully decoded data, call userName(), password(), host() and
2071 port() individually.
2072
2073 \sa setAuthority(), userInfo(), userName(), password(), host(), port()
2074*/
2075QString QUrl::authority(ComponentFormattingOptions options) const
2076{
2077 QString result;
2078 if (!d)
2079 return result;
2080
2081 if (options == QUrl::FullyDecoded) {
2082 qWarning(msg: "QUrl::authority(): QUrl::FullyDecoded is not permitted in this function");
2083 return result;
2084 }
2085
2086 d->appendAuthority(appendTo&: result, options, appendingTo: QUrlPrivate::Authority);
2087 return result;
2088}
2089
2090/*!
2091 Sets the user info of the URL to \a userInfo. The user info is an
2092 optional part of the authority of the URL, as described in
2093 setAuthority().
2094
2095 The user info consists of a user name and optionally a password,
2096 separated by a ':'. If the password is empty, the colon must be
2097 omitted. The following example shows a valid user info string:
2098
2099 \image qurl-authority3.png
2100
2101 The \a userInfo data is interpreted according to \a mode: in StrictMode,
2102 any '%' characters must be followed by exactly two hexadecimal characters
2103 and some characters (including space) are not allowed in undecoded form. In
2104 TolerantMode (the default), all characters are accepted in undecoded form
2105 and the tolerant parser will correct stray '%' not followed by two hex
2106 characters.
2107
2108 This function does not allow \a mode to be QUrl::DecodedMode. To set fully
2109 decoded data, call setUserName() and setPassword() individually.
2110
2111 \sa userInfo(), setUserName(), setPassword(), setAuthority()
2112*/
2113void QUrl::setUserInfo(const QString &userInfo, ParsingMode mode)
2114{
2115 detach();
2116 d->clearError();
2117 QString trimmed = userInfo.trimmed();
2118 if (mode == DecodedMode) {
2119 qWarning(msg: "QUrl::setUserInfo(): QUrl::DecodedMode is not permitted in this function");
2120 return;
2121 }
2122
2123 d->setUserInfo(userInfo: trimmed, from: 0, end: trimmed.length());
2124 if (userInfo.isNull()) {
2125 // QUrlPrivate::setUserInfo cleared almost everything
2126 // but it leaves the UserName bit set
2127 d->sectionIsPresent &= ~QUrlPrivate::UserInfo;
2128 } else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::UserInfo, input: userInfo)) {
2129 d->sectionIsPresent &= ~QUrlPrivate::UserInfo;
2130 d->userName.clear();
2131 d->password.clear();
2132 }
2133}
2134
2135/*!
2136 Returns the user info of the URL, or an empty string if the user
2137 info is undefined.
2138
2139 This function returns an unambiguous value, which may contain that
2140 characters still percent-encoded, plus some control sequences not
2141 representable in decoded form in QString.
2142
2143 The \a options argument controls how to format the user info component. The
2144 value of QUrl::FullyDecoded is not permitted in this function. If you need
2145 to obtain fully decoded data, call userName() and password() individually.
2146
2147 \sa setUserInfo(), userName(), password(), authority()
2148*/
2149QString QUrl::userInfo(ComponentFormattingOptions options) const
2150{
2151 QString result;
2152 if (!d)
2153 return result;
2154
2155 if (options == QUrl::FullyDecoded) {
2156 qWarning(msg: "QUrl::userInfo(): QUrl::FullyDecoded is not permitted in this function");
2157 return result;
2158 }
2159
2160 d->appendUserInfo(appendTo&: result, options, appendingTo: QUrlPrivate::UserInfo);
2161 return result;
2162}
2163
2164/*!
2165 Sets the URL's user name to \a userName. The \a userName is part
2166 of the user info element in the authority of the URL, as described
2167 in setUserInfo().
2168
2169 The \a userName data is interpreted according to \a mode: in StrictMode,
2170 any '%' characters must be followed by exactly two hexadecimal characters
2171 and some characters (including space) are not allowed in undecoded form. In
2172 TolerantMode (the default), all characters are accepted in undecoded form
2173 and the tolerant parser will correct stray '%' not followed by two hex
2174 characters. In DecodedMode, '%' stand for themselves and encoded characters
2175 are not possible.
2176
2177 QUrl::DecodedMode should be used when setting the user name from a data
2178 source which is not a URL, such as a password dialog shown to the user or
2179 with a user name obtained by calling userName() with the QUrl::FullyDecoded
2180 formatting option.
2181
2182 \sa userName(), setUserInfo()
2183*/
2184void QUrl::setUserName(const QString &userName, ParsingMode mode)
2185{
2186 detach();
2187 d->clearError();
2188
2189 QString data = userName;
2190 if (mode == DecodedMode) {
2191 parseDecodedComponent(data);
2192 mode = TolerantMode;
2193 }
2194
2195 d->setUserName(value: data, from: 0, end: data.length());
2196 if (userName.isNull())
2197 d->sectionIsPresent &= ~QUrlPrivate::UserName;
2198 else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::UserName, input: userName))
2199 d->userName.clear();
2200}
2201
2202/*!
2203 Returns the user name of the URL if it is defined; otherwise
2204 an empty string is returned.
2205
2206 The \a options argument controls how to format the user name component. All
2207 values produce an unambiguous result. With QUrl::FullyDecoded, all
2208 percent-encoded sequences are decoded; otherwise, the returned value may
2209 contain some percent-encoded sequences for some control sequences not
2210 representable in decoded form in QString.
2211
2212 Note that QUrl::FullyDecoded may cause data loss if those non-representable
2213 sequences are present. It is recommended to use that value when the result
2214 will be used in a non-URL context, such as setting in QAuthenticator or
2215 negotiating a login.
2216
2217 \sa setUserName(), userInfo()
2218*/
2219QString QUrl::userName(ComponentFormattingOptions options) const
2220{
2221 QString result;
2222 if (d)
2223 d->appendUserName(appendTo&: result, options);
2224 return result;
2225}
2226
2227/*!
2228 \fn void QUrl::setEncodedUserName(const QByteArray &userName)
2229 \deprecated
2230 \since 4.4
2231
2232 Sets the URL's user name to the percent-encoded \a userName. The \a
2233 userName is part of the user info element in the authority of the
2234 URL, as described in setUserInfo().
2235
2236 \obsolete Use setUserName(QString::fromUtf8(userName))
2237
2238 \sa setUserName(), encodedUserName(), setUserInfo()
2239*/
2240
2241/*!
2242 \fn QByteArray QUrl::encodedUserName() const
2243 \deprecated
2244 \since 4.4
2245
2246 Returns the user name of the URL if it is defined; otherwise
2247 an empty string is returned. The returned value will have its
2248 non-ASCII and other control characters percent-encoded, as in
2249 toEncoded().
2250
2251 \obsolete Use userName(QUrl::FullyEncoded).toLatin1()
2252
2253 \sa setEncodedUserName()
2254*/
2255
2256/*!
2257 Sets the URL's password to \a password. The \a password is part of
2258 the user info element in the authority of the URL, as described in
2259 setUserInfo().
2260
2261 The \a password data is interpreted according to \a mode: in StrictMode,
2262 any '%' characters must be followed by exactly two hexadecimal characters
2263 and some characters (including space) are not allowed in undecoded form. In
2264 TolerantMode, all characters are accepted in undecoded form and the
2265 tolerant parser will correct stray '%' not followed by two hex characters.
2266 In DecodedMode, '%' stand for themselves and encoded characters are not
2267 possible.
2268
2269 QUrl::DecodedMode should be used when setting the password from a data
2270 source which is not a URL, such as a password dialog shown to the user or
2271 with a password obtained by calling password() with the QUrl::FullyDecoded
2272 formatting option.
2273
2274 \sa password(), setUserInfo()
2275*/
2276void QUrl::setPassword(const QString &password, ParsingMode mode)
2277{
2278 detach();
2279 d->clearError();
2280
2281 QString data = password;
2282 if (mode == DecodedMode) {
2283 parseDecodedComponent(data);
2284 mode = TolerantMode;
2285 }
2286
2287 d->setPassword(value: data, from: 0, end: data.length());
2288 if (password.isNull())
2289 d->sectionIsPresent &= ~QUrlPrivate::Password;
2290 else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Password, input: password))
2291 d->password.clear();
2292}
2293
2294/*!
2295 Returns the password of the URL if it is defined; otherwise
2296 an empty string is returned.
2297
2298 The \a options argument controls how to format the user name component. All
2299 values produce an unambiguous result. With QUrl::FullyDecoded, all
2300 percent-encoded sequences are decoded; otherwise, the returned value may
2301 contain some percent-encoded sequences for some control sequences not
2302 representable in decoded form in QString.
2303
2304 Note that QUrl::FullyDecoded may cause data loss if those non-representable
2305 sequences are present. It is recommended to use that value when the result
2306 will be used in a non-URL context, such as setting in QAuthenticator or
2307 negotiating a login.
2308
2309 \sa setPassword()
2310*/
2311QString QUrl::password(ComponentFormattingOptions options) const
2312{
2313 QString result;
2314 if (d)
2315 d->appendPassword(appendTo&: result, options);
2316 return result;
2317}
2318
2319/*!
2320 \fn void QUrl::setEncodedPassword(const QByteArray &password)
2321 \deprecated
2322 \since 4.4
2323
2324 Sets the URL's password to the percent-encoded \a password. The \a
2325 password is part of the user info element in the authority of the
2326 URL, as described in setUserInfo().
2327
2328 \obsolete Use setPassword(QString::fromUtf8(password));
2329
2330 \sa setPassword(), encodedPassword(), setUserInfo()
2331*/
2332
2333/*!
2334 \fn QByteArray QUrl::encodedPassword() const
2335 \deprecated
2336 \since 4.4
2337
2338 Returns the password of the URL if it is defined; otherwise an
2339 empty string is returned. The returned value will have its
2340 non-ASCII and other control characters percent-encoded, as in
2341 toEncoded().
2342
2343 \obsolete Use password(QUrl::FullyEncoded).toLatin1()
2344
2345 \sa setEncodedPassword(), toEncoded()
2346*/
2347
2348/*!
2349 Sets the host of the URL to \a host. The host is part of the
2350 authority.
2351
2352 The \a host data is interpreted according to \a mode: in StrictMode,
2353 any '%' characters must be followed by exactly two hexadecimal characters
2354 and some characters (including space) are not allowed in undecoded form. In
2355 TolerantMode, all characters are accepted in undecoded form and the
2356 tolerant parser will correct stray '%' not followed by two hex characters.
2357 In DecodedMode, '%' stand for themselves and encoded characters are not
2358 possible.
2359
2360 Note that, in all cases, the result of the parsing must be a valid hostname
2361 according to STD 3 rules, as modified by the Internationalized Resource
2362 Identifiers specification (RFC 3987). Invalid hostnames are not permitted
2363 and will cause isValid() to become false.
2364
2365 \sa host(), setAuthority()
2366*/
2367void QUrl::setHost(const QString &host, ParsingMode mode)
2368{
2369 detach();
2370 d->clearError();
2371
2372 QString data = host;
2373 if (mode == DecodedMode) {
2374 parseDecodedComponent(data);
2375 mode = TolerantMode;
2376 }
2377
2378 if (d->setHost(value: data, from: 0, iend: data.length(), mode)) {
2379 if (host.isNull())
2380 d->sectionIsPresent &= ~QUrlPrivate::Host;
2381 } else if (!data.startsWith(c: QLatin1Char('['))) {
2382 // setHost failed, it might be IPv6 or IPvFuture in need of bracketing
2383 Q_ASSERT(d->error);
2384
2385 data.prepend(c: QLatin1Char('['));
2386 data.append(c: QLatin1Char(']'));
2387 if (!d->setHost(value: data, from: 0, iend: data.length(), mode)) {
2388 // failed again
2389 if (data.contains(c: QLatin1Char(':'))) {
2390 // source data contains ':', so it's an IPv6 error
2391 d->error->code = QUrlPrivate::InvalidIPv6AddressError;
2392 }
2393 } else {
2394 // succeeded
2395 d->clearError();
2396 }
2397 }
2398}
2399
2400/*!
2401 Returns the host of the URL if it is defined; otherwise
2402 an empty string is returned.
2403
2404 The \a options argument controls how the hostname will be formatted. The
2405 QUrl::EncodeUnicode option will cause this function to return the hostname
2406 in the ASCII-Compatible Encoding (ACE) form, which is suitable for use in
2407 channels that are not 8-bit clean or that require the legacy hostname (such
2408 as DNS requests or in HTTP request headers). If that flag is not present,
2409 this function returns the International Domain Name (IDN) in Unicode form,
2410 according to the list of permissible top-level domains (see
2411 idnWhitelist()).
2412
2413 All other flags are ignored. Host names cannot contain control or percent
2414 characters, so the returned value can be considered fully decoded.
2415
2416 \sa setHost(), idnWhitelist(), setIdnWhitelist(), authority()
2417*/
2418QString QUrl::host(ComponentFormattingOptions options) const
2419{
2420 QString result;
2421 if (d) {
2422 d->appendHost(appendTo&: result, options);
2423 if (result.startsWith(c: QLatin1Char('[')))
2424 result = result.mid(position: 1, n: result.length() - 2);
2425 }
2426 return result;
2427}
2428
2429/*!
2430 \fn void QUrl::setEncodedHost(const QByteArray &host)
2431 \deprecated
2432 \since 4.4
2433
2434 Sets the URL's host to the ACE- or percent-encoded \a host. The \a
2435 host is part of the user info element in the authority of the
2436 URL, as described in setAuthority().
2437
2438 \obsolete Use setHost(QString::fromUtf8(host)).
2439
2440 \sa setHost(), encodedHost(), setAuthority(), fromAce()
2441*/
2442
2443/*!
2444 \fn QByteArray QUrl::encodedHost() const
2445 \deprecated
2446 \since 4.4
2447
2448 Returns the host part of the URL if it is defined; otherwise
2449 an empty string is returned.
2450
2451 Note: encodedHost() does not return percent-encoded hostnames. Instead,
2452 the ACE-encoded (bare ASCII in Punycode encoding) form will be
2453 returned for any non-ASCII hostname.
2454
2455 This function is equivalent to calling QUrl::toAce() on the return
2456 value of host().
2457
2458 \obsolete Use host(QUrl::FullyEncoded).toLatin1() or toAce(host()).
2459
2460 \sa setEncodedHost()
2461*/
2462
2463/*!
2464 Sets the port of the URL to \a port. The port is part of the
2465 authority of the URL, as described in setAuthority().
2466
2467 \a port must be between 0 and 65535 inclusive. Setting the
2468 port to -1 indicates that the port is unspecified.
2469*/
2470void QUrl::setPort(int port)
2471{
2472 detach();
2473 d->clearError();
2474
2475 if (port < -1 || port > 65535) {
2476 d->setError(errorCode: QUrlPrivate::InvalidPortError, source: QString::number(port), supplement: 0);
2477 port = -1;
2478 }
2479
2480 d->port = port;
2481 if (port != -1)
2482 d->sectionIsPresent |= QUrlPrivate::Host;
2483}
2484
2485/*!
2486 \since 4.1
2487
2488 Returns the port of the URL, or \a defaultPort if the port is
2489 unspecified.
2490
2491 Example:
2492
2493 \snippet code/src_corelib_io_qurl.cpp 3
2494*/
2495int QUrl::port(int defaultPort) const
2496{
2497 if (!d) return defaultPort;
2498 return d->port == -1 ? defaultPort : d->port;
2499}
2500
2501/*!
2502 Sets the path of the URL to \a path. The path is the part of the
2503 URL that comes after the authority but before the query string.
2504
2505 \image qurl-ftppath.png
2506
2507 For non-hierarchical schemes, the path will be everything
2508 following the scheme declaration, as in the following example:
2509
2510 \image qurl-mailtopath.png
2511
2512 The \a path data is interpreted according to \a mode: in StrictMode,
2513 any '%' characters must be followed by exactly two hexadecimal characters
2514 and some characters (including space) are not allowed in undecoded form. In
2515 TolerantMode, all characters are accepted in undecoded form and the
2516 tolerant parser will correct stray '%' not followed by two hex characters.
2517 In DecodedMode, '%' stand for themselves and encoded characters are not
2518 possible.
2519
2520 QUrl::DecodedMode should be used when setting the path from a data source
2521 which is not a URL, such as a dialog shown to the user or with a path
2522 obtained by calling path() with the QUrl::FullyDecoded formatting option.
2523
2524 \sa path()
2525*/
2526void QUrl::setPath(const QString &path, ParsingMode mode)
2527{
2528 detach();
2529 d->clearError();
2530
2531 QString data = path;
2532 if (mode == DecodedMode) {
2533 parseDecodedComponent(data);
2534 mode = TolerantMode;
2535 }
2536
2537 d->setPath(value: data, from: 0, end: data.length());
2538
2539 // optimized out, since there is no path delimiter
2540// if (path.isNull())
2541// d->sectionIsPresent &= ~QUrlPrivate::Path;
2542// else
2543 if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Path, input: path))
2544 d->path.clear();
2545}
2546
2547/*!
2548 Returns the path of the URL.
2549
2550 \snippet code/src_corelib_io_qurl.cpp 12
2551
2552 The \a options argument controls how to format the path component. All
2553 values produce an unambiguous result. With QUrl::FullyDecoded, all
2554 percent-encoded sequences are decoded; otherwise, the returned value may
2555 contain some percent-encoded sequences for some control sequences not
2556 representable in decoded form in QString.
2557
2558 Note that QUrl::FullyDecoded may cause data loss if those non-representable
2559 sequences are present. It is recommended to use that value when the result
2560 will be used in a non-URL context, such as sending to an FTP server.
2561
2562 An example of data loss is when you have non-Unicode percent-encoded sequences
2563 and use FullyDecoded (the default):
2564
2565 \snippet code/src_corelib_io_qurl.cpp 13
2566
2567 In this example, there will be some level of data loss because the \c %FF cannot
2568 be converted.
2569
2570 Data loss can also occur when the path contains sub-delimiters (such as \c +):
2571
2572 \snippet code/src_corelib_io_qurl.cpp 14
2573
2574 Other decoding examples:
2575
2576 \snippet code/src_corelib_io_qurl.cpp 15
2577
2578 \sa setPath()
2579*/
2580QString QUrl::path(ComponentFormattingOptions options) const
2581{
2582 QString result;
2583 if (d)
2584 d->appendPath(appendTo&: result, options, appendingTo: QUrlPrivate::Path);
2585 return result;
2586}
2587
2588/*!
2589 \fn void QUrl::setEncodedPath(const QByteArray &path)
2590 \deprecated
2591 \since 4.4
2592
2593 Sets the URL's path to the percent-encoded \a path. The path is
2594 the part of the URL that comes after the authority but before the
2595 query string.
2596
2597 \image qurl-ftppath.png
2598
2599 For non-hierarchical schemes, the path will be everything
2600 following the scheme declaration, as in the following example:
2601
2602 \image qurl-mailtopath.png
2603
2604 \obsolete Use setPath(QString::fromUtf8(path)).
2605
2606 \sa setPath(), encodedPath(), setUserInfo()
2607*/
2608
2609/*!
2610 \fn QByteArray QUrl::encodedPath() const
2611 \deprecated
2612 \since 4.4
2613
2614 Returns the path of the URL if it is defined; otherwise an
2615 empty string is returned. The returned value will have its
2616 non-ASCII and other control characters percent-encoded, as in
2617 toEncoded().
2618
2619 \obsolete Use path(QUrl::FullyEncoded).toLatin1().
2620
2621 \sa setEncodedPath(), toEncoded()
2622*/
2623
2624/*!
2625 \since 5.2
2626
2627 Returns the name of the file, excluding the directory path.
2628
2629 Note that, if this QUrl object is given a path ending in a slash, the name of the file is considered empty.
2630
2631 If the path doesn't contain any slash, it is fully returned as the fileName.
2632
2633 Example:
2634
2635 \snippet code/src_corelib_io_qurl.cpp 7
2636
2637 The \a options argument controls how to format the file name component. All
2638 values produce an unambiguous result. With QUrl::FullyDecoded, all
2639 percent-encoded sequences are decoded; otherwise, the returned value may
2640 contain some percent-encoded sequences for some control sequences not
2641 representable in decoded form in QString.
2642
2643 \sa path()
2644*/
2645QString QUrl::fileName(ComponentFormattingOptions options) const
2646{
2647 const QString ourPath = path(options);
2648 const int slash = ourPath.lastIndexOf(c: QLatin1Char('/'));
2649 if (slash == -1)
2650 return ourPath;
2651 return ourPath.mid(position: slash + 1);
2652}
2653
2654/*!
2655 \since 4.2
2656
2657 Returns \c true if this URL contains a Query (i.e., if ? was seen on it).
2658
2659 \sa setQuery(), query(), hasFragment()
2660*/
2661bool QUrl::hasQuery() const
2662{
2663 if (!d) return false;
2664 return d->hasQuery();
2665}
2666
2667/*!
2668 Sets the query string of the URL to \a query.
2669
2670 This function is useful if you need to pass a query string that
2671 does not fit into the key-value pattern, or that uses a different
2672 scheme for encoding special characters than what is suggested by
2673 QUrl.
2674
2675 Passing a value of QString() to \a query (a null QString) unsets
2676 the query completely. However, passing a value of QString("")
2677 will set the query to an empty value, as if the original URL
2678 had a lone "?".
2679
2680 The \a query data is interpreted according to \a mode: in StrictMode,
2681 any '%' characters must be followed by exactly two hexadecimal characters
2682 and some characters (including space) are not allowed in undecoded form. In
2683 TolerantMode, all characters are accepted in undecoded form and the
2684 tolerant parser will correct stray '%' not followed by two hex characters.
2685 In DecodedMode, '%' stand for themselves and encoded characters are not
2686 possible.
2687
2688 Query strings often contain percent-encoded sequences, so use of
2689 DecodedMode is discouraged. One special sequence to be aware of is that of
2690 the plus character ('+'). QUrl does not convert spaces to plus characters,
2691 even though HTML forms posted by web browsers do. In order to represent an
2692 actual plus character in a query, the sequence "%2B" is usually used. This
2693 function will leave "%2B" sequences untouched in TolerantMode or
2694 StrictMode.
2695
2696 \sa query(), hasQuery()
2697*/
2698void QUrl::setQuery(const QString &query, ParsingMode mode)
2699{
2700 detach();
2701 d->clearError();
2702
2703 QString data = query;
2704 if (mode == DecodedMode) {
2705 parseDecodedComponent(data);
2706 mode = TolerantMode;
2707 }
2708
2709 d->setQuery(value: data, from: 0, iend: data.length());
2710 if (query.isNull())
2711 d->sectionIsPresent &= ~QUrlPrivate::Query;
2712 else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Query, input: query))
2713 d->query.clear();
2714}
2715
2716/*!
2717 \fn void QUrl::setEncodedQuery(const QByteArray &query)
2718 \deprecated
2719
2720 Sets the query string of the URL to \a query. The string is
2721 inserted as-is, and no further encoding is performed when calling
2722 toEncoded().
2723
2724 This function is useful if you need to pass a query string that
2725 does not fit into the key-value pattern, or that uses a different
2726 scheme for encoding special characters than what is suggested by
2727 QUrl.
2728
2729 Passing a value of QByteArray() to \a query (a null QByteArray) unsets
2730 the query completely. However, passing a value of QByteArray("")
2731 will set the query to an empty value, as if the original URL
2732 had a lone "?".
2733
2734 \obsolete Use setQuery, which has the same null / empty behavior.
2735
2736 \sa encodedQuery(), hasQuery()
2737*/
2738
2739/*!
2740 \overload
2741 \since 5.0
2742 Sets the query string of the URL to \a query.
2743
2744 This function reconstructs the query string from the QUrlQuery object and
2745 sets on this QUrl object. This function does not have parsing parameters
2746 because the QUrlQuery contains data that is already parsed.
2747
2748 \sa query(), hasQuery()
2749*/
2750void QUrl::setQuery(const QUrlQuery &query)
2751{
2752 detach();
2753 d->clearError();
2754
2755 // we know the data is in the right format
2756 d->query = query.toString();
2757 if (query.isEmpty())
2758 d->sectionIsPresent &= ~QUrlPrivate::Query;
2759 else
2760 d->sectionIsPresent |= QUrlPrivate::Query;
2761}
2762
2763/*!
2764 \fn void QUrl::setQueryItems(const QList<QPair<QString, QString> > &query)
2765 \deprecated
2766
2767 Sets the query string of the URL to an encoded version of \a
2768 query. The contents of \a query are converted to a string
2769 internally, each pair delimited by the character returned by
2770 \l {QUrlQuery::queryPairDelimiter()}{queryPairDelimiter()}, and the key and value are delimited by
2771 \l {QUrlQuery::queryValueDelimiter()}{queryValueDelimiter()}
2772
2773 \note This method does not encode spaces (ASCII 0x20) as plus (+) signs,
2774 like HTML forms do. If you need that kind of encoding, you must encode
2775 the value yourself and use QUrl::setEncodedQueryItems.
2776
2777 \obsolete Use QUrlQuery and setQuery().
2778
2779 \sa queryItems(), setEncodedQueryItems()
2780*/
2781
2782/*!
2783 \fn void QUrl::setEncodedQueryItems(const QList<QPair<QByteArray, QByteArray> > &query)
2784 \deprecated
2785 \since 4.4
2786
2787 Sets the query string of the URL to the encoded version of \a
2788 query. The contents of \a query are converted to a string
2789 internally, each pair delimited by the character returned by
2790 \l {QUrlQuery::queryPairDelimiter()}{queryPairDelimiter()}, and the key and value are delimited by
2791 \l {QUrlQuery::queryValueDelimiter()}{queryValueDelimiter()}.
2792
2793 \obsolete Use QUrlQuery and setQuery().
2794
2795 \sa encodedQueryItems(), setQueryItems()
2796*/
2797
2798/*!
2799 \fn void QUrl::addQueryItem(const QString &key, const QString &value)
2800 \deprecated
2801
2802 Inserts the pair \a key = \a value into the query string of the
2803 URL.
2804
2805 The key-value pair is encoded before it is added to the query. The
2806 pair is converted into separate strings internally. The \a key and
2807 \a value is first encoded into UTF-8 and then delimited by the
2808 character returned by \l {QUrlQuery::queryValueDelimiter()}{queryValueDelimiter()}.
2809 Each key-value pair is delimited by the character returned by
2810 \l {QUrlQuery::queryPairDelimiter()}{queryPairDelimiter()}
2811
2812 \note This method does not encode spaces (ASCII 0x20) as plus (+) signs,
2813 like HTML forms do. If you need that kind of encoding, you must encode
2814 the value yourself and use QUrl::addEncodedQueryItem.
2815
2816 \obsolete Use QUrlQuery and setQuery().
2817
2818 \sa addEncodedQueryItem()
2819*/
2820
2821/*!
2822 \fn void QUrl::addEncodedQueryItem(const QByteArray &key, const QByteArray &value)
2823 \deprecated
2824 \since 4.4
2825
2826 Inserts the pair \a key = \a value into the query string of the
2827 URL.
2828
2829 \obsolete Use QUrlQuery and setQuery().
2830
2831 \sa addQueryItem()
2832*/
2833
2834/*!
2835 \fn QList<QPair<QString, QString> > QUrl::queryItems() const
2836 \deprecated
2837
2838 Returns the query string of the URL, as a map of keys and values.
2839
2840 \note This method does not decode spaces plus (+) signs as spaces (ASCII
2841 0x20), like HTML forms do. If you need that kind of decoding, you must
2842 use QUrl::encodedQueryItems and decode the data yourself.
2843
2844 \obsolete Use QUrlQuery.
2845
2846 \sa setQueryItems(), setEncodedQuery()
2847*/
2848
2849/*!
2850 \fn QList<QPair<QByteArray, QByteArray> > QUrl::encodedQueryItems() const
2851 \deprecated
2852 \since 4.4
2853
2854 Returns the query string of the URL, as a map of encoded keys and values.
2855
2856 \obsolete Use QUrlQuery.
2857
2858 \sa setEncodedQueryItems(), setQueryItems(), setEncodedQuery()
2859*/
2860
2861/*!
2862 \fn bool QUrl::hasQueryItem(const QString &key) const
2863 \deprecated
2864
2865 Returns \c true if there is a query string pair whose key is equal
2866 to \a key from the URL.
2867
2868 \obsolete Use QUrlQuery.
2869
2870 \sa hasEncodedQueryItem()
2871*/
2872
2873/*!
2874 \fn bool QUrl::hasEncodedQueryItem(const QByteArray &key) const
2875 \deprecated
2876 \since 4.4
2877
2878 Returns \c true if there is a query string pair whose key is equal
2879 to \a key from the URL.
2880
2881 \obsolete Use QUrlQuery.
2882
2883 \sa hasQueryItem()
2884*/
2885
2886/*!
2887 \fn QString QUrl::queryItemValue(const QString &key) const
2888 \deprecated
2889
2890 Returns the first query string value whose key is equal to \a key
2891 from the URL.
2892
2893 \note This method does not decode spaces plus (+) signs as spaces (ASCII
2894 0x20), like HTML forms do. If you need that kind of decoding, you must
2895 use QUrl::encodedQueryItemValue and decode the data yourself.
2896
2897 \obsolete Use QUrlQuery.
2898
2899 \sa allQueryItemValues()
2900*/
2901
2902/*!
2903 \fn QByteArray QUrl::encodedQueryItemValue(const QByteArray &key) const
2904 \deprecated
2905 \since 4.4
2906
2907 Returns the first query string value whose key is equal to \a key
2908 from the URL.
2909
2910 \obsolete Use QUrlQuery.
2911
2912 \sa queryItemValue(), allQueryItemValues()
2913*/
2914
2915/*!
2916 \fn QStringList QUrl::allQueryItemValues(const QString &key) const
2917 \deprecated
2918
2919 Returns the a list of query string values whose key is equal to
2920 \a key from the URL.
2921
2922 \note This method does not decode spaces plus (+) signs as spaces (ASCII
2923 0x20), like HTML forms do. If you need that kind of decoding, you must
2924 use QUrl::allEncodedQueryItemValues and decode the data yourself.
2925
2926 \obsolete Use QUrlQuery.
2927
2928 \sa queryItemValue()
2929*/
2930
2931/*!
2932 \fn QList<QByteArray> QUrl::allEncodedQueryItemValues(const QByteArray &key) const
2933 \deprecated
2934 \since 4.4
2935
2936 Returns the a list of query string values whose key is equal to
2937 \a key from the URL.
2938
2939 \obsolete Use QUrlQuery.
2940
2941 \sa allQueryItemValues(), queryItemValue(), encodedQueryItemValue()
2942*/
2943
2944/*!
2945 \fn void QUrl::removeQueryItem(const QString &key)
2946 \deprecated
2947
2948 Removes the first query string pair whose key is equal to \a key
2949 from the URL.
2950
2951 \obsolete Use QUrlQuery.
2952
2953 \sa removeAllQueryItems()
2954*/
2955
2956/*!
2957 \fn void QUrl::removeEncodedQueryItem(const QByteArray &key)
2958 \deprecated
2959 \since 4.4
2960
2961 Removes the first query string pair whose key is equal to \a key
2962 from the URL.
2963
2964 \obsolete Use QUrlQuery.
2965
2966 \sa removeQueryItem(), removeAllQueryItems()
2967*/
2968
2969/*!
2970 \fn void QUrl::removeAllQueryItems(const QString &key)
2971 \deprecated
2972
2973 Removes all the query string pairs whose key is equal to \a key
2974 from the URL.
2975
2976 \obsolete Use QUrlQuery.
2977
2978 \sa removeQueryItem()
2979*/
2980
2981/*!
2982 \fn void QUrl::removeAllEncodedQueryItems(const QByteArray &key)
2983 \deprecated
2984 \since 4.4
2985
2986 Removes all the query string pairs whose key is equal to \a key
2987 from the URL.
2988
2989 \obsolete Use QUrlQuery.
2990
2991 \sa removeQueryItem()
2992*/
2993
2994/*!
2995 \fn QByteArray QUrl::encodedQuery() const
2996 \deprecated
2997
2998 Returns the query string of the URL in percent encoded form.
2999
3000 \obsolete Use query(QUrl::FullyEncoded).toLatin1()
3001
3002 \sa setEncodedQuery(), query()
3003*/
3004
3005/*!
3006 Returns the query string of the URL if there's a query string, or an empty
3007 result if not. To determine if the parsed URL contained a query string, use
3008 hasQuery().
3009
3010 The \a options argument controls how to format the query component. All
3011 values produce an unambiguous result. With QUrl::FullyDecoded, all
3012 percent-encoded sequences are decoded; otherwise, the returned value may
3013 contain some percent-encoded sequences for some control sequences not
3014 representable in decoded form in QString.
3015
3016 Note that use of QUrl::FullyDecoded in queries is discouraged, as queries
3017 often contain data that is supposed to remain percent-encoded, including
3018 the use of the "%2B" sequence to represent a plus character ('+').
3019
3020 \sa setQuery(), hasQuery()
3021*/
3022QString QUrl::query(ComponentFormattingOptions options) const
3023{
3024 QString result;
3025 if (d) {
3026 d->appendQuery(appendTo&: result, options, appendingTo: QUrlPrivate::Query);
3027 if (d->hasQuery() && result.isNull())
3028 result.detach();
3029 }
3030 return result;
3031}
3032
3033/*!
3034 Sets the fragment of the URL to \a fragment. The fragment is the
3035 last part of the URL, represented by a '#' followed by a string of
3036 characters. It is typically used in HTTP for referring to a
3037 certain link or point on a page:
3038
3039 \image qurl-fragment.png
3040
3041 The fragment is sometimes also referred to as the URL "reference".
3042
3043 Passing an argument of QString() (a null QString) will unset the fragment.
3044 Passing an argument of QString("") (an empty but not null QString) will set the
3045 fragment to an empty string (as if the original URL had a lone "#").
3046
3047 The \a fragment data is interpreted according to \a mode: in StrictMode,
3048 any '%' characters must be followed by exactly two hexadecimal characters
3049 and some characters (including space) are not allowed in undecoded form. In
3050 TolerantMode, all characters are accepted in undecoded form and the
3051 tolerant parser will correct stray '%' not followed by two hex characters.
3052 In DecodedMode, '%' stand for themselves and encoded characters are not
3053 possible.
3054
3055 QUrl::DecodedMode should be used when setting the fragment from a data
3056 source which is not a URL or with a fragment obtained by calling
3057 fragment() with the QUrl::FullyDecoded formatting option.
3058
3059 \sa fragment(), hasFragment()
3060*/
3061void QUrl::setFragment(const QString &fragment, ParsingMode mode)
3062{
3063 detach();
3064 d->clearError();
3065
3066 QString data = fragment;
3067 if (mode == DecodedMode) {
3068 parseDecodedComponent(data);
3069 mode = TolerantMode;
3070 }
3071
3072 d->setFragment(value: data, from: 0, end: data.length());
3073 if (fragment.isNull())
3074 d->sectionIsPresent &= ~QUrlPrivate::Fragment;
3075 else if (mode == StrictMode && !d->validateComponent(section: QUrlPrivate::Fragment, input: fragment))
3076 d->fragment.clear();
3077}
3078
3079/*!
3080 Returns the fragment of the URL. To determine if the parsed URL contained a
3081 fragment, use hasFragment().
3082
3083 The \a options argument controls how to format the fragment component. All
3084 values produce an unambiguous result. With QUrl::FullyDecoded, all
3085 percent-encoded sequences are decoded; otherwise, the returned value may
3086 contain some percent-encoded sequences for some control sequences not
3087 representable in decoded form in QString.
3088
3089 Note that QUrl::FullyDecoded may cause data loss if those non-representable
3090 sequences are present. It is recommended to use that value when the result
3091 will be used in a non-URL context.
3092
3093 \sa setFragment(), hasFragment()
3094*/
3095QString QUrl::fragment(ComponentFormattingOptions options) const
3096{
3097 QString result;
3098 if (d) {
3099 d->appendFragment(appendTo&: result, options, appendingTo: QUrlPrivate::Fragment);
3100 if (d->hasFragment() && result.isNull())
3101 result.detach();
3102 }
3103 return result;
3104}
3105
3106/*!
3107 \fn void QUrl::setEncodedFragment(const QByteArray &fragment)
3108 \deprecated
3109 \since 4.4
3110
3111 Sets the URL's fragment to the percent-encoded \a fragment. The fragment is the
3112 last part of the URL, represented by a '#' followed by a string of
3113 characters. It is typically used in HTTP for referring to a
3114 certain link or point on a page:
3115
3116 \image qurl-fragment.png
3117
3118 The fragment is sometimes also referred to as the URL "reference".
3119
3120 Passing an argument of QByteArray() (a null QByteArray) will unset the fragment.
3121 Passing an argument of QByteArray("") (an empty but not null QByteArray)
3122 will set the fragment to an empty string (as if the original URL
3123 had a lone "#").
3124
3125 \obsolete Use setFragment(), which has the same behavior of null / empty.
3126
3127 \sa setFragment(), encodedFragment()
3128*/
3129
3130/*!
3131 \fn QByteArray QUrl::encodedFragment() const
3132 \deprecated
3133 \since 4.4
3134
3135 Returns the fragment of the URL if it is defined; otherwise an
3136 empty string is returned. The returned value will have its
3137 non-ASCII and other control characters percent-encoded, as in
3138 toEncoded().
3139
3140 \obsolete Use query(QUrl::FullyEncoded).toLatin1().
3141
3142 \sa setEncodedFragment(), toEncoded()
3143*/
3144
3145/*!
3146 \since 4.2
3147
3148 Returns \c true if this URL contains a fragment (i.e., if # was seen on it).
3149
3150 \sa fragment(), setFragment()
3151*/
3152bool QUrl::hasFragment() const
3153{
3154 if (!d) return false;
3155 return d->hasFragment();
3156}
3157
3158#if QT_DEPRECATED_SINCE(5, 15)
3159#if QT_CONFIG(topleveldomain)
3160/*!
3161 \since 4.8
3162
3163 \deprecated
3164
3165 Returns the TLD (Top-Level Domain) of the URL, (e.g. .co.uk, .net).
3166 Note that the return value is prefixed with a '.' unless the
3167 URL does not contain a valid TLD, in which case the function returns
3168 an empty string.
3169
3170 Note that this function considers a TLD to be any domain that allows users
3171 to register subdomains under, including many home, dynamic DNS websites and
3172 blogging providers. This is useful for determining whether two websites
3173 belong to the same infrastructure and communication should be allowed, such
3174 as browser cookies: two domains should be considered part of the same
3175 website if they share at least one label in addition to the value
3176 returned by this function.
3177
3178 \list
3179 \li \c{foo.co.uk} and \c{foo.com} do not share a top-level domain
3180 \li \c{foo.co.uk} and \c{bar.co.uk} share the \c{.co.uk} domain, but the next label is different
3181 \li \c{www.foo.co.uk} and \c{ftp.foo.co.uk} share the same top-level domain and one more label,
3182 so they are considered part of the same site
3183 \endlist
3184
3185 If \a options includes EncodeUnicode, the returned string will be in
3186 ASCII Compatible Encoding.
3187*/
3188QString QUrl::topLevelDomain(ComponentFormattingOptions options) const
3189{
3190 QString tld = qTopLevelDomain(domain: host());
3191 if (options & EncodeUnicode) {
3192 return qt_ACE_do(domain: tld, op: ToAceOnly, dot: AllowLeadingDot);
3193 }
3194 return tld;
3195}
3196#endif
3197#endif // QT_DEPRECATED_SINCE(5, 15)
3198/*!
3199 Returns the result of the merge of this URL with \a relative. This
3200 URL is used as a base to convert \a relative to an absolute URL.
3201
3202 If \a relative is not a relative URL, this function will return \a
3203 relative directly. Otherwise, the paths of the two URLs are
3204 merged, and the new URL returned has the scheme and authority of
3205 the base URL, but with the merged path, as in the following
3206 example:
3207
3208 \snippet code/src_corelib_io_qurl.cpp 5
3209
3210 Calling resolved() with ".." returns a QUrl whose directory is
3211 one level higher than the original. Similarly, calling resolved()
3212 with "../.." removes two levels from the path. If \a relative is
3213 "/", the path becomes "/".
3214
3215 \sa isRelative()
3216*/
3217QUrl QUrl::resolved(const QUrl &relative) const
3218{
3219 if (!d) return relative;
3220 if (!relative.d) return *this;
3221
3222 QUrl t;
3223 if (!relative.d->scheme.isEmpty()) {
3224 t = relative;
3225 t.detach();
3226 } else {
3227 if (relative.d->hasAuthority()) {
3228 t = relative;
3229 t.detach();
3230 } else {
3231 t.d = new QUrlPrivate;
3232
3233 // copy the authority
3234 t.d->userName = d->userName;
3235 t.d->password = d->password;
3236 t.d->host = d->host;
3237 t.d->port = d->port;
3238 t.d->sectionIsPresent = d->sectionIsPresent & QUrlPrivate::Authority;
3239
3240 if (relative.d->path.isEmpty()) {
3241 t.d->path = d->path;
3242 if (relative.d->hasQuery()) {
3243 t.d->query = relative.d->query;
3244 t.d->sectionIsPresent |= QUrlPrivate::Query;
3245 } else if (d->hasQuery()) {
3246 t.d->query = d->query;
3247 t.d->sectionIsPresent |= QUrlPrivate::Query;
3248 }
3249 } else {
3250 t.d->path = relative.d->path.startsWith(c: QLatin1Char('/'))
3251 ? relative.d->path
3252 : d->mergePaths(relativePath: relative.d->path);
3253 if (relative.d->hasQuery()) {
3254 t.d->query = relative.d->query;
3255 t.d->sectionIsPresent |= QUrlPrivate::Query;
3256 }
3257 }
3258 }
3259 t.d->scheme = d->scheme;
3260 if (d->hasScheme())
3261 t.d->sectionIsPresent |= QUrlPrivate::Scheme;
3262 else
3263 t.d->sectionIsPresent &= ~QUrlPrivate::Scheme;
3264 t.d->flags |= d->flags & QUrlPrivate::IsLocalFile;
3265 }
3266 t.d->fragment = relative.d->fragment;
3267 if (relative.d->hasFragment())
3268 t.d->sectionIsPresent |= QUrlPrivate::Fragment;
3269 else
3270 t.d->sectionIsPresent &= ~QUrlPrivate::Fragment;
3271
3272 removeDotsFromPath(path: &t.d->path);
3273
3274#if defined(QURL_DEBUG)
3275 qDebug("QUrl(\"%ls\").resolved(\"%ls\") = \"%ls\"",
3276 qUtf16Printable(url()),
3277 qUtf16Printable(relative.url()),
3278 qUtf16Printable(t.url()));
3279#endif
3280 return t;
3281}
3282
3283/*!
3284 Returns \c true if the URL is relative; otherwise returns \c false. A URL is
3285 relative reference if its scheme is undefined; this function is therefore
3286 equivalent to calling scheme().isEmpty().
3287
3288 Relative references are defined in RFC 3986 section 4.2.
3289
3290 \sa {Relative URLs vs Relative Paths}
3291*/
3292bool QUrl::isRelative() const
3293{
3294 if (!d) return true;
3295 return !d->hasScheme();
3296}
3297
3298/*!
3299 Returns a string representation of the URL. The output can be customized by
3300 passing flags with \a options. The option QUrl::FullyDecoded is not
3301 permitted in this function since it would generate ambiguous data.
3302
3303 The resulting QString can be passed back to a QUrl later on.
3304
3305 Synonym for toString(options).
3306
3307 \sa FormattingOptions, toEncoded(), toString()
3308*/
3309QString QUrl::url(FormattingOptions options) const
3310{
3311 return toString(options);
3312}
3313
3314/*!
3315 Returns a string representation of the URL. The output can be customized by
3316 passing flags with \a options. The option QUrl::FullyDecoded is not
3317 permitted in this function since it would generate ambiguous data.
3318
3319 The default formatting option is \l{QUrl::FormattingOptions}{PrettyDecoded}.
3320
3321 \sa FormattingOptions, url(), setUrl()
3322*/
3323QString QUrl::toString(FormattingOptions options) const
3324{
3325 QString url;
3326 if (!isValid()) {
3327 // also catches isEmpty()
3328 return url;
3329 }
3330 if ((options & QUrl::FullyDecoded) == QUrl::FullyDecoded) {
3331 qWarning(msg: "QUrl: QUrl::FullyDecoded is not permitted when reconstructing the full URL");
3332 options &= ~QUrl::FullyDecoded;
3333 //options |= QUrl::PrettyDecoded; // no-op, value is 0
3334 }
3335
3336 // return just the path if:
3337 // - QUrl::PreferLocalFile is passed
3338 // - QUrl::RemovePath isn't passed (rather stupid if the user did...)
3339 // - there's no query or fragment to return
3340 // that is, either they aren't present, or we're removing them
3341 // - it's a local file
3342 if (options.testFlag(f: QUrl::PreferLocalFile) && !options.testFlag(f: QUrl::RemovePath)
3343 && (!d->hasQuery() || options.testFlag(f: QUrl::RemoveQuery))
3344 && (!d->hasFragment() || options.testFlag(f: QUrl::RemoveFragment))
3345 && isLocalFile()) {
3346 url = d->toLocalFile(options: options | QUrl::FullyDecoded);
3347 return url;
3348 }
3349
3350 // for the full URL, we consider that the reserved characters are prettier if encoded
3351 if (options & DecodeReserved)
3352 options &= ~EncodeReserved;
3353 else
3354 options |= EncodeReserved;
3355
3356 if (!(options & QUrl::RemoveScheme) && d->hasScheme())
3357 url += d->scheme + QLatin1Char(':');
3358
3359 bool pathIsAbsolute = d->path.startsWith(c: QLatin1Char('/'));
3360 if (!((options & QUrl::RemoveAuthority) == QUrl::RemoveAuthority) && d->hasAuthority()) {
3361 url += QLatin1String("//");
3362 d->appendAuthority(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl);
3363 } else if (isLocalFile() && pathIsAbsolute) {
3364 // Comply with the XDG file URI spec, which requires triple slashes.
3365 url += QLatin1String("//");
3366 }
3367
3368 if (!(options & QUrl::RemovePath))
3369 d->appendPath(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl);
3370
3371 if (!(options & QUrl::RemoveQuery) && d->hasQuery()) {
3372 url += QLatin1Char('?');
3373 d->appendQuery(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl);
3374 }
3375 if (!(options & QUrl::RemoveFragment) && d->hasFragment()) {
3376 url += QLatin1Char('#');
3377 d->appendFragment(appendTo&: url, options, appendingTo: QUrlPrivate::FullUrl);
3378 }
3379
3380 return url;
3381}
3382
3383/*!
3384 \since 5.0
3385
3386 Returns a human-displayable string representation of the URL.
3387 The output can be customized by passing flags with \a options.
3388 The option RemovePassword is always enabled, since passwords
3389 should never be shown back to users.
3390
3391 With the default options, the resulting QString can be passed back
3392 to a QUrl later on, but any password that was present initially will
3393 be lost.
3394
3395 \sa FormattingOptions, toEncoded(), toString()
3396*/
3397
3398QString QUrl::toDisplayString(FormattingOptions options) const
3399{
3400 return toString(options: options | RemovePassword);
3401}
3402
3403/*!
3404 \since 5.2
3405
3406 Returns an adjusted version of the URL.
3407 The output can be customized by passing flags with \a options.
3408
3409 The encoding options from QUrl::ComponentFormattingOption don't make
3410 much sense for this method, nor does QUrl::PreferLocalFile.
3411
3412 This is always equivalent to QUrl(url.toString(options)).
3413
3414 \sa FormattingOptions, toEncoded(), toString()
3415*/
3416QUrl QUrl::adjusted(QUrl::FormattingOptions options) const
3417{
3418 if (!isValid()) {
3419 // also catches isEmpty()
3420 return QUrl();
3421 }
3422 QUrl that = *this;
3423 if (options & RemoveScheme)
3424 that.setScheme(QString());
3425 if ((options & RemoveAuthority) == RemoveAuthority) {
3426 that.setAuthority(authority: QString());
3427 } else {
3428 if ((options & RemoveUserInfo) == RemoveUserInfo)
3429 that.setUserInfo(userInfo: QString());
3430 else if (options & RemovePassword)
3431 that.setPassword(password: QString());
3432 if (options & RemovePort)
3433 that.setPort(-1);
3434 }
3435 if (options & RemoveQuery)
3436 that.setQuery(query: QString());
3437 if (options & RemoveFragment)
3438 that.setFragment(fragment: QString());
3439 if (options & RemovePath) {
3440 that.setPath(path: QString());
3441 } else if (options & (StripTrailingSlash | RemoveFilename | NormalizePathSegments)) {
3442 that.detach();
3443 QString path;
3444 d->appendPath(appendTo&: path, options: options | FullyEncoded, appendingTo: QUrlPrivate::Path);
3445 that.d->setPath(value: path, from: 0, end: path.length());
3446 }
3447 return that;
3448}
3449
3450/*!
3451 Returns the encoded representation of the URL if it's valid;
3452 otherwise an empty QByteArray is returned. The output can be
3453 customized by passing flags with \a options.
3454
3455 The user info, path and fragment are all converted to UTF-8, and
3456 all non-ASCII characters are then percent encoded. The host name
3457 is encoded using Punycode.
3458*/
3459QByteArray QUrl::toEncoded(FormattingOptions options) const
3460{
3461 options &= ~(FullyDecoded | FullyEncoded);
3462 return toString(options: options | FullyEncoded).toLatin1();
3463}
3464
3465/*!
3466 \fn QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode parsingMode)
3467
3468 Parses \a input and returns the corresponding QUrl. \a input is
3469 assumed to be in encoded form, containing only ASCII characters.
3470
3471 Parses the URL using \a parsingMode. See setUrl() for more information on
3472 this parameter. QUrl::DecodedMode is not permitted in this context.
3473
3474 \sa toEncoded(), setUrl()
3475*/
3476QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode mode)
3477{
3478 return QUrl(QString::fromUtf8(str: input.constData(), size: input.size()), mode);
3479}
3480
3481/*!
3482 Returns a decoded copy of \a input. \a input is first decoded from
3483 percent encoding, then converted from UTF-8 to unicode.
3484
3485 \note Given invalid input (such as a string containing the sequence "%G5",
3486 which is not a valid hexadecimal number) the output will be invalid as
3487 well. As an example: the sequence "%G5" could be decoded to 'W'.
3488*/
3489QString QUrl::fromPercentEncoding(const QByteArray &input)
3490{
3491 QByteArray ba = QByteArray::fromPercentEncoding(pctEncoded: input);
3492 return QString::fromUtf8(str: ba, size: ba.size());
3493}
3494
3495/*!
3496 Returns an encoded copy of \a input. \a input is first converted
3497 to UTF-8, and all ASCII-characters that are not in the unreserved group
3498 are percent encoded. To prevent characters from being percent encoded
3499 pass them to \a exclude. To force characters to be percent encoded pass
3500 them to \a include.
3501
3502 Unreserved is defined as:
3503 \tt {ALPHA / DIGIT / "-" / "." / "_" / "~"}
3504
3505 \snippet code/src_corelib_io_qurl.cpp 6
3506*/
3507QByteArray QUrl::toPercentEncoding(const QString &input, const QByteArray &exclude, const QByteArray &include)
3508{
3509 return input.toUtf8().toPercentEncoding(exclude, include);
3510}
3511
3512/*!
3513 \internal
3514 \since 5.0
3515 Used in the setEncodedXXX compatibility functions. Converts \a ba to
3516 QString form.
3517*/
3518QString QUrl::fromEncodedComponent_helper(const QByteArray &ba)
3519{
3520 return qt_urlRecodeByteArray(ba);
3521}
3522
3523/*!
3524 \fn QByteArray QUrl::toPunycode(const QString &uc)
3525 \obsolete
3526 Returns a \a uc in Punycode encoding.
3527
3528 Punycode is a Unicode encoding used for internationalized domain
3529 names, as defined in RFC3492. If you want to convert a domain name from
3530 Unicode to its ASCII-compatible representation, use toAce().
3531*/
3532
3533/*!
3534 \fn QString QUrl::fromPunycode(const QByteArray &pc)
3535 \obsolete
3536 Returns the Punycode decoded representation of \a pc.
3537
3538 Punycode is a Unicode encoding used for internationalized domain
3539 names, as defined in RFC3492. If you want to convert a domain from
3540 its ASCII-compatible encoding to the Unicode representation, use
3541 fromAce().
3542*/
3543
3544/*!
3545 \since 4.2
3546
3547 Returns the Unicode form of the given domain name
3548 \a domain, which is encoded in the ASCII Compatible Encoding (ACE).
3549 The result of this function is considered equivalent to \a domain.
3550
3551 If the value in \a domain cannot be encoded, it will be converted
3552 to QString and returned.
3553
3554 The ASCII Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491
3555 and RFC 3492. It is part of the Internationalizing Domain Names in
3556 Applications (IDNA) specification, which allows for domain names
3557 (like \c "example.com") to be written using international
3558 characters.
3559*/
3560QString QUrl::fromAce(const QByteArray &domain)
3561{
3562 return qt_ACE_do(domain: QString::fromLatin1(str: domain), op: NormalizeAce, dot: ForbidLeadingDot /*FIXME: make configurable*/);
3563}
3564
3565/*!
3566 \since 4.2
3567
3568 Returns the ASCII Compatible Encoding of the given domain name \a domain.
3569 The result of this function is considered equivalent to \a domain.
3570
3571 The ASCII-Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491
3572 and RFC 3492. It is part of the Internationalizing Domain Names in
3573 Applications (IDNA) specification, which allows for domain names
3574 (like \c "example.com") to be written using international
3575 characters.
3576
3577 This function returns an empty QByteArray if \a domain is not a valid
3578 hostname. Note, in particular, that IPv6 literals are not valid domain
3579 names.
3580*/
3581QByteArray QUrl::toAce(const QString &domain)
3582{
3583 return qt_ACE_do(domain, op: ToAceOnly, dot: ForbidLeadingDot /*FIXME: make configurable*/).toLatin1();
3584}
3585
3586/*!
3587 \internal
3588
3589 Returns \c true if this URL is "less than" the given \a url. This
3590 provides a means of ordering URLs.
3591*/
3592bool QUrl::operator <(const QUrl &url) const
3593{
3594 if (!d || !url.d) {
3595 bool thisIsEmpty = !d || d->isEmpty();
3596 bool thatIsEmpty = !url.d || url.d->isEmpty();
3597
3598 // sort an empty URL first
3599 return thisIsEmpty && !thatIsEmpty;
3600 }
3601
3602 int cmp;
3603 cmp = d->scheme.compare(s: url.d->scheme);
3604 if (cmp != 0)
3605 return cmp < 0;
3606
3607 cmp = d->userName.compare(s: url.d->userName);
3608 if (cmp != 0)
3609 return cmp < 0;
3610
3611 cmp = d->password.compare(s: url.d->password);
3612 if (cmp != 0)
3613 return cmp < 0;
3614
3615 cmp = d->host.compare(s: url.d->host);
3616 if (cmp != 0)
3617 return cmp < 0;
3618
3619 if (d->port != url.d->port)
3620 return d->port < url.d->port;
3621
3622 cmp = d->path.compare(s: url.d->path);
3623 if (cmp != 0)
3624 return cmp < 0;
3625
3626 if (d->hasQuery() != url.d->hasQuery())
3627 return url.d->hasQuery();
3628
3629 cmp = d->query.compare(s: url.d->query);
3630 if (cmp != 0)
3631 return cmp < 0;
3632
3633 if (d->hasFragment() != url.d->hasFragment())
3634 return url.d->hasFragment();
3635
3636 cmp = d->fragment.compare(s: url.d->fragment);
3637 return cmp < 0;
3638}
3639
3640/*!
3641 Returns \c true if this URL and the given \a url are equal;
3642 otherwise returns \c false.
3643
3644 \sa matches()
3645*/
3646bool QUrl::operator ==(const QUrl &url) const
3647{
3648 if (!d && !url.d)
3649 return true;
3650 if (!d)
3651 return url.d->isEmpty();
3652 if (!url.d)
3653 return d->isEmpty();
3654
3655 // First, compare which sections are present, since it speeds up the
3656 // processing considerably. We just have to ignore the host-is-present flag
3657 // for local files (the "file" protocol), due to the requirements of the
3658 // XDG file URI specification.
3659 int mask = QUrlPrivate::FullUrl;
3660 if (isLocalFile())
3661 mask &= ~QUrlPrivate::Host;
3662 return (d->sectionIsPresent & mask) == (url.d->sectionIsPresent & mask) &&
3663 d->scheme == url.d->scheme &&
3664 d->userName == url.d->userName &&
3665 d->password == url.d->password &&
3666 d->host == url.d->host &&
3667 d->port == url.d->port &&
3668 d->path == url.d->path &&
3669 d->query == url.d->query &&
3670 d->fragment == url.d->fragment;
3671}
3672
3673/*!
3674 \since 5.2
3675
3676 Returns \c true if this URL and the given \a url are equal after
3677 applying \a options to both; otherwise returns \c false.
3678
3679 This is equivalent to calling adjusted(options) on both URLs
3680 and comparing the resulting urls, but faster.
3681
3682*/
3683bool QUrl::matches(const QUrl &url, FormattingOptions options) const
3684{
3685 if (!d && !url.d)
3686 return true;
3687 if (!d)
3688 return url.d->isEmpty();
3689 if (!url.d)
3690 return d->isEmpty();
3691
3692 // First, compare which sections are present, since it speeds up the
3693 // processing considerably. We just have to ignore the host-is-present flag
3694 // for local files (the "file" protocol), due to the requirements of the
3695 // XDG file URI specification.
3696 int mask = QUrlPrivate::FullUrl;
3697 if (isLocalFile())
3698 mask &= ~QUrlPrivate::Host;
3699
3700 if (options.testFlag(f: QUrl::RemoveScheme))
3701 mask &= ~QUrlPrivate::Scheme;
3702 else if (d->scheme != url.d->scheme)
3703 return false;
3704
3705 if (options.testFlag(f: QUrl::RemovePassword))
3706 mask &= ~QUrlPrivate::Password;
3707 else if (d->password != url.d->password)
3708 return false;
3709
3710 if (options.testFlag(f: QUrl::RemoveUserInfo))
3711 mask &= ~QUrlPrivate::UserName;
3712 else if (d->userName != url.d->userName)
3713 return false;
3714
3715 if (options.testFlag(f: QUrl::RemovePort))
3716 mask &= ~QUrlPrivate::Port;
3717 else if (d->port != url.d->port)
3718 return false;
3719
3720 if (options.testFlag(f: QUrl::RemoveAuthority))
3721 mask &= ~QUrlPrivate::Host;
3722 else if (d->host != url.d->host)
3723 return false;
3724
3725 if (options.testFlag(f: QUrl::RemoveQuery))
3726 mask &= ~QUrlPrivate::Query;
3727 else if (d->query != url.d->query)
3728 return false;
3729
3730 if (options.testFlag(f: QUrl::RemoveFragment))
3731 mask &= ~QUrlPrivate::Fragment;
3732 else if (d->fragment != url.d->fragment)
3733 return false;
3734
3735 if ((d->sectionIsPresent & mask) != (url.d->sectionIsPresent & mask))
3736 return false;
3737
3738 if (options.testFlag(f: QUrl::RemovePath))
3739 return true;
3740
3741 // Compare paths, after applying path-related options
3742 QString path1;
3743 d->appendPath(appendTo&: path1, options, appendingTo: QUrlPrivate::Path);
3744 QString path2;
3745 url.d->appendPath(appendTo&: path2, options, appendingTo: QUrlPrivate::Path);
3746 return path1 == path2;
3747}
3748
3749/*!
3750 Returns \c true if this URL and the given \a url are not equal;
3751 otherwise returns \c false.
3752
3753 \sa matches()
3754*/
3755bool QUrl::operator !=(const QUrl &url) const
3756{
3757 return !(*this == url);
3758}
3759
3760/*!
3761 Assigns the specified \a url to this object.
3762*/
3763QUrl &QUrl::operator =(const QUrl &url)
3764{
3765 if (!d) {
3766 if (url.d) {
3767 url.d->ref.ref();
3768 d = url.d;
3769 }
3770 } else {
3771 if (url.d)
3772 qAtomicAssign(d, x: url.d);
3773 else
3774 clear();
3775 }
3776 return *this;
3777}
3778
3779/*!
3780 Assigns the specified \a url to this object.
3781*/
3782QUrl &QUrl::operator =(const QString &url)
3783{
3784 if (url.isEmpty()) {
3785 clear();
3786 } else {
3787 detach();
3788 d->parse(url, parsingMode: TolerantMode);
3789 }
3790 return *this;
3791}
3792
3793/*!
3794 \fn void QUrl::swap(QUrl &other)
3795 \since 4.8
3796
3797 Swaps URL \a other with this URL. This operation is very
3798 fast and never fails.
3799*/
3800
3801/*!
3802 \internal
3803
3804 Forces a detach.
3805*/
3806void QUrl::detach()
3807{
3808 if (!d)
3809 d = new QUrlPrivate;
3810 else
3811 qAtomicDetach(d);
3812}
3813
3814/*!
3815 \internal
3816*/
3817bool QUrl::isDetached() const
3818{
3819 return !d || d->ref.loadRelaxed() == 1;
3820}
3821
3822static QString fromNativeSeparators(const QString &pathName)
3823{
3824#if defined(Q_OS_WIN)
3825 QString result(pathName);
3826 const QChar nativeSeparator = u'\\';
3827 auto i = result.indexOf(nativeSeparator);
3828 if (i != -1) {
3829 QChar * const data = result.data();
3830 const auto length = result.length();
3831 for (; i < length; ++i) {
3832 if (data[i] == nativeSeparator)
3833 data[i] = u'/';
3834 }
3835 }
3836 return result;
3837#else
3838 return pathName;
3839#endif
3840}
3841
3842/*!
3843 Returns a QUrl representation of \a localFile, interpreted as a local
3844 file. This function accepts paths separated by slashes as well as the
3845 native separator for this platform.
3846
3847 This function also accepts paths with a doubled leading slash (or
3848 backslash) to indicate a remote file, as in
3849 "//servername/path/to/file.txt". Note that only certain platforms can
3850 actually open this file using QFile::open().
3851
3852 An empty \a localFile leads to an empty URL (since Qt 5.4).
3853
3854 \snippet code/src_corelib_io_qurl.cpp 16
3855
3856 In the first line in snippet above, a file URL is constructed from a
3857 local, relative path. A file URL with a relative path only makes sense
3858 if there is a base URL to resolve it against. For example:
3859
3860 \snippet code/src_corelib_io_qurl.cpp 17
3861
3862 To resolve such a URL, it's necessary to remove the scheme beforehand:
3863
3864 \snippet code/src_corelib_io_qurl.cpp 18
3865
3866 For this reason, it is better to use a relative URL (that is, no scheme)
3867 for relative file paths:
3868
3869 \snippet code/src_corelib_io_qurl.cpp 19
3870
3871 \sa toLocalFile(), isLocalFile(), QDir::toNativeSeparators()
3872*/
3873QUrl QUrl::fromLocalFile(const QString &localFile)
3874{
3875 QUrl url;
3876 if (localFile.isEmpty())
3877 return url;
3878 QString scheme = fileScheme();
3879 QString deslashified = fromNativeSeparators(pathName: localFile);
3880
3881 // magic for drives on windows
3882 if (deslashified.length() > 1 && deslashified.at(i: 1) == QLatin1Char(':') && deslashified.at(i: 0) != QLatin1Char('/')) {
3883 deslashified.prepend(c: QLatin1Char('/'));
3884 } else if (deslashified.startsWith(s: QLatin1String("//"))) {
3885 // magic for shared drive on windows
3886 int indexOfPath = deslashified.indexOf(c: QLatin1Char('/'), from: 2);
3887 QStringRef hostSpec = deslashified.midRef(position: 2, n: indexOfPath - 2);
3888 // Check for Windows-specific WebDAV specification: "//host@SSL/path".
3889 if (hostSpec.endsWith(s: webDavSslTag(), cs: Qt::CaseInsensitive)) {
3890 hostSpec.truncate(pos: hostSpec.size() - 4);
3891 scheme = webDavScheme();
3892 }
3893
3894 // hosts can't be IPv6 addresses without [], so we can use QUrlPrivate::setHost
3895 url.detach();
3896 if (!url.d->setHost(value: hostSpec.toString(), from: 0, iend: hostSpec.size(), mode: StrictMode)) {
3897 if (url.d->error->code != QUrlPrivate::InvalidRegNameError)
3898 return url;
3899
3900 // Path hostname is not a valid URL host, so set it entirely in the path
3901 // (by leaving deslashified unchanged)
3902 } else if (indexOfPath > 2) {
3903 deslashified = deslashified.right(n: deslashified.length() - indexOfPath);
3904 } else {
3905 deslashified.clear();
3906 }
3907 }
3908
3909 url.setScheme(scheme);
3910 url.setPath(path: deslashified, mode: DecodedMode);
3911 return url;
3912}
3913
3914/*!
3915 Returns the path of this URL formatted as a local file path. The path
3916 returned will use forward slashes, even if it was originally created
3917 from one with backslashes.
3918
3919 If this URL contains a non-empty hostname, it will be encoded in the
3920 returned value in the form found on SMB networks (for example,
3921 "//servername/path/to/file.txt").
3922
3923 \snippet code/src_corelib_io_qurl.cpp 20
3924
3925 Note: if the path component of this URL contains a non-UTF-8 binary
3926 sequence (such as %80), the behaviour of this function is undefined.
3927
3928 \sa fromLocalFile(), isLocalFile()
3929*/
3930QString QUrl::toLocalFile() const
3931{
3932 // the call to isLocalFile() also ensures that we're parsed
3933 if (!isLocalFile())
3934 return QString();
3935
3936 return d->toLocalFile(options: QUrl::FullyDecoded);
3937}
3938
3939/*!
3940 \since 4.8
3941 Returns \c true if this URL is pointing to a local file path. A URL is a
3942 local file path if the scheme is "file".
3943
3944 Note that this function considers URLs with hostnames to be local file
3945 paths, even if the eventual file path cannot be opened with
3946 QFile::open().
3947
3948 \sa fromLocalFile(), toLocalFile()
3949*/
3950bool QUrl::isLocalFile() const
3951{
3952 return d && d->isLocalFile();
3953}
3954
3955/*!
3956 Returns \c true if this URL is a parent of \a childUrl. \a childUrl is a child
3957 of this URL if the two URLs share the same scheme and authority,
3958 and this URL's path is a parent of the path of \a childUrl.
3959*/
3960bool QUrl::isParentOf(const QUrl &childUrl) const
3961{
3962 QString childPath = childUrl.path();
3963
3964 if (!d)
3965 return ((childUrl.scheme().isEmpty())
3966 && (childUrl.authority().isEmpty())
3967 && childPath.length() > 0 && childPath.at(i: 0) == QLatin1Char('/'));
3968
3969 QString ourPath = path();
3970
3971 return ((childUrl.scheme().isEmpty() || d->scheme == childUrl.scheme())
3972 && (childUrl.authority().isEmpty() || authority() == childUrl.authority())
3973 && childPath.startsWith(s: ourPath)
3974 && ((ourPath.endsWith(c: QLatin1Char('/')) && childPath.length() > ourPath.length())
3975 || (!ourPath.endsWith(c: QLatin1Char('/'))
3976 && childPath.length() > ourPath.length() && childPath.at(i: ourPath.length()) == QLatin1Char('/'))));
3977}
3978
3979
3980#ifndef QT_NO_DATASTREAM
3981/*! \relates QUrl
3982
3983 Writes url \a url to the stream \a out and returns a reference
3984 to the stream.
3985
3986 \sa{Serializing Qt Data Types}{Format of the QDataStream operators}
3987*/
3988QDataStream &operator<<(QDataStream &out, const QUrl &url)
3989{
3990 QByteArray u;
3991 if (url.isValid())
3992 u = url.toEncoded();
3993 out << u;
3994 return out;
3995}
3996
3997/*! \relates QUrl
3998
3999 Reads a url into \a url from the stream \a in and returns a
4000 reference to the stream.
4001
4002 \sa{Serializing Qt Data Types}{Format of the QDataStream operators}
4003*/
4004QDataStream &operator>>(QDataStream &in, QUrl &url)
4005{
4006 QByteArray u;
4007 in >> u;
4008 url.setUrl(url: QString::fromLatin1(str: u));
4009 return in;
4010}
4011#endif // QT_NO_DATASTREAM
4012
4013#ifndef QT_NO_DEBUG_STREAM
4014QDebug operator<<(QDebug d, const QUrl &url)
4015{
4016 QDebugStateSaver saver(d);
4017 d.nospace() << "QUrl(" << url.toDisplayString() << ')';
4018 return d;
4019}
4020#endif
4021
4022static QString errorMessage(QUrlPrivate::ErrorCode errorCode, const QString &errorSource, int errorPosition)
4023{
4024 QChar c = uint(errorPosition) < uint(errorSource.length()) ?
4025 errorSource.at(i: errorPosition) : QChar(QChar::Null);
4026
4027 switch (errorCode) {
4028 case QUrlPrivate::NoError:
4029 Q_ASSERT_X(false, "QUrl::errorString",
4030 "Impossible: QUrl::errorString should have treated this condition");
4031 Q_UNREACHABLE();
4032 return QString();
4033
4034 case QUrlPrivate::InvalidSchemeError: {
4035 auto msg = QLatin1String("Invalid scheme (character '%1' not permitted)");
4036 return msg.arg(args&: c);
4037 }
4038
4039 case QUrlPrivate::InvalidUserNameError:
4040 return QLatin1String("Invalid user name (character '%1' not permitted)")
4041 .arg(args&: c);
4042
4043 case QUrlPrivate::InvalidPasswordError:
4044 return QLatin1String("Invalid password (character '%1' not permitted)")
4045 .arg(args&: c);
4046
4047 case QUrlPrivate::InvalidRegNameError:
4048 if (errorPosition != -1)
4049 return QLatin1String("Invalid hostname (character '%1' not permitted)")
4050 .arg(args&: c);
4051 else
4052 return QStringLiteral("Invalid hostname (contains invalid characters)");
4053 case QUrlPrivate::InvalidIPv4AddressError:
4054 return QString(); // doesn't happen yet
4055 case QUrlPrivate::InvalidIPv6AddressError:
4056 return QStringLiteral("Invalid IPv6 address");
4057 case QUrlPrivate::InvalidCharacterInIPv6Error:
4058 return QLatin1String("Invalid IPv6 address (character '%1' not permitted)").arg(args&: c);
4059 case QUrlPrivate::InvalidIPvFutureError:
4060 return QLatin1String("Invalid IPvFuture address (character '%1' not permitted)").arg(args&: c);
4061 case QUrlPrivate::HostMissingEndBracket:
4062 return QStringLiteral("Expected ']' to match '[' in hostname");
4063
4064 case QUrlPrivate::InvalidPortError:
4065 return QStringLiteral("Invalid port or port number out of range");
4066 case QUrlPrivate::PortEmptyError:
4067 return QStringLiteral("Port field was empty");
4068
4069 case QUrlPrivate::InvalidPathError:
4070 return QLatin1String("Invalid path (character '%1' not permitted)")
4071 .arg(args&: c);
4072
4073 case QUrlPrivate::InvalidQueryError:
4074 return QLatin1String("Invalid query (character '%1' not permitted)")
4075 .arg(args&: c);
4076
4077 case QUrlPrivate::InvalidFragmentError:
4078 return QLatin1String("Invalid fragment (character '%1' not permitted)")
4079 .arg(args&: c);
4080
4081 case QUrlPrivate::AuthorityPresentAndPathIsRelative:
4082 return QStringLiteral("Path component is relative and authority is present");
4083 case QUrlPrivate::AuthorityAbsentAndPathIsDoubleSlash:
4084 return QStringLiteral("Path component starts with '//' and authority is absent");
4085 case QUrlPrivate::RelativeUrlPathContainsColonBeforeSlash:
4086 return QStringLiteral("Relative URL's path component contains ':' before any '/'");
4087 }
4088
4089 Q_ASSERT_X(false, "QUrl::errorString", "Cannot happen, unknown error");
4090 Q_UNREACHABLE();
4091 return QString();
4092}
4093
4094static inline void appendComponentIfPresent(QString &msg, bool present, const char *componentName,
4095 const QString &component)
4096{
4097 if (present) {
4098 msg += QLatin1String(componentName);
4099 msg += QLatin1Char('"');
4100 msg += component;
4101 msg += QLatin1String("\",");
4102 }
4103}
4104
4105/*!
4106 \since 4.2
4107
4108 Returns an error message if the last operation that modified this QUrl
4109 object ran into a parsing error. If no error was detected, this function
4110 returns an empty string and isValid() returns \c true.
4111
4112 The error message returned by this function is technical in nature and may
4113 not be understood by end users. It is mostly useful to developers trying to
4114 understand why QUrl will not accept some input.
4115
4116 \sa QUrl::ParsingMode
4117*/
4118QString QUrl::errorString() const
4119{
4120 QString msg;
4121 if (!d)
4122 return msg;
4123
4124 QString errorSource;
4125 int errorPosition = 0;
4126 QUrlPrivate::ErrorCode errorCode = d->validityError(source: &errorSource, position: &errorPosition);
4127 if (errorCode == QUrlPrivate::NoError)
4128 return msg;
4129
4130 msg += errorMessage(errorCode, errorSource, errorPosition);
4131 msg += QLatin1String("; source was \"");
4132 msg += errorSource;
4133 msg += QLatin1String("\";");
4134 appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Scheme,
4135 componentName: " scheme = ", component: d->scheme);
4136 appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::UserInfo,
4137 componentName: " userinfo = ", component: userInfo());
4138 appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Host,
4139 componentName: " host = ", component: d->host);
4140 appendComponentIfPresent(msg, present: d->port != -1,
4141 componentName: " port = ", component: QString::number(d->port));
4142 appendComponentIfPresent(msg, present: !d->path.isEmpty(),
4143 componentName: " path = ", component: d->path);
4144 appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Query,
4145 componentName: " query = ", component: d->query);
4146 appendComponentIfPresent(msg, present: d->sectionIsPresent & QUrlPrivate::Fragment,
4147 componentName: " fragment = ", component: d->fragment);
4148 if (msg.endsWith(c: QLatin1Char(',')))
4149 msg.chop(n: 1);
4150 return msg;
4151}
4152
4153/*!
4154 \since 5.1
4155
4156 Converts a list of \a urls into a list of QString objects, using toString(\a options).
4157*/
4158QStringList QUrl::toStringList(const QList<QUrl> &urls, FormattingOptions options)
4159{
4160 QStringList lst;
4161 lst.reserve(alloc: urls.size());
4162 for (const QUrl &url : urls)
4163 lst.append(t: url.toString(options));
4164 return lst;
4165
4166}
4167
4168/*!
4169 \since 5.1
4170
4171 Converts a list of strings representing \a urls into a list of urls, using QUrl(str, \a mode).
4172 Note that this means all strings must be urls, not for instance local paths.
4173*/
4174QList<QUrl> QUrl::fromStringList(const QStringList &urls, ParsingMode mode)
4175{
4176 QList<QUrl> lst;
4177 lst.reserve(alloc: urls.size());
4178 for (const QString &str : urls)
4179 lst.append(t: QUrl(str, mode));
4180 return lst;
4181}
4182
4183/*!
4184 \typedef QUrl::DataPtr
4185 \internal
4186*/
4187
4188/*!
4189 \fn DataPtr &QUrl::data_ptr()
4190 \internal
4191*/
4192
4193/*!
4194 Returns the hash value for the \a url. If specified, \a seed is used to
4195 initialize the hash.
4196
4197 \relates QHash
4198 \since 5.0
4199*/
4200uint qHash(const QUrl &url, uint seed) noexcept
4201{
4202 if (!url.d)
4203 return qHash(key: -1, seed); // the hash of an unset port (-1)
4204
4205 return qHash(key: url.d->scheme) ^
4206 qHash(key: url.d->userName) ^
4207 qHash(key: url.d->password) ^
4208 qHash(key: url.d->host) ^
4209 qHash(key: url.d->port, seed) ^
4210 qHash(key: url.d->path) ^
4211 qHash(key: url.d->query) ^
4212 qHash(key: url.d->fragment);
4213}
4214
4215static QUrl adjustFtpPath(QUrl url)
4216{
4217 if (url.scheme() == ftpScheme()) {
4218 QString path = url.path(options: QUrl::PrettyDecoded);
4219 if (path.startsWith(s: QLatin1String("//")))
4220 url.setPath(path: QLatin1String("/%2F") + path.midRef(position: 2), mode: QUrl::TolerantMode);
4221 }
4222 return url;
4223}
4224
4225static bool isIp6(const QString &text)
4226{
4227 QIPAddressUtils::IPv6Address address;
4228 return !text.isEmpty() && QIPAddressUtils::parseIp6(address, begin: text.begin(), end: text.end()) == nullptr;
4229}
4230
4231/*!
4232 Returns a valid URL from a user supplied \a userInput string if one can be
4233 deduced. In the case that is not possible, an invalid QUrl() is returned.
4234
4235 This overload takes a \a workingDirectory path, in order to be able to
4236 handle relative paths. This is especially useful when handling command
4237 line arguments.
4238 If \a workingDirectory is empty, no handling of relative paths will be done,
4239 so this method will behave like its one argument overload.
4240
4241 By default, an input string that looks like a relative path will only be treated
4242 as such if the file actually exists in the given working directory.
4243
4244 If the application can handle files that don't exist yet, it should pass the
4245 flag AssumeLocalFile in \a options.
4246
4247 \since 5.4
4248*/
4249QUrl QUrl::fromUserInput(const QString &userInput, const QString &workingDirectory,
4250 UserInputResolutionOptions options)
4251{
4252 QString trimmedString = userInput.trimmed();
4253
4254 if (trimmedString.isEmpty())
4255 return QUrl();
4256
4257
4258 // Check for IPv6 addresses, since a path starting with ":" is absolute (a resource)
4259 // and IPv6 addresses can start with "c:" too
4260 if (isIp6(text: trimmedString)) {
4261 QUrl url;
4262 url.setHost(host: trimmedString);
4263 url.setScheme(QStringLiteral("http"));
4264 return url;
4265 }
4266
4267 const QFileInfo fileInfo(QDir(workingDirectory), userInput);
4268 if (fileInfo.exists()) {
4269 return QUrl::fromLocalFile(localFile: fileInfo.absoluteFilePath());
4270 }
4271
4272 QUrl url = QUrl(userInput, QUrl::TolerantMode);
4273 // Check both QUrl::isRelative (to detect full URLs) and QDir::isAbsolutePath (since on Windows drive letters can be interpreted as schemes)
4274 if ((options & AssumeLocalFile) && url.isRelative() && !QDir::isAbsolutePath(path: userInput)) {
4275 return QUrl::fromLocalFile(localFile: fileInfo.absoluteFilePath());
4276 }
4277
4278 return fromUserInput(userInput: trimmedString);
4279}
4280
4281/*!
4282 Returns a valid URL from a user supplied \a userInput string if one can be
4283 deducted. In the case that is not possible, an invalid QUrl() is returned.
4284
4285 \since 4.6
4286
4287 Most applications that can browse the web, allow the user to input a URL
4288 in the form of a plain string. This string can be manually typed into
4289 a location bar, obtained from the clipboard, or passed in via command
4290 line arguments.
4291
4292 When the string is not already a valid URL, a best guess is performed,
4293 making various web related assumptions.
4294
4295 In the case the string corresponds to a valid file path on the system,
4296 a file:// URL is constructed, using QUrl::fromLocalFile().
4297
4298 If that is not the case, an attempt is made to turn the string into a
4299 http:// or ftp:// URL. The latter in the case the string starts with
4300 'ftp'. The result is then passed through QUrl's tolerant parser, and
4301 in the case or success, a valid QUrl is returned, or else a QUrl().
4302
4303 \section1 Examples:
4304
4305 \list
4306 \li qt-project.org becomes http://qt-project.org
4307 \li ftp.qt-project.org becomes ftp://ftp.qt-project.org
4308 \li hostname becomes http://hostname
4309 \li /home/user/test.html becomes file:///home/user/test.html
4310 \endlist
4311*/
4312QUrl QUrl::fromUserInput(const QString &userInput)
4313{
4314 QString trimmedString = userInput.trimmed();
4315
4316 // Check for IPv6 addresses, since a path starting with ":" is absolute (a resource)
4317 // and IPv6 addresses can start with "c:" too
4318 if (isIp6(text: trimmedString)) {
4319 QUrl url;
4320 url.setHost(host: trimmedString);
4321 url.setScheme(QStringLiteral("http"));
4322 return url;
4323 }
4324
4325 // Check first for files, since on Windows drive letters can be interpretted as schemes
4326 if (QDir::isAbsolutePath(path: trimmedString))
4327 return QUrl::fromLocalFile(localFile: trimmedString);
4328
4329 QUrl url = QUrl(trimmedString, QUrl::TolerantMode);
4330 QUrl urlPrepended = QUrl(QLatin1String("http://") + trimmedString, QUrl::TolerantMode);
4331
4332 // Check the most common case of a valid url with a scheme
4333 // We check if the port would be valid by adding the scheme to handle the case host:port
4334 // where the host would be interpretted as the scheme
4335 if (url.isValid()
4336 && !url.scheme().isEmpty()
4337 && urlPrepended.port() == -1)
4338 return adjustFtpPath(url);
4339
4340 // Else, try the prepended one and adjust the scheme from the host name
4341 if (urlPrepended.isValid() && (!urlPrepended.host().isEmpty() || !urlPrepended.path().isEmpty()))
4342 {
4343 int dotIndex = trimmedString.indexOf(c: QLatin1Char('.'));
4344 const QStringRef hostscheme = trimmedString.leftRef(n: dotIndex);
4345 if (hostscheme.compare(s: ftpScheme(), cs: Qt::CaseInsensitive) == 0)
4346 urlPrepended.setScheme(ftpScheme());
4347 return adjustFtpPath(url: urlPrepended);
4348 }
4349
4350 return QUrl();
4351}
4352
4353QT_END_NAMESPACE
4354

source code of qtbase/src/corelib/io/qurl.cpp