What is a URL?

What is a URL?

All things have a standardized thing. Buses have line numbers, planes have flight numbers, and individuals have ID numbers. You take a taxi and tell the driver that I want to go to Shipai Huashi, and he will understand what you mean. A URL is a standardized name for an Internet resource. URLs point to a piece of electronic information that tells you where they are located and how to interact with them.

URI

Uniform Resource Identifier (URI) is a more general class of resource identifiers that includes URLs and URNs. URLs identify resources by describing their location, whereas URNs identify resources by name, regardless of their current location.

URL

syntax Most of the URLs we encounter in ordinary time (such as "http://www.yhys.com/caipu/") are composed of three parts: scheme (http), host (www.yhys.com), Path (/caipu). The syntax of the URL varies by scheme! The URL syntax for most URL schemes is built on a common 9-part format:

<scheme>://<user>:<password>@<host>:<port>/<path>;<params>?< query>#<frag>

But few URLs contain all of these components.

Scheme

First , let's take a look at what a scheme is.

A scheme is actually a primary identifier that dictates how to access a given resource, and it tells the application responsible for parsing URLs what protocol to use. Our most commonly used http (currently most browsers will help you omit this part), https (secure version of http), ftp and so on.

The scheme name is not case sensitive, which means "http://www.google.com" and "HTTP:www.google.com" are equivalent (you can try it in your browser) .

Host and Port

We want to access a resource on the Internet, we need to know which machine is hosting the resource and where to find the resource on that machine. This is the information provided by the host and port in the URL. We can use the hostname (www.yhys.com) or IP address (112.90.32.241) to represent the hostname. The port component identifies the port the server is listening on, the default port for http is 80, and the default port for https is 443.

Username and Password

User and password components are usually present in the ftp protocol.

ftp://ftp.prep.ai.mit.edu/pub/gnu There are no user and password components in this example. When the URL scheme asks for a username and password, it inserts anonymous as your username and sends a default password.

ftp://[email protected]/pug/gnu This example shows a specified username anonymous. The character @ speaks to separate the user and password components from the rest of the URL.

ftp://anonymous:[email protected]/pub/gnu specifies the username and password, separated by ":".

Paths are

like file paths in a PC. Each path has its own parameters.

The parameter component in a parameter

URL is a list of name-value pairs, separated from the rest of the URL by the character ";". They provide applications with additional information needed to access resources. ftp://ftp.prep.ai.mit.edu/pub/gnu;type=d

Can you see what is the parameter name in the above example, and what is the value of the parameter?

Query

Let's illustrate http://www.280.cc/index.php/search/searchid/14872?item=123&color=blue with an example. In this example, the content on the right is the query component. We narrow down the requested resource by querying the component.

We generally use the "name/value" method to query, and the name-value pairs are separated by &: http://www.280.cc/index.php/search/searchid/14872?item=123&color=blue.

Fragment

URLs support the use of Fragment components to represent fragments within a resource, such as a specific image and section in an HTML document. For example:

http://www.280.cc/index.php/search/searchid/14872#robben

*The HTTP server processes the entire object, not the fragment of the object. After returning from the server, the entire resource is returned by the client's browser. to display the clips you are interested in.

URL Shortcut

URLs come in two ways: absolute and relative. What we usually see are absolute URLs. Relative URLs, on the other hand, are a convenient abbreviation for URLs, which are fragments or small portions of URLs. Anyone with development experience should have seen it. Let's take a look at an HTML document
<HTML>

<HEAD><TITLE>Joe's Tools</TITLE></HEAD>

<BODY>

<H1>Tools Page</H1>

<H2>Hammers</H2>

<P>Joe's Hardware Online has the largest selection of
<A HREF="./hammers.html">hammers</A> on the earth.</P>

<H2><A NAME=drills></A>Drills </H2>

<P>Joe's Hardware has a complete line of cordless and corded drills,
as well as the latest in plutonium-powered atomic drills, for those
big around the house jobs.</P> ...

</BODY>

</HTML>


where ./hammers.html is a relative path.
Since it is a relative path, it must have a relative object. This object is the so-called base URL. In this example, the base URL is http://www.xxx.com/tools.com. So where does the base URL come from?

1. Display the provision in the resource. For example, an HTML document may contain a tag <BASE> that defines the base URL

2. The base URL of the packaged resource. If no base URL is specified explicitly, the URL of the resource it belongs to can be used as the base.

3. There is no base URL. Usually means this is an absolute URL, of course it is possible that the URL is incomplete.

So how do we convert a relative URL to an absolute URL? Let's look at the picture below.



We use the algorithm in the graph for ./hanmmers.html.

1) The path is ./hammmers.html, and the base URL is http://www.xxx.com/tools.html

2) The scheme is empty, and the scheme (HTTP) that inherits the base URL

3) The component is empty, and the host and port are inherited Component

4) Merge relative URLs and inherited components: http://www.xxx.com/hammmers.html.

Automatically expand URL

hostname extension

"HTTP Authoritative Guide" says that when we enter yahoo in the address bar, www. and .com will be automatically inserted in the host name, but I have not found a corresponding example.

History expansion

This is our daily life Used a lot, through the websites we have visited, the browser will automatically provide us with some complete options for us to choose from.

The future

We already know that the URL provides the location of the resource we need. Its disadvantage is that once the resource is removed, we cannot locate the resource through the URL. Our solution is officially the URN mentioned earlier.

URN (uniform resource name) Uniform resource name. The idea is to introduce another middle layer in the process of resource search. Through an intermediate resource locator, the server registers and tracks the actual URL of the resource, so that no matter where our resource is moved, as long as it is not deleted, The locator can then use this resource to redirect to the actual URL of the requested resource. But it will take time to replace URLs, and it's not a pressing issue in web development either.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326864034&siteId=291194637