Web Basics and HTTP Protocol

Table of contents

One: domain name

1. Domain name overview

2. Domain name space structure

3. Domain name registration

Two: Web page access (HTTP/HTTPS)

1. Basic concepts of web pages

2. Website

3. Homepage

4. Domain name

5.HTTP

6.URL

7. Web page basic label

(1) The role of web page summary information

(2) Title tag type

Three: HTML

1. HTML overview

2. HTML basic tags

(1) HTML grammar rules

(2) HTML file structure

(3) The "source code" of the web page [press F12 on a blank page of the web page to display the source code]

 3.HTML file structure

(1) Commonly used tags in header tags 

(2) Commonly used tags in content tags

Four: web

1. Web overview

2.Web1.0 vs Web2.0

(1)Web1.0 

(2)web 2.0

3. Static pages

(1) Static page definition

(2) Static page features

4. Dynamic pages

(1) Dynamic page definition

(2) Features of dynamic pages

(3) Currently commonly used dynamic web page programming languages

Five: HTTP protocol

1. Introduction to HTTP protocol

2. HTTP version

(1)HTTP/0.9

(2)HTTP/1.0

(3)HTTP/1.1

(4)HTTP/2.0

(5)HTTP3.0

3. HTTP method

 4. GET and POST 

(1)GET

(2)POST

(3) The difference between GET and POST

5. HTTP status code

6. Common HTTP status codes in the production environment 

7. Analysis of HTTP request process

(1) Request message

(2) Commonly used request headers

(3) Response message

​( 4) Common response headers

Summary: common status codes and reasons


One: domain name

1. Domain name overview

A tree structure that identifies a group of hosts and provides information about them (mainly determine where the root is, and each branch can be determined)

www.baidu.com 
域名服务器(分布式,每台主机维护一个部分):
① 保持和维护域名空间的程序
② 响应解析器的请求

解析端(客户端)
向DNS服务器发出请求的设备

2. Domain name space structure

① 根域 位于域名空间最顶层,一般用一个 “.” 表示								
基础单位,除了根域 其他都只有一个上级域,有0或多个子域,同层域不可重复的子域或域名

② 顶级域
一般代表一种类型的组织机构或国家地区(主要有此两种类型构成),如 net(网络公司)、com(商业)、org(民间团体组织)、edu(教育)、gov(政府)、mil(军事)、cn(中国)、jp(日本)、hk(中国香港)

③ 二级域
用来标明顶级域内的一个特定的组织,国家顶级域下面的二级域名由国家网络部门统一管理,如 .cn 顶级域名下面设置的二级域名:.com.cn、.net.cn、.edu.cn …

④ 子域
用来标明顶级域内的一个特定的组织,国家顶级域下面的二级域名由国家网络部门统一管理,如 .cn 顶级域名下面设置的二级域名:.com.cn、.net.cn、.edu.cn …

主机
主机位于域名空间最下层,就是一台具体的计算机,如 www、mail、都是具体的计算机名字,可用www.sina.com.cn.、mail.sina.com.cn. 来表示,这种表示方式称为 FQDN (完全合格域名),也是这台主机在域名中的全名

3. Domain name registration

Domain name registration is a method used to solve the problem of address correspondence in the Internet Follow the principle of first application first registration Domain name registration
steps: Prepare application materials———”Find domain name registration website———”Query domain name————”Formal Application————"Application successful

Two: Web page access (HTTP/HTTPS)

1. Basic concepts of web pages

Files in plain text format, written in HTML, are "translated" into web pages in the user's browser and displayed.

① 网页
a.纯文本格式文件
b.编写语言为HTML
c.在用户的浏览器中被“翻译”成网页形式显示出来

② 网站
由一个一个页面构成的,是多个网页的结合体
主页
打开网站后出现的第一个网页称为网站主页(或首页)

③ 域名
浏览网页时输入的网址

④ HTTP/HTTPS
用来传输网页的通信协议(是否加密),是一种通讯/交互的标准/规范

⑤ URL
是一种万维网寻址系统

⑥ HTML 
用来编写网页的超文本标记语言

⑦ 超链接
超链接是将网站中不同网页链接起来的功能

⑧ 发布
将制作好的网页上传到服务器供用户访问的过程

2. Website

Consisting of one page at a time, it is a combination of multiple web pages

3. Homepage

The first page that appears after opening a website is called the home page (or home page) of the website

4. Domain name

URLs entered while browsing the web

5.HTTP

Communication protocol used to transmit web pages

6.URL

is a World Wide Web addressing system

7. Web page basic label

(1) The role of web page summary information

Helpful for browser parsing

Good for search engine searches

<title>标签		#标题
#举例
<head>
  <title>搜狐-中国最大的门户网站
  </title>
</head> 


<meta>标签		#根据关键信息搜索
#举例
<head>
  <meta name="keywords"
   content="山东蓝翔,挖掘机培训"/> 
</head>

(2) Title tag type

Line Control Related Tags
Range Tags
Image Tags
Hyperlink Tags
Special Symbols

<h1>静夜思</h1>

<p>床前明月光</p>

	疑是地上霜<br/>

<span>举头望明月</span>

<img src=""linux.jpg/>

<a href="linux.htm">我是中国人</a>

&nbsp;&quot;&copy;&gt;			#空行	双引号	版权符号	大于

Three: HTML

1. HTML overview

HTML is called Hypertext Markup Language, which is a specification and a standard, and it uses markup symbols to mark various parts of the webpage to be displayed. A web page file itself is a text file. By adding tags to the text file, you can tell the browser how to display the content.

HTML files can be edited with any text editor that can generate txt files to generate hypertext markup language files, just modify the file name suffix to ".html" or ".htm".

2. HTML basic tags

(1) HTML grammar rules

HTML tags adopt the form of double tags, and the front and rear tags correspond to each other, indicating the start and end of the tag respectively, and the content in the middle of the tag is described by the tag. The front tag is represented by "<XXX>", and the end tag has one more "/", which is represented by "</XXX>".

(2) HTML file structure

The outermost layer of an HTML file is <html></html>, indicating that the file is described in HTML language. Inside it are juxtaposed head tags (<head>) and content tags (<body>)

(3) The "source code" of the web page [press F12 on a blank page of the web page to display the source code]

 3.HTML file structure

HTML web page
head part
title part
body part
web page content, including text, images, etc.

示例:
<html>
<head>
<title>我的第一个网页 </title>
</head>

<body >
       Hello World!
</body>

</html>

(1) Commonly used tags in header tags 

头标签中常用标签:
标签				描述
<title>				定义了文档的标题
<base>				定义了页面链接标签的默认链接地址
<link>				定义了一个文档和外部资源之间的关系
<meta>				定义了 HTML 文档中的元数据
<script>			定义了客户端的脚本文件
<style>				定义了 HTML 文档的样式文件

内容标签中常用标签

(2) Commonly used tags in content tags

标签				描述
<table>				定义一个表格
<tr>				定义了表格中的一行
<td>				定义了表格中某一行的一列
<img>				定义了一个图像
<a>					定义了一个超链接
<p>					定义了一行
<br>				定义了换行
<font>				定义了字体
<h1>				定义字体大小

Four: web

1. Web overview

Web (Would Wide Web) 全球广域网is also known as the World Wide Web  . A distributed graphic information system is a network service built on the Internet.

2.Web1.0 vs Web2.0

(1)Web1.0 

1.以编辑为特征,网站提供给用户的内容是编辑处理后的,然后用户阅读网站提供的内容
2.这个过程是网站到用户的单向行为

(2)web 2.0

1.更注重用户的交互作用,用户既是网站内容的消费者(浏览者),也是网站内容的制造者
2.加强了网站与用户之间的互动,网站内容基于用户提供,网站的诸多功能也由用户参与建设,实现了网站与用户双向的交流与参与
3.Web2.0特征:用户分享、以兴趣为聚合点的社群、开放的平台,活跃的用户。

3. Static pages

(1) Static page definition

① Static web pages are standard HTML files
② The extensions are .htm, .html,
such as text, images, sounds, Flash animations, client-side scripts, ActiveX controls, and Java applets, etc. ③
It is the basis of website construction. Early websites are generally composed of Static webpage production
④ No background database, no program and non-interactive webpage
⑤ Relatively troublesome to update, suitable for display websites that generally update less

(2) Static page features

① Each static web page has a fixed URL, and the URL is suffixed in common forms such as .htm, .html, .shtml, etc. without "?
" , each static webpage is saved on the website server ③ The content of the static webpage is relatively
stable and easy to be retrieved by search engines
When it is very large, it is difficult to completely rely on static webpage production methods.
⑤ The interactivity of static webpages is poor, and there are relatively large restrictions on functions.
⑥ The page browsing speed is fast, and the process does not need to connect to the database.

4. Dynamic pages

(1) Dynamic page definition

① The URL of the web page is not fixed , and can communicate with users through the background. ② There is an iconic symbol in the URL of the dynamic web page. ③ Commonly used languages ​​include PHP, JSP, Python, Ruby, etc.交互
——“?”

(2) Features of dynamic pages

①Interactive web pages will dynamically change and respond according to user requirements and choices, and the  browser will be used as the client interface, which will be the general trend of WEB development in the future. ②Automatic update  will automatically generate new pages without manually updating HTML documents. , can greatly save the workload ③ Varies from time to time When different people visit the same website at different times, different pages will be generated 

(3) Currently commonly used dynamic web page programming languages

PHP
is Hypertext Preprocessor (Hypertext Preprocessor), which is the most popular scripting language on the Internet today. Its syntax borrows from C, Java, PERL and other languages, but you can use PHP to create A truly interactive Web site.

JSP
is Java Server Pages (Java Server Pages), which is a new technology introduced by Sun Microsystem in June 1999. It is a Web development technology based on Java Servlet and the entire Java system.

Python 
is an object-oriented, cross-platform dynamic computer-like programming language. It was originally designed for writing automation scripts (shell). Independent large-scale project development.

Ruby
is a simple and fast object-oriented (object-oriented programming) scripting language developed by Japanese Yukihiro Matsumoto in the 1990s, and complies with the GPL agreement and Ruby License. Its inspiration and features come from Perl, Smalltalk, Eiffel, Ada, and Lisp languages.

Five: HTTP protocol

1. Introduction to HTTP protocol

  • The HTTP (HyperText Transfer Protocol) protocol is the most widely used network protocol on the Internet. It is an application layer transmission protocol based on the TCP protocol. Simply put, it is a rule for data transmission between the client and the server. .
  • HTTP/HTTPS is a protocol on the application layer, built on top of TCP at the transport layer. The client performs a TCP connection (three-way handshake) with the server, and then sends HTTP requests and receives HTTP responses by accessing the Socket interface to call the TCP protocol. accomplish.
  • HTTP is a stateless protocol, and the HTTP protocol itself does not persist (store, save) the sent requests and corresponding communication states. The purpose of this is to keep the simplicity of the HTTP protocol, so that a large number of transactions can be processed quickly and efficiency is improved.
  • However, in many application scenarios, we need to keep the user logged in or record the items in the user's shopping cart. Since HTTP is a stateless protocol, some technology must be introduced to record the management state, such as cookies.
  • Both cookie and session are to achieve short-term persistence of http (memory/cache method, fast query and high efficiency). Cookie is cached in the client browser (default cache for one day). When accessing the client through the same browser, it will first read the cached information in the cookie and make a request to the server. At the same time, when the server receives the request from the client, it reads the cookie file and knows that the client was looking for it before. For the tasks handled by server A, in order to save trouble and resources, simply talk about the request directly and then hand it over to server A for processing

Note: cookies save server performance, and sessions are more secure.

2. HTTP version

(1)HTTP/0.9

obsolete. Only GET is accepted as a request method, no version number is specified in the communication, and the request header is not supported. Since this version does not support the POST method, the client cannot pass much information to the server.

(2)HTTP/1.0

This was the first version of the HTTP protocol to specify a version number in the communication, and it is still widely used today, especially in proxy servers.

(3)HTTP/1.1

A persistent connection is introduced, that is, the TCP connection is not closed by default and can be reused by multiple requests, which can work well with proxy servers. It also supports the pipeline mechanism, that is, in the same TCP connection, the client can send multiple requests at the same time, so as to reduce the line load and increase the transmission speed.

(4)HTTP/2.0

Complete multiplexing, in a connection, both the client and the browser can send multiple requests or responses at the same time, and there is no need to correspond one by one in order. A header information compression mechanism is introduced, which can be compressed with gzip or compress before sending. Support server-side push, allowing the server to actively send resources to the client without request.

(5)HTTP3.0

 HTTP 3.0 is the first version of the protocol that relies entirely on the QUIC protocol instead of TCP. HTTP3.0 is based on QUIC design, aiming to improve web performance and security. It uses stream multiplexing, allowing multiple requests and responses to be sent on a single connection, and provides encryption by default. HTTP 3.0 also features features such as 0-RTT (zero round-trip time) recovery, which improves connection speeds when revisiting a website, and includes a request prioritization mechanism that further improves web performance.

3. HTTP method

HTTP supports several different request commands, which are called HTTP methods. Each HTTP request message contains a method, which tells the server what action to perform, including: obtaining a page, running a gateway program, deleting a file, and so on. The most commonly used methods for obtaining resources are GET, POST, and PUT.

HTTP method describe
GET Simple requests for server resource fetches
PUT     Submit data to the server to modify the data
DELETE Delete some resources on the server
POST Used to send requests containing user-submitted data
HEAD Request the header of the page to obtain the meta information of the resource

(1) The GET request will send a request for data to the database to obtain information. This request is just like the select operation of the database, it is only used to query the data, it will not modify or increase the data, and will not affect the content of the resource. That is, the request has no side effects. No matter how many times you do it, the result is the same.

(2) Different from GET, the PUT request is to send data to the server to change the information. This request is like the update operation of the database, which is used to modify the content of the data, but it will not increase the type of data, etc. That is to say, no matter how many PUT operations are performed, the result is not different.

(3) The POST request is similar to the PUT request, which sends data to the server, but the request will change the type of data and other resources, just like the insert operation of the database, it will create new content. Almost all submission operations currently use POST requests.

(4) The DELETE request, as the name implies, is used to delete a certain resource. This request is like the delete operation of the database

 4. GET and POST 

(1)GET

Accept:客户端可以接受的数据类型
Accept-Language:客户端可以接受的语言类型
User-Agent:浏览器的信息
Accpect-Encoding:客户端可以接受的编码格式
Host:表示请求的ip和端口号
Connection:告诉服务器请求连接如何处理 Keep-Alive:通知服务器回传数据不要马上关闭,保持一小段的连接
Closed:马上关闭

(2)POST

① 请求行
请求的方式
请求的资源路径
请求的协议的版本号

② 请求头

Accept:客户端可以接受的数据类型
Accept-Language:客户端可以接受的语言类型
Referer:表示请求发起时,浏览器地址栏中的地址
User-Agent:浏览器的信息
Content-Type:发送的数据类型
Content-Length:发送的数据长度

③ 请求体:就是发送给服务器的数据

(3) The difference between GET and POST

●GET 方法:从指定的服务器上获得数据
GET请求能被缓存
GET请求会保存在浏览器的浏览纪录里
GET请求有长度的限制
主要用于获取数据
查询的字符串会显示在URL后缀中,不安全,比如 http://www.test.com/a.php?Id=123 

●POST 方法:提交数据给指定服务器处理
POST请求不能被缓存
POST请求不会保存在浏览器的浏览纪录里
POST请求没有长度限制
查询的字符串不会显示在URL中,比较安全

5. HTTP status code

当使用浏览器访问某一个URL,会根据处理情况返回相应的处理状态
通常正常的状态码为2xx,3xx(如200)
如果出现异常会返回4xx,5xx(如404)
status code first defined range Classification
1xx 100-101 message notification
2xx 200-206 success
3xx 300-305 redirect
4xx 400-415 client error
5xx 500-505 Server Error

6. Common HTTP status codes in the production environment 

information describe
200 OK The request was successful (followed by the response document to GET and POST requests)
301 Moved Permanently Requested permanent page jump
403 Forbidden Do not access this page
404 Not Found The server cannot find the requested page
500 Internal Server Error Internal Server Error
403 Forbidden Do not access this page
404 Not Found The server cannot find the requested page
500 Internal Server Error Internal Server Error

7. Analysis of HTTP request process

When the user enters the URL in the browser to access, an HTTP request message is initiated, and the request includes the request line, request header, and request body. After receiving the request, the server returns a response message, including the status line, response header, and response body.

  1. The client accesses through the domain name, and first performs DNS domain name resolution
  2. After that, the client requests to establish a TCP connection with the web server (three-way handshake)
  3. After the connection is established, the client sends an HTTP request to the web server
  4. The server responds to the HTTP request, and the client's browser gets the HTML code
  5. The browser parses the HTML code and requests resources in the HTML code. (After the browser gets the HTML file, it starts to parse the HTML code, and when it encounters a static resource, it requests the server to download it.)
  6. Disconnect the TCP connection (wave four times), the browser renders the page and presents it to the user 
     

(1) Request message

Request line : The request line consists of three parts: request method, URL, and protocol version.
Request header : The request header adds some additional information to the request message, consisting of "name/value" pairs, one pair per line, and the name and value are separated by a colon.
Blank line : There will be a blank line at the end of the request header, indicating the end of the request header, followed by the request body. This line is very important and essential.
Request body : The request body is the parameter submitted by the request. The GET method has specified the parameters in the URL, so there is no data when submitting. The parameters submitted by the POST method are in the request body.

(2) Commonly used request headers

request header describe
Host The address of the server that accepts the request, which can be IP: port number or domain name
User-Agent The name of the application sending the request
Connection Specify connection-related properties, such as Connection: Keep-Alive
Accept-Charset Notify the encoding format that the server can send
Accept-Encoding Notify the server of the data compression format that can be sent
Accept-Language The language that the notification server can send

(3) Response message

Status line : The status line consists of three parts: protocol version, status code, and status code description.
Response header : The response header is similar to the request header, adding some additional information to the response message.
Blank line : There will be a blank line at the end of the response header, indicating the end of the response header.
Response body : The corresponding HTML data returned by the server, and the browser displays the page after parsing it.

(4) Common response headers

response header     describe
Server The name and version of the server application software
Content-Type The type of response body (whether it is an image or a binary string)
Content-Length Response body length
Content-Charset The encoding to use for the response body
Content-Encoding The data compression format to use for the response body
Content-Language The language used in the response body

总结:常见的状态码及原因

1xx:信息性状态码,表示服务器已经接收到请求,正在处理中。

  • 100:请求已被服务器接收,继续处理。
  • 101:服务器已经理解了客户端的请求,并将通过Upgrade消息头通知客户端采用不同的协议来完成这个请求。例如,将HTTP协议升级为WebSocket协议。

2xx:成功状态码,表示服务器已经成功处理了请求。

  • 200:请求已成功,请求所希望的响应头或数据体将随此响应返回。
  • 201:请求已经被成功处理,并创建了新的资源。
  • 204:请求已成功处理,但是没有返回任何内容。

3xx:重定向状态码,表示请求需要进一步操作以完成请求。

  • 301:请求的资源已永久移动到新的位置,客户端应该使用新的URL访问。
  • 302:请求的资源已临时移动到新的位置,客户端应该使用新的URL访问。
  • 304:客户端发送了一个带条件的请求,服务器告诉客户端资源未被修改,可以直接使用缓存的资源。

4xx:客户端错误状态码,表示客户端发送的请求有错误。

  • 400:请求语法错误,服务器无法识别此请求。
  • 401:请求未经授权,需要身份验证。
  • 403:请求被服务器拒绝,权限不足。
  • 404:请求的资源不存在。

5xx:服务器错误状态码,表示服务器处理请求时发生错误。

  • 500:服务器内部错误,无法完成请求。
  • 503:服务器暂时无法处理请求,通常是因为服务器过载或正在进行维护。

Guess you like

Origin blog.csdn.net/A1100886/article/details/130844522
Recommended