Experiment 1 Anaconda installation and use (online Python programming experiment guide)

Experiment 1 Anaconda installation and use

1. Experimental purposes and requirements

(1) Master the installation and configuration of Anaconda under Windows.

(2) Master the simple use of Anaconda under Windows, including the use of IDLE, Jupyter Notebook, and Spyder tools.

(3) Master the use of pip to manage Python extension libraries, including downloading, online installation, offline installation, upgrade, and uninstallation of extension libraries.

2. Experimental content

(1) Download Anaconda.

(2) Install and configure Anaconda.

(3) Use IDLE that comes with Anaconda.

(4) Use Spyder that comes with Anaconda.

(5) Use the Jupyter Notebook that comes with Anaconda.

(6) Use pip/conda tools to manage Python third-party extension libraries.

3. Experimental instruments and equipment

A PC + Windows operating system + Anaconda Distribution (Python 3.X) installation file, which requires access to the Internet.

4. Experimental principles

(1) Introduction to Python

Python is a cross-platform, open source, and free interpreted high-level dynamic programming language. It supports imperative programming (how to do), functional programming (what to do), fully supports object-oriented programming, has concise and clear syntax, and has a large number of Powerful built-in objects, standard libraries and mature extension libraries that support application development in almost all fields.

Currently, the operating systems supported by Python include mainstream Windows, Linux, Mac OS, etc.

Currently, the commonly used Python IDE tools mainly include the following:

  1. IDLE (recommended for beginners),https://www.python.org/
  2. Anaconda (recommended for scientific computing),Anaconda | The World’s Most Popular Data Science Platform 
  3. PyCharm (recommended for large-scale Python program development),PyCharm: the Python IDE for Professional Developers by JetBrains 
  4. VS Code (supports multiple development languages),https://code.visualstudio.com/Download
  5. Sublime Text (powerful editor),Sublime Text - Text Editing, Done Right 

For more information about Python, readers please refer to the official websitehttps://www.python.org/.

(2) Introduction to Anaconda

Anaconda is a set of software packages specifically built to facilitate the use of Python for data science research. It covers 250 common toolkits in the field of data science. Anaconda can be regarded as the Python family bucket. It also comes with the conda package management tool specifically designed to solve software environment dependency problems. Conda provides package management and environment management functions, which can easily solve the coexistence and switching of multiple versions of python and the installation of various third-party packages.

Anaconda uses the command conda to manage packages and environment, and already includes Python and related supporting tools.

As a tool that comes with Anaconda, Jupyter Notebook adopts B/S mode. It will automatically open the browser when it is started and connect to the backend server through port 8888.

Table 1-1 shows the commonly used shortcut keys for operating Cell in Jupyter Notebook and their descriptions.

Table 1-1 Commonly used shortcut keys for operating Cell

Schema type

Order

illustrate

Available in both command mode and edit mode

Shift+Enter

Execute the code of this unit and jump to the next unit

Ctrl+Enter

Execute the code of this unit and stay in this unit

command mode

AND

Cell switches to Code mode

M

Cell switches to Markdown mode

A

Add a Cell above the current Cell

B

Add Cell below the current Cell

Double click D

Delete current Cell

WITH

go back

Ctrl+Shift+minus sign

Separate Cell, at cursor

L

Add line number to current Cell

Edit mode

Ctrl+mouse click (Mac:CMD+mouse click)

Multiple cursor operations

Ctrl+Z(Mac:CMD+Z)

go back

Ctrl+Y

Redo

Tab key

code completion

Ctrl(CMD+/)

Comment multiple lines of code

For more information about Anaconda, readers please refer to the official websiteAnaconda | The World’s Most Popular Data Science Platform.

(3) Use pip to manage Python extension libraries

Currently, pip has become the mainstream way to manage Python extension libraries. Using pip, you can not only view the list of installed Python extension libraries on your local machine in real time, but also support operations such as installation, upgrade, and uninstallation of pure Python extension libraries.

Commonly used pip commands and their descriptions are shown in Table 1-2.

Table 1-2 How to use commonly used pip commands

pip command example

illustrate

pip download PackageFilename[==version]

Download the specified version of the extension library without installing it

pip list

List all currently installed modules

pip install PackageFilename[==version]

Install the specified version of the SomePackage module online

pip install PackageFilename.whl

Install extension library offline through whl file

pip install package1 package2 ...

Install package1, package2 and other extension modules online in sequence

pip install --upgrade PackageFilename

Upgrade SomePackage module

pip uninstall PackageFilename[==version]

Uninstall the specified version of the SomePackage module

python -m pip

Run pip as module

You can execute the "pip help" command in the command prompt environment to view the pip command help. The screenshot of the execution effect is shown in Figure 1-1.

Figure 1-1 View pip command help (part)

You can also use the command "pip <command> --help" to get help information for a specific command. For example, the screenshot of the execution effect of the "pip install --help" command is shown in Figure 1-2.

Figure 1-2 View pip install command help (part)

5. Experimental steps

(1) Obtain Anaconda.

Anaconda’s official website download address ishttps://www.anaconda.com/distribution/, which needs to be determined according to your own CPU word length and operating system to select the corresponding Anaconda Distribution version. The operating system used by the editor is Windows, so download the installer corresponding to Windows, as shown in Figure 1-3.

Figure 1-3 Anaconda Windows installer

For the specific installation environment requirements of Anaconda, readers can check the official website documentationInstallation — Anaconda documentation.

(2) Install and configure Anaconda.

Double-click the Anaconda Windows installation package file to complete the installation. Some suggestions on installation are as follows:

(1) Select customized installation when installing. It is recommended not to install Anaconda on the C drive.

(2) Select for all users during installation.

(3) Do not add Anacond’s own Python interpreter to the system environment variable PATH during installation.

(3) Initial use of IDLE that comes with Anaconda.

Find the file "idle.exe" in the Scripts directory under the Anaconda installation directory. Double-click the file to enter the IDLE development environment, as shown in Figure 1-4.

Figure 1-4 IDLE development environment

Then, output the string "Hello, Python!" in interactive mode, as shown in Figure 1-5.

Figure 1-5 Output the string "Hello, Python!" in interactive mode under IDLE

(4)UseSpyder IDE

1. Click [Anaconda Navigator] under [Anaconda3] in the [Start] menu to enter the Anaconda Navigator main interface, as shown in Figure 1-6.

Figure 1-6 Anaconda Navigator main interface

2. Click the button [Launch] under Spyder to enter the Spyder IDE development environment, as shown in Figure 1-7.

Figure 1-7 Spyder IDE main interface

3. Enter the following code in the default "temp.py" file:

print('Hello, Python')

Then press the shortcut key "Ctrl+Shift+S" or click the menu [File] → [Save as], select the save location, and save it as "hello.py".

4. Run "hello.py". Press the shortcut key F5 or click the menu [Run] → [Run] or click the button [Run file (F5)] on the toolbar. The window [Run settings for hello.py] will pop up. You can run it according to the default run options. , the running results are displayed in the "IPython console" in the lower right corner of the window, as shown in Figure 1-8.

Figure 1-8 The string "Hello, Python!" is output in the form of a script program under Spyder IDE.

Of course, we can regard "IPython console" as a Python interactive mode environment and directly enter Python statements at the prompt "In [N]:". The effect is shown in Figure 1-9.

Figure 1-9 The string "Hello, Python!" is output under the "IPython console" in Spyder IDE.

In addition, you can also enter the Spyder development environment through [Spyder] under [Start] menu [Anaconda3].

(五)使用Jupyter Notebook

1. Enter the Anaconda Navigator interface and click the button [Launch] under Jupyter Notebook to enter the Jupyter Notebook development environment, as shown in Figure 1-10. In actual use, it is often necessary to specify the working directory of Jupyter Notebook. Methods as below:

Open the [Anaconda3] option in the [Start] menu, right-click [Jupyter Notebook], select [Open file location], right-click [Jupyter Notebook], select [Properties], and modify the line ending in the [Target] input box Just change the text content to the specified directory you want.

Figure 1-10 Jupyter Notebook main interface

2. For example, create a new Python 3 file on the desktop, enter the "Desktop" directory, and click [Python 3] under the button [New], as shown in Figure 1-11.

Figure 1-11 Create a new Python 3 file under Jupyter Notebook

3. Enter the editing interface of the file, as shown in Figure 1-12.

Figure 1-13 Python 3 file editing page under Jupyter Notebook

4. Enter the Python program or statement. For example, enter the following statement:

print('Hello, Python')

5. Run the Python program or statement. Click the button [Run] on the toolbar or a Run option under the menu [Cell] or press the shortcut key "Ctrl+Enter". The running effect is shown in Figure 1-14.

Figure 1-14 Jupyter Notebook outputs the string "Hello, Python!"

6. If necessary, you can also save the Python file, click the menu [File] → [Save as...], enter the file name such as "hello2", and then click the button [Save] to complete the save, as shown in Figure 1- 15 shown.

Figure 1-15 Save files under Jupyter Notebook

Readers please note that the default file extension saved by Jupyter Notebook is .ipynb.

In addition, you can also enter the Spyder development environment through [Spyder] under [Start] menu [Anaconda3].

  • Mix graphics, text and formulas in Jupyter Notebook
  1. Select the unit type [markdown].

Figure 1-16 Set cell type to markdown

Markdown is a lightweight markup language that allows people to write documents in a plain text format that is easy to read and write.

files. Text can be edited in Markdown mode. Using Markdown's syntax specifications, you can set the text format, insert links, pictures and even mathematical formulas. Similarly, you can run the Markdown unit and display the format by pressing the [shifit] + [enter] key combination. ized text.

Add a "#" character and a space before the first line to represent the first-level title, add two "##" characters and a space to represent the second-level title, and so on. Bullets can use "+", "-", "*" plus spaces. The formula is enclosed by two "$" symbols, for example, the inline formula: "$E=mc^2 $", single-line formula: $$E=mc^2$ $

For the syntax of mathematical formulas, see URL:https://www.jianshu.com/p/e74eb43960a1.

  1. Complete the following input:

Figure 1-17 Enter markdown text

(7) Use pip to manage Python third-party extension libraries.

1. Enter the Anaconda command prompt environment through [Anaconda Prompt] under [Start] menu [Anaconda3], and you can enter the command "pip help" to view the help document of the pip command.

2. Upgrade pip. It is recommended to upgrade pip first after installing Python or Anaconda. The command used is as follows.

python -m pip install --upgrade pip

3. Viewall extension libraries currently installed under Anaconda, the command used is as follows, and its running effect is shown in Figure 1-18.

pip list

Figure 1-18 Use pip to view all extension libraries currently installed under Anaconda (part)

4. Display detailed information of an expansion package. For example, to display the detailed information of the numpy package, the command used is as follows, and the running effect is shown in Figure 1-19.

pip show numpy

Figure 1-19 Use pip to view numpy package details

5. Uninstall an expansion pack. For example, to uninstall the numpy package, the command used is as follows. In the middle, you will be asked whether to continue. Just enter "y". The operation effect is shown in Figure 1-20.

pip uninstall numpy

Figure 1-20 Use pip to uninstall the numpy package

6. Install an expansion package online. For example, to install the numpy package, the command used is as follows. In the middle, you will be asked whether to continue. Just enter "y". The running effect is shown in Figure 1-21. Use pip again to view the detailed information of the numpy package. The effect is shown in Figure 1-22. As shown in Figure 1-22, numpy 1.17.1 has been reinstalled, and the previous numpy version was 1.16.2.

pip install numpy

Figure 1-21 Use pip to install numpy package online

Figure 1-22 Use pip again to view numpy package details

If the network connection timeout is displayed, you can use the following two methods:

  1. To temporarily change the pip installation source, use the command pip install -i Simple Index The name of the package to be installed. (This example uses Douban as an example. Mirror sources such as Alibaba and Tsinghua can be changed according to the actual network conditions)
  2. Permanently change the pip installation source (take win10 as an example)

  In the [C:\Users\Administrator\AppData\Roaming\] directory, create a subdirectory named [pip], enter the pip subdirectory, and create a text file pip.ini with the following content:

[global]

timeout = 300

index-url = https://pypi.douban.com/simple

trusted-host = pypi.douban.com

7. Install third-party extension libraries offline. If you encounter an extension library that was unsuccessfully installed, use a browser to open the URL https://www.lfd.uci.edu/~gohlke/pythonlibs/ Download the whl file and then install it offline. The command for offline installation is as follows.

pip install PackageFilename.whl

(eight)Modified Jupyter permission work space

Using jupyter notebook on Windows is not like Linux. Which directory is started in Linux will be the workspace by default, but this is not the case in Windows.

Navigate to [Jupyter Notebook] under [Anaconda3] in the start menu, right-click [Jupyter Notebook], select [Open File Location] under [More], then right-click [Jupyter Notebook] and select [Properties], replace "%USERPROFILE%/" in the target dialog box is your preset directory (the directory must exist), and finally click the [OK] button. Start Jupyter Notebook and check whether it is located in the preset directory.

6. Experimental Precautions

(1) Specifications for the use of computer rooms and safety of electricity use.

(2) Python is an interpreted language and requires support from the Python interpreter.

(3) Jupyter NoteBook in Ananconda is based on B/S mode and has certain requirements for browsers. It is recommended to use Chrome or 360 browser.

(4) When downloading the extension library, it defaults to downloading from foreign websites, which is slow. It is recommended to download from domestic mirror websites, such as Douban (Simple Index), University of Science and Technology of China (Simple Index).

(5) When encountering a problem, first read the error message given by the system, determine the point of error, and then use the knowledge you have learned to solve it. If you cannot solve it, you can search for information through Baidu, discuss with classmates, or ask the teacher for help.

7. Experiment report requirements

Laboratory reports must be submitted in written/electronic form. Plagiarism is strictly prohibited. Once found, plagiarism will result in zero points.

The main content of the experiment report includes the name of the experiment, the type of experiment, the location of the experiment, the hours of study, the experimental environment, the experimental principle, the experimental steps, the experimental results, the summary and reflections, etc.

8. Experimental performance assessment

Experimental results are scored based on attendance in the experimental class, classroom performance, experimental thinking and the content of the experimental report. Based on a hundred-point system, the average experimental score is calculated into the total course grade at a rate of 15%.

[Markdown unit reference code]

#Data preprocessing

## 1. Selection of distance

Distance in data mining represents the degree of similarity and dissimilarity between data. Different distances should be selected to measure in different scenarios.

### 1) Euclidean distance---L2 norm

Euclidean distance is the distance between two points in a plane or space. It is the shortest distance between two points in a multi-dimensional space.

$$d = \sqrt{\displaystyle\sum_{i=1}^{n}(x_{1i}-x_{2i})^2}$$

### 2) Manhattan distance---L1 norm

Manhattan distance represents the sum of the absolute axis distances of two points in space in the coordinate system.

$$d = \displaystyle \sum_{i=1}^{n}|x_{1i}-x_{2i}|$$

### 3) Cosine similarity

The similarity between two vectors is measured by measuring the cosine value of the angle between them, which is often used in the calculation of text similarity.

$$d = cos(\theta)=\frac{\vec{a}\cdot\vec{b}}{||a||\cdot||b||}=\frac {\displaystyle \sum_{i=1}^{n}(x_{1i}\times x_{2i})}{\sqrt {\displaystyle \sum_{i=1}^{n}x_{1i}^2 \times x_{2i}^2}}$$

Guess you like

Origin blog.csdn.net/VLOKL/article/details/134367803