Simplify complexity and code elegantly. ——Jupyter notebook

1 Introduction

Jupyter notebook is currently the most popular editor for writing data science and machine learning. Jutel Notebook is different from other mainstream editors in that it is based on Web technology. Jutel Notebook provides an interactive environment and can edit content in rich text format (including mainstream Markdown, Latex formulas, etc.), making it very convenient to edit documents. Jutel Notebook was originally only for the Python language, but has now been expanded to more than 40 languages.

2. Some key points along the way

This is not intended to be a detailed historical retrospective. Instead, we will highlight some milestones that illustrate how a series of important ideas that continue to be relevant to the present came about.

Interactive Python and the scientific Python ecosystem. Jupyter evolved from the IPython project and focuses on using Python for interactive computing to address scientific computing needs and workflows. Since 2001, IPython has made a moral commitment to building a completely open source project (so that research results can be shared without barriers), and recognized that Python's characteristics can make it a challenge to those in academia who charge for computing software. By. This means that IPython will grow with the scientific Python ecosystem, providing a "gateway" to NumPy, SciPy, Matplotlib, pandas, and other powerful toolkits. So from the beginning, we found a good division of labor: IPython can focus on human-computer interaction problems, while other projects provide data structures, algorithms, visualization, etc. Various projects freely share code through a common license structure, allowing each project to add their own content while working together to provide tools for creating powerful systems for end users.

Open IPython Notebook protocol and file format. Around 2010, after many experiments building notebooks for IPython, we took the first steps towards the architecture we have established today. We want to retain the "IPython experience" design, which means that all the features and workflows of the IPython terminal will be retained, but it will operate over the network protocol so that no matter where the client is, it can connect to the server providing the calculation . Using the ZeroMQ network library, we defined a protocol that captures all the operations we are familiar with in IPython, from executing code to automatically completing the name of an object (introspection operations). This decision, in little more than a year, resulted in the release of the graphical client (still using the Qt console) and Jupyter Notebook (then called IPython) in the summer of 2011. an iteration (more details can be found in this blog post).

From IPython to Jupyter.  IPython Notebook was quickly adopted by the SciPy community, but it soon became clear that its underlying architecture could be used with any interactive programming language. Subsequently, in a very short period of time, kernels for other languages ​​​​(Julia, Haskell, R, etc.) besides Python were continuously created. We develop some ourselves, but most of the kernels are developed independently by users of these languages. This cross-language usage scenario forces us to carefully validate our architecture to eliminate any unexpected dependencies on IPython. And in 2014, this also led us to rename much of the project to Jupyter. The name is inspired by Julia, Python, and R (three open source languages ​​for data science), but the name represents a universal idea that transcends any specific language: the activities of computing, data, and humans understanding, sharing, and collaborating.

3. View trends from today’s perspective

The ideas that brought Juypter to this point have been woven into the larger framework of computing and data science, and we expect it to have a significant impact in the future. Here are six trends we’re seeing in the Jupyter ecosystem:

1.  Interactive computing is already a real and serious thing. Data-oriented computing has exposed the idea of ​​interactive computing to more practitioners. People in the field of scientific computing are already familiar with this kind of human-computer interactive computing through programming languages ​​such as Matlab, IDL, and Mathematica. However, when we started developing IPython in the early 2000s, this workflow was still foreign to developers in traditional software engineering fields. Languages ​​such as Python and Ruby provide interactive shells, but their functionality is limited and they are lightweight experimental projects rather than preferred development environments. When the first version of IPython appeared in 2001, it was an attempt to make interactive computing with Python enjoyable for those who used Python full-time. Tools such as Jupyter, RStudio, Zeppelin, and Databricks have further promoted interactive web-based computing. This enables millions of statisticians, data scientists, data engineers, and artificial intelligence/machine learning people to perform interactive computing every day. Traditional integrated development environments (IDEs) are being replaced by interactive computing environments: Jupyter, JupyterLab and RStudio are outstanding examples of this trend. Along with interactive computing, the basic building blocks were formalized, identified, and developed: the kernel (the process that runs the code), the network protocols (the formal message specifications to send code to the kernel and get the results back), the user interface (which provides the interface with kernel's human-machine interface) and MIME-based output (the representation of any type of result other than simple text), etc.

2.  Computational narratives are widely created. Live-running code, narrative text, and visualizations are integrated to make it easy to tell stories using code and data. This computational narrative is being used to produce and share technical content across different user and business contexts, including books, blog posts, peer-reviewed academic publications, data-driven journalism, and more. File formats such as Jupyter Notebook and R Markdown encode these computational narratives into shareable and reproducible units. However, the practice of computational narrative has expanded well beyond these open source formats to many interactive computing platforms.

3.  Program for specific insights rather than generalized tasks. The overall goal of computer science is generalization and abstraction. Software engineering focuses on designing unified libraries and applications for a variety of problems. With the rise of interactive computing as a practice, and the inclusion of this process in the computational narrative (what we call Literate Computing), we now have a new group of people who use programming languages ​​and development tools for different purposes . They explore data, models, and algorithms, often for very specific purposes, and may even expend significant effort on a single dataset, but ask complex questions and find insights that can be shared, published, and extended. The ubiquity of data in all disciplines means that programming languages ​​and tools will have a vastly expanded audience, but the needs and interests of these audiences are different from those of "traditional" software engineers.

4.  Embrace multilingual individuals and organizations. Many individuals and organizations realize the benefits of leveraging the benefits of multiple programming languages ​​when working with data. In a data-focused research group or company, it's not uncommon to see Python, R, Java, and Scala all being used. This forces everyone to develop and build protocols (Jupyter Message Specification), file formats (Jupyter Notebook, Feature, Parquet, Markdown, SQL, JSON) and user interfaces (Jupyter and nteract) that can run uniformly across languages ​​and maximize interoperability. and collaboration tools.

5.  Open standards for interactive computing. The industry focus a decade ago was on creating open standards for the Internet, such as HTML, HTTP and their corresponding devices. Today, we see the same types of standards being developed for interactive, data-oriented computing. The Jupyter Notebook file format is the formal specification for a JSON document format for computational narratives. Markdown is the standard for narrative text (albeit a bit dodgy). The Jupyter Message Specification is an open standard that allows any interactive computing client to communicate with any language kernel. Vega and Vega-Lite are JSON schemas for interactive visualization. These open standards enable a large number of tools and languages ​​to work together seamlessly.

6.  Share data meaningfully. Open data initiatives by governments and organizations provide a rich source of data to ordinary people and institutions. This data can be used to explore, reproduce previous experiments and studies, and construct services for others. But data only makes sense when paired with the right tools (Jupyter, nteract, RStudio, Zeppelin, etc.) that allow users to explore these data sets and share their results, that can humanize the data analysis process, that can support collaboration, and Ability to use narrative content and visualizations to convey the meaning of data.

So the question then becomes: Do all these trends signify a larger pattern? We believe they all herald the emergence and development of code, data, and user interfaces designed to optimize human-computer interaction and understanding of computations.

In the past, humans were not allowed to constrain themselves to accommodate the various limitations of computers (network, memory, CPU, disk space, etc.). Now that these previous constraints have been significantly relaxed, we can enjoy using high-level languages ​​(Python, R, Julia) and rich network interfaces (web browsers and JavaScript frameworks). We can build powerful distributed systems using well-designed browser-based user interfaces that allow us to use computing resources and data regardless of their geographic location. We can now start optimizing our most important resource: human time.

The relaxation of these previous constraints did not magically trigger the creation of human-centered computing systems, but it opened the door to it. The real driver may be the explosion of data emerging from every conceivable organization and activity. This creates a profound need to interact with code and data in more significant and meaningful ways. Without this impetus, the Jupyter project would still exist, but it would probably be limited to a very small academic scientific computing community.

Organizations need to start focusing on the human element when developing their data strategy. The huge success that Jupyter can achieve in some organizations is not a purchasing decision made by upper management. It's a decision made by developers and data scientists who have to spend time every day wrestling with coding and data. In the future, tools and systems that put the human element front and center, and prioritize design and usability as much as performance, will be the ones that are actually used and widely adopted. We developed Jupyter ideas because we wanted to use it ourselves, and we will build on those ideas moving forward.

4. Install jupyter notebook

Okay, let’s not say so much. Let's officially start the installation!

All roads lead to Rome, and the same applies to installing Jupyter .

1.pip installation

1. Determine whether python is installed.

We hold down windows+r at the same time to open the run window, enter cmd  to open the command line window

 Enter in the command line window. If python information is displayed, python is installed on the system.

python --version

 The python information displayed by everyone may be different. What I display here is python 3.9.13.

 If you don't have the installation, you can use Baidu yourself. I will publish a detailed python installation tutorial later.

2. Whether pip exists

We enter in the command line window

pip -h

to detect whether pip exists. If a long string of information is output, the installation is successful.

If the output is not a command, it means that pip is not installed. We open the pip download link to download the pip installer.

Click  get-pip.py  and select Save As, then run it after the download is complete. After running, enter the command again to check whether the installation is successful.

pip -h

After installing pip, we enter the command to install jupyter notebook

Not recommended

pip install jupyter

 Recommended (Tsinghua mirror download)

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple jupyter

Enter the command line and wait patiently for installation.

 

 

After the installation is complete, enter

jupyter notebook

 After pressing Enter, a web page will automatically open.

When this web page is displayed, the installation is successful.

There is currently notebook7, which requires command line input to install notebook7.

pip install notebook

2.Anconda installation

Install Anaconda

Go to https://www.anaconda.com/official website to get the installation package

Download the Personal Edition - The Personal Edition is free

Install Anaconda for different operating systems according to individual needs

 The Anaconda installation process for different operating systems is similar. Since it is a relatively automatic installation by the installers, you only need to pay attention to a few steps. If you need other older versions of Anaconda, click here.

1. After the download is completed, enter the installation program path, double-click the installers program, do not use the right-click to install in administrator mode - it will affect environment variables, etc.

2. Click "next".

3. Read the license terms and click "I Agree".

4. Select the ''Just Me'' option, (if your computer has multiple users and they need to use aconda, please select installed for all users, otherwise other users will not have permission to use it).

5. Select the destination folder where you want to install Anaconda and click the "next" button. Note: The file path name cannot have spaces and can only be English and numbers.

6. Choose whether to add Anaconda to the PATH environment variable. It is recommended not to add Anaconda to the PATH environment variable as this can interfere with other software. Instead, use Anaconda software by opening Anaconda Navigator or Anaconda Prompt from the Start menu.

7. Choose whether to register Anaconda as the default Python. Unless you plan to install and run multiple versions of Anaconda or multiple versions of Python, accept the defaults and check this box

 8. Optional: To install PyCharm for Anaconda, click the link to https://www.anaconda.com/pycharm. Or to install Anaconda without PyCharm, click the "next" button.

9. After successful installation, you will see the "Thank you for installing Anaconda" dialog box:

10.Click 'finish' to complete

11. Double-click to run Anaconda navigator. The following interface will appear, and the installation is complete.

Then click Notebook and the same web page will open.

 5.Usage of Jupyter notebook

Open the command line terminal and enter  jupyter notebook the command to start Jupyter Notebook.

The jupyter notebook interface will open on the web page.

Create a new notebook file. Click the new button in the upper right corner and select Python3 in the menu that appears. 

A new webpage will automatically open. There is already a code editing port in the webpage.

 We enter the code in the code block and click the run button to run it

Each time you run one, a new code block will be automatically created. You can also manually click the + sign to create a new one.

 To delete a cell, you can select a code block and press d twice to delete it.

 

 We will find that it does not have code completion, but it actually has this function. We need to press Tab to turn on auto-completion

 We find the save function in file, which can be saved as different types of files

 Choose one and it will automatically generate the file, just download it. After downloading, open it (such as HTML) and display

6. Conclusion 

I will continue to update this series about editors in the future, thanks♪(・ω・)ノ

Guess you like

Origin blog.csdn.net/m0_73552311/article/details/131510757