This article is reprinted from the public account CSDN, compiled | Tu Min
In the era of data explosion, should you use Excel or Python for data analysis?
good news! Now you no longer have to do this multiple choice question.
Just yesterday, Microsoft announced on its official blog that it has natively integrated Python into the public preview version of Excel. This will allow data analysts, engineers, marketers, or students studying data science to directly use Python code, Libraries perform complex statistical analysis, advanced visualizations, predictive analytics, machine learning, and more in Excel.
Excel and Python can be used simultaneously
“You can use Python graphing and libraries to manipulate and explore data in Excel, and then further refine your insights using Excel’s formulas, charts, and pivot tables,” explains Stefan Kinnestrand, general manager of modern work at Microsoft. “Now, you can Access Python directly from the Excel ribbon and perform advanced data analysis in the familiar Excel environment."
Seamlessly aggregate and visualize your data in Excel using Python
In detail, Microsoft shared the differences of Python in Excel from several dimensions in its official blog. Next, we might as well take a look together.
No need to download any software, use Python code directly in Excel
First, it's worth noting that Python in Excel is natively integrated directly into Excel sheets.
Therefore, for ordinary users, opening the Excel table and entering the =PY function directly into the cell can directly input Python code into the Excel table without installing any other software.
The so-called coding can be completed with a few clicks of the mouse. This not only lowers the threshold for coding, but also improves work efficiency.
In addition, users can also use Excel's built-in connectors and Power Query to directly introduce external data into Python in Excel workflows. What's more, Python in Excel is compatible with the tools users already know and love, such as formulas, PivotTables, and Excel charts.
Use third-party open source distributions to implement Python in Excel
So how is this function implemented?
In this regard, Microsoft explained that the new feature utilizes Anaconda (https://www.anaconda.com/download), an open source Python distribution, which is a company for data scientists and engineers and is also friendly to beginners.
Anaconda includes many prepackaged libraries and packages such as pandas, Matplotlib, scikit-learn, NumPy, and SciPy.
Python in Excel leverages the Python Anaconda Distribution running in Azure and is securely built, tested, and supported by Anaconda. The Python provided by Anaconda supports various analyzes using Python in Excel.
How to ensure safety?
In addition, to ensure security, Python in Excel runs on the Microsoft cloud, and calculation results are returned to the worksheet, including charts and visualizations. This gives users enterprise-grade security in their Microsoft 365 experience.
Python code runs through the secure software supply chain in its own hypervisor-isolated container using Azure Container Instances and secure source code packages from Anaconda.
Python in Excel protects users' data privacy by preventing Python code from knowing the individual user's identity and opening workbooks further isolated from the Internet in separate containers.
Data in a workbook can only be sent through the built-in xl() Python function, and output from Python code can only be returned as the result of the =PY() Excel function.
Seamless collaboration between the same team
Even more powerful, Microsoft shared on its blog, members of the same team can seamlessly interact with and refresh analysis results based on Python in Excel without having to worry about installing other tools, Python runtimes or management libraries, and interacting with each other. dependencies between them.
Users can share workbooks using their favorite collaboration tools, such as Microsoft Teams and Microsoft Outlook, and collaborate seamlessly with comments, @mentions, and co-authoring with colleagues just like in Excel.
What exactly can Python in Excel do?
On the blog, Microsoft also gave several examples to share the detailed uses of Python in Excel.
One, visualization.
In Excel, users can directly use Python's rich third-party libraries such as Matplotlib and seaborn to create various charts, including building traditional bar charts, line charts, heat maps, violin charts, group charts, etc.
Second, machine learning and predictive analysis.
Leverage the power of Python libraries like scikit-learn and statsmodels to apply popular machine learning, predictive analytics, and forecasting techniques such as regression analysis, time series modeling, and more.
Machine learning model for weather prediction using Python and Excel LAMBDA
Third, data cleaning.
Effectively utilize advanced data cleaning techniques such as finding missing values, standardizing formats, removing duplicates, and pattern-based transformations using techniques such as regular expressions.
Extract date using regular expression
Guido van Rossum, the father of Python: When I joined Microsoft three years ago, I never dreamed of this possibility
It has to be said that Microsoft's ability to integrate the mainstream programming language Python into mainstream office software may be inseparable from the efforts of Guido van Rossum, the father of Python.
In November 2020, unable to bear the loneliness of retirement life, Guido Van Rossum tweeted: "Retirement life is so boring, I decided to join Microsoft's developer department!"
At that time, he only revealed that he joined Microsoft to continue developing and optimizing Python, making this technology more important, and not just on Windows.
After three years, we see that Microsoft is deeply embracing Python. According to Microsoft, Guido van Rossum helped define the architecture of Python in Excel in this update.
Guido van Rossum also said, "I'm delighted that this great, tight integration of Python and Excel can now finally be put to use. I hope that both communities will find interesting new uses for this collaboration to enhance The capabilities of each partner. When I joined Microsoft three years ago, I never dreamed this would be possible. The Excel team is outstanding!"
Developers are excited
Currently, Python in Excel is mainly launched for Microsoft 365 Insiders, starting with build 16818. It is currently only available in the desktop version of Excel, and Microsoft said it will be pushed to other platforms at a later date.
If you have already joined the Microsoft 365 Insider Program (https://insider.microsoft365.com/join/windows), you only need to install the latest Insider version of Excel, open a blank workbook, and then perform the following steps to try it out. .
Select the formula in the ribbon.
Choose Insert Python.
Select the Try Preview button in the dialog box that appears.
However, Microsoft also said that during the preview period, Python in Excel will be included with Microsoft 365 subscriptions, but after the preview ends, "some features will be limited without a paid license ."
Most developers are excited about this feature release.
A former Microsoft employee commented on HN:
“As a former Excel developer who tried to bring Python to Excel, I was pleasantly surprised to read this article today.
More than 7 years ago, I chose to leave the Excel team. My boss's boss at the time knew I was interested in bringing Python to Excel and offered me a chance if I chose to stay. However, what was originally a 6-month project turned into a 3-year project, the Python part gradually disappeared, and we ended up enabling JavaScript custom functions in Excel.
As far as Python is concerned, we were also running in the "cloud" at that time (AzureML v1), but we also had some back-and-forth discussions later on whether it should be run locally. I think what made the Python part go away was our partner AzureML team re-forged, re-launched, re-hired, we lost a PM and our work came to the attention of another partner team and they realized they could use us The code executes their JavaScript out of process. So I spent a lot of time trying to ensure a successful release of this feature, which, I thought, was doing Python a disservice.
I had help from some great engineers and learned a lot. At the heart of this work is modifying Excel's calculation engine to allow functions to calculate asynchronously, so users can continue working on other parts of the spreadsheet while a remote endpoint (whether JavaScript, Python or otherwise) calculates. Previously, the spreadsheet would lock while the calculation was running, which wasn't cool for long-running unlimited calculations. I don't know if any of the features we were building at the time were incorporated into this new feature.
Now, very happy to see this and looking forward to trying it out. "
Another developer @cableshaft said:
“I hope it’s not just powered by the Microsoft cloud but also running Python locally, but regardless, I think it’s still going to be a huge project that’s going to be a huge push towards modernizing Excel.
I thought about a project I worked on before, which was to build an analytics website that only a few people used internally. If such a thing had existed at the time, it would have served their needs well. "
However, some developers said that as Python entered Excel, it became the last straw for VBA macros.
Finally, would native integration of Python into Excel be useful to you? Feel free to share your thoughts.
reference:
https://techcommunity.microsoft.com/t5/microsoft-365-blog/introducing-python-in-excel-the-best-of-both-worlds-for-data/ba-p/3905482
https://techcommunity.microsoft.com/t5/excel-blog/announcing-python-in-excel-combining-the-power-of-python-and-the/ba-p/3893439
https://news.ycombinator.com/item?id=37222191
recommended reading
"Excel+Python: Quickly complete data analysis and processing"
Author / [Switzerland] Felix Zumstein
Translated/Feng Li
The founder of the popular Python library xlwings teaches you how to make Excel fly faster
Office workers can learn Python without any pressure, easily break through the Excel bottleneck and expand their ideas for solving problems.
Say goodbye to cumbersome formulas and VBA codes, automate Excel tasks and achieve a leap in efficiency
Let Excel and Python work together to avoid human errors and accurately complete data processing
"Python+Excel/Word/PPT All-In-One"
By / Xiaoma Ge
Rich content: including the cooperative application of Python and Excel/Word/PPT/PDF
Easy to understand: explains the knowledge points step by step, truly suitable for readers with zero basic knowledge.
Detailed explanation: analyze the code line by line, allowing readers to truly grasp programming ideas and methods
Practical guide: starting from actual problems and needs, organizing content and designing cases
Supporting resources: PPT + source code + key content explanation video provided free of charge with the book
"Excel Machine Learning"
【Beauty】Zhou Hong|Author
Li Qiaojun|Translation
More than ten commonly used machine learning methods, open Excel and learn easily step by step
Not good at programming? Is math too difficult? Another way to climb the mountain of machine learning
Introducing commonly used machine learning algorithms and data mining techniques through Excel examples
Use Excel to clearly display each step and intermediate results of the machine learning modeling process, so that you not only know what is happening, but also why it is happening.