Preface
Python is very powerful. For some complicated Excel operations, Python only needs one or two statements. However, Python has a high barrier to use and requires the installation of software and various module libraries. Many times we just want to use python functions lightly. Is there any simple way? Smartbi intelligent analysis provides web python function, you can execute python scripts in the browser without installing any software; it also provides an Excel plug-in, which can easily retrieve the python execution results back to Excel.
Let's take basic data filtering as an example, so that everyone can quickly get started with the python function of Smartbi intelligent analysis and appreciate the charm of python data analysis.
data source
We need to use an EXCEL sample file first. Here I have selected an e-commerce order list. The data has been desensitized:
EXCEL screening
If filtering in EXCEL, our general method is to use EXCEL's own filtering function, and use the mouse to check in the pop-up filtering interface. For example, we want to filter the "order date" as 2010 and the "order level" as advanced :
Use python to filter in Smartbi analysis
Download Data
Since the ETL of Smartbi smart analysis is processed on the web, the first thing we need to do is to first import the local data source into the Smartbi smart analysis platform, and the data source can be quickly accessed in the Smartbi smart analysis data connection interface Lead in. In addition to local data files, Smartbi analysis also supports connections to relational databases such as mysql and Alibaba Cloud. Open the ETL interface, drag the components of the relational data source to the display area, and find your data source according to the storage path:
Right-click the "view output" of the relational data source, and we can preview the output effect of the data source:
Python
When the data connection is completed, then you can use the python script that comes with the Smartbi intelligent analysis ETL to perform various processing on the data. Let's first drag the python components to the middle display area and connect with the above data source. Connect:
Click the python input box to view, you can see that the input box here is basically the same as the input box of pycharm and other software, as long as the friends who are familiar with python can easily get started, and the system has entered some scripts in advance, these codes are the system default Comes with it, no need to write by yourself, very considerate. As you can see, the scripts written in advance mainly call the two libraries numpy and pandas and define functions:
According to the requirements mentioned above, our purpose is mainly to filter the two fields of order date and order level. We first define two variables cond and cond1, where cond calls pandas's pd.to_datetime() function. This is the time processing function of pandas. The result to be filtered by dt.year is that the year is equal to 2010. The next condition to be filtered by cond1 is column4=advanced, because the relationship between the two is and, after writing these two scripts, you must use & to connect these two conditions.
After the script is written, click OK. Finally, let's take a look at the effect of the python script. The order date in the second column is displayed as 2010, and the order level is displayed as advanced. Just write 3 lines of code, and our screening requirements have been achieved. :
Data output
We then use the output function of ETL to save and output the report. In order to facilitate the call of the data source in EXCEL, the data source type to be output here is a data set. First, drag the "Output to Data Set" in the ETL interface to the middle display area, and connect with the above components:
Then right-click on "Output to Data Set" and select "Execute to Here", the data set will be successfully saved on the cloud of Smartbi analysis:
Data back to EXCEL
Finally return to EXCEL, find the just saved data set in the data set panel, click on the data set, and use the mouse to drag the fields in the data set to EXCEL:
Click Refresh on the toolbar, and all the data will be refreshed. We can see that the two fields of "Order Date" and "Order Level" have achieved the filtered effect: