Article Directory
1. Background
When we usually learn classification, regression, and clustering models for some relatively simple data sets, we often have a relatively large amount of code duplication. Therefore, such a code generation system is configured through a visual interface to generate exploratory data analysis reports, select configuration parameters for data cleaning, and build models.
Because this is for personal use only, no drag-and-drop configuration is implemented. Select different methods through the drop-down to realize code generation and construction.
2. System architecture
In order to facilitate the storage of data sets. Here mongodb is used for data storage.
The working process of the system is as follows.
3. System function interface
3.1 Data import/data analysis report generation
For all imported data, exploratory data analysis reports can be constructed. The data analysis report is generated by pandas_Profiling. Analysis report: overview of data set analysis results (Overview), single-column descriptive statistics of each variable (Variables), visual relationship between variables (Interactions), correlation matrix heat map (Correlations) between all variables, and all data columns Display in multiple dimensions such as missing features (Missing values) and partial data display (Sample) of the dataset.
After clicking Generate Report, the exploratory data analysis interface is as follows
3.2 Data cleaning configuration
3.3 Model building interface configuration
Optional models are configured through background json files
3.4 Machine Learning Code Generation
4. Source code acquisition
Search the official account " A program tree " on WeChat and reply "aml" to get the source code address and detailed project documentation.