Recently, data processing work to the test result table and the original table, originally did not come into contact with this knowledge, summarize learning knowledge and concepts:
1. Installation Environment
Java(JDK8),python3.6.5, pycharm ,idea , pyspark
2.jpuyter conceptual understanding:
Jupyter Notebook (formerly known IPython notebook) is an interactive notebook, run support more than 40 programming languages. In this article, we will describe the main features Jupyter notebook, and why to want to write beautiful interactive documents for people who are a powerful tool.
Nature Jupyter Notebook is a Web application, easy to create and share literary program documentation, support real-time code, mathematical equations, visualization and markdown. Applications include: data cleansing and conversion, numerical simulation, statistical modeling, machine learning, etc.
Local jpuyter new page when a python error, mostly permission problems!
3.Anaconda the difference between Pycharm?
their lack python numpy, matplotlib, scipy, scikit-learn .... series of packets, we need to install the packages to be introduced pip corresponding operation, input terminal cmd: pip3 install numpy numpy package can be installed.
Every additional installation package needed a little trouble, this time we can use the anaconda. anaconda is a python release, contains a large number of packages, using anaconda eliminates the need to install additional required packages
pyspark: Spark started the python interpreter.
sparkR: Spark started the R interpreter.