Big Data learning python test summary

Recently, data processing work to the test result table and the original table, originally did not come into contact with this knowledge, summarize learning knowledge and concepts:

1. Installation Environment

 Java(JDK8),python3.6.5, pycharm ,idea , pyspark

2.jpuyter conceptual understanding:

      Jupyter Notebook (formerly known IPython notebook) is an interactive notebook, run support more than 40 programming languages. In this article, we will describe the main features Jupyter notebook, and why to want to write beautiful interactive documents for people who are a powerful tool.
     Nature Jupyter Notebook is a Web application, easy to create and share literary program documentation, support real-time code, mathematical equations, visualization and markdown. Applications include: data cleansing and conversion, numerical simulation, statistical modeling, machine learning, etc.

Local jpuyter new page when a python error, mostly permission problems!

3.Anaconda the difference between Pycharm?

their lack python numpy, matplotlib, scipy, scikit-learn .... series of packets, we need to install the packages to be introduced pip corresponding operation, input terminal cmd: pip3 install numpy numpy package can be installed.

Every additional installation package needed a little trouble, this time we can use the anaconda. anaconda is a python release, contains a large number of packages, using anaconda eliminates the need to install additional required packages

PyCharm is a Python IDE, it can help users with a set of productivity tools to improve their use Python language development, such as debugging, syntax highlighting, Project management, code branches, IntelliSense, auto-complete, unit testing, version control . In addition, the IDE provides advanced features for professional Web development under the support of the Django framework
Note: After installing anaconda, do not install a python.
4.spark
Spark is calculated using data based on large-memory computing, open source cluster environment scala achieve. Provides a call interface java, scala, python, R and other languages
spark-shell: Spark started the scala interpreter.
pyspark: Spark started the python interpreter.
sparkR: Spark started the R interpreter.
5.scala
Scala is a multi-paradigm programming language, a similar java programming language, providedTotal intention is to achieve scalable language  ,And the integration of object-oriented programming and functional programming various characteristics, Scala runs on the Java platform (Java Virtual Machine ), and is compatible with existing Java programs. It can also run on Java ME, the CLDC (Java Platform, Micro Edition Connected Limited Device Configuration).
6.groovy
Groovy is a Java virtual machine an agile of dynamic languages , it is a mature object-oriented programming language that can be used for both object-oriented programming, but also can be used as a pure scripting language . That language need not excessive coding, but also has a closure Other characteristics and dynamic languages.
 
java,c++,c,python,go,php,shell,scala,groovy

 

Guess you like

Origin www.cnblogs.com/pcy226/p/12304385.html