Small circle apes share - Data Analysis Tools

Many of us learn python to develop, but some people is to analyze data, today gave a small ape circle to share some data analysis tool, hoping to help to you, so that everyone's data analysis easier. Handle larger, more complex class excel data • Pandas - processing tabular (similar to Excel) common tool kit of data • SQLite - Tabular database format that can handle large data sets, but also can run in a desktop environment. • PostgreSQL - enterprise-class database system processing space, geographic data: PostGIS - Postgres geospatial data type extension Carto - business data geospatial data mining tools Mapbox - commercial mapping tool, but also a web mapping system. Leaflet - web-based resource development activities and local data web mapping code base qGIS - suitable for almost all geospatial mapping and GIS graphical tools for working with unconventional data: • RethinkDB - database processing real-time data streams wonderful, moving from commercial open source turn, be used with care. • MongoDB - handle large unstructured and semi-structured data in popular database in a production environment you need to add careful. • CouchDB - and MongoDB is somewhat similar but not the same. • Cassandra - the performance of large data sets to create code that maps and relational database: • Pandas - under an open source Python library data analysis, data structures DataFrame it provides greatly simplifies the data analysis process in a number of complicated operation. • Apache Spark - a high-performance general-purpose data processing system • SciPy and Numpy - C numerical algorithm based scriptable, can in a compact, Running on the underlying machine architecture data. • Cython - using the C compiler Python compiler for Python to enhance performance. • PyOpenCL - numerical calculations and statistical processing on the graphics card. Data cleaning tools • ODO - Python library to convert between different data formats. • OpenRefine - have a graphical user interface, data discovery and cleansing tools • Pandas - General Python toolset to handle tabular data in the data science missions • Scrapy - Python development of a fast, high-level screen scraping and web scraping framework for gripping web site and extract structured data from the page. • BeautifulSoup - and Scrapy similar but not the same • Scrubadub - removal of personally identifiable information • Arrow - to help you easily control the date and time stamp of Python libraries • DataCleaner - excluding dirty Python libraries • Dora - and DataCleaner functionally similar Python library . Data visualization tools • Processing - Interactive to develop interactive visual content recommendation Reader:.. Visualizing Data • D3 - developed on the web visual interactive • C3 - D3 from the chart • Bokeh - Similar to the D3, but based on the Python. • matplotlib - the oldest Python data visualization tools. • Leaflet - for the development of a mobile device-friendly interactive map of open source JavaScript library. • MapBox - see map toolset. • qGIS - see map toolset. • VTK - commonly used in medical, physics research and heavy-duty visualization toolkit. Data mining and machine learning tools • Weka - a machine learning and data mining tools package, there are a free-readable reference book • SciKitLearn - Python-based machine learning and data mining tool suite. • Orange - another Python-based data mining tool suite also has a graphical user interface. • TensorFlow - multi-dimensional map of mathematical modeling tools Google open source. Sharing, collaboration, and knowledge management tools • Django - Python-based web framework • Django REST Framework - Creating REST APIs • IRODS for the Django website - enterprise data storage and management, including data processing metadata management and rule-based. • Cassandra (useful for metadata and relationship storage) - open source distributed data management system to store and query metadata • GitLab -GitHub frequently used open-source alternatives, can set up a private server. • ReciPy - • Prov - Python implementation of the W3C provenance model • Kanren (deployment based on business logic metadata and data source information is very useful) - a descriptive query and rule-based Python logic programming system, ideal for scientific metadata deal with. • Well today is so much to share, hope can help to you, these tools work related to the content of each of our data analysts, we hope to make the complex more flexible work has become more convenient, I feel good enough to be recommended to my friends Oh, remember collection plus interest. Django REST Framework - Creating REST APIs • IRODS for the Django website - enterprise data storage and management, including data processing metadata management and rule-based. • Cassandra (useful for metadata and relationship storage) - open source distributed data management system to store and query metadata • GitLab -GitHub frequently used open-source alternatives, can set up a private server. • ReciPy - • Prov - Python implementation of the W3C provenance model • Kanren (deployment based on business logic metadata and data source information is very useful) - a descriptive query and rule-based Python logic programming system, ideal for scientific metadata deal with. • Well today is so much to share, hope can help to you, these tools work related to the content of each of our data analysts, we hope to make the complex more flexible work has become more convenient, I feel good enough to be recommended to my friends Oh, remember collection plus interest. Django REST Framework - Creating REST APIs • IRODS for the Django website - enterprise data storage and management, including data processing metadata management and rule-based. • Cassandra (useful for metadata and relationship storage) - open source distributed data management system to store and query metadata • GitLab -GitHub frequently used open-source alternatives, can set up a private server. • ReciPy - • Prov - Python implementation of the W3C provenance model • Kanren (deployment based on business logic metadata and data source information is very useful) - a descriptive query and rule-based Python logic programming system, ideal for scientific metadata deal with. • Well today is so much to share, hope can help to you, these tools work related to the content of each of our data analysts, we hope to make the complex more flexible work has become more convenient, I feel good enough to be recommended to my friends Oh, remember collection plus interest.

Reproduced in: https: //juejin.im/post/5cef3402e51d4510b71da594

Guess you like

Origin blog.csdn.net/weixin_34246551/article/details/91449700