DataCamp Data Scientist with Python track 学习笔记 - 代码天地

DataCamp Data Scientist with Python track 学习笔记

编程语言 2018-11-21 20:31:25 阅读次数: 0

Importing Data in Python:

Customizing your pandas import:

# Import matplotlib.pyplot as plt
import matplotlib.pyplot as plt

# Assign filename: file
file = 'titanic_corrupt.txt'

# Import file: data
data = pd.read_csv(file, sep='\t', comment='#', na_values='Nothing')

# Print the head of the DataFrame
print(data.head())

# Plot 'Age' variable in a histogram
pd.DataFrame.hist(data[['Age']])
plt.xlabel('Age (years)')
plt.ylabel('count')
plt.show()

也许有的时候pandas默认被当作的缺失值还不能满足要求，我们可以通过设置na_values，将指定的值替换成为NaN值。语句中的意思是将 'Nothing' 用NaN进行替代，将所有的Nothing都替换成了NaN。

'sep' is the 'pandas' version of 'delim', which in this case is tab-delimited.

data.head() #默认出5行，括号里可以填其他数据。

Introduction to other file types:

pickle提供了一个简单的持久化功能，可以将对象以文件的形式存放在磁盘上。python中几乎所有的数据类型（列表，字典，集合，类等）都可以用pickle来序列化，而pickle序列化后的数据可读性差。

If you merely want to be able to import them into Python, you can serialize them. All this means is converting the object into a sequence of bytes, or a bytestream.

Customizing your spreadsheet import:

# Parse the first sheet and rename the columns: df1
df1 = xl.parse(0, skiprows=[0], names=['Country', 'AAM due to War (2002)'])

# Print the head of the DataFrame df1
print(df1.head())

# Parse the first column of the second sheet and rename the column: df2
df2 = xl.parse(1, parse_cols=[0], skiprows=[0], names=['Country'])

# Print the head of the DataFrame df2
print(df2.head())

猜你喜欢

转载自blog.csdn.net/weixin_41803041/article/details/84316784

DataCamp Data Scientist with Python track 学习笔记

学习【数据分析-data scientist】

Python - Datacamp - Introduction to Matplotlib

Data scientist成长路线

How to Think Like a Data Scientist

Python数据科学速查表 DataCamp

微软招 Data&Applied Scientist

大数据学习资源之DataCamp

学习笔记之Supervised Learning with scikit-learn | DataCamp

构建端到端数据科学项目，从我的Data Scientist Ideal Profiles项目中学习（附链接）

45 Questions to test a data scientist on basics of Deep Learning (along with solution)

25 Open Datasets for Deep Learning Every Data Scientist Must Work With

track

Detect-and-Track论文：笔记

学习笔记之Python for Data Analysis

python data analysis---学习笔记

Python学习笔记20（Data On the Web）

[Blog Excerpt] 12 things I wish I’d known before starting as a Data Scientist

data track capacitor 10.3模拟题题解总结

Python学习笔记 1-Python Data Model

python笔记二：Data Structure

笔记：Introduction to Data Science in Python

htmlcss学习笔记 WebVTT 及 HTML5 <track> 元素简介

学习笔记之pandas: Python Data Analysis Library

数据分析---《Python for Data Analysis》学习笔记【01】

数据分析---《Python for Data Analysis》学习笔记【02】

数据分析---《Python for Data Analysis》学习笔记【03】

学习笔记之Problem Solving with Algorithms and Data Structures using Python

Python量化交易学习笔记（25）——Data Feeds扩展

SQL数据同步到ELK（四）- 利用SQL SERVER Track Data相关功能同步数据（上）

今日推荐

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

GCC 14.1 发布

面壁智能发布 Eurux-8x22B 开源大模型 —— 堪称「理科状元」

开源日报 | 谷歌扶持鸿蒙上位；开源Rabbit R1；Docker加持的安卓手机；微软的焦虑和野心；海尔电器把开放平台关了

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

周排行

Java自定义时间格式

同步整形电路

在开发中最最最常用的字符串的属性大集合

Linux 查看端口占用并杀掉

Java基础四：ArrayList

多线程之死锁就是这么简单

mysql 基础命令集

awk 命令详解

Centos6.3编译安装nginx+php步骤

OCR （Optical Character Recognition，光学字符识别）

每日归档

更多

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)