Python基础知识（六）

循环，列表，字典

在开始使用 for 循环之前，需要在某个位置存放循环的结果。最好的方法是使用列表（list），列表就是一个按顺序存放东西的容器。首先我们看看如何创建列表：

hairs = ['brown', 'blond', 'red']
eyes = ['brown', 'blue', 'green']
weights = [1, 2, 3, 4]

现在我们将使用循环创建一些列表，然后将它们打印出来:

the_count = [1, 2, 3, 4, 5]
fruits = ['apples', 'oranges', 'pears', 'apricots']
change = [1, 'pennies', 2, 'dimes', 3, 'quarters']

# 第一个for循环遍历一个列表
for number in the_count:
    print ("This is count %d" % number)

# for循环遍历一个列表
for fruit in fruits:
    print ("A fruit of type: %s" % fruit)

# 此外，我们也可以浏览混合列表
# 注意，我们必须使用%r，因为我们不知道里面有什么
for i in change:
    print ("I got %r" % i)

# 我们还可以构建列表，首先从空列表开始
elements = []

# 然后使用range函数执行0到5个计数
for i in range(0, 6):
    print ("Adding %d to the list." % i)
    # append是一个列表可以理解的函数
    elements.append(i)

# 现在我们也可以把它们打印出来
for i in elements:
    print ("Element was: %d" % i)

结果

This is count 1
This is count 2
This is count 3
This is count 4
This is count 5
A fruit of type: apples
A fruit of type: oranges
A fruit of type: pears
A fruit of type: apricots
I got 1
I got 'pennies'
I got 2
I got 'dimes'
I got 3
I got 'quarters'
Adding 0 to the list.
Adding 1 to the list.
Adding 2 to the list.
Adding 3 to the list.
Adding 4 to the list.
Adding 5 to the list.
Element was: 0
Element was: 1
Element was: 2
Element was: 3
Element was: 4
Element was: 5

常见问题
Q: 如何定义一个两层（2D）的列表？

就是一个列表在另一个列表里面，比如[[1,2,3],[4,5,6]]

Q: 列表和数组不是同一种东西吗？

依赖于语言和实现方式。在经典设计角度，由于数组列表的实现方式不同，数组列表是非常不同的。在Ruby中程序员称之为数组。在Python中,他们称之为列表。因为现在是Python调用它们，所以我们就称呼它为列表。

Q: 为什么for 循环可以使用一个没有定义过的变量？

在for循环开始的时候，就会定义这个变量，并初始化。

Q: 为什么for i in range(1, 3):只循环了两次？

range()函数循环的次数不包括最后一个。所以range(1,3)只循环到2,这是这种循环最常用的方法。

Q: elements.append()实现什么功能？

它能实现在列表的末尾追加一个元素。打开Python解析器，自己写一个列表做些实验。当你遇到这类问题的时候，都可以在Python的解析器中做些实验，自己找到问题的答案。

while循环

while循环（while-loop）。while循环会一直执行它下面的代码片段，直到它对应的布尔表达式为False时才会停下来。While 循环有一个问题，那就是有时它永远不会结束。为了避免这样的问题，你需要遵循下面的规定：

  1.尽量少用while-loop，大部分时候for-loop是更好的选择。
  2.重复检查你的while语句，确定你的布尔表达式最终会变成False 。
  3.如果不确定，就在while循环的结尾打印出你测试的值。看看它的变化。

i = 0
numbers = []

while i < 6:
    print ("At the top i is %d" % i)
    numbers.append(i)

    i = i + 1
    print ("Numbers now: ", numbers)
    print ("At the bottom i is %d" % i)

print ("The numbers: ")

for num in numbers:
    print (num)

结果

At the top i is 0
Numbers now:  [0]
At the bottom i is 1
At the top i is 1
Numbers now:  [0, 1]
At the bottom i is 2
At the top i is 2
Numbers now:  [0, 1, 2]
At the bottom i is 3
At the top i is 3
Numbers now:  [0, 1, 2, 3]
At the bottom i is 4
At the top i is 4
Numbers now:  [0, 1, 2, 3, 4]
At the bottom i is 5
At the top i is 5
Numbers now:  [0, 1, 2, 3, 4, 5]
At the bottom i is 6
The numbers: 
0
1
2
3
4
5

常见问题
Q: for 循环和while循环有什么区别？

for 循环只能对某种事物的集合做循环，而while可以进行任何种类的循环。但是，while循环很容易出错，大部分情况for循环也是一个很好的选择。

访问列表元素

访问第一个元素的方法是这样的：

animals = ['bear', 'tiger', 'penguin', 'zebra']
bear = animals[0]

分支和函数

from sys import exit

def gold_room():
    print ("This room is full of gold.  How much do you take?")

    choice = input("> ")
    if "0" in choice or "1" in choice:
        how_much = int(choice)
    else:
        dead("Man, learn to type a number.")

    if how_much < 50:
        print ("Nice, you're not greedy, you win!")
        exit(0)
    else:
        dead("You greedy bastard!")

def bear_room():
    print ("There is a bear here.")
    print ("The bear has a bunch of honey.")
    print ("The fat bear is in front of another door.")
    print ("How are you going to move the bear?")
    bear_moved = False

    while True:
        choice = input("> ")

        if choice == "take honey":
            dead("The bear looks at you then slaps your face off.")
        elif choice == "taunt bear" and not bear_moved:
            print ("The bear has moved from the door. You can go through it now.")
            bear_moved = True
        elif choice == "taunt bear" and bear_moved:
            dead("The bear gets pissed off and chews your leg off.")
        elif choice == "open door" and bear_moved:
            gold_room()
        else:
            print ("I got no idea what that means.")

def cthulhu_room():
    print ("Here you see the great evil Cthulhu.")
    print ("He, it, whatever stares at you and you go insane.")
    print ("Do you flee for your life or eat your head?")

    choice = input("> ")

    if "flee" in choice:
        start()
    elif "head" in choice:
        dead("Well that was tasty!")
    else:
        cthulhu_room()

def dead(why):
    print (why, "Good job!")
    exit(0)

def start():
    print ("You are in a dark room.")
    print ("There is a door to your right and left.")
    print ("Which one do you take?")

    choice = input("> ")

    if choice == "left":
        bear_room()
    elif choice == "right":
        cthulhu_room()
    else:
        dead("You stumble around the room until you starve.")

start()

You are in a dark room.
There is a door to your right and left.
Which one do you take?
>  left
There is a bear here.
The bear has a bunch of honey.
The fat bear is in front of another door.
How are you going to move the bear?
>  taunt bear
The bear has moved from the door. You can go through it now.
>  open door
This room is full of gold.  How much do you take?
>  1000
You greedy bastard! Good job!

IF 语句的规则：

每一个“if 语句”必须包含一个 else.
如果这个else永远都不应该被执行到，因为它本身没有任何意义，那你必须在else语句后面使用一个叫做die的函数，让它打印出错误信息,这和上一节的习题类似，这样你可以找到很多的错误。
“if 语句”的嵌套不要超过 2 层，最好尽量保持只有 1 层。
将“if 语句”当做段落来对待，其中的每一个if-elif-else 组合就跟一个段落的句子一样。在这种组合的最前面和最后面留一个空行以作区分。
你的布尔测试应该很简单，如果它们很复杂的话，你需要将它们的运算事先放到一个变量里，并且为变量取一个好名字。

循环的规则

只有在循环永不停止时使用“while循环”，这意味着你可能永远都用不到。这条只有 Python 中成立，其他的语言另当别论。
其他类型的循环都使用“for循环”，尤其是在循环的对象数量固定或者有限的情况下。

调试的小技巧

不要使用 “debugger”。Debugger所作的相当于对病人的全身扫描。你不会得到某方面的有用信息，而且你会发现它输出的信息大部分没有用，或者只会让你更困惑。
最好的调试程序的方法是使用print,在各个你想要检查的关键环节将关键变量打印出来，从而检查哪里是否有错。
让程序一部分一部分地运行起来。不要等一个很长的脚本写完后才去运行它。写一点，运行一点，修改一点。

字典

Python 将这种数据类型叫做 “dict”，有的语言里它的名称是 “hash”。这两种名字我都会用到，不过这并不重要，重要的是它们和列表的区别。针对列表可以做这样的事情：

>>> things = ['a', 'b', 'c', 'd']
>>> print things[1]
b
>>> things[1] = 'z'
>>> print things[1]
z
>>> things
['a', 'z', 'c', 'd']

可以使用数字作为列表的索引，也就是可以通过数字找到列表中的元素。现在应该了解列表的这些特性，也应了解只能通过数字来获取列表中的元素。

而 dict 所作的，是可以通过任何东西找到元素，不只是数字。是的，字典可以将一个物件和另外一个东西关联，不管它们的类型是什么，我们来看看：

>>> stuff = {'name': 'Zed', 'age': 39, 'height': 6 * 12 + 2}
>>> print stuff['name']
Zed
>>> print stuff['age']
39
>>> print stuff['height']
74
>>> stuff['city'] = "San Francisco"
>>> print stuff['city']
San Francisco

你将看到除了通过数字以外，我们还可以用字符串来从字典中获取 stuff ，我们还可以用字符串来往字典中添加元素。当然它支持的不只有字符串，我们还可以做这样的事情：

>>> stuff[1] = "Wow"
>>> stuff[2] = "Neato"
>>> print stuff[1]
Wow
>>> print stuff[2]
Neato
>>> stuff
{'city': 'San Francisco', 2: 'Neato', 'name': 'Zed', 1: 'Wow', 'age': 39, 'height': 74}

在这段代码中，使用了数字，当打印stuff的时候，可以看到，不止有数字还有字符串作为字典的key。事实上，可以使用任何东西，这么说并不准确。

当然了，一个只能放东西进去的字典是没啥意思的，所以有删除的方法，也就是使用del 这个关键字：

>>> del stuff['city']
>>> del stuff[1]
>>> del stuff[2]
>>> stuff
{'name': 'Zed', 'age': 36, 'height': 74}

一个字典实例

注意一下这个例子中是如何对应这些州和它们的缩写，以及这些缩写对应的州里的城市。记住, “映射” 是字典中的关键概念。

# 创建状态到缩写的映射
states = {
    'Oregon': 'OR',
    'Florida': 'FL',
    'California': 'CA',
    'New York': 'NY',
    'Michigan': 'MI'
}

# 创建一个基本的州和一些城市的集合
cities = {
    'CA': 'San Francisco',
    'MI': 'Detroit',
    'FL': 'Jacksonville'
}

# 增加一些城市
cities['NY'] = 'New York'
cities['OR'] = 'Portland'

# 打印出一些城市
print ('-' * 10)
print ("NY State has: ", cities['NY'])
print ("OR State has: ", cities['OR'])

# 打印一些州
print ('-' * 10)
print ("Michigan's abbreviation is: ", states['Michigan'])
print ("Florida's abbreviation is: ", states['Florida'])

# 是通过州和城市的法令吗
print ('-' * 10)
print ("Michigan has: ", cities[states['Michigan']])
print ("Florida has: ", cities[states['Florida']])

# 打印每个州的缩写
print ('-' * 10)
for state, abbrev in states.items():
    print ("%s is abbreviated %s" % (state, abbrev))

# 在州内印刷每一个城市
print ('-' * 10)
for abbrev, city in cities.items():
    print ("%s has the city %s" % (abbrev, city))

# 现在同时做这两件事
print ('-' * 10)
for state, abbrev in states.items():
    print ("%s state is abbreviated %s and has city %s" % (
        state, abbrev, cities[abbrev]))

print ('-' * 10)
# 安全得到一个可能不在那里的州的缩写
state = states.get('Texas')

if not state:
    print ("Sorry, no Texas.")

# 获取具有默认值的城市
city = cities.get('TX', 'Does Not Exist')
print ("The city for the state 'TX' is: %s" % city)

结果

$ python ex39.py
----------
NY State has:  New York
OR State has:  Portland
----------
Michigan's abbreviation is:  MI
Florida's abbreviation is:  FL
----------
Michigan has:  Detroit
Florida has:  Jacksonville
----------
California is abbreviated CA
Michigan is abbreviated MI
New York is abbreviated NY
Florida is abbreviated FL
Oregon is abbreviated OR
----------
FL has the city Jacksonville
CA has the city San Francisco
MI has the city Detroit
OR has the city Portland
NY has the city New York
----------
California state is abbreviated CA and has city San Francisco
Michigan state is abbreviated MI and has city Detroit
New York state is abbreviated NY and has city New York
Florida state is abbreviated FL and has city Jacksonville
Oregon state is abbreviated OR and has city Portland
----------
Sorry, no Texas.
The city for the state 'TX' is: Does Not Exist

字典能做什么

字典是另一个数据结构的例子，和列表一样，是编程中最常用的数据结构之一。字典是用来做映射或者存储你需要的键值对，这样当你需要的时候，你可以通过key来获取它的值。同样，程序员不会使用一个像“字典”这样的术语，来称呼那些不能像一个写满词汇的真实字典正常使用的事物，所以我们只要把它当做真实世界中的字典来用就好。

假如你想知道这个单词"Honorificabilitudinitatibus"的意思。你可以很简单的把它复制粘贴放进任何一个搜索引擎中找到答案。我们真的可以说一个搜索引擎就像一个巨大的超级复杂版本的《牛津英语词典》(OED).在搜索引擎出现之前，你可能会这样做：

   1.走进图书馆，找到一本字典，我们称这本字典为OED
   2.你知道单词"honorificabilitudinitatibus" 以字母 'H'开头，所以你查看字典的小标签，找到以 'H' 开头的部分.
   3.然后你会浏览书页，直到找到"hon"开头的地方。
   4.然后你再翻过一些书页，直到找到 "honorificabilitudinitatibus" 或者找到以 "hp" 开头的单词，发现这个词不在我们的字典中。
   5.当你找到这个条目，你就可以仔细阅读并弄明白它的意思。

这个过程跟我们在程序中使用字典的是相似的，你会映射（“mapping”）找到这个单词"honorificabilitudinitatibus"的定义。Python中的字典就跟真实世界中的这本牛津词典（OED）差不多。
定义自己的字典类

最后一段代码演示了如何使用你刚学会的list来创建一个字典数据结构。这段代码可能有些难以理解，所以如果你要花费你很长的时间去弄明白代码额含义也不要担心。代码中会有一些新的知识点，它确实有些复杂，还有一些事情需要你上网查找

为了使用Python中的dict保存数据，我打算把我的数据结构叫做hashmap,这是字典数据结构的另一个名字。你要把下面的代码输入一个叫做hashmap.py的文件，这样我们就可以在另一个文件 ex39_test.py中执行它。

def new(num_buckets=256):
    """用给定的桶数初始化映射。"""
    aMap = []
    for i in range(0, num_buckets):
        aMap.append([])
    return aMap

def hash_key(aMap, key):
    """给定一个键，这将创建一个数字，然后将其转换为aMap桶的索引。"""
    return hash(key) % len(aMap)

def get_bucket(aMap, key):
    """如果有一把钥匙，就把水桶放在它要去的地方。"""
    bucket_id = hash_key(aMap, key)
    return aMap[bucket_id]

def get_slot(aMap, key, default=None):
    """返回在桶中找到的槽的索引、键和值。在未找到时返回-1、键和默认值(未设置则为None)。"""
    bucket = get_bucket(aMap, key)

    for i, kv in enumerate(bucket):
        k, v = kv
        if key == k:
            return i, k, v

    return -1, key, default

def get(aMap, key, default=None):
    """获取桶中给定键或默认值的值。"""
    i, k, v = get_slot(aMap, key, default=default)
    return v

def set(aMap, key, value):
    """将键设置为值，替换任何现有值。"""
    bucket = get_bucket(aMap, key)
    i, k, v = get_slot(aMap, key)

    if i >= 0:
        # 这个键存在，替换它
        bucket[i] = (key, value)
    else:
        # 这个键没有，附加创建它
        bucket.append((key, value))

def delete(aMap, key):
    """从映射中删除给定的键。"""
    bucket = get_bucket(aMap, key)

    for i in xrange(len(bucket)):
        k, v = bucket[i]
        if key == k:
            del bucket[i]
            break

def list(aMap):
    """打印出地图上的内容。"""
    for bucket in aMap:
        if bucket:
            for k, v in bucket:
                print (k, v)

上面的代码创建了一个叫做hashmap的模块，你需要把这个模块import到文件 ex39_test.py 中，并让这个文件运行起来：

import hashmap

# 创建状态到缩写的映射
states = hashmap.new()
hashmap.set(states, 'Oregon', 'OR')
hashmap.set(states, 'Florida', 'FL')
hashmap.set(states, 'California', 'CA')
hashmap.set(states, 'New York', 'NY')
hashmap.set(states, 'Michigan', 'MI')

# 创建一个基本的州和一些城市的集合
cities = hashmap.new()
hashmap.set(cities, 'CA', 'San Francisco')
hashmap.set(cities, 'MI', 'Detroit')
hashmap.set(cities, 'FL', 'Jacksonville')

# 增加一些城市
hashmap.set(cities, 'NY', 'New York')
hashmap.set(cities, 'OR', 'Portland')

# 打印出一些城市
print ('-' * 10)
print ("NY State has: %s" % hashmap.get(cities, 'NY'))
print ("OR State has: %s" % hashmap.get(cities, 'OR'))

# 打印一些州
print ('-' * 10)
print ("Michigan's abbreviation is: %s" % hashmap.get(states, 'Michigan'))
print ("Florida's abbreviation is: %s" % hashmap.get(states, 'Florida'))

# 是通过州和城市的法令吗
print ('-' * 10)
print ("Michigan has: %s" % hashmap.get(cities, hashmap.get(states, 'Michigan')))
print ("Florida has: %s" % hashmap.get(cities, hashmap.get(states, 'Florida')))

# 打印每个州的缩写
print ('-' * 10)
hashmap.list(states)

# 在州内印刷每一个城市
print ('-' * 10)
hashmap.list(cities)

print ('-' * 10)
state = hashmap.get(states, 'Texas')

if not state:
  print ("Sorry, no Texas.")

# 使用||=与nil结果的默认值
# can you do this on one line?
city = hashmap.get(cities, 'TX', 'Does Not Exist')
print ("The city for the state 'TX' is: %s" % city)

代码分析

这个 hashmap 只不过是"拥有键值对的有插槽的列表"，用几分钟时间分析一下我说的意思：

"一个列表"在 hashmap 函数中，我创建了一个列表变量aMap，并且用其他的列表填充了这个变量。"有插槽的列表"最开始这个列表是空的,当我给这个数据结构添加键值对之后，它就会填充一些插槽或者其他的东西"拥有键值对"表示这个列表中的每个插槽都包含一个(key, value)这样的元素或者数据对。

如果我的这个描述仍旧没让你弄明白是什么意思，花点时间在纸上画一画它们，直到你搞明白为止。实际上，手动在纸上运算是让你弄明白它们的好办法。

你现在知道数据是如何被组织起来的，你还需要知道它每个操作的算法。算法指的是你做什么事情的步骤。它是是数据结构运行起来的代码。我们接下来要逐个分析下代码中用到的操作，下面是在 hashmap算法中一个通用的模式：

1.把一个关键字转换成整数使用哈希函数: hash_key.
2.Convert this hash to a bucket number using a %(模除) 操作.
3.Get this bucket from the aMap list of buckets, and then traverse it to find the slot that contains the key we want.

操作set 实现以下功能,如果key值存在，则替换原有的值，不存在则创建一个新值。

下面我们逐个函数分析一下hashmap的代码:

new首先，我以创建一个函数来生成一个hashmap开始，也被称为初始化。我先创建一个包含列表的变量，叫做aMap,然后把列表num_buckets放进去， num_buckets用来存放我给hashmap设置的内容。后面我会在另一个函数中使用len(aMap) 来查找一共有多少个 buckets。确信你明白我说的。

hash_key这个看似简单的函数是一个dict如何工作的核心。它是用Python内建的哈希函数将字符串转换为数字。Python为自己的字典数据结构使用此功能，而我只是复用它. 你应该启动一个Python控制台，看看它是如何工作的. 当我拿到key对应的数字的时候, 我使用 % (模除) 操作和 len(aMap) 来获得一个放置这个key的位置。你应该知道，% (模除) 操作将会返回除法操作的余数。我也可以使用这个方法来限制大数，将其固为较小的一组数字。如果你不知道我在说什么，使用Python解析器研究一下。

get_bucket这个函数使用hash_key来找到一个key所在的“bucket”。当我在hash_key函数中进行%len(aMap)操作的时候，我知道无论我获得哪一个 bucket_id都会填充进 aMap 列表中. 使用 bucket_id 可以找到一个key所在的“bucket” 。

get_slot这个函数使用get_bucket来获得一个key所在的“bucket”，它通过查找bucket中的每一个元素来找到对应的key。找到对应的key之后，它会返回这样一个元组(i, k, v)，i表示的是key的索引值，k就是key本身，v是key对应的值。你现在已经了解了足够字典数据结构运行的原理。它通过keys、哈希值和模量找到一个bucket，然后搜索这个bucket，找到对应的条目。它有效的减少了搜索次数。

get这是一个人们需要hashmap的最方便的函数。它使用get_slot来获得元组(i, k, v) 但是只返回v. 确定你明白这些默认变量是如何运行的，以及get_slot中(i, k, v) 分派给i, k, v的变量是如何获得的。

set设置一个key/value键值对，并将其追加到字典中，保证以后再用到的时候可以获取的到。但是，我希望我的hashmap每个key值存储一次。为了做到这点，首先，我要找到这个key是不是已经存在，如果存在，我会替换它原来的值，如果不存在，则会追加进来。这样做会比简单的追加慢一些，但是更满足hashmap使用者的期望。如果你允许一个key有多个value，你需要使用 get方法查阅所有的“bucket”并返回一个所有value的列表。这是平衡设计的一个很好的例子。现在的版本是你可以更快速的 get, 但是会慢一些 set.

delete删除一个key, 找到key对应的 bucket，并将其从列表中删除。因为我选择一个key只能对应一个value，当我找到一个相应的key的时候，我就可以停止继续查找和删除。如果我选择允许一个key可以对应多个value的话，我的删除操作也会慢一些，因为我要找到所有key对应的value，并将其删除。

list最后的功能仅仅是一个小小的调试功能，它能打印出hashmap 中的所有东西，并且能帮助你理解字典的细微之处。