递归与循环的效率迷思

本文简单比较了一下相同逻辑下,递归实现和循环实现的效率差异

已经不记得最初是从哪里获取的信息了,自己总有一个印象是递归的效率比循环差,因为递归有很大的函数调用开销,再加上递归可能存在的堆栈溢出问题(本文暂不考虑该问题),所以书写代码时还是尽量使用循环为好.

简单举个加法的例子(求解前 n 个自然数的和):

// C#

// recur version
int AddRecur(int val)
{
	if (val > 0)
	{
		return val + AddRecur(val - 1);
	}
	
	return 0;
}

// iter version
int AddIter(int val)
{
	var ret = 0;
	
	for (int i = 1; i <= val; ++i)
	{
		ret += i;
	}
	
	return ret;
}

简单 Profile 一下,发现循环版本比递归版本要快 64% 左右 ~

再举个耳熟能详的例子: 斐波那契数列

// C#

// recur version
long FibonacciRecur(long index)
{
	if (index <= 1)
	{
		return index;
	}
	else 
	{
		return FibonacciRecur(index - 1) + FibonacciRecur(index - 2);
	}
}

// iter version
long FibonacciIter(long index)
{
	if (index <= 1)
	{
		return index;
	}
	else 
	{
		long pre = 0;
	    long cur = 1;
	    long next = 0;
	    
	    for (long i = 2; i <= index; ++i)
	    {
	    	// calc next
	    	next = pre + cur;
	    	// update pre and cur
	    	pre = cur;
	    	cur = next;
	    }
	    
	    return next;
	}
}

继续 Profile 一下,发现循环版本比递归版本要快了 96% 左右 !
不过稍有递归经验的朋友都会看出,上面的递归实现会做很多的重复计算,更好的方式就是缓存一下中间的计算结果:

// C#

Dictionary<long, long> s_buffer = new Dictionary<long, long>();
		
long FibonacciRecur(long index)
{
	if (index <= 1)
	{
		return index;
	}
	else 
	{
		long pre = 0;
		if (!s_buffer.TryGetValue(index - 1, out pre)) 
		{
		    pre = FibonacciRecur(index - 1);
            s_buffer[index - 1] = pre;		    
		}
		
		long cur = 0;
		if (!s_buffer.TryGetValue(index - 2, out cur))
		{
		    cur = FibonacciRecur(index - 2);
            s_buffer[index - 2] = cur;		    
		}
		
		return pre + cur;
	}
}

改动之后,循环版本比递归版本就只快 64% 左右了 ~

试验到现在,似乎都印证了我之前的印象: 递归比循环慢,写代码就要写循环~

我们最后来看个真实的(也更复杂的)示例:查找指定名字的子节点(假设我们有一颗树形结构的节点树,给出根节点,查找某个指定名字的子节点)

以下是一个简易的树形结构实现:

// C#

public class Node 
{
	string m_name;
	List<Node> m_children = new List<Node>();
	
	public Node(string name, params Node[] children)
	{
		m_name = name;
		m_children.AddRange(children);
	}
	
	public string GetName()
	{
		return m_name;
	}
	
	public int GetChildCount()
	{
		return m_children.Count;
	}
	
	public Node GetChild(int index)
	{
		if (index >= 0 && index < m_children.Count) 
		{
			return m_children[index];
		}
		
		return null;
	}
}

查找树形结构的指定节点一般可以采用 BFS 或 DFS,这里我们使用 DFS :

// C#

Node FindChildRecur(Node parent, string name)
{
	if (parent != null)
	{	
		var childCount = parent.GetChildCount();
		for (int i = 0; i < childCount; ++i)
		{
			var child = parent.GetChild(i);
			if (child.GetName() == name) 
			{
				return child;
			}
			else 
			{
				var childNode = FindChildRecur(child, name);
				if (childNode != null)
				{
					return childNode;
				}
			}
		}
	}
	
	return null;
}

如果要将上面的递归代码改为循环代码,方法就没有之前那么简明了,需要我们自行来模拟调用栈:

// C#

Stack<Node> s_stack = new Stack<Node>();
		
Node FindChildIter(Node parent, string name)
{
	if (parent != null)
	{
		s_stack.Clear();
		s_stack.Push(parent);
		
		while (s_stack.Count > 0)
		{
			var curParent = s_stack.Pop();
			
			var childCount = curParent.GetChildCount();
			for (int i = childCount - 1; i >= 0; --i)
			{
				var child = curParent.GetChild(i);
				if (child.GetName() == name)
				{
					return child;
				}
				else
				{
					if (child.GetChildCount() > 0)
					{
					    s_stack.Push(child);
					}
				}
			}
		}
	}
	
	return null;
}

考虑到递归调用的高消耗,似乎我们应该将之前的递归代码改写为这种循环形式,但是 Profile 之后发现,其实循环版本还略慢于递归版本,原因就在于(模拟)调用栈的引入抵消了(甚至超过了)函数调用的开销.

其实一般而言,栈内存的操作消耗都要小于堆内存的操作消耗,上面例子中引入的(模拟)调用栈其实就是一种堆操作,考虑到 CLR(C#) 的可能影响,我也用 C++ 进行了一样的实现对比,最终结果也是一致的,甚至在 C++ 中实现的循环版本还要显著慢于其递归版本.

还有一个问题之前没有提及,就是代码可读性问题,从我个人经验来讲,递归代码的可读性大体上还是要优于循环代码的.

结论

一般而言,将递归代码改写为循环代码可以提高效率,但是一旦改写过程中引入了堆操作,那么结果往往是相反的.

tkokof1 博客专家

发布了142 篇原创文章 · 获赞 146 · 访问量 25万+

私信关注

递归与循环的效率迷思

结论

猜你喜欢