现代 C++ 函数式编程指南

现代 C++ 函数式编程指南

现代 C++ 函数式编程指南

函数式编程是一种编程范式，它强调程序的构建是通过应用(applying)和组合函数(composing functions)来实现的。在函数式编程中，程序被视为由函数定义的表达式树，这些函数将一个值映射到另一个值，而不是一系列更新程序运行状态的命令式语句。

https://en.wikipedia.org/wiki/Functional_programming

什么是柯里化（Curry）

一种函数，将具有多个参数的函数作为输入并返回仅具有一个参数的函数。

Curry: A function that takes a function with multiple parameters as input and returns a function with exactly one parameter.

让我们首先看一个简单的例子，展示柯里化的基本概念：

#include <print> // C++23

// 柯里化函数模板
template<typename Func, typename... Args>
auto curry(Func func, Args... args) {
    return [=](auto... remainingArgs) {
        return func(args..., remainingArgs...);
    };
}

// 示例一：加法器的柯里化
int add(int a, int b) {
    return a + b;
}

int main() {
    // 使用柯里化创建新的加法函数
    auto curriedAdd = curry(add, 5);  // 固定第一个参数为 5

    // 调用柯里化后的函数
    std::println("{:d}", curriedAdd(3));  // 输出 8 (5 + 3)

    return 0;
}

这个例子中，curry 函数接受一个函数和部分参数，返回一个接受剩余参数的函数。curriedAdd 就是一个将加法函数柯里化后的结果，固定了第一个参数为 5。

什么是部分应用（Partial Application）

将函数应用于其某些参数的过程。部分应用的函数将被返回以供以后使用。换句话说，一个函数接受一个具有多个参数的函数并返回一个具有较少参数的函数。部分应用修复（部分应用函数）返回函数内的一个或多个参数，返回函数将其余参数作为参数以完成函数应用。

Partial Application: The process of applying a function to some of its arguments. The partially applied function gets returned for later use. In other words, a function that takes a function with multiple parameters and returns a function with fewer parameters. Partial application fixes (partially applies the function to) one or more arguments inside the returned function, and the returned function takes the remaining parameters as arguments in order to complete the function application.

参考 https://en.wikipedia.org/wiki/Partial_application

注文中 特化 代指 Partial Application 。

二元函数（Partial Application）

作为 API 创建者，我们经常希望特化功能或预填充某些参数，这可以通过部分应用来实现。

partial_application_scheme

我们提供具体论据的子集，并产生一个较小数量的函数。

我们来看一个具体的例子。

该函数计算扇形的面积。

partial_application_circle_sector

double secArea(double theta, double radius){
    return 0.5*theta*pow(radius,2);
}

让我们专门计算这个函数来计算整圆的面积,我们需要嵌套 lambda 来表达 部分应用(Partial Application)。

// papply 是 Partial Application 缩写的形式，p(Partial) apply(Application)
auto papply = [](auto f, auto x) {
        return [=](auto y){
            return f(x,y);
    };
};

为了实现特化，我们只需要传递函数及其第一个参数。

auto op = papply(secArea,2*M_PI); // 固定第一个参数为完整的圆弧长度即完整的圆
auto val = op(3.0); // 计算半径为 3 的圆的面积

partial_application_circle_sector2

在前一种情况下，特化涉及第一个函数参数。

double secArea(double rAngle, double radius);

完整代码如下：

#include <print>    // C++23
#include <numbers>  // C++20

// papply 是 Partial Application 缩写的形式，p(Partial) apply(Application)
auto papply = [](auto f, auto x) {
    return [=](auto y) {
        return f(x, y);
    };
};

// 计算圆弧的面积
double secArea(double theta, double radius) {
    return 0.5 * theta * pow(radius, 2);
}

int main() {
    auto op  = papply(secArea, 2 * std::numbers::pi);  // 固定第一个参数为完整的圆弧长度即完整的圆
    auto val = op(3.0);                                // 计算半径为 3 的圆的面积

    // 使用 std::format 格式化浮点数并保留两位小数
    std::println("{:.2f}", val);  // 输出半径为2的圆面积 28.27

    return 0;
}

然而，我们常常不得不处理参数排序。

参数排序（Partial Application）

partial_application_pow

double pow (double base, double exponent);

例如将 pow C 库函数将基数(base)位置参数置换为指数(exponent)位置参数。

我们如何特化 pow 来返回基数(base)的 2 次方？

partial_application_pow2

下面这种特化可以解决我们上面的问题吗?

partial_application_pow3

如果我们特化 base 部分，papply 将返回一个 2 的任意幂函数。

auto op = papply(pow,2); // 2 的任意幂
auto val = op(3); // 2^3 = 8

该结果不是我们想要的。

pow 函数需要对第二个参数进行特化。

double pow (double base, double exponent);

这个问题可以通过参数交换来解决。

auto swapArgs = [] (auto f){
        return [=](auto x, auto y){
            return f(y,x);
    };
};
auto op = papply(swapArgs(pow), 2); // 现在2作为了指数，解决了我们上面的问题。
auto val = op(3); // 3^2 = 9

我们也可以使用特化专用于 pow 的 lambda 来解决。

auto powS = [](auto exponent, auto base){
               return pow(base, exponent);
};
auto op = papply(powS, 2); 
auto val = op(3); // 3^2 = 9

或者使用下面这种更加紧凑的形式。

auto op = papply([](auto exponent, auto base){
                     return pow(base, exponent);}, 2);

auto val = op(3); // 3^2 = 9

另一种选择是使用库函数 std::bind

此解决方案绕过了使用 lambda 表达式。

auto op = std::bind(pow, std::placeholders::_1, 2);
auto val = op(3); // 3^2 = 9

应用场景

计算碳衰减周期求年龄

接下来，让我们看一个关于碳衰减周期求年龄的例子：

含有有机物质的物体的年龄可以通过放射性同位素测年法确定。

partial_application_radioactive2

这是放射性衰变的一般方程

partial_application_radioactive3

double age(double remainingProportion, double halflife){
    return log(remainingProportion)*halflife / -log(2);
}

将半衰期替换为碳C14的半衰期，即5730年。

auto op = papply(swapArgs(age),5730);

问题1. 与活体样本相比，含有 40% C14 的化石有多少年了？

auto val = op(0.4); // 7575 years

完整代码如下：

#include <print> // C++23

// papply 是 Partial Application 缩写的形式，p(Partial) apply(Application)
auto papply = [](auto f, auto x) {
    return [=](auto y) {
        return f(x, y);
    };
};

auto swapArgs = [](auto f) {
    return [=](auto x, auto y) {
        return f(y, x);
    };
};

double age(double remainingProportion, double halflife) {
    return log(remainingProportion) * halflife / -log(2);
}

int main() {
    auto op  = papply(swapArgs(age), 5730);                  // 将半衰期替换为碳C14的半衰期，即5730年。
    auto val = op(0.4);                                      // 计算含有 40% C14 的化石有多少年了？
    std::println("{:d}", static_cast<int>(std::ceil(val)));  // 7575 years
    return 0;
}

与正则表达式相关的函数的特化也非常有用。

让我们专门研究 std::regex_match

std::regex_match 确定正则表达式 re 是否匹配整个字符序列 s。

bool std::regex_match( const std::string& s,
                  const std::regex& re,
                  std::regex_constants::match_flag_type flags =
                  std::regex_constants::match_default);

我们如何特化使用 std::regex_match 来验证电子邮件地址？

我们使用特化的 lambda 来实现所需的参数排序

auto op = papply([](auto re, auto str){
        return std::regex_match(str, re);}, 
        std::regex("(\\w+)(\\.|_)?(\\w*)@(\\w+)(\\.(\\w+))+"));
auto val1 = op("[email protected]"); // return 1, i.e. true
auto val2 = op("test@cheungxiongweicom"); // return 0, i.e. false

完整代码：

#include <print>  // C++23
#include <regex>  // C++11

// papply 是 Partial Application 缩写的形式，p(Partial) apply(Application)
auto papply = [](auto f, auto x) {
    return [=](auto y) {
        return f(x, y);
    };
};

int main() {
    // 该例子中使用的特化 lambda 形式进行参数交换
    auto op   = papply([](auto re, auto str) { return std::regex_match(str, re); }, std::regex("(\\w+)(\\.|_)?(\\w*)@(\\w+)(\\.(\\w+))+"));
    auto val1 = op("[email protected]");  // return 1, i.e. true
    auto val2 = op("test@cheungxiongweicom");   // return 0, i.e. false

    std::println("{} {}", val1, val2);  // true false
    return 0;
}

让我们继续讨论多参数问题

多参数（Partial Application）

这是运动物体最终速度的公式。

partial_application_velocity

Note：这个函数有三个参数，而不是前面2个参数的形式

// 计算速度
double velocity(double v0/*初始速度*/, double a/*加速度*/, double t/*加速时间*/){
    return v0 + a*t;
}

我们如何将这个公式特化用于解决自由落体问题？

我们想要专门研究 两个参数 ：初始速度和加速度。

我们需要嵌套 lambda 和参数包。

auto papply = [](auto f, auto... args) {
    return [=](auto... rargs) {
        return f(args..., rargs...);
    };
};

多个参数的使用通过参数包来表示 ...

我们将其类似地应用于二元情况

partial_application_velocity2

auto op = papply(velocity, 0.0, 9.81);
auto val = op(4.5/*4.5秒加速时间*/); // returns 44.15 m/s

在这种特定情况下，不需要交换。

完整代码：

#include <print>  // C++23

// papply 是 Partial Application 缩写的形式，p(Partial) apply(Application)
auto papply = [](auto f, auto... args) {
    return [=](auto... rargs) {
        return f(args..., rargs...);
    };
};

double velocity(double v0 /*初始速度*/, double a /*加速度*/, double t /*加速时间*/) {
    return v0 + a * t;
}

int main() {
    auto op  = papply(velocity, 0.0, 9.81);
    auto val = op(4.5 /*4.5秒加速时间*/);  // returns 44.15 m/s

    std::println("{:.2f} m/s", val);  // 44.15 m/s
    return 0;
}

高阶函数（Partial Application）

如何特化高阶函数？

此函数对集合执行左折叠

auto leftFold = [](auto col, auto op, auto init) {
    return std::accumulate(std::begin(col), std::end(col), 
           init, op);
};

函数 leftFold 使用二元运算 op 从值 init 开始组合集合 col 的元素。

使用 leftFold 特化执行求和

auto op = papply([](auto op, auto init, auto col){
                    return leftFold(col, op, init);},
                    std::plus<>(), 0.0);

该函数计算从值 0.0 开始的集合元素的总和。

使用 leftFold 特化计算集合的乘积

auto op = papply([](auto op, auto init, auto col){
                    return leftFold(col, op, init);},
                    std::multiplies<>(),1.0);

完整代码：

#include <print>    // C++23
#include <numeric>  // C++20
#include <vector>

auto papply = [](auto f, auto... args) {
    return [=](auto... rargs) {
        return f(args..., rargs...);
    };
};

auto leftFold = [](auto col, auto op, auto init) {
    return std::accumulate(std::begin(col), std::end(col), init, op);
};

int main() {
    auto op_plus       = papply([](auto op, auto init, auto col) { return leftFold(col, op, init); }, std::plus<>(), 0.0);
    auto op_multiplies = papply([](auto op, auto init, auto col) { return leftFold(col, op, init); }, std::multiplies<>(), 1.0);

    std::vector<int> set = {1, 2, 3, 4, 5};

    auto val1 = op_plus(set);        // 1 + 2 + 3 + 4 + 5 = 15
    auto val2 = op_multiplies(set);  // 1 * 2 * 3 * 4 * 5 = 120

    std::println("{:d} {:d}", static_cast<int>(val1), static_cast<int>(val2)); // 15 120
    return 0;
}

结论

柯里化（Curry）和（部分应用）Partial Application 作为函数式编程的重要概念，通过现代 C++ 中的函数对象和 lambda 表达式实现，为代码的模块化和灵活性提供了更多可能性。通过固定部分参数，生成新的函数，柯里化让函数处理变得更加高效、灵活和可复用。

在 C++ 中，柯里化（Curry）和（部分应用）Partial Application为处理函数式编程提供了一种强大的工具，可以应对各种复杂的场景，提高代码的可读性和可维护性。