Reading Note of CppPrimer-Chapter10

发表于 2022-06-16 更新于 2024-08-06 分类于 C++ 阅读次数：本文字数： 16k 阅读时长 ≈ 15 分钟

Generic Algorithms

Common Algorithms

泛型算法 (generic algorithms) 本身不会执行容器的操作，他们只会运行在迭代器之上，执行迭代器的操作；算法不会改变容器的大小

迭代器参数

一些算法要从两个序列中读取元素，构成这两个序列的元素可以来自于不同类型的容器，如第一个来自 vector，第二个来自于 list；两个容器的元素类型也不一定要相同，只要求他们之间可以进行比较；

操作两个序列的算法之间的区别在于如何传递第二个序列：

有的函数接受三个迭代器，前两个代表第一个序列的范围，第三个代表第二个序列的开始元素；用一个单一的迭代器表示第二个序列都假定第二个序列和第一个序列一样长；
有的函数接受四个迭代器，前两个代表第一个序列的范围，后两个代表第二个序列的范围；

#include <algorithm>
std::equal(roster1.begin(), roster1.end(), roster2.begin());
std::copy(roster1.cbegin(), roster1.cend(), roster2.begin());
std::search(vec1.begin(), vec1.end(), vec2.begin()+1, vec2.begin()+5);

std::accumulate

std::accumulate 的第三个参数类型决定了函数中使用哪个加法运算符以及返回值的类型；序列中的元素必须与第三个参数类型相容

1
2
3

#include <numeric>
std::string sum_str = std::accumulate(vec.cbegin(), vec.cend(), std::string("")); // valid
std::string sum_str = std::accumulate(vec.cbegin(), vec.cend(), ""); // invalid, const char* undefine "+"

写容器的算法：向目的位置迭代器写入数据的算法假定目的位置足够大，能容纳要写入的元素：

std::fill(vec.begin(), vec.end(), 0);  //将每个元素置零
std::fill_n(vec.begin(), 10, 2);       // 将vec前10个数置为2

std::vector<int> vec;    //空vector
std::fill_n(vec.begin(), 10, 0);       // invalid，修改10个不存在的元素

vec.resize(10);
std::fill_n(vec.begin(), 10, 0);       // valid，resize后有10个值初始化的元素

int a1[] = {0,1,2,3,4,5,6,7,8,9};
int a2[sizeof(a1)/sizeof(*a1)] = {};
auto ret = std::copy(std::begin(a1), std::end(a1), a2);  // ret指向拷向a2的最后一个元素的后面一个元素

std::replace

有原地替换和拷贝替换两个版本

// in place replace
std::replace(vec.begin(), vec.end(), 42, 0);    // vec中的所有42替换为0

// copy and replace
std::list<int> vec1;
std::replace_copy(vec.begin(), vec.end(), std::back_inserter(vec1), 42, 0);

std::unique

重排容器，使得不重复的元素出现在容器的开始部分，返回指向最后一个不重复元素之后的元素的迭代器

std::vector<std::string> words = {"fox", "red", "slow", "red"};
// 按字典序排序
std::sort(words.begin(), words.end());       // default comparer, words: {"fox", "red", "red", "slow"}
// unique是的每个单词只在容器开始出现一次
auto end_unique = std::unique(words.begin(), words.end());//words: {"fox", "red", "slow",[end_unq] "red"}
// 删除重复元素，标准库算法是对迭代器操作而不是对容器直接操作
words.erase(end_unique, std::end(words));    // words: {"fox", "red", "slow"}

std::partition

接受一个 predicate，对容器内容进行划分，predicate 返回 true 的元素排列在容器的前半部分，返回为 false 的排列在容器的后半部分；返回值指向最后一个使 predicate 为 true 的元素的后面一个元素；

std::vector<int> v = {0,1,2,3,4,5,6,7,8,9};
std::cout << "Original vector:\n    ";
for(int elem : v) std::cout << elem << ' ';
 
auto it = std::partition(v.begin(), v.end(), [](int i){return i % 2 == 0;});
 
std::cout << "\nPartitioned vector:\n    ";
std::copy(std::begin(v), it, std::ostream_iterator<int>(std::cout, " ")); // usage of std::ostream_iterator
std::cout << " * " " ";
std::copy(it, std::end(v), std::ostream_iterator<int>(std::cout, " "));

// possible outputs:
// Original vector:
//    0 1 2 3 4 5 6 7 8 9 
// Partitioned vector:
//    0 8 2 6 4  *  5 3 7 1 9

std::find_if

接受一个迭代器范围和一个 predicate，返回第一个使 predicate 为 true 的迭代器；如果没有这样的值，返回尾迭代器

std::transform

std::transform applies the given function to a range and stores the result in another range

1
2
3

#include <algorithm>
std::vector<int> vec = {-3,-2,-1,0,1,2,3};
std::transform(vec.begin(), vec.end(), vec.begin(), [](int i){return i<0 ? -i : i;});    //求所有数的绝对值

特定容器算法

list 不支持随机访问，对于 list、forward_list，要优先使用他们的成员函数，而不是使用通用算法；链表自己特有的操作会改变容器，但是通用的不一定会；
1
2
3
4
5
6
7
8
list.sort(); // not std::sort(list.begin(), list.end());
list.sort(cmp);
list.unique();
list.unique(pred);
list.merge(lst2);
list.remove(val);
list.remove_if(pred);
list.reverse();

Customizing Operations

可调用对象：
- 函数，函数指针
- lambda 表达式
- 重载了函数调用运算符的类
严格弱序 (strict weak ordering)
A strict weak ordering is a binary relation < on a set S that is a strict partial order (a transitive relation that is irreflexive, or equivalently that is asymmetric) in which the relation “neither a < b nor b < a” is transitive. Therefore, a strict weak ordering has the following properties：
1. For all x in S, it is not the case that x < x (irreflexivity).
2. For all x, y in S, if x < y then it is not the case that y < x (asymmetry).
3. For all x, y, z in S, if x < y and y < z then x < z (transitivity).
4. For all x, y, z in S, if x is incomparable with y (neither x < y nor y < x hold), and y is incomparable with z, then x is incomparable with z (transitivity of incomparability).
This list of properties is somewhat redundant, as asymmetry follows readily from irreflexivity and transitivity.
注意定义中所说的 relation 就是我们上面说的 “比较规则”，数学上叫 relation, 是一个更加广的概念，不一定是比较规则

而且，定义中用到的 < 并不是小学数学中的小于号，它只是代表一个 relation；当然，普通的小于号 < 是最简单的满足 strict weak ordering 的 relation 了

容易看出，小于等于号 <= 和大于等于号 >= 并不满足 irreflexivity 的要求；
这里的一个比较细微的地方是 “相等” 的概念。在这里，” 相等”( equivelant ) 并不意味着 a 和 b 是一样 ( identical ) 的东西，而只是 !(a<b) && !(b<a) (incomparable)（注意 < 只是一个 relation，并不一定是我们常说的小于号）；也就是说，“相等” 在这里已经变成了 !(a<b) && !(b<a) ，而不是 == 号（而且我们根本就没有定义 == 这个 operator，你根本没法用这个运算符来看 a 和 b 是否相等）

Ｃ++ 语言中为什么要默认采用 < 来作为 std::sort() 和其他与排序相关的函数的比较符号呢？

必须定义一个默认的比较运算符，否则每次排序都要提供一个运算子，太麻烦了。

为什么不用其他运算符？

可以用其他运算符号，比如，可以用 > 。只要这个运算符能满足 strict weak ordering 即可。

为什么一定要满足 strict weak ordering?

因为，满足 strict weak ordering 的运算符能够表达其他所有的逻辑运算符（logical operator）：
<(a, b) : (a < b)
<=(a, b) : !(b < a)
==(a, b) : !(a < b) && !(b < a)
!=(a, b) : (a < b) || (b < a)

>(a, b) : (b < a)
>=(a, b) : !(a < b)
但是，不满足 strict weak ordering 的运算符就做不到。比如上文中所讲的 <= 号和 >= 号，当 a==b 时，!(a<=b)&&!(b<=a) 并不为真

谓词 (Predicate)：其返回结果是一个可以作为条件的值；常见的有一元谓词 unary predicate(接受一个参数) 和二元谓词 binary predicate(接受两个参数)；predicate 必须满足 strict weak ordering 的条件；

// binary predicate with each contains 1 item
bool mycomparison(double first, double second){
    return (int (first) <  int (second) );    
}
// Whenever `x < y` is a strict weak ordering then `x > y` is also a strict weak ordering, just with the reverse order. {1,2,3,5,7} vs {7,5,3,2,1}
bool mycomparison(double first, double second){
    return (int (first) >  int (second) );
}

// binary predicate with each contains 2 items
struct S {
    ThingA   a;
    ThingB   b;
};
bool operator<(S const& lhs, S const& rhs) {
    // Note: This assumes that thingA and thingB already implement strict weak ordering themselves.
    // std::tie: Creates a tuple of lvalue references to its arguments or instances of std::ignore.
    return std::tie(lhs.a, lhs.b) < std::tie(rhs.a, rhs.b);
}

// binary predicate with each contains 3 items
// This orders the elements by a1 being most siginificant and a3 least significant.
bool comparer(std::tuple<int,int,int> a, std::tuple<int, int, int> b){
    // a: {a1, a2, a3}
    // b: {b1, b2, b3}  
    if (a1 < b1)
        return true;
    if (b1 < a1)
        return false;
    // a1==b1: continue with element 2
    if (a2 < b2)
        return true;
    if (b2 < a2)
        return false;
    // a2 == b2: continue with element 3
    if (a3 < b3)
        return true;
    return false; // early out
}

lambda 表达式

我们可以忽略参数列表和返回值类型，但永远不能省略捕获列表和函数体 ;

忽略参数列表等价于指定一个空的额参数列表；lambda 表达式的参数列表不能有默认参数；

忽略返回值，lambda 将根据函数体中的代码推导出返回类型：

// capture list: lambda内部需要使用的任何lambda所在的函数体内的局部变量，都需要capture
[capture list](parameter list) -> return type {
    // function body
};

// for example
auto f = []() -> int {return 42;}    // complete
auto f = []{return 42};              // 忽略参数列表和返回值类型
std::cout << f() << std::endl;       // print 42

虽然一个 lambda 可以出现在一个函数中，使用这个函数的局部变量，但他只能使用哪些明确指明的变量；一个 lambda 通过将局部变量放在 capture list 里面来着指出将会使用哪些变量；变量之间使用逗号 , 分隔；

void func(){
    int min = 0;
    int max = 100;
    // capture min,max, both are local variables
    auto f = [min, max](std::string& str){
        return str.size() >= min and str.size() <= max;
    };
    // variable 'max' cannot be implicitly captured in a lambda with no capture-default specified
    //auto f = [](std::string& str){
    //    return str.size() >= min and str.size() <= max;
    //};
}

捕获列表只用于所在函数内的非静态局部 (non-static local) 变量；lambda 里面是可以直接使用静态局部 (static local) 变量和他所在函数之外的声明的全局变量 (global variable)；

int offset = 0;
void func(){
    static int min = 0;
    static int max = 100;
    // no need to capture local static variables and global vars;
    auto f = [](std::string& str){
        return str.size() >= (min+offset) and str.size() <=(max+offset);
    }
}

lambda 是函数对象

当我们编写一个 lambda 后，编译器将该表达式翻译为一个未命名类的未命名对象；在生成的类中有一个重载的函数调用运算符()；

默认情况下从 lambda 生成的类中都包含 capture list 里面的成员作为自己的数据成员，类似任何普通类的数据成员；

lambda 的数据成员也是在 lambda 对象在创建的时候初始化的；

捕获列表 (capture list)

类似于参数传递，lambda 的捕获分为值捕获和引用捕获：

值捕获 (by-copy capture)：捕获参数在 lambda 创建的时候拷贝的，不是在调用的时候初始化的；捕获变量随后的变化不再影响之前创建的 lambda
1
2
3
4
5
6
void func_copy{
size_t v1 = 42;
auto f = [v1]{return v1;}
v1 = 0;
std::cout << f() << std::endl; // print 42, not 0
}
引用捕获 (by-reference capture)：类似于传参的时候传引用，实际上生成的类中的数据成员是引用类型，绑定的是局部变量；如果使用引用捕获，必须确保在 lambda 执行的时候，这些通过引用捕获被引用的局部变量是存在的；比如我们可以从一个函数直接返回一个 lambda 表达式，该 lambda 引用捕获了这个函数的局部变量，当函数返回的时候，这些局部变量都会销毁，于是返回的 lambda 的引用数据成员也是未定义的；
1
auto f = [&os, c](std::string& s){os << s << c;}

隐式捕获：可以让编译器根据 lambda 函数体中的代码来推测我们要使用哪些变量；& and = :

& 表示默认引用捕获，
= 表示默认值捕获；
也可以使用混合捕获，即其中一部分变量使用指定的捕获方式，剩余的使用默认捕获方式；

1
2
3

auto f = [&](std::string& s){os << s << c;}       // 默认都引用捕获
auto f = [&, c](std::string& s){os << s << c;}    // c值捕获，os引用捕获
auto f = [=, &os](std::string& s){os << s << c;}  // os引用捕获,c值捕获，

lambda mutable capture by copy

通过值捕获的变量，lambda 类会为每个值捕获的变量建立对应的数据成员；默认情况下，不会改变他的值；即函数调用运算符是 const 的；如果希望 lambda 的调用可以修改值捕获变量的值，需要加上 mutable 关键字；而一个引用捕获的变量能否可以被修改取决于这个变量本身是不是 const；

why need mutable:

because by default, a function object should produce the same result every time it’s called. This is the difference between an object orientated function and a function using a global variable, effectively.

void func(){
    size_t v1 = 42;
    // error: cannot assign to a variable captured by copy in a non-mutable lambda
    //auto f1 = [v1](){return ++v1;};           // value capture
    auto f2 = [v1]() mutable {return ++v1;};    // mutable value capture
    auto f3 = [&v1]() mutable {return ++v1;};   // reference capture
    v1 = 0;
    std::cout << f1() << std::endl;        // print 42
    std::cout << f2() << std::endl;        // print 43
    std::cout << f3() << std::endl;        // print 43    
}

// unroll lambda expresion, the classes like belows:
class LambdaF1{
public:
    size_t operator()() const {return ++v1;} // const member func, modify member var is invalid    
private:
    size_t v1;
};

class LambdaF2{
public:
    size_t operator()() const {return ++v1;} // const member func, modify mutable var is valid    
private:
    mutable size_t v1; // mutable
};

lambda 返回值

如果没有指定尾置返回值类型，lambda 将根据函数体中的代码推导出返回类型 (function template auto deduction)；

1 2	[](int i) {if (i<0) return -i; else return i;}; // return type is int [](int i) -> int {if (i<0) return -i; else return i;}; // return type is int

不适合使用 lambda 的情况：
- 如果需要在很多地方使用同一个函数，就不太适合使用 lambda，可以定义函数；
- 类似的如果一个函数需要很多语句才能完成，则定义函数比使用 lambda 要好；
- 如果 lambda 的捕获列表为空，则可以直接使用函数替换；但是对于有捕获参数的 lambda，函数不是很直接替换了，这时候可以借助 std::bind 来捕获；
- 或者函数的参数不满足 predicate 的要求，这个时候也可以借助 std::bind 来调整参数列表；

std::bind

接受一个可调用对象，生成另一个可调用对象来适应参数列表数量的要求；std::placeholders::_1, std::placeholders::_2 这些是参数占位符；分别表示第一个参数和第二个参数；

1 2	#include <functional> auto newCallable = std::bind(callable, arg_list);

// funtion with 2 params, unqualified for unary predicate params list
bool check_size(std::string &s, std::string::size_type sz){
    return s.size() > sz;
}
// convert to function qualified for unary predicate, so that check6 can be used as a unary predicate
auto check6 = std::bind(check_size, std::placeholders::_1, 6);

使用 std::bind 可以将原函数的参数进行重新映射，改变顺序，设置默认值，当 std::bind 的 placeholder 的顺序不是递增的时候，原函数的参数顺序就被改变了；甚至可以通过 bind 简单实现函数的作用颠倒，对 lambda 表达式也可以使用 bind

// Pseudocode
void f(a, b, c, d, e){
}

// call g(X, Y) is equivalent to call f(A, Y, C, X, E);
auto g = std::bind(f, A, std::placeholder::_2, C, std::placeholder::_1, E);

// bind normal func to reverse parameters order
//sort words with length from short to long
std::sort(words.begin(), words.end(), isShorter);
//sort words with length from long to short
std::sort(words.begin(), words.end(), std::bind(isShorter, std::placeholder::_2, std::placeholder::_1));

// bind lambda to reverse parameters order
auto isShorter = [](std::string& a, std::string& b){ 
    return a.size() < b.size();
};
std::vector<std::string> strs = {"abc", "ab", "a", "asdf"};
std::sort(strs.begin(), strs.end(), isShorter);
for (auto str: strs){
    std::cout << str << std::endl;
}
std::sort(strs.begin(), strs.end(), std::bind(isShorter, std::placeholders::_2, std::placeholders::_1));
for (auto str: strs){
    std::cout << str << std::endl;
}

std::ref and std::cref

std::bind 默认情况下那些不是占位符的参数是拷贝到 bind 返回的可调用对象中的，但是如果 bind 希望通过引用传递给 bind 返回的可调用对象时，需要使用 std::ref 或者 std::cref

// lambda capture
[&os, c](std::string &s){ os << s << c;}

// function 
ostream& print(ostreams& os, const std::string &s, char c){
        return os << s << c;
}

// bind capture
auto bp = std::bind(print, os, std::placeholder::_1, "c");                  // invalid
auto bp = std::bind(print, std::ref(os), std::placeholder::_1, "c");        // valid

Function templates ref and cref are helper functions that generate an object of type std::reference_wrapper, using template argument deduction to determine the template argument of the result.
std::reference_wrapper is a class template that wraps a reference in a copyable, assignable object. It is frequently used as a mechanism to store references inside standard containers (like std::vector) which cannot normally hold references.

C++11 中引入 std::ref 用于取某个变量的引用，这个引入是为了解决一些传参问题。std::ref 用于包装按引用传递的值；std::cref 用于包装按 const 引用传递的值。我们知道 C++ 中本来就有引用的存在，为何 C++11 中还要引入一个 std::ref 了？

主要是考虑函数式编程（如 std::bind）在使用时，是对参数直接拷贝，而不是引用；传递引用参见上条
explicitly initialize the thread with a reference_wrapper by using std::ref
1
2
3
4
5
static void SimpleThread(int& a) {
cout << __PRETTY_FUNCTION__ << ":" << a << endl;
}
int a = 6;
auto thread1 = std::thread(SimpleThread, std::ref(a));
The arguments to the thread function are moved or copied by value. If a reference argument needs to be passed to the thread function, it has to be wrapped (e.g. with std::ref or std::cref).

Iterator

插入迭代器 (inserter iterator)

inserter：当调用 inserter(c, iter) 时，会得到一个插入迭代器，接下来每次使用的时候他将元素插入到 iter 原来指向的元素之前的位置 ;

auto it = std::inserter(vec, iter);
*it = val;

// equvialent to 
it = vec.insert(it, val);        // it指向新元素
++it;                            // 递增it使其指向原来的元素

front_inserter：元素总是插入到容器的第一个元素的前面，这点和 inserter 不同；

std::list<int> lst = {1,2,3,4};
std::list<int> lst1, lst2;
std::copy(lst.cbegin(), lst.cend(), std::front_inserter(lst1));          // lst1: {4,3,2,1}
std::copy(lst.cbegin(), lst.cend(), std::inserter(lst2, lst2.begin()));  // lst2: {1,2,3,4}

back_inserter：插入迭代器 (insert iterator) 是一种向容器中插入元素的容器迭代器，当我们向一个插入迭代器赋值时，一个与赋值号右端相等的元素被插入容器，即 push_back

#include <iterator>
std::vector<int> vec;      // vec: {}
auto back_it = std::back_iterator(vec);
*it = 42;                  // vec: {42};
std::fill_n(it, 10, 2);    // vec: {2,2,2,2,2,2,2,2,2,2}

std::istream_iterator

istream_iterator 使用 >> 来读取流，所以用 istream_iterator 读取的类型必须定义了输入运算符；我们可以将一个 istream_iterator 绑定到一个流，也可以默认初始化，这样得到的就是一个空的迭代器，可以作为尾后迭代器使用；

#include <iterator>
std::istream_iterator<int> int_it(std::cin);   // 从cin读取int序列
std::istream_iterator<int> eof;                // 默认初始化，相当于尾后迭代器
std::ifstream in("in_file");
std::istream_iterator<int> file_it(in);        // 从文件读取

// 从一个流中读取int序列存入vector
std::istream_iterator<int> it(std::cin);
std::istream_iterator<int> eof;
std::vector<int> vec;
auto back_it = std::back_inserter(vec);
while (it != eof){
    *back_it = *it++;
}

std::ostream_iterator

可以对任何具有输出运算符 << 的类型定义 ostream_iterator；必须将 ostream_iterator 绑定到一个输出流，不允许空的或者表示尾后位置的 ostream_iterator;

// 可以在ostream_iterator创建的时候传递第二个参数，这样每次输出的时候都会打印这样一个C风格的字符串
std::ostream_iterator<int> out_iter(std::cout, " ");
for (auto e : vec){
    *out_iter = e;                // 每次打印完e之后会带一个空格
}
// 通过std::copy来打印，比写for循环更简洁
std::ostream_iterator<int> out_iter(std::cout, " ");
std::copy(vec.begin(), vec.end(), out_iter);

流迭代器不支持递减操作，但是有递增操作；
- istream_iterator++, ++istream_iterator：按定义的 >> 运算法从输入流读取下一个值
- ostream_iterator++, ++ostream_iterator：运算符存在，但是不做任何事情，均返回 iterator 本身

反向迭代器:

在容器中从尾元素向首元素反向移动的迭代器；对于反向迭代器，递增递减的操作会颠倒过来；递增将向前移动，递减往后移动；

可以使用 riterator.base() 将反向迭代器转换为一个普通的迭代器；注意 riterator 和 riterator.base () 指向不同的位置，类似于 begin() 和 rend()、end() 和 rbegin() 一样，都错位了一下

// 正向迭代器: vec.begin() -> vec.end()        头->尾后 左闭右开
// 逆向迭代器：vec.rbegin() -> vec.rend()      尾->首前 右闭左开
std::vector<int> vec = {1,2,3,4,5,6,7,8,9,10};
for(auto iter = vec.crbegin(); iter!=vec.crend(), iter++){
    std::cout << *iter << std::endl;        // 逆向输出 10,9,8,7,6,5,4,3,2,1
}

// 查找最后一个xxx的时候，可以考虑使用逆序迭代器
auto rcomma = std::find(line.crbegin(), line.crend(), ",");        // 打印被,分割的最后一个单词

// 从行尾到行首的方向逆序打印
std::cout << std::string(line.crbegin(), comma) << std::endl;

// convert reverse iterator to a normal iterator: base()
auto comma = rcomma.base();
std::cout << std::string(comma, line.cend()) << std::endl;