Complex numbers matrix transformations

The Boost numeric uBLAS library provides a handful of function to operate transformations on matrix (or vector) of complex numbers.

Element transformation

Three of them are mere convenience functions that apply a transformation for single complex numbers to each element of the container:
ublas::vector<std::complex<double> > v1(6);
for(int i = 0; i < 6; ++i) { v1[i] = std::complex<double> (i, i * -1.0); }
std::cout << v1 << std::endl;

ublas::matrix<std::complex<double> > m1(6,3);
for(unsigned int i = 0; i < m1.size1(); ++i)
    for(unsigned int j = 0; j < m1.size2(); ++j)
        m1(i, j) = std::complex<double>(i, j * -1.0);
std::cout << m1 << std::endl;

std::cout << " conjugate vector: " << ublas::conj(v1) << std::endl; // 1
std::cout << " conjugate matrix: " << ublas::conj(m1) << std::endl;
std::cout << " real vector: " << ublas::real(v1) << std::endl; // 2
std::cout << " real matrix: " << ublas::real(m1) << std::endl;
std::cout << " imaginary vector: " << ublas::imag(v1) << std::endl; // 3
std::cout << " imaginary matrix: " << ublas::imag(m1) << std::endl;
1. The conjugate of a complex number is another complex with the same real part and the imaginary part with the changed sign. The uBLAS conj() function returns a new container where each element is the conjugate of the same element in the original container.
2. A complex number has two components, real and imaginary. The uBLAS real() function returns a container where each element is the real part of the correspondent complex element in the original container.
3. Counterpart of (2), here the container in out contains the imaginary parts of the original complex numbers.

Matrix transposition

The other two functions could still be used on both vectors or matrices, even though they make actual sense only on the latter.
std::cout << " transpose vector: " << ublas::trans(v1) << std::endl; // 1
std::cout << " transpose matrix: " << ublas::trans(m1) << std::endl;
std::cout << " hermitian vector: " << ublas::herm(v1) << std::endl; // 2
std::cout << " hermitian matrix: " << ublas::herm(m1) << std::endl;
1. A transpose of a matrix is another matrix where the rows of the first are the columns of the second. A vector is a matrix with just one row, so there is no change.
2. The Hermitian matrix is another matrix, transpose of the conjugate of the original one. When applied to a vector, the Hermitian generates the conjugated vector.

Go to the full post

Products with uBLAS

Using the vectors and matrices defined in the Boost uBLAS library, we can perform a number of basic linear algebraic operations. Here is a brief description of the products functionality available.

I am performing these tests using Boost 1.49 for MSVC 2010. Here is what I did to use uBLAS vector and matrix:
#include <boost/numeric/ublas/vector.hpp> // 1
#include <boost/numeric/ublas/matrix.hpp> // 2
#include <boost/numeric/ublas/io.hpp> // 3

namespace ublas = boost::numeric::ublas; // 4
1. For the uBLAS vector definition.
2. uBLAS matrix.
3. I/O uBLAS support, it makes output to console easier.
4. Synonym definition to the boost numeric ublas namespace.

Error handling

When running in debug mode, uBLAS performs a strict data check. Usually we need to remove all this checks from the production code for efficiency reasons, making the uBLAS code faster, but moving in this way to our code the responsibility of checking all the weirdness that could happen.

If the symbol NDEBUG is defined as preprocessor constant, the uBLAS code generated by the compiler is lean and mean. Otherwise is safer, but bigger and slower.

So, if we plan to keep in production code the safer uBLAS version, we should try/catch for std::exception our uBLAS code, to react adequately in case of anomalies. Otherwise, for NDEBUG generated code, no check is performed, no exception is thrown by uBLAS. Simply if we give garbage in, we get garbage out. It is our responsibility to ensure that this won't happen.

A couple of uBLAS vectors

To begin playing a bit around with uBLAS products, we need a few of vectors:
ublas::vector<double> v1(6); // 1
ublas::vector<double> v2(6);

for(int i = 0; i < 6; ++i)
    v1[i] = i;
    v2[i] = i + 1;

ublas::vector<double> vs(3);
for(int i = 0; i < 3; ++i) { vs[i] = i + 1; }

std::cout << v1 << v2 << vs << std::endl; // 2
1. The uBLAS vector behave slightly differently from STL vector. We can specify in the constructor its size, and the number of element passed is allocated without being initialized.
2. Dump to standard output console the generated uBLAS vectors:
Inner product

We can calculate the inner product between two vectors through the uBLAS function inner_prod(). It takes in input two vectors with the same size, and returns a scalar value:
std::cout << ublas::inner_prod(v1, v2) << std::endl; // 1
std::cout << std::inner_product(v1.begin(), v1.end(), v2.begin(), 0.0) << std::endl; // 2

try // 3
    double badValue = ublas::inner_prod(v1, vs); // 4
    std::cout << "Bad inner product: " << badValue << std::endl; // 5
catch(const std::exception& ex)
    std::cout << "Can't get inner product: " << ex.what() << std::endl; // 6
1. uBLAS inner product, in this case the expected result is 70.
2. We could use STL instead.
3. Remember that makes sense try/catching uBLAS functions only if NDEBUG has not been defined
4. The two vectors involved in an inner product should have the same size. Here this is not true, trouble should be expected.
5. This line is going to be executed only if NDEBUG has been defined, and in that case the output will be a meaningless value.
6. If we are not running in NDEBUG mode, the uBLAS check on the vectors size would fail, and an exception is thrown, and here it would be caught.

Outer product

The outer products of two vectors is a matrix where the x,y element is the product of the x element of the first vector by the y element of the second one, as shown by this very basic implementation:
namespace my
    ublas::matrix<double> outer_prod(const ublas::vector<double>& v1, const ublas::vector<double>& v2)
        ublas::matrix<double> m(v1.size(), v2.size());
        for(unsigned int i = 0; i < v1.size(); ++i)
            for(unsigned int j = 0; j < v2.size(); ++j)
                m(i, j) = v1[i] * v2[j];
        return m;
There is no constrain on the vector sizes. If they are the same, the result is a square matrix, otherwise we'll get a rectangular one:
std::cout << "uBLAS outer product /1: " << ublas::outer_prod(v1, v2) << std::endl;
std::cout << "uBLAS outer product /2: " << ublas::outer_prod(v2, vs) << std::endl;

ublas::vector<double> v0(0);
std::cout << "uBLAS outer product /3: " << ublas::outer_prod(v1, v0) << std::endl; // 1
1. It is not illegal multiply for a zero-sized vector. In this specific case we get a 6x0 matrix.

Left-right matrix-vector product

The product between matrix and vector is not commutative. The vector could be on the right only if its size matches the matrix size1(), and on the left if it matches size2():
ublas::matrix<double> m1(6,3); // 1

for(unsigned int i = 0; i < m1.size1(); ++i)
    for(unsigned int j = 0; j < m1.size2(); ++j)
        m1(i, j) = (i + j);

std::cout << ublas::prod(v1, m1) << std::endl; // 2
std::cout << ublas::prod(m1, vs) << std::endl; // 3

    std::cout << "Right " << ublas::prod(vs, m1) << std::endl; // 4
catch(const std::exception& ex)
    std::cout << "Can't get outer product: " << ex.what() << std::endl;
1. A 6x3 rectangular matrix.
2. Vector v1 size is 6, matching with matrix size1.
3. All right, vector vs size is 3.
4. Size mismatch! If NDEBUG is not defined, uBLAS here throws an exception. Otherwise, we get a meaningless result.

Matrix-matrix product

We can think to the vector-matrix product as a special case of a matrix to matrix product. So, here too we should ensure that first matrix size2 matches to second matrix size1. Otherwise we are bound to get an exception (no NDEBUG) or nonsensical output (if NDEBUG is defined).
ublas::matrix<double> m2(3,6); // 1

for(unsigned int i = 0; i < m2.size1(); ++i)
    for(unsigned int j = 0; j < m2.size2(); ++j)
        m2(i, j) = (10 - i - j);

std::cout << "Output matrix: " << ublas::prod(m1, m2) << std::endl; // 2

ublas::matrix<double> mq(6,6); // 3

for(unsigned int i = 0; i < mq.size1(); ++i)
    for(unsigned int j = 0; j < mq.size2(); ++j)
        mq(i, j) = i + j;

    std::cout << "Bad: " << ublas::prod(m1, mq) << std::endl; // 4
catch(const std::exception& ex)
    std::cout << "Can't get matrix-matrix product: " << ex.what() << std::endl;
1. A 3x6 rectangular matrix.
2. The first matrix is a 6x3 one, so they could be multiplied.
3. A 6x6 square matrix.
4. Here the no NDEBUG version throws an exception, if NDEBUG was defined, we should expect silly values in the resulting matrix.

Element product

The element product takes as operand two vectors, or matrices, having the same sizes, and produces as result a third vector (or matrix) with the same dimensions, and where each element contains a value that is the product of the original elements in the same position.
std::cout << ublas::element_prod(v1, v2) << std::endl; // 1

    std::cout << ublas::element_prod(v1, vs) << std::endl; // 2
catch(const std::exception& ex)
    std::cout << "Bad element product: " << ex.what() << std::endl;
1. The vectors v1 and v2 have the same size, we can perform an element_prod() on them.
2. Different sizes in the passed vectors result in an exception (if NDEBUG not defined) or in an inconsistent result (NDEBUG defined).
Same for matrices:
ublas::matrix<double> m3(6,3);

for(unsigned int i = 0; i < m3.size1(); ++i)
    for(unsigned int j = 0; j < m3.size2(); ++j)
        m3(i, j) = (7 - i - j);

std::cout << ublas::element_prod(m1, m3) << std::endl; // 1

    std::cout << ublas::element_prod(m1, m2) << std::endl; // 2
catch(const std::exception& ex)
    std::cout << "Bad element product: " << ex.what() << std::endl;
1. This should works fine.
2. A mismatch in the matrices' sizes leads to an exception (NDEBUG not defined) or garbage in the output matrix (NDEBUG defined).

Element division

Element division is about the same of element product:
std::cout << ublas::element_div(v1, v2) << std::endl; // 1
std::cout << ublas::element_div(v2, v1) << std::endl; // 2

    std::cout << ublas::element_div(v1, vs) << std::endl; // 3
catch(const std::exception& ex)
    std::cout << "Bad element product: " << ex.what() << std::endl;

std::cout << ublas::element_div(m1, m3) << std::endl; // 4

    std::cout << ublas::element_div(m1, m2) << std::endl; // 5
catch(const std::exception& ex)
    std::cout << "Bad element product: " << ex.what() << std::endl;
1. This works fine.
2. This works fine, too. But given that the first element in the v1 vector is zero, the first element in the resulting vector is not a valid double, represented as 1.#INF in Windows.
3. Vectors with different sizes, their element division leads to an exception (NDEBUG not defined) or to nonsensical result (NDEBUG defined).
4. Matrices with the same sizes, it works alright with the same caveat in (2), given that we have a zero divisor.
5. Like (3), mismatching sizes for matrices lead to troubles.

Go to the full post

Pointer to function, functor, lambda function

Let's see an example where the use of pointer to function, functor, or lambda function is just a matter of personal taste. I'm assuming you are working with a modern C++ compiler and you don't have any special constrain.

We have an array of doubles, and we want to know how many values in there satisfy a given condition, that could vary. To do that, we want to call the STL function count_if(), using a predicate that we can select accordingly to the user request.

We'll do some test on this data:
double values[] = { 41.97, 12.34, 95.43, 9, 44.51 };
const int SIZE = sizeof(values) / sizeof(double);
And we want just to count how many elements are less than 42.

Pointer to function

The STL count_if() function expects as input the delimiters for the sequence it has to operate on, and a predicate accepting in input a value, matching the collection base type, and returning a boolean. We can think to implement a first rough solution like this:
bool isLessThan42(double value) { return value < 42; }
// ...
std::cout << std::count_if(values, values + SIZE, isLessThan42) << std::endl;
The most painful limitation of this implementation is that the comparator is too inflexible. But we can easily relax it using a binder.

Pointer to function with STL binder

We had a problem in the first implementation, count_if() asks for a one-parameter function, but we want to use a two-parameter one. No problem, we can adapt our function:
bool isLessThan(double value, double limit)
    return value < limit;
// ...
const double LIMIT = 42;
std::cout << std::count_if(values, values + SIZE, std::bind(isLessThan, std::placeholders::_1, LIMIT)) << std::endl;
The STL bind adapter connects our isLessThan() function to count_if, using as first parameter (first placeholder) the one provided by count_if and adding LIMIT in the second place.


The second pointer to function implementation works fine, but why passing each time the limit for comparison? Wouldn't be enough to pass it once, and use it for all the iteration? We can do it using a functor:
class LessThan : public std::unary_function<double, bool> // 1
    argument_type limit_;
    LessThan(argument_type limit) : limit_(limit) {}
    result_type operator() (argument_type value) { return value < limit_; }
// ...
const double LIMIT = 42;
std::cout << std::count_if(values, values + SIZE, LessThan(LIMIT)) << std::endl; // 2
1. unary_function is an STL utility class provided to help keeping consistence for the creating (unary) functors. It is not mandatory to use it, but it is a good idea, even just for documentation. Reading the class header is enough to see that LessThan is a functor that gets in input a double (argument_type) and returns a boolean (result_type).
2. A LessThan object is created passing LIMIT to its ctor, and then passed to count_if(). Each iteration of count_if() calls the LessThan operator() function.

C++11 lambda

If your compiler implements the 2011 C++ standard lambda function, you could simplify the code in this way:
const double LIMIT = 42;
std::cout << std::count_if(values, values + SIZE, [LIMIT](double value){ return value < LIMIT; }) << std::endl;
The access to the LIMIT constant is granted to the lambda function body putting it in the capture section.

Boost lambda

If your compiler does not implement the standard C++ lambda, you could still use it through the Boost Libraries support:
std::cout << "Boost lambda: " << std::count_if(values, values + SIZE, boost::lambda::_1 < LIMIT ) << std::endl;
STL Functor

In such a simple case, is probably not worthy to create a custom functor. We could use a standard one, and adapt it to do it what we want. In this case, we have a ready made less<> functor (better categorized as an adaptable binary predicate) that could do:
std::cout << std::count_if(values, values + SIZE, std::bind2nd(std::less<double>(), LIMIT)) << std::endl;

Go to the full post

Pointer to function

A C++ programmer could use almost indifferently a pointer to function, a functor, or a lambda function to implement the same design request. But pointer to functions are so contrived, difficult to read and maintain, that are commonly used only when one can't avoid to do it. Namely when developing in pure C.

I guess you know what a pointer to function is, and how it works, but let's have a tiny example to refresh the matter, before comparing them with functor and lambda functions.

Say that we want a free function to behave polimorfically, it should run an adder or a multiplier, accordingly to a specific situation, on a couple of doubles, and then output the result. We have some obscure constrain (legacy code that uses our free function, probably) that forces us to think in terms of pointer to functions.

These are the functions that have to be called:
double sum(double a, double b) { return a + b; }
double multiply(double a, double b) { return a * b; }
We define what is the type of a pointer to these functions:
typedef double (*PDF_DD) (double, double);
As one would expect, we defined PDF_DD as a pointer to a function that gets in input two doubles and gives back another double.

This is a possible implementation for our function:
void dispatcher(double x, double y, PDF_DD f) // 1
    if(f == NULL) // 2
        std::cout << "Bad pointer to function!" << std::endl;
    double result = (*f)(x, y); // 3
//  double result = f(x, y); // 4
    std::cout << "Result is: " << result << std::endl;
1. The third input parameter is a pointer to function, as defined above.
2. Being a pointer (to function or data doesn't matter), f could be NULL. So we should check it, to avoid the risk of a NULL dereferencing crash.
3. The pointer to function is dereferenced, and the actual function is called. It is not mandatory explicitly dereferencing a pointer to function, the compiler is smart enough to do it implicitly.
4. Same as (2), but relying on the compiler to convert the pointer to function, before calling the referenced function.

The function should be called like this:
dispatcher(3, 2, sum);
dispatcher(3, 2, multiply);
dispatcher(3, 2, NULL); // 1
1. PDF_DD is a pointer, so this code is legal. Hence we should remember to check this parameter value before dereferencing it.

Go to the full post

Frobenius Norm

The Frobenius norm is the same concept of the Euclidean norm, but applied to matrices.
It is easy to write a pure C function calculating the Frobenius norm:
double frobeniusNorm(double* matrix, int size1, int size2)
    double result = 0.0;
    for(int i = 0; i < size1; ++i)
        for(int j = 0; j < size2; ++j)
            double value = *(matrix + (i*size2) + j);
            result += value * value;
    return sqrt(result);
This piece of code would look a bit counterintuitive to someone not used to how multidimensional arrays are managed in C, but once you get the point that they are flatted and seen just like a one dimensional array, the things start looking clearer.

We could have written the same function in a more friendly way, using the array notation, ending up in something like this:
double frobeniusNorm(double matrix[][MAT_SIZE_2], int size1)
    double result = 0.0;
    for(int i = 0; i < size1; ++i)
        for(int j = 0; j < MAT_SIZE_2; ++j)
            double value = matrix[i][j];
            result += value * value;
    return sqrt(result);
But the compiler has to know one matrix dimension, so that it could converts the matrix notation in the actual address of the memory cell containing its value, so we are forced to pass it (here is used an int constant, MAT_SIZE_2, whose definition is showed below).

In C, a 3x3 matrix would be typically defined in one of this two ways:
double dm[] = { // 1
    1, 2, 3,
    4, 5, 6,
    7, 8, 9
const int MAT_SIZE_2 = 3;
double dm2[][MAT_SIZE_2] = { // 2
    {1, 2, 3},
    {4, 5, 6},
    {7, 8, 9}
1. In this way we keep the representation close to what is the actual way the data are stored in memory, but the code could be a bit confusing: is this a matrix or a vector?
2. Here we are stating clearly that we are dealing with a two dimensional vector, but this is what we have to do to call our generic Frobenius norm calculator:
frobeniusNorm(&dm2[0][0], 3, 3)
We can't pass a bidimesional array when a bare pointer is expected, so we have to explicitly say to the compiler that it should get the address to the first element in the array.

We could save all this trouble using the Boost uBLAS matrix class:
// ...
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas;
// ...
ublas::matrix<double> matrix(3, 3);
int k = 0;
for(unsigned int i = 0; i < matrix.size1(); ++i)
    for(unsigned int j = 0; j < matrix.size2(); ++j)
        matrix(i, j) = ++k;
// ...
double frobeniusNorm(const ublas::matrix<double>& matrix)
    double result = 0.0;
    for(unsigned int i = 0; i < matrix.size1(); ++i)
        for(unsigned int j = 0; j < matrix.size2(); ++j)
            double value = matrix(i, j);
            result += value * value;
    return sqrt(result);
Naturally, the above implementation of a Frobenius norm calculator is useful only to show how to work with an uBLAS matrix. In real code, it does not make much sense using uBLAS data structure and not its functions:
std::cout << ublas::norm_frobenius(matrix) << std::endl;

Go to the full post

Euclidean norm

The Euclidean norm is defined on a sequence of numbers as the square root of sum of square of each element or, equivalently, as the square root of the inner product of a vector and itself. Simple, isn't it? Actually it is. In C++ we could get it in a whiff, but walking through all the details would make it much more interesting.

We have to apply the square root on a value that could be get through a couple of recipes. Let's go for the first one.

Sum of squares

An intuitive function to calculate the sum of squares is:
double sumSquares(const std::vector<double>& v)
    double result = 0.0;
    for(unsigned int i = 0; i < v.size(); ++i)
        result += (v[i] * v[i]);
    return result;
With some extra-thinking we should ends up producing a more generic solution like:
template<typename C> // 1
typename C::value_type sumSquares(const C& c) // 2
    return std::accumulate(c.begin(), c.end(), static_cast<C::value_type>(0), // 3
        [](C::value_type sum, C::value_type value){return sum + value * value; }); // 4
1. This template function should be called for containers, so I used C as typename.
2. The return value type is defined as the passed container underlying value_type. We have to stress the fact that C::value_type is a type name, otherwise the compiler wouldn't understand what we meant and would signal its perplexity with a series of errors that on VC++2010 starts with this warning:
warning C4346: 'C::value_type' : dependent name is not a type
3. I use the STL accumulate() function, specifying beginning and end of the sequence, the initializing value for the sum, and (next line) the predicate that we want to use to generate the accumulation. The third parameter is not just a simple zero, is a zero for the right underlying type. Using a "simple" zero could lead to truncating issues, given that the type of variable used by accumulate() the store the sum is deducted by the type of this value.
4. As predicate I used a lambda function that gets as input parameter the generating sum and the value of the current collection element in evaluation. It calculates the square of the value, add it to the sum, and returns it.

Here is a pre-C++11 (no lambda) implementation of the same functionality:
template<typename T>
T square(T sum, T value)
    return sum  + value * value;

template<typename C>
typename C::value_type sumSquares(const C& c)
    return std::accumulate(c.begin(), c.end(), static_cast<C::value_type>(0), square<C::value_type>);
Having available a sumSquares() function, it is easy to calculate the Euclidean mean for a vector:
std::vector<double> v;
// ...
double euclideanNorm = sqrt(sumSquares(v));

Inner product

The inner product, also known as dot or scalar product, is an algebraic operation that given two vectors of the same size returns the sum of the product of each corresponding element in the vectors.

A trivial C/C++ implementation for inner product is:
double innerProduct(double* x, double* y, unsigned int size)
    double result = 0.0;
    for(unsigned int i = 0; i < size; ++i)
        result += x[i] * y[i];
    return result;
Major weakness of this implementation is that it lacks genericity: it is tailored just for arrays of double, any other container, and any other data type, would require another function.

A rapid browsing in the STL C++ library would give as a more interesting algorithm, std::inner_product(), that comes in a couple of flavors, the plain one, and one that uses whichever functors we like instead of the standard plus and multiply for the current datatype.

The simpler inner_product() version requires in input three iterators, two for delimiting the first sequence, one for the beginning of the second - remember that the two sequences should have the same size. A fourth parameter is required as initializer for the returned value, and usually is set to zero.

The testing code below calls my trivial implementation above, and then the standard inner_product(), so we can compare the two solutions:
double d1[] = { 1, 3 };
double d2[] = { 6, 2 };
std::cout << innerProduct(d1, d2, 2) << std::endl;
typedef std::vector<double> DVector;
DVector v1(d1, d1 + 2);
DVector v2(d2, d2 + 2);
std::cout << std::inner_product(d1, d1 + 2, d2, 0.0) << std::endl;
std::cout << std::inner_product(v1.begin(), v1.end(), v2.begin(), 0.0) << std::endl;
Notice that I passed the initializing value to sum as an explicit double. If I used a plain 0, the compiled would have inferred that I wanted to have the result as an int, truncating it as we asked.

The amazing result of all this chatting is that we can calculate a vector Euclidean norm writing a single line of code:
std::vector<double> v;
// ...
double euclideanNorm = sqrt(std::inner_product(v.begin(), v.end(), v.begin(), 0.0));

Still, we can do even better, at least from the point of view of the expressiveness and maintainability of our code, using an algebraic library.


The Boost Numeric uBLAS library provides a number of common algebraic function, and among them there is also the vector Euclidean norm. We only have to pay attention to the fact that we have to work with uBLAS vector and not STL ones:
// ...
#include <boost/numeric/ublas/vector.hpp>
#include <boost/numeric/ublas/io.hpp>

namespace ublas = boost::numeric::ublas;

// ...
ublas::vector<double> uv(2);
uv[0] = 6;
uv[1] = 2;

std::cout << "Euclidean Norm: " << ublas::norm_2(uv) << std::endl;

Go to the full post

uBLAS matrix

There is not such a beast like a matrix in standard C++. When we don't fallback to the raw bidimensional array, a real pain in the neck to work with and to maintain, the common approach is emulating it with a vector of vectors. It could be enough in some cases, but if our problem is not trivial, it is a good idea to look for alternative solutions provided by non standard libraries.

A popular C++ algebraic library is uBLAS, that implements the BLAS interface and it is part of the Boost Libraries.

A standard solution

If we want to manage a very simple case of a 4x4 matrix of integers that has to be just created with all its values set to zero, and then printed, we typically end up with code like this:
const int IM_SIZE_1 = 4; // 1
const int IM_SIZE_2 = 4;
IMatrix matrix(IM_SIZE_1); // 2
std::for_each(matrix.begin(), matrix.end(), [IM_SIZE_2](IVector& v) { v = IVector(IM_SIZE_2); }); // 3
dump(matrix); // 4
1. Specify the matrix dimensions, 4x4.
2. Despite its wishful name, IMatrix is not really a matrix, but a vector, see below for details.
3. Each element of the IMatrix vector has to be initialized by hand, specifying each line dimension. The IVector is passed to the lambda function by reference, so that the assignment to a new instance reflects to the actual element in IMatrix. Otherwise the assignment would have effect only on a local variable that would be discarded at the end of the lambda scope.
4. Now we can call an ad hoc function we have written to dump the matrix to the standard console.

Let's have a look of what IMatrix actually is, and how to implement the dump function:
#include <iostream>
#include <vector>
#include <algorithm>
typedef std::vector<int> IVector;
typedef std::vector<IVector> IMatrix; // 1

void dump(const IVector& vector) // 2
    std::cout << "( ";
    std::for_each(vector.begin(), vector.end(), [](int i){ std::cout << i << ' '; }); // 3
    std::cout << ") ";

void dump(const IMatrix& matrix)
    std::for_each(matrix.begin(), matrix.end(), [](const IVector& v){ dump(v); }); // 4
    std::cout << std::endl;
1. IMatrix is nothing more that a synonym for a vector of a vectors of ints.
2. To dump an IMatrix, we should know how to print its IVector components.
3. Pretty easy: for_each element in the vector, from begin() to end(), we call a tiny lambda function that does the dirty job.
4. Now printing an IMatrix is just a matter of calling (2) for each of its elements. The IVector is passed by reference just for performance reasons, we don't want to waste time to create a meaningless local copy.

uBLAS solution

The code above is sort of intuitive, and for such a simple example we won't normally ask for anything better, but let's see in any case how we can refactor it to use the uBLAS matrix concept:
// ...
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas; // 1
ublas::matrix<int> matrix(4, 4); // 2
for(unsigned int i = 0; i < matrix.size1(); ++i)
    for(unsigned int j = 0; j < matrix.size2(); ++j)
        matrix(i, j) = 0; // 3
std::cout << matrix << std::endl; // 4
1. Define an alias to the uBLAS namespace, to avoid the nuisance of using the original longish namespace.
2. Instantiate a 4x4 uBLAS matrix object.
3. For efficiency, no implicit initialization of elements is performed by the matrix ctor. The matrix dimensions are available through the getters size1() and size2().
4. Sweetly, the operator "put to" for the standard output console has been oveloaded for the uBLAS matrix.

Go to the full post

Envelopes for PUB-SUB

ZeroMQ is quite low level. For instance no implicit envelope is wrapped around a message if there is no impelling architectural reason for it. This is good from a performance point of view, but it is a bit of a nuisance when we actually need an envelope for marking our message. Think about a typical PUB-SUB application. The publisher sends a wide number of messages, each of them identified by a key that marks it interesting only for some subscribers.

It is quite natural to put the key in a specific frame, and store the information in another one. Still we have to do it by hand, managing explicitly a multi-part message, taking care of the gory details.

The almost meaningless ØMQ 3.1 PUB-SUB application shown here is designed in this way: the publisher sends ten couples of messages, each of them composed by two frames, key and value; the client subscribes for a key, and gets a couple of messages before terminating. The sense of sending all those messages from the producer is that I haven't implemented any synchronization between PUB and SUB, and we should in some way giving enough time to the subscribers to enter in the game before the publisher ends its stream of messages.

This is the main loop in the producer:
for(int i = 0; i < 10; ++i)
    zmq_send(skPub, "A", 1, ZMQ_SNDMORE); // 1
    char* msgA = "An uninteresting message";
    zmq_send(skPub, msgA, strlen(msgA), 0); // 2
    std::cout << "Sent: " << msgA << std::endl;
    zmq_send(skPub, "B", 1, ZMQ_SNDMORE); // 3
    char* msgB = "An interesting message";
    zmq_send(skPub, msgB, strlen(msgB), 0); // 4
    std::cout << "Sent: " << msgB << std::endl;
    boost::this_thread::sleep(boost::posix_time::seconds(1)); // 5
1. Send the first frame in a message through the PUB socket. The flag is set to ZMQ_SNDMORE, at least one more frame should be sent to complete the message. This is the key of the message, and it is set to a one character string, "A". Right, one character, there is no '\0' in a 0MQ string.
2. Second frame for the first message, the flag is set to zero, so this is the tail of it.
3. A new messages starts here, having in its first frame a different one-character key, "B".
4. Payload for the second message.
5. The main reason of looping in the publisher is giving time to the subscriber. So we slow down the process with a sleep.

Each subscriber subscribes to the key that it likes, and then gets the relevant messages:
// ...
zmq_setsockopt(skSub, ZMQ_SUBSCRIBE, "B", 1); // 1
for(int i = 0; i < 2; ++i)
    char key[BUF_SIZE];
    int kLen = zmq_recv(skSub, key, BUF_SIZE, 0); // 2
    std::cout << "Key: ";
    std::for_each(key, key + kLen, [](char c){ std::cout << c; });
    std::cout << std::endl;
    char value[BUF_SIZE];
    int vLen = zmq_recv(skSub, value, BUF_SIZE, 0); // 3
    std::cout << "Value: ";
    std::for_each(value, value + vLen, [](char c){ std::cout << c; });
    std::cout << std::endl;
1. Here we want to get only the "B" messages.
2. Receive a frame. We should already know that this is just a key, and that the next frame is still part of the same message.
3. Second frame received. If the design changes, we are going to be in troubles, since there is no way to know if this is really the last part of the same message.

The post is based on the Pub-sub Message Envelopes section of the Z-Guide. I have already written a similar post, but in that case it was for version 2.1 and in C++; I ported it here to 3.1 and using the raw C interface.

Go to the full post

s_dump() for ZeroMQ version 3

Have been the ØMQ version 2 utility functions defined in zhelpers.h ported to version 3? If so, this post is not so useful as it could have been, but I guess it still has some interesting bit to show.

What I am doing here is rewriting the s_dump() function for ZeroMQ 3.1 on C++11, using some stuff like casting, manipulators, "auto" keyword in its new skin, lambda functions, a couple of STL algorithms. All in all a piece of fun code, I'd say.

What the s_dump() function does is dumping to standard output console a message pending on the specified socket. Here is my rewrite:
void dumpMessage(void* skCheck) // 1
    std::cout << "----------------------------------------" << std::endl;
    do {
        zmq_msg_t message;
        zmq_recvmsg(skCheck, &message, 0); // 2
        unsigned char* data = static_cast<unsigned char*>(zmq_msg_data(&message)); // 3
        int size = zmq_msg_size(&message);
        auto isBin = [](unsigned char uc){ return (uc < 32 || uc > 127) ? true : false; }; // 4
        bool isText = (std::find_if(data, data + size, isBin) == data + size) ? true : false; // 5
        std::cout << (isText ? "TXT" : "BIN") << " [" << std::setw(3) << std::setfill('0') << size << "] ";
        std::for_each(data, data + size, [isText](unsigned char c) // 6
                std::cout << std::setw(2) << std::hex << static_cast<int>(c); // 7
                std::cout << c;
        std::cout << std::endl;
    while(moreFrames(skCheck)); // 8
1. Just a matter of taste, I reckon this name is more descriptive. The input parameter is a socket, assumed to be "good", it is a very trusting piece of code.
2. The recv() function version 2 has been renamed recvmsg in version 3.
3. A message could be in text or binary format, so better using bytes (unsigned char for C/C++) as underlying type.
4. This line could have been included in the next one, but I suspected it was better to split them and make the code clearer. Here I define a lambda function, not bothering to specify its actual class type, falling back instead to the "auto" keyword, leaving to the compiler the burden of correctly identify it. It should be almost immediate its purpose, it gets in input a byte, checks if it is in the range of plain Latin characters and return true if it is considered binary.
5. Check if the message could be considered textual or should be interpreted as binary. It calls the STL algorithm find_if() to get the first binary byte in the data. That function returns "end" if the search fails, and in our case that is represented by the pointer to the array plus its size.
6. Another lambda function, here called as predicate for the STL algorithm for_each(), here we are using also the capture clause, since we want to use the boolean isText in this closure.
7. In case of binary we print each element as a number (that's way we cast it int, otherwise it would be interpreted as a character), in hexadecimal format, reserving two positions to each one. Being a number, the filling character is zero. In case of text a blank would have been the default choice.
8. Check this was a non-final part of a multi-part message. If so, we iterate till all the message frames are processed.
Here is the function to check if there are more frames for the current message:
bool moreFrames(void* skCheck)
    int more; // 1
    size_t more_size = sizeof(int);
    zmq_getsockopt(skCheck, ZMQ_RCVMORE, &more, &more_size);
    return more != 0; // 2
1. This is a difference between ZeroMQ version 2 and 3. Before it was a fixed sized variable (64 bit), now it is a plain int.
2. Returns true if the RCVMORE option in the specified socket is set.

Go to the full post

PUB-SUB coordination by REQ-REP: the subscriber

In the previous post we have seen a publisher that is coordinated to its subscribers, now it is time to show how the subscriber works.

Let's see the main part of the subscriber function:
void* context = zmq_init(1);

void* skSub = zmq_socket(context, ZMQ_SUB);
zmq_connect(skSub, "tcp://localhost:5561");
zmq_setsockopt(skSub, ZMQ_SUBSCRIBE, NULL, 0); // 1

zmq_recv(skSub, NULL, 0, 0); // 2
acknowledge(context); // 3

    int value;
    int len = zmq_recv(skSub, &value, sizeof(int), 0); // 4
    if(len != sizeof(int)) // 5
        // ...
    if(value == PING_FLAG) // 6
        // ...
    // 7

// ...
1. No filter on the subscription.
2. The subscriber waits till it sees the publisher.
3. See below, the subscriber lets the publisher know about it.
4. Receive the value, expected to be an int.
5. If it is not an int (typically, we got an error, or the empty message used as terminator) we terminate the loop.
6. Dummy value detected, the publisher is still looping waiting for more subscribers. The value is discarded, and we get to the next iteration.
7. The value is real, let's process it.

A request socket is used to communicate back to the publisher, so that it would know about a new subscriber connected to it:
void acknowledge(void* context)
    void* skSync = zmq_socket(context, ZMQ_REQ);
    zmq_connect(skSync, "tcp://localhost:5562");
    zmq_send(skSync, NULL, 0, 0);
    zmq_recv(skSync, NULL, 0, 0);
It is just a matter of sending and receiving an empty message. The socket is used just here, so we create and close it in the function body.

Go to the full post

PUB-SUB coordination by REQ-REP: the publisher

We can't use ZeroMQ PAIR sockets to synchronize processes. As seen in the previous posts, they are explicitly designed to work in multithreaded scope. For a ØMQ multiprocess application we use instead the REQ-REP pattern. Here I am going to take care of the publisher, next post is dedicated to the subscriber.

The code is a porting in C - ØMQ 3.1 of the similar example I have written in C++ for version 2.1. The original source is the Z-Guide, currently still referring to 2.1.

Before start publishing, the publisher waits that all the requested subscribers are connected:
void* context = zmq_init(1);
void* skPub = zmq_socket(context, ZMQ_PUB); // 1
zmq_bind(skPub, "tcp://*:5561");

void* skSync = zmq_socket(context, ZMQ_REP); // 2
zmq_bind(skSync, "tcp://*:5562");

std::cout << "Waiting for " << SUBS << " subscribers." << std::endl;
int subscribers = 0;
while(subscribers < SUBS) // 3
    ping(skPub); // 4
    if(handshake(skSync)) // 5

publishing(skPub, MSG_NBR); // 6
1. PUB socket, used to publish the messages to the subscribers.
2. REP socket, used by the subscribers to signal that they are ready to receive messages.
3. SUBS is the number of expected subscribers. The publisher won't start sending the real messages till all of them are connected.
4. A "ping" message is sent on the PUB socket to show the subscribers that the publisher is up and waiting for them.
5. The handshake() function checks for subscribers synchronization on the REP socket.
6. All the subscribers are there, the publisher can start publishing.

The ping() function is pretty simple, a dummy message, here a minus one integer, is sent on the PUB socket:
const int PING_FLAG = -1;

void ping(void* skPub)
    zmq_send(skPub, &PING_FLAG, sizeof(int), 0);
More interesting the polling on the REP socket:
bool handshake(void* skSync)
    boost::this_thread::sleep(boost::posix_time::seconds(1)); // 1

    if(zmq_recv(skSync, NULL, 0, ZMQ_DONTWAIT) == 0) // 2
        zmq_send(skSync, NULL, 0, 0); // 3
        std::cout << " handshake!" << std::endl;
        return true;

    std::cout << '.';
    return false;
1. Sleep one second, to give time to the subscribers to connect.
2. Check on the socket for an empty message.
3. Acknowledge the subscriber that the publisher has seen it.

The publishing itself is nothing fancy, a bunch of integers are sent on the PUB socket, followed by an empty message as terminator:
void publishing(void* skPub, const int MSG_NBR)
    std::cout << "sending messages" << std::endl;
    for(int i = 1; i <= MSG_NBR; ++i)
        zmq_send(skPub, &i, sizeof(int), 0);

    std::cout << "sending terminator" << std::endl;
    zmq_send(skPub, NULL, 0, 0);

Go to the full post

ZMQ_PAIR sockets for multithreading coordination

In a classic multithreading model, all this coordination is done through access to shared variables protected by mutexes and locks. In a messaging multithread model, synchronization is achieved through messages. Normally we use the same messaging pattern for both multiprocess and multithreading application, as we have seen for multithread broker in a previous post, but when we need to squeeze even the last drop of speed in our ØMQ multithread application, ZMQ_PAIR sockets are there for us. The other side of the story is that, if we use ZMQ_PAIR sockets in our multithread application, we make harder to rewrite it as a multiprocess application. ZMQ_PAIR are designed explicitly for multithreading use, and don't scale well in a multiprocess environment. They are designed to be used in a stable context, so they do not automatically reconnect.

Let's think to a three-step multithreaded application. The main thread creates a worker thread, and sits waiting for it to signaling the job is done. The worker creates a sub-worker, does some job, than puts its result together with the one from its sub, and signals the main thread that it could go on.

No locks and mutexes are used in the application, with the noticeable exception of printing to standard output, resource shared among all the threads:
boost::mutex mio;

void print(const char* message)
    boost::lock_guard<boost::mutex> lock(mio);

    std::cout << boost::this_thread::get_id() << ": " << message << std::endl;
Here is the code to be executed by the main thread:
print("main: start");
void* context = zmq_init(1);

print("main: create socket for answer");
void* skMain = zmq_socket(context, ZMQ_PAIR); // 1
zmq_bind(skMain, "inproc://main"); // 2

print("main: create thread for worker");
boost::thread t2(doWork, context); // 3

print("main: do some preparation job");
boost::this_thread::sleep(boost::posix_time::milliseconds(200)); // 4

print("main: wait for worker result");
zmq_recv(skMain, NULL, 0, 0); // 5
print("main: message received");

t2.join(); // 6
print("main: done");
1. Even though this socket is used only a few lines down, it has to be created here, before the worker thread. The reason lies in the nature of pair sockets, they have no automatic reconnection capability, so the server socket must be up before its client try to connect.
2. This is a server socket in a pair connection, it uses an in process protocol, and it is identified by the name "main".
3. The worker is started, see below the code for doWork(), but pay attention to the fact that we are passing to it the 0MQ context.
4. Simulating some work.
5. We don't want to get any data from the worker, so we get an empty message. Here a bit of error handling would be a good idea, for a production code.
6. Wait the worker to terminate and then shut down the socket and the context.

Here is the worker in all its splendor:
void doWork(void* context) // 1
    print("worker: part 1 cooperating with another thread");

    void* skSub = zmq_socket(context, ZMQ_PAIR); // 2
    zmq_bind(skSub, "inproc://sub");

    print("worker: create thread for subworker");
    boost::thread t(doSub, context); // 3

    print("worker: do something");

    print("worker: waiting for subworker");
    zmq_recv(skSub, NULL, 0, 0); // 4

    print("worker: subworker is done");

    // ---

    print("worker: part 2, answering to main");

    void* skMain = zmq_socket(context, ZMQ_PAIR); // 5
    zmq_connect(skMain, "inproc://main");

    print("worker: doing something else");

    print("worker: signal back to main");
    zmq_send(skMain, NULL, 0, 0);

    print("worker: done");
1. The ZMQ context is the only thread-safe object in this structure, so it could, and should, be passed around threads that want to be coordinated.
2. Another pair socket used to connect the worker to a its sub-worker.
3. See below the code for the sub-worker function.
4. Again, this thread wait a communication from its relative worker, and then close the socket.
5. This socket is used to connect this thread with the main one, it expects the server socket for this connection to be already up.

The code for the sub-worker should be no surprise:
void doSub(void* context)
    print("sub: start");

    void* skSub = zmq_socket(context, ZMQ_PAIR); // 1
    zmq_connect(skSub, "inproc://sub");

    print("sub: tell to worker we're ready");
    zmq_send(skSub, NULL, 0, 0); // 2

    print("sub: done");
1. Pair socket back to the worker.
2. An empty message to say that it is done.

I have written the code in this post expressly for ZeroMQ 3.1, and the standard C interface. It is basically a porting of my previous post that I wrote for the C++ official interface while reading the ZGuide (that currently still refers to version 2.1.x).

Go to the full post

Boost heap priority queue

When the C++ STL priority queue is not enough, we could check if the Boost Heap priority queue is a better match to our requirements.

For instance, we could be interested in traversing the elements in the priority queue without popping them. Boost Heap priority queue provides a constant iterator that doesn't allow us to modify the underlying data, but lets us fetch them:
#include <iostream>
#include <boost/heap/priority_queue.hpp>

// ...

typedef boost::heap::priority_queue<int> MyPriQue;

MyPriQue pq;

for(MyPriQue::const_iterator it = pq.begin(); it != pq.end();++it)
    std::cout << *it << ' ';
std::cout << std::endl;
The output is:
44 42 24 22

Go to the full post