Store credit code jam

I have found out that a common source of interview questions is Google Code Jam. You can practicing for this contest seeing what they did in the past editions, and there you can find interesting problems. The other day I have played with the first one proposed, part of the Qualification Round Africa 2010.

The core of the job is writing a function that gets in input an integer, representing an amount that we want to spend, and an array of integers, representing the cost of any single item that we could buy.

We know for sure that exists one and only one pair of values in the array respecting the condition:
total = first + second
Total is given, our task is returning the indexes in the array for first and second.

I avoid discussing the code around the function, that is pretty boring. It deals about reading from the input file provided by Code Jam and creating the relative output file in the expected format. Being C++ the language that I used, it was a matter of opening an ifstream, calling getline() on it a few times, converting strings to int (atoi sufficed), and extracting integers from a string (istringstream came handy). Then I created a ofstream, and put to it the results.

I should handle couples of values, so I am going to use std::pair, that I typedef'ed it Choice, to make it a bit more expressive. The function itself takes in input the total and the available values, and gets back the pair of indexes:
typedef std::pair<int, int> Choice;

Choice getChoice(int total, const std::vector<int>& input);
Before let my program work with the actual input data, I have written a few test cases, that I used to drive my code development. Here is just one of them:
TEST(TestCredit, Case1)
{
    const int TOTAL = 100;
    const int VALUES[] = {5, 75, 25};
    std::vector<int> values(VALUES, VALUES + sizeof(VALUES) / sizeof(int));

    Choice choice = getChoice(TOTAL, values);

    EXPECT_EQ(2, choice.first);
    EXPECT_EQ(3, choice.second);
}
And here is how I implemented the function:
Choice getChoice(int total, const std::vector<int>& input)
{
    typedef std::vector<int>::size_type ST;
    for(ST i = 0; i < input.size() - 1; ++i) // 1
    {
        if(input[i] >= total) // 2
            continue;

        for(ST j = i + 1; j < input.size(); ++j) // 3
        {
            if(input[i] + input[j] == total)
                return Choice(i+1, j+1);
        }
    }

    return Choice(); // 4
}
1. Admittedly, this implementation is quite brutal. Still, we didn't get tough requirements on time complexity, and we need to produce some working code quickly. So, I reckoned this plain Oh-n-square algorithm should be enough.
2. We need two values, so the first one should be smaller than the total. If this is not true, we skip to the next iteration.
3. Let's check all the values after the current one. If we found the happy couple, return it.
4. We should never get it, accordingly to the requirements. So I return the "bad" pair (0, 0). I could have thrown an exception, to make things more clear.

My test cases OK'd it, so I ran my application against the data provided by Code Jam (Oh-n-square performance was not a problem at all), submitted the results, and got the confirmation that all works alright.

In the comment to the post, Arthur passes a link to a github page with his two solutions to this problem. As Arthur says, one is quite similar to the one I have already presented, the other one it looks to me overly complicated, and probably an example of premature optimization (root of all evil!). I found more interesting the simple one, that makes uses of STL and C++11 features. On its dark side, it abuses of the C++ meaning for the auto keyword, and it loops by while, in a way that seems to me less natural than by for.

Using the Arthur suggestions, I'll rewrite the getChoice() in this way:
Choice getChoice(const int total, const std::vector<int>& values)
{
  typedef std::vector<int>::const_iterator IT;
  for(IT it = values.begin(); it != values.end() - 1; ++it) // 1
  {
    int first = *it;
    if(first >= total)
        continue;

    IT it2 = std::find_if(it + 1, values.end(), [first, total] (const int second) { // 2
      return first + second == total;
    });
    if(it2 != values.end()) // 3
      return { it - values.begin() + 1, it2 - values.begin() + 1 };
  }

  return { 0, 0 }; // 4
}
1. Instead of a plain old for loop, a loop on an iterator interval is more expressive.
2. The internal loop could be rendered with a call to the STL find_if() algorithm couple with a lambda. This line could be read: return the iterator to the element in the vector in the interval starting from the next to the current element, ending to the natural end of the range, that satisfies the requested relation (first plus second equals total).
3. If find_if() fails, it returns the end() iterator to the container.
4. Using the new C++11 notation to create a Choice (that is, a std::pair) could actually result more natural.

Go to the full post

Palindrome string

You know what a palindrome is, right? For instance, SOS by ABBA, is a song with a palindromic title played by a pop group with a palindromic name.

It is easy to write a function that detects if a string is a palindrome, so an interviewer sometimes asks it as a first question, just to break the ice.

Here is a no-frills implementation in C:
int cPalindrome(const char* input)
{
    if(input == NULL) // 1
        return 1;

    int len = strlen(input); // 2
    for(int i = 0; i < len / 2; ++i) // 3
    {
        if(input[i] != input[len - i - 1]) // 4
            return 0;
    }

    return 1; // 5
}
1. If the caller pass a NULL, should we consider it a palindrome or not? It is questionable, at it should be a matter of a little chat to decide what the expected behavior should actually be. I voted for the most permissive option.
2. Painfully, we have to loop on the string once, just to get its length. But we have no choice, we can't even start checking the string if we don't know where is its end.
3. We check each couple of elements, moving from the extremes to the center.
4. As soon as we detect a mismatch, we return a "No, sir, this is not a palindrome!".
5. All string checked (notice that it could be even an empty one), no difference spotted, we claim it is a palindrome.

We can rewrite it in C++, and the result is an amazing one-liner:
bool palindrome(const std::string& input)
{
    return std::equal(input.begin(), input.begin() + input.length() / 2, input.rbegin());
}
Right, one line, but an intense one. There are a few things to notice.

No check on NULL string. It is not possible to convert a NULL to a string object, so such test, when required, should be moved outside the function.

To compare the elements in the string, I used the STL equal() function (defined in the algorithm include file). First two parameters are the delimiter for the first sequence to be checked, third one is the begin of the second sequence. Notice that the third parameter is rbegin(), the "reverse begin" of the string, the iterator to the last valid element of the container (or, in this case, string) that, incremented, moves to the left, and not to the right, as we usually expect in an iterator.

A variation to this classical question, is asking to write it as a recursive function. Here is a possible C++ implementation:
bool rPalindrome(const std::string& input)
{
    if(input.length() < 2) // 1
        return true;
    if(input.front() == input.back()) // 2
        return rPalindrome(input.substr(1, input.length() - 2)); // 3
    return false; // 4
}
1. Empty strings, or single-character ones, are palindrome, without need of any other test.
2. The current string has more that two characters, let's check the one to the extreme left with the one to the extreme right. If they are the same, the string is a palindrome if and only if the internal string is (recursively) a palindrome.
3. So we check the internal string, that is extracted from the current one using the STL string::substr() method.
4. Otherwise the current string is not a palindrome.

Go to the full post

Reading about ZeroMQ

I have just got this book, ZeroMQ - Use ZeroMQ and learn how to apply different message patterns, written by Faruk Akgul for Packt Publishing, and I about to read it. Some more substantial feedback in the near future.

It is a bit under my level, since it is thought for C developers at their first experience with ZeroMQ, but it is a while I don't actually work with this fine framework, and I felt it would be nice to have a refresh of the basic concepts starting from a different perspective.

Browsing the index, you could see how the book is structured to provide an introduction to ZeroMQ, letting you know how to write a simple C client-server application, describing how it works. Chapter two is mainly dedicated to a couple of messaging patterns, pub-sub and pipeline, and there is also a section dedicated to Valgrind, and how to use it to detect memory leaks on 0MQ. Chapter three gives more details on what a ZeroMQ socket is, and how to use it. In its second part, an introduction to CZMQ, the high(er) level C wrapper to the standard ZeroMQ library, is given. Chapter four delves a bit more on some more advanced topics.

The code in the book has been written for ZMQ version 3.2, and CZMQ 1.3.1; for what I have seen, the building instruction are provided only for the GCC compiler (version 4.7.2 is cited, I quite confident the newest 4.8 would be alright).

On Windows, Visual C++ is the suggested compiler. I couldn't see any detail on how to set MSVC for ZeroMQ in the book, still I could assure you that it is very easy to build a ZMQ solution in that environment too. If you need an hint, have a look at this ancient post of mine, that should be still valid.

Go to the full post

Matching bracket

Think of a string (it could be very long) composed exclusively by round brackets (aka parentheses). We want to check if it is correct, in the sense that each open bracket should have a matching close one. So, a string starting with a ")" is considered wrong, while the "()()()" string is a good one. Nested brackets are allowed, so "((()())(()()))" is a good sequence.

It is not difficult to think to a solution, still, you should not forget that unharmful looking hint on the possible large input size.

I bumped into this problem when having a coffee-break with colleagues. We were talking about interview questions, and someone remembered this one. In a few minutes we envisioned a couple of possible solutions, neither of them looked satisfactory. After a bit more of talking, and we found out a third candidate, that looked quite good.

First attempt

Naively, we could iteratively remove each "()" from the input string, until we could not find anymore such pattern. After that, we simply check if the string is empty. The good thing about this solution is that is easy to describe and implement. Less convincing is that we should copy the input to a buffer, and then perform on it a lot of splitting and splicing of substrings.

Let's see a possible C++ implementation of this idea:
bool simpleChecker(const std::string& input)
{
  std::string buffer(input);

  if(buffer.empty()) // 1
    return true;

  if(buffer.length() % 2) // 2
    return false;

  while(true)
  {
    std::string::size_type pos = buffer.find("()"); // 3
    if(pos == std::string::npos)
      break;

    buffer.replace(pos, 2, ""); // 4
  }

  return buffer.empty(); // 5
}
1. The requisites do not say if an empty string should be considered valid or not. Let's assume it should.
2. If we have an odd number of elements, the string is obviously invalid.
3. We stop looping if we can't find a "()" element.
4. When we find a "()", we replace it with an empty string (i.e., we remove it).
5. If the string is empty, the validation succeeded.

This code works fine, even for moderately large strings. In the best case, a monotonous sequence of "()()()()", the standard string replace() function is so nicely written that on my machine it takes a handful of millisecs to check an input of thousands characters. On the other side, it is easy to spot a sequence that gives troubles to replace(). The test case here below takes me something around one second on a 50K string:
TEST(CPPSimple, FiftyKWorstCase)
{
  const int SIZE = 50 * 1024;

  char buffer[SIZE + 1];
  for(int i = 0; i < SIZE / 2; ++i)
    buffer[i] = '(';
  for(int i = SIZE / 2; i < SIZE; ++i)
    buffer[i] = ')';
  buffer[SIZE] = '\0';

  EXPECT_TRUE(simpleChecker(buffer));
}
And his bigger brother, 250K sized, takes something like half a minute.

Recursive approach

The main issue of the first try is that it modifies the string. A better approach would be checking the string for each matching parenthesis, without doing any change. We read the first character in the passed string, it should be an '(', if the next one is a ')', cool, we have completed a sequence, we can move to the third character. Otherwise, we call recursively the same function, to extract a subsequence. It should return the length of the found sequence, or zero, if the subsequence is not valid. The caller would check the returned value, and move in the string accordingly, to see if the next character closes its sequence, or if it has to call again itself to check if we have another subsequence.

As you can see, this solution is more contrived, but promises to be more performant. Still, it has a big issue. In the same worst case as seen above, we have such a huge number of recursive function calls that it could easily lead to corruption in the memory stack.

Just count them

Yes, just count the brackets. After all, what we want is having the same number of open and close parenthesis.

Actually, we have also to take care that we don't want to see a close bracket when there is nothing to close, but it is easy to take care of it. Each time we have an open bracket, we put it on a stack, and we pop out one of them as we get a close bracket. If we get a closing bracket before any opening one has been scanned, the stack is empty, and so we know the string is not correct.

Here is how I have implemented this solution:
bool stackChecker(const char* input)
{
  int tracker = 0; // 1

  for(int i = 0; input[i] != '\0'; ++i)
  {
    switch(input[i])
    {
    case '(': // 2
      ++tracker;
      break;
    case ')': // 3
      if(tracker)
        --tracker;
      else
        return false;
    } // 4
  }

  return tracker ? false : true; // 5
}
1. I use this integer as the stack I talked above. Since I am not interested in the actual position of the brackets, I don't need a real stack, I just need to keep track of how many open brackets have been read.
2. I see an open bracket in the string, put it in the stack, and go to the next character.
3. Close bracket. If the stack is not empty, it matches with the last open bracket in the stack, I pop it and I am ready to check the next one. If the stack is empty, the input string is not valid.
4. What if the string contains anything else? I assumed a "don't care" behavior, but possibly the string could be considered invalid. If that is the case, a default section should be added to the switch, to always return false.
5. The complete string has been scanned, if no open bracket is left in the stack, we consider it valid.

Go to the full post

Redis client testing

I am a TDD guy. If I don't write some test cases, even for simple stuff, I feel uneasy. So, when I wrote the unique and the shared Radis wrappers for the previous posts, I let some (very basic) tests drive my code development. I guess it could be useful to have a look to them. The full C++ code is on github, you would need Google Test, Redis and its hiredis plugin, and a C++11-compliant compiler to build it. It won't be difficult to refactor the code for a different test framework (better if xUnit-based), and for a less modern C++ compiler, using Boost (or another library) for smart pointer support.

Firstly, I have written a tiny wrapper class to Google Test, just to make the code less messy, in Tester.h:
class Tester
{
public:
    Tester(int argc, char** argv) { testing::InitGoogleTest(&argc, argv); } // 1

    int run() { return RUN_ALL_TESTS(); } // 2
};
1. Before running any test, we should initialize the environment. What a better place than a constructor to do that?
2. I am happy with a very basic usage of Google Test, I would simply run all the tests.

Then I have written a simple main function, that creates an object Tester, and run it:
Tester(argc, argv).run();
I don't even use the return value from run(), I rely on the output generated by Google Test to let the user understand how the testing behaved.

Finally, I put in a source file all the tests I want to be performed. Here are some of them:
TEST(TestConnectShared, BadPort) // 1
{
    TTR::SharedRedis redis(1234); // 2
    ASSERT_FALSE(redis.isConnected()); // 3
}

TEST(TestConnectShared, Vanilla) // 4
{
    TTR::SharedRedis redis;
    ASSERT_TRUE(redis.isConnected());
}

TEST(TestCopyShared, Copy) // 5
{
    TTR::SharedRedis redis;
    ASSERT_TRUE(redis.isConnected());
    TTR::SharedRedis r2 = redis;
    ASSERT_TRUE(r2.isConnected());
}
1. The TEST() macro requires a test case name (here is TestConnectShared) and a test function name (BadPort). These names should make clear what we are testing here. In this case, I want to check the shared Redis connection behavior when I pass a wrong port number. Notice that the prerequisite is that the Redis server is up and running on localhost, default port.
2. I have put SharedRedis in a namespace named TTR (short for ThisThread Redis), that's way I use that prefix here.
3. I expect isConnected() to return false, so I assert it. If, unexpectedly, I get a valid connection to Redis, this test would fail.
4. The vanilla test on a shared Redis connection would try to create a default connection to Redis. If isConnected() does not return true, I should assume the test has failed.
5. A simple test to check if I can actually copy a shared Redis connection, and if the copy is still connected to Redis.

If you run those tests when a Redis server is not running on your localhost (and accepting connections on its default port), you should expect a number of failures. Actually, it would (wrongly) succeed only the tests expecting a failure, as the shown TestConnectShared-BadPort. Otherwise I would expect an all green lights scenario.

Go to the full post

Connecting to multiple Redis servers

In the previous post, I have showed how I designed a simple C++ wrapper class that uses a standard C++11 unique_ptr to help working with the hiredis context in such a scenario. My solution has a few limitation, first of them, it doesn't allow to connect to more than a Redis server. Often this is "as designed", still sometimes it is just a nuisance. In any case it is not difficult to redesign the code to allow connections to many Redis servers.

I have written a class named SharedRedis that wraps the hiredis functionality, storing the Redis connection in a C++11 standard shared_ptr smart pointer. You could find the full include file on github.
typedef std::shared_ptr<redisContext> SharedContext; // 1

class SharedRedis
{
    enum { DEFAULT_PORT = 6379 }; // 2
public:
    SharedRedis(const std::string& server, int port = DEFAULT_PORT); // 3
    SharedRedis(int port = DEFAULT_PORT) : SharedRedis("localhost", port) {}

    // ...
private:
    // ...
    SharedContext spContext_;
};
1. To keep the code more readable, I typedef'ed the shared pointer to the Redis context.
2. In this implementation, the user should specify host and port for the Redis server, by default the standard Redis port is used.
3. The only substantial difference to the UniqueRedis implementation (beside the usage of a different smart pointer) is that the object is freely built by the user. No static initializer, but just normal ctors are called to create a new object.

There is not much to say about the implementation, that is very close to what we have seen for the UniqueRedis version. You could see the full source code on github. Maybe it could be worthy to spend a few words on the shared smart pointer construction/destruction.
SharedRedis::SharedRedis(const std::string& server, int port)
{
    // ...
    spContext_ = SharedContext(context, std::bind(&SharedRedis::free, this)); // 1
}

void SharedRedis::free() // 2
{
    redisFree(spContext_.get());
}
1. In the class constructor, the context smart pointer is initialized passing the raw context pointer (we have ensured before that it is a valid value) and the function that has be called on its destruction. Here I use a method of the same class, I could have passed the native radisFree() function, but I wanted to add some stuff to its implementation (actually, just some logging, but I guess it is interesting to show the idea). That's way I had to bind the SharedRedis::free() function to the this pointer, so that the SharedContext destructor could be able to call the right function on the right object.
2. Here is the function that we want to be called on the SharedContext destruction.

There is still something missing. Probably, if we have many Redis connection we want to play with, we should think of storing them in a collection (maybe a map, using as key the host/port), so to have a centralized place to manage them consistently. But I guess this could be seen in another post.

Go to the full post

A unique Redis connection

I need to connect to Redis from a C++ application, and I plan to use hiredis, the Redis official C client, to do it. I want to use just one Redis connection, so I designed my C++ wrapper as a singleton. I would connect once, as I get the first request to Redis, and I would disconnect at the end of the application run.

The environment setup is done in a previous post, including the Makefile description, that you could download from github.

The access to Redis is ruled through a class, UniqueRedis, that has a minimal public interface. There is a static method to get the unique connection instance, a utility method to check if currently the connection is available, and a getter/setter couple. The full source code for the header file is on github:
typedef std::unique_ptr<redisContext, std::function<void(redisContext*)>> UniqueContext; // 1

class UniqueRedis
{
public:
    static UniqueRedis& instance() // 2
    {
        static UniqueRedis redis; // 3
        return redis;
    }

    bool isConnected() { return spContext_ ? true : false; } // 4

    void set(const std::string& key, const std::string& value); // 5
    std::string get(const std::string& key);
private:
    UniqueRedis(); // 6
    void free(); // 7

    UniqueRedis(const UniqueRedis&) = delete; // 8
    const UniqueRedis& operator=(const UniqueRedis&) = delete;

    UniqueContext spContext_; // 9
};
1. I don't want to manage directly the Redis context, I delegate instead a C++11 smart pointer to do the dirty job for me instead. I have chosen unique_ptr because I don't want it to be copied around, and I am giong to use it in the flavor that let the user passing a deleter to it, so that the smart pointer could also take care of calling that cleanup function when needed.
2. The user of this Redis wrapper, should call this static method to gain access to its unique object.
3. If you are using a compiler supporting the C++11 standard, as GNU GCC, initializing a local static variable is implicitely thread-safe.
4. This method returns true if the private smart pointer has been initialized, meaning that a connection to Redis has been already estabilished.
5. Setter and getter. The code is shown below.
6. A UniqueRedis object could be created only through the static initializer defined above.
7. Utility function to cleanup the Redis connection.
8. Copy ctor and assignment operator are explicitely deleted from the class interface.
9. The standard unique_ptr for the Redis context.

The source code for the cpp file is on github, too. Here are some stripped down and commented parts of it:
const std::string REDIS_HOST = "localhost"; // 1
const int REDIS_PORT = 6379;

UniqueRedis::UniqueRedis()
{
    redisContext* context = redisConnect(REDIS_HOST.c_str(), REDIS_PORT);

    if(context->err) // 2
    {
        redisFree(context);
        return;
    }

    spContext_ = UniqueContext(context, std::bind(&UniqueRedis::free, this)); // 3
}

void UniqueRedis::set(const std::string& key, const std::string& value) // 4
{
    void* reply = redisCommand(spContext_.get(), "SET %b %b", key.c_str(), key.length(), value.c_str(), value.length());
    if(reply)
    {
        freeReplyObject(reply);
        return;
    }
}

std::string UniqueRedis::get(const std::string& key)
{
    redisReply* reply = static_cast<redisReply*>(redisCommand(spContext_.get(), "GET %b", key.c_str(), key.length()));
    if(!reply)
        return "";

    std::string result = reply->str ? reply->str : "";
    freeReplyObject(reply);
    return result;
}
1. It would be a smarter idea to have Redis host and port in a configuration file, and fetch the actual values from there. But for the moment having them defined as constants will do.
2. If the hiredis function redisConnect() can't create a good connection to Redis using the passed host and port, I simply cleanup the context and return. You could decide to implement a more aggressive behavior (typically, throwing an exception).
3. We store the valid Redis context in the class private smart pointer. We pass as deleter the address to the member function that would free the context (not shown here, it boils down to call redisFree() for the raw Redis context pointer).
4. The call to redisCommand() is wrapped by set() and get(). As you can see, they work in a very similar way. Notice that I use the binary flag (%b) so that I could pass whatever comes from the user without troubles.

Go to the full post