Hackerrank Heaps: Find the Running Median

Given in input a stream of integers, calculate their running median. As the name of this HackerRank problem suggests, we should find a solution based on heaps data structures.

However, it is difficult to avoid the temptation of trying a different approach. Since I am using Python 3 as implementation language, the insort() function from the bisect library, makes the naive solution of sorting the list of data each time a new element arrives less expensive, to the point that is accepted by HackerRank, even though is far from optimal.


Here is a possible implementation:
for i in range(n):  # 1
    item = int(input().strip())
    insort(data, item)  # 2
    pos = len(data) // 2  # 3
    result = float(data[pos])
    if len(data) % 2 == 0:
        result += data[pos - 1]
        result /= 2
1. I have previously read in n the number of elements that I should expect in input.
2. I insert in the list data each value I get using insort() so, keeping the list sorted. The cost of finding the right position in the list is logarithmic, hence the name bisect of the library, then we have to pay a linear cost to move to the right all the elements bigger than the current one. Still a Big-Oh n*log(n) algorithm, however, on average cheaper than sorting data each time, if we don't use an algorithm that knows it is dealing with a quasi-sorted collection.
3. Than we calculate the median, peaking the central element and, if the current size of the data list is even, also its left neighbor - in this case dividing by two.

It works find, but it sort of cheating.


Unfortunately, the Python standard library does not support heaps so strongly as other languages do. There is a good implementation for the min heap, but here we need to work also with a max heap. The common hack, that I am going to use here, is negating the values inserted in a heap, to let it work as a max heap. Not the cleanest thing to do, however it works fine here.

I have written a class, MedianHeap, that shield the intricacies of the min/max heaps, and offer a simple interface to my solution, that gets very easy:
def solution(values):
    heap = MedianHeap()

    result = []
    for value in values:  # 1

    return '\n'.join(result)
The MedianHeap constructor is trivial:
def __init__(self):
    self.left = []   # max heap
    self.right = []  # min heap
I am going to use two heaps. One the left, keeping the smaller values, that is going to be a max heap, and one on the right, a min heap that keeps the bigger values. I will be always only interested in the max value on the left and the min value on the right.

To push an element, I check if it is smaller that the max to the left, and in this case I am going to push it there, otherwise it will go to the right.
def push(self, value):
    if not self.left or value < -self.left[0]:  # 1
        heappush(self.left, -value)  # 2
        heappush(self.right, value)
1. Couple of nuisances. Before checking I must ensure the left heap is not empty. And, I have to remember my hack on the max heap, so I change the sign of the max element on it before comparing it to the current value.
2. Beside that, I simply use the heappush() function from the heapq python core library. Still, remember that the left heap is a max one, so change the sign of the value I pushing in.

After the push, we should ensure the size difference between the two heaps is minimal, zero or one, otherwise I have to balance them:
bigger = self.left if len(self.left) > len(self.right) else self.right
smaller = self.left if len(self.left) < len(self.right) else self.right
if len(bigger) - len(smaller) > 1:  # 1
    moved = -1 * heappop(bigger)  # 2
    heappush(smaller, moved)
1. If the unbalance is bigger than one, I move from the smaller heap to the bigger one. I care only of their size, left to right or the other way round is just the same.
2. Still, I should remember to swapping the sign of the element, since I am moving from a min to a max heap, or viceversa. Notice the use of the other heapq function at work here, heappop().

Getting the median element from my MedianHeap is easy:
def median(self):
    bigger = self.left if len(self.left) > len(self.right) else self.right
    smaller = self.left if len(self.left) < len(self.right) else self.right

    if bigger is smaller:
        return (self.right[0] - self.left[0]) / 2  # 1
        maxi = -1.0 if bigger is self.left else 1.0
        return maxi * bigger[0]
1. If the heaps have the same size, I should return the mean of their max/min elements. Instead of adding, I subtract them, for the hack on the max heap.
2. Otherwise I return the min/max element of the biggest heap. If the winner is the left heap, it is the max one, so I have to change its sign.

I have pushed both the full python script, the bisect insort and the heap based one, and the unit test I used for help me to develop the more complex second solution, to GitHub.

Go to the full post

Spring Prototype Bean

By default, the Spring framework creates beans as singleton. However we can instruct it to create a different bean each time a request is made specifying the prototype mode. Another couple of ways are available, to create a bean for each request or session.

I have written a simple JUnit test to see the difference between prototype and singleton.

I have created two empty classes, both of them implementing the empty interface Scoping:
public class UniqueThing implements Scoping {

public class Notepad implements Scoping {
The SCOPE_SINGLETON scope is the default, so usually you won't see it explicitly set in production code. Here is just to remark its existence.
The class Notepad, marked with SCOPE_PROTOTYPE is the interesting stuff.

This is what normally happens with beans:
public void testSingleton() {
    UniqueThing thing1 = context.getBean(UniqueThing.class);
    UniqueThing thing2 = context.getBean(UniqueThing.class);
    assertSame(thing1, thing2);
When I get a bean - here I get them through the Spring application context - the same bean is returned.

But if the bean is marked as SCOPE_PROTOTYPE, we have a different behavior:
public void testPrototype() {
    Notepad notepad1 = context.getBean(Notepad.class);
    Notepad notepad2 = context.getBean(Notepad.class);
    assertNotSame(notepad1, notepad2);
Each bean is a newly created one!

Reference: Advanced wiring, from Spring in Action, Fourth Edition by Craig Walls. Chapter three, section four, Scoping beans.

I pushed the code I created on GitHub, see the scoping package and the associated JUnit test case.

Go to the full post

CodeEval Reverse Words

Given a string of words, write a function that returns them in reversed order. This is CodeEval problem #8 that has a trivial Python solution.

This is what we want to get, as a python test:
def test_provided_1(self):
    self.assertEqual('World Hello', solution('Hello World'))
And the core of the solution is a mere one-liner:
' '.join(reversed(line.split()))
We split the line, getting a list of strings, reverse it, and join the result on a blank.

I pushed the complete python script to GitHub.

Go to the full post

HackerRank Trees: Is This a Binary Search Tree?

As the title suggest, the point of this hackerrank problem is checking if a tree is a BST.

Before being able to work on this problem, you should know what we are talking about. In a few words, a BST is a tree in which for each node is true that its left children are strictly smaller and the right one are strictly bigger.

In Python, if a node is represented in this way
class Node:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None
We can check a tree passing its root to a recursive function like this one:
def is_bst(node, left, right):  # 1
    if node is None:  # 2
        return True
    if not left < node.data < right:  # 3
        return False
    return is_bst(node.left, left, node.data) and is_bst(node.right, node.data, right)  # 4
1. The first parameter, node, is an instance of the class Node. The other two parameters are the limit in which the value is supposed to be. 2. An empty tree is a BST. This comes handy to manage the leaves of the tree. 3. The node value should lie in the specified interval, otherwise this is not a BST. 4. If the current node passes the check, we explore the rest of the tree. On the left we restrict the interval cutting it to the right, on the right we cut it on the left. Couple of questions. What should we pass as left and right for the root? And, What if we don't want to accept an empty tree as a valid BST? I think that in Python a good solution could be having a starting call like is_bst(root) that do not check for an interval and define on its own what to do in case of None. And then it call the above is_bst() for all the other nodes.

Go to the full post

Autowiring ambiguity

I have a hierarchy of Spring Components rooted in the interface Dessert. Let's see what could go wrong when I try to autowire on them.

Here is the base interface:
public interface Dessert {
    default double getPrice() {
        return 0;
It has three empty implementing classes named Cake, Cookies, and IceCream. For instance, here is the third one:
public class IceCream implements Dessert {
In my test case, I autowire a Dessert that I am going to use in the tests, like this:
Dessert dessert;

public void test() {
    assertThat(dessert.getPrice(), is(0.0));
Spring has to know which Dessert to create and wire, and this could be done by class configuration. In this case, the tester should know it, by annotating as SpringApplicationConfiguration the class, that should be defined and annotated as a Spring Configuration file, like this:
@ComponentScan(basePackageClasses = Dessert.class)
public class DessertConfig {
Do not forget to specify the component scan annotation, otherwise Spring won't see the beans, and would complain something like:
 Error creating bean with name 'dd.sia.restaurant.DessertTest':
 Injection of autowired dependencies failed; nested exception is
Could not autowire field:
 dd.sia.restaurant.Dessert dd.sia.restaurant.DessertTest.dessert;
nested exception is
 No qualifying bean of type [dd.sia.restaurant.Dessert] found for dependency:
 expected at least 1 bean which qualifies as autowire candidate for this dependency.
 Dependency annotations:
However, activating the ComponentScan is only the first step. Since we have three Components based on Dessert, we should say to Spring which one to choose. If we don't we are going to have a different, albeit similar, exception:
 Error creating bean with name 'dd.sia.restaurant.DessertTest':
 Injection of autowired dependencies failed; nested exception is
Could not autowire field:
 dd.sia.restaurant.Dessert dd.sia.restaurant.DessertTest.dessert;
nested exception is 
 No qualifying bean of type [dd.sia.restaurant.Dessert] is defined:
 expected single matching bean but found 3: cake,cookies,iceCream
One way to solve this issue is annotating a Component as primary:
public class Cake implements Dessert {
In this way, Spring knows that Cake is the Dessert it should pick up when it is in doubt.

We can always bypass this default, specifying a default in the autowiring:
Dessert dessert;
Now Spring knows that we want cookies and won't pay attention to the primary annotation.
If we don't like the name generated by Spring for the component, that is, the class name but starting lowercase, we can set it as we please, using again the Qualifier annotation, this time on the Component itself. So, for instance, I can qualify Cookies as "yummy" and use this qualification when autowiring to ask Spring to select it.

Reference: Advanced wiring, of Spring in Action, Fourth Edition by Craig Walls. Chapter three, section three, Addressing ambiguity in autowiring.

I pushed the code I have written on this stuff to GitHub. See the package restaurant for the Java classes in the Spring application, and the DessertTest for the JUnit testcase.

Go to the full post

CodeEval Prefix Expressions

The CodeEval problem #7 asks us to write a toy calculator for simple arithmetic expressions written in Polish notation. We expect only addictions, multipications, and divisions. And no inner expressions are allowed, meaning that we can safely expect to have in input firstly all the operators then operands, as integer numbers. We should give an integer answer, even though the internal process should keep correctly track of real numbers generated by divisions.

As sample we are given a unique case, that I converted in a python test:
def test_provided_1(self):
    self.assertEqual(20, solution('* + 2 3 4'))
Since we have only to deal with binary operations, we can use the clever trick of starting from the middle of the input, were we can find the first operand, moving to the left to get the list of operators and to the right for the other operands. This lead naturally to think using zip() a python implementation:
i = len(data) // 2  # 1
result = float(data[i])  # 2
for op, value in zip(reversed(data[0:i]), data[i + 1:]):  # 3
    # ...
1. Getting the central element index in the input data list.
2. Convert the first value to real number, so that floating point arithmetic is applied to the expression.
3. I zip together the left and right part of the input data. First part reversed, so that I start to center and move to the edges fetching elements.

Now, what I should do in the loop. As C programmer my instinct went to a switch on the operators, to detect which operation I have to apply. But Python has no switch, so I had to think to something smarter. I put in a dictionary the mapping between each operator and the actual piece of code I need to run in that context, in form of a lambda function, and then I simply have to lookup in it to have the job done:
mapping = {'+': lambda r, v: r + v, '*': lambda r, v: r * v, '/': lambda r, v: r / v}

# ...

for # ...
    result = mapping[op](result, int(value))
Then, after the for loop, I just have to cast the result to integer and return its value.

On GitHub you can see the unit test and the full python script.

Go to the full post

Hackerrank Queues: A Tale of Two Stacks

This Hackerrank problem asks us to emulate a queue using two stacks. The rationale for this apparently weird data structure could be that we know for sure that we are going to have a data traffic in a sort of wave on the seaside fashion. We have a huge number of pushes with (about) no peek or pop, and then a huge number of peek or pop.
The architect should have calculate that it is not worthy to implement a proper queue, maybe memory comes at a premium on that environment. Or whatever.
Actually, the real sense of it is to see in an interview if a candidate knows about stacks and queue, and how to deal with them.

As usual, before starting writing my python code, I jotted down a few tests, so to clarify to myself what I am going to do. If I get on some soft spot during development, I can always get back here and fatten up the case.
def test_enqueue(self):
    queue = MyQueue()
    self.assertEqual(42, queue.peek())

def test_enqueue_2(self):
    queue = MyQueue()
    self.assertEqual(42, queue.peek())

def test_dequeue(self):
    queue = MyQueue()
    self.assertEqual(2, queue.peek())
As often happens in this kind of coding, we can blissfully ignore the possible exception. Here letting our MyQueue crash in case the user try to pop when it is empty is a non-issue.

I have to use to stacks. One would emulate the head of the queue, form which pop and peek would operate, the other the tail, where push will add its values. So, I found it natural to name them head and tail:
class MyQueue(object):
    def __init__(self):
        self.head = []
        self.tail = []
Being in python world, the most obvious choice is using lists as their data type.

The push method (here called "put", as suggested by the template provided) is a no-brainer. I just append the passed value to tail.
def put(self, value):

Peek and pop will substantially operate in the same way, getting the data from the other stack. There is only a question. How I should ensure that the head is populate with the correct data. Let's delegate it to another method, and just write their core functionality, for the moment.
def peek(self):
    return self.head[-1]

def pop(self):
    return self.head.pop()

And this is the point of the exercise, refill_head(). When we get the first call to pop or peek, we need to retrieve the first value stored in the tail queue. To do that, we can simply move all data stored in that stack to the head stack. As a bonus we have also all the other data ready to be used for pop and peek. Remember that the tail stack is accessed only for pushing data, so it won't mind at all of what there is inside it. We can safely empty it any time we need more data from head. But beware, we can move it only when head is empty, otherwise we would mess up with the elements' order.
def refill_head(self):
    if not self.head:  # 1
        while self.tail:  # 2
1. If the head is not empty we can (actually, we must) delay the refilling.
2. Otherwise we move all the data from tail to head. Being both stacks, the order is respected.

No need for extra testing, the solution got accepted, and I pushed unit test and python script to GitHub.

Go to the full post

Functional interfaces

In Java everything that is not a primitive is an object. OK, but what about functions? Introducing a support for functional programming in Java 8 led to the need of give an answer to this question, and the designers decided to use the ad-hoc solution of functional interfaces.

A functional interface is a Java interface that has only one abstract method. It is a good idea to annotate it as FunctionalInterface so that it's clear what it is mean for, and let the compiler to check if it is alright. However it is not mandatory.

We can define and use any interface we like for use it in this way, but we should save to ourselves, and to our code clients, some pain using when possible the standard one, define in the java.util.function package.

Let's see a few examples.


The predicate functional interface is defined as:
public interface Predicate<T> {
    boolean test(T t);

    // ...
I can use it if I have to write a method filter() that gets in input a list of data and filter them accordingly some requirements decided by the user.
public static <T> List<T> filter(List<T> data, Predicate<T> p) { // 1
    List<T> results = new ArrayList<>();
    for (T t: data) { // 2
        if (p.test(t)) {
    return results;
1. It takes as input a list of T and a predicate on T.
2. For each element in the list, if it pass the test defined by the caller, I add it to the list I return.

Here is a piece of code that uses the filter() method:
List<String> result = FunctionalInterfaces.filter(Arrays.asList("", "x", ""), (s) -> !s.isEmpty());
I create a list of strings and I pass it to filter() alongside a lambda function, that has to match with the expected Predicate, so the lambada input has to be a String - I could state it explicitly, but I can also leave it implicit - and it should return a boolean.
The result is a list containing just one string, namely "x".


It is defined as:
public interface Consumer<T> {
    void accept(T t);

    // ...

Here I use it in a forEach() method that let the user specify what it should do with each element of a List:
public static <T> void forEach(List<T> data, Consumer<T> c) {
    for(T t: data) {
I use this forEach() to keep calculate the length of all the strings in a list:
List<String> data = Arrays.asList("Functional", "Interfaces", "in", "Java 8");
consumed = 0;
FunctionalInterfaces.forEach(data, (s) -> consumed += s.length());
assertThat(consumed, is(28));
Again, note that here the type of the lambda parameter s is inferred from T, type of the list passed to forEach() and used to determine the actual Consumer functional interface to which the lambda should refer to.
Interesting to note how the consumed variable is captured by the lambda without the need of any explicit action by programmer. This mechanism works fine only for non-local variables. Local variables can be captured only if final (at least effectively so).

public interface Function<T, R> {
    R apply(T t);

    // ...

This map() method maps objects of a given type to another one, delegating to the caller the job of specifying how to do the conversion.
public static <T, R> List<R> map(List<T> data, Function<T, R> f) {
    List<R> result = new ArrayList<>();
    for(T t: data) {
    return result;

I use it here to convert a list of strings in a list of integers:
List<String> data = Arrays.asList("Functional", "Interfaces", "in", "Java 8");
List<Integer> result = FunctionalInterfaces.map(data, (s) -> s.length());
The compiler does a lot of job for us here. Since map() is templatized by T, the template type of the input list, and R, from the returned list, it figures out that s, passed to the lambda, is a String, so it should return an object of the class returned by the String method lenght(), that is an int, boxed to Integer.

public interface IntBinaryOperator {
    int applyAsInt(int left, int right);
In the example for Function, boxing the integers was an expected behavior. We need them as objects since we want them in a collection.
Other times, boxing and unboxing could be just a useless cost that we could avoid using a special functional interface like this one, that works with lambdas working on primitives.

This combine() method works on primitive integers and don't want to pay any extra costs for objects
public static int combine(int[] data, IntBinaryOperator op) {
    if(data == null || data.length != 2) {
        return 0;            
    return op.applyAsInt(data[0], data[1]);

I could use it to decide dinamically what kind of operation applying to a couple of integers, accordingly to the lambda passed.
int result = FunctionalInterfaces.combine(new int[] {6, 7}, (a, b) -> a * b);


public interface Supplier<T> {
    T get();

We don't have anything to pass to the lambda, and we blindly let it to create an object of the specified type.
public static <T> T aFactory(Supplier<T> s) {
    return s.get();

Here is a trivial example:
String result = FunctionalInterfaces.aFactory(() -> "Hello");

I pushed to GitHub a class FunctionalInterfaces with the static method having a reference to functional interfaces as parameter and a JUnit tester that shows how I called them.

Reference: Java 8 in action, section 3.4 Using functional interfaces.

Go to the full post

CodeEval Longest Common Subsequence

The CodeEval problem #6 is commonly used as an example to introduce to Dynamic Programming.
We have two strings, we want to know how many and which characters they have in common.

On CodeEval we are given a sample, I added a second test, shorter, that helped me better to develop my python script:
def test_provided_1(self):
    self.assertEqual('MJAU', solution('XMJYAUZ;MZJAWXU'))

def test_simple(self):
    self.assertEqual('HI', solution('AHOI;HI'))
Solving this problem with DP comes quite naturally. I have two strings, I put it one vertically, the other horizontally, I leave top row and left column set to zero, just to simplify the calculations, and then I scan the matrix from top-left to right, row by row, since I get to the right-bottom corner.

Cell by cell, I check if the characters in the two strings match or not. In case of matching, I get the value of the cell one step to the left, one step over, and increase it by one. Otherwise I get the values from the cell over, the cell to the left, and I keep the highest value between the two.

Here is the matrix for my simple test above:
-  -  H  I
- [0, 0, 0]
A [0, 0, 0]
H [0, 1, 1]
O [0, 1, 1]
I [0, 1, 2]
The bottom right cell tells us that there are two matches. The algorithm I am using makes a bit more laborious to get which are them. Well, not in this case, since the shorter string has size two, we already know that "HI" is the longest common subsequence. But generally speaking, we have to backtrack the job we have done, starting from the bottom-right, observing where there is a change of value in a cell compared to its neighbors, moving up to find the next border inside the matrix.

The resulting python is quite terse, so I think could help to better understand the thing.

First part, generating the matrix. In ver and hor I put the two strings. Think that ver is "AHOI" and hor is "HI", from the example above.
t = [[0 for j in range(len(hor)+1)] for i in range(len(ver)+1)]  # 1
for i in range(len(ver)):  # 2
    for j in range(len(hor)):
        t[i + 1][j + 1] = t[i][j] + 1 if ver[i] == hor[j] else max(t[i+1][j], t[i][j+1])
1. Create an all-zero bidimensional list sized +1 of the respective referenced string.
2. Loop on all the cells, and apply the algorithm that I described above. Notice that the indexes i,j are 0-based, so they are OK when referring to the strings in ver and hor. When working on the matrix, however, I adjust them adding one to refer to the current cell, row and column. I guess this is the most confusing part. Maybe following the debugger stepping in the code could help a better understanding.

Now I prepare for backtracking. Let's set a few variables:
i, j = len(ver), len(hor)  # 1
result = [None] * t[i][j]  # 2
cur = -1  # 3
1. I let i and j refer to the bottom-right cell.
2. I am going to put the result in this list. I know the size, it is stored in the bottom-left cell, I initialize each element to None, missing any better default.
3. Since I am backtracking. I am going to put the first character in the last position, and so on.

And this is the backtracking loop. Notice that I changed the convention for i, j. Here it looked to me more natural using i and j without adjustment to refer to the cells in the matrix.
while i > 0 and j > 0:
    if t[i][j] == t[i-1][j]:  # 1
        i -= 1
    elif t[i][j] == t[i][j-1]:  # 2
        j -= 1
        result[cur] = ver[i-1]  # 3
        cur -= 1
        i -= 1
        j -= 1
1. Compare the current cell with the one above. If they are the same, I can move upward.
2. I couldn't move upward, I try to move leftward.
3. I am on the verge of changing values. This means that the letter in the string (pay attention to the index, adjusted by subtracting one!) is part of the subsequence. Store it in the result, and move on.

Finally we have just to convert the result from list to a string, the usual pythonic way is a join on an empty string.

The script has been accepted with full marks. It was just a bit slow, and it consumed lot of memory. So I decided to port the code to C++, task that was easy and rewarding. Just a few minor changes, besides the obvious differences due to the languages grammars, for instance, I found easier in the backtracking to write the result backward and the reverse it by the STL function.

For full reference, I put the unit test and the python script on GitHub.

Go to the full post

Hackerrank Stacks: Balanced Brackets

This HackerRank problem is an interview classic. I have got in input a string containing only brackets, I have to ensure they are correctly balanced. It is very easy if you get you should use a stack to process them. Here I implement a solution in Python, that is interesting because we should make use here a couple of pythonic constructions.

Besides the provided samples, I added a missing important element to my unit test.
def test_provided_1(self):
    self.assertEqual(True, solution('{[()]}'))

def test_provided_2(self):
    self.assertEqual(False, solution('{[(])}'))

def test_provided_3(self):
    self.assertEqual(True, solution('{{[[(())]]}}'))

def test_too_close(self):
    self.assertEqual(False, solution('}}'))
In the last one I have only closing brackets, that's obviously a False. I feel that explicitly setting a test on this case helps to get sooner to the solution.

The idea is pushing the opening brackets on the stack and popping an element when we are examining a closing bracket. If there is no matching bracket in the stack or, worse, the stack is empty, the sequence has to be rejected.

Here is my implementation:
def solution(data):
    matches = { '(': ')', '[': ']', '{': '}' }  # 1

    stack = []  # 2
    for c in data:  # 3
        if c in matches.keys():  # 4
        elif not stack or c != matches[stack.pop()]:  # 5
            return False
    return True if not stack else False  # 6
1. I use a dictionary to store the opening brackets as keys and the closing ones as values. That helps keeping the code compact.
2. A plain list works a as a stack.
3. Looping on the characters in the input data. Notice that there is no error handling whatsoever. This is a typical interview code, robustness would be added on demand.
4. First use of the dict in (1), I compare the character against its keys to check if it is an opening bracket. If so, I push it in the stack.
5. Otherwise, I expect it to be a closing bracket. Before popping from the stack, I should ensure it is not empty, then I use the element in the stack as key for the dict in (1) to retrieve the matching bracket, and I compare it against the actual character.
6. I get to the end of the input data without detecting any mismatch. So I return true. But only if the stack is empty, i.e., all the opening brackets have been consumed.

Solution accepted by HakerRank, test case and python script pushed to GitHub.

Go to the full post

CodeEval Detecting Cycles

Getting in input a list of integers, we should identify where a cycle starts, and return it.
It's the CodeEval problem #5, and it is easier than it looks, because, even if it is not explicitly said, the passed samples suggest that the numbers are present only once, if their are not in a cycle. Maybe we should think of them as labels in a graph, and the sequence could represent the result of a traversal algorithm that enters in an infinite loop.

I have converted the samples in python tests, I added one more just because I felt bad of risking an exception if the detected cycle was interrupted in the input sequence.
def test_provided_1(self):
    self.assertEqual('6 3 1', solution('2 0 6 3 1 6 3 1 6 3 1'))

def test_provided_2(self):
    self.assertEqual('49', solution('3 4 8 0 11 9 7 2 5 6 10 1 49 49 49 49'))

def test_provided_3(self):
    self.assertEqual('1 2 3', solution('1 2 3 1 2 3 1 2 3'))

def test_tail(self):
    self.assertEqual('1 2', solution('1 2 3 1 2'))
For how the test case look, we simply have to search for the first repetition, and then count how many elements fit in the cycle. I assumed, see the last test, that we should return only the elements in the cycle we are sure are repeated.

The core of my solution is a triple (!) for loop:
for i in range(len(data)-1):
    for j in range(i+1, len(data)):
        if data[i] != data[j]:

        size = min(j - i, len(data) - j)  # 1
        for k in range(1, size):
            if data[i+k] != data[j+k]:  # 2
                size = k
        return ' '.join(data[i:i+size])
return ''
1. I loop until I find two matching numbers. At this point I ensure that this is really a loop, checking that there is a match between the following numbers on the left and on the right. I added the extra check on the list size to avoid falling off the end.
2. When I complete the check, maybe finding a spurious element, I stop looping and return the found cycle.

Solution accepted, unit test and python script pushed to GitHub.

Go to the full post

Execute around pattern and try-with-resources

Say that you have to implement in Java 8 a functionality where there is a lot of boilerplate. Instead of duplicating code in sibling methods, we can try to create a single method where the changing behavior is passed by parameter as a lambda function.

Reduced to the minimal term, here is the example of a method that we could refactor using the execute around pattern. The code is made more interesting by using the try-with-resources statement available in Java since version 7.
public static String processFileLimited(String filename) throws IOException {
    try (BufferedReader br = new BufferedReader(new FileReader(filename))) { // 1
        return br.readLine(); // 2
1. We try to create a BufferedReader. At the end of the block, even in case of an exception, Java takes care of closing the resource. Neat.
2. This is the behavior we want to be parametrized.

Lambda is here our friend. Only a minor nuisance, we can't use a standard functional interface because here we should let our lambdas throw an IO exception, and in Java this should specified in the method signature.
public interface BufferedReaderProcessor {
    public String process(BufferedReader br) throws IOException;
Given this interface, we can refactor the above method in this way:
public static String processFile(String filename, BufferedReaderProcessor p) throws IOException {
    try (BufferedReader br = new BufferedReader(new FileReader(filename))) {
        return p.process(br);
The boilerplate is there unchanged, the actual behavior has been externalized. It should be user resposibility to provide something sensed to be done in. Here is a couple of different ways processFile() could be called:
ExecuteAround.processFile(FNAME, (BufferedReader b) -> b.readLine());
ExecuteAround.processFile(FNAME, (BufferedReader b) -> b.readLine() + '-' + b.readLine());
As second parameter, I pass a lambda that in its body uses the BufferedReader to perform its job. A string is (implicitly) returned.

I have divided the code between the actual ExecuteAround class and its unit test. You can find both on GitHub.

Code and ideas are based on Java 8 in action, section 3.3 Putting lambdas into practice: the execute around pattern.

Go to the full post

Hackerrank Hash Tables: Ransom Note

Given two lists of words, we have to say if the second is a subset of the first. This HackerRank problem is very similar to the previous one.

The given sample, that I converted to a python test, should clarify the point of the problem:
def test_provided_1(self):
    self.assertEqual(True, solution(['give', 'me', 'one', 'grand', 'today', 'night'], ['give', 'one', 'grand', 'today']))
I don't bother to add more tests, being the problem so simple. And I don't even spend much time in thinking on the function design, since the problem name itself gives a huge spoiler on its solution. I would put the words in hash tables (dictionaries, since I am using python as implementation language) and I ensure all the words in the second collection are also present in the first one.
from collections import Counter  # 1

def solution(magazine, ransom):
    available = Counter(magazine)
    needed = Counter(ransom)  # 2

    for word in needed:
        if needed.get(word) > available.get(word, 0):  # 3
            return False
    return True
1. Since it is all about counting items, instead of using naked dictionaries I use Counters, making my script a bit shorter. See my solution to the previous problem to see how the code looks like using dictionaries.
2. Actually, the second dictionary is not strictly required, and a check as implemented in the previous problem would reduce slightly the cost of the algorithm. However, I am not here to write optimized code (otherwise I would have chosen C++ as implementation language), and in this way the code is probably more readable.
3. If I don't have enough of the current word, or even not at all, I can stop looping and return a False. Only if all the words in ransom are also in magazine I'll return True.

Solution accepted, unit test and python script pushed to GitHub.

Go to the full post

Hackerrank Strings: Making Anagrams

Given two strings in input, tell how many characters we should remove from both ones to leave just the same characters even if a different order.

This HackerRank problem is meant to be about strings. however, I solved it using Python, and in this case I ended up seeing the two strings not differently as they was lists of whatever elements.

One aspect I have overseen initially was that in a string there could be more than an occurrence of a character. To ensure that my code would work correctly I added a test to the one suggested by Hackerrank:
def test_provided_1(self):
    self.assertEqual(4, solution('cde', 'abc'))

def test_many_a(self):
    self.assertEqual(3, solution('aaaa', 'a'))
In the one provided, I have to remove 'd' and 'e' from the first string, plus 'a' and 'b' from the second one to get to the same 'c' in both of them, total, 4 eliminations.
In my test, I just have to remove 3 'a' from the first string.

I decided to implement my solution using a dictionary to store the letters in the first string and their numbers. I could have written a piece of code like this:
counter = {}
for ch in a:
    counter[ch] = counter.setdefault(ch, 0) + 1
The handy setdefault() function return the value for the specified key, or second parameter, instead of the default None, if there is not such key in the dictionary.

But the job of counting the number of elements in a collection is so common, that the standard python collections library give us a class, Counter, that works exactly in this way. So I saved same typing and I used it instead.
from collections import Counter

def solution(a, b):
    counter = Counter(a)
    # ...
Now I have to sort of compare this counting with the values store in the other string. I decided to loop on all the characters in it, and decrease the associated value if the key exists and the value is not zero, otherwise to increase a buffer variable to keep track of all the characters that are in the second string and not in the first one.
extra = 0  # 1
for ch in b:
    if counter.get(ch):  # 2
        counter[ch] -= 1
        extra += 1
1. Number of characters that are in the second string without a match in the first one.
2. Remember that get() on dictionary returns the associated value to the passed key or None. Here I decrease the value only if I get a value (not None) and it is different from zero.

Finally, I add up all the values in the counter dictionary, plus the extra count coming from the second string, and I return it.
return sum(counter.values()) + extra
I pushed both the unit test and the python script to GitHub for full reference.

Go to the full post

HackerRank Arrays: Left Rotation

Given a collection of integers and the left shift we want to apply to it, return a collection in the new configuration.

This is the first HackerRank problem in their Cracking the Coding Interview series.

Python solution is trivial, nevertheless I converted their sample to a test case:
def test_provided_1(self):
    self.assertEqual([5, 1, 2, 3, 4], solution([1, 2, 3, 4, 5], 5, 4))
Given a list, its length, and the required left shift. The result should be the left-shifted input list.

This is a one-liner solution:
def solution(values, size, shift):
    return values[shift:] + values[:shift]
Obviously the second parameter is not needed, I left it there as for problem requirement.

I ask Python to slice a first time the passed list starting from shift to the end, then a second time, from beginning to shift (remember that the interval is close to the left and open to the right), and finally join them.

I have spent more time setting up the repository on GitHub and the project than solving the problem itself. So, even if I don't think it adds much value, here is the link to the test case and the python script in there.

Go to the full post

JFreeChart and PDFBox

In a previous post, I have created a pie chart with JFreeChart and I saved it as a file in PNG format. Let's now put it in a PDF file, using the Apache PDFBox library, version 2. And I stress version 2 because it is still young and has a few changes that also impact right in this area.

First thing, I have added a dependency to PDFBox in my POM file - it's a Maven project, as you have guessed:
For Gradle fueled project, your build.gradle should have this line among its dependencies:
compile group: 'org.apache.pdfbox', name: 'pdfbox', version: '2.0.2'
I suggest you to check the current version on the Apache PDFBox web site. Actually, this post is already slightly outdated since version 2.0.4 has already been released.

Since large part of the code I am about to use here has been already written, I have extracted the JFreeChart creation to a method, createPiePlotSimple(), that would return the object as seen before, without saving it to file. Then I created a createSimpleOnPdf() method that does the more interesting stuff.

Firstly, it creates a PDF document, and add a page to it.
PDDocument doc = new PDDocument();
PDPage page = new PDPage();
Then I call the createPiePlotSimple() to get the JFreeChart object, and create from it a buffered image.
JFreeChart chart = createPiePlotSimple();
BufferedImage bi = chart.createBufferedImage(500, 500);
Finally, in a try-catch block, since there are lot of IOException that could be thrown around in this piece of code, I convert the buffered image to a PDImageXObject, specific for the PDF format, using a specialized factory class. The image is drawn passing through a PDPageContentStream object, and then I can finally save my PDF file.
PDPageContentStream cs = new PDPageContentStream(doc, page);
PDImageXObject ximage = LosslessFactory.createFromImage(doc, bi);
cs.drawImage(ximage, 100, 100);
doc.save(new File("dump/pie.pdf"));
Pay attention that PDImageXObject and LosslessFactory are new in version 2. Before that we had PDXObjectImage and PDJpeg, PDPixelMap, and PDCCitt. I suggest to have a look at the migration page if you need more information.

I pushed the changes to GitHub.

Go to the full post

CodeEval Number Operations

Given five integers in the close interval [1 .. 52], determine if it is possible, just applying addition, subtraction, and multiplication, to get as result 42.
I spent some time thinking to a smarter way to solve codeeval problem #190, in the end I fell back to a brute force approach. After all, I thought, 42 must be an hint that the answer is easy, but the question is difficult to get.

Here is how I implemented a Python 3 solution for it.

These are the tests provided by CodeEval, that I put in a python unit test:
def test_provided_1(self):
    self.assertEqual('NO', solution('44 6 1 49 47'))

def test_provided_2(self):
    self.assertEqual('YES', solution('34 1 49 2 21'))

def test_provided_3(self):
    self.assertEqual('NO', solution('31 38 27 51 18'))
Assume that "data" is a list containing the five integers we have to deal with. I need to check all the possible permutations of them, after all they are only five, and 5! is just 120.
However, we have to check all the different combinations of +, -, * over them. That means that, for each permutation, examining all the items in the Cartesian product of three elements with four repetitions, that is 3**4, 81.
That's a total of 120 * 81 = 9720 tests.

Not terrible, however I can't expect this code to be blazingly fast. Good for us, the standard python library itertools gives us permutations() that generates all the possible permutations for a collection, and product() that generates all the elements in the Cartesian product for the passed collection and a given size of the result. Just what we need here.
for numbers in permutations(data):  # 1
    for operations in product('*+-', repeat=4):  # 2
        result, *values = numbers  # 3
        for value, op in zip(values, operations):  # 4
            if op == '+':
                result += value
            elif op == '-':
                result -= value
                result *= value
        if result == 42:  # 5
            return 'YES'
return 'NO'
1. Loop on all the data permutations.
2. Loop on all the combinations of the available operations, knowing that we need four of them.
3. Split the numbers, that is tuple, in two parts, result, the first number, on its own, and all the other values in a list. Notice the use of the star operator that here works as a gatherer, bringing together all the other elements besides the first one, taken by the variable before the comma.
4. Here you see why I did that weird thing in the line above (3). Now I can zip() values and operations. Each component with the same index are extracted together and associated to the loop variables value and op, so that I can happily apply the operation to the current result and the next value in list.
5. If the current permutation with the current operations result as required, I return success. Only if I won't find any match, after looping on all the elements, I will say that, no, I could not find a way to get 42 from those integers.

As I expected, CodeEval accepted the solution, but didn't was too large in giving points for it. Anyway, I pushed test case and python script to GitHub.

Go to the full post

CodeEval Sum of Primes

The CodeEval problem #4 is similar to the previous one. Again prime numbers, this time we have to get the first one thousand of them, and return their sum.

No need for a test case, we have just a single possible outcome, and it should be 3682913.

We need a generic function to check if a number is prime, I implemented it in this way:
primes = [2, 3]  # 1

def is_prime(value):
    for prime in primes:
        if prime ** 2 > value:  # 2
        elif i % prime == 0:  # 3
            return False
    return True
1. In this list I am going to store the prime numbers. It make sense to store them somewhere instead of simply checking if a number is prime and then increase a counter, because we also rely on this list to see if a candidate is prime or not.
2. Extra condition to interrupt the for loop. We stop checking at the square root of the candidate. Or, other way round, cheaper to calculate, when the square of the current prime is bigger than the candidate.
3. If we find a divisor for the candidate, it is not a prime number.

And here is the body of my python script:
while len(primes) < 1000:  # 1
    for i in count(primes[-1] + 2, 2):  # 2
        if is_prime(i):  # 3
1. I won't stop calculating until I have one thousand primes.
2. I start looking from the next prime from the last one stored in the list, increased by two. And then I go on, stepping by two. This is because I am not interested in even numbers, evidently not prime. I use the itertool count() function because I don't have a (theoric) end in this loop, and the built-in range() would require a dummy end to work, making the code a tad less clear.
3. When I find the next prime, I push it to the list, break the for loop, and go on to the next iteration in the when loop.

Then I just have to print the sum of all the primes.
For a complete reference, I have pushed my python script to GitHub.

Go to the full post

CodeEval Prime Palindrome

Find the largest prime palindrome less than 1000. Such a simple request is CodeEval problem #3.

It doesn't make sense writing a test case in this case, we can just ensure that our code, in this case my python script, generates the expected solution, 929.

I have split the job in two parts. We are looking for a palindrome number of three ciphers. So, we want its first and last cipher to be the same:
for candidate in range(989, 100, -2):  # 1
    if (candidate // 100 == candidate % 10) and is_prime(candidate):  # 2
1. I should have started looping from 999. However, this number is evidently not prime. So I started from the second palindrome available. I step by 2, downward, since I don't care about even numbers, being them not prime.
2. If I divide the candidate number by one hundred, I get the third, most significative, cipher. This is Python 3, so I use the // to apply the integer division. The remainder of the division by ten gives us the less significative cipher. They should be the same.

But this is not enough, the candidate should also be a prime number. We all know how to check if a number is prime. We ensure that it has no divisor besides one and itself. A safe trick is stopping checking when we reach the square root of the tested number. This lead to this function:
PRIMES = [3, 5, 7, 11, 13, 17, 19, 23, 29, 31]  # 1

def is_prime(value):
    for prime in PRIMES:  # 2
        if prime**2 > value:  # 3
        if value % prime == 0:  # 4
            return False

    return True
1. Sort of cheating. Since I know that the biggest number I have to check is less than 1024, I need only prime numbers less than 32. It is easy to find out the list of them.
2. I am going to check the value against the prime numbers in the list.
3. It is cheaper to get the square of a number that a square root, so I inverted the check to determine if I can safely stop to loop on primes. Notice that I can't merge this check in the one in (2) even if it has to be performed on each cycle. That's Python, love it or leave it.
4. Only if no divisor is found, return true.

I guess you won't need to see the full python script. However I pushed the file on GitHub.

Go to the full post

First Pie with JFreeChart

I have to generate charts from Java. Among the available possibility, JFreeChart has been chosen.
In this post I start from nothing to save a PNG with a pie chart in it.

Well, actually, I have something more than nothing, namely a windows box with Java 8 on it and Eclipse Neon as IDE. I create a Maven project, and I added a few dependencies. The most relevant here is the JFreeChart one, however, don't forget a test library, I'm using JUnit 4:
I want to explore how to create pie charts, so I create a class Pie in a chart package with a static method in it, createSimple() that would return true if the process succeed.
package chart;

public class Pie {
    static boolean createSimple() {
        return false;
Then I create a test case, that is going to be quite trivial, but at least clearly split the concern about writing the code and testing it in two different places. The alternative, giving a main function to the Pie class would have been more regrettable.
public void testCreate() {
So, let's create a pie reporting a few elements in it, defined by a name and an associated value. We are going to use all the possible default provided, just to have a picture to display.
DefaultPieDataset dataset = new DefaultPieDataset(); // 1
dataset.setValue("Merlin", 2.50);
dataset.setValue("Tiger", 18.27);
dataset.setValue("Mustang", 41.33);
dataset.setValue("Dolphin", 72.02);

JFreeChart chart = ChartFactory.createPieChart("Versions", dataset); // 2
1. I create dataset, and I push a few description/value couples to it.
2. Then I create a pie chart going through the ChartFactory class. As parameter, this factory method requires just the name I want to give the chart and the data it should use to create it.

I want to put the image in a file, in the PNG format. This is easily done.
try {
    ChartUtilities.saveChartAsPNG(new File("dump/pie.png"), chart, 500, 500);
    return true;
} catch(IOException ioe) {
    return false;
Be careful of passing to saveChartAsPNG() a file that could be created. If the path does not exist, you are going to get an exception. The other parameters are the chart that has to be created, and its size, width x height. Here is the result I've got:
Nice. However, I'd like not to see the tooltips. From the documentation, I read that I can have a better control on the generated picture calling the overload of createPieChart() that accepts three boolean more, ruling the display of legend, tooltips, and URLs. The shorter version I used above is equivalent to:
// legend and tooltips
JFreeChart chart = ChartFactory.createPieChart("Languages", dataset, true, true, false);
So we just set the second boolean to false to get rid of the tooltips, right? Well, not. For same strange reason, that flag does not work. Fortunately there is a workaround, we can disable the label generator.
// disable tooltips not working!
JFreeChart chart = ChartFactory.createPieChart("Versions", dataset, true, false, false);

PiePlot pp = (PiePlot) chart.getPlot();
// force tooltips off
Try commenting the call to setLabelGenerator(), you will still see the tooltips on the pie, otherwise you'll have only the legend.
Say that I don't like the default color choice, and I want the Mustang slice in magenta. I can call setSectionPaint() on the PiePlot for that slice to change it:
pp.setSectionPaint("Mustang", Color.MAGENTA);
Notice that magenta replaced green for Mustang, but green is not out of the game, it has merely shifted to the next slice.

Last change I want to do here, let also the legend off:
JFreeChart chart = ChartFactory.createPieChart("Versions", dataset, false, false, false);

I have pushed the project on GitHub.

Go to the full post

CodeEval Consecutive Primes

Consider the sequence of natural number, up to a given even element. We have to say in how many ways we can rearrange these numbers so that any couple of neighbors added up gives a prime number. The first element should always be 1, and should go with the last one.
This is the CodeEval problem #187, and here I am giving a Python 3 solution that gets accepted with full marks but does not give any point for the rank, being too slow. Any suggestion for a better algorithm is welcome.

As usual, my first step is converting the given sample in a python test case.
def test_provided_1(self):
    self.assertEqual(1, solution('2'))

def test_provided_2(self):
    self.assertEqual(2, solution('4'))

def test_provided_3(self):
    self.assertEqual(0, solution('5'))

def test_sixteen(self):
    self.assertEqual(81024, solution('16'))

def test_eighteen(self):
    self.assertEqual(770144, solution('18'))
Do not even try the last one until you have a good implementation at hand, because it is going to take a long time to give you back an answer.
Notice that the third test violates the requirement stating that we should expect an even number in input. I interpretate it as a suggestion to explicitly take care of these cases, avoiding the main computation, as better described by the code below:
def solution(line):
    size = int(line)
    if size % 2:
        return 0
    return necklaces([x for x in range(1, size + 1)], 1)
If the passed size is odd, I just return zero. In any case, if you think about it, an odd number of numbers in the natural sequence can't be reorganized as requested by the problem.
I build on the fly, by list comprehension, the sequence I need, starting from 1 ending to size.
I called the function necklaces because, in the metaphor used in the problem, we want to count how many necklaces we can generate from the passed pebbles.

A brute force solution is absolutely out of scope, since generating all the possible pebbles permutations is a factorial algorithm. You could try it, using the itertools permutations function, but keep your input very low.

To save some CPU time, I decided to cache the prime numbers that my script should be aware of in a set:
PRIMES = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31}
Actually, this won't be a substantial help. Even for a simple prime checker function, the number involved are not big enough to cause performance problems.

After some thinking I came up with this recursive function:
def necklaces(beads, pos):  # 1
    if len(beads) == pos:  # 2
        return 1 if 1 + beads[- 1] in PRIMES else 0

    result = 0

    if beads[pos] + beads[pos-1] in PRIMES:  # 3
        result += necklaces(beads, pos + 1)

    for i in range(pos+2, len(beads), 2):  # 4
        if beads[pos-1] + beads[i] in PRIMES:  # 5
            beads[i], beads[pos] = beads[pos], beads[i]
            result += necklaces(beads, pos + 1)
            beads[i], beads[pos] = beads[pos], beads[i]

    return result
1. I pass to the function the list of beads, and the position we have to check. The first time I call it, from the solution() function, beads is the natural sequence from 1 to the number passed in input, and pos is 1, since we have as a requisite that the first one in the list, position 0, won't change.
2. We have reached the last pebble. Check the sum of its value, added to 1 for the first in the list. If this is a prime number, return 1.
3. Check the sum of the current pebble with the previous one. If it is a prime, accept it and call recursively the function on the next pebble.
4. We need to check all the other pebbles in this position. Actually, not really all of them, since we know that we are looking for a prime number. So, if the previous pebble was even, now we want an odd one, and viceversa. This means, we can safely skip by two in the loop.
5. If the current 'i' pebble makes a good pair with the previous one, we accept it and check for the other ones, just like as above in (3). With the variation that we have to put the 'i' pebble in the 'pos' position, and moving somewhere else the 'pos' pebble. Best place I could think of, was the one resulting from a swap between the two pebbles. This has the advantage that I can easily swap them back after the recursive call is done. An alternative would be using two different lists. One for the actual necklace, the other for the currently unused pebbles.

I couldn't think of a better algorithm, still, the only way to get marks from it on CodeEval was reimplementing it in C++. It was easy, just a few minor changes, and the reported time costs improved by a factor x20.

Even if a bit disappointed, I pushed test case and python script to GitHub.

Go to the full post

Conditional bean

We have a bean in our application, but we want it instantiated only if some condition applies. To do that, we can use the Spring Conditional annotation, and let the framework decide if it has to be created or not.

Firstly, I create a very simple class in the restfun package:
public class MagicBean {
    public String toString() {
        return "Magic Bean!";
Then I create a class that implements the Condition Spring interface defining its matches() method.
public class MagicExistsCondition implements Condition {
    public boolean matches(ConditionContext context, AnnotatedTypeMetadata metadata) {
        Environment env = context.getEnvironment();
        return !env.containsProperty("lack_of_magic");
I am saying that the condition hold when the environment does not contain the property named lack_of_magic.

And here is a configuration class that is putting all together in its method annotated as Conditional Bean, that specify the Condition class to verify if there is the condition for acting and that returns a new instance of the required class.
public class MagicConfig {
    public MagicBean magicBean() {
        return new MagicBean();
To check if my magic bean works as expected, I have created a test case where I get it through both the Spring application context and directly by autowiring the specific bean.
public void testBeanInContext() {
    MagicBean mb = (MagicBean) context.getBean("magicBean");
    assertThat(mb.toString(), is("Magic Bean!"));

public void testBeanWired() {
    assertThat(magic.toString(), is("Magic Bean!"));
I played a bit around this code, changing the Condition and the tests, and enjoing the JUnit green light.
Until I felt satisfied, and I pushed the code to GitHub. See the source package and the JUnit test class for details.

I have written this post while reading the third chapter, Advanced wiring, of Spring in Action, Fourth Edition by Craig Walls. I ported the original code to a Maven Spring Boot Web project on the STS IDE, using AspectJ annotations instead of using the classic xml configuration. You should expect a few other minor changes in this code, basically because I wanted to have some fun too.

Go to the full post

CodeEval Mersenne prime

CodeEval problem #240 asks to return a comma separated list of all the Marsenne prime numbers smaller than the smallish number (up to 3000) passed as input in a string.

Actually, I found out that the problem description was not crystal clear. A Marsenne prime is a prime number in the form 2**x - 1. However, when I wrote a first python 3 script and I submitted it to CodeEval, I had the surprise to have it partially rejected since I didn't put in the list 2047 when they expected it.

It is easy to see that 2047 is not a prime number, being factorizable in 23 * 89.

Accordingly with the new piece of information, I added an extra test to my test case, now including three tests:
def test_provided_1(self):
    self.assertEqual('3', solution('4'))

def test_provided_2(self):
    self.assertEqual('3, 7, 31, 127', solution('308'))

def test_three_tho(self):
    self.assertEqual('3, 7, 31, 127, 2047', solution('3000'))
I assumed that what they really want to get from us are the "plain" (opposite to prime) Marsenne numbers, in the variation where the index has to be a prime number. In this case, noticing that 2047 is 2**11 - 1, and its Marsenne index is 11, a prime number, also the third test make sense.

Solved this issue, let's see about the code.

Since I didn't want to do the same calculation on and on, I decided to cache some stuff, namely the already calculated Marsenne numbers, as a tuple, index and actual value. And to make my life easier, I have already initialize the cache with the first element in it:
marsennes = [(2, 3)]
The central part of my python script is this following function that ensure the cache contains all the required marsenne numbers for the input. If not, generates the missing ones, then return them.
def get_marsennes(value):
    if value <= marsennes[-1][1]:  # 1
        return trim(value)

    index = marsennes[-1][0]  # 2
    while True:
        index += 1
        if not is_prime(index):  # 3

        marsenne = 2**index - 1  # 4
        marsennes.append((index, marsenne))

        if marsenne < value:  # 5
        return marsennes if marsenne == value else marsennes[:-1]
1. If the Marsenne numbers have already been calculated, return them. Minor nuisance, I don't want to return too many of them, so I have written a tiny simple little function, trim(), that returns a trimmed copy of the cache. Not showed here, you can see it on github.
2. Get the index of the biggest Marsenne number in the cache, in the following loop it is increased and stored beside the related Marsenne number.
3. As we find out, we have to ensure the index, used as exponent in the number generation, is a prime. I have written another little function to do this job too.
4. Here is the juice. A new Marsenne number is created and shoveled in the cache.
5. If this Marsenne number is not enough to match the user request, go to the next iteration. Otherwise return the cache, or maybe a copy trimmed of its last element. This loop is one of the rare place where a do-while would have been a nice to have. Alas, Guido van Rossum didn't like it.

Basically, that's it. Ah, OK, we need to convert the list of tuples in a comma separated string, something like:
return ', '.join(map(str, [r[1] for r in result]))
This looks like what the CodeEval guys really wanted from this problem, since I got full marks. So I committed test case and python script to GitHub.

Go to the full post

CodeEval Burrows-Wheeler transform

The CodeEval problem #184 asks us to generate the original text from an input line representing the result of a Burrows-Wheeler transformation.

The CodeEval page explains in a few words what is that transformation and what it useful for. Even if could look a bit unclear at first, it is quite easy to write a function that operates the transformation. Here is my python 3 code for it:
def bw_transform(s):
    size = len(s)
    rots = sorted([s[i:size] + s[0:i] for i in range(size)])
    result = [rot[-1] for rot in rots]
    return ''.join(result)
The input line is rotated by one char at time, and all these rotations are stored in a list, that get sorted. Then I extract the last character in each rotation to another list that, joined, gives the result.

However our job is to reverse the result of the transformation, and that leads to a piece of code more complicated. Actually, I'm not sure the fuction I get could be considered a good result. It gets accepted by CodeEval, though.

Here is the test case I used to help me getting through the job:
def test_provided_1(self):
    self.assertEqual('Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo$',
        solution('oooooooo$  ffffffff     ffffffffuuuuuuuuaaaaaaaallllllllbbBbbBBb'))

def test_provided_2(self):
    self.assertEqual('James while John had had had had had had had had had had had a better effect on the teacher$',
        solution('edarddddddddddntensr$  ehhhhhhhhhhhJ aeaaaaaaaaaaalhtf thmbfe           tcwohiahoJ eeec t e '))

def test_provided_3(self):
    self.assertEqual('Neko no ko koneko, shishi no ko kojishi$', solution('ooooio,io$Nnssshhhjo  ee  o  nnkkkkkkii '))

def test_extra(self):
    self.assertEqual('easy-peasy$', solution('yyeep$-aass'))
As you can see also from the tests, in this implementation of the Burrows-Wheeler algorithm there is the trick of adding a marker, here a dollar sign, at the end of the line. That's why I defined a constant in my script:
EOL = '$'
Then I create a matrix, initialized as a list of empty lists, then, for each element, I repeatedly inserted at the beginning of each rotation each character in line.
But notice that, after each run of the inner for loop, I sort the rotations:
def solution(line):
    rots = [[] for _ in line]
    for _ in line:
        for ch, rot in zip(line, rots):
            rot.insert(0, ch)
So, in the end, rots contains all the rotations of the original string.

Now I should find out the right rotation. But that's easy, since I used the EOL marker to identify it.
for candidate in rots:
    if candidate[-1] == EOL:
        return ''.join(candidate)
It is just a matter of selecting the rotation that has it as its last character.

CodeEval took quite a longish time to give its positive response on this solution. I think I should work more on it to get a better result. Anyway, I put the test case and the python script I have written on GitHub.

Go to the full post

CodeEval Gronsfeld cipher

Given the vocabulary, a key, and an encrypted message, get back to the plain text, using the Gronsfeld cipher.
This is the CodeEval problem #181, and here is my Python 3 solution.

ALPHA is the vocabulary we have to use.
ALPHA = ' !"#$%&\'()*+,-./0123456789:<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
I have converted the sample provided to a Python test case, adding an extra test that shows how the decoding works at an extreme of the alphabet:
def test_provided_1(self):
    self.assertEqual('EXALTATION', solution('31415;HYEMYDUMPS'))

def test_provided_2(self):
    self.assertEqual('I love challenges!', solution('45162;M%muxi%dncpqftiix"'))

def test_provided_3(self):
    self.assertEqual('Test input.', solution('14586214;Uix!&kotvx3'))

def test_extra(self):
    self.assertEqual('xyz', solution('3; !"'))
Gronsfeld is a simple extension of the Caesar ciphering method. Instead of using a single number that shifts the plain text to its encrypted version, more numbers could be used. Since the message could be longer than the key, we repeat the key to get the shift we have to apply to a given character in the message.
As you can see, my extra test does not use the Gronsfeld extension capability, falling back to the Caesar schema. It's point is just to show what happens when I encrypt letters that are on the extreme right of the vocabulary.

The core of my solution is a loop that executes a single line converting each character back to its plain counterpart and appending it to the result.
result.append(ALPHA[ALPHA.index(data[1][i]) - data[0][i % len(data[0])]])
data[0] is a list of integers, representing the key.
data[1] is a string, the encrypted message.
The index in ALPHA for the plain character is given by the index in ALPHA of the encrypted one minus the key for that position. In this way it could happen that we get a negative index, but this is OK, since python manages them right how we need.

I pushed test case and solution python script to GitHub.

Go to the full post

CodeEval Justify the Text

For each string of text given in input, format it so that it cover the full size of the line, given as 80. The space between words should be always the same, the extra amount of blanks should go first to the right. We don't have to reformat the last part of the string.
This is CodeEval problem #177, and here you are going to see my Python 3 solution.

I have converted the given sample in a python unit test, adding an extra test, that gave me problems at my first solution and caused the partial rejection of my code:
def test_provided_1(self):
    self.assertEqual('Hello, World!', solution('Hello, World!'))

def test_provided_2(self):
    text_in =  'The precise 50-digits value of Pi is 3.14159265358979323846264338327950288419716939937510.'
    text_out = 'The         precise         50-digits        value        of        Pi        is\n' \
    self.assertEqual(text_out, solution(text_in))

def test_provided_3(self):
    text_in = 'But he who would be a great man ought to regard, not himself or his interests, but what is just,' \
              ' whether the just act be his own or that of another. Next as to habitations. Such is the tradition.'
    text_out = 'But  he  who would be a great man ought to regard, not himself or his interests,\n' \
               'but what is just, whether the just act be his own or that of another. Next as to\n' \
               'habitations. Such is the tradition.'
    self.assertEqual(text_out, solution(text_in))

def test_extra(self):
    text_in = 'This is the language of naval warfare, and is anything but worthy of extraordinary praise.'
    text_out = 'This  is  the  language  of  naval  warfare,  and  is  anything  but  worthy  of\n' \
               'extraordinary praise.'
    self.assertEqual(text_out, solution(text_in))
I approached the problem considering each time a chunk of string, the biggest one possible accordingly to the line size (80). Then I reformatted that chunk accordingly to the requirements, and put this line in a result list. Finally, if there are more words, I push them as last line in the result. And it is just a matter of joining the result to give it back to the caller:
def solution(text):
    result = []
    while len(text) > 80:
        pos = 0

        # ...

        text = text[pos + 1:]

    if text:
    return '\n'.join(result)
I left out the code in the while loop. I need to find where is the "pos" that I use to split the current chunk and so reducing the text given in input, and push to result the formatted line.

Finding where to split

Assuming that the separator between words is just a plain blank, it is easy to loop on the text looking for the one closer to the limit:
pos = 0
while True:
    next_pos = text.find(' ', pos + 1)
    if next_pos == -1 or next_pos > 80:
    pos = next_pos
line = text[0:pos]
Remember that the python string find() method returns the position where the passed token is found, or minus one for not found. As second parameter it is possible to pass where to start to search for it.
Looping I have found the splitting point, so I can extract "line" from "text".

Formatting the line

The general meaning of this part is not complicated, however I spent some time to get the current settings here and tweaking with the settings to get it right. Here a good set of test is an invaluable support.
words = line.split() # 1
blanks = 80 - len(line) + len(words) # 2
size = (blanks // (len(words) - 1)) # 3
extra = blanks % (len(words) - 1) - 1 # 4
for i in range(extra):
    words[i] += ' '
result.append((' ' * size).join(words))
1. This is easy. I get a list of all the words in the line.
2. I calculate how many blanks I need to add to the words to get the expected length.
3. I am going to add as separator between words a block of blanks. Here I am calculating its size.
4. A few words at the beginning of the line may have an extra blank. Here I see how many of them, and then I add a blank to each of them.

CodeEval accepted my solution with full marks. I pushed the test case and the python script to GitHub.

Go to the full post