Monthly Archives: November 2012

Why I love Scala – Part 1: Boxes

I Love ScalaThe Scala programming language logo

Here’s a small example of why I love Scala. First of all, if you don’t know what Scala is, it’s a relatively recent language that compiles to Java binaries that can be run on the Java Virtual Machine. It keeps the best from Java: the JVM is now very fast, very efficient and runs on pretty much every device you could want. Plus, all Java classes are also Scala classes. So as long as something has been ported to Java, you can use it in Scala without waiting for someone to port it. The language itself hardly has anything to do with Java. Unlike Java, Scala is extremely concise and makes very high and frequent use of implicit typing and implicit declarations. In Scala, when

you don’t need to tell Scala that a in an Int. Scala will figure it out for you. It is strictly statically typed, but you don’t need to state the obvious about types. Just program what you need.

The Problem

In this post, I’m going to talk about a programming issue that Scala handles, I think, better than any programming language I’ve ever use: the special unassigned case.

For example, let’s say that you have a string, for example, "abc". And you want to find the first occurrence of a specific character in that string. In Java, for example, you’d use indexOf:

this would evaluate to 2. The index of "a", would evaluate to 0. And that’s bad. That’s very bad. Because

is false, even though a definitely has an index. In most languages, 0 is false, even tough it doesn’t always mean false. And how about the index of "w"?

Java and C++ use the special value -1 as a way to indicate that the index was not found. C doesn’t have an equivalent for indexOf, as it doesn’t even technically have strings but char arrays. But this clever little bit of code: <http://cboard.cprogramming.com/c-programming/100548-strpos-function-c.html> also uses -1 as the special “not found” value.

There are many problems with using -1 as meaning “not found”. First of all, -1 simply does not actually mean “not found.” Second, unlike 0, -1 evaluates to true. All in all, this is usable, but not ideal.

Let’s look at the haphazard mess that is PHP. PHP gives us strpos(). That function will return 2 for strpos(“abc”, “c”) and 0 for strpos("abc", "a"). And in PHP, 0 will also evaluate to false. But for strpos("abc", "w"), it will return a boolean false, which is equal to 0, but not identical. That means that even though all the programming teachers in the world have been, rightly, encouraging their students not write code like

PHP actually requires us to write

And then re-evaluate the whole thing or save the result in the first place. That’s PHP-style messiness.

Python gives us a much cleaner, safer, more elegant and semantically accurate way of dealing with the whole matter. Each string as a built-in index function:

If an unknown index is found, Python will raise an exception, which is perfectly fine in Python. Pythonic code means that it’s better to ask for forgiveness than permission, unlike the C, C++ and Java principle of “look before you leap.” So in Python, you’d do:

Ta da! If you really want to get a -1 when the value is not found, you can always use the find() method instead of index() that will do just that: return -1 if the substring is not found. But if you do that, it probably means that you just don’t understand Python. Ugh.

A Solution

The question now is: how does Scala do it? Well, here my argument sort of falls apart at first. Scala tries not to add too much to what Java already does well, this means it doesn’t implement strings at all and just uses Java’s Strings. By default, Scala uses Java’s bad way of handling that example. Let’s fix this. Let’s create a new class, a SuperString, that will actually deal with this situation in a more idiomatic way. (We’ll see the details on how to do that later.) Our new strings will have a method called find that will do things the Scala way!

What does that return? Well, since SuperString does things the Scala way, it won’t in fact return an Int. Scala has Ints, of course, but Ints aren’t really what we need here, because, precisely, there is a special case where we don’t get a result at all.

Scala provides us with a very powerful tool for that situation: the Option. David Pollack, creator of Scala’s killer app, the Lift Web Framework, thought that Options were great but that it was possible to do even better, so he did. He re-did his own Option class: the Box.

A Box can contain one of three values.

If something went seriously wrong, the Box will contain the special Failure object. This is not for special cases like when a substring is not found, or Python’s ValueError or KeyError exceptions. This is for when something really bad happens.

If the Box has no value in it, it contains the special Empty object.

If the Box does have something in it, it contains the special Full object. Boxes and Full are both generic types. Unlike C++, C# and Java, where type arguments are surrounded by angle brackets, in Scala, generic arguments are surrounded by square brackets. So in our example, the find() method will return a Box[Int], that can contain a Failure, an Empty or a Full[Int].

So now that we have our Box[Int], what can we do with it? As it turns out, quite a lot. First of all, we can check if it’s empty. If result is the Box[int]

will be true if the Box contains an Empty or a Failure, false otherwise.

will be true if the Box contains a Full, false otherwise.

But you usually won’t need to use those. You can, in theory, open a Box with open_!

But again, what’s the point of having a Box if you’re just going to open it? The very fact that there is an exclamation point in the name is meant to warn you that you’d better not be using it unless you have a good reason. If the Box contains anything other than Full, it’s going to throw an exception anyway when it’s opened. And if you check to see if it’s defined first, it’s just as clumsy as the C++, C# and Java ways. No. One of the things you’re likely want to do with the Box, is, say, for example, map it. As a functional language, Scala is very big on mapping.

What that does is that, if the Box is Full, it takes what’s inside of it, puts it in a variable, here r, lets you use it in a function, lambda like here, or otherwise, and then re-package it in a Box. If the original Box contained an Empty or a Failure, the new one will too. It’s that simple.

But wait! Scala has other cool tricks going on here. result is a Box[Int], right? So r has to be an Int. It can’t be anything else! No point in telling Scala about it. Scala can figure it out. It’s actually considered good practice to let Scala deal with it.

But wait! We’re still not done! We have one argument and we’re using it once. In that case, we don’t even have to name it. If all your arguments are only used once and in the order they were passed, Scala can invoke them automatically with the default variable _. So really, all we need here is:

At this point, this no longer returns a Box[Int] but a Box[String]. Any type of object can be Boxed, including user-defined objects, of course.

Now, if we need to do something with our String, we will at some point need to open the Box. But what if our String was not found and our Box is Empty? The Box lets us deal with that with the special operator openOr, which takes as an argument a default value in case the Box is Empty:

But wait! It’s not all! Since Scala is essentially a functional language, it doesn’t need dots and parentheses for method invocation that only take one argument. So, really, all we need here is:

A few parentheses are remaining. Let’s get rid of them by using format instead of those messy plus signs:

And finally:

This returns either something like The character was at position 0. or The character was not found. Perfect. Clear. Simple.

But we can still do more! By which I mean less, in the long term.

I Really Shouldn’t Have to Say This

This only worked because we used the find method on our SuperString.

Just to be explicit, the class SuperString is nothing more than:

Oops. This class does not actually use Boxes. Let’s add them.

(Yes, in Scala, classes take arguments. That means that a class declaration is also part of the constructor. Scala likes concision.)

Of course, a String can easily be converted to a SuperString with a function like:

So, what this would give us is:

  • A class called SuperString that has a find method,
  • A String class with which we could probably make good use of our find method, but that we can’t extend because it’s final,
  • A method that can totally convert an object of one class (String) into an object of the other (SuperString).

Couldn’t Scala figure out on its own how to combine these three things? Well, yes and no. Yes, it could but that would be messy. What if we don’t want it to? Just because we can convert one type into another doesn’t mean we always want to. So what if we indicated that we do want to? Scala does let us tell it that with the almost magical implicit keyword. Let’s make that function from before implicit:

So now, code like:

will return a Full(0) and

will return an Empty. When scala fails to find a find method in the String, it tries to look for another type of object that has a find method and sees if there is an implicit way it can do the conversion. If it can, it does. And so, it looks like we’ve added a method to a final class, when in fact we’ve done nothing of the sort. We could also remove the dot and parentheses for the method call. Super cool.

The Final Line

So to get our index and format it and provide a default message if it fails, all we have to do now is:

This will evaluate to:

Whereas this:

evaluates to:

This is fully statically typed, extremely concise, clear and powerful. We don’t need a special case to check if the text is not found, it’s all handled by the Box. We don’t need to specify a default output for the special case, the Box handle that as well.

Once you can wrap your head around the basic concepts, it’s all smooth sailing.

Final Note

This code uses libraries from the Lift web framework. To get Scala working with them, the easiest thing to do is to run it from the Lift console. Download Lift. Make sure you have Java installed. You don’t actually need Scala. Lift will get its own version of Scala for you. From a Lift project’s main directory, run:

Then, from the Scala console, run:

You’re set.