Category Archives: Erlang

Simple Lempel-Ziv Compression in Erlang

I decided to do a little bit of something useful with Erlang, both to have some code to show, and to get some sense of what it’s like writing something beyond a completely trivial example. Because the capabilities of Erlang shine when you get into low-level bit oriented things, I thought that writing a bit of data compression code would be make for a good example. I’m going to present it in two parts: first a simple but stupid version of the algorithm; and then the encoding part, which into bit twiddling, and potentially gets more interesting.

I’m going to use a simple version of Lempel-Ziv compression. Before I get into the detail
of how LZ compression works, I’ve got an interesting story about how I learned about it. My first summer in grad school, I was looking for a job. One of my college friends worked for a subsidiary of Travellers insurance, and he got me an internship with them.

Our group (3 of us) worked on programs that ran on salepeoples’ laptops. Since this
was 1991, laptops were still pretty primitive. We had to run in 128K of memory, with the best machines having 10 megabytes of hard disk, and the worst machines having two floppies. So memory use was always a huge issue.

Being an insurance company, naturally things were pretty bureaucratic. They hired me to write a prototype of a new program. I don’t remember the details of what I was supposed to write. But the way things worked, they wanted to build this new tool for the salespeople. But the company wouldn’t let them propose a new project without having staffing for it. So they hired me to work on it. But because they hadn’t proposed the project before they hired me,
I had nothing to do while they proposed it, and worked out their requirements. So I worked for them for a total of about 12 weeks; and it took them about 9 1/2 weeks to get to the point where they had an approved project for me to work on. So I had over two months with nothing
to do. So I kept pestering the other two guys about what I could do to help them.

One of the things they wanted was a way of doing a splash screen. They had a GIF image of the company logo, and they thought it would be nice to be able to splash it on the screen whenever the sales app loaded. But they didn’t have an image viewer that they could call from inside their program. So they asked me to write one. GIF images are encoded using LZ. So I coded that up the LZ decoder to get the bitmap out of a GIF in C, and they were happy. Then I decided, as long as I had an LZ decompressor, I should write an LZ compressor, and then we’d have some generic data compression code. So I went ahead and did that, and added a
nice, effective set of data compression routines to our libraries. But the manager was actually pissed at me: I’d added a feature to our library – data compression – without getting permission. The right way of doing things was to write a proposal, and pass it around the various levels of petty bureaucrats for two months while I sat and twiddled my thumbs on the payroll.

Anyway… Back to compression.

Continue reading

Records in Erlang

One of the things I discovered since writing part one of my Erlang introduction is that Erlang has grown a lot over the last few years. For example, the idiom of tagged tuple as a way of creating a record-like structure has been coded into the language. There is, unfortunately, a bit of a catch. They aren’t really added to the language. Instead, there’s a pre-processor in Erlang, and records
are defined by translation in the pre-processor. This to me typifies one of the less attractive
attributes of Erlang: much of Erlang has a very ad-hoc flavor to it. There are important high-level features – like record data and modules – which aren’t really entirely part of the language. Instead, they’re just sort of spliced in however it was easiest for the implementors to cram ’em in. And there are other things that were added to later versions of the language that, while first class, are awkward – like the “fun” prefix for first-class functions.

The effect of that kind of thing is mainly syntactic. The implementation is good enough that
even though things like records are really pre-processor constructs, they feel almost as
if they’re built-ins.

Anyway – let’s take a look at records.

Continue reading

Erlang: a Language for Functional Concurrency (Updated!)

Several commenters pointed out that I made several mistakes in my description of Erlang. It turns out that the main reference source I used for this post, an excerpt from a textbook on Erlang available on the Erlang website, is quite out of date. I didn’t realize this; I assumed that if they pointed towards that as a tutorial, that it would represent the current state of the language. My bad. As a result, several things that I said about Erlang – including some negative comments, were inaccurate. I’ve updated the article, and the changes are marked with comments.

As long-time readers will recall, one of my greatest interests in programming languages. A while back, I wrote a tutorial on Haskell, which is one of the most influential languages currently in the programming language research community. Haskell is a pure, lazy, functional language which gained a lot of attention in recent times for being incredibly expressive, while maintaining a solid theoretical basis with well-defined semantics. What made Haskell unusual was that it’s a completely pure functional language – meaning no true state at all – not for I/O, not for assignment, not for mutable data – but through the
use of a clever construct called a monad, you can create the effect of state without disturbing the functional semantics of the language. It’s quite a nice idea, although I must admit that I remain somewhat skeptical about how scaleable it might be.

One of the competitors of Haskell for mindshare in the community of people who are interested in functional programming languages is a language called Erlang. In many ways, Erlang is a polar opposite to Haskell. Erlang is, basically, a functional language, but it’s designers didn’t object to tossing in a bit of state when it made things easier. It’s dynamically typed,
and even for a dynamically typed language, it’s support for typing is weak. It’s gotten its attention for a couple of big reasons:

  1. Erlang was designed by Ericsson for implementing the software in their switches and routers. It’s the first functional language designed by a company for building production applications for extremely complex performance-critical low-level applications like switching and routing.
  2. Erlang is specifically designed for building distributed systems. As I’ve mentioned before, programming for distributed systems is incredibly difficult, and most programming languages are terrible at it.

Concurrency and distributed systems are a big deal. I’d argue that programming for concurrency and distribution are the biggest problems facing software developers today. Pretty much every computer that you can buy has multiple processors, and to take full advantage of their power, code needs to use concurrency. And the network is an unavoidable part of our environment: many of the applications that we’re writing today need to be aware of the internet, and need to interact with other systems. These are just simple facts of the modern computing world – and software developers need tools to deal with them.

Continue reading