It's Official: Everything is OO

Posted by Daniel Lyons Thu, 09 Aug 2007 05:59:00 GMT

My wonderful arch nemesis Kirit Sælensminde is up to his dastardly old tricks again. Erlang must be officially in the cool now, or else no one would be trying to demonstrate that their language of choice (or invention, to Kirit’s credit) maps perfectly onto it, or vice-versa.

Let’s talk about the semantic differences:

  1. Erlang messages are one-way. All OO languages have call-return semantics.
  2. Erlang’s functions and message passing systems are fully orthogonal.

Erlang is much, much weirder than your run-of-the-mill OO language. But of course we can dig out examples that are different from your run-of-the-mill language—my favorite example is Io. In Io, you can prefix your message with @ to get a transparent future, or @@ to send an asynchronous message altogether. (Of course, let’s forget for a second that Io is implemented with user threads, so it doesn’t scale anything remotely like what Erlang does on huge quantities of processors.) But in Erlang, if you want to return something, you have to send a new message back to the caller. Or two messages. Or none, or whatever.

While we’re thinking of languages with interesting semantics, find me one in which the methods work in a completely different way from the functions. In a decent OO language, you can’t even make a function that isn’t hooked up to a class or an instance.

Also, there are liberties you get with Erlang that you would never get with OO; because the state is just whatever you happen to pass through the listening function. You can call completely different functions when you’re done handling this request, transforming or dropping your state on the floor altogether. This is how you get live code replacement—because there is a boundary between your “behavior” and your “state,” unlike OO where the whole point is to mingle the two ideas.

Then of course we also have implementation differences:

  1. Starting an Erlang process is cheaper than whatever you’re doing (threads, fork/exec, whatever).
  2. Sending an Erlang message is cheaper than executing a method.

I think I’m really getting at two ideas simultaneously and I fear I may be muddying the water. The first is that there’s nothing surprising about computational equivalences between Erlang and OO language X. After all, in C++ all those fancy method invocations wind up being function calls inside the compiler, which in turn wind up being jumps or whatever in the binary. What is surprising and obnoxious is the thought that this computational equivalence implies that language X ought to be as great as Erlang (or whatever other language we’re talking about). We’re not ever comparing the totality of language X and Y, just whatever subsets of each that happen to make the discovery of this equivalence easy.

If there’s nothing new about Erlang’s message passing system, then why doesn’t every OO language have the parallelism benefits Erlang has? Joe asserts that the reason is because Erlang is a functional programming language free of destructive assignment, and, by extension, most of what you think of as being state in an imperative language. Isn’t the whole point of OO that your objects have state and behavior and can be carried around as a unit? If Erlang is OO, it somehow wound up there starting from completely different pieces.

Supposing that Erlang is OO but it got there a completely different way, what makes you think Erlang has any ramifications for language X? It seems possible (putting it mildly) that there is more to a programming language than simply what can be proven about it mathematically. That there are aesthetic concerns and pragmatic concerns well outside what can be proven with math. While we were talking about Erlang from an OO perspective, did we mention the utility of its bit syntax? The closest comparison I’ve seen to it is Haskell’s BitSyntax library and even that’s terrible in comparison. And don’t get me started about aesthetics. An anonymous object with a method on it might be “equivalent” to a lambda abstraction, but look at these two things and tell me with a straight face you wouldn’t prefer the OCaml version to the Java version:

// Java
addWindowListener( 
     new WindowAdapter() {
         public void windowClosing( WindowEvent e ) {
             System.exit(0); 
         } 
     });

(* OCaml *)
addWindowListener (fun e -> System.exit 0)

Erlang is here, now, alive and kicking, gaining in mindshare and popularity every day. It beats the crap out of basically everything else when it comes to concurrency. It has a reasonably large built-in library, it’s been field-tested and really delivered. So what’s with the rush to talk about taking its features to other languages? This is really puzzling to me because Erlang is among my top six languages for implementing damn near anything. I’d vastly prefer it to, say, JavaScript, which is the foundation for Mahlee. JavaScript is a potentially fine language but, as usual from the curly brace contingency, it eschews expressiveness for something like backwards compatibility with menial programmers. Of course semantically I love it, but I love Io so much more I can hardly think about JavaScript. I see it as a web development necessity made tolerable by Prototype, which really is just Ruby for JavaScript when you get down to it.

The rush to pirate the #1 interesting feature from Erlang into various other languages is indicative to me of two things:

  1. Everyone wants easy concurrency right now, and
  2. People do not fully appreciate either the mathematical, pragmatic or the artistic value of a programming language.

If people really were interested in a language as a mathematical principle, then they would only program in languages that map directly onto mathematical formalisms, and they’d chafe at the places where the mapping isn’t perfect. Instead, we rely on math when it’s convenient for us, and discard it the rest of the time.

If people really were interested in a language pragmatically, they would see the value of the language as being a holistic experience. I think there are some truly pragmatic programmers in this world using C, C#, C++, Python and Ruby simply because the combination of the language and the library is best for them. It’s extremely contrary to pragmatism, however, to look at other languages solely in terms of collections of ideas which can be stolen. Languages that are assembled this way are rarely beautiful and never mathematical. In general they come out looking like C++.

If people had a taste for languages as an art form, they would certainly find Erlang to be too beautiful in its own right to want to rush out and copy it. And even if they did, for some reason, desire to do that (perhaps because of some great idea which is impossible in Erlang or merits its own language) they would at least have the artistry to not make it another hard to read, hard to parse nightmare like a curly-brace language. I would, in all seriousness, expect it to look more like Haskell. I’d even take a Ruby/Python hybrid, personally, having the keywords and flexibility of Ruby combined with syntactically meaningful indentation/whitespace.

To return to the topic, it’s very easy to look at Erlang’s message passing and attribute the performance and reliability of Erlang to it. This is an error. Erlang’s performance is not a pure math result, but rather a combination of having a very strong theory supported by a very strong implementation and an environment that encourages coding to a certain aesthetic standard that maximizes the utility of the implementation and theory. Threading can be done correctly, beautifully, and so forth but nothing about threading itself or its implementation in threaded languages in any way encourages this—and threading today is considered expert knowledge, practiced very conservatively by most developers if at all, with the vast majority preferring to write unthreaded code or for systems such as web servers backed by relational databases which mitigate the complexity. Ruby, like Io, uses green threads internally yet by the grace of web servers and databases is used to serve millions of concurrently-generated pages daily by everyone using Rails. (And it’s worth pointing out that its user threading is much poorer than Io’s, which is actually quite good for what it is).

Erlang does everything in its power to help you and encourage you to write concurrent programs the right way. Some of this is syntactic—your messages can be of arbitrary complexity and pattern matching makes it quite simple to decide what to do with a given message. Some of this is semantic—you cannot assign to a variable that’s already bound. Some of it is in the standard library, like OTP, which makes it easy to write servers and monitor them. The whole infrastructure is conspiring with the programmer to the same end. It would be a lie to tell someone, “well, you’ve used OO, that’s message passing, that’s basically the same” because they haven’t experienced it in truth. From this perspective saying “Erlang is OO” is completely meaningless because it’s so much less and so much more than OO is. And simultaneously, to say “If you have asynchronous messages, then you have Erlang” would also be a lie, because it’s like pointing to the engine of a car and saying it’s all there is to it.

A language is so much more.

Tags , , , , ,  | 2 comments

Pattern Matching and Unification

Posted by Daniel Lyons Sat, 19 May 2007 03:58:00 GMT

For fun during my meeting with Baird the other day at the Flying Star on Central, I gave him a toy problem I read somewhere on the internet.

Problem: In Prolog, write a predicate len_three/1 which succeeds when its parameter is a list of three items.

A crummy answer looks like this:

len_three(X) :-
   length(X, Len),
   Len = 3.

Of course there are always many crummy answers. Here’s another crummy one:

len_three(X) :-
   length(X, 3).

The best answer is to let the unification algorithm do the work:

len_three([_,_,_]).

Erlang isn’t a far cry from Prolog syntactically, so it’s no surprise that this is valid Erlang:

len_three([_,_,_]) -> true.

To illustrate using the remainder of a list, I showed him how to write head and tail in various languages:

 -- Haskell
head (x:_) = x
tail (_:xs) = xs

(* OCaml *)
let head (x::_) = x  
let tail (_::xs) = xs

% Prolog
head([X|_], X).
tail([_|T], T).

% Erlang
head([X|_]) -> X.
tail([_|T]) -> T.

So you see these are all pretty analogous to each other, thanks to the pattern matching or unification functionality of the various languages.

So then Baird wanted to see an example of peeling some items off a list but also using the remainder. So I proposed pairs/1, which takes a list as a parameter and returns a list of lists, each sublist containing two items. Implementations:

-- Haskell
two (x:y:xs) = [x,y] : two xs
two [] = []

(* OCaml *)
let rec two = function
  | (x::y::xs) -> [x;y] :: two xs
  | []         -> []
;;

% Prolog
two([X, Y | Xs], [[X,Y] | Result]) :- two(Xs, Result).
two([], []).

% Erlang
two([X, Y | Xs]) -> [[X, Y] | two(Xs)];
two([]) -> [].

Some observations:

  1. Notice how much more obfuscation there is going on in the OCaml version than the Haskell version. Taste that sugary sweetness of Haskell’s syntax.
  2. In theory, the Prolog version should be reversible—in other words, you should be able to unify by calling two(X, [[1,2],[3,4]]) and getting X = [1,2,3,4] back. In practice, I have some trouble writing predicates that have that property, and I’m not sure what I’m doing wrong. I should study Prolog more carefully.
  3. Notice the Erlang version is shorter than the Prolog version. Shorter and perhaps slightly more readable. I have omitted the module and export declarations, however.
  4. The Erlang implementation does something funny if you give it 1-10:
    2> two:two([1,2,3,4,5,6,7,8,9,10]).
    [[1,2],[3,4],[5,6],[7,8],"\t\n"]

Tags , , , , , , ,  | 2 comments

Functional Programming

Posted by Daniel Lyons Thu, 17 May 2007 09:11:00 GMT

Functional programming is undergoing a sort of a collective raised eyebrow these days. Here’s why I think it’s cool and useful:

  1. Functional programs are more concise.

    This is because recursive definitions are closer to their mathematical counterparts than procedural ones.

  2. Functional data structures have interesting properties.

    For example, purely functional lists and trees work through an implicit copy-on-write, meaning older versions of the data structure can continue to exist very cheaply.

  3. Lazy languages improve performance through memoization.

    If all your functions have the same result for a given input, then the compiler can just substitute the answer for the whole computation after it has been computed once. This saves a lot of compute cycles at the cost of some memory.

  4. Lazy algorithms let us use high-level metaphors without sacrificing performance.

    It’s natural to talk about, for example, the Fibonacci sequence as an infinite list, but only lazy languages like Haskell let you really define it that way.

  5. Functional programs are more modular than procedural or object-oriented programs.

    Through algebraic data types, type classes, polymorphic functions and functors, you can reap almost all the benefits of OO software construction. But you also get pattern-matching and a wealth of function composition functions, and very cheap and pretty anonymous functions. Your glue code becomes small and transparent. The net effect is tremendous.

  6. Functional programs have fewer bugs.

    Functional programs depend on mathematical properties to a far greater extent than on intrinsic state. This means the method of computation is not hidden in the background in terms of correct sequencing of modifications to some binary chunk of data. Because the pieces are smaller and work deterministically with a given input, you are more confident of the result when they are combined. Higher-order functions (functions which combine other functions) are smaller and well-understood, and replace much of what would be hand-written glue code in a procedural language. All these attributes combine to make code that, in general, is more reliable than procedural or OO code.

Flame on.

Tags , ,  | no comments

Norvig's Spelling Corrector in Haskell

Posted by Daniel Lyons Mon, 09 Apr 2007 08:15:00 GMT

I caught wind of Norvig’s statistical spellchecker essay earlier today and decided to do a reimplementation in Haskell.

Right now, my version blows. It’s 61 lines of code compared to his 25, substantially slower and quite ugly to boot. Yay. :-P

In related news, I bought Programming Erlang. And I know someone who is using Oz which looks ugly and horrible but rather powerful at the same time, so I installed that too.

Tags , , , , ,  | no comments

Haskell Arrows

Posted by Daniel Lyons Sun, 08 Apr 2007 11:14:00 GMT

Just a quick post about my implementation of FizzBuzz with Arrows.

If you’re not familiar with arrows, they’re usually called a generalization of monads to arbitrary computations. In this case I realized I could use an arrow to handle the combination of both mod 3 and mod 5 tests. In reality, it isn’t much nicer than the ordinary function composition method, except that I get to write the tricky part in points-free form. Actually, most of the program is points-free.

The usual stuff at the beginning, and the Arrow library:

module Main where
import Control.Arrow

For simplicity, I defined some helper functions for testing divisibility and generating a test. I decided to use the Either monad to carry around either an integer if the function didn’t mangle the number into a string, or the string. Then I use the utility to define my functions three and five which handle their respective tests:

-- True if y divides x evenly
divides :: (Integral a) => a -> a -> Bool
x `divides` y  =  y `mod` x == 0

-- Generates an Either monad function for two types
genTest :: (a -> Bool) -> b -> a -> Either a b
genTest test result x = if test x then Right result else Left x

-- The fizzbuzz functions
three, five :: Integer -> Either Integer String
three = genTest (divides 3) "Fizz" 
five  = genTest (divides 5) "Buzz" 

The last piece of the FizzBuzz part of the problem is something which combines the result of the three and five functions. I wish there were a way to get rid of this but I’m not sure there is. At least by using pattern matching, I think the code is somewhat more aesthetically pleasing. Take a look:

-- A thing to combine the results of the fizzbuzz functions
combine :: Either Integer String -> Either Integer String -> String
combine (Left x)  (Left _)  = show x
combine (Right x) (Right y) = x ++ y
combine (Right x) _ = x
combine _ (Right y) = y

OK, just one thing left: actually sequencing the operations and doing the damn work! I want to wind up with a function which takes a number and returns the appropriate string:

-- OK, run three and five with the same input and combine the result
fizzbuzzArrow :: Integer -> String
fizzbuzzArrow = three &&& five >>> uncurry combine

This is the real meat. The arrow here is doing this computation (forgive my horrible ASCII art:

          val
           |
           |
           |
         (&&&) -- the fanout operator copies val
          / \
         /   \
        /     \
       /       \ 
      /         \
     |           |
     |           |
    three val    |    -- call three on the first copy
     |           |
     |      five val  -- call five on the second copy
     |           |
     |   (>>>)   |    -- send both values onward
     |           |
    (r1,        r2)   -- if r1 = three val and
     |           |       r2 = five val, this is the 
     |           |       tuple we're carrying around
     |          /
     |         /
     |        /
     |       /
     |      /
     |     /
    uncurry combine   -- uncurry unpacks a 2-tuple into
           |             two args for a function call
           |             thereby calling our "combine" 
           |
        result        -- this is the function's result 

I hope that’s clear!

Now that we have our fizzbuzzArrow function which takes a number and produces the right string, we just have to call it on the correct input and print out the result. This is pretty simple in Haskell:

-- The main: apply the function (with printing with a newline) to all of
-- the first hundred integers.
main = mapM_ (putStrLn . fizzbuzzArrow) [1..100]

Nothing tricky there; just using mapM_ to send the list of numbers between 1 and 100 through a function composition of putStrLn and fizzbuzzArrow. The composition has type Integer -> IO (), which is why mapM_ is necessary rather than map.

All this, just for points-free style?! How about the simpler, non-arrow function combining method! It looks almost the same:

-- fizzbuzz :: Integer -> String
fizzbuzz x = uncurry combine (three x, five x)

main = mapM_ (putStrLn . fizzbuzz) [1..100]

Yeah, yeah… well, hopefully the arrow is at least instructional!

Tags , , ,  | 1 comment

F-cking Fast

Posted by Daniel Lyons Wed, 20 Dec 2006 07:46:44 GMT

Raise your hand if you think this sounds cool:

...then my page rendering algorithm will look like me iterating through the list of regions producing the strings and slapping them together.

...actually, I’d like to parse the template into a list of closures, some which produce text immediately (the static part of the template), and others which perform a region content lookup.

...and then a page request will actually be a fold-right over the list of template functions, passing in ”” as the initial value and using (concatenate ‘string) as the fold function.

...and then we scream “bwa-ha-ha” over how fucking fast that puppy is going to be.

Tags , , ,  | 3 comments

Functional Voltaire

Posted by Daniel Lyons Wed, 13 Dec 2006 07:43:44 GMT

So often lately I think about how to make Voltaire functional. A key part of Voltaire is adding code for specific sites, both via plugins and into the regions themselves. This immediately disqualifies OCaml.

The problem of putting code into a template is basically a parsing problem. I already have parsers for VTemplates in OCaml and Haskell. It was a snap. I know of no parsing helper tools for Lisp. Bummer. Erlang has yecc and leex, which I might be learning soon, even though they’re very unorthodox.

OTOH, putting Lisp code into a region and running it later is a snap. eval is part of the REPL, after all. So it takes to dynamically inserted code like a dream. Erlang has a built-in library for evaluating strings, but it’s somewhat less convenient to use than Lisp’s eval. But it’s probably better too, it being Erlang. Haskell, I thought, has no eval. Actually it does, in the form of a library that doesn’t compile against GHC 6.6 (industry standard). It’s sort of wacky to me that you can implement eval as a library, but whatever.

I have in mind a good reason to use Lisp though: keeping data and functions together in the image. After all, the whole “regions table” could be kept in memory as a sort of infinitely nested hash or a-list, carrying around templates which have been compiled into simple functions which are applied to the request args or the requested path or whatever. In fact, it seems clever enough it would be fun to use. One drawback to such an implementation is that the code must then be kept in memory, rather than in a database (I think). So, I would have to supply a mechanism for changing the templates over the wire, and I’d have to keep around the template source, and worry about reading and writing this data structure to disk. Michael’s site is larger than Nina’s somehow and comes to a little under one meg of actual region data, though, so it may not be so bad performance-wise. But code would have to exist which currently doesn’t. On the plus side, if it were simply a nested a-list, the REPL could take care of the disk access, mostly. A hash table would be more trouble.

So, I have bits and pieces of Voltaire implemented in a half-dozen or so languages (several other chunks I haven’t mentioned here). As a family, I’d like to call them The Old Women, after the half-assed woman of Candide. If I eventually distill an implementation in one of them, I may name it Pangloss, since that name befits anything the result of such foolish optimism as I have.

...

I was told earlier today that my functional programming chops, combined with another person’s OO chops, “and Eastern Europe doesn’t stand a chance!” Rock on.

Tags , , , , ,  | no comments

More Haskell L99

Posted by Daniel Lyons Tue, 12 Dec 2006 08:13:36 GMT

I’m up to #23 and just now hit the first snag. Well, the second snag. An earlier problem depended on infinitely nested lists, which you can’t do in Haskell without making a new data type and I didn’t feel like doing it. (#11).

A couple observations:

  1. In general, the Haskell implementation is very short. The average length for these solutions is 3.7 lines of code (including types, when the compiler made me add them).
  2. In general, the Haskell solution is quite quick to arrive at. Things that would be difficult to figure out how to express in Lisp seem to come together more nicely because there’s less between you and your idea, syntactically (believe it).
  3. When you’re not sure what the hell you’re doing, it helps to add a type signature.
  4. If you added a type signature and it’s still not buying it, actually read the error message. All of the messages I’ve gotten from GHC have been extremely helpful, when I’ve stopped to actually, you know, read them.
  5. I’m a little surprised at how easy these problems are. There have only been two which struck me as intrinsically more difficult (#6 and #9). This is either because I’ve read the Little Lisper or because I am just more familiar with functional programming than I used to be.

I guess it could also be because I cheat left and right: in #18, I have this line of code: slice l 1 y = take (y-1) l, and my implementation of #22 is just range x y = [x..y]. But hey, Haskell is more expressive. What can I say?

I think I may post the file containing all my solutions when I’m done. It’s 160 lines of code right now (I didn’t include comments in the average above).

Tags , , ,  | no comments

Haskell: Implementing "if"

Posted by Daniel Lyons Mon, 11 Dec 2006 22:08:54 GMT

It just occurred to me, Haskell is the only language I know of in which one can implement if without using if (or cond, or whatever):

my_if True x _ = x
my_if False _ y = y

A test:

*Main> my_if True (putStrLn "Hello, World!") (putStrLn "Goodbye, World!")
Hello, World!
*Main> my_if False (putStrLn "Hello, World!") (putStrLn "Goodbye, World!")
Goodbye, World!
*Main> my_if (2 == 2) (putStrLn "Hello, World!") (putStrLn "Goodbye, World!")
Hello, World!
*Main> my_if (3 == 2) (putStrLn "Hello, World!") (putStrLn "Goodbye, World!")
Goodbye, World!

Neat!

Of course I’m sure Lisp could do this with a macro (and if or cond), but to have this in the very fabric of the language is kind of nice.

Tags , , ,  | 3 comments

A Haskell and Lisp Comparison

Posted by Daniel Lyons Mon, 11 Dec 2006 21:33:00 GMT

This struck me as I was working through the L-99 99 Lisp problems:

P09 (**) Pack consecutive duplicates of list elements into sublists.
If a list contains repeated elements they should be placed in separate sublists.

Example:
* (pack ‘(a a a a b c c a a d e e e e))
((A A A A) (B) (C C) (A A) (D) (E E E E))

Well, naturally I was using Haskell rather than Lisp, so I came up with this:

pack :: (Eq a) => [a] -> [[a]]
pack []     = []
pack (x:xs) = pack_it [x] xs where
    pack_it y []           = [y]
    pack_it (y:ys) (l:ls)  = if y == l 
        then pack_it (y:l:ys) ls
        else (y:ys) : pack (l:ls)

It’s not amazingly readable, but in plain English, what’s happening here is I’m using a function pack_it which takes two lists. If the first thing in the second list is the same as the first thing in the first list, it recurs, shifting the element over to the first list. If that isn’t the case, it produces the first list concatenated with the recursive invocation of the parent function pack with the remainder of the list. The parent function just calls pack_it with the first item of the list in its own list as the first list, and the remainder as the second list.

It’s kind of surprising to me that I’m able to write code such as this.

Anyway, for comparison, I thought I’d take a peek at the Lisp implementation on the same site. It looks like this:

(defun pack (lista)
  (if (eql lista nil) 
      nil
      (cons (pega lista) (pack (tira lista)))))

(defun pega (lista)
  (cond ((eql lista nil) nil)
    ((eql (cdr lista) nil) lista)
    ((equal (car lista) (cadr lista))
     (cons (car lista) (pega (cdr lista))))
    (t (list (car lista)))))

(defun tira (lista)
  (cond ((eql lista nil) nil)
    ((eql (cdr lista) nil) nil)
    ((equal (car lista) (cadr lista))
     (tira (cdr lista)))
    (t (cdr lista))))

How about that. 19 lines versus… 6. Or 7 if you include the type.

Edit: Justin Dressel suggests this implementation with guards and the @ pattern:

pack (x:xs) = pack_it [x] xs
  where
    pack_it y []  = [y]
    pack_it y@(y1:ys) l@(l1:ls) 
      | y1 == l1  = pack_it (l1:y) ls
      | otherwise = y : pack l

Beeyuteeful.

Edit #2:

My Haskell implementation of pack, translated into Lisp:

(defun pack (l)
  (if (null l) nil
      (let ((x (car l))
            (xs (cdr l)))
        (pack-it (list x) xs))))

(defun pack-it (h l)
  (let ((kind (car h))
        (hd (car l))
        (tl (cdr l)))
    (if (eq hd kind)
        (pack-it (cons hd h) tl)
        (cons h (pack l)))))

This is a mere 13 lines of code compared to the 19 of the tira/pega implementation. Which makes me think I may have done something cleverer.

Tags , , , ,  | 9 comments

Older posts: 1 2