Posted by Daniel Lyons
Mon, 12 May 2008 15:17:00 GMT
Over the weekend I wrote the same program in three (almost four) languages: Haskell, Common Lisp, Erlang and almost Prolog.
I have an informal result I want to share. The first one I wrote was in Haskell. Of the four, it was definitely the most fun to write. I did, however, avoid monadic computation when it probably would have been amenable, instead I opted to use interact and a small combinator of my own design. In a “real” program in a strict language I’m sure it would have been inefficient. I felt the freedom of Haskell, letting you design the solution in the most abstract, high-level way you can imagine it, and still make it quite small.
Next up was Common Lisp. I was surprised by several things. For one, it was shorter by a handful of lines than the Haskell version. I used essentially the same design as the Haskell and that seemed to work well enough, though Lisp lacks the combinators that Haskell comes with, so I wound up using a do loop for something I was just doing with map in Haskell. And of course I was annoyed by the effort it takes to produce an executable: (sb-ext:save-lisp-and-die "a.out" :toplevel 'main :purify t :executable t).
Then I started on Prolog. Of course, the logical part of the problem was short and sweet. Then I thought about what it would take to do line input/output, got the shivers, and decided to move over to Erlang. The syntax is fairly similar, though the execution strategy is shockingly different, but this application didn’t require any backtracking anyway.
Erlang wound up being shorter than Lisp or Haskell.
This fact stunned me, though I can explain it: Lisp requires a fair amount of schmuzz to do things that are free in Haskell and Erlang (namely, defun steals a line for itself) and Haskell requires a fair amount of schmuzz for human-assisted type checking. Erlang didn’t need either of those and has some of the strongest pattern matching around (though the limited guard support is obnoxious). Erlang did need Rubyesque and despicable list_to_integer and atom_to_list type conversions, but none of those cost me a line. You could omit the type declarations in Haskell, and I did add a line to make the type signatures more readable, but that’s really bad form.
I also found myself wondering how much I was buying myself by using Haskell’s data types for part of the computation. What’s the difference between reading a character or two to create a special instance of some data type and then using that everywhere, versus just using the character? In Haskell I’ll get a read error, whereas in Lisp and Erlang an ecase or a pattern match is going to fail later in the game, but not that much later. Does the anal retentiveness pay off in the end? I’m starting to wonder if, as Jonathan Edwards is proposing, formality is a cure that’s worse than the disease.
- Erlang was the shortest (least syntactic BS as far as writing functions are concerned).
- Erlang was the hardest to debug (the error messages are more machine-readable than human-readable).
- Haskell was the most fun to write (I guess I like function combinators and higher-order programming).
- Haskell took the longest to write (it was the first one I did).
- Lisp was the easiest to write and debug.
- Lisp was the hardest to get to a binary of those that could (I don’t think Erlang can ever give you back a binary directly).
- Lisp’s schmuzz was the most obnoxious to write (especially the hapless
destructuring-bind, which is substantially underpowered compared to Erlang’s or Haskell’s pattern matching facility).
I think I will try using Common Lisp’s bind library later this week to rewrite this and see if I like that better than destructuring-bind.
Tags erlang, haskell, lisp | no comments
Posted by Daniel Lyons
Mon, 05 May 2008 12:09:00 GMT
There is a general desire to keep code and data separate, but they are indivisible.
We use migrations in Rails to manage changes to the database, but some data is invariably essential to the proper working of the application. The database schema itself is data. As your understanding of the dataset of the application changes so does the database schema; data affecting data.
The code itself is utterly meaningless except as transformations to the data. Even esoteric languages like Haskell, J and Forth which do their best to avoid naming the data work only by computing with it. Nothing else happens other than examining and modifying data. Thus, much data is inline in the code: initial values, special values, error values.
Sometimes it feels like there is a hard distinction between data and code, but even then, it is an illusion. On the processor, there are just bytes; there’s no difference between moving bytes from your code into the registers to examine them or moving data from a file into registers to execute them. It’s all just bytes moving from point A to point B.
The source code itself is probably the most obsessively well-managed data on the planet. Nobody uses version control for anything else.
Consider TeX. Is it a document (data) or a transformation (program)? Both. Restricting your output to DVI or PDF is no different from restricting your output to plain text. You can perform arbitrary computation with TeX. And this is of course necessary to make a well-crafted document: code is needed to compute how to style the data properly.
On the web we try very hard to maintain a separation of “content” and “presentation,” but the CSS is served by the same web server that sends out the HTML. More and more people are using the same template systems to generate their CSS as to generate their HTML: CSS isn’t enough on its own. Of course it isn’t, no amount of built-in functionality can ever replace arbitrary code (until you become Turing-complete or close to it). Even then, you use inline styles and generate not particularly flexible HTML interfaces.
There is no real distinction between content and presentation either. These things are all the same. Data. Or code.
Be aware of the illusion you operate under for the sake of good style.
Tags philosophy, programming | 1 comment
Posted by Daniel Lyons
Wed, 06 Feb 2008 17:25:00 GMT
In the past three days I’ve gotten three comments on my blog from three different people I didn’t know. Here are their names, email addresses (partly censored) and IP addresses:
| Name |
Email |
Submitter’s IP |
| Hualing |
hualing_huaa@XXX |
124.106.129.106 |
| Carlo |
carloodriguez_rodd@XXX |
124.217.42.19 |
| Tim |
tarmo_timm@XXX |
124.106.128.178 |
All three of the comments have the same basic form, and seem to be aiming for my Lisp stories. The first and last make the most sense but the middle one is kind of a giveaway:
Carlo:
“I didn’t familiar with Lisp Dialects, maybe I’ll try it if not complicated.”
—
Lisp Dialects
The others, in case you’re interested:
Hualing:
“That should be complicated. Do you have to switch much since you have two different environments?”
—
The Bad Side of Scheme Shell
Tim:
“I find both really complicated! We should have a sort of community to help one another with the coding. It should be quite useful after getting a hang of it.”
—Lisp Dialects
The thing that’s really spooky to me is how similar their blogs are in content. Check ‘em out, they all have basically one random sentence with a paragraph behind the cut, not really talking about anything in particular. One of ‘em, I think Tim, talks about his wife and his girlfriend in two different entries. Also, the date and time on the entries on these blogs are fishy. All of them have an entry near June 7, 2007 and then the next one is near January 11th.
Anybody with a clue what’s going on here? They don’t seem to be pushing any goods or services here or at their blogs. I think maybe they’re just trying to accumulate PageRank, but what for?
Edit: Bill has a solution:
“I think (and I’ll post this as well) that this is a pagerank growing trick.
Those pages will get swapped out with spam pages and/or ads once they have some rank.”
He also points out the WHOIS data:
Registrant Name:philip llave
Registrant Organization:Cholilax
Registrant Street1:30 B Caldecott Broadcast Centre,
Registrant Street2:
Registrant Street3:
Registrant City:Singapore
Registrant State/Province:Singapore
Registrant Postal Code:508984
Registrant Country:SG
Registrant Phone:+65.78921258
Registrant Phone Ext.:
Registrant FAX:
Registrant FAX Ext.:
Registrant Email:philipllave@yahoo.com
Same one for all three domains. Definitely a spam thing.
Tags meta, robots, spammers, weirdness | 8 comments
Posted by Daniel Lyons
Wed, 06 Feb 2008 08:50:00 GMT
I have never used such a frustrating development environment as scsh.
As much as I desperately want to like it, and should like it, for all the right reasons,
- The interactive version is basically unusable. The REPL blows.
- The built-ins seem fragile, especially the very-cool
awk which I can’t seem to get to work.
- DrScheme has library functions which should be present in scsh, but aren’t (
fold-files comes to mind, probably because I’ve spent three hours trying to replicate the functionality in scsh and being unable to.)
- Scsh is built on Scheme-48, which seems to be not particularly widely used compared to PLT and Bigloo/Chicken, so it’s missing SRFIs. Plus, loading SRFIs seems to be wonky (
load doesn’t work, but the command line switch -l does, and it seems obnoxious that I’d have to use absolute paths for either.)
Scsh would be just perfect if someone had ever finished that port of it to PLT. As it stands right now, it sucks having two different Scheme environments on my machine: the really nice DrScheme one and then two different flavors of hell with Emacs, Hen mode for Chicken and run-scheme with scsh.
What to do, what to do.
Tags lisp, scheme, scsh | 2 comments
Posted by Daniel Lyons
Tue, 05 Feb 2008 04:35:00 GMT
“Lisp was not the product of a concerted design effort. Instead, it evolved informally in an experimental manner in response to users’ needs and to pragmatic implementation considerations. Lisp’s informal evolution has continued through the years, and the community of Lisp users has traditionally resisted attempts to promulgate any “official” definition of the language.”—Structure and Interpretation of Computer Programs, Abelson and Sussman, page 3.
The rocking sensation you probably did not feel a couple weeks ago was the release of Paul Graham’s Arc. Many people made many remarks about Arc being a massive letdown. I must admit I shared the sense of disappointment. After so many years of promises, and reading and respecting Paul Graham, and then, almost nothing, a tiny language atop a tiny Scheme. It hardly seems like anything to get worked up about.
The Lisp world is an interesting place these days. By and large, people have an opinion about Lisp based on experiences with Common Lisp and Scheme, which I think account for the vast majority of Lisp users. Yet, as Abelson and Sussman observe above, there never really was a Lisp consensus in the past, and today there are far more interesting variants of Lisp in the wild with few users. I think we expected Arc to represent a turning point in Lisp’s evolution towards a more Pythonic development model, the benevolent dictator, with a programming language aesthete in the driver’s seat. There are two important lessons here.
The first lesson is that we do not need Paul Graham to tell us what makes a compelling Lisp. We can mix-and-match our own stuff. Consider SRFI 26, Notation of Specializing Parameters without Currying:
(map (cut * 2 <>) '(1 2 3 4))
That’s almost as compact as this (hypothetical and untested) Arc code:
(map [* 2 _] '(1 2 3 4))
And of course it would be possible to create a reader macro for many Schemes which would achieve the same trick. But the SRFI-26 syntax also permits arbitrarily nested expressions with arbitrarily many variables—does Arc?
This SRFI has been around for six years and is supported by PLT (DrScheme), Kawa, Guile, Chicken, SISC and Gauche implementations, which is most of the ones you’d ever care to use.
Swindle has been around an equally astonishing amount of time, and is nothing but a bunch of Scheme sources that go atop any old Scheme (though bundled with DrScheme). And that gives you damn near 100% CLOS compatibility, based in part on Tiny-CLOS which is also very portable.
This is just what I’ve been looking at lately. There are a ton of compelling Lisps out there, far more fascinating than Arc; in addition to the ones I mentioned last time there’s also Nu, L# and Clojure.
My advice to you: do what Paul has done. Evolve your own Lisp. Pick Lisp or Scheme as your basis if you want to be able to share, or another language if you want to exploit some power it has. Try to keep your modifications portable or use a portable starting basis. Publish your changes.
The second lesson is, of course, to not be disappointed in Arc. We’re disappointed in Arc because we were expecting something “better” than Lisp, or something other than Lisp. Paul delivers a pile of macros. Well, that’s all you’re ever going to get with Lisp. It’s only a let down because we were expecting some kind of magic. But Paul doesn’t have a monopoly on this magic, we all have it. That’s part of the beauty of Lisp. Maybe it’s a disappointment that Paul really can’t do any better than we can, but remember it also means we can do as well as Paul.
Tags arc, lisp, scheme | 4 comments
Posted by Daniel Lyons
Thu, 31 Jan 2008 08:24:00 GMT
The world seems to be a little bit less optimistic today, because Arc has been out for 24 hours and has not yet changed the world.
In case you can’t tell, I like Lisp a lot. Seeing a Lisp machine work recently changed my opinion of Lisp a bit. I had been looking forward to Arc as a Lisp-2 with dirty macros and a lot of opinionatedness, basically as a tiny Scheme which celebrates all that’s gross about the other half of the Lisp family. But now that it’s here I’m not sure I see the point. It’s not that I don’t love Paul Graham, either. I just feel like something is awry.
There are, as it turns out, a number of minor Lisps out there which attempt to address some aspect of what makes Lisp annoying:
- Qi attempts to bring advances in functional programming back to Lisp.
- Lush tries to push Lisp into the multimedia/science sector.
- Scsh attempts to blend Unix and Scheme together.
- newLisp is a refresher and a shot at bringing Lisp to the web.
- Liskell is a Lisp syntax for Haskell with macros.
The problem, essentially, is that the Lisp machine was not a machine programmed in Lisp. It was a comprehensive environment which blurred the distinctions between language, operating system, hardware and editor. To use Emacs is to accept that the editor and the language/OS/hardware will be separate. To use Common Lisp on a modern OS is to accept that the language and the OS are separate. This goes against the nature of Lisp as fully realized on the Lisp machine. It feels inauthentic.
Of the five above, I have the most interest in the Scheme shell because I see it bringing back the tight integration between Lisp and the OS. Scsh not only brings a concise process spawning/piping and IO redirecting sublanguage to Scheme, it also bridges most of the POSIX APIs. It’s way more than just a shell, covering territory you’d traditionally need C for at the low level, the shell for process control, and a scripting language for logic. Instead you get all three for the price of one. I’m very intrigued.
This of course makes me wish I knew more Scheme. Which is an awkward position to be in as a Lisp partisan of so many years. This quote from Lambda the Ultimate helps:
An hygienic macro is one where the meanings of symbols that aren’t parameters to the macro are bound at the definition site rather than the expansion site.
Another user points out that you can accomplish variable capture with a hygienic macro as well as an unhygienic one, but there are problems you can run into with an unhygienic one that you can’t solve with gensym.
I wish I were less tired so I could be more coherent about all this, but it’s been a hell of a day, a hell of a week, and a hell of a month.
Tags lisp, scheme, scsh | 1 comment
Posted by Daniel Lyons
Sat, 19 Jan 2008 07:48:00 GMT
Reasons to consider switching to masturbation for your new vice instead of a MMORPG:
- It’s free
- It’s more fun
- When you’re done, you have something to show for it
- It requires less time. With this extra time you can (but don’t have to):
- Hold down a job (how about that drinking vice!)
- Spend more with non-imaginary friends
- Accomplish more than one task in a given 24-hour period
I’m having trouble coming up with good catch phrases for this new campaign, but I have a few ideas:
- “Be your own whore—not someone else’s!”
- “Human Contact: it’s not just for those who bathe!”
- “Aren’t you neglecting some of your other needs?”
- “You remember that part in the Matrix where they show the tanks with the people inside that are being used as batteries…?”
Tags bitching, complaining, gamers, morons, ranting, wow | 2 comments
Posted by Daniel Lyons
Sat, 19 Jan 2008 04:44:00 GMT
I somehow find myself participating in a game design group called “Etherware.” I’m not quite sure how, but at any rate, we’re discussing how to implement magic in a video game. Most video games implement magic as a list of pre-defined spells. This is very constricting.
My first reaction was, of course, that to make it better is to make it a language, and now we’re actually talking about making a programming language designed around a particular task and trying to find a compelling UI for it that doesn’t resemble programming.
We’ve identified so far four different “languages”: wand waving, rune cutting, verbal incantations and potion crafting. I haven’t read Harry Potter all the way through so I’m probably omitting several.
Jesse mentioned that he always prized the “true name” concept from various magic systems: that once you know someone or something’s true name, you have power over it.
A “true name” is just a string that can be used as an address. There are a number of ways to deal with this, probably the easiest one is just to make a hash and use that. It occurred to me that the number of objects in any game will be prohibitively large, and anyway, what do you do if the gamer creates new objects?
An easy solution to this problem is to use the memory address encoded in hex. But that’s not very user-friendly, or very pronounceable. Plus they tend to be quite large. On the other hand, you have encodings like base-64 which use the alphabet. Base-64 isn’t very pronounceable either, though. You could take the alphabet as-is and use it to encode numbers under base 26.
On the other hand, there are other types of orthographies, such as a syllabary, in which every possible syllable is enumerated. It wouldn’t work with English because of the vast quantity of dipthongs and consonant combinations, but it works quite well for Japanese, which has two syllabaries, katakana and hiragana. There happen to be around 115 distinct symbols in katakana. So, you can encode any number in Romanized katakana by removing any duplicate sounds, assigning an order to the orthographs, and converting the number to base 115. If you do, you’ll get something like this:
>> 12983.to_name
=> "ryavu"
>> 6.to_name
=> "ki"
>> 12983923239823.to_name
=> "megomeagyubika"
Converting back is easy if you’ve ever implemented atoi() in C:
>> "ryakujani".name_value
=> 33155983
>> "kyu".name_value
=> 53
>> "aeiouaeiou".name_value
=> 7158386439867648320
The latter example there is to illustrate the point that, just because it always works, doesn’t mean it always makes something that would be a word. I haven’t computed the probability, but it’s roughly the probability of a number showing up all 9’s. I expect it to be particularly rare because base 115 can represent numbers with a pretty high compression ratio. The set of three syllable words is 1,520,875 in size. There are 5 vowels out of 115, which means there are 60 ways to choose three vowels in a row, so 60 / 1,520,875 = 0.00395%, AKA not very likely. But still can happen.
Of course, there is already a swell system built into every language and every program of uniquely identifying an object. The memory address! And it happens to be a number. In Ruby, you can get at it with Object#object_id and convert back to the object with the please-don’t-use-me method ObjectSpace#_id2ref. Which means you could also have this:
>> h = {:this => "that", :name => 48}
=> {:this=>"that", :name=>48}
>> h.magic_name
=> "vachugyuku"
>> "vachugyuku".object
=> {:this=>"that", :name=>48}
Works with everything in Ruby, for free. Oh, is that a class?
>> Numeric.magic_name
=> "weseke"
>> "weseke".object
=> Numeric
Yeah, I think it is!
I’m pretty proud of the code. It’s BSD-licensed, so feel free to borrow it. Feedback appreciated!
Tags games, linguistics, programming, ruby | no comments
Posted by Daniel Lyons
Mon, 31 Dec 2007 18:06:00 GMT
** FizzBuzz in SNOBOL4
**
** Daniel Lyons <fusion@storytotell.org>
**
** Yes, I am insane, before you ask.
**
**
I = 1
LOOP FIZZBUZZ = ""
EQ(REMDR(I, 3), 0) :F(TRY_5)
FIZZBUZZ = FIZZBUZZ "FIZZ"
TRY_5 EQ(REMDR(I, 5), 0) :F(DO_NUM)
FIZZBUZZ = FIZZBUZZ "BUZZ"
DO_NUM IDENT(FIZZBUZZ, "") :F(SHOW)
FIZZBUZZ = I
SHOW OUTPUT = FIZZBUZZ
I = I + 1
LE(I, 100) :S(LOOP)
END
Tags languages, programming, snobol | 1 comment
Posted by Daniel Lyons
Sat, 15 Dec 2007 11:09:00 GMT
Programming is different because you give people a problem and a solution at the same time.
Somehow, when someone comes to you with a programming task, you distill from that what exactly their problem is. Then you decide what the solution to the problem will be. Finally, you deliver to the customer two things: your conception of the problem they brought you and your concrete solution to that problem. (There’s usually a third product of this, which is either money or laughter depending on how badly you screw it up.)
There’s really two problems here. The customer is actually bringing you a universe with some flaws. They want a universe without these flaws. There are actually many universes without these flaws, so your first job as a programmer is to understand the customer’s reality.
Because programming begins and ends in concrete terms it’s easy to fall under the impression that it is concrete the whole time. Nothing could be further from the truth.
The set of perfect universes for this customer is quite large. Each one consists of a problem definition and a solution implementation. For example, someone may come to you with the problem that they have no money. They want to live in a universe in which they have money. One perfect universe is the one in which they panhandle for an hour and earn $3. Presto, they are no longer broke.
This definition of problem considers being broke to be a one-time state of being. There are other solutions to this problem with different attributes. For example, you could also go pick up aluminum cans for two weeks and take them to the recycling center to earn $50. $50 is more than $3, but two weeks is a lot longer than 1 hour. On the other hand, panhandling is degrading and picking up cans is just boring.
The solution isn’t the only thing that can vary. There are other ways of defining the problem. With a little common sense it’s easy to see that being broke is probably not a one-time event. Better solutions are probably found by designing the problem to account for this insight, but coming up with this insight isn’t something anyone has an algorithm for. The result of this insight, however, is a formal problem and a formal solution.
Additionally, the solution you hand over to the customer comes with another problem built-in. The customer imagines a world in which they have money, not a world in which they pick up aluminum cans or panhandle. However, the customer can’t come to you and just have wealth appear. They have to do some work to create new wealth.
As we imagine perfected universes, we are in the act of formalizing a problem and conceiving of a formal solution to that problem. This means dividing the problem into a metaphor, a formalized problem and selecting and then implementing a formalized solution.
The Incompleteness Theorem comes into it at the stage where you divide the solution into the part you solve for the user and the part the user solves for themselves. Another way of looking at this is, the amount of work the user has to do to utilize the perfection. You can balance this equation such that the user has to do no work; this is the complete solution with no expressiveness. I think this balance is probably a continuum on the incomplete side and a point on the complete side.
Elegance is that the problem and the solution are both as small as possible without becoming complete in the incompleteness theorem sense. Only the minimum of what is needed to define the problem and solve it to perfect this universe are included. Nothing extraneous.
The uncharted territory is problem definition itself. This is what computing science has taken for granted, because sorting and searching are intrinsic and easily defined problems that require interesting solutions that vary only by how. This is also why “real world” programming is such a hassle—most of your time is spent defining the problem/solution space rather than working on the solution.
Let’s work an example. The customer wants data management, so this is the problem/solution space. They bring you disks as the imperfect world. You define two concepts, “files” and “directories” and conceive of how they will work and build FAT. The two concepts consist of the problem you formalize and FAT is the formal solution. You’ve also given the customer the work of using files and directories in an appropriate fashion. That’s their problem; you’re done.
Same example again. The customer wants data management. I look to set theory, and then to relational theory. I conceive of “relations,” “tuples,” “tables,” “columns,” and “queries” from relational theory. I define a formal language, SQL. I implement a database, PostgreSQL. I’ve now increased the complexity on virtually every front, but I’ve also increased the expressiveness on the client’s side. He also has to do more up-front work, in that he must learn how to interact with the system via SQL. If we were being fair, he would weigh the time to learn SQL against the time to learn filesystems and the time he would spend implementing against a filesystem that he wouldn’t spend by having access to SQL and understanding it, but this seldom is the case.
Inappropriate abstractions are simply problem conceptions which increase the amount of work the customer has to do in the supposedly perfected universe with the formally defined problem and selected solution. It is more effort to deploy a program using CORBA in the backend than to deploy a program using a custom-written protocol, so the abstraction isn’t really an abstraction, it buys you nothing.
Powerful abstractions reduce the expressiveness by some factor but make the system more coherent. Drupal may consider arbitrary chunks of content uniquely addressable from anywhere, but this makes the management problem so bad the front-end doesn’t even bother. Voltaire is a better abstraction because removing sibling-to-sibling transclusions aren’t common enough to warrant a design that accommodates them. Voltaire simplifies the problem and the solution, and both are elegant and quite powerful.
Carter’s juxtapositional thinking/focussed attention dichotomy is just pointing out that programming begins with the act of creating a good problem formalism and that is an artistic act. It’s process to him. Fred Brooks is describing the design itself when he talks about conceptual coherence. If you have islands of meaningful features that don’t have a common substrate, you’re just pushing complexity from the flawed universe into the formal problem rather than the formal solution.
Dijkstra is arguing that both the problem and the solution need to be simple and elegant or neither will be comprehensible. And I tend to agree.
I need new nouns for these things.
Tags cs | 1 comment