Relational Thinking in J
According to Aaron Hsu, the starting point for APL programming is the relational model.
I’m mixed on this, because I don’t think J has a natural concept of table, actually. Moreover, it seems like your code gets littered with head {.
and constant-powered {
lookups if you do represent data in a tabular format, with heterogeneous rows. I could be wrong, but it seems to work better in general when you have homogeneous arrays. Creating weird packets of data like we do in other languages just doesn’t seem to be the ticket here.
Suppose you design a table like this:
developer name | |
---|---|
alice | a@example.com |
bob | b@example.com |
calvin | c@example.com |
delilah | d@example.com |
ellen | e@example.com |
You will probably wind up using 0 {
or 1 {
to take it apart to do different things with the different columns. So I would probably build this table in J like so:
developers =. 'alice'; 'bob'; 'calvin'; 'delilah'; 'ellen'
devemails =. 'a@example.com'; 'b@example.com'; 'c@example.com'; 'd@example.com'; 'e@example.com'
This is maybe a column-oriented view of the world. You can recover the table pretty easily though:
developers ,. devemails
┌───────┬─────────────┐
│alice │a@example.com│
├───────┼─────────────┤
│bob │b@example.com│
├───────┼─────────────┤
│calvin │c@example.com│
├───────┼─────────────┤
│delilah│d@example.com│
├───────┼─────────────┤
│ellen │e@example.com│
└───────┴─────────────┘
Projection is sort of obvious now, you have to choose the columns you want because you don’t have the table, as it were. Selection isn’t so bad; you are going to filter on a certain column and apply that filter on the other column. Let’s find the developers with an “a” in their name:
developers #~ 'a' e."1> developers
┌─────┬──────┬───────┐
│alice│calvin│delilah│
└─────┴──────┴───────┘
The same selection works on the other column, and you can still stitch together columns to make a table:
devemails #~ 'a' e."1> developers
┌─────────────┬─────────────┬─────────────┐
│a@example.com│c@example.com│d@example.com│
└─────────────┴─────────────┴─────────────┘
(#&developers ,. #&devemails) 'a' e."1> developers
┌───────┬─────────────┐
│alice │a@example.com│
├───────┼─────────────┤
│calvin │c@example.com│
├───────┼─────────────┤
│delilah│d@example.com│
└───────┴─────────────┘
Joins work by doing lookups by index. Let’s introduce another table:
project name | lead developer |
---|---|
alphago | calvin |
bitbucket | ellen |
cafeteria | delilah |
diffie | alice |
entryway | bob |
finality | alice |
grace | delilah |
homelab | calvin |
Following the earlier example we get this:
projects =. 'alphago'; 'bitbucket'; 'cafeteria'; 'diffie'; 'entryway'; 'finality'; 'grace'; 'homelab'
projdevs =. 'calvin'; 'ellen'; 'delilah'; 'alice'; 'bob'; 'alice'; 'delilah'; 'calvin'
projects ,. projdevs
┌─────────┬───────┐
│alphago │calvin │
├─────────┼───────┤
│bitbucket│ellen │
├─────────┼───────┤
│cafeteria│delilah│
├─────────┼───────┤
│diffie │alice │
├─────────┼───────┤
│entryway │bob │
├─────────┼───────┤
│finality │alice │
├─────────┼───────┤
│grace │delilah│
├─────────┼───────┤
│homelab │calvin │
└─────────┴───────┘
We can find the email of the developer for each project like so:
(developers i. projdevs) { devemails
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┐
│c@example.com│e@example.com│d@example.com│a@example.com│b@example.com│a@example.com│d@example.com│c@example.com│
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┘
This might be easier to read as a table, so let’s do that:
projects ,. projdevs ,. devemails {~ developers i. projdevs
┌─────────┬───────┬─────────────┐
│alphago │calvin │c@example.com│
├─────────┼───────┼─────────────┤
│bitbucket│ellen │e@example.com│
├─────────┼───────┼─────────────┤
│cafeteria│delilah│d@example.com│
├─────────┼───────┼─────────────┤
│diffie │alice │a@example.com│
├─────────┼───────┼─────────────┤
│entryway │bob │b@example.com│
├─────────┼───────┼─────────────┤
│finality │alice │a@example.com│
├─────────┼───────┼─────────────┤
│grace │delilah│d@example.com│
├─────────┼───────┼─────────────┤
│homelab │calvin │c@example.com│
└─────────┴───────┴─────────────┘
So there’s some basic relational-type stuff in J. Is this the right approach? I don’t know.
Edit: Notes from the chatroom from elcaro:
You want to be a little bit careful when dealing with boxes, because unboxing a boxed array will create fills. When you do
'a' e."1> developers
, your right arg is a 5×7 character vector
quote"1 > developers
'alice '
'bob '
'calvin '
'delilah'
'ellen '
If you looked for developers with a Space in them, you’d match all except ‘delilah’. You can unbox each, and so compare each unbox string by itself
('a' e. ])@> developers 1 0 1 1 0
Which is fine, but might be hard to put into a larger tacit expression, as you need to explicitly refer to left arg in that inner expression. Something else you can do is use Spread (
S:
) or Level-At (L:
) where appropriate. I don’t use these much, butS:0
often pairs nicely withe.
orE.
, where you want to match some unboxed string to a list of boxed strings
'a' e.S:0 developers 1 0 1 1 0
This new operator is often easier to put into a larger tacit expression (if desiered) Spread and Level are a little like the Depth modifier in BQN
⚇
(and proposed in APL⍥
) in that it modifies operators to perform on nested data. The Nuvoc pages are a little lacking, but play with them and you’ll get a hang of what they do Spread - NuVoc Leval-At - NuVoc Depth (operator) - APL Wiki Depth - BQN docsOh one more thing… you can do
(#&developers ,. #&devemails) 1 0 1 1 0
as
(developers ,. devemails) #~ 1 0 1 1 0