On Partisan Politics and Gerrymandering
As a Baha’i, I had to rescind my party affiliation because it is an aspect of partisan politics, which is forbidden. That doesn’t mean I can’t or don’t vote; on the other hand, election and democracy is as important to the Faith as avoiding partisan politics. These seem like irreconcilable ideas but it fundamentally means something quite simple. You need to vote for whoever you think is going to do the best job at governance. Party affiliation is fundamentally a way of creating tribes, and if we want unity, we must stop creating or investing in tribal affiliations.
For some reason gerrymandering is on my mind. We can probably all agree that gerrymandering is fundamentally wrong. It occurred to me recently that the only reason gerrymandering is possible is because of party registration. Your vote is secret. You can create a district and see what votes happen there; perhaps even polling stations can be separated out (I don’t really know) but the factual basis that enables gerrymandering is party registration. So maybe don’t do that and you’ll see less (or at least less effective) gerrymandering.
The counterargument is that you want to vote in the primary. The last election shows that such votes are an absolute farce, since neither of the major parties actually asked the voters in a meaningful way who they wanted on the ballot.
In some states, anyone can vote in a primary. This is a good idea because it dilutes the problem of playing to your extremes during the primary and the middle during the general election. If you have to play to the middle during the primaries too, we’ll all have more moderate candidates to choose between in the general election.
In a situation where enough people refuse party registration, parties will voluntarily choose to allow anyone to vote in the primaries, simply because otherwise the party members will choose people completely unpalatable to the general population. Right now, about two thirds of the country is registered with one of the parties. If the situation were reversed, where one third were registered with one or the other, the independent voice would be more important than the party die-hards.
This seems to me a good strategy for de-escalating the partisan nature of American politics.
Causality
I recently read most of this famous article, Demystifying Dependence. This is a pretty transformative paper, in my opinion, as a Nix user who is interested in what “dependence” means, and as a software engineer. I also greatly appreciated the didactic method of providing nine stories and then investigating what we can learn about dependencies from them in the new framework.
The surprise in the paper is that dependence can be analyzed in terms of causality. The authors show how using a new framework called “Halpern causation,” a definition of causation from a subdiscipline called “actual causality.” The treatment is really interesting. One of the big “a-ha” moments is realizing that you cannot discuss causality without talking about counterfactuals. To paraphrase the math:
- Create a model of the world in terms of input variables (“exogeneous”) and derived variables (“endogeneous”)
- Figure out the variable configuration for the event that happened
- Find a variable whose setting, if it were inverted, would lead to the opposite outcome
That variable is the “but-for” cause. It turns out not to be a perfect model of causality. They give an example: suppose we are worried about watering some grass, but we don’t want to water on days where it rains. We ask after a rain, why is the grass wet? According to the but-for definition, imagining a counterfactual world where it did not rain, we still have wet grass, because the sprinkler system would have come on. The authors then introduce a more sophisticated definition of causality called Halpern-Pearl causality which fixes this problem, using the idea of a contingency, which forces certain settings of variables. These settings are called witnesses to the contingency. The jargon alone is quite fun.
It does make me want to think about the Chornobyl accident in terms of actual causality. For instance, the roof being made of flammable materials made the disaster worse, but it was not a but-for cause of the explosion. But we could use these definitions of causality to answer questions like, was graphite-tipping the control rods an actual cause? It wasn’t if (and only if), holding everything else about the accident the same, the accident plays out in the exact same way without them.
A Few More Shavian Notes
A few other things occurred to me.
A major benefit: Tragedeigh naming is basically impossible. Is it Cate or Kate? It’s 𐑒𐑱𐑑. I also kind of like that you can see visually the irritating rhyming of my daughter’s friends names: ·𐑧𐑤𐑰-·𐑨𐑤𐑰-·𐑨𐑛𐑰, ·𐑧𐑤𐑰𐑨𐑯𐑩-·𐑭𐑮𐑰𐑨𐑯𐑩-·𐑭𐑛𐑰𐑨𐑯𐑩.
If you want to practice, there is a very nice addon for Firefox to convert a page to Shavian, in total or by replacing N (25, 50, 100, 200,… 500) common words. I don’t recommend trying to read a Wikipedia page about phonetics with “auto translate” enabled but otherwise it seems to be super useful.
In fact, a significant problem of learning another orthography like this is going to be that I have spent my entire life reading English text without sounding it out. My brain has a lot of experience. A more logical orthography might be a huge improvement, but beating decades of familiarity with another one is going to take time. This approach seems like a brilliant one, because you build up experience in a similar manner; as you get used to seeing common words mixed in with English text, hopefully you just get familiar quickly. Or maybe it backfires because your brain can fill in lazily from context. I guess time will tell.
The IPA/Shavian correspondences ʌ-𐑳, ʊ-𐑫 and u-𐑵 seem like a bit of a missed opportunity. I think IPA already had ʌ (Shavian 𐑳) before Read came up with the scheme, but I don’t know whether he was aware of it or cared. Not a huge deal either way.
There is actually a distinction which used to be phonemic that is not preserved in Shavian, which is the w/wh distinction, (in IPA, w/ʍ). Sometimes people exaggerate this (“a hwale is in trouble!”), but the distinction evidently still exists in some places, but probably not where Read was working on Shavian. So you can’t distinguish witch and which in Shavian writing.
The Shavian Alphabet
I’ve been spending some time the last few days learning the Shavian alphabet. There’s a great learning application at shavian.app.
What’s the point of this? I mean, fun mainly. One point of this alphabet is to have an actually phonetic (or phonemic to be precise) representation. English has a lot of phonemes, but not a lot of agreement on what they are or how they’re realized; Wikipedia doesn’t give an exact number but lets you add 24 consonants to as few as 14-16 vowels for the General American dialect or as many as 20-25 vowels in the Received Pronunciation, yielding as few as 38 or as many as 49 distinct phonemes, meaning we are missing symbols for between 12 and 23 phonemes, depending on how you count. Are there benefits to writing what we actually say?
Before seeing this particular attempt, I sort of assumed it couldn’t possibly work because the sounds of Indian English differ substantially from American or British English. I am singling out these three because you can make a strong argument that any of these should form the basis of a new English spelling standard: Indian, because it has the most speakers; British, because it is the original recipe; American, because of cultural imperialism. None of these would be very satisfying.
Shavian addresses the problem by being overtly phonemic and basing pronunciation on words. This idea is not that far from the idea of lexical sets. The original definition from 1982 yields 27 phonemes, and this table shows the agreement between these and Shavian:
Keyword | RP Phone | GA Phone | Shavian character | Character name | Examples |
---|---|---|---|---|---|
KIT | ɪ | ɪ | 𐑦 | if | ship, sick, bridge, milk, myth, busy |
DRESS | e | ɛ | 𐑧 | egg | step, neck, edge, shelf, friend, ready |
TRAP | æ | æ | 𐑨 | ash | tap, back, badge, scalp, hand, cancel |
LOT | ɒ | ɑ | 𐑪 | on | stop, sock, dodge, romp, possible, quality |
STRUT | ʌ | ʌ | 𐑳 | up | cup, suck, budge, pulse, trunk, blood |
FOOT | ʊ | ʊ | 𐑫 | wool | put, bush, full, good, look, wolf |
BATH | ɑː | æ | 𐑨 | ash | staff, brass, ask, dance, sample, calf |
CLOTH | ɒ | ɔ | 𐑪 | on | cough, broth, cross, long, Boston |
NURSE | ɜː | ɜr | 𐑻 | err | hurt, lurk, urge, burst, jerk, term |
FLEECE | iː | i | 𐑰 | eat | creep, speak, leave, feel, key, people |
FACE | eɪ | eɪ | 𐑱 | age | tape, cake, raid, veil, steak, day |
PALM | ɑː | ɑ | 𐑭 | ah | psalm, father, bra, spa, lager |
THOUGHT | ɔː | ɔ | 𐑷 | awe | taught, sauce, hawk, jaw, broad |
GOAT | əʊ | oʊ | 𐑴 | oak | soap, joke, home, know, so, roll |
GOOSE | uː | u | 𐑵, 𐑿 | ooze, yew | loop, shoot, tomb, mute, huge, view |
PRICE | aɪ | aɪ | 𐑲 | ice | ripe, write, arrive, high, try, buy |
CHOICE | ɔɪ | ɔɪ | 𐑶 | oil | adroit, noise, join, toy, royal |
MOUTH | aʊ | aʊ | 𐑬 | out | out, house, loud, count, crowd, cow |
NEAR | ɪə | ɪr | 𐑽 | ear | beer, sincere, fear, beard, serum |
SQUARE | ɛə | ɛr | 𐑺 | air | care, fair, pear, where, scarce, vary |
START | ɑː | ɑr | 𐑸 | are | far, sharp, bark, carve, farm, heart |
NORTH | ɔː | ɔr | 𐑹 | or | for, war, short, scorch, born, warm |
FORCE | ɔː | or | 𐑹 | or | four, wore, sport, porch, borne, story |
CURE | ʊə | ʊr | 𐑫𐑼 | wool, array | poor, tourist, pure, plural, jury |
happY | ɪ | ɪ | 𐑦 | if | copy, scampi, taxi, sortie, committee, hockey, Chelsea |
lettER | ə | ər | 𐑼 | array | paper, metre, calendar, stupor, succo(u)r, martyr |
commA | ə | ə | 𐑩 | ado | about, gallop, oblige, quota, vodka |
You can see from the chart there are only a few examples that are not distinguished by Shavian, but having separate characters for rhotacized vowels seems clever to me, since they should occur in relatively predictable ways but can be pronounced with or without the R sound. I think it’s worth appreciating the cleverness of this approach, which prefigured lexical sets by a few decades, and yields an alphabet uniquely suited to English despite the plethora of local realizations of its large and fairly unique phonetic set.
Using Shavian, I am made pretty aware my accent and how it differs from the Received Pronunciation. For instance, when I say the word “been” it rhymes with “bin” and not “bean.” The words “caught” and “cot,” I pronounce the same, but if I imagine a British accent I can sort of imagine how the sounds differ. There are some standardized “spellings” for words that requires one to think in RP, or at least an accent with more vowels than my American accent affords, although using or not using standard spellings seems not to be a contentious issue within the tiny Shavian user community.
I think the majority of L1 English speakers are probably unaware of some of the phonemes. The obvious ones that come to mind are θ/ð. It’s not easy for me to guess which one I am using in a given word, and as far as distinct phonemes, there are not that many minimal pairs for this set (ether/either seems to be one of the few). I think most L1 English speakers think of this as “the TH sound” and wouldn’t do much better than me at guessing without a finger to their throat which they are using in a given word. Another example might be the word “think,” which phonetically contains what we English speakers would call “the -ing sound,” in Shavian you have to encode that but if you were thinking in terms of the normal Latin spelling you wouldn’t realize it.
I have frequently said that English suffers from an overabundance of shwa sounds. I think now that isn’t technically correct. Shavian helps show that what Engish does have is an abundance of vowels, and consequently many are not that different, and/or located near the middle. It’s fun to click around the vowel chart on Wikipedia and try and find one that doesn’t have an entry for some form of English; I didn’t find one but didn’t do an exhaustive search.
Aesthetically speaking, it’s pretty good looking, especially at first blush. It’s distinctive, it’s cool that there are 48 characters we can distinguish without picking up the pen. It doesn’t really look like anything else. Another plus is that the name would make one think it was designed by an Armenian, which it certainly wasn’t. It’s quite wise, in my opinion, to use certain letters as single-glyph words (for “the,”-𐑞 “to,”-𐑑 “are,”-𐑸 “for,”-𐑓 “and,”-𐑯 and “of”-𐑝). The sound-shape correspondences are interesting but not necessarily super helpful for remembering sounds. Most alphabets (Inuktitut is an exception) don’t have every possible permutation of each shape. In my five-day-old opinion, some of the shapes are not easy to write. I have noticed, as others have, the funny fact that the glyphs for “h” and “ng” are in the voiced and unvoiced categories respectively, which seems wrong, although the decision was apparently intentional. I don’t think it’s worth litigating these things, although they are curious. For instance, you’re better off learning Esperanto than Ido even though Ido is probably “better” in various ways, because it’s better to just pick something and get the community going around it rather than constantly nit-picking, creating minor refinements, and fracturing the community along the way.
An interesting benefit is that overall, text typically takes about 1/3rd less space to write in Shavian. Here’s an example:
|
|
𐑞 𐑚𐑧𐑕𐑑 𐑚𐑦𐑤𐑳𐑝𐑩𐑛 𐑝 𐑷𐑤 𐑔𐑦𐑙𐑟 𐑦𐑯 𐑥𐑲 𐑕𐑲𐑑 𐑦𐑟 𐑡𐑳𐑕𐑑𐑦𐑕; | The best beloved of all things in My sight is Justice; |
𐑑𐑻𐑯 𐑯𐑪𐑑 𐑩𐑢𐑱 𐑞𐑺𐑓𐑮𐑪𐑥 𐑦𐑓 𐑞𐑬 𐑛𐑦𐑟𐑲𐑼𐑧𐑕𐑑 𐑥𐑰, | turn not away therefrom if thou desirest Me, |
𐑯 𐑯𐑦𐑜𐑤𐑧𐑒𐑑 𐑦𐑑 𐑯𐑪𐑑 𐑞𐑨𐑑 𐑲 𐑥𐑱 𐑒𐑩𐑯𐑓𐑲𐑛 𐑦𐑯 𐑞𐑰. | and neglect it not that I may confide in thee. |
𐑚𐑲 𐑦𐑑𐑕 𐑱𐑛 𐑞𐑬 𐑖𐑨𐑤𐑑 𐑕𐑰 𐑢𐑦𐑞 𐑞𐑲𐑯 𐑴𐑯 𐑲𐑟 | By its aid thou shalt see with thine own eyes |
𐑯 𐑯𐑪𐑑 𐑔𐑮𐑵 𐑞 𐑲𐑟 𐑝 𐑳𐑞𐑼𐑟, | and not through the eyes of others, |
𐑯 𐑖𐑨𐑤𐑑 𐑯𐑴 𐑝 𐑞𐑲𐑯 𐑴𐑯 𐑯𐑪𐑤𐑦𐑡 | and shalt know of thine own knowledge |
𐑯 𐑯𐑪𐑑 𐑔𐑮𐑵 𐑞 𐑯𐑪𐑤𐑦𐑡 𐑝 𐑞𐑲 𐑯𐑱𐑚𐑼. | and not through the knowledge of thy neighbor. |
𐑐𐑪𐑯𐑛𐑼 𐑞𐑦𐑕 𐑦𐑯 𐑞𐑲 𐑣𐑸𐑑; 𐑣𐑬 𐑦𐑑 𐑚𐑦𐑣𐑵𐑝𐑧𐑔 𐑞𐑰 𐑑 𐑚𐑰. | Ponder this in thy heart; how it behooveth thee to be. |
𐑝𐑧𐑮𐑦𐑤𐑦 𐑡𐑳𐑕𐑑𐑦𐑕 𐑦𐑟 𐑥𐑲 𐑜𐑦𐑓𐑑 𐑑 𐑞𐑰 | Verily justice is My gift to thee |
𐑯 𐑞 𐑕𐑲𐑯 𐑝 𐑥𐑲 𐑤𐑳𐑝𐑦𐑙-𐑒𐑲𐑯𐑛𐑯𐑩𐑕. | and the sign of My loving-kindness. |
𐑕𐑧𐑑 𐑦𐑑 𐑞𐑧𐑯 𐑚𐑦𐑓𐑹 𐑞𐑲𐑯 𐑲𐑟. | Set it then before thine eyes. |
— ·𐑚𐑭𐑣𐑭𐑵𐑤𐑭 | — Baha’u’llah |
Where does this reduction come from? I’m not a statistics person but I’ll guess. Small words becoming one character or two is certainly a big help. Another significant help is that many small words have a lot of silent letters, “through” being the best example (it becomes 𐑔𐑮𐑵, beating the slang “thru” by a character). On the other hand, there is no “x” character, so words like “exist” can actually become longer (“𐑦𐑜𐑟𐑦𐑕𐑑”). Probably in many cases, it’s simply digraphs like “sh” and “ch” or dipthongs like “ou” and “er” creating a small benefit across many words.
Thinking about Shavian reminded me of when my daughter was very young and very desperate to write. She would spell things “phonetically” wich yuzuly rezultid n smthin hahrd tu red. Would this have been easier? It’s twice as big, so maybe the alphabet would take twice as long to learn. But it is uniform—apart from the distinctions mentioned earlier—so after learning the alphabet, reading would be entirely a matter of practice, getting faster, and learning vocabulary. That’s got to be worth a year or two of education.
A downside which only a great speller like myself would point out is that there is tremendous historical information in the way we spell things. The first part of the words “function,” “funny” and “phonetic” sound the same, but you can tell by looking that one comes from Greek and the others do not, and that the latter shares some meaning with words like “phonograph” and “telephone.” This information is purely written. But it also tortures students and led us here, where fixing spelling is a significant benefit of using a computer.
Shavian is unlikely to unseat English’s Latin-based orthography, but it is fun, fairly easy to learn, unique looking, and has various advantages. Wide usage is not one of those advantages but perhaps it will increase! It’s probably a better spelling reform than just reintroducing ð and þ.
I18n Puzzles, Day 2
This is the kind of problem where you could load the whole thing into Postgres and get the answer in about five seconds. In fact, let’s try it:
postgres# create table i18n_day2 (input timestamp with time zone);
CREATE TABLE
postgres=# \copy i18n_day2 from /i18np/input.txt
COPY 1758
postgres=# select input at time zone 'UTC', count(*) from i18n_day2 group by input at time zone 'UTC' having count(*) >= 4;
timezone | count
---------------------+-------
20XX-YY-ZZ HH:MM:SS | 4
(1 row)
I decided to censor the output a little bit and I didn’t handle the formatting properly there because that’s not really the point.
Suffice to say, this is kind of a non-problem in languages with time zone support. Unfortunately, neither J nor APL have time zone support in the standard library, so we’ll have to figure it out on our own.
The first problem is that we have to parse these dates: 2019-06-05T08:15:00-04:00
. These happen to be fixed-width. There are snappier ways of parsing but I decided to narrow in on this element of it.
My plan here is to handle the date+time part first, since there is a library built-in for this (todayno
and todate
, which seem like they should be inverses of each other but are not some reason). We can parse the time into a similar structure and expand it using expand. Then we add them, or actually subtract them (I realized this a little late).
I felt like I wanted to see a fork of the form f + g
since the idea is to parse most of the timestamp and then the offset. The amount of work to handle a fixed-width format was not insubstantial, but I came up with these functions:
dp =. (_ 2 1 $ 0 4 5 2 8 2 11 2 14 2 17 2) 0&".;.0 ]
tp =. 0 0 0 1 1 0 #inv (_ 2 1 $ 19 3 23 2) 0&".;.0 ]
These two functions parse the date part and time part. Probably bad names. The key idea about using ;.0
is to take a substring of a given length at a given offset. So starting from 0 with length 4 gets us the year, this is the 0 4
; then we get the month from offset 5 length 2, which is 5 2
which comes next. All six of the chunks of data we need are thus specified by the 12 items in the list; we convert these into an array of 6 2x1 vectors with $
. This feeds the subarray ;.0
verb. We’re adding in 0&".
to parse numbers; regular ".
runs J code, but we just want the values.
The ever-friendly and wise elcaro on the J channel of the APL Farm Discord suggested using these predicates instead:
Nats =: '1234567890'&(i. ".@:{ ' ',~ [)
Nums =: '1234567890._ ' ".@:{~ '1234567890.-' i. ]
Which was really tempting since you can then do all the parsing with this kind of expression:
t =. '2024-08-01T08:15:03-03:15'
19 (Nats@{. , Nums@}.) t
2024 8 1 8 15 3 _3 15
Which is really hot, but I insisted on doing it the hard way for some reason.
Now my plan is to “normalize” the timestamp, by converting this from a 6 item array to an internal date and back, and then throwing it into a printing function. First the printing function:
require 'format/printf'
dt =. '%04d-%02d-%02dT%02d:%02d:%02d+00:00' vsprintf
Nothing interesting here. Now my goal is to avoid boxing and pass lines through a function which does the work here. That function will do the “normalization” I mentioned above:
norm =. [: dt 2 todate 2 todayno dp - tp
There’s the fork I was thinking about. I read another article (about ray tracing in J) which explained that the cap [:
is about converting a dyadic function to a monadic one for forks.
The use of
[:
is a little weird to understand, but it is basically a no-operation left argument to ensure that the verb is evaluated as a function of one argument instead of two.
This seems like a decent explanation. So the idea here is to handle the time zone data with the date part, convert that to a day number, then convert that back to a date, then format it. The conversion handles the possibilities of negative times and whatnot.
Another approach would have been to instead convert the first number to a “day number” and then convert the hour and minute values to fractions of a day. In trying that, I saw odd behavior so I decided this might work alright.
OK, so now we have the verb that will parse, but we still need to actually do the puzzle. The first piece is to use norm;._2 fread <filename>
. Using norm
with ;._2
is how we’re going to avoid boxing; we’ll get an array of normalized timestamps instead of boxed strings or whatever. But the puzzle question is to find the times that appear most frequently. This is not all that different from the word frequencies problem. So I wound up using key /.
with length #
on the normalized timestamps, sorting by that, and applying that sort order to the nub ~.
of the timestamps. Taking the first item of that list yields the timestamp we are interested it:
{. (~. nm) \: #/.~ nm =. norm;._2 fread 'test-input.txt'
And this is our solution. The entire thing is:
require 'format/printf'
dt =. '%04d-%02d-%02dT%02d:%02d:%02d+00:00' vsprintf
dp =. (_ 2 1 $ 0 4 5 2 8 2 11 2 14 2 17 2) 0&".;.0 ]
tp =. 0 0 0 1 1 0 #inv (_ 2 1 $ 19 3 23 2) 0&".;.0 ]
norm =. [: dt 2 todate 2 todayno dp - tp
{. (~. nm) \: #/.~ nm =. norm;._2 fread 'input.txt'
Relational Thinking in J
According to Aaron Hsu, the starting point for APL programming is the relational model.
I’m mixed on this, because I don’t think J has a natural concept of table, actually. Moreover, it seems like your code gets littered with head {.
and constant-powered {
lookups if you do represent data in a tabular format, with heterogeneous rows. I could be wrong, but it seems to work better in general when you have homogeneous arrays. Creating weird packets of data like we do in other languages just doesn’t seem to be the ticket here.
Suppose you design a table like this:
developer name | |
---|---|
alice | a@example.com |
bob | b@example.com |
calvin | c@example.com |
delilah | d@example.com |
ellen | e@example.com |
You will probably wind up using 0 {
or 1 {
to take it apart to do different things with the different columns. So I would probably build this table in J like so:
developers =. 'alice'; 'bob'; 'calvin'; 'delilah'; 'ellen'
devemails =. 'a@example.com'; 'b@example.com'; 'c@example.com'; 'd@example.com'; 'e@example.com'
This is maybe a column-oriented view of the world. You can recover the table pretty easily though:
developers ,. devemails
┌───────┬─────────────┐
│alice │a@example.com│
├───────┼─────────────┤
│bob │b@example.com│
├───────┼─────────────┤
│calvin │c@example.com│
├───────┼─────────────┤
│delilah│d@example.com│
├───────┼─────────────┤
│ellen │e@example.com│
└───────┴─────────────┘
Projection is sort of obvious now, you have to choose the columns you want because you don’t have the table, as it were. Selection isn’t so bad; you are going to filter on a certain column and apply that filter on the other column. Let’s find the developers with an “a” in their name:
developers #~ 'a' e."1> developers
┌─────┬──────┬───────┐
│alice│calvin│delilah│
└─────┴──────┴───────┘
The same selection works on the other column, and you can still stitch together columns to make a table:
devemails #~ 'a' e."1> developers
┌─────────────┬─────────────┬─────────────┐
│a@example.com│c@example.com│d@example.com│
└─────────────┴─────────────┴─────────────┘
(#&developers ,. #&devemails) 'a' e."1> developers
┌───────┬─────────────┐
│alice │a@example.com│
├───────┼─────────────┤
│calvin │c@example.com│
├───────┼─────────────┤
│delilah│d@example.com│
└───────┴─────────────┘
Joins work by doing lookups by index. Let’s introduce another table:
project name | lead developer |
---|---|
alphago | calvin |
bitbucket | ellen |
cafeteria | delilah |
diffie | alice |
entryway | bob |
finality | alice |
grace | delilah |
homelab | calvin |
Following the earlier example we get this:
projects =. 'alphago'; 'bitbucket'; 'cafeteria'; 'diffie'; 'entryway'; 'finality'; 'grace'; 'homelab'
projdevs =. 'calvin'; 'ellen'; 'delilah'; 'alice'; 'bob'; 'alice'; 'delilah'; 'calvin'
projects ,. projdevs
┌─────────┬───────┐
│alphago │calvin │
├─────────┼───────┤
│bitbucket│ellen │
├─────────┼───────┤
│cafeteria│delilah│
├─────────┼───────┤
│diffie │alice │
├─────────┼───────┤
│entryway │bob │
├─────────┼───────┤
│finality │alice │
├─────────┼───────┤
│grace │delilah│
├─────────┼───────┤
│homelab │calvin │
└─────────┴───────┘
We can find the email of the developer for each project like so:
(developers i. projdevs) { devemails
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┐
│c@example.com│e@example.com│d@example.com│a@example.com│b@example.com│a@example.com│d@example.com│c@example.com│
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┘
This might be easier to read as a table, so let’s do that:
projects ,. projdevs ,. devemails {~ developers i. projdevs
┌─────────┬───────┬─────────────┐
│alphago │calvin │c@example.com│
├─────────┼───────┼─────────────┤
│bitbucket│ellen │e@example.com│
├─────────┼───────┼─────────────┤
│cafeteria│delilah│d@example.com│
├─────────┼───────┼─────────────┤
│diffie │alice │a@example.com│
├─────────┼───────┼─────────────┤
│entryway │bob │b@example.com│
├─────────┼───────┼─────────────┤
│finality │alice │a@example.com│
├─────────┼───────┼─────────────┤
│grace │delilah│d@example.com│
├─────────┼───────┼─────────────┤
│homelab │calvin │c@example.com│
└─────────┴───────┴─────────────┘
So there’s some basic relational-type stuff in J. Is this the right approach? I don’t know.
Edit: Notes from the chatroom from elcaro:
You want to be a little bit careful when dealing with boxes, because unboxing a boxed array will create fills. When you do
'a' e."1> developers
, your right arg is a 5×7 character vector
quote"1 > developers
'alice '
'bob '
'calvin '
'delilah'
'ellen '
If you looked for developers with a Space in them, you’d match all except ‘delilah’. You can unbox each, and so compare each unbox string by itself
('a' e. ])@> developers 1 0 1 1 0
Which is fine, but might be hard to put into a larger tacit expression, as you need to explicitly refer to left arg in that inner expression. Something else you can do is use Spread (
S:
) or Level-At (L:
) where appropriate. I don’t use these much, butS:0
often pairs nicely withe.
orE.
, where you want to match some unboxed string to a list of boxed strings
'a' e.S:0 developers 1 0 1 1 0
This new operator is often easier to put into a larger tacit expression (if desiered) Spread and Level are a little like the Depth modifier in BQN
⚇
(and proposed in APL⍥
) in that it modifies operators to perform on nested data. The Nuvoc pages are a little lacking, but play with them and you’ll get a hang of what they do Spread - NuVoc Leval-At - NuVoc Depth (operator) - APL Wiki Depth - BQN docsOh one more thing… you can do
(#&developers ,. #&devemails) 1 0 1 1 0
as
(developers ,. devemails) #~ 1 0 1 1 0