A Few More Shavian Notes
A few other things occurred to me.
A major benefit: Tragedeigh naming is basically impossible. Is it Cate or Kate? Itโs ๐๐ฑ๐. I also kind of like that you can see visually the irritating rhyming of my daughterโs friends names: ยท๐ง๐ค๐ฐ-ยท๐จ๐ค๐ฐ-ยท๐จ๐๐ฐ, ยท๐ง๐ค๐ฐ๐จ๐ฏ๐ฉ-ยท๐ญ๐ฎ๐ฐ๐จ๐ฏ๐ฉ-ยท๐ญ๐๐ฐ๐จ๐ฏ๐ฉ.
If you want to practice, there is a very nice addon for Firefox to convert a page to Shavian, in total or by replacing N (25, 50, 100, 200,โฆ 500) common words. I donโt recommend trying to read a Wikipedia page about phonetics with โauto translateโ enabled but otherwise it seems to be super useful.
In fact, a significant problem of learning another orthography like this is going to be that I have spent my entire life reading English text without sounding it out. My brain has a lot of experience. A more logical orthography might be a huge improvement, but beating decades of familiarity with another one is going to take time. This approach seems like a brilliant one, because you build up experience in a similar manner; as you get used to seeing common words mixed in with English text, hopefully you just get familiar quickly. Or maybe it backfires because your brain can fill in lazily from context. I guess time will tell.
The IPA/Shavian correspondences ส-๐ณ, ส-๐ซ and u-๐ต seem like a bit of a missed opportunity. I think IPA already had ส (Shavian ๐ณ) before Read came up with the scheme, but I donโt know whether he was aware of it or cared. Not a huge deal either way.
There is actually a distinction which used to be phonemic that is not preserved in Shavian, which is the w/wh distinction, (in IPA, w/ส). Sometimes people exaggerate this (โa hwale is in trouble!โ), but the distinction evidently still exists in some places, but probably not where Read was working on Shavian. So you canโt distinguish witch and which in Shavian writing.
The Shavian Alphabet
Iโve been spending some time the last few days learning the Shavian alphabet. Thereโs a great learning application at shavian.app.
Whatโs the point of this? I mean, fun mainly. One point of this alphabet is to have an actually phonetic (or phonemic to be precise) representation. English has a lot of phonemes, but not a lot of agreement on what they are or how theyโre realized; Wikipedia doesnโt give an exact number but lets you add 24 consonants to as few as 14-16 vowels for the General American dialect or as many as 20-25 vowels in the Received Pronunciation, yielding as few as 38 or as many as 49 distinct phonemes, meaning we are missing symbols for between 12 and 23 phonemes, depending on how you count. Are there benefits to writing what we actually say?
Before seeing this particular attempt, I sort of assumed it couldnโt possibly work because the sounds of Indian English differ substantially from American or British English. I am singling out these three because you can make a strong argument that any of these should form the basis of a new English spelling standard: Indian, because it has the most speakers; British, because it is the original recipe; American, because of cultural imperialism. None of these would be very satisfying.
Shavian addresses the problem by being overtly phonemic and basing pronunciation on words. This idea is not that far from the idea of lexical sets. The original definition from 1982 yields 27 phonemes, and this table shows the agreement between these and Shavian:
Keyword | RP Phone | GA Phone | Shavian character | Character name | Examples |
---|---|---|---|---|---|
KIT | ษช | ษช | ๐ฆ | if | ship, sick, bridge, milk, myth, busy |
DRESS | e | ษ | ๐ง | egg | step, neck, edge, shelf, friend, ready |
TRAP | รฆ | รฆ | ๐จ | ash | tap, back, badge, scalp, hand, cancel |
LOT | ษ | ษ | ๐ช | on | stop, sock, dodge, romp, possible, quality |
STRUT | ส | ส | ๐ณ | up | cup, suck, budge, pulse, trunk, blood |
FOOT | ส | ส | ๐ซ | wool | put, bush, full, good, look, wolf |
BATH | ษห | รฆ | ๐จ | ash | staff, brass, ask, dance, sample, calf |
CLOTH | ษ | ษ | ๐ช | on | cough, broth, cross, long, Boston |
NURSE | ษห | ษr | ๐ป | err | hurt, lurk, urge, burst, jerk, term |
FLEECE | iห | i | ๐ฐ | eat | creep, speak, leave, feel, key, people |
FACE | eษช | eษช | ๐ฑ | age | tape, cake, raid, veil, steak, day |
PALM | ษห | ษ | ๐ญ | ah | psalm, father, bra, spa, lager |
THOUGHT | ษห | ษ | ๐ท | awe | taught, sauce, hawk, jaw, broad |
GOAT | ษส | oส | ๐ด | oak | soap, joke, home, know, so, roll |
GOOSE | uห | u | ๐ต, ๐ฟ | ooze, yew | loop, shoot, tomb, mute, huge, view |
PRICE | aษช | aษช | ๐ฒ | ice | ripe, write, arrive, high, try, buy |
CHOICE | ษษช | ษษช | ๐ถ | oil | adroit, noise, join, toy, royal |
MOUTH | aส | aส | ๐ฌ | out | out, house, loud, count, crowd, cow |
NEAR | ษชษ | ษชr | ๐ฝ | ear | beer, sincere, fear, beard, serum |
SQUARE | ษษ | ษr | ๐บ | air | care, fair, pear, where, scarce, vary |
START | ษห | ษr | ๐ธ | are | far, sharp, bark, carve, farm, heart |
NORTH | ษห | ษr | ๐น | or | for, war, short, scorch, born, warm |
FORCE | ษห | or | ๐น | or | four, wore, sport, porch, borne, story |
CURE | สษ | สr | ๐ซ๐ผ | wool, array | poor, tourist, pure, plural, jury |
happY | ษช | ษช | ๐ฆ | if | copy, scampi, taxi, sortie, committee, hockey, Chelsea |
lettER | ษ | ษr | ๐ผ | array | paper, metre, calendar, stupor, succo(u)r, martyr |
commA | ษ | ษ | ๐ฉ | ado | about, gallop, oblige, quota, vodka |
You can see from the chart there are only a few examples that are not distinguished by Shavian, but having separate characters for rhotacized vowels seems clever to me, since they should occur in relatively predictable ways but can be pronounced with or without the R sound. I think itโs worth appreciating the cleverness of this approach, which prefigured lexical sets by a few decades, and yields an alphabet uniquely suited to English despite the plethora of local realizations of its large and fairly unique phonetic set.
Using Shavian, I am made pretty aware my accent and how it differs from the Received Pronunciation. For instance, when I say the word โbeenโ it rhymes with โbinโ and not โbean.โ The words โcaughtโ and โcot,โ I pronounce the same, but if I imagine a British accent I can sort of imagine how the sounds differ. There are some standardized โspellingsโ for words that requires one to think in RP, or at least an accent with more vowels than my American accent affords, although using or not using standard spellings seems not to be a contentious issue within the tiny Shavian user community.
I think the majority of L1 English speakers are probably unaware of some of the phonemes. The obvious ones that come to mind are ฮธ/รฐ. Itโs not easy for me to guess which one I am using in a given word, and as far as distinct phonemes, there are not that many minimal pairs for this set (ether/either seems to be one of the few). I think most L1 English speakers think of this as โthe TH soundโ and wouldnโt do much better than me at guessing without a finger to their throat which they are using in a given word. Another example might be the word โthink,โ which phonetically contains what we English speakers would call โthe -ing sound,โ in Shavian you have to encode that but if you were thinking in terms of the normal Latin spelling you wouldnโt realize it.
I have frequently said that English suffers from an overabundance of shwa sounds. I think now that isnโt technically correct. Shavian helps show that what Engish does have is an abundance of vowels, and consequently many are not that different, and/or located near the middle. Itโs fun to click around the vowel chart on Wikipedia and try and find one that doesnโt have an entry for some form of English; I didnโt find one but didnโt do an exhaustive search.
Aesthetically speaking, itโs pretty good looking, especially at first blush. Itโs distinctive, itโs cool that there are 48 characters we can distinguish without picking up the pen. It doesnโt really look like anything else. Another plus is that the name would make one think it was designed by an Armenian, which it certainly wasnโt. Itโs quite wise, in my opinion, to use certain letters as single-glyph words (for โthe,โ-๐ โto,โ-๐ โare,โ-๐ธ โfor,โ-๐ โand,โ-๐ฏ and โofโ-๐). The sound-shape correspondences are interesting but not necessarily super helpful for remembering sounds. Most alphabets (Inuktitut is an exception) donโt have every possible permutation of each shape. In my five-day-old opinion, some of the shapes are not easy to write. I have noticed, as others have, the funny fact that the glyphs for โhโ and โngโ are in the voiced and unvoiced categories respectively, which seems wrong, although the decision was apparently intentional. I donโt think itโs worth litigating these things, although they are curious. For instance, youโre better off learning Esperanto than Ido even though Ido is probably โbetterโ in various ways, because itโs better to just pick something and get the community going around it rather than constantly nit-picking, creating minor refinements, and fracturing the community along the way.
An interesting benefit is that overall, text typically takes about 1/3rd less space to write in Shavian. Hereโs an example:
|
|
๐ ๐๐ง๐๐ ๐๐ฆ๐ค๐ณ๐๐ฉ๐ ๐ ๐ท๐ค ๐๐ฆ๐๐ ๐ฆ๐ฏ ๐ฅ๐ฒ ๐๐ฒ๐ ๐ฆ๐ ๐ก๐ณ๐๐๐ฆ๐; | The best beloved of all things in My sight is Justice; |
๐๐ป๐ฏ ๐ฏ๐ช๐ ๐ฉ๐ข๐ฑ ๐๐บ๐๐ฎ๐ช๐ฅ ๐ฆ๐ ๐๐ฌ ๐๐ฆ๐๐ฒ๐ผ๐ง๐๐ ๐ฅ๐ฐ, | turn not away therefrom if thou desirest Me, |
๐ฏ ๐ฏ๐ฆ๐๐ค๐ง๐๐ ๐ฆ๐ ๐ฏ๐ช๐ ๐๐จ๐ ๐ฒ ๐ฅ๐ฑ ๐๐ฉ๐ฏ๐๐ฒ๐ ๐ฆ๐ฏ ๐๐ฐ. | and neglect it not that I may confide in thee. |
๐๐ฒ ๐ฆ๐๐ ๐ฑ๐ ๐๐ฌ ๐๐จ๐ค๐ ๐๐ฐ ๐ข๐ฆ๐ ๐๐ฒ๐ฏ ๐ด๐ฏ ๐ฒ๐ | By its aid thou shalt see with thine own eyes |
๐ฏ ๐ฏ๐ช๐ ๐๐ฎ๐ต ๐ ๐ฒ๐ ๐ ๐ณ๐๐ผ๐, | and not through the eyes of others, |
๐ฏ ๐๐จ๐ค๐ ๐ฏ๐ด ๐ ๐๐ฒ๐ฏ ๐ด๐ฏ ๐ฏ๐ช๐ค๐ฆ๐ก | and shalt know of thine own knowledge |
๐ฏ ๐ฏ๐ช๐ ๐๐ฎ๐ต ๐ ๐ฏ๐ช๐ค๐ฆ๐ก ๐ ๐๐ฒ ๐ฏ๐ฑ๐๐ผ. | and not through the knowledge of thy neighbor. |
๐๐ช๐ฏ๐๐ผ ๐๐ฆ๐ ๐ฆ๐ฏ ๐๐ฒ ๐ฃ๐ธ๐; ๐ฃ๐ฌ ๐ฆ๐ ๐๐ฆ๐ฃ๐ต๐๐ง๐ ๐๐ฐ ๐ ๐๐ฐ. | Ponder this in thy heart; how it behooveth thee to be. |
๐๐ง๐ฎ๐ฆ๐ค๐ฆ ๐ก๐ณ๐๐๐ฆ๐ ๐ฆ๐ ๐ฅ๐ฒ ๐๐ฆ๐๐ ๐ ๐๐ฐ | Verily justice is My gift to thee |
๐ฏ ๐ ๐๐ฒ๐ฏ ๐ ๐ฅ๐ฒ ๐ค๐ณ๐๐ฆ๐-๐๐ฒ๐ฏ๐๐ฏ๐ฉ๐. | and the sign of My loving-kindness. |
๐๐ง๐ ๐ฆ๐ ๐๐ง๐ฏ ๐๐ฆ๐๐น ๐๐ฒ๐ฏ ๐ฒ๐. | Set it then before thine eyes. |
โ ยท๐๐ญ๐ฃ๐ญ๐ต๐ค๐ญ | โ Bahaโuโllah |
Where does this reduction come from? Iโm not a statistics person but Iโll guess. Small words becoming one character or two is certainly a big help. Another significant help is that many small words have a lot of silent letters, โthroughโ being the best example (it becomes ๐๐ฎ๐ต, beating the slang โthruโ by a character). On the other hand, there is no โxโ character, so words like โexistโ can actually become longer (โ๐ฆ๐๐๐ฆ๐๐โ). Probably in many cases, itโs simply digraphs like โshโ and โchโ or dipthongs like โouโ and โerโ creating a small benefit across many words.
Thinking about Shavian reminded me of when my daughter was very young and very desperate to write. She would spell things โphoneticallyโ wich yuzuly rezultid n smthin hahrd tu red. Would this have been easier? Itโs twice as big, so maybe the alphabet would take twice as long to learn. But it is uniformโapart from the distinctions mentioned earlierโso after learning the alphabet, reading would be entirely a matter of practice, getting faster, and learning vocabulary. Thatโs got to be worth a year or two of education.
A downside which only a great speller like myself would point out is that there is tremendous historical information in the way we spell things. The first part of the words โfunction,โ โfunnyโ and โphoneticโ sound the same, but you can tell by looking that one comes from Greek and the others do not, and that the latter shares some meaning with words like โphonographโ and โtelephone.โ This information is purely written. But it also tortures students and led us here, where fixing spelling is a significant benefit of using a computer.
Shavian is unlikely to unseat Englishโs Latin-based orthography, but it is fun, fairly easy to learn, unique looking, and has various advantages. Wide usage is not one of those advantages but perhaps it will increase! Itโs probably a better spelling reform than just reintroducing รฐ and รพ.
I18n Puzzles, Day 2
This is the kind of problem where you could load the whole thing into Postgres and get the answer in about five seconds. In fact, letโs try it:
postgres# create table i18n_day2 (input timestamp with time zone);
CREATE TABLE
postgres=# \copy i18n_day2 from /i18np/input.txt
COPY 1758
postgres=# select input at time zone 'UTC', count(*) from i18n_day2 group by input at time zone 'UTC' having count(*) >= 4;
timezone | count
---------------------+-------
20XX-YY-ZZ HH:MM:SS | 4
(1 row)
I decided to censor the output a little bit and I didnโt handle the formatting properly there because thatโs not really the point.
Suffice to say, this is kind of a non-problem in languages with time zone support. Unfortunately, neither J nor APL have time zone support in the standard library, so weโll have to figure it out on our own.
The first problem is that we have to parse these dates: 2019-06-05T08:15:00-04:00
. These happen to be fixed-width. There are snappier ways of parsing but I decided to narrow in on this element of it.
My plan here is to handle the date+time part first, since there is a library built-in for this (todayno
and todate
, which seem like they should be inverses of each other but are not some reason). We can parse the time into a similar structure and expand it using expand. Then we add them, or actually subtract them (I realized this a little late).
I felt like I wanted to see a fork of the form f + g
since the idea is to parse most of the timestamp and then the offset. The amount of work to handle a fixed-width format was not insubstantial, but I came up with these functions:
dp =. (_ 2 1 $ 0 4 5 2 8 2 11 2 14 2 17 2) 0&".;.0 ]
tp =. 0 0 0 1 1 0 #inv (_ 2 1 $ 19 3 23 2) 0&".;.0 ]
These two functions parse the date part and time part. Probably bad names. The key idea about using ;.0
is to take a substring of a given length at a given offset. So starting from 0 with length 4 gets us the year, this is the 0 4
; then we get the month from offset 5 length 2, which is 5 2
which comes next. All six of the chunks of data we need are thus specified by the 12 items in the list; we convert these into an array of 6 2x1 vectors with $
. This feeds the subarray ;.0
verb. Weโre adding in 0&".
to parse numbers; regular ".
runs J code, but we just want the values.
The ever-friendly and wise elcaro on the J channel of the APL Farm Discord suggested using these predicates instead:
Nats =: '1234567890'&(i. ".@:{ ' ',~ [)
Nums =: '1234567890._ ' ".@:{~ '1234567890.-' i. ]
Which was really tempting since you can then do all the parsing with this kind of expression:
t =. '2024-08-01T08:15:03-03:15'
19 (Nats@{. , Nums@}.) t
2024 8 1 8 15 3 _3 15
Which is really hot, but I insisted on doing it the hard way for some reason.
Now my plan is to โnormalizeโ the timestamp, by converting this from a 6 item array to an internal date and back, and then throwing it into a printing function. First the printing function:
require 'format/printf'
dt =. '%04d-%02d-%02dT%02d:%02d:%02d+00:00' vsprintf
Nothing interesting here. Now my goal is to avoid boxing and pass lines through a function which does the work here. That function will do the โnormalizationโ I mentioned above:
norm =. [: dt 2 todate 2 todayno dp - tp
Thereโs the fork I was thinking about. I read another article (about ray tracing in J) which explained that the cap [:
is about converting a dyadic function to a monadic one for forks.
The use of
[:
is a little weird to understand, but it is basically a no-operation left argument to ensure that the verb is evaluated as a function of one argument instead of two.
This seems like a decent explanation. So the idea here is to handle the time zone data with the date part, convert that to a day number, then convert that back to a date, then format it. The conversion handles the possibilities of negative times and whatnot.
Another approach would have been to instead convert the first number to a โday numberโ and then convert the hour and minute values to fractions of a day. In trying that, I saw odd behavior so I decided this might work alright.
OK, so now we have the verb that will parse, but we still need to actually do the puzzle. The first piece is to use norm;._2 fread <filename>
. Using norm
with ;._2
is how weโre going to avoid boxing; weโll get an array of normalized timestamps instead of boxed strings or whatever. But the puzzle question is to find the times that appear most frequently. This is not all that different from the word frequencies problem. So I wound up using key /.
with length #
on the normalized timestamps, sorting by that, and applying that sort order to the nub ~.
of the timestamps. Taking the first item of that list yields the timestamp we are interested it:
{. (~. nm) \: #/.~ nm =. norm;._2 fread 'test-input.txt'
And this is our solution. The entire thing is:
require 'format/printf'
dt =. '%04d-%02d-%02dT%02d:%02d:%02d+00:00' vsprintf
dp =. (_ 2 1 $ 0 4 5 2 8 2 11 2 14 2 17 2) 0&".;.0 ]
tp =. 0 0 0 1 1 0 #inv (_ 2 1 $ 19 3 23 2) 0&".;.0 ]
norm =. [: dt 2 todate 2 todayno dp - tp
{. (~. nm) \: #/.~ nm =. norm;._2 fread 'input.txt'
Relational Thinking in J
According to Aaron Hsu, the starting point for APL programming is the relational model.
Iโm mixed on this, because I donโt think J has a natural concept of table, actually. Moreover, it seems like your code gets littered with head {.
and constant-powered {
lookups if you do represent data in a tabular format, with heterogeneous rows. I could be wrong, but it seems to work better in general when you have homogeneous arrays. Creating weird packets of data like we do in other languages just doesnโt seem to be the ticket here.
Suppose you design a table like this:
developer name | |
---|---|
alice | a@example.com |
bob | b@example.com |
calvin | c@example.com |
delilah | d@example.com |
ellen | e@example.com |
You will probably wind up using 0 {
or 1 {
to take it apart to do different things with the different columns. So I would probably build this table in J like so:
developers =. 'alice'; 'bob'; 'calvin'; 'delilah'; 'ellen'
devemails =. 'a@example.com'; 'b@example.com'; 'c@example.com'; 'd@example.com'; 'e@example.com'
This is maybe a column-oriented view of the world. You can recover the table pretty easily though:
developers ,. devemails
โโโโโโโโโฌโโโโโโโโโโโโโโ
โalice โa@example.comโ
โโโโโโโโโผโโโโโโโโโโโโโโค
โbob โb@example.comโ
โโโโโโโโโผโโโโโโโโโโโโโโค
โcalvin โc@example.comโ
โโโโโโโโโผโโโโโโโโโโโโโโค
โdelilahโd@example.comโ
โโโโโโโโโผโโโโโโโโโโโโโโค
โellen โe@example.comโ
โโโโโโโโโดโโโโโโโโโโโโโโ
Projection is sort of obvious now, you have to choose the columns you want because you donโt have the table, as it were. Selection isnโt so bad; you are going to filter on a certain column and apply that filter on the other column. Letโs find the developers with an โaโ in their name:
developers #~ 'a' e."1> developers
โโโโโโโฌโโโโโโโฌโโโโโโโโ
โaliceโcalvinโdelilahโ
โโโโโโโดโโโโโโโดโโโโโโโโ
The same selection works on the other column, and you can still stitch together columns to make a table:
devemails #~ 'a' e."1> developers
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โa@example.comโc@example.comโd@example.comโ
โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโ
(#&developers ,. #&devemails) 'a' e."1> developers
โโโโโโโโโฌโโโโโโโโโโโโโโ
โalice โa@example.comโ
โโโโโโโโโผโโโโโโโโโโโโโโค
โcalvin โc@example.comโ
โโโโโโโโโผโโโโโโโโโโโโโโค
โdelilahโd@example.comโ
โโโโโโโโโดโโโโโโโโโโโโโโ
Joins work by doing lookups by index. Letโs introduce another table:
project name | lead developer |
---|---|
alphago | calvin |
bitbucket | ellen |
cafeteria | delilah |
diffie | alice |
entryway | bob |
finality | alice |
grace | delilah |
homelab | calvin |
Following the earlier example we get this:
projects =. 'alphago'; 'bitbucket'; 'cafeteria'; 'diffie'; 'entryway'; 'finality'; 'grace'; 'homelab'
projdevs =. 'calvin'; 'ellen'; 'delilah'; 'alice'; 'bob'; 'alice'; 'delilah'; 'calvin'
projects ,. projdevs
โโโโโโโโโโโฌโโโโโโโโ
โalphago โcalvin โ
โโโโโโโโโโโผโโโโโโโโค
โbitbucketโellen โ
โโโโโโโโโโโผโโโโโโโโค
โcafeteriaโdelilahโ
โโโโโโโโโโโผโโโโโโโโค
โdiffie โalice โ
โโโโโโโโโโโผโโโโโโโโค
โentryway โbob โ
โโโโโโโโโโโผโโโโโโโโค
โfinality โalice โ
โโโโโโโโโโโผโโโโโโโโค
โgrace โdelilahโ
โโโโโโโโโโโผโโโโโโโโค
โhomelab โcalvin โ
โโโโโโโโโโโดโโโโโโโโ
We can find the email of the developer for each project like so:
(developers i. projdevs) { devemails
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โc@example.comโe@example.comโd@example.comโa@example.comโb@example.comโa@example.comโd@example.comโc@example.comโ
โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโ
This might be easier to read as a table, so letโs do that:
projects ,. projdevs ,. devemails {~ developers i. projdevs
โโโโโโโโโโโฌโโโโโโโโฌโโโโโโโโโโโโโโ
โalphago โcalvin โc@example.comโ
โโโโโโโโโโโผโโโโโโโโผโโโโโโโโโโโโโโค
โbitbucketโellen โe@example.comโ
โโโโโโโโโโโผโโโโโโโโผโโโโโโโโโโโโโโค
โcafeteriaโdelilahโd@example.comโ
โโโโโโโโโโโผโโโโโโโโผโโโโโโโโโโโโโโค
โdiffie โalice โa@example.comโ
โโโโโโโโโโโผโโโโโโโโผโโโโโโโโโโโโโโค
โentryway โbob โb@example.comโ
โโโโโโโโโโโผโโโโโโโโผโโโโโโโโโโโโโโค
โfinality โalice โa@example.comโ
โโโโโโโโโโโผโโโโโโโโผโโโโโโโโโโโโโโค
โgrace โdelilahโd@example.comโ
โโโโโโโโโโโผโโโโโโโโผโโโโโโโโโโโโโโค
โhomelab โcalvin โc@example.comโ
โโโโโโโโโโโดโโโโโโโโดโโโโโโโโโโโโโโ
So thereโs some basic relational-type stuff in J. Is this the right approach? I donโt know.
Edit: Notes from the chatroom from elcaro:
You want to be a little bit careful when dealing with boxes, because unboxing a boxed array will create fills. When you do
'a' e."1> developers
, your right arg is a 5ร7 character vector
quote"1 > developers
'alice '
'bob '
'calvin '
'delilah'
'ellen '
If you looked for developers with a Space in them, youโd match all except โdelilahโ. You can unbox each, and so compare each unbox string by itself
('a' e. ])@> developers 1 0 1 1 0
Which is fine, but might be hard to put into a larger tacit expression, as you need to explicitly refer to left arg in that inner expression. Something else you can do is use Spread (
S:
) or Level-At (L:
) where appropriate. I donโt use these much, butS:0
often pairs nicely withe.
orE.
, where you want to match some unboxed string to a list of boxed strings
'a' e.S:0 developers 1 0 1 1 0
This new operator is often easier to put into a larger tacit expression (if desiered) Spread and Level are a little like the Depth modifier in BQN
โ
(and proposed in APLโฅ
) in that it modifies operators to perform on nested data. The Nuvoc pages are a little lacking, but play with them and youโll get a hang of what they do Spread - NuVoc Leval-At - NuVoc Depth (operator) - APL Wiki Depth - BQN docsOh one more thingโฆ you can do
(#&developers ,. #&devemails) 1 0 1 1 0
as
(developers ,. devemails) #~ 1 0 1 1 0
My thoughts on the CES Letter.
My friend Bill recommended I read CES Letter. I found it pretty hard to put down, and read the whole thing over the last couple days. I thought it was worth reflecting on the ideas in it from the point of view of a Bahaโi.
I havenโt mentioned it on my blog yet for various reasons, but I declared myself a Bahaโi in October 2023. So I now belong to a Faith about the size of Mormonism (although much smaller in the US). I have been met with quite a bit of understanding and maybe a little puzzlement. But this seems like a good context to talk about it, because any faith should be able to withstand the kinds of questions Jeremy Runnells asks of his former faith. Why not start with the controversial stuff? At least you wonโt think Iโm being fatuous.
Before going too far down this road, I want to make it clear Iโm not going to bag on LDS believers. Bahaโuโllah says โConsort with the followers of all religions in a spirit of friendliness and fellowship.โ Reading the CES Letter, I felt a lot of sympathy for Jeremy and others like him: good and decent people trying to adhere to Godโs wishes, whom I have no doubt are worthy of and will receive an ample share of Godโs blessings. And furthermore, Iโm just a random Bahaโi and a guy who has read some books; I donโt speak from authority about anything, let alone the Bahaโi Faith.
The CES Letter is concerned with a number of topics: the verity of the foundational texts, the legitimacy of Joseph Smith Jr and the line of authority starting with him, the legitimacy of his witnesses, certain difficult or troubling teachings, and the Churchโs attitudes towards historicity, fact, belief, emotion and the general reconciliation of reason with, letโs say, shortcomings in those categories.
There are analogs in some of these areas to the questions that are asked of the Bahaโi Faith. In Mormonism, the question about the texts has to do with their historicity and legitimacy. In the Bahaโi Faith, we have a large number of writings that are untranslated. Both religions have gone through succession crises and wound up with basically one real movement and a few tiny groups of sectarians. Neither is especially progressive about LGBTQ issues, and both have a kind of membership status that can be imperiled by transgressing certain behavioral standards including this. Both have a concept of infallibility, a word that makes my skin crawl a bit and probably yours too.
Probably the most basic difference is that the Bahaโi Faith requires you to figure things out on your own. While you may be born into a Bahaโi family and may get a Bahaโi education, in the end, itโs completely your personal decision. When you declare, at least in the US, you fill out an online form; if you withdraw, you click a button on the same site. Youโre not going to get harassed about it. You can attend Bahaโi gatherings as a Bahaโi or just a friend; declaring and undeclaring does not imperil this. Nothing is really at stake if you reject the Faith. The benefits of being a declared Bahaโi are simply that you can donate money, participate in the electoral process and take part in the internal discussion part of the 19-day meeting (โfeastโ). Nobody believes that you are going to hell or otherwise going to suffer some kind of penalty in the afterlife. Similarly, the Bahaโi behavior standards are for Bahaโis. If you have not declared, they obviously do not apply to you, and nobody will think less of you for drinking or whatever. Anyone can use the Bahaโi prayers, read the Bahaโi books, believe in any part of it or none of it; it is for everyone equally.
Based on the CES Letter, it sounds like for Mormonism, the important thing is that you feel it is true, and this is called a testimony. Based on this, youโre supposed to accept everything. The outcome for Jeremy was extremely difficult. Itโs not a good look for the LDS church, in my opinion.
Now, infallibility gives me and most Americans hives because we find it impossible to imagine that someone could never have made a mistake about anything. The combination of infallibility with two known-invalid translations and a third whose source is not available kind of beggars belief. For Bahaโis, the combination of infallibility with a large corpus of untranslated text at first sounds like a minefield. I have only read a small portion of what is available, because despite it being a fraction of what exists, itโs still enormous. In doing this reading, what I came to realize is that generally when something is said and I have trouble with it, when I ponder it, I come to see how it fits with everything else. Though there is a vast amount of writing, the majority of it closes in on the same themes repeatedly: the unity of mankind, the unity of religion, the unity of the world. It is tremendously optimistic. Having read as much as I have, my fears about what remains untranslated have reduced a lot. Bahaโuโllah didnโt sit around holding forth on every possible topic. He talked about the same things repeatedly with many different people, using different kinds of language.
In the light of unity, I see infallibility as primarily serving the role of bringing about unity by proscribing intense, schism-causing debate. In this perspective, Bahaโuโllah gave leadership to Abduโl-Bahaโ and declared his infallibility so that there would not be schism from different people trying to usurp the religion. (Many tried anyway.) The same story repeats with Shoghi Effendi. In neither case were they empowered to add new things to the religion, only to explain. Each of these were small succession crises in comparison to Shoghi Effendiโs death and the formation of the UHJ. But even there, enough groundwork had been laid that the vast majority of people came along to the Bahaโi Faith we have today, and only a small splinter group was created (mostly living in Roswell, NM, of all places). There are only a few examples people have found where Abduโl-Bahaโ appears to say something contrary to science; the most prominent one is probably this one:
Question: What will be the food of the united people? Answer: As humanity progresses, meat will be used less and less, for the teeth of man are not carnivorous. For example, the lion is endowed with carnivorous teeth, which are intended for meat, and if meat be not found, the lion starves. The lion cannot graze; its teeth are of different shape. The digestive system of the lion is such that it cannot receive nourishment save through meat. The eagle has a crooked beak, the lower part shorter than the upper. It cannot pick up grain; it cannot graze; therefore, it is compelled to partake of meat. The domestic animals have herbivorous teeth formed to cut grass, which is their fodder. The human teeth, the molars, are formed to grind grain. The front teeth, the incisors, are for fruits, etc. It is, therefore, quite apparent according to the implements for eating that manโs food is intended to be grain and not meat. When mankind is more fully developed, the eating of meat will gradually cease.
On the one hand, we have an evolutionary explanation for why we have incisors: tearing meat, probably. But the question isnโt about where we have come from, itโs about where we are going, and virtually everyone can agree that eating more meat is probably worse for us. Evolution tells us where you came from, it doesnโt necessarily tell you where youโre going; giant pandas have incisors too, but only eat bamboo.
In the Bahaโi Faith, science and religion are complimentary because one tells you where you have been and the other where you are going; one tells you how things are and the other tells you why. As Bahaโuโllah says: โKnowledge is as wings to manโs life, and a ladder for his ascent. Its acquisition is incumbent upon everyone.โ While there is an emotional, experiential dimension to the Bahaโi Faith, it is not to override reason. This does not appear to be the case in the LDS church.
On the question of LGBTQ issues, we have about the same in the Bahaโi Faith as in the other major Abrahamic religions: a definition of marriage as being between a man and a woman and that sexual relations are to be limited to this union. Bahaโuโllah made a very oblique mention of pedophilia in the book of laws; Shoghi Effendi expanded this to homosexual relations. I canโt reconcile myself to Shoghi Effendiโs interpretation, but he certainly did not have the power to add something new to the writings or abrogate something completely in order to arrive at another solution. In the interest of unity, I can only acquiesce to it. There are LGBTQ Bahaโis, as well as Muslims, Catholics and Jews; I think Bahaโuโllah certainly did not want us to construct our identities on the basis of our preferences, and this is simply one of those tests for us, especially Americans who are accustomed to equating the satisfaction of our desires with the good. The matter is, in my opinion, somewhat overstated, because the Faith does not really permit us to categorize people into โgoodโ and โbadโ categories (or โin-groupโ/โout-groupโ or any other dichotomy) on any basis. If you are discriminated against by a Bahaโi, they have failed to fulfill their obligations to Bahaโuโllah to create unity.
Unlike Mormonism, we simply donโt have questions about the accuracy or authenticity of our religious texts. Also unlike Mormonism, we get to have the benefit of most other religionโs texts. The validity of our translations are sometimes questioned. The Arabic of The Bรกb and Bahaโuโllah were not exactly normative; Modern Standard Arabic wasnโt yet really formalized and their styles are idiosyncratic and have a lot of Persian influence. The Bรกbโs writings were only intended to last a short period of time and are written in a very dense style for people highly acquainted with the Qurโan; translating the bulk of them into English has not been a high priority. Those interested in knowing more about His writings should probably read Gate of the Heart. I donโt know the precise reason why much of Bahaโuโllahโs writings havenโt yet been translated. I suspect that the reasons are simply because A) there is so much to translate, B) Shoghi Effendi translated what he considered important for us, and there is a general trust in his decisions, C) not much trust in our ability to do an equally good job, D) the funds are better spent in other ways, and E) a lot of Bahaโuโllahโs writings are direct correspondence with believers and others, so may be repetitive in toto or not particularly general. These are just my speculations.
The historicity claims of the LDS church are pretty integral to the entire enterprise. If Joseph Smith didnโt really find hidden tablets, if the tablets arenโt true, then the rest of his revelation is probably false, which is what provides the impetus for their being an LDS church at all. Would the Book of Mormon be worth reading if it were an out-and-out forgery? The context surrounding it makes it difficult for non-Mormons to take seriously, and the Book of Abraham and Kinderhook Plates are obviously forged. Is there still value in reading and studying these books? Iโm not in a position to say. Much of the Hebrew Bible is either unverifiable or false from a historical perspective; the majority of Jews donโt see a problem with this and find that the ideas are still useful and worth studying. Most of Jesusโs words are in the form of parables, which means we know that their โtruth valueโ in the sense of logic is false, but their โtruth valueโ in the sense of spiritual teachings is quite large. On the other hand, it is the truth value of the LDSโs teachings that gave the LDS church racist teachings, promoted polygamy and conversion therapy, several of which were abrogated only recently.
The Bahaโi Faith has some historicity problems of another sort. The obvious one is that if all religions are one, why are they so different from one another? The Bahaโi perspective is that there are core teachings that are the same for all religions, which have been promoted by the Manifestations of God in all eras and all regions; the details differ because the locality and era might demand different emphasis, or perhaps just due to the passage of time and meddling by religious authorities (a favorite target of Bahaโuโllahโs). This seems pretty workable for the Abrahamic faiths that preceded it but raises problems about Eastern religions that do not have obvious solutions. Consider this question about Confucius and Buddha:
Buddha also established a new religion and Confucius renewed the ancient conduct and morals, but the original precepts have been entirely changed and their followers no longer adhere to the original pattern of belief and worship. The founder of Buddhism was a precious Being Who established the oneness of God, but later His original precepts were gradually forgotten and displaced by primitive customs and rituals, until in the end it led to the worship of statues and images.
Itโs quite difficult to look at the Buddhism we have today and see how Buddha could possibly have been talking about the oneness of God when there are no Buddhisms today that talk about God. Moojan Momen wrote a lengthy article about the interface between the Bahaโi Faith and Buddhism centered around the 8-fold path. But itโs unlikely that a Buddhist would be interested in hearing from us that they were originally monotheists. But Buddhism is so old and the original writings long gone, so what we have here from Abduโl-Bahaโ is basically untestable. His interpretation of Buddhism as worshipping images would be offensive to Western Buddhists. But Iโm not in a position to judge whether it is true elsewhere; my sense is that Buddhism in the West is rather different from Buddhism in the East, where people actually visit Buddhist shrines and have them in their homes. But back to the point.
Faced with a particular instance of an infallible person saying something at least untestable but perhaps false, I could respond by just giving up on it. I could respond to the unacceptance of gay marriage by giving up on the Faith. I didnโt though.
- There is not some slightly-tweaked form of the Faith out there that would resolve all these problems; thereโs one Bahaโi Faith.
- If there were, how would siding with some splinter group help working towards unity?
- There is not a place in the Faith where it says it is my job to judge people on the basis of their religion or sexuality or anything else. On the contrary, I am to accept everyone, show kindness to everyone, show violence to no one, speak ill of no one, etc.
- What we are trying to build is bigger; quibbling about this or that element creates more disunity rather than more unity.
- I can accept and show love to anyone, even if I canโt change doctrine.
Whether Abduโl-Bahaโ is right about the historical Buddha, we will probably never know, but I donโt think it undermines my religion. We have what we have today. Building bridges between Buddhism and the Bahaโi Faith, that is something that matters today, moreover between Buddhists and Bahaโis. Whatever their beliefs are based on, I still think we can learn something from them today. And itโs the same with the LDS church. We can learn things about God, humanity, how to be a better person and so forth from Mormons, from Buddhists, from LGBTQ people, from atheists. And we should, and we have to.
So this my interpretation of infallibility: that it is more about unity than about being right about everything. Much like in a marriage, sometimes you have to set aside the question of who is right in order to be happy and have peace. Hopefully not all the time. The marriage is more important. It can be, anyway, ideally. But also like a marriage, itโs hard to rebuild trust after a lot of lying. After having gone through the CES Letter, it feels like there is a little too much fabrication.
Overall, the CES letter is a great read. I recommend it to you whether you are an LDS member or not. I only disagree with one idea: โEach religion has believers who believe that their spiritual experiences are more authentic and powerful than those of the adherents of other religions. They cannot all be right together, if at all.โ
Internationalization Puzzles in J, Day One
I just found out about these i18n puzzles and figured Iโd take a crack at one in J. The first one is pretty easy. Iโm also trying to apply my learning about linear algebra to the domain.
The crux of this puzzle is the following observation about a string:
- For SMS, the byte count matters, and must be under 160 bytes
- For Twitter, the character count matters, and must be under 140 characters
The input format is just a list of strings, and your mission is to calculate the cost. I glanced over this at first and made a mistake in my zeal to apply some really rudimentary linear algebra. Basically, I thought, letโs convert the input into a big matrix, weโll have a byte count and a unicode length for each, and this will become our input matrix:
This will become a matrix of 0s and 1s, which we can then just take the dot product by the costs matrix:
The entire problem should reduce to something like this:
If I had done this step on paper or something, I would probably have figured out the mistake, but I didnโt until later.
Letโs translate that to J. First we need to read the file, which will be using input =. cutLF fread filename
. This gives us boxed strings, which is fine. Now we need to use #
to get the length and ucpcount
to count Unicode characters. We can throw these together as a train with ,
to get both at once:
(# , ucpcount) each input
โโโโโโโโโฌโโโโโโโโฌโโโโโโโโฌโโโโโโโโ
โ162 143โ138 136โ253 140โ147 141โ
โโโโโโโโโดโโโโโโโโดโโโโโโโโดโโโโโโโโ
These are the very values in the problem page, so this appears to be on the right track.
Then I hit a little snag with trains, because I wanted to write it like (160>:# , 140>:ucpcount)
but this does not do what it feels like it should, on account of the strict left-to-right order. So I wrote it like so instead:
>((160>:#) , 140>:ucpcount) each input
โโโโโฌโโโโฌโโโโฌโโโโ
โ0 0โ1 1โ0 1โ1 0โ
โโโโโดโโโโดโโโโดโโโโ
>((160>:#) , 140>:ucpcount) each input
0 0
1 1
0 1
1 0
Now we have exactly the matrix I expected to have, so letโs try the dot product:
(11 7) +/ .* |: >((160>:#) , 140>:ucpcount) each input
0 18 7 11
This is supposed to be 0 13 7 11
? Oh right, in my excitement I forgot that I need to discount when theyโre using both SMS and Twitter. I thought about this for a second and thought, I would really like to be able to index an array by another array. Iโm not sure how that would work. But I also remember reading about a trick where you convert the two-dimensional index into a scalar by using encode #.
. So instead of having a 2x2 table, we just have an array of length 4. In other words, 0 0 = 0, 0 1 = 1, 1 0 = 2, 1 1 = 3. Then I can encode the prices as 0 7 11 13, the price for nothing, a Tweet, an SMS, and both.
(0 7 11 13) {~ 2 #. > ((160>:#) , 140>:ucpcount) each input
0 13 7 11
Now we can just make the entire solution:
+/ (0 7 11 13) {~ 2 #. > ((160>:#) , 140>:ucpcount) each (cutLF fread'~/Downloads/test-input.txt')
31
And this solves the puzzle.
Edit: the helpful people on The APL Farm provided some advice. For starters, Time Melon points out that #.
has a default left argument of 2, so we can simply remove the 2 there, and the parentheses around the prices can be removed, yielding this improvement:
+/ 0 7 11 13 {~ #. > ((160>:#) , 140>:ucpcount) each input
31
Elcaro points out that each
is creating boxes I am then removing, so we can simplify to this:
+/ 0 7 11 13 {~ #. ((160>:#) , 140>:ucpcount) every input
31
or this; Iโm undecided but leaning towards the shorter one because I didnโt realize every
was a thing:
+/ 0 7 11 13 {~ #. ((160>:#) , 140>:ucpcount) &> input
31
Elcaro also noticed that Iโm missing out on the obvious fact that >:
is repeated inside the major transformation, so we can simplify it further to this:
+/ 0 7 11 13 {~ #. (160 140 >: #,ucpcount) &> input
31
NB. Or directly
+/ 0 7 11 13 {~ #. (160 140 >: #,ucpcount) &> cutLF fread '~/Downloads/test-input.txt'
31
And this appears to me to be the final form!
Edit: Elcaro makes another suggestion, pointing out that cutLF
is not that different from <;._2
, and so we can actually remove the boxing altogether and simplify the solution slightly further to this:
+/ 0 7 11 13 {~ #. (160 140 >: #,ucpcount);._2 input
31