What is proof to a non-mathematician?

I was talking to Alex this morning and the topic of geometric proof came up. This caused me to spend some time idly thinking about proof. I’ve written about this before but let me expand on it kind of dreamily.

I learned about Coq from a seminar taught by another friend of mine, Tyler Cecil. A key idea I learned from him is audience. The nice and beautiful thing about using a theorem prover like Coq or Lean is that it has the same excruciating standards for everyone. It entices the programmer by making you think that proving something is about the same as writing a program. This is formally true, thanks to the Curry-Howard Isomorphism but the skills needed to write a proof and the skills needed to write a program are curiously different. You wind up needing to know a bit of both to use a theorem prover well.

My friend Randy Van Why explained to me why mathematicians do not especially love theorem provers. In short, they are not solely interested in the truth-value of a proposition. They want to understand why a theorem is true. Input to a theorem prover tells you that a theorem is true, and in many cases it can do a proof-by-exhaustion that human could never pull off. But exhaustively testing every possibility to find out that the four-color theorem is true is not the same as the elegant proof of the five-color theorem. We want to learn something from a proof in addition to the truth value.

Parenthetically, Randy also informed me that the majority of mathematical communication happens over the chalkboard, not paper. So don’t feel bad if you have trouble self-studying math, because mathematicians mostly don’t.

One thing computer scientists like about theorem provers is that they can be restricted to constructive proof. This means certain avenues of proof are unavailable, and some things cannot be proven, but what you can prove, you can also construct—or compute.

However, when I am proving something on the side for fun, the only person I need to convince is myself. Kind of like playing chess with yourself, you have to train yourself a little in how to do it. However, the game is a bit more fun. Often if you don’t know how to do a proof from a book, it’s because you have forgotten an important definition from earlier. Sometimes you misunderstand the situation and make a bad proof you think is good. The time you spend shouting at the page, this is obvious! why can’t I prove it?” seems to be time well-spent.

Here’s an example from the book No Bullshit Guide to Linear Algebra:

Verify that v^=1|||| = 1.

Well, we have a few definitions we’ll need to consult:

v=(vx,vy)=vx2+vy2 || || = || (v_x,v_y) || =

and

v^=vv =

Knowing these, we can expand like so:

v^=vv=(vxv,vyv)=(vxv)2+(vyv)2=(vxvx2+vy2)2+(vyvx2+vy2)2=vx2vx2+vy2+vy2vx2+vy2=vx2+vy2vx2+vy2=1=1

This is a pretty simple example of a direct proof. In my opinion, this is enough detail. Someone newer to this subject matter might be confused by what happened between step 1 and 2, or someone with poor algebra might find the whole thing quite confusing. On the other hand, a more experienced mathematician would probably skip the penultimate step, or maybe even the step before that. The audience matters. But even if it’s just for you, I think doing these kinds of verifications is good for your understanding of the material.

May 14, 2025






Basic Stats in J

My son’s homework the other day involved the basic statistical functions of mean, median, mode and range. One of the most classic array language functions is, of course, calculating the mean, which is famously three functions: mean =. +/ % # in J.

The range is also pretty simple; you just want the difference between the largest and smallest values in the input, so you get range =. >./ - <./ in J.

The next two are not as clean. I’ll show you what I came up with and then we can compare to what’s going on in the stats/base library.

Median

Let’s start with the median. The idea is you take the middle value from the sorted list. Middle value” makes I don’t see an obvious trick for finding the middle value in J. I see basically two ways to do it: 1) calculate the midpoint and index it, or 2) use a recursive grossness to tear bits of the start and end. The latter feels more like a functional programming solution with linked lists, so I’m guessing the more array-ish way of doing it is to calculate. But there’s a second problem, which maybe the dear reader has already seen but I had to actually get halfway there before I realized, which is that even length arrays do not have a value at the midpoint. This invites some overthinking.

My son’s book suggests that in this situation you average the two midmost values. On the one hand, it feels to me like the whole point of the median is that it is a value from the set. On the other hand, only having a median 50% of the time feels unsatisfactory. Having two results for median 50% of the time also seems bad. I checked Wikipedia, and they give a definition as a partial function:

if nn is odd, med(x)=x(n+1)/2(x) = x_{(n+1)/2} if nn is even, med(x)=x(n/2)+x((n/2)+1)2(x) =

This suggests a direct translation that would involve using agenda @. on the divisibility of the length. Another approach would be to always get two values and then if they happen to be the same, average them anyway, which feels like less work to me. If the list is odd-length, we should obtain a midpoint with a half index in it, as in a list of length 7 producing a midpoint of 3.5. We can subtract 0.5 from that to get the actual index of 3. However, if the list has a length of 8, we will get a midpoint of 4. Subtracting 0.5 from that will produce 3.5. If we take the floor and ceiling of this, we will get 3 and 4, which are the indices of the values around the middle. So it feels like the right approach here is to find the midpoint minus half, take the floor and ceiling of that, take the values at those indexes in the sorted list and average them. Which is a lot of words to say find the median” but seems like the only way, going down this road. This yields the slightly unwieldy verb:

median =. {{ (+/ % #) q {~ (<.,>.) 0.5-~ -: # q =. /:~ y }}

So the idea here is take the argument, sort it, call it q because we’re going to need it again later, get the length of it, half it, take 0.5 from it, get the floor and ceiling of that number, take those from q, then average them. I’m not in love with this. Let’s see what we have in the stats/basic file:

min=: <./
max=: >./
midpt=: -:@<:@#
median=: -:@(+/)@((<. , >.)@midpt { /:~)

This is an interesting piece of code. We can see some similarity, though defining max/min and not using them is a choice. For midpoint, we have half of count minus one, which saves a few constants from my method. Same trick for getting the floor and ceiling in one: (<. , >.). Nice compositional style here, no need for the actual variables. Also no need for a real average, we just add up the two values and cut in half. All in all, this solution is shorter and makes much better use of the primitives than mine, which isn’t too surprising, and manages a tacit style nicely. I’m not sure how much value is obtained by moving midpoint out though.

Tentative conclusion: pay attention to constants like 1/2. If you know you have two items, you can use halve -:.

Mode

Now we have a real bummer: the mode. The mode is defined as the most frequent value in the set. We can safely assume that this is going to return a list of results because there can be many items with the most frequent value. I want to utilize key /.. as in a post from a few days ago to get the histogram, basically, of the input, and then we’ll need to utilize the two parts separately: find the largest count, and then use that to lookup the values with the largest count. For some reason I’m picturing this happening inside a verb that gets called dyadically, with the counts on the left and the groups of identical values on the right. It will be very interesting to see how this compares to the method in the stats/basics package.

So first, let’s get the counts with the values:

    (#,{.)/.~ q =. 9 1 8 16 42 9 1 2 8 7 7 5 4
2 9
2 1
2 8
1 16
1 42
1 2
2 7
1 5
1 4

Now this is slightly inconvenient, because I would really like to have the counts on one side and the values on the other so I can use insert / to put a function between them. We wind up needing to transpose this intermediate so we get the counts in the first entry and the values in the second:

   |: (#,{.)/.~ q
2 2 2 1 1 1 2 1 1
9 1 8 16 42 2 7 5 4

Now we need a function to find the greatest, which is simply >./. Now we have a selection problem, which demands filtering the input, which is a job for copy #. So we test for where the count on the left is greatest, find occurrences of that value in the left, and filter with copy against the right, and the overall solution is this:

   mode =. {{ (([ = >./@[) # ])/|: (#,{.)/.~ y }}
   mode q
9 1 8 7

This could probably be simplified; we could invert copy and remove some parens, for instance. I’m not entirely sure that we need to transpose the array.

Surprisingly, stats/base does not contain a definition for mode, but on the wiki I found this one: mode=: ~. {~ [: (i. >./) #/.~ which only returns the first value. But let’s look at how it works. First, we calculate the count over the self-groups, which gets us the frequencies in the histogram (but not their values). Then we have a capped fork. I don’t find these especially readable yet. Here it is now clear that we are finding the index of the largest value within the frequencies. Then we are inverting take and passing the nub of the original input to the right of take and the index of the largest value to the left. However, the use of i. here guarantees we get one result, no matter how many modes our set has. Thus, we get a pretty incomplete answer on our sample input:

   mode =. ~. {~ [: (i. >./) #/.~
   mode q
9

However, we know the answer we want will have #~ in the position where {~ is, and we know that we should have = where we have i., so let’s try that:

   mode =. ~. #~ [: (= >./) #/.~
   mode q
9 1 8 7

So that appears to have worked, which surprises me a little.

The cap here has actually done us a lot of good because not only has it saved us from writing @:, it has also saved us from quite a few parentheses, as this demonstrates:

   (~. #~ [: (= >./) #/.~) q
9 1 8 7
   (~. #~ (= >./) @: #/.~) q
9 1 8 16 42 2 7 5 4
   (~. #~ (= >./) @: (#/.~)) q
9 1 8 7

Tentative conclusion: /.. key dyad is very cute, but it’s easier to work with one piece of data at a time. Get better at capped forks, since they can be clearer than at @: when @: would require more parentheses. Forks are helpful.

May 8, 2025 j math






Dracula

We had to download a new app to watch the Superbowl back in February, Tubi. For some reason I was thinking about vampires back then, so I decided to watch the Hammer trilogy, The Vampire Lovers, Lust for a Vampire and Twins of Evil, which were all available on there, and then it recommended me another one, Vampyres, but I wouldn’t follow the IMDB link on that one if you’re at work. Somewhere in the middle I also watched Blood from the Mummy’s Tomb. These flicks are somewhere between horror and pornography. I’m not sure what happened, I think my brother had loved Hammer films when he was young, so they were kind of always on my to-watch list and I finally got around to a few, although I don’t think these are the ones he was endorsing exactly. Anyway, it was a vampire kick, these things happen. The movies are neither so great nor so bad that they deserve many words. But not ultimately satisfied, I decided I would watch a great vampire movie from my youth: Bram Stoker’s Dracula.

This should be a watershed movie. It has great effects, many great actors, great sets and was shot right around the time in my life when I feel like movies were really getting good. A certain scene near the beginning also activated certain neurons in your teenage author, the first movie that really imprinted on me for having done so. I had fond memories of this movie, so when I rewatched it with my wife, I was kind of awestruck at how bad it is. Not to dump on anyone in particular, but Keanu is surrounded by people speaking with accents yet his attempts are dismal, as are Winona’s. Lucy’s character is way oversexed. There’s a love plot device involving Dracula which seemed a little hackneyed. On the other hand, Gary Oldman and Anthony Hopkins are really great, but there’s too much happening for them to fix everything wrong here.

Now I started to wonder, is there something inherently not great about vampires? And just how accurate is this movie to the novel? Is the love story between Dracula and pants and/or Mina real? I had heard the novel was difficult to follow because it is told as a series of diary entries.

It took me over a month to get through this in audiobook but this book is great and it deserves its place as a classic. There is some sexiness but it is far from the principal element. There is a whole spooky ship scene which is rather great and doesn’t seem to be adapted often. Renfield is much more interesting in the novel.

The diary structure is not hard to follow at all, but it adds something wonderful to the atmosphere and the entire narrative structure. First, you get to see each character discover the horror in turn. When Jonathan first brushes off the supernatural, the foreboding is quite intense; later, each of the other characters have his same rationality at first and we get to see it stripped away one by one, a real horror moment. Second, it’s a useful device when in the middle of the novel, instead of narrating each character coming up to speed, we just have this exchange of diaries and then everyone is up to speed. Saves Bram from writing a lot of tedious dialogue about so-and-so telling so-and-so things we already know. Third, it allows him to play with our expectations. When we are getting diary entries from Mina in the final act and then we get one from Van Helsing instead, he creates doubt and tension about Mina right before the last entry. And finally, you get a bit of voyeuristic pleasure from the sensation that you’re rifling through a big pile of diaries containing something not quite believable. Indeed, the final words of the novel state this.

As an armchair language aficionado, I can also tell that Bram Stoker was one as well. He spends a great deal of time discussing the languages and accents of different characters. I don’t find his accents, particularly Van Helsing’s, especially believable, but I appreciate the attempt.

So that was great. I recommend reading or listening to the original. The films are pretty much trash. The Audible version with Alan Cumming and Tim Curry is very nice.

May 7, 2025 books movies horror






Gratitude

I’m trying to migrate my IT infrastructure to places where it won’t cost much but will be my problem, at least partly so that I don’t feel as much guilt when I see my friend Bill, and perhaps partly just to clean things up. The first of these was this blog, which I moved to Blot because it’s just ridiculous how easy it is to post here. I realized even having to commit and push to a repo is a barrier to writing for me. Which is, on the one hand, silly, but on the other hand, I don’t exactly revise blog posts over and over again trying to make them pass unit tests cleanly.

The second thing I started moving was my mail, which meant I had to go through ~45,000 emails going back to 2003ish. Yeah, 22 years of email. Let me tell you what I have learned.

My hobbies are pretty cyclical. I guess I always knew that, but would you believe my first exchange with Aaron Hsu was in 2008, on the topic of how to write Scheme on Plan 9? Then we were reintroduced in 2017 when I wrote APL Symbol Fetishism, and then I re-reintroduced myself to him a few weeks ago after hearing him talk on Episode 100 of the Array Cast. Funny to me! I doubt he remembered me from before, at least I hope not, because he found that essay pretty irritating and sent me quite a few corrections for it, which have not (as yet) been implemented.

The next thing I learned is that most of my email comes from my mom. Yeah! My mother is my most frequent email correspondent, and it’s not even close. After her it’s my brother. The frequency has tailed off in recent years, but there are months in my old mail that are just one after another from him, with no one else intervening.

I also have a lot of mail from my friends. And I have found quite a few emails apologizing for setting me off, or replying to my apologies for setting them off. I regret that my depression caused a lot of people to be on eggshells around me, that many people were sliced by shrapnel for the crime of trying to be my friend. I became pretty intolerant, and pretty intolerable, and lost a few dear friends to this. I blanket apology is probably insufficient. But in the compressed timeline, I can really see how much of an asshole I have been, to many, many people.

I was reminded of all the major moments of my life since college. Meeting my wife, having my kids, buying and selling houses, moving, deaths of friends and dear ones. Weird to think that email is this important. But it is.

Anyway, I’m grateful to you for reading this, and if you have emailed me in the past, I am even more grateful. Thank you!

May 7, 2025






Hey, remember this one? It turns out we can do it a little more simply:

   ({.;#)/.~ 'Mississippi'
┌─┬─┐
│M│1│
├─┼─┤
│i│4│
├─┼─┤
│s│4│
├─┼─┤
│p│2│
└─┴─┘

May 1, 2025 j






Some notes about J from today

Charting

   chart =. {{ l {~ >. (<: #l =. u: 16b20, 16b2581 + i.8) * y % >./ y }}
   ] dat =. 10 ? 20 
13 18 9 0 1 4 16 19 3 7
   chart dat
▆█▄ ▁▂▇█▂▃

tsvector in J

A tsvector in Postgres is the tokens of the input and their locations. We can compute this in J by cutting the input and associating it with the number of words of the input. Here is the Postgres example:

SELECT to_tsvector('english', 'a fat  cat sat on a mat - it ate a fat rats');
                  to_tsvector
-----------------------------------------------------
 'ate':9 'cat':3 'fat':2,11 'mat':7 'rat':12 'sat':4

In J:

   txt =. 'a fat  cat sat on a mat - it ate a fat rats'
   w ([,<@])/.. i. #(w =. ;: txt)
┌────┬──────┐
│a   │0 5 10│
├────┼──────┤
│fat │1 11  │
├────┼──────┤
│cat │2     │
├────┼──────┤
│sat │3     │
├────┼──────┤
│on  │4     │
├────┼──────┤
│mat │6     │
├────┼──────┤
│-   │7     │
├────┼──────┤
│it  │8     │
├────┼──────┤
│ate │9     │
├────┼──────┤
│rats│12    │
└────┴──────┘

There are some differences due to cleanup that Postgres does on the input which I’m not doing. But the idea is pretty much the same.

Collecting Data using /..

Considering the following:

   ]data =: _2 ]\ 'Fred';90;'John';75;'Fred';95;'John';85;'Susan';100;'John';86
┌─────┬───┐
│Fred │90 │
├─────┼───┤
│John │75 │
├─────┼───┤
│Fred │95 │
├─────┼───┤
│John │85 │
├─────┼───┤
│Susan│100│
├─────┼───┤
│John │86 │
└─────┴───┘

The documentation provides the following solution:

   ]s =: ({."1 data) <@;/. ({:"1 data)   NB. Use first column as keys; collect second-column values
┌─────┬────────┬───┐
│90 95│75 85 86│100│
└─────┴────────┴───┘
   (~. {."1 data) ,. s                   NB. Reinstall the name
┌─────┬────────┐
│Fred │90 95   │
├─────┼────────┤
│John │75 85 86│
├─────┼────────┤
│Susan│100     │
└─────┴────────┘

I found my own solution:

   (}:"1 data) ([,<@;@])/.. ({:"1 data) 
┌─────┬────────┐ 
│Fred │90 95   │ 
├─────┼────────┤ 
│John │75 85 86│ 
├─────┼────────┤ 
│Susan│100     │ 
└─────┴────────┘

The channel found this:

   (,<@;)/../|:data
┌─────┬────────┐
│Fred │90 95   │
├─────┼────────┤
│John │75 85 86│
├─────┼────────┤
│Susan│100     │
└─────┴────────┘

This notices that we have one column on the left and one on the right and inserts the calculation between them, reducing the amount of tearing apart we have to do by hand.

   (}:"1 (,<@;)/.. {:"1)

This is a tacit definition from someone else on the channel, which reduces parens by using a tacit definition.

April 22, 2025 j