Englishy "Phonetic" Translation of Hebrew
Posted by Daniel Lyons Mon, 20 Nov 2006 06:19:22 GMT
It occurred to me today that Hebrew pronunciation is very regular and it would not be particularly difficult to implement a program that takes UTF-8 Hebrew text in and transliterates it to English. For example:
”שלום עליכם” -> “shalom aleiḥem” (that should look like an h with a dot underneath it, in case you have a weird browser).
Naturally, I already have a graceful recursive algorithm in mind for doing this, but I actually have no idea which of my permitted languages supports Unicode properly.
Here’s my guess:
- Common Lisp: probably supported, but probably gross, like everything else in CL.
- OCaml: should be supported (Europeans), but isn’t native.
- Haskell: no clue, probably not native at least. Maybe Hugs98?
- Erlang: not supported at all, due to underlying string implementation (lists of integers). I could mess around with the binary directly, using binary pattern matching to grep out the Hebrew and produce, say, atoms. Hmm.
I’ll have to look at this some more. It would be helpful to have in general I think.
