<p>Back in 2014, when we implemented <a href="https://nqsb.io">TLS</a> in OCaml, at some point
I was bored with TLS. I usually need at least two projects (but not more than 5) at the same time to
procrastinate the one I should do with the other one - it is always more fun to
do what you're not supposed to do. I started to implement another security
protocol (<a href="https://otr.cypherpunks.ca/">Off-the-record</a>, resulted in
<a href="https://hannesm.github.io/ocaml-otr/doc/Otr.html">ocaml-otr</a>) on my own,
applying what I learned while co-developing TLS with David. I was eager to
actually deploy our TLS stack: using it with a web server (see <a href="/Posts/nqsbWebsite">this post</a>) is fun, but only using one half
of the state machine (server side) and usually short-lived connections
(discovers lots of issues with connection establishment) - not the client side
and no long living connection (which may discover other kinds of issues, such as
leaking memory).</p>
<p>To use the stack, I needed to find an application I use on a daily basis (thus
I'm eager to get it up and running if it fails to work). Mail client or web
client are just a bit too big for a spare time project (maybe not ;). Another
communication protocol I use daily is jabber, or
<a href="https://en.wikipedia.org/wiki/Xmpp">XMPP</a>. Back then I used
<a href="https://mcabber.com">mcabber</a> inside a terminal, which is a curses based client
written in C.</p>
<p>I started to develop <a href="https://github.com/hannesm/jackline">jackline</a> (first
commit is 13th November 2014), a terminal based XMPP client in
<a href="/Posts/OCaml">OCaml</a>. This is a report of a
work-in-progress (unreleased, but publicly available!) software project. I'm
not happy with the code base, but neverthelss consider it to be a successful
project: dozens of friends are using it (no exact numbers), I got <a href="https://github.com/hannesm/jackline/graphs/contributors">contributions from other people</a>
(more than 25 commits from more than 8 individuals), I use it on a daily basis
<p>Authentication is done via a TLS channel (where your client should authenticate
the server), and SASL that the server authenticates your client. I
<a href="https://berlin.ccc.de/~hannes/secure-instant-messaging.pdf">investigated in 2008</a> (in German)
which clients and servers use which authentication methods (I hope the state of
certificate verification improved in the last decade).</p>
<p>End-to-end encryption is achievable using OpenPGP (rarely used in my group of
friends) via XMPP, or <a href="https://otr.cypherpunks.ca/">Off-the-record</a>, which was
pioneered over XMPP, and is still in wide use - it gave rise to forward secrecy:
if your long-term (stored on disk) asymmetric keys get seized or stolen, they
are not sufficient to decrypt recorded sessions (you can't derive the session
key from the asymmetric keys) -- but the encrypted channel is still
authenticated (once you verified the public key via a different channel or a
shared secret, using the <a href="https://en.wikipedia.org/wiki/Socialist_millionaire">Socialist millionaires problem</a>).</p>
<p>OTR does not support offline messages (the session keys may already be destroyed
by the time the communication partner reconnects and receives the stored
messages), and thus recently <a href="https://conversations.im/omemo/">omemo</a> was
developed. Other messaging protocols (Signal, Threema) are not really open,
support no federation, but have good support for group encryption and offline
messaging. (There is a <a href="https://www.cypherpunks.ca/~iang/pubs/secmessaging-oakland15.pdf">nice overview over secure messaging and threats.</a>)</p>
<p>There is (AFAIK) no encrypted group messaging via XMPP; also the XMPP server
contains lots of sensible data: your address book (buddy list), together with
offline messages, nicknames you gave to your buddies, subscription information,
and information every time you connect (research of privacy preserving presence
protocols has been done, but is not widely used AFAIK,
e.g. <a href="http://cacr.uwaterloo.ca/techreports/2014/cacr2014-10.pdf">DP5</a>).</p>
<p>See <a href="https://en.wikipedia.org/wiki/Comparison_of_XMPP_clients">wikipedia</a> for an
extensive comparison (which does not mention jackline :P).</p>
<p>A more opinionated analysis is that you were free to choose between C - where
all code has to do manual memory management and bounds checking - with ncurses
(or GTK) and OpenSSL (or GnuTLS) using libpurple (or some other barely
maintained library which tries to unify all instant messaging protocols), or
Python - where you barely know upfront what it will do at runtime - with GTK and
some OpenSSL, or even JavaScript - where external scripts can dynamically modify
the prototype of everything at runtime (and thus modify code arbitrarily,
violating invariants) - calling out to C libraries (NSS, maybe libpurple, who
knows?).</p>
<p>Due to complex APIs of transport layer security, certificate verification is
<a href="https://pidgin.im/news/security/?id=91">still not always done correctly</a> (that's just
one example, you'll find more) - even if, it may not allow custom trust anchors
or certificate fingerprint based verification - which are crucial for a
federated operations without a centralised trust authority.</p>
<p>Large old code basis usually gather dust and getting bitrot - and if you add
patch by patch from random people on the Internet, you've to deal with the most
common bug: insufficient checking of input (or output data, <a href="https://dev.gajim.org/gajim/gajim-plugins/issues/145">if you encrypt only the plain body, but not the marked up one</a>). In some
programming languages this easily <a href="https://pidgin.im/news/security/?id=64">leads to execution of remote code</a>, other programming languages steal the
work from programmers by deploying automated memory management (finally machines
take our work away! :)) - also named garbage collection, often used together
with automated bounds checking -- this doesn't mean that you're safe - there are
still logical flaws, and integer overflows (and funny things which happen at
<p>My upfront motivation was to write and use an XMPP client tailored to my needs.
I personally don't use many graphical applications (coding in emacs, mail via
thunderbird, firefox, mplayer, mupdf), but stick mostly to terminal
applications. I additionally don't use any terminal multiplexer (saw too many
active <code>screen</code> sessions on remote servers where people left root shells open).</p>
<p>The <a href="https://github.com/hannesm/jackline/commit/9322ceefa9a331fa92a6bf253e8d8f010da2229c">goal was from the beginning</a>
to write a "minimalistic graphical user interface for a secure (fail hard)
and trustworthy XMPP client". By <em>fail hard</em> I mean exactly that: if it can't
authenticate the server, don't send the password. If there is no
end-to-end encrypted session, don't send the message.</p>
<p>As a user of (unreleased) software, there is a single property which I like to
preserve: continue to support all data written to persistent storage. Even
during large refactorings, ensure that data on the user's disk will also be
correctly parsed. There is nothing worse than having to manually configure an
application after update. The solution is straightforward: put a version in
every file you write, and keep readers for all versions ever written around.
My favourite marshalling format (human readable, structured) are still
S-expressions - luckily there is a
<a href="https://github.com/janestreet/sexplib">sexplib</a> in OCaml for handling these.
Additionally, once the initial configuration file has been created (e.g. interactively with the application), the application
does no further writes to the config file. Users can make arbitrary modifications to the file,
and restart the application (and they can make changes while the application is running).</p>
<p>I also appreciate another property of software: don't ever transmit any data or
open a network connection unless initiated by the user (this means no autoconnect on startup, or user is typing indications). Don't be obviously
fingerprintable. A more mainstream demand is surely that software should not
phone home - that's why I don't know how many people are using jackline, reports
based on friends opinions are hundreds of users, I personally know at least
several dozens.</p>
<p>As written <a href="/Posts/OperatingSystem">earlier</a>, I often take
a look at the trusted computing base of a computer system. Jackline's trusted
computing base consists of the client software itself, its OCaml dependencies
(including OTR, TLS, tty library, ...), then the OCaml runtime system, which
uses some parts of libc, and a whole UNIX kernel underneath -- one goal is to
have jackline running as a unikernel (then you connect via SSH or telnet and
TLS).</p>
<p>There are only a few features I need in an XMPP client: single account, strict
validation, delivery receipts, notification callback, being able to deal with
friends logged in multiple times with wrongly set priorities - and end-to-end
encryption. I don't need inline HTML, avatar images, my currently running
music, leaking timezone information, etc. I explicitly don't want to import any
private key material from other clients and libraries, because I want to ensure
that the key was generated by a good random number generator (read <a href="https://mirage.io/blog/mirage-entropy">David's blog article</a> on randomness and entropy).</p>
<p>The security story is crucial: always do strict certificate validation, fail
hard, make it noticable by the user if they're doing insecure communication.
Only few people are into reading out loud their OTR public key fingerprint, and
SMP is not trivial -- thus jackline records the known public keys together with
a set of resources used, a session count, and blurred timestamps (accuracy: day)
when the publickey was initially used and when it was used the last time.</p>
<p>I'm pragmatic - if there is some server (or client) deployed out there which
violates (my interpretation of) the specification, I'm happy to <a href="https://github.com/hannesm/ocaml-otr/issues/10">implement workarounds</a>. Initially I
worked roughly one day a week on jackline.</p>
<p>To not release the software for some years was something I learned from the
<a href="https://common-lisp.net/project/slime/">slime</a> project (<a href="https://www.youtube.com/watch?v=eZDWJfB9XY4">watch Luke's presentation from 2013</a>) - if
there's someone complaining about an issue, fix it within 10 minutes and ask
them to update. This only works if each user compiles the git version anyways.</p>
<p>Initially I targeted GTK with OCaml, but that excursion only lasted <a href="https://github.com/hannesm/jackline/commit/17b674130f7b1fcf2542eb5e0911a40b81fc724e">two weeks</a>,
when I switched to a <a href="https://github.com/diml/lambda-term">lambda-term</a> terminal
one waiting for network input (<a href="https://github.com/hannesm/jackline/blob/ec8f8c01d6503bf52be263cd319ef21f2b62ff2e/cli/cli_state.ml#L202"><code>Connect</code>, including reconnecting timers</a>),
<p>Only recently I solved the copy and paste issue by <a href="https://github.com/hannesm/jackline/commit/cab34acab004023911997ec9aee8b00a976af7e4">delaying all redraws by 40ms</a>,
<a href="https://github.com/hannesm/jackline/issues/115">support for multiple accounts</a>
(tbh, these are all
things I'd like to have as well).</p>
<p>But there's some mess to clean up:</p>
<ol>
<li>
<p>The <a href="https://github.com/ermine/xmpp">XMPP library</a> makes heavy use of
functors (to abstract over the concrete IO, etc.), and embeds IO deep inside it.
I do prefer (see e.g. <a href="https://usenix15.nqsb.io">our TLS paper</a>, or <a href="/Posts/ARP">my ARP post</a>) these days to have a pure interface for
the protocol implementation, providing explicit input (state, event, data), and
output (state, action, potentially data to send on network, potentially data to
process by the application). The <a href="https://github.com/hannesm/xmpp/blob/eee18bd3dd343550169969c0b45548eafd51cfe1/src/sasl.ml">sasl implementation</a>
is partial and deeply embedded. The XML parser is as well deeply embedded (and
<a href="https://github.com/hannesm/jackline/issues/8#issuecomment-67773044">has some issues</a>).
The library needs to be torn apart (something I procrastinate since more than
a year). Once it is pure, the application can have full control over when to
call IO (and esp use the same protocol implementation as well for registering a
new account - <a href="https://github.com/hannesm/jackline/issues/12">currently not supported</a>).</p>
</li>
<li>
<p>On the frontend side (the <code>cli</code> subfolder), there is too much knowledge of
XMPP. It should be more general, and be reusable (some bits and pieces are
notty utilities, such as wrapping a string to fit into a text box of specific
should be part of the earlier mentioned <code>cli_state</code>, also contacts should be a map, not a Hashtbl (took me some time to learn).</p>
</li>
<li>
<p>Having jackline self-hosted as a MirageOS unikernel. I've implemented a a
<a href="https://github.com/hannesm/telnet">telnet</a> server, there is a
<a href="https://github.com/pqwy/notty/tree/mirage">notty branch</a> be used with the telnet
server. But there is (right now) no good story for persistent mutable storage.</p>
</li>
<li>
<p>Jackline predates some very elegant libraries, such as
<a href="http://erratique.ch/software/logs">logs</a> and
<a href="http://erratique.ch/software/astring">astring</a>, even
<a href="http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html#TYPEresult">result</a> - since 4.03 part of Pervasives - is not used.
Clearly, other libraries (such as TLS) do not yet use result.</p>
</li>
<li>
<p>After looking in more depths at the logs library, and at user interfaces - I
envision the graphical parts to be (mostly!?) a viewer of logs, and a command
shell (using a control interface, maybe
<a href="https://github.com/mirage/ocaml-9p/">9p</a>): Multiple layers (of a protocol),
slightly related (by <a href="http://erratique.ch/software/logs/doc/Logs.Tag.html">tags</a> - such as the OTR session), and have the layers be visible to users (see also
<a href="https://github.com/mirleft/tlstools">tlstools</a>), a slightly different interface
of similarly structured data. In jackline I'd like to e.g. see all messages of
a single OTR session (see <a href="https://github.com/hannesm/jackline/issues/111">issue</a>), or hide the presence messages in a multi-user chat,
investigate the high-level message, its XML encoded stanza, TLS encrypted
frames, the TCP flow, all down to the ethernet frames send over the wire - also
viewable as sequence diagram and other suitable (terminal) presentations (TCP
window size maybe in a size over time diagram).</p>
</li>
<li>
<p>Once the API between the sources (contacts, hosts) and the UI (what to
display, where and how to trigger notifications, where and how to handle global
changes (such as reconnect)) is clear and implemented, commands need to be
reinvented (some, such as navigation commands and emacs keybindings, are generic
to the user interface, others are specific to XMPP and/or OTR): a new transport
(IRC) or end-to-end crypto protocol (omemo) - should be easy to integrate (with