This commit is contained in:
Hannes Mehnert 2016-04-18 00:11:12 +01:00
parent bb87160923
commit c1efa3c5e1

View file

@ -20,62 +20,62 @@ from designing over experimenting to debugging why it does not do what I want.
In the end, the computer is dumb and executes only what you (or code from In the end, the computer is dumb and executes only what you (or code from
someone else which you rely on) tell it to do. someone else which you rely on) tell it to do.
To not have to write assembly code manually, programming languages were To abstract from assembly code, which is not portable, programming languages were
developed as an abstraction. There exist different flavours which vary in developed. Different flavoured languages vary in
expressive power and static guarantees. Lots claim to be general purpose or expressive power and static guarantees. Many claim to be general purpose or
systems languages; whether it is convenient to develop in depends on the choices systems languages; depending on the choices of
the language designer made, and whether there is sufficient tooling around it. the language designer and tooling around the language, it is a language which lets you conveniently develop programs in.
A language design decides on the builtin abstraction mechanisms, each of which A language designer decides on the builtin abstraction mechanisms, each of which
is both a burden and a blessing. They might be interfering (bad design), is both a burden and a blessing: it might be interfering (which to use? `for` or `while`, `trait` or `object`),
orthogonal (composable), or even synergistic (interfere in a positive way, such as anonymous functions and higher order functions, or ADT and pattern matching). Another choice is whether the language includes a type orthogonal (one way to do it), or even synergistic (higher order functions and anonymous functions). Another choice is whether the language includes a type
system, and if the developer might cheat on it. A strong static type system system, and if the developer can cheat on it (by allowing arbitrary type casts, a *weak* type system). A strong static type system
allows a developer to encode invariants, without the need to defer to runtime allows a developer to encode invariants, without the need to defer to runtime
assertions. Type systems differ in their expressive power, the new kid on the assertions. Type systems differ in their expressive power ([dependent typing](https://en.wikipedia.org/wiki/Dependent_type) are the hot research area at the moment). Tooling depends purely
block is [dependent typing](https://en.wikipedia.org/wiki/Dependent_type), which on the community size, natural selection will prevail the useful tools
allows to encode values in types (list of length 3). Tooling depends purely (community size gives inertia to other factors: demand for libraries, package manager, activity on stack overflow, etc.).
on the community size, natural selection will prevail the useful tools (size gives rise to other factors such as inertia, activity on stack overflow).
## Why OCaml? ## Why OCaml?
As already mentioned in [other](https://hannes.nqsb.io/Posts/About) As already mentioned in [other](https://hannes.nqsb.io/Posts/About)
[articles](https://hannes.nqsb.io/Posts/OperatingSystem) here, it is a [articles](https://hannes.nqsb.io/Posts/OperatingSystem) here, it is a
combination of large enough community, runtime performance, modularity, combination of sufficiently large community, runtime stability and performance, modularity,
well-thought abstraction mechanisms, age (it recently turned 20), and functional features. carefully thought out abstraction mechanisms, maturity (OCaml recently turned 20), and functional features.
The latter is squishy, I'll try to explain it a bit: you define your concrete The latter is squishy, I'll try to explain it a bit: you define your concrete
*data types* as *products* (`int * int` for a pair of integers), *records* (`{ *data types* as *products* (`int * int`, a tuple of integers), *records* (`{
foo : int ; bar : int }` in case you want to name fields), variants (sum types, tagged union in C `type list = Nil | Cons of a * a list`), and compose them by foo : int ; bar : int }` to name fields), sums (`type state = Initial | WaitingForKEX | Established`, or variants, or tagged union in C).
using [*algebraic data types*](https://en.wikipedia.org/wiki/Algebraic_data_type). Whenever you have a These are called [*algebraic data types*](https://en.wikipedia.org/wiki/Algebraic_data_type). Whenever you have a
state machine, you can encode the state as an algebraic data type and use a state machine, you can encode the state as a variant and use a
`match` to handle the cases. The compiler checks whether your match is complete pattern match to handle the different cases. The compiler checks whether your pattern match is complete
(contains a line for each member of the ADT). Another important aspect of (contains a line for each member of the variant). Another important aspect of
functional programming is that you can pass functions to other functions functional programming is that you can pass functions to other functions
(*higher-order functions*). Also, *recursion* is fundamental for functional (*higher-order functions*). Also, *recursion* is fundamental for functional
programming (there's no need for one or multiple programming language constructs programming: a function calls itself -- combined with a variant type (such as
to provide loops), instead functions call themselves (hopefully with some `type list = Nil | Cons of a * a list`) it is trivial to show termination.
decreasing argument, thus they will terminate).
A real program is boring without *side effects*, such as mutable state and *Side effects* make the program interesting, because they
input/output. These are the bits which make the program interesting by communicate with other systems or humans. Side effects should be isolated and
communicating with other systems or humans. They should be isolated and explicitly stated (in the type!). Algorithm and protocol
explicitly stated (e.g. in the type). Especially algorithm or protocol implementations should not deal with side effects internally, but leave this to an
implementations should not handle side effects internally, but leave this to an effectful layer on top of it. The internal pure functions
effectful layer on top of it, separating the concerns. Those pure functions (which receive arguments and return values, no other way of communication) inside
(which get arguments and return a value, no other way of communication) inside
preserve [*referential preserve [*referential
transparency*](https://en.wikipedia.org/wiki/Referential_transparency_%28computer_science%29). transparency*](https://en.wikipedia.org/wiki/Referential_transparency_%28computer_science%29).
Modularity helps to separate the concerns.
The holy grail is [declarative programing](https://en.wikipedia.org/wiki/Declarative_programming), write *what* The holy grail is [declarative programing](https://en.wikipedia.org/wiki/Declarative_programming), write *what*
a program should achieve, not *how to* achieve it (like it is done imperatively). a program should achieve, not *how to* achieve it (like often done in an imperative language).
OCaml has a object and class system, which I do not use. OCaml also contains OCaml has a object and class system, which I do not use. OCaml also contains
exceptions (and annoyingly the standard library (e.g. `List.find`) is full of exceptions (and annoyingly the standard library (e.g. `List.find`) is full of
them), which I avoid, and libraries should not expose any exception (apart from out of memory). If your them), which I avoid as well. Libraries should not expose any exception (apart from out of memory, a really exceptional situation). If your
processing code might end up in an error state (common for parsers of input code might end up in an error state (common for parsers which process input
received via network), return a value of an algebraic data type with two from the network), return a variant type as value (`type result = Ok of 'a | Error of 'b`).
constructors, `Ok` and `Error`. In this way, the caller has to handle That way, the caller has to handle
both cases explicitly. both the success and failure case explicitly.
## Where to start? ## Where to start?
@ -84,12 +84,12 @@ tutorials](https://ocaml.org/learn/tutorials/) and examples, including
[introductionary [introductionary
material](https://ocaml.org/learn/tutorials/get_up_and_running.html) how to get material](https://ocaml.org/learn/tutorials/get_up_and_running.html) how to get
started with a new library. Editor integration (at least for emacs, vim, and started with a new library. Editor integration (at least for emacs, vim, and
atom) via [merlin](https://github.com/the-lambda-church/merlin/wiki) is atom) is via [merlin](https://github.com/the-lambda-church/merlin/wiki)
available. available.
There are also [programming There are also [programming
guidelines](https://ocaml.org/learn/tutorials/guidelines.html) available, which guidelines](https://ocaml.org/learn/tutorials/guidelines.html), best to re-read
is worth a read periodically. on a regular schedule.
A very good starting book is [OCaml from the very A very good starting book is [OCaml from the very
beginning](http://ocaml-book.com/) to learn the functional ideas in OCaml (also beginning](http://ocaml-book.com/) to learn the functional ideas in OCaml (also
@ -97,43 +97,44 @@ its successor [More
OCaml](http://ocaml-book.com/more-ocaml-algorithms-methods-diversions/)). OCaml](http://ocaml-book.com/more-ocaml-algorithms-methods-diversions/)).
Another good book is [real world OCaml](https://realworldocaml.org), though it Another good book is [real world OCaml](https://realworldocaml.org), though it
is focussed around the "core" library (which I do not recommend due to its is focussed around the "core" library (which I do not recommend due to its
size). huge size).
[Opam](https://opam.ocaml.org) is the OCaml package manager. [Opam](https://opam.ocaml.org) is the OCaml package manager.
The [opam repository](https://opam.ocaml.org/packages/) contains over 1000 The [opam repository](https://opam.ocaml.org/packages/) contains over 1000
libraries. The quality varies, I personally like the small libraries done by libraries. The quality varies, I personally like the small libraries done by
[Daniel Bünzli](http://erratique.ch/software), as well as our [Daniel Bünzli](http://erratique.ch/software), as well as our
[nqsb](https://nqsb.io) libraries (see [mirleft](https://github.com/mirleft)), [nqsb](https://nqsb.io) libraries (see [mirleft org](https://github.com/mirleft)),
[notty](https://github.com/pqwy/notty). A concise library (not much code), [notty](https://github.com/pqwy/notty). A concise library (not much code),
including tests, documentation, etc. is including tests, documentation, etc. is
[hkdf](https://github.com/hannesm/ocaml-hkdf). For testing I currently prefer [hkdf](https://github.com/hannesm/ocaml-hkdf). For testing I currently prefer
[alcotest](https://github.com/mirage/alcotest). For cooperative tasks, [alcotest](https://github.com/mirage/alcotest). For cooperative tasks,
[lwt](https://github.com/ocsigen/lwt) is decent (though a bit convoluted by [lwt](https://github.com/ocsigen/lwt) is decent (though it is a bit convoluted by
integrating too much). integrating too many features).
I try to stay away from big libraries such as ocamlnet, core, extlib, batteries. I try to stay away from big libraries such as ocamlnet, core, extlib, batteries.
When I develop a library I rather not force any use to depend on such a large When I develop a library I do not want to force anyone into using such large
code base. Since opam is widely used, distributing libraries became easier, code bases. Since opam is widely used, distributing libraries became easier,
thus the trend is towards small libraries (such as thus the trend is towards small libraries (such as
[astring](http://erratique.ch/software/astring) and [astring](http://erratique.ch/software/astring),
[ptime](http://erratique.ch/software/ptime)). [ptime](http://erratique.ch/software/ptime),
[PBKDF](https://github.com/abeaumont/ocaml-pbkdf), [scrypt](https://github.com/abeaumont/ocaml-scrypt-kdf)).
What is needed depends on your concrete use case or plan. There are lots of What is needed? This depends on your concrete goal. There are lots of
issues in lots of libraries, the MirageOS project also has a [list of issues in lots of libraries, the MirageOS project also has a [list of
projects](https://github.com/mirage/mirage-www/wiki/Pioneer-Projects) which Pioneer projects](https://github.com/mirage/mirage-www/wiki/Pioneer-Projects) which
would be useful. I personally would like to have a native [simple would be useful to have. I personally would like to have a native [simple
authentication and security layer (SASL)](https://tools.ietf.org/html/rfc4422) authentication and security layer (SASL)](https://tools.ietf.org/html/rfc4422)
implementation in OCaml (amongst other things, such as using an [ELF section for implementation in OCaml soon (amongst other things, such as using an [ELF section for
data](https://github.com/mirage/mirage/issues/489), and data](https://github.com/mirage/mirage/issues/489),
[strtod](https://github.com/mirage/mirage-platform/issues/118)). [strtod](https://github.com/mirage/mirage-platform/issues/118)).
A [dashboard](https://github.com/rudenoise/mirage-dashboard) for MirageOS is A [dashboard](https://github.com/rudenoise/mirage-dashboard) for MirageOS is
under development, which will hopefully ease tracking of MirageOS active under development, which will hopefully ease tracking of what is being actively
development. I setup an [atom developed within MirageOS. Because I'm impatient, I setup an [atom
feed](https://github.com/miragebot.private.atom?token=ARh4hnusZ1kC_bQ_Q6_HUzQteEEGTqy8ks61Fm2LwA==) feed](https://github.com/miragebot.private.atom?token=ARh4hnusZ1kC_bQ_Q6_HUzQteEEGTqy8ks61Fm2LwA==)
which watches several MirageOS-related repositories. which watches lots of MirageOS-related repositories.
I hope I gave some insight into OCaml. A longer read is our Usenix 2015 paper I hope I gave some insight into OCaml, and why I currently enjoy it. A longer read on applicability of OCaml is our Usenix 2015 paper
[Not-quite-so-broken TLS: lessons in re-engineering a security protocol [Not-quite-so-broken TLS: lessons in re-engineering a security protocol
specification and specification and
implementation](https://nqsb.io/nqsbtls-usenix-security15.pdf). I'm interested in feedback, either via implementation](https://nqsb.io/nqsbtls-usenix-security15.pdf). I'm interested in feedback, either via