Built from 3deddd702f
This commit is contained in:
parent
890055c942
commit
84a3091898
6 changed files with 117 additions and 38 deletions
|
@ -53,7 +53,7 @@ and eventually I decided to take a more rigorous approach to it.</p>
|
||||||
In OCaml a string is just a sequence of bytes.
|
In OCaml a string is just a sequence of bytes.
|
||||||
Any bytes, even <code>NUL</code> bytes.
|
Any bytes, even <code>NUL</code> bytes.
|
||||||
There is no concept of unicode in OCaml strings.<br>
|
There is no concept of unicode in OCaml strings.<br>
|
||||||
In Python there is the <code>str</code> type which is a sequence of Unicode code points[^python-bytes].
|
In Python there is the <code>str</code> type which is a sequence of Unicode code points<sup><a href="#fn-python-bytes" id="ref-1-fn-python-bytes" role="doc-noteref" class="fn-label">[1]</a></sup>.
|
||||||
I can recommend reading Daniel Bünzli's <a href="https://ocaml.org/p/uucp/13.0.0/doc/unicode.html#minimal">minimal introduction to Unicode</a>.
|
I can recommend reading Daniel Bünzli's <a href="https://ocaml.org/p/uucp/13.0.0/doc/unicode.html#minimal">minimal introduction to Unicode</a>.
|
||||||
Already here there is a significant gap in semantics between Python and OCaml.
|
Already here there is a significant gap in semantics between Python and OCaml.
|
||||||
For many practical purposes we can get away with using the OCaml <code>string</code> type and treating it as a UTF-8 encoded Unicode string.
|
For many practical purposes we can get away with using the OCaml <code>string</code> type and treating it as a UTF-8 encoded Unicode string.
|
||||||
|
@ -81,7 +81,7 @@ The string literal can optionally have a prefix character that modifies what typ
|
||||||
That means backslash escape sequences are not interpreted.
|
That means backslash escape sequences are not interpreted.
|
||||||
In my experiments they seem to be quasi-interpreted, however!
|
In my experiments they seem to be quasi-interpreted, however!
|
||||||
The string <code>r"\"</code> is considered unterminated!
|
The string <code>r"\"</code> is considered unterminated!
|
||||||
But <code>r"\""</code> is fine as is interpreted as <code>'\\"'</code>[^raw-escape-example].
|
But <code>r"\""</code> is fine as is interpreted as <code>'\\"'</code><sup><a href="#fn-raw-escape-example" id="ref-1-fn-raw-escape-example" role="doc-noteref" class="fn-label">[2]</a></sup>.
|
||||||
Why this is the case I have not found a good explanation for.</p>
|
Why this is the case I have not found a good explanation for.</p>
|
||||||
<p>The <code>b</code>-prefixed strings are <code>bytes</code> literals.
|
<p>The <code>b</code>-prefixed strings are <code>bytes</code> literals.
|
||||||
This is close to OCaml strings.</p>
|
This is close to OCaml strings.</p>
|
||||||
|
@ -247,7 +247,7 @@ Below is the output of <code>help(str.__repr__)</code>:</p>
|
||||||
</code></pre>
|
</code></pre>
|
||||||
<p>Language and (standard) library designers could consider whether the slightly nicer looking strings are worth the added complexity users eventually are going to rely on - inadvertently or not.
|
<p>Language and (standard) library designers could consider whether the slightly nicer looking strings are worth the added complexity users eventually are going to rely on - inadvertently or not.
|
||||||
I do think strings and bytes in Python are a bit too complex.
|
I do think strings and bytes in Python are a bit too complex.
|
||||||
It is not easy to get a language lawyer[^language-lawyer] level understanding.
|
It is not easy to get a language lawyer<sup><a href="#fn-language-lawyer" id="ref-1-fn-language-lawyer" role="doc-noteref" class="fn-label">[3]</a></sup> level understanding.
|
||||||
In my opinion it is a mistake to not at least print a warning if there are illegal escape sequences - especially considering there are escape sequences that are valid in one string literal but not another.</p>
|
In my opinion it is a mistake to not at least print a warning if there are illegal escape sequences - especially considering there are escape sequences that are valid in one string literal but not another.</p>
|
||||||
<p>Unfortunately it is often the case that to get a precise specification it is necessary to look at the implementation.
|
<p>Unfortunately it is often the case that to get a precise specification it is necessary to look at the implementation.
|
||||||
For testing your implementation hand-written tests are good.
|
For testing your implementation hand-written tests are good.
|
||||||
|
@ -261,10 +261,15 @@ It may be the last time I need to understand Python's <code>str.__repr__()</code
|
||||||
I have documented the code to make it more approachable and maintainable by others.
|
I have documented the code to make it more approachable and maintainable by others.
|
||||||
Hopefully it is not something that you need, but in case it is useful to you it is licensed under a permissive license.</p>
|
Hopefully it is not something that you need, but in case it is useful to you it is licensed under a permissive license.</p>
|
||||||
<p>If you have a project in OCaml or want to port something to OCaml and would like help from me and my colleagues at <a href="https://robur.coop/">Robur</a> please <a href="https://robur.coop/Contact">get in touch</a> with us and we will figure something out.</p>
|
<p>If you have a project in OCaml or want to port something to OCaml and would like help from me and my colleagues at <a href="https://robur.coop/">Robur</a> please <a href="https://robur.coop/Contact">get in touch</a> with us and we will figure something out.</p>
|
||||||
<p>[^python-bytes]: There is as well the <code>bytes</code> type which is a byte sequence like OCaml's <code>string</code>.
|
<section role="doc-endnotes"><ol>
|
||||||
|
<li id="fn-python-bytes">
|
||||||
|
<p>There is as well the <code>bytes</code> type which is a byte sequence like OCaml's <code>string</code>.
|
||||||
The Python code in question is using <code>str</code> however.</p>
|
The Python code in question is using <code>str</code> however.</p>
|
||||||
<p>[^raw-escape-example]: Note I use single quotes for the output. This is what Python would do. It would be equivalent to <code>"\\\""</code>.</p>
|
<span><a href="#ref-1-fn-python-bytes" role="doc-backlink" class="fn-label">↩︎︎</a></span></li><li id="fn-raw-escape-example">
|
||||||
<p>[^language-lawyer]: <a href="http://catb.org/jargon/html/L/language-lawyer.html">A person, usually an experienced or senior software engineer, who is intimately familiar with many or most of the numerous restrictions and features (both useful and esoteric) applicable to one or more computer programming languages. A language lawyer is distinguished by the ability to show you the five sentences scattered through a 200-plus-page manual that together imply the answer to your question “if only you had thought to look there”.</a></p>
|
<p>Note I use single quotes for the output. This is what Python would do. It would be equivalent to <code>"\\\""</code>.</p>
|
||||||
|
<span><a href="#ref-1-fn-raw-escape-example" role="doc-backlink" class="fn-label">↩︎︎</a></span></li><li id="fn-language-lawyer">
|
||||||
|
<p><a href="http://catb.org/jargon/html/L/language-lawyer.html">A person, usually an experienced or senior software engineer, who is intimately familiar with many or most of the numerous restrictions and features (both useful and esoteric) applicable to one or more computer programming languages. A language lawyer is distinguished by the ability to show you the five sentences scattered through a 200-plus-page manual that together imply the answer to your question “if only you had thought to look there”.</a></p>
|
||||||
|
<span><a href="#ref-1-fn-language-lawyer" role="doc-backlink" class="fn-label">↩︎︎</a></span></li></ol></section>
|
||||||
|
|
||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
|
|
@ -30,7 +30,7 @@ In order to implement this side of the protocol I studied parts of the OpenVPN
|
||||||
Studying the OpenVPN™ implementation has lead me to discover two security issues: CVE-2024-28882 and CVE-2024-5594.
|
Studying the OpenVPN™ implementation has lead me to discover two security issues: CVE-2024-28882 and CVE-2024-5594.
|
||||||
In this article I will talk about the relevant parts of the protocol, and describe the security issues in detail.</p>
|
In this article I will talk about the relevant parts of the protocol, and describe the security issues in detail.</p>
|
||||||
<p>A VPN establishes a secure tunnel in which (usually) IP packets are sent.
|
<p>A VPN establishes a secure tunnel in which (usually) IP packets are sent.
|
||||||
The OpenVPN protocol establishes a TLS tunnel[^openvpn-tls] with which key material and configuration options are negotiated.
|
The OpenVPN protocol establishes a TLS tunnel<sup><a href="#fn-openvpn-tls" id="ref-1-fn-openvpn-tls" role="doc-noteref" class="fn-label">[1]</a></sup> with which key material and configuration options are negotiated.
|
||||||
Once established the TLS tunnel is used to exchange so-called control channel messages.
|
Once established the TLS tunnel is used to exchange so-called control channel messages.
|
||||||
They are NUL-terminated (well, more on that later) text messages sent in a single TLS record frame (mostly, more on that later).</p>
|
They are NUL-terminated (well, more on that later) text messages sent in a single TLS record frame (mostly, more on that later).</p>
|
||||||
<p>I will describe two (groups) of control channel messages (and a bonus control channel message):</p>
|
<p>I will describe two (groups) of control channel messages (and a bonus control channel message):</p>
|
||||||
|
@ -40,7 +40,7 @@ They are NUL-terminated (well, more on that later) text messages sent in a singl
|
||||||
<li>(<code>AUTH_FAILED</code>)</li>
|
<li>(<code>AUTH_FAILED</code>)</li>
|
||||||
</ul>
|
</ul>
|
||||||
<p>The <code>EXIT</code>, <code>RESTART</code>, and <code>HALT</code> messages share similarity.
|
<p>The <code>EXIT</code>, <code>RESTART</code>, and <code>HALT</code> messages share similarity.
|
||||||
They are all three used to signal to the client that it should disconnect[^disconnect] from the server.
|
They are all three used to signal to the client that it should disconnect<sup><a href="#fn-disconnect" id="ref-1-fn-disconnect" role="doc-noteref" class="fn-label">[2]</a></sup> from the server.
|
||||||
<code>HALT</code> tells the client to disconnect and suggests the client should terminate.
|
<code>HALT</code> tells the client to disconnect and suggests the client should terminate.
|
||||||
<code>RESTART</code> also tells the client to disconnect and suggests the client can reconnect either to the same server or the next server if multiple are configured depending on flags in the message.
|
<code>RESTART</code> also tells the client to disconnect and suggests the client can reconnect either to the same server or the next server if multiple are configured depending on flags in the message.
|
||||||
<code>EXIT</code> tells the <em>peer</em> that it is exiting and the <em>peer</em> should disconnect.
|
<code>EXIT</code> tells the <em>peer</em> that it is exiting and the <em>peer</em> should disconnect.
|
||||||
|
@ -67,7 +67,7 @@ The management interface is a text protocol to communicate with the OpenVPN serv
|
||||||
One command is the <code>client-kill</code> command.
|
One command is the <code>client-kill</code> command.
|
||||||
The documentation says to use this command to "[i]mmediately kill a client instance[...]".
|
The documentation says to use this command to "[i]mmediately kill a client instance[...]".
|
||||||
In practice it sends an exit message to the client (either a custom one or the default <code>RESTART</code>).
|
In practice it sends an exit message to the client (either a custom one or the default <code>RESTART</code>).
|
||||||
I learnt that it shares code paths with the exit control messages to schedule an exit (disconnect)[^kill-immediately].
|
I learnt that it shares code paths with the exit control messages to schedule an exit (disconnect)<sup><a href="#fn-kill-immediately" id="ref-1-fn-kill-immediately" role="doc-noteref" class="fn-label">[3]</a></sup>.
|
||||||
That is, <code>client-kill</code> schedules the same five second timer.</p>
|
That is, <code>client-kill</code> schedules the same five second timer.</p>
|
||||||
<p>Thus a malicious client can, instead of exiting on receiving an exit or <code>RESTART</code> message, send back repeatedly <code>EXIT</code> to the server to reset the five second timer.
|
<p>Thus a malicious client can, instead of exiting on receiving an exit or <code>RESTART</code> message, send back repeatedly <code>EXIT</code> to the server to reset the five second timer.
|
||||||
This way the client can indefinitely delay the exit/disconnect assuming sufficiently stable and responsive network.
|
This way the client can indefinitely delay the exit/disconnect assuming sufficiently stable and responsive network.
|
||||||
|
@ -86,7 +86,7 @@ The OpenVPN security@ mailing list took it seriously enough to assign it CVE-202
|
||||||
As the names suggest it's a request/response protocol.
|
As the names suggest it's a request/response protocol.
|
||||||
It is used to communicate configuration options from the server to the client.
|
It is used to communicate configuration options from the server to the client.
|
||||||
These options include routes, ip address configuration, negotiated cryptographic algorithms.
|
These options include routes, ip address configuration, negotiated cryptographic algorithms.
|
||||||
The client signals it would like to receive configuration options from the server by sending the <code>PUSH_REQUEST</code> control channel message[^proto-push-request].
|
The client signals it would like to receive configuration options from the server by sending the <code>PUSH_REQUEST</code> control channel message<sup><a href="#fn-proto-push-request" id="ref-1-fn-proto-push-request" role="doc-noteref" class="fn-label">[4]</a></sup>.
|
||||||
The server then sends a <code>PUSH_REPLY</code> message.</p>
|
The server then sends a <code>PUSH_REPLY</code> message.</p>
|
||||||
<p>The format of a <code>PUSH_REPLY</code> message is <code>PUSH_REPLY,</code> followed by a comma separated list of OpenVPN configuration directives terminated by a NUL byte as in other control channel messages.
|
<p>The format of a <code>PUSH_REPLY</code> message is <code>PUSH_REPLY,</code> followed by a comma separated list of OpenVPN configuration directives terminated by a NUL byte as in other control channel messages.
|
||||||
Note that this means pushed configuration directives cannot contain commas.</p>
|
Note that this means pushed configuration directives cannot contain commas.</p>
|
||||||
|
@ -94,8 +94,8 @@ Note that this means pushed configuration directives cannot contain commas.</p>
|
||||||
I learned some quirks of the configuration language which I find surprising and somewhat hard to implement.
|
I learned some quirks of the configuration language which I find surprising and somewhat hard to implement.
|
||||||
I will not cover all corners of the configuration language.</p>
|
I will not cover all corners of the configuration language.</p>
|
||||||
<p>In some sense you could say the configuration language of OpenVPN™ is line based.
|
<p>In some sense you could say the configuration language of OpenVPN™ is line based.
|
||||||
At least, the first step to parsing configuration directives as OpenVPN 2.X does is to read one line at a time and parse it as one configuration directive[^inline-files].
|
At least, the first step to parsing configuration directives as OpenVPN 2.X does is to read one line at a time and parse it as one configuration directive<sup><a href="#fn-inline-files" id="ref-1-fn-inline-files" role="doc-noteref" class="fn-label">[5]</a></sup>.
|
||||||
A line is whatever <code>fgets()</code> says it is - this includes the newline if not at the end of the file[^configuration-newlines].
|
A line is whatever <code>fgets()</code> says it is - this includes the newline if not at the end of the file<sup><a href="#fn-configuration-newlines" id="ref-1-fn-configuration-newlines" role="doc-noteref" class="fn-label">[6]</a></sup>.
|
||||||
This is how it is for configuration files.
|
This is how it is for configuration files.
|
||||||
However, if it is a <code>PUSH_REPLY</code> a <em>"line"</em> is the text string up to a comma or the end of file (or, importantly, a NUL byte).
|
However, if it is a <code>PUSH_REPLY</code> a <em>"line"</em> is the text string up to a comma or the end of file (or, importantly, a NUL byte).
|
||||||
This "line" tokenization is done by repeatedly calling OpenVPN™'s <code>buf_parse(buf, ',', line, sizeof(line))</code> function.</p>
|
This "line" tokenization is done by repeatedly calling OpenVPN™'s <code>buf_parse(buf, ',', line, sizeof(line))</code> function.</p>
|
||||||
|
@ -210,12 +210,20 @@ Either way, it's old and gone unnoticed for quite a while.</p>
|
||||||
<p>I think this shows that diversity in implementations is a great way to exercise corner cases, push forward (protocol) documentation efforts and get thorough code review by motivated peers.
|
<p>I think this shows that diversity in implementations is a great way to exercise corner cases, push forward (protocol) documentation efforts and get thorough code review by motivated peers.
|
||||||
This work was funded by <a href="https://nlnet.nl/project/MirageVPN/">the EU NGI Assure Fund through NLnet</a>.
|
This work was funded by <a href="https://nlnet.nl/project/MirageVPN/">the EU NGI Assure Fund through NLnet</a>.
|
||||||
In my opinion, this shows that funding one open source project can have a positive impact on other open source projects, too.</p>
|
In my opinion, this shows that funding one open source project can have a positive impact on other open source projects, too.</p>
|
||||||
<p>[^openvpn-tls]: This is not always the case. It is possible to use static shared secret keys, but it is mostly considered deprecated.
|
<section role="doc-endnotes"><ol>
|
||||||
[^disconnect]: I say "disconnect" even when the underlying transport is the connection-less UDP.
|
<li id="fn-openvpn-tls">
|
||||||
[^kill-immediately]: As the alert reader might have realized this is inaccurate. It does not kill the client "immediately" as it will wait five seconds after the exit message is sent before exiting. At best this will kill a cooperating client once it's received the kill message.
|
<p>This is not always the case. It is possible to use static shared secret keys, but it is mostly considered deprecated.</p>
|
||||||
[^proto-push-request]: There is another mechanism to request a <code>PUSH_REPLY</code> earlier with less roundtrips, but let's ignore that for now. The exact message is <code>PUSH_REQUEST<NUL-BYTE></code> as messages need to be NUL-terminated.
|
<span><a href="#ref-1-fn-openvpn-tls" role="doc-backlink" class="fn-label">↩︎︎</a></span></li><li id="fn-disconnect">
|
||||||
[^inline-files]: An exception being inline files which can span multiple lines. They vaguely resemble XML tags with an open <code><tag></code> and close <code></tag></code> each on their own line with the data in between. I doubt these are sent in <code>PUSH_REPLY</code>s, but I can't rule out without diving into the source code that it isn't possible to send inline files.
|
<p>I say "disconnect" even when the underlying transport is the connection-less UDP.</p>
|
||||||
[^configuration-newlines]: This results in the quirk that it is possible to sort-of escape a newline in a configuration directive. But since the line splitting is done <em>first</em> it's not possible to continue the directive on the next line! I believe this is mostly useless, but it is a way to inject line feeds in configuration options without modifying the OpenVPN source code.</p>
|
<span><a href="#ref-1-fn-disconnect" role="doc-backlink" class="fn-label">↩︎︎</a></span></li><li id="fn-kill-immediately">
|
||||||
|
<p>As the alert reader might have realized this is inaccurate. It does not kill the client "immediately" as it will wait five seconds after the exit message is sent before exiting. At best this will kill a cooperating client once it's received the kill message.</p>
|
||||||
|
<span><a href="#ref-1-fn-kill-immediately" role="doc-backlink" class="fn-label">↩︎︎</a></span></li><li id="fn-proto-push-request">
|
||||||
|
<p>There is another mechanism to request a <code>PUSH_REPLY</code> earlier with less roundtrips, but let's ignore that for now. The exact message is <code>PUSH_REQUEST<NUL-BYTE></code> as messages need to be NUL-terminated.</p>
|
||||||
|
<span><a href="#ref-1-fn-proto-push-request" role="doc-backlink" class="fn-label">↩︎︎</a></span></li><li id="fn-inline-files">
|
||||||
|
<p>An exception being inline files which can span multiple lines. They vaguely resemble XML tags with an open <code><tag></code> and close <code></tag></code> each on their own line with the data in between. I doubt these are sent in <code>PUSH_REPLY</code>s, but I can't rule out without diving into the source code that it isn't possible to send inline files.</p>
|
||||||
|
<span><a href="#ref-1-fn-inline-files" role="doc-backlink" class="fn-label">↩︎︎</a></span></li><li id="fn-configuration-newlines">
|
||||||
|
<p>This results in the quirk that it is possible to sort-of escape a newline in a configuration directive. But since the line splitting is done <em>first</em> it's not possible to continue the directive on the next line! I believe this is mostly useless, but it is a way to inject line feeds in configuration options without modifying the OpenVPN source code.</p>
|
||||||
|
<span><a href="#ref-1-fn-configuration-newlines" role="doc-backlink" class="fn-label">↩︎︎</a></span></li></ol></section>
|
||||||
|
|
||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
|
|
@ -78,7 +78,7 @@ Then I got a hunch: I had read about <a href="https://en.m.wikipedia.org/wiki/GU
|
||||||
I had always thought it was optional and not needed in a new system such as Mirage that doesn't have to care too much about legacy code and operating systems.</p>
|
I had always thought it was optional and not needed in a new system such as Mirage that doesn't have to care too much about legacy code and operating systems.</p>
|
||||||
<p>So I started comparing the layout of MBR and tar.
|
<p>So I started comparing the layout of MBR and tar.
|
||||||
The V7 tar format only uses the first 257 bytes of the 512 byte block.
|
The V7 tar format only uses the first 257 bytes of the 512 byte block.
|
||||||
The V7 format is differentiated by the UStar, POSIX/pax and old GNU tar formats by not having the string <code>ustar</code> at byte offset 257[^tar-ustar].
|
The V7 format is differentiated by the UStar, POSIX/pax and old GNU tar formats by not having the string <code>ustar</code> at byte offset 257<sup><a href="#fn-tar-ustar" id="ref-1-fn-tar-ustar" role="doc-noteref" class="fn-label">[1]</a></sup>.
|
||||||
The master boot record format starts with the bootstrap code area.
|
The master boot record format starts with the bootstrap code area.
|
||||||
In the classic format it is the first 446 bytes.
|
In the classic format it is the first 446 bytes.
|
||||||
In the modern standard MBR format the first 446 bytes are mostly bootstrap code too with the exception of a handful bytes at offset 218 or so which are used for a timestamp or so.
|
In the modern standard MBR format the first 446 bytes are mostly bootstrap code too with the exception of a handful bytes at offset 218 or so which are used for a timestamp or so.
|
||||||
|
@ -131,7 +131,10 @@ If the sector size is greater than 512 we can use the remaining space in LBA 0 t
|
||||||
I may try this for a sector size of 4096, but I'm not happy that it doesn't work with sector size 512 which solo5 will default to.</li>
|
I may try this for a sector size of 4096, but I'm not happy that it doesn't work with sector size 512 which solo5 will default to.</li>
|
||||||
</ul>
|
</ul>
|
||||||
<p>If you have other ideas what I can do please reach out!</p>
|
<p>If you have other ideas what I can do please reach out!</p>
|
||||||
<p>[^tar-ustar]: This is somewhat simplified. There are some more nuances between the different formats, but for this purpose they don't matter much.</p>
|
<section role="doc-endnotes"><ol>
|
||||||
|
<li id="fn-tar-ustar">
|
||||||
|
<p>This is somewhat simplified. There are some more nuances between the different formats, but for this purpose they don't matter much.</p>
|
||||||
|
<span><a href="#ref-1-fn-tar-ustar" role="doc-backlink" class="fn-label">↩︎︎</a></span></li></ol></section>
|
||||||
|
|
||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
|
|
@ -35,7 +35,7 @@ To better guide the performance engineering, we also developed <a href="https://
|
||||||
<h2 id="takeaway-of-performance-engineering"><a class="anchor" aria-hidden="true" href="#takeaway-of-performance-engineering"></a>Takeaway of performance engineering</h2>
|
<h2 id="takeaway-of-performance-engineering"><a class="anchor" aria-hidden="true" href="#takeaway-of-performance-engineering"></a>Takeaway of performance engineering</h2>
|
||||||
<p>The learnings of our performance engineering are in three areas:</p>
|
<p>The learnings of our performance engineering are in three areas:</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li>Formatting strings is computational expensive -- thus if in an error case a hexdump is produced of a packet, its construction must be delayed for when the error case is executed (we have <a href="https://github.com/robur-coop/miragevpn/pull/220">this PR</a> and <a href="https://github.com/robur-coop/miragevpn/pull/209">that PR</a>). Alain Frisch wrote a nice <a href="https://www.lexifi.com/blog/ocaml/note-about-performance-printf-and-format/#">blog post</a> at LexiFi about performance of <code>Printf</code> and <code>Format</code>[^lexifi-date].</li>
|
<li>Formatting strings is computational expensive -- thus if in an error case a hexdump is produced of a packet, its construction must be delayed for when the error case is executed (we have <a href="https://github.com/robur-coop/miragevpn/pull/220">this PR</a> and <a href="https://github.com/robur-coop/miragevpn/pull/209">that PR</a>). Alain Frisch wrote a nice <a href="https://www.lexifi.com/blog/ocaml/note-about-performance-printf-and-format/#">blog post</a> at LexiFi about performance of <code>Printf</code> and <code>Format</code><sup><a href="#fn-lexifi-date" id="ref-1-fn-lexifi-date" role="doc-noteref" class="fn-label">[1]</a></sup>.</li>
|
||||||
<li>Rethink allocations: fundamentally, only a single big buffer (to be send out) for each incoming packet should be allocated, not a series of buffers that are concatenated (see <a href="https://github.com/robur-coop/miragevpn/pull/217">this PR</a> and <a href="https://github.com/robur-coop/miragevpn/pull/219">that PR</a>). Additionally, not zeroing out the just allocated buffer (if it is filled with data anyways) removes some further instructions (see <a href="https://github.com/robur-coop/miragevpn/pull/218">this PR</a>). And we figured that appending to an empty buffer nevertheless allocated and copied in OCaml, so we worked on <a href="https://github.com/robur-coop/miragevpn/pull/214">this PR</a>.</li>
|
<li>Rethink allocations: fundamentally, only a single big buffer (to be send out) for each incoming packet should be allocated, not a series of buffers that are concatenated (see <a href="https://github.com/robur-coop/miragevpn/pull/217">this PR</a> and <a href="https://github.com/robur-coop/miragevpn/pull/219">that PR</a>). Additionally, not zeroing out the just allocated buffer (if it is filled with data anyways) removes some further instructions (see <a href="https://github.com/robur-coop/miragevpn/pull/218">this PR</a>). And we figured that appending to an empty buffer nevertheless allocated and copied in OCaml, so we worked on <a href="https://github.com/robur-coop/miragevpn/pull/214">this PR</a>.</li>
|
||||||
<li>Still an open topic is: we are in the memory-safe language OCaml, and we sometimes extract data out of a buffer (or set data in a buffer). Now, each operation lead to bounds checks (that we do not touch memory that is not allocated or not ours). However, if we just checked for the buffer being long enough (either by checking the length, or by allocating a specific amount of data), these bounds checks are superfluous. So far, we don't have an automated solution for this issue, but we are <a href="https://discuss.ocaml.org/t/bounds-checks-for-string-and-bytes-when-retrieving-or-setting-subparts-thereof/">discussing it in the OCaml community</a>, and are eager to find a solution to avoid unneeded computations.</li>
|
<li>Still an open topic is: we are in the memory-safe language OCaml, and we sometimes extract data out of a buffer (or set data in a buffer). Now, each operation lead to bounds checks (that we do not touch memory that is not allocated or not ours). However, if we just checked for the buffer being long enough (either by checking the length, or by allocating a specific amount of data), these bounds checks are superfluous. So far, we don't have an automated solution for this issue, but we are <a href="https://discuss.ocaml.org/t/bounds-checks-for-string-and-bytes-when-retrieving-or-setting-subparts-thereof/">discussing it in the OCaml community</a>, and are eager to find a solution to avoid unneeded computations.</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
@ -43,7 +43,10 @@ To better guide the performance engineering, we also developed <a href="https://
|
||||||
<p>To conclude: we already achieved a factor of 25 in performance by adapting the code in various ways. We have ideas to improve the performance even more in the future - we also work on using OCaml string and bytes, instead of off-the-OCaml-heap-allocated bigarrays (see <a href="https://blog.robur.coop/articles/speeding-ec-string.html">our previous article</a>, which provided some speedups).</p>
|
<p>To conclude: we already achieved a factor of 25 in performance by adapting the code in various ways. We have ideas to improve the performance even more in the future - we also work on using OCaml string and bytes, instead of off-the-OCaml-heap-allocated bigarrays (see <a href="https://blog.robur.coop/articles/speeding-ec-string.html">our previous article</a>, which provided some speedups).</p>
|
||||||
<p>Don't hesitate to reach out to us on <a href="https://github.com/robur-coop/miragevpn/issues">GitHub</a>, or <a href="https://robur.coop/Contact">by mail</a> if you're stuck.</p>
|
<p>Don't hesitate to reach out to us on <a href="https://github.com/robur-coop/miragevpn/issues">GitHub</a>, or <a href="https://robur.coop/Contact">by mail</a> if you're stuck.</p>
|
||||||
<p>We want to thank <a href="https://nlnet.nl">NLnet</a> for their funding (via <a href="https://www.assure.ngi.eu/">NGI assure</a>), and <a href="https://eduvpn.org">eduVPN</a> for their interest.</p>
|
<p>We want to thank <a href="https://nlnet.nl">NLnet</a> for their funding (via <a href="https://www.assure.ngi.eu/">NGI assure</a>), and <a href="https://eduvpn.org">eduVPN</a> for their interest.</p>
|
||||||
<p>[^lexifi-date]: It has come to our attention that the blog post is rather old (2012) and that the implementation has been completely rewritten since then.</p>
|
<section role="doc-endnotes"><ol>
|
||||||
|
<li id="fn-lexifi-date">
|
||||||
|
<p>It has come to our attention that the blog post is rather old (2012) and that the implementation has been completely rewritten since then.</p>
|
||||||
|
<span><a href="#ref-1-fn-lexifi-date" role="doc-backlink" class="fn-label">↩︎︎</a></span></li></ol></section>
|
||||||
|
|
||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
|
|
@ -41,12 +41,12 @@ The latter uses separate data & control channels where the control channel u
|
||||||
<p>Before diving into TLS mode and eventually tls-crypt-v2 it's worth to briefly discuss why we spend time reimplementing the OpenVPN™ protocol.
|
<p>Before diving into TLS mode and eventually tls-crypt-v2 it's worth to briefly discuss why we spend time reimplementing the OpenVPN™ protocol.
|
||||||
You may ask yourself: why not just use the existing tried and tested implementation?</p>
|
You may ask yourself: why not just use the existing tried and tested implementation?</p>
|
||||||
<p>OpenVPN™ community edition is implemented in the C programming language.
|
<p>OpenVPN™ community edition is implemented in the C programming language.
|
||||||
It heavily uses the OpenSSL library[^mbedtls] which is as well written in C and has in the past had some notable security vulnerabilities.
|
It heavily uses the OpenSSL library<sup><a href="#fn-mbedtls" id="ref-1-fn-mbedtls" role="doc-noteref" class="fn-label">[1]</a></sup> which is as well written in C and has in the past had some notable security vulnerabilities.
|
||||||
Many vulnerabilities and bugs in C can be easily avoided in other languages due to bounds checking and stricter and more expressive type systems.
|
Many vulnerabilities and bugs in C can be easily avoided in other languages due to bounds checking and stricter and more expressive type systems.
|
||||||
The state machine of the protocol can be more easily be expressed in OCaml, and some properties of the protocol can be encoded in the type system.</p>
|
The state machine of the protocol can be more easily be expressed in OCaml, and some properties of the protocol can be encoded in the type system.</p>
|
||||||
<p>Another reason is <a href="https://mirage.io/">Mirage OS</a>, a library operating system implemented in OCaml.
|
<p>Another reason is <a href="https://mirage.io/">Mirage OS</a>, a library operating system implemented in OCaml.
|
||||||
We work on the Mirage project and write applications (unikernels) using Mirage.
|
We work on the Mirage project and write applications (unikernels) using Mirage.
|
||||||
In many cases it would be desirable to be able to connect to an existing VPN network[^vpn-network],
|
In many cases it would be desirable to be able to connect to an existing VPN network<sup><a href="#fn-vpn-network" id="ref-1-fn-vpn-network" role="doc-noteref" class="fn-label">[2]</a></sup>,
|
||||||
or be able to offer a VPN network to clients using OpenVPN™.</p>
|
or be able to offer a VPN network to clients using OpenVPN™.</p>
|
||||||
<p>Consider a VPN provider:
|
<p>Consider a VPN provider:
|
||||||
The VPN provider runs many machines that run an operating system in order to run the user-space OpenVPN™ service.
|
The VPN provider runs many machines that run an operating system in order to run the user-space OpenVPN™ service.
|
||||||
|
@ -105,8 +105,12 @@ For general instructions on running Mirage unikernels see our <a href="https://r
|
||||||
The unikernel will need a block device containing the OpenVPN™ configuration and a network device.
|
The unikernel will need a block device containing the OpenVPN™ configuration and a network device.
|
||||||
More detailed instructions Will Follow Soon™!
|
More detailed instructions Will Follow Soon™!
|
||||||
Don't hesitate to reach out to us on <a href="https://github.com/robur-coop/miragevpn/issues">GitHub</a>, <a href="https://robur.coop/Contact">by mail</a> or me personally <a href="https://bsd.network/@reynir">on Mastodon</a> if you're stuck.</p>
|
Don't hesitate to reach out to us on <a href="https://github.com/robur-coop/miragevpn/issues">GitHub</a>, <a href="https://robur.coop/Contact">by mail</a> or me personally <a href="https://bsd.network/@reynir">on Mastodon</a> if you're stuck.</p>
|
||||||
<p>[^mbedtls]: It is possible to compile OpenVPN™ community edition with Mbed TLS instead of OpenSSL which is written in C as well.</p>
|
<section role="doc-endnotes"><ol>
|
||||||
<p>[^vpn-network]: I use the term "VPN network" to mean the virtual private network itself. It is a bit odd because the 'N' in 'VPN' is 'Network', but without disambiguation 'VPN' could refer to the network itself, the software or the service.</p>
|
<li id="fn-mbedtls">
|
||||||
|
<p>It is possible to compile OpenVPN™ community edition with Mbed TLS instead of OpenSSL which is written in C as well.</p>
|
||||||
|
<span><a href="#ref-1-fn-mbedtls" role="doc-backlink" class="fn-label">↩︎︎</a></span></li><li id="fn-vpn-network">
|
||||||
|
<p>I use the term "VPN network" to mean the virtual private network itself. It is a bit odd because the 'N' in 'VPN' is 'Network', but without disambiguation 'VPN' could refer to the network itself, the software or the service.</p>
|
||||||
|
<span><a href="#ref-1-fn-vpn-network" role="doc-backlink" class="fn-label">↩︎︎</a></span></li></ol></section>
|
||||||
|
|
||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
|
|
@ -50,18 +50,74 @@
|
||||||
<h2 id="performance-numbers"><a class="anchor" aria-hidden="true" href="#performance-numbers"></a>Performance numbers</h2>
|
<h2 id="performance-numbers"><a class="anchor" aria-hidden="true" href="#performance-numbers"></a>Performance numbers</h2>
|
||||||
<p>All numbers were gathered on a Lenovo X250 laptop with a Intel i7-5600U CPU @ 2.60GHz. We used OCaml 4.14.1 as compiler. The baseline is OpenSSL 3.0.12. All numbers are in operations per second.</p>
|
<p>All numbers were gathered on a Lenovo X250 laptop with a Intel i7-5600U CPU @ 2.60GHz. We used OCaml 4.14.1 as compiler. The baseline is OpenSSL 3.0.12. All numbers are in operations per second.</p>
|
||||||
<p>NIST P-256</p>
|
<p>NIST P-256</p>
|
||||||
<p>| op | 0.11.2 | PR#146 | speedup | OpenSSL | speedup |
|
<div role="region"><table>
|
||||||
| - | - | - | - | - | - |
|
<tr>
|
||||||
| sign | 748 | 1806 | 2.41x | 34392 | 19.04x |
|
<th>op</th>
|
||||||
| verify | 285 | 655 | 2.30x | 12999 | 19.85x |
|
<th>0.11.2</th>
|
||||||
| ecdh | 858 | 1785 | 2.08x | 16514 | 9.25x |</p>
|
<th>PR#146</th>
|
||||||
<p>Curve 25519</p>
|
<th>speedup</th>
|
||||||
<p>| op | 0.11.2 | PR#146 | speedup | OpenSSL | speedup |
|
<th>OpenSSL</th>
|
||||||
| - | - | - | - | - | - |
|
<th>speedup</th>
|
||||||
| sign | 10713 | 11560 | 1.08x | 21943 | 1.90x |
|
</tr>
|
||||||
| verify | 7600 | 8314 | 1.09x | 7081 | 0.85x |
|
<tr>
|
||||||
| ecdh | 12144 | 13457 | 1.11x | 26201 | 1.95x |</p>
|
<td>sign</td>
|
||||||
<p>Note: to re-create the performance numbers, you can run <code>openssl speed ecdsap256 ecdhp256 ed25519 ecdhx25519</code> - for the OCaml site, use <code>dune bu bench/speed.exe --rel</code> and <code>_build/default/bench/speed.exe ecdsa-sign ecdsa-verify ecdh-share</code>.</p>
|
<td>748</td>
|
||||||
|
<td>1806</td>
|
||||||
|
<td>2.41x</td>
|
||||||
|
<td>34392</td>
|
||||||
|
<td>19.04x</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>verify</td>
|
||||||
|
<td>285</td>
|
||||||
|
<td>655</td>
|
||||||
|
<td>2.30x</td>
|
||||||
|
<td>12999</td>
|
||||||
|
<td>19.85x</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>ecdh</td>
|
||||||
|
<td>858</td>
|
||||||
|
<td>1785</td>
|
||||||
|
<td>2.08x</td>
|
||||||
|
<td>16514</td>
|
||||||
|
<td>9.25x</td>
|
||||||
|
</tr>
|
||||||
|
</table></div><p>Curve 25519</p>
|
||||||
|
<div role="region"><table>
|
||||||
|
<tr>
|
||||||
|
<th>op</th>
|
||||||
|
<th>0.11.2</th>
|
||||||
|
<th>PR#146</th>
|
||||||
|
<th>speedup</th>
|
||||||
|
<th>OpenSSL</th>
|
||||||
|
<th>speedup</th>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>sign</td>
|
||||||
|
<td>10713</td>
|
||||||
|
<td>11560</td>
|
||||||
|
<td>1.08x</td>
|
||||||
|
<td>21943</td>
|
||||||
|
<td>1.90x</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>verify</td>
|
||||||
|
<td>7600</td>
|
||||||
|
<td>8314</td>
|
||||||
|
<td>1.09x</td>
|
||||||
|
<td>7081</td>
|
||||||
|
<td>0.85x</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td>ecdh</td>
|
||||||
|
<td>12144</td>
|
||||||
|
<td>13457</td>
|
||||||
|
<td>1.11x</td>
|
||||||
|
<td>26201</td>
|
||||||
|
<td>1.95x</td>
|
||||||
|
</tr>
|
||||||
|
</table></div><p>Note: to re-create the performance numbers, you can run <code>openssl speed ecdsap256 ecdhp256 ed25519 ecdhx25519</code> - for the OCaml site, use <code>dune bu bench/speed.exe --rel</code> and <code>_build/default/bench/speed.exe ecdsa-sign ecdsa-verify ecdh-share</code>.</p>
|
||||||
<p>The performance improvements are up to 2.5 times compared to the latest mirage-crypto-ec release (look at the 4th column). In comparison to OpenSSL, we still lack a factor of 20 for the NIST curves, and up to a factor of 2 for 25519 computations (look at the last column).</p>
|
<p>The performance improvements are up to 2.5 times compared to the latest mirage-crypto-ec release (look at the 4th column). In comparison to OpenSSL, we still lack a factor of 20 for the NIST curves, and up to a factor of 2 for 25519 computations (look at the last column).</p>
|
||||||
<p>If you have ideas for improvements, let us know via an issue, eMail, or a pull request :) We started to <a href="https://github.com/mirage/mirage-crypto/issues/193">gather some</a> for 25519 by comparing our code with changes in BoringSSL over the last years.</p>
|
<p>If you have ideas for improvements, let us know via an issue, eMail, or a pull request :) We started to <a href="https://github.com/mirage/mirage-crypto/issues/193">gather some</a> for 25519 by comparing our code with changes in BoringSSL over the last years.</p>
|
||||||
<p>As a spoiler, for P-256 sign there's another improvement of around 4.5 with <a href="https://github.com/mirage/mirage-crypto/pull/191">Virgile's PR</a> using pre-computed tables also for NIST curves.</p>
|
<p>As a spoiler, for P-256 sign there's another improvement of around 4.5 with <a href="https://github.com/mirage/mirage-crypto/pull/191">Virgile's PR</a> using pre-computed tables also for NIST curves.</p>
|
||||||
|
|
Loading…
Reference in a new issue