OCamlPro Feed 2024-10-03T13:48:57Z OCamlPro contact@ocamlpro.com Copyright (c) 2011–2021 OCamlPro https://ocamlpro.com/blog/feed Alt-Ergo 2.6 is Out! https://ocamlpro.com/blog/2024_09_01_alt_ergo_2_6_0_released 2024-09-30T13:48:57Z 2024-09-30T13:48:57Z Basile Clément Pierre Villemot We are excited to announce the release of Alt-Ergo 2.6! Alt-Ergo is an open-source automated prover used for formal verification in software development. It is part of the arsenal behind static analysis frameworks such as TrustInSoft Analyzer and Frama-C, and is one of the solvers behind Why3, a pla... <p></p> <p> <div class="figure"> <p> <a href="/blog/assets/img/alt-ergo-8-colors-blank-bg.png"> <img alt="The Alt-Ergo 2.6 release comes with many enhancements!" src="/blog/assets/img/alt-ergo-8-colors-blank-bg.png"/> </a> <div class="caption"> The Alt-Ergo 2.6 release comes with many enhancements! </div> </p> </div> </p> <p><strong>We are excited to announce the release of Alt-Ergo 2.6!</strong></p> <p>Alt-Ergo is an open-source automated prover used for formal verification in software development. It is part of the arsenal behind static analysis frameworks such as TrustInSoft Analyzer and Frama-C, and is one of the solvers behind Why3, a platform for deductive program verification. The newly released version 2.6 brings new features and performance improvements.</p> <p>Development on Alt-Ergo has accelerated significantly this past year, thanks to the launch of the <a href="https://decysif.fr/en/">DéCySif</a> joint research project (i-Démo) with AdaCore, Inria, OCamlPro and TrustInSoft. The improvements to bit-vectors and algebraic data types in this release are sponsored by the Décysif project.</p> <p>The highlights of Alt-Ergo 2.6 are:</p> <ul> <li>Support for reasoning and model generation with bit-vectors </li> <li>Model generation for algebraic data types </li> <li>Optimization with <code>(maximize)</code> and <code>(minimize)</code> </li> <li>FPA support is enabled by default and available in SMT-LIB format </li> <li>Binary releases now on GitHub </li> </ul> <p>Alt-Ergo 2.6 also includes other improvements to the user interface (notably the <code>set-option</code> SMT-LIB command), use of Dolmen as the default frontend for SMT-LIB and native input, and many bug fixes.</p> <h3>Bit-vectors</h3> <p>In Alt-Ergo 2.5, we introduced built-in functions for the bit-vector primitives from the SMT-LIB standard, but only provided limited reasoning support. For Alt-Ergo 2.6, we set out to improve this reasoning support, and have developed a new and improved relational theory for bit-vectors. This new theory is based on an also new constraint propagation core that draws heavily on the architecture of the Colibri solver (as in <a href="https://cea.hal.science/cea-01795779">Sharpening Constraint Programming approaches for Bit-Vector Theory</a>), integrated into Alt-Ergo's existing normalizing Shostak solver.</p> <p>Bit-vectors are commonly used in verification of low-level code and in cryptography, so improved support significantly enhances Alt-Ergo’s applicability in these domains.</p> <p>There are still areas of improvements, so please share any issue you encounter with the bit-vector theory (or Alt-Ergo in general) via our <a href="https://github.com/ocamlpro/alt-ergo/issues">issue tracker</a>.</p> <p>To showcase improvements in Alt-Ergo 2.6, we compared it against the version 2.5 and industry-leading solvers Z3 and CVC5 on a dataset of bit-vector problems collected from our partners in the DéCySif project. The (no BV) variants for Alt-Ergo do not use the new bit-vector theory but instead an axiomatization of bit-vector primitives provided by Why3. The percentages represent the proportion of bit-vector problems solved successfully in each configuration.</p> <table class="table"> <thead> <tr class="table-light text-center"> <th scope="col"></th> <th scope="col" colspan="2">AE 2.5</th> <th scope="col" colspan="2">AE 2.6</th> <th scope="col">Z3 (4.12.5)</th> <th scope="col">CVC5 (1.1.2)</th> <th scope="col">Total</th> </tr> <tr> <th scope="row"></th> <td>(BV)</td> <td>(no BV)</td> <td>(BV)</td> <td>(no BV)</td> <td></td> <td></td> <td></td> </tr> </thead> <tbody> <tr> <th scope="row">#</th> <td>4128</td> <td>4870</td> <td>6265</td> <td>4940</td> <td>5482</td> <td>7415</td> <td>9038</td> </tr> <tr> <th scope="row">%</th> <td>46%</td> <td>54%</td> <td>69%</td> <td>54%</td> <td>61%</td> <td>82%</td> <td>100%</td> </tr> </tbody> </table> <p>As the table shows, Alt-Ergo 2.6 significantly outperforms version 2.5, and the new built-in bit-vector theory outperforms Why3's axiomatization. We even surpass Z3 on this benchmark, a testament to the new bit-vector theory in Alt-Ergo 2.6.</p> <h3>Model Generation</h3> <p>Bit-vector is not the only theory Alt-Ergo 2.6 improves upon. Model generation was introduced in Alt-Ergo 2.5 with support for booleans, integers, reals, arrays, enumerated types, and records. Alt-Ergo 2.6 extends this support to bit-vector and arbitrary algebraic data types, which means that model generation is now enabled for all the theories supported by Alt-Ergo.</p> <p>Model generation allows users to extract concrete examples or counterexamples, aiding in debugging and verification of their systems.</p> <p>Model generation is also more robust in Alt-Ergo 2.6, with numerous bug fixes and improvements for edge cases.</p> <h3>Optimization</h3> <p>Alt-Ergo 2.6 introduces optimization capabilities, available via SMT-LIB input using OptiSMT primitives such as <code>(minimize)</code> and <code>(maximize)</code> and compatible with Z3 and OptiMathSat. Optimization allows guiding the solver towards simpler and smaller counterexamples, helping users find more concrete and realistic scenarios to trigger a bug.</p> <p>See some <a href="https://ocamlpro.github.io/alt-ergo/latest/Optimization.html">examples</a> in the documentation.</p> <h3>SMT-LIB command support</h3> <p>Alt-Ergo 2.6 supports more SMT-LIB syntax and commands, such as:</p> <ul> <li>The <code>(get-info :all-statistics)</code> command to obtain information about the solver's statistics </li> <li>The <code>(reset)</code>, <code>(exit)</code> and <code>(echo)</code> commands </li> <li>The <code>(get-assignment)</code> command, as well as the <code>:named</code> attribute and <code>:produce-assignments</code> option </li> </ul> <p>See the <a href="https://smt-lib.org">SMT-LIB standard</a> for more details about these commands.</p> <h3>Floating-point theory</h3> <p>In this release, we have made Alt-Ergo's <a href="https://ocamlpro.github.io/alt-ergo/next/Alt_ergo_native/05_theories.html#floating-point-arithmetic">floating-point theory</a> enabled by default: there is no need to provide the <code>--enable-theories fpa</code> flag anymore. The theory can be disabled with <code>--disable-theories fpa,nra,ria</code> (the <code>nra</code> and <code>ria</code> theories were automatically enabled along with the <code>fpa</code> theory in Alt-Ergo 2.5).</p> <p>We have also made the floating-point primitives available in the SMT-LIB format as the indexed constant <code>ae.round</code> and the convenience <code>ae.float16</code>, <code>ae.float32</code>, <code>ae.float64</code> and <code>ae.float128</code> functions; see the <a href="https://ocamlpro.github.io/alt-ergo/v2.6.0/SMT-LIB_language/index.html#floating-point-arithmetic">documentation</a>.</p> <h3>Dolmen is the new default frontend</h3> <p>Introduced in Alt-Ergo 2.5, the Dolmen frontend has been rigorously tested for regressions and is now the default for both <code>.smt2</code> and <code>.ae</code> files; the <code>--frontend dolmen</code> flag that was introduced in Alt-Ergo 2.5 is no longer necessary.</p> <p>The Dolmen frontend is based on the <a href="https://github.com/gbury/dolmen">Dolmen</a> library developed by Guillaume Bury at OCamlPro. It provides excellent support for the SMT-LIB standard and is used to check validity of all new problems in the SMT-LIB benchmark collection, as well as the results of the annual SMT-LIB affiliated solver competition SMT-COMP.</p> <p>The preferred input format for Alt-Ergo is now the SMT-LIB format. The legacy <code>.ae</code> format is still supported, but is now deprecated and users are encouraged to migrate to the SMT-LIB format if possible. Please <a href="mailto:alt-ergo@ocamlpro.com">reach out</a> if you find any issue while migrating to the SMT-LIB format.</p> <p>As we announced when releasing Alt-Ergo 2.5, the legacy frontend (supports <code>.ae</code> files only) is deprecated in Alt-Ergo 2.6, but it can still be enabled with the <code>--frontend legacy</code> option. It will be removed entirely from Alt-Ergo 2.7.</p> <p>Parser extensions, such as the built-in AB-Why3 plugin, only work with the legacy frontend, and will no longer work with Alt-Ergo 2.7. We are not aware of any current users of either parser extensions or the AB-Why3 plugin: if you need these features, please reach out to us on <a href="https://github.com/ocamlpro/alt-ergo/issues">GitHub</a> or by <a href="mailto:alt-ergo@ocamlpro.com">email</a> so that we can figure out a path forward.</p> <h3>Use of <code>dune-site</code> for plugins</h3> <p>Starting with Alt-Ergo 2.6, we are using the plugin mechanism from <code>dune-site</code> to replace the custom plugin loading <code>Dynlink</code>. Plugins now need to be registered in the <code>(alt-ergo plugins)</code> site with the <a href="https://dune.readthedocs.io/en/stable/reference/dune/plugin.html"><code>plugin</code> stanza</a>.</p> <p>This does not impact users, but only impacts developers of Alt-Ergo plugins. See the <a href="https://github.com/OCamlPro/alt-ergo/blob/next/src/plugins/fm-simplex/dune">dune file</a> for Alt-Ergo's built-in FM-Simplex plugin for reference.</p> <h3>Binary releases on GitHub</h3> <p>Starting with Alt-Ergo 2.6, we will be providing binary releases on the <a href="https://github.com/ocamlpro/alt-ergo/releases">GitHub Releases</a> page for Linux (x86_64) and macOS (x86_64 and arm). These are released under the same <a href="https://ocamlpro.github.io/alt-ergo/latest/About/licenses/index.html">licensing conditions</a> as the Alt-Ergo source code.</p> <p>The binary releases are statically linked and have no dependencies, except for system dependencies on macOS. They do not support dynamically loading plugins.</p> <h3>Performance</h3> <p>For Alt-Ergo 2.6, our main focus of improvement in term of reasoning was on bit-vectors and algebraic data types. Other theories also benefit from broader performance improvements we made. On our internal problem dataset, Alt-Ergo 2.6 is about 5% faster than Alt-Ergo 2.5 on the goals they both prove.</p> <h3>And more!</h3> <p>This release also includes significant internal refactoring, notably a rewrite from scratch of the interval domain. This improves the accuracy of Alt-Ergo in handling interval arithmetic and facilitates mixed operations involving integers and bit-vectors, resulting in shorter and more reliable proofs.</p> <p>See the complete changelog <a href="https://ocamlpro.github.io/alt-ergo/v2.6.0/About/changes.html">here</a>.</p> <p>We encourage you to try out Alt-Ergo 2.6 and share your experience or any feedback on our <a href="https://github.com/OCamlPro/Alt-Ergo">GitHub</a> or by email at <a href="mailto:alt-ergo@ocamlpro.com">alt-ergo@ocamlpro.com</a>. Your input will help share future releases!</p> <h3>Acknowledgements</h3> <p>We thank the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users' Club</a> members: AdaCore, the CEA, Thales, Mitsubishi Electric R&amp;D Center Europe (MERCE) and TrustInSoft.</p> <p>Special thanks to David Mentré and Denis Cousineau at MERCE for funding the initial optimization work. MERCE has been a Member of the Alt-Ergo Users' Club for four years. This partnership allowed Alt-Ergo to evolve and we hope that more users will join the Club on our journey to make Alt-Ergo a must-have tool.</p> <div class="figure"> <div class="card-light blog-logos"> <img alt="AdaCore logo" src="/assets/img/logo_adacore.svg"> <img alt="CEA list logo" src="/blog/assets/img/cealist.png"> <img alt="Thales logo" style="height: 24px;" src="/assets/img/logo_thales.svg"> <img alt="Mitsubishi Electric logo" src="/assets/img/logo_merce.png"> <img alt="TrustInSoft logo" style="height: 32px;" src="/assets/img/logo_trustinsoft.svg"> </div> <div class="caption">The dedicated members of our Alt-Ergo Club!</div> </div> Flambda2 Ep. 3: Speculative Inlining https://ocamlpro.com/blog/2024_08_09_the_flambda2_snippets_3 2024-08-09T13:48:57Z 2024-08-09T13:48:57Z Pierre Chambart Vincent Laviron Guillaume Bury Dario Pinto Nathanaëlle Courant Welcome to a new episode of The Flambda2 Snippets! The F2S blog posts aim at gradually introducing the world to the inner-workings of a complex piece of software engineering: The Flambda2 Optimising Compiler for OCaml, a technical marvel born from a 10 year-long effort in Research & Development and ... <p></p> <p> <div class="figure"> <p> <a href="/blog/assets/img/picture_egyptian_weighing_of_heart.jpg"> <img alt="A representation of Speculative Inlining through the famous Weighing Of The Heart of Egyptian Mythology. Egyptian God Anubis weighs his OCaml function, to see if it is worth inlining.<br />Credit: The Weighing of the Heart Ceremony, Ammit. Angus McBride (British, 1931-2007)" src="/blog/assets/img/picture_egyptian_weighing_of_heart.jpg"/> </a> <div class="caption"> A representation of Speculative Inlining through the famous Weighing Of The Heart of Egyptian Mythology. Egyptian God Anubis weighs his OCaml function, to see if it is worth inlining.<br />Credit: The Weighing of the Heart Ceremony, Ammit. Angus McBride (British, 1931-2007) </div> </p> </div> </p> <h3>Welcome to a new episode of The Flambda2 Snippets!</h3> <blockquote> <p>The <strong>F2S</strong> blog posts aim at gradually introducing the world to the inner-workings of a complex piece of software engineering: The <code>Flambda2 Optimising Compiler</code> for OCaml, a technical marvel born from a 10 year-long effort in Research &amp; Development and Compilation; with many more years of expertise in all aspects of Computer Science and Formal Methods.</p> </blockquote> <p>Today's article will serve as an introduction to one of the key design decisions structuring <code>Flambda2</code> that we will cover in the next episode in the series: <code>Upward and Downward Traversals</code>.</p> <p>See, there are interesting things to be said about how <code>inlining</code> is conducted inside of our compiler. <code>Inlining</code> in itself is rather ubiquitous in compilers. The goal here is to show how we approach <code>inlining</code>, and present what we call <code>Speculative Inlining</code>.</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#inliningingeneral">Inlining in general</a> </li> <li><a href="#detrimentalinlining">When inlining is detrimental</a> </li> <li><a href="#beneficialinlining">How to decide when inlining is beneficial</a> </li> <li><a href="#speculativeinlining">Speculative inlining</a> </li> <li><a href="#speculativeinlininginpractice">Speculative inlining in practice</a> </li> <li><a href="#summary">Summary</a> </li> <li><a href="#conclusion">Conclusion</a> </li> </ul> <p></div></p> <h2> <a id="inliningingeneral" class="anchor"></a><a class="anchor-link" href="#inliningingeneral">Inlining in general</a> </h2> <p>Given the way people write functional programs, <strong>inlining</strong> is an important part of the optimisation pipeline of such functional langages.</p> <p>What we call <strong>inlining</strong> in this series is the process of duplicating some code to specialise it to a specific context.</p> <p>Usually, this can be thought as copy-pasting the body of a function at its call site. A common misunderstanding is to think that the main benefit of this optimisation is to remove the cost of the function call. However, with modern computer architectures, this has become less and less relevant in the last decades. The actual benefit is to use the specific context to trigger further optimisations.</p> <p>Suppose we have the following <code>option_map</code> and <code>double</code> functions:</p> <pre><code class="language-ocaml">let option_map f x = match x with | None -&gt; None | Some x -&gt; Some (f x) let double i = i + i </code></pre> <p>Additionally, suppose we are currently considering the following function:</p> <pre><code class="language-ocaml">let stuff () = option_map double (Some 21) </code></pre> <p>In this short example, inlining the <code>option_map</code> function would perform the following transformation:</p> <pre><code class="language-ocaml">let stuff () = let f = double in let x = Some 21 in match x with | None -&gt; None | Some x -&gt; Some (f x) </code></pre> <p>Now we can inline the <code>double</code> function.</p> <pre><code class="language-ocaml">let stuff () = let x = Some 21 in match x with | None -&gt; None | Some x -&gt; Some (let i = x in i + i) </code></pre> <p>As you can see, inlining alone isn't that useful of an optimisation per se. In this context, appliquing <code>Constant Propagation</code> will optimise and simplify it to the following:</p> <pre><code class="language-ocaml">let stuff () = Some 42 </code></pre> <p>Although this is a toy example, combining small functions is a common pattern in functional programs. It's very convenient that using combinators is <strong>not</strong> significantly worse than writing this function by hand.</p> <h2> <a id="detrimentalinlining" class="anchor"></a><a class="anchor-link" href="#detrimentalinlining">When inlining is detrimental</a> </h2> <p>We cannot just go around and inline everything, everywhere... all at once.</p> <p>As we said, inlining is mainly code duplication and that would be detrimental and blow the size of the compiled code drastically. However, there is a sweet spot to be found, between both absolute inlining and no inlining at all, but it is hard to find.</p> <p>Here's an example of exploding code at inlining time:</p> <pre><code class="language-ocaml">(* val h : int -&gt; int *) let h n = (* Some non constant expression *) (* val f : (int -&gt; int) -&gt; int -&gt; int *) let f g x = g (g x) (* 4 calls to f -&gt; 2^4 calls to h *) let n = f (f (f (f h))) 42 </code></pre> <p>Following through with the inlining process will produce a very large binary relative to its source code. This contrived example highlights potential problems that might arise in ordinary codebases in the wild, even if this one is tailored to be <strong>quite nasty</strong> for inlining: notice the exponential blowup in the number of nested calls, every additional call to <code>f</code> doubles the number of calls to <code>h</code> after inlining.</p> <h2> <a id="beneficialinlining" class="anchor"></a><a class="anchor-link" href="#beneficialinlining">How to decide when inlining is beneficial</a> </h2> <p>Most compilers use a collection of heuristics to guide them in the decision making. A good collection of heuristics is hard to both design, and fine-tune. They also can be quite specific to a programming style and unfit for other compilers to integrate. The take away is: <strong>there is no best way</strong>.</p> <blockquote> <p><strong>Side Note:</strong></p> <p>This topic would make for an interesting blog post but, unfortunately, rather remote from the point of this article. If you are interested in going deeper into that subject right now, we have found references for you to explore until we get around to writing a comprehensive, and more digestable, explanation about the heuristic nature of inlining:</p> <ul> <li><a href="https://www.cambridge.org/core/services/aop-cambridge-core/content/view/8DD9A82FF4189A0093B7672193246E22/S0956796802004331a.pdf/secrets-of-the-glasgow-haskell-compiler-inliner.pdf"><strong>Secrets of the Glasgow Haskell Compiler inliner</strong>, <em>by SIMON PEYTON JONES and SIMON MARLOW, 2002</em></a>. </li> <li><a href="https://web.archive.org/web/20010615153947/https://www.cs.indiana.edu/~owaddell/papers/thesis.ps.gz"><strong>Extending the Scope of Syntactic Abstraction</strong>, <em>by OSCAR WADDELL, 1999. Section 4.4</em> (<strong>PDF Download link</strong>)</a>, for the case of Scheme. </li> <li><a href="https://dl.acm.org/doi/10.1145/182409.182489"><strong>Towards Better Inlining Decisions Using Inlining Trials</strong>, <em>by JEFFREY DEAN and CRAIG CHAMBERS, 1994</em></a>. </li> <li><a href="https://ethz.ch/content/dam/ethz/special-interest/infk/ast-dam/documents/Theodoridis-ASPLOS22-Inlining-Paper.pdf"><strong>Understanding and Exploiting Optimal Function Inlining</strong>, <em>by THEODOROS THEODORIDIS, TOBIAS GROSSER, ZHENDONG SU, 2022</em></a>. </li> </ul> </blockquote> <p>Before we get to a concrete example, and break down <code>Speculative Inlining</code> for you, we would like to discuss the trade-offs of duplicating code.</p> <p>CPUs execute instructions one by one, or at least they pretend that they do. In order to execute an instruction, they need to load up into memory both code and data. In modern CPUs, most instructions take only a few cycles to execute and in practice, the CPUs often execute several at the same time. To put into perspective, loading memory, however, in the worst case, can take hundreds of CPU cycles... Most of the time it's not the case because CPUs have complex memory cache hierarchies such that loading from instruction cache can take just a few cycles, loading from level 2 caches may take dozens of them, and the worst case is loading from main memory which can take hundreds of cycles.</p> <p>The take away is, when executing a program, the cost of one instruction that has to be loaded from main memory can be <a href="https://norvig.com/21-days.html#answers">larger</a> than the cost of executing a hundred instructions in caches.</p> <p>There is a way to avoid the worst case scenario. Since caches are rather small in size, the main component to keeping from loading from main memory is to keep your program rather small, or at least the parts of it that are regularly executed.</p> <p>Keep these orders of magnitude in mind when we address the trade-offs between improving the number of instructions that we run and keeping the program to a reasonably small size.</p> <hr /> <p>Before explaining <code>Speculative Inlining</code> let's consider a piece of code.</p> <p>The following pattern is quite common in OCaml and other functional languages, let's see how one would go about inlining this code snippet.</p> <p><strong>Example 1:</strong> Notice the higher-order function <code>f</code>:</p> <pre><code class="language-ocaml">(* val f : (condition:bool -&gt; int -&gt; unit) -&gt; condition:bool -&gt; int -&gt; unit *) let f g ~condition n = for i = 0 to n do g ~condition i done let g_real ~condition i = if condition then (* small operation *) else (* big piece of code *) let condition = true let foo n = f g_real ~condition n </code></pre> <p>Even for such a small example we will see that the heuristics involved to finding the right solution can become quite complex.</p> <p>Keeping in mind the fact that <code>condition</code> is always <code>true</code>, the best set of inlining decisions would yield the following code:</p> <pre><code class="language-ocaml">(* All the code before [foo] is kept as is, from the previous codeblock *) let foo x = for i = 0 to x do (* small operation *) done </code></pre> <p>But if <code>condition</code> had been always <code>false</code>, instead of <code>small operation</code>, we would have had a big chunk of <code>g_real</code> duplicated in <code>foo</code> (i.e: <code>(* big piece of code *)</code>). Moreover it would have only spared us the running time of a few <code>call</code> instructions. Therefore, we would have probably preferred to have kept ourselves from inlining anything.</p> <p>Specifically, we would have liked to have stopped from inlining <code>g</code>, as well as to have avoided inlining <code>f</code> because it would have needlessly increased the size of the code with no substantial benefit.</p> <p>However, if we want to be able to take an educated decision based on the value of <code>condition</code>, we will have to consider the entirety of the code relevant to that choice. Indeed, if we just look at the code for <code>f</code>, or its call site in <code>foo</code>, nothing would guide us to the right decision. In order to take the right decision, we need to understand that if the <code>~condition</code> parameter to the <code>g_real</code> function is <code>true</code>, then we can remove a <strong>large</strong> piece of code, namely: the <code>else</code> branch and the condition check as well.</p> <p>But to understand that the <code>~condition</code> in <code>g_real</code> is always <code>true</code>, we need to see it in the context of <code>f</code> in <code>foo</code>. This implies again that, that choice of inlining is not based on a property of <code>g_real</code> but rather a property of the context of its call.</p> <p>There exists a <strong>very large</strong> number of combinations of such difficult situations that would each require <strong>different</strong> heuristics which would be incredibly tedious to design, implement, and maintain.</p> <h2> <a id="speculativeinlining" class="anchor"></a><a class="anchor-link" href="#speculativeinlining">Speculative inlining</a> </h2> <p>We manage to circumvent the hurdle that this decision problem represents thanks to what we call <code>Speculative Inlining</code>. This strategy requires two properties from the compiler: the ability to inline and optimise at the same time, as well as being able to backtrack inlining decisions.</p> <p>Lets look at <strong>Example 1</strong> again and look into the <code>Speculative Inlining</code> strategy.</p> <pre><code class="language-ocaml">let f g ~condition n = for i = 0 to n do g ~condition i done let g_real ~condition x = if condition then (* small operation *) else (* big piece of code *) let condition = true let foo x = f g_real ~condition x </code></pre> <p>We will focus only on the traversal of the <code>foo</code> function.</p> <p>Before we try and inline anything, there are a couple things we have to keep in mind about values and functions in OCaml:</p> <ol> <li><strong>Application arity may not match function arity</strong> </li> </ol> <p>To give you an idea, the function <code>foo</code> could also been written in the following way:</p> <pre><code class="language-ocaml">let foo x = let f1 = f in let f2 = f1 g_real in let f3 = f2 ~condition in f3 x </code></pre> <p>We expect the compiler to translate it as well as the original, but we cannot inline a function unless all its arguments are provided. To solve this, we need to handle partial applications precisely. Over-applications also present similar challenges.</p> <ol start="2"> <li><strong>Functions are values in OCaml</strong> </li> </ol> <p>We have to understand that the call to <code>f</code> in <code>foo</code> is <strong>not</strong> trivially a direct call to <code>f</code> in this context. Indeed, at this point functions could instead be stored in pairs, or lists, or even hashtables, to be later retrieved and applied at will, and we call such functions <strong>general functions</strong>.</p> <p>Since our goal is to inline it, we <strong>need</strong> to know the body of the function. We call a function <strong>concrete</strong> when we have knowledge of its body. This entails <a href="https://en.wikipedia.org/wiki/Constant_folding"><code>Constant Propagation</code></a> in order to associate a <strong>concrete</strong> function to <strong>general</strong> function values and, consequently, be able to simplify it while inlining.</p> <p>Here's the simplest case to demonstrate the importance of <code>Constant Propagation</code>.</p> <pre><code class="language-ocaml">let foo_bar y = let pair = foo, y in (fst pair) (snd pair) </code></pre> <p>In this case, we have to look inside the pair in order to find the function, this demonstrates that we sometimes have to do some amount of <strong>value analysis</strong> in order to proceed. It's quite common to come across such cases in OCaml programs due to the module system and other functional languages present similar characteristics.</p> <p>There are many scenarios which also require a decent amount of context in order to identify which function should be called. For example, when a function passed as parameter is called, we need to know the context of the caller function<strong>s</strong>, sometimes up to an arbitrarily large context. Analysing the relevant context will tell us which function is being called and thus help us make educated inlining decisions. This problem is specific to functional languages, functions in good old imperative languages are seldom ambiguous; even though such considerations would be relevant when function pointers are involved.</p> <p>This small code snippet shows us that we <strong>have</strong> to inline some functions in order to know whether we should have inlined them.</p> <h3> <a id="speculativeinlininginpractice" class="anchor"></a><a class="anchor-link" href="#speculativeinlininginpractice">Speculative inlining in practice</a> </h3> <p>In practice, <code>Speculative Inlining</code> is being able to quantify the benefits brought by a set of optimisations, which have to be applied after a given inlining decision, and use these results to determine if said inlining decision is in fact worth to carry out all things considered.</p> <p>The criteria for accepting an inlining decision is that the resulting code <strong>should be</strong> faster that the original one. We use <em>&quot;should be&quot;</em> because program speed cannot be fully understood with absolutes.</p> <p>That's why we use a heuristic algorithm in order to compare the original and the optimised versions of the code. It roughly consists in counting the number of retired (executed) instructions and comparing it to the increase in code size introduced by inlining the body of that function. The value of that cut-off ratio is by definition heuristic and different compilation options given to <code>ocamlopt</code> change it.</p> <p>As said previously, we cannot go around and evaluate each inlining decision independently because there are cases where inlining a function allows for more of them to happen, and sometimes a given inlining choice validates another one. We can see this in <strong>Example 1</strong>, where deciding <strong>not</strong> to inline function <code>g_real</code> would make the inlining of function <code>f</code> useless.</p> <p>Naturally, every combination of inlining decision cannot be explored exhaustively. We can only explore a small subset of them, and for that we have another heuristic that was already used in <code>Flambda1</code>, although <code>Flambda2</code> does not yet implement it in full.</p> <p>It's quite simple: we choose to consider inlining decision relationships only when there are nested calls. As for any other heuristic, it does not cover every useful case, but not only is it the easiest to implement, we are also fairly confident that it covers the most important cases.</p> <p>Here's a small rundown of that heuristic:</p> <ul> <li><code>A</code> is a function which calls <code>B</code> <ul> <li><strong>Case 1</strong>: we evaluate the body of <code>A</code> at its definition, possibly inlining <code>B</code> in the process </li> <li><strong>Case 2</strong>: at a specific callsite of <code>A</code>, we evaluate <code>A</code> in the inlining context. <ul> <li><strong>Case 2.a</strong>: inlining <code>A</code> is beneficial no matter the decision on <code>B</code>, so we do it. </li> <li><strong>Case 2.b</strong>: inlining <code>A</code> is potentially detrimental, so we go and evaluate <code>B</code> before deciding to inline <code>A</code> for good. </li> </ul> </li> </ul> </li> </ul> <p>Keep in mind that case <strong>2.b</strong> is recursive and can go arbitrarily deep. This amounts to looking for the best leaf in the decision tree. Since we can't explore the whole tree, we do have a some limit to the depth of the exploration.</p> <blockquote> <p><strong>Reminder for our fellow Cameleers</strong>: <code>Flambda1</code> and <code>Flambda2</code> have a flag you can pass through the CLI which will generate a <code>.org</code> file which will detail all the inlining decisions taken by the compiler. That flag is: <code>-inlining-report</code>. Note that <code>.org</code> files allow to easily visualise a decision tree inside of the Emacs editor.</p> </blockquote> <h2> <a id="summary" class="anchor"></a><a class="anchor-link" href="#summary">Summary</a> </h2> <p>By now, you should have a better understanding of the intricacies inherent to <code>Speculative Inlining</code>. Prior to its initial inception, it was fair to question how feasible (and eligible, considering the many requirements for developping a compiler), such an algorithm would be in practice. Since then, it has demonstrated its usefulness in <code>Flambda1</code> and, consequently, its porting to <code>Flambda2</code> was called for.</p> <p>So before we move on to the next stop in the <a href="/blog/2024_03_18_the_flambda2_snippets_0#listing"><strong>F2S</strong></a> series, lets summarize what we know of <code>Speculative Inlining</code>.</p> <p>We learned that <strong>inlining</strong> is the process of copying the body of a function at its callsite. We also learned that it is not a very interesting transformation by itself, especially nowadays with how efficient modern CPUs are, but that its usefulness is found in how it <strong>facilitates other optimisations</strong> to take place later.</p> <p>We also learned about the <strong>heuristic</strong> nature of inlining and how it would be difficult to maintain finely-tailored heuristics in the long run as many others have tried before us. Actually, it is because <strong>there is no best way</strong> that we have come up with the need for an algorithm that is capable of simultaneously performing <strong>inlining</strong> and <strong>optimising</strong> as well as <strong>backtracking</strong> when needed which we called <code>Speculative Inlining</code>. In a nutshell, <code>Speculative Inlining</code> is one of the algorithms of the optimisation framework of <code>Flambda2</code> which facilitates other optimisations to take place.</p> <p>We have covered the constraints that the algorithm has to respect for it to hold ground in practice, like <strong>performance</strong>. We value a fast compiler and aim to keep both its execution but also the code it generates to be so. Take an optimisation such as <code>Constant Propagation</code> as an example. It would be a <em>naïve</em> approach to try and perform this transformation everywhere because the resulting complexity of the compiler would amount to something like <code>size_of_the_code * number_of_inlinings_performed</code> which is unacceptable to say the least. We aim at making the complexity of our compiler linear to the code size, which in turn entails plenty of <strong>logarithms</strong> anytime it is possible. Instead, we choose to apply any transformation only in the inlined parts of the code.</p> <p>With all these parameters in mind, can we imagine ways to tackle these <strong>multi-layered challenges</strong> all at the same time ? There are solutions out there that do so in an imperative manner. In fact, the most intuitive way to implement such an algorithm may be fairly easily done with imperative code. You may want to read about <code>Equality Saturation</code> for instance, or even <a href="http://www-sop.inria.fr/members/Manuel.Serrano/publi/serrano-plilp97.ps.gz">download Manuel Serrano's Paper inside the Scheme Bigloo compiler</a> to learn more about it. However, we require backtracking, and the nested nature of these transformations (inlining, followed by different optimising transformations) <strong>would make backtracking bug-prone and tedious to maintain</strong> if it was to be written imperatively.</p> <p>It soon became evident for us that we were going to leverage one of the key characteristics of functional languages in order to make this whole ordeal easier to design, implement and maintain: <strong>purity of terms</strong>. Indeed, not only is it easier to support backtracking when manipulating <strong>pure</strong> code, but it also becomes impossible for us to introduce cascades of hard to detect nested bugs by avoiding transforming code <strong>in place</strong>. From this point on, we knew we had to perform all transformations at the same time, making our inlining function one that would return an <strong>optimised inlined function</strong>. This does introduce complexities that we have chosen over the hurdles of maintaining an imperative version of that same algorithm, which can be seen as pertaining to <code>graph traversal</code> and <code>tree rewriting</code> for all intents and purposes.</p> <p>Despite the density of this article, keep in mind that we aim at explaining <code>Flambda2</code> in the most comprehensive manner possible and that there are voluntary shortcuts taken throughout these snippets for all of this to make sense for the broader audience. In time, these articles will go deep into the guts of the compiler and by then, hopefully, we will have done a good job at providing our readers with all necessary information for all of you to continue enjoying this rabbit-hole with us!</p> <p>Here's a pseudo-code snippet representing <code>Speculative Inlining</code>.</p> <pre><code class="language-ocaml">(* Pseudo-code to rpz the actual speculation *) let try_inlining f env args = let inlined_version_of_f = inline f env args in let benefit = compare inlined_version_of_f f in if benefit &gt; 0 then inlined_version_of_f else f </code></pre> <h2> <a id="conclusion" class="anchor"></a><a class="anchor-link" href="#conclusion">Conclusion</a> </h2> <p>As we said at the start of this article, this one is but an introduction to a major topic we will cover next, namely: <code>Upwards and Downwards Traversals</code>.</p> <p>We had to cover <code>Speculative Inlining</code> first. It is a reasonably approachable solution to a complex problem, and having an idea of all the requirements for its good implementation is half of the work done for understanding key design decisions such as how code traversal was designed for algorithms such as <code>Speculative Inlining</code> to hold out.</p> <hr /> <p><strong>Thank you all for reading! We hope that these articles will keep the community hungry for more!</strong></p> <p><strong>Until next time, keep calm and OCaml!</strong> <a href="https://egypt-museum.com/the-weighing-of-the-heart-ceremony/">⚱️🐫🏺📜</a></p> opam 2.2.0 release! https://ocamlpro.com/blog/2024_07_01_opam_2_2_0_releases 2024-07-01T13:48:57Z 2024-07-01T13:48:57Z Raja Boujbel - OCamlPro Kate Deplaix - Ahrefs David Allsopp - Tarides Feedback on this post is welcomed on Discuss! We are very pleased to announce the release of opam 2.2.0, and encourage all users to upgrade. Please read on for installation and upgrade instructions. NOTE: this article is cross-posted on opam.ocaml.org and ocamlpro.com, and published in discuss.ocaml... <p><em>Feedback on this post is welcomed on <a href="https://discuss.ocaml.org/t/ann-opam-2-2-0-is-out/14893">Discuss</a>!</em></p> <p>We are very pleased to announce the release of opam 2.2.0, and encourage all users to upgrade. Please read on for installation and upgrade instructions.</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>, and published in <a href="https://discuss.ocaml.org/t/ann-opam-2-2-0-is-out/14893">discuss.ocaml.org</a>.</p> </blockquote> <h2>Try it!</h2> <p>In case you plan a possible rollback, you may want to first backup your <code>~/.opam</code> or <code>$env:LOCALAPPDATAopam</code> directory.</p> <p>The upgrade instructions are unchanged:</p> <ol> <li>Either from binaries: run </li> </ol> <p>For Unix systems</p> <pre><code class="language-shell-session">bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.2.0&quot; </code></pre> <p>or from PowerShell for Windows systems</p> <pre><code class="language-shell-session">Invoke-Expression &quot;&amp; { $(Invoke-RestMethod https://raw.githubusercontent.com/ocaml/opam/master/shell/install.ps1) }&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.2.0">the Github &quot;Releases&quot; page</a> to your PATH.</p> <ol start="2"> <li>Or from source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.2.0#compiling-this-repo">README</a>. </li> </ol> <p>You should then run:</p> <pre><code class="language-shell-session">opam init --reinit -ni </code></pre> <h2>Changes</h2> <h3>Major change: Windows support</h3> <p>After 8 years' effort, opam and opam-repository now have official native Windows support! A big thank you is due to Andreas Hauptmann (<a href="https://github.com/fdopen">@fdopen</a>), whose <a href="https://github.com/fdopen/godi-repo">WODI</a> and <a href="https://fdopen.github.io/opam-repository-mingw/">OCaml for Windows</a> projects were for many years the principal downstream way to obtain OCaml on Windows, Jun Furuse (<a href="https://github.com/camlspotter">@camlspotter</a>) whose <a href="https://inbox.vuxu.org/caml-list/CAAoLEWsQK7=qER66Uixx5pq4wLExXovrQWM6b69_fyMmjYFiZA@mail.gmail.com/">initial experimentation with OPAM from Cygwin</a> formed the basis of opam-repository-mingw, and, most recently, Jonah Beckford (<a href="https://github.com/JonahBeckford">@jonahbeckford</a>) whose <a href="https://diskuv.com/dkmlbook/">DkML</a> distribution kept - and keeps - a full development experience for OCaml available on Windows.</p> <p>OCaml when used on native Windows requires certain tools from the Unix world which are provided by either <a href="https://cygwin.com">Cygwin</a> or <a href="https://msys2.org">MSYS2</a>. We have engineered <code>opam init</code> so that it is possible for a user not to need to worry about this, with <code>opam</code> managing the Unix world, and the user being able to use OCaml from either the Command Prompt or PowerShell. However, for the Unix user coming over to Windows to test their software, it is also possible to have your own Cygwin/MSYS2 installation and use native Windows opam from that. Please see the <a href="https://opam.ocaml.org/blog/opam-2-2-0-windows/">previous blog post</a> for more information.</p> <p>There are two &quot;ports&quot; of OCaml on native Windows, referred to by the name of provider of the C compiler. The mingw-w64 port is <a href="https://www.mingw-w64.org/">GCC-based</a>. opam's external dependency (depext) system works for this port (including providing GCC itself), and many packages are already well-supported in opam-repository, thanks to the previous efforts in <a href="https://github.com/fdopen/opam-repository-mingw">opam-repository-mingw</a>. The MSVC port is <a href="https://visualstudio.microsoft.com/">Visual Studio-based</a>. At present, there is less support in this ecosystem for external dependencies, though this is something we expect to work on both in opam-repository and in subsequent opam releases. In particular, it is necessary to install Visual Studio or Visual Studio BuildTools separately, but opam will then automatically find and use the C compiler from Visual Studio.</p> <h3>Major change: opam tree / opam why</h3> <p><code>opam tree</code> is a new command showing packages and their dependencies with a tree view. It is very helpful to determine which packages bring which dependencies in your installed switch.</p> <pre><code class="language-shell-session">$ opam tree cppo cppo.1.6.9 ├── base-unix.base ├── dune.3.8.2 (&gt;= 1.10) │ ├── base-threads.base │ ├── base-unix.base [*] │ └── ocaml.4.14.1 (&gt;= 4.08) │ ├── ocaml-base-compiler.4.14.1 (&gt;= 4.14.1~ &amp; &lt; 4.14.2~) │ └── ocaml-config.2 (&gt;= 2) │ └── ocaml-base-compiler.4.14.1 (&gt;= 4.12.0~) [*] └── ocaml.4.14.1 (&gt;= 4.02.3) [*] </code></pre> <p>Reverse-dependencies can also be displayed using the new <code>opam why</code> command. This is useful to examine how dependency versions get constrained.</p> <pre><code class="language-shell-session">$ opam why cmdliner cmdliner.1.2.0 ├── (&gt;= 1.1.0) b0.0.0.5 │ └── (= 0.0.5) odig.0.0.9 ├── (&gt;= 1.1.0) ocp-browser.1.3.4 ├── (&gt;= 1.0.0) ocp-indent.1.8.1 │ └── (&gt;= 1.4.2) ocp-index.1.3.4 │ └── (= version) ocp-browser.1.3.4 [*] ├── (&gt;= 1.1.0) ocp-index.1.3.4 [*] ├── (&gt;= 1.1.0) odig.0.0.9 [*] ├── (&gt;= 1.0.0) odoc.2.2.0 │ └── (&gt;= 2.0.0) odig.0.0.9 [*] ├── (&gt;= 1.1.0) opam-client.2.2.0~alpha │ ├── (= version) opam.2.2.0~alpha │ └── (= version) opam-devel.2.2.0~alpha ├── (&gt;= 1.1.0) opam-devel.2.2.0~alpha [*] ├── (&gt;= 0.9.8) opam-installer.2.2.0~alpha └── user-setup.0.7 </code></pre> <blockquote> <p>Special thanks to <a href="https://github.com/cannorin">@cannorin</a> for contributing this feature.</p> </blockquote> <h3>Major change: with-dev-setup</h3> <p>There is now a way for a project maintainer to share their project development tools: the <code>with-dev-setup</code> dependency flag. It is used in the same way as <code>with-doc</code> and <code>with-test</code>: by adding a <code>{with-dev-setup}</code> filter after a dependency. It will be ignored when installing normally, but it's pulled in when the package is explicitly installed with the <code>--with-dev-setup</code> flag specified on the command line.</p> <p>For example</p> <pre><code class="language-shell-session">opam-version: &quot;2.0&quot; depends: [ &quot;ocaml&quot; &quot;ocp-indent&quot; {with-dev-setup} ] build: [make] install: [make &quot;install&quot;] post-messages: [ &quot;Thanks for installing the package&quot; &quot;as well as its development setup. It will help with your future contributions&quot; {with-dev-setup} ] </code></pre> <h3>Major change: opam pin --recursive</h3> <p>When pinning a package using <code>opam pin</code>, opam looks for opam files in the root directory only. With recursive pinning, you can now instruct opam to look for <code>.opam</code> files in subdirectories as well, while maintaining the correct relationship between the <code>.opam</code> files and the package root for versioning and build purposes.</p> <p>Recursive pinning is enabled by the following options to <code>opam pin</code> and <code>opam install</code>:</p> <ul> <li>With <code>--recursive</code>, opam will look for <code>.opam</code> files recursively in all subdirectories. </li> <li>With <code>--subpath &lt;path&gt;</code>, opam will only look for <code>.opam</code> files in the subdirectory <code>&lt;path&gt;</code>. </li> </ul> <p>The two options can be combined: for instance, if your opam packages are stored as a deep hierarchy in the <code>mylib</code> subdirectory of your project you can try <code>opam pin . --recursive --subpath mylib</code>.</p> <p>These options are useful when dealing with a large monorepo-type repository with many opam libraries spread about.</p> <h3>New Options</h3> <ul> <li> <p><code>opam switch -</code>, inspired by <code>git switch -</code>, makes opam switch back to the previously selected global switch.</p> </li> <li> <p><code>opam pin --current</code> fixes a package to its current state (disabling pending reinstallations or removals from the repository). The installed package will be pinned to its current installed state, i.e. the pinned opam file is the one installed.</p> </li> <li> <p><code>opam pin remove --all</code> removes all the pinned packages from a switch.</p> </li> <li> <p><code>opam exec --no-switch</code> removes the opam environment when running a command. It is useful when you want to launch a command without opam environment changes.</p> </li> <li> <p><code>opam clean --untracked</code> removes untracked files interactively remaining from previous packages removal.</p> </li> <li> <p><code>opam admin add-constraint &lt;cst&gt; --packages pkg1,pkg2,pkg3</code> applies the given constraint to a given set of packages</p> </li> <li> <p><code>opam list --base</code> has been renamed into <code>--invariant</code>, reflecting the fact that since opam 2.1 the &quot;base&quot; packages of a switch are instead expressed using a switch invariant.</p> </li> <li> <p><code>opam install --formula &lt;formula&gt;</code> installs a formula instead of a list of packages. This can be useful if you would like to install one package or another one. For example <code>opam install --formula '&quot;extlib&quot; |&quot;extlib-compat&quot;'</code> will install either <code>extlib</code> or <code>extlib-compat</code> depending on what's best for the current switch.</p> </li> </ul> <h3>Miscellaneous changes</h3> <ul> <li>The UI now displays a status when extracting an archive or reloading a repository </li> <li>Overhauled the implementation of <code>opam env</code>, fixing many corner cases for environment updates and making the reverting of package environment variables precise. As a result, using <code>setenv</code> in an opam file no longer triggers a lint warning. </li> <li>Fix parsing pre-opam 2.1.4 switch import files containing extra-files </li> <li>Add a new <code>sys-ocaml-system</code> default global eval variable </li> <li>Hijack the <code>&quot;%{var?string-if-true:string-if-false-or-undefined}%&quot;</code> syntax to support extending the variables of packages with <code>+</code> in their name (<code>conf-c++</code> and <code>conf-g++</code> already exist) using <code>&quot;%{?pgkname:var:}%&quot;</code> </li> <li>Fix issues when using fish as shell </li> <li>Sandbox: Mark the user temporary directory (as returned by <code>getconf DARWIN_USER_TEMP_DIR</code>) as writable when TMPDIR is not defined on macOS </li> <li>Add Warning 69: Warn for new syntax when package name in variable in string interpolation contains several '+' (this is related to the &quot;hijack&quot; item above) </li> <li>Add support for Wolfi OS, treating it like Alpine family as it also uses apk </li> <li>Sandbox: <code>/tmp</code> is now writable again, restoring POSIX compliance </li> <li>Add a new <code>opam admin: new add-extrafiles</code> command to add/check/update the <code>extra-files:</code> field according to the files present in the <code>files/</code> directory </li> <li>Add a new <code>opam lint -W @1..9</code> syntax to allow marking a set of warnings as errors </li> <li>Fix bugs in the handling of the <code>OPAMCURL</code>, <code>OPAMFETCH</code> and <code>OPAMVERBOSE</code> environment variables </li> <li>Fix bugs in the handling of the <code>--assume-built</code> argument </li> <li>Software Heritage fallbacks is now supported, but is disabled-by-default for now. For more information you can read one of our <a href="https://opam.ocaml.org/blog/opam-2-2-0-alpha/#Software-Heritage-Binding">previous blog post</a> </li> </ul> <p>And many other general and performance improvements were made and bugs were fixed. You can take a look to previous blog posts. API changes and a more detailed description of the changes are listed in:</p> <ul> <li><a href="https://github.com/ocaml/opam/releases/tag/2.2.0-alpha">the release note for 2.2.0~alpha</a> </li> <li><a href="https://github.com/ocaml/opam/releases/tag/2.2.0-alpha2">the release note for 2.2.0~alpha2</a> </li> <li><a href="https://github.com/ocaml/opam/releases/tag/2.2.0-alpha3">the release note for 2.2.0~alpha3</a> </li> <li><a href="https://github.com/ocaml/opam/releases/tag/2.2.0-beta1">the release note for 2.2.0~beta1</a> </li> <li><a href="https://github.com/ocaml/opam/releases/tag/2.2.0-beta2">the release note for 2.2.0~beta2</a> </li> <li><a href="https://github.com/ocaml/opam/releases/tag/2.2.0-beta3">the release note for 2.2.0~beta3</a> </li> <li><a href="https://github.com/ocaml/opam/releases/tag/2.2.0-rc1">the release note for 2.2.0~rc1</a> </li> <li><a href="https://github.com/ocaml/opam/releases/tag/2.2.0">the release note for 2.2.0</a> </li> </ul> <p>This release also includes PRs improving the documentation and improving and extending the tests.</p> <p>Please report any issues to <a href="https://github.com/ocaml/opam/issues">the bug-tracker</a>.</p> <p>We hope you will enjoy the new features of opam 2.2! 📯</p> Flambda2 Ep. 2: Loopifying Tail-Recursive Functions https://ocamlpro.com/blog/2024_05_07_the_flambda2_snippets_2 2024-05-07T13:48:57Z 2024-05-07T13:48:57Z Nathanaëlle Courant Guillaume Bury Pierre Chambart Vincent Laviron Dario Pinto Welcome to a new episode of The Flambda2 Snippets! Today's topic is Loopify, one of Flambda2's many optimisation algorithms which specifically deals with optimising both purely tail-recursive and/or functions annotated with the [@@loop] attribute in OCaml. A lazy explanation for its utility would be... <p></p> <p> <div class="figure"> <p> <a href="/blog/assets/img/F2S_loopify_figure.png"> <img alt="Two camels are taking a break from crossing the desert, they know their path could not have been more optimised." src="/blog/assets/img/F2S_loopify_figure.png"/> </a> <div class="caption"> Two camels are taking a break from crossing the desert, they know their path could not have been more optimised. </div> </p> </div> </p> <h3>Welcome to a new episode of <strong>The Flambda2 Snippets</strong>!</h3> <p>Today's topic is <code>Loopify</code>, one of <code>Flambda2</code>'s many optimisation algorithms which specifically deals with optimising both <em>purely tail-recursive</em> and/or functions <em>annotated</em> with the <code>[@@loop]</code> attribute in OCaml.</p> <p>A lazy explanation for its utility would be to say that it simply aims at reducing the number of memory allocations in the context of <em>recursive</em> and <em>tail-recursive</em> function calls in OCaml. However, we will see that is just <strong>part</strong> of the point and thus we will tend to address the broader context: what are <em>tail-calls</em>, how they are optimised and how they fit in the functional programming world, what dilemma does <code>Loopify</code> nullify exactly and, in time, many details on how it's all implemented!</p> <p>If you happen to be stumbling upon this article and wish to get a bird's-eye view of the entire <strong>F2S</strong> series, be sure to refer to <a href="/blog/2024_03_18_the_flambda2_snippets_0">Episode 0</a> which does a good amount of contextualising as well as summarising of, and pointing to, all subsequent episodes.</p> <p><strong>All feedback is welcome, thank you for staying tuned and happy reading!</strong></p> <blockquote> <p>The <strong>F2S</strong> blog posts aim at gradually introducing the world to the inner-workings of a complex piece of software engineering: The <code>Flambda2 Optimising Compiler</code>, a technical marvel born from a 10 year-long effort in Research &amp; Development and Compilation; with many more years of expertise in all aspects of Computer Science and Formal Methods.</p> </blockquote> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#tco">Tail-Call Optimisation</a> </li> <li><a href="#tailcallsinocaml">Tail-Calls in OCaml</a> </li> <li><a href="#conundrum">The Conundrum of Reducing allocations Versus Writing Clean Code</a> </li> <li><a href="#loopify">Loopify</a> <ul> <li><a href="#concept">Concept</a> </li> <li><a href="#toloopifyornottoloopify">Deciding to Loopify or not</a> </li> <li><a href="#thetransformation">The nature of the transformation</a> </li> </ul> </li> <li><a href="#conclusion">Conclusion</a> </div> </li> </ul> <h2> <a id="tco" class="anchor"></a><a class="anchor-link" href="#tco">Tail-Call Optimisation</a> </h2> <p>As far as we know, Tail-Call optimisation (TCO) has been a reality since at least the 70s. Some LISP implementations used it and Scheme specified it into its language around 1975.</p> <p>The debate to support TCO happens regularly today still. Nowadays, it's a given that most functional languages support it (Scala, OCaml, Haskell, Scheme and so on...). Other languages and compilers have supported it for some time too. Either optionally, with some C compilers (gcc and clang) that support TCO in some specific compilation scenarios; or systematically, like Lua, which, despite not usually being considered a functional language, specifies that TCO occurs whenever possible (<a href="https://www.lua.org/manual/5.3/manual.html#3.4.10">you may want to read section 3.4.10 of the Lua manual here</a>).</p> <p><strong>So what exactly is Tail-Call Optimisation ?</strong></p> <p>A place to start would be the <a href="https://en.wikipedia.org/wiki/Tail-call_optimisation">Wikipedia page</a>. You may also find some precious insight about the link between the semantics of <code>GOTO</code> and tail calls <a href="https://www.college-de-france.fr/fr/agenda/cours/structures-de-controle-de-goto-aux-effets-algebriques/programmer-ses-structures-de-controle-continuations-et-operateurs-de-controle">here</a>, a course from Xavier Leroy at the <em>College de France</em>, which is in French.</p> <p>Additionally to these resources, here are images to help you visualise how TCO improves stack memory consumption. Assume that <code>g</code> is a recursive function called from <code>f</code>:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/F2S_2_stack_no_tail_rec_call.svg"> <img alt="A representation of the textbook behaviour for recursive functions stackframe allocations. You can see here that the stackframes of non-tail-recursive functions are allocated sequentially on decreasing memory addresses which may eventually lead to a stack overflow." src="/blog/assets/img/F2S_2_stack_no_tail_rec_call.svg"/> </a> <div class="caption"> A representation of the textbook behaviour for recursive functions stackframe allocations. You can see here that the stackframes of non-tail-recursive functions are allocated sequentially on decreasing memory addresses which may eventually lead to a stack overflow. </div> </p> </div> </p> <p>Now, let's consider a tail-recursive implementation of the <code>g</code> function in a context where TCO is <strong>not</strong> supported. Tail-recursion means that the last thing <code>t_rec_g</code> does before returning is calling itself. The key is that we still have a frame for the caller version of <code>t_rec_g</code> but we know that it will only be used to return to the parent. The frame itself no longer holds any relevant information besides the return address and thus the corresponding memory space is therefore mostly wasted.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/F2S_2_stack_tail_rec_call_no_tco.svg"> <img alt="A representation of the textbook behaviour for tail-recursive functions stackframe allocations without Tail Call Optimisation (TCO). When TCO is not implemented the behaviour for these allocations and the potential for a stack overflow are the same as with non-tail-recursive functions." src="/blog/assets/img/F2S_2_stack_tail_rec_call_no_tco.svg"/> </a> <div class="caption"> A representation of the textbook behaviour for tail-recursive functions stackframe allocations without Tail Call Optimisation (TCO). When TCO is not implemented the behaviour for these allocations and the potential for a stack overflow are the same as with non-tail-recursive functions. </div> </p> </div> </p> <p>And finally, let us look at the same function in a context where TCO <strong>is</strong> supported. It is now apparent that memory consumption is much improved by the fact that we reuse the space from the previous stackframe to allocate the next one all the while preserving its return address:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/F2S_2_stack_tail_rec_call_tco.svg"> <img alt="A representation of the textbook behaviour for tail-recursive functions stackframe allocations with TCO. Since TCO is implemented, we can see that the stack memory consumption is now constant, and that the potential that this specific tail-recursive function will lead to a stack overflow is diminished." src="/blog/assets/img/F2S_2_stack_tail_rec_call_tco.svg"/> </a> <div class="caption"> A representation of the textbook behaviour for tail-recursive functions stackframe allocations with TCO. Since TCO is implemented, we can see that the stack memory consumption is now constant, and that the potential that this specific tail-recursive function will lead to a stack overflow is diminished. </div> </p> </div> </p> <h3> <a id="tailcallsinocaml" class="anchor"></a><a class="anchor-link" href="#tailcallsinocaml">Tail-Calls in OCaml</a> </h3> <p>The <code>List</code> data structure is fundamental to and ubiquitous in functional programming. Therefore, it's important to not have an arbitrary limit on the size of lists that one can manipulate. Indeed, most <code>List</code> manipulation functions are naturally expressed as recursive functions, and can most of the time be implemented as tail-recursive functions. Without guaranteed TCO, a programmer could not have the assurance that their program would not stack overflow at some point. That reasoning also applies to a lot of other recursive data structures that commonly occur in programs or libraries.</p> <p>In OCaml, TCO is guaranteed. Ever since its inception, Cameleers have unanimously agreed to guarantee the optimisation of tail-calls. While the compiler's support for TCO has been a thing from the beginning, <a href="https://v2.ocaml.org/manual/attributes.html#ss%3Abuiltin-attributes">an attribute</a>, <code>[@tailcall]</code> was later added to help users ensure that their calls are in tail position.</p> <p>Recently, TCO was also extended with the <a href="https://v2.ocaml.org/manual/tail_mod_cons.html"><code>Tail Mod Cons</code> optimisation</a> which allows to generate tail-calls in more cases.</p> <h3> <a id="conundrum" class="anchor"></a><a class="anchor-link" href="#conundrum">The Conundrum of Reducing Allocations Versus Writing Clean Code</a> </h3> <p>One would find one of the main purposes for the existence of <code>Loopify</code> in the following conversation: a Discuss Post about <a href="https://discuss.ocaml.org/t/how-to-speed-up-this-function/10286">the unboxing of floating-point values in OCaml</a> and performance.</p> <p><a href="https://discuss.ocaml.org/t/how-to-speed-up-this-function/10286/9">This specific comment</a> sparks a secondary conversation that you may want to read yourself but will find a quick breakdown of below and that will be a nice starting point to understand today's subject.</p> <p>Consider the following code:</p> <pre><code class="language-ocaml">let sum l = let rec loop s l = match l with | [] -&gt; s | hd :: tl -&gt; (* This allocates a boxed float *) let s = s +. hd in loop s tl in loop 0. l </code></pre> <p>This is a simple tail-recursive implementation of a <code>sum</code> function for a list of floating-point numbers. However this is not as efficient as we would like it to be.</p> <p>Indeed, OCaml needs an uniform representation of its values in order to implement polymorphic functions. In the case of floating-point numbers this means that the numbers are boxed whenever they need to be used as generic values.</p> <p>Besides, everytime we call a function all parameters have to be considered as generic values. We thus cannot avoid their allocation at each recursive call in this function.</p> <p>If we were to optimise it in order to get every last bit of performance out of it, we could try something like:</p> <p><strong>Warning: The following was coded by trained professionnals, do NOT try this at home.</strong></p> <pre><code class="language-ocaml">let sum l = (* Local references *) let s = ref 0. in let cur = ref l in try while true do match !cur with | [] -&gt; raise Exit | hd :: tl -&gt; (* Unboxed floats -&gt; No allocation *) s := !s +. hd; cur := tl done; assert false with Exit -&gt; !s (* The only allocation *) </code></pre> <p>While in general references introduce one allocation and a layer of indirection, when the compiler can prove that a reference is strictly local to a given function it will use mutable variables instead of reference cells.</p> <p>In our case <code>s</code> and <code>cur</code> do not escape the function and are therefore eligible to this optimisation.</p> <p>After this optimisation, <code>s</code> is now a mutable variable of type <code>float</code> and so it can also trigger another optimisation: <em>float unboxing</em>.</p> <p>You can see more details <a href="https://www.lexifi.com/blog/ocaml/unboxed-floats-ocaml/#">here</a> but note that, in this specific example, all occurrences of boxing operations disappear except a single one at the end of the function.</p> <p><strong>We like to think that not forcing the user to write such code is a benefit, to say the least.</strong></p> <hr /> <h2> <a id="loopify" class="anchor"></a><a class="anchor-link" href="#loopify">Loopify</a> </h2> <h3> <a id="concept" class="anchor"></a><a class="anchor-link" href="#concept">Concept</a> </h3> <p>There is a general concept of transforming function-level control-flow into direct <strong>IR</strong> continuations to benefit from &quot;basic block-level&quot; optimisations. One such pattern is present in the local-function optimisation triggered by the <code>[@local]</code> attribute. <a href="https://github.com/ocaml/ocaml/pull/2143">Here's the link to the PR that implements it</a>. <code>Loopify</code> is an attempt to extend the range of this kind of optimisation to proper (meaning <code>self</code>) tail-recursive calls.</p> <p>As you saw previously, in some cases (e.g.: numerical calculus), recursive functions sometimes hurt performances because they introduce some allocations.</p> <p>That lost performance can be recovered by hand-writing loops using local references however it's unfortunate to encourage non-functional code in a language such as OCaml.</p> <p>One of <code>Flambda</code> and <code>Flambda2</code>'s goals is to avoid situations such as those and allow for good-looking, functional code, to be as performant as code which is written and optimised by hand at the user-level.</p> <p>Therefore, we introduce a solution to the specific problem described above with <code>Loopify</code>, which, in a nutshell, transforms tail-recursive functions into non-recursive functions containing a loop, hence the name.</p> <h3> <a id="toloopifyornottoloopify" class="anchor"></a><a class="anchor-link" href="#toloopifyornottoloopify">Deciding to Loopify or not</a> </h3> <p>The decision to loopify a given function is made during the conversion from the <code>Lambda</code> <strong>IR</strong> to the <code>Flambda2</code> <strong>IR</strong>. The conversion is triggered in two cases:</p> <ul> <li>when a function is purely tail-recursive -- meaning all its uses within its body are <code>self-tail</code> calls, they are called <em>proper calls</em>; </li> <li>when an annotation is given by the user in the source code using the <code>[@loop]</code> attribute; </li> </ul> <p>Let's see two examples for them:</p> <pre><code class="language-ocaml">(* Not a tail-rec function: is not loopified *) let rec map f = function | [] -&gt; [] | x :: r -&gt; f x :: map f r (* Is tail-rec: is loopified *) let rec fold_left f acc = function | [] -&gt; acc | x :: r -&gt; fold_left f (f acc x) r </code></pre> <p>Here, the decision to <code>loopify</code> is automatic and requires no input from the user. Quite straightforward.</p> <hr /> <p>Onto the second case now:</p> <pre><code class="language-ocaml">(* Helper function, not recursive, nothing to do. *) let log dbg f arg = if dbg then print_endline &quot;Logging...&quot;; f arg [@@inline] (* Not tail-rec in the source, but may become tail-rec after inlining of the [log] function. At this point we can loopify, provided that the user specified a [@@loop] attribute. *) let rec iter_with_log dbg f = function | [] -&gt; () | x :: r -&gt; f x; log dbg (iter_with_log dbg f) r [@@loop] </code></pre> <p>The recursive function <code>iter_with_log</code>, is not initially purely tail-recursive.</p> <p>However after the inlining of the <code>log</code> function and then simplification, the new code for <code>iter_with_log</code> becomes purely tail-recursive.</p> <p>At that point we have the ability to <code>loopify</code> the function, but we keep from doing so unless the user specifies the <code>[@@loop]</code> attribute on the function definition.</p> <h3> <a id="thetransformation" class="anchor"></a><a class="anchor-link" href="#thetransformation">The nature of the transformation</a> </h3> <p>Onto the details of the transformation.</p> <p>First, we introduce a recursive continuation at the start of the function. Lets call it <code>self</code>.</p> <p>Then, at each tail-recursive call, we replace the function call with a continuation call to <code>self</code> with the same arguments as the original call.</p> <pre><code class="language-ocaml">let rec iter_with_log dbg f l = let_cont rec k_self dbg f l = match l with | [] -&gt; () | x :: r -&gt; f x; log dbg (iter_with_log dbg f) r in apply_cont k_self (dbg, f, l) </code></pre> <p>Then, we inline the <code>log</code> function:</p> <pre><code class="language-ocaml">let rec iter_with_log dbg f l = let_cont k_self dbg f l = match l with | [] -&gt; () | x :: r -&gt; f x; (* Here the inlined code starts *) (* We first start by binding the arguments of the original call to the parameters of the function's code *) let dbg = dbg in let f = iter_with_log dbg f in let arg = r in if dbg then print_endline &quot;Logging...&quot;; f arg in apply_cont k_self (dbg, f, l) </code></pre> <p>Then, we discover a <em>proper</em> tail-recursive call subsequently to these transformations that we replace with the adequate continuation call.</p> <pre><code class="language-ocaml">let rec iter_with_log dbg f l = let_cont k_self dbg f l = match l with | [] -&gt; () | x :: r -&gt; f x; (* Here the inlined code starts *) (* Here, the let bindings have been substituted by the simplification. *) if dbg then print_endline &quot;Logging...&quot;; apply_cont k_self (dbg, f, r) in apply_cont k_self (dbg, f, l) </code></pre> <p>In this context, the benefit of transforming a function call to a continuation call is mainly about allowing other optimisations to take place. As shown in the previous section, one of these optimisations is <code>unboxing</code> which can be important in some cases like numerical calculus. Such optimisations can take place because continuations are local to a function while OCaml ABI-abiding function calls require a prior global analysis.</p> <p>One could think that a continuation call is intrinsically cheaper than a function call. However, the OCaml compiler already optimises self-tail-calls such that they are already as cheap as continuation calls (i.e, a single <code>jump</code> instruction).</p> <p>An astute reader could realise that this transformation can apply to any function and will result in one of three outcomes:</p> <ul> <li>if the function is not tail-recursive, or even not recursive at all, nothing will happen, the transformation does nothing. </li> <li>if a function is purely tail-recursive then all recursive calls will be replaced to a continuation call and the function after optimisation will no longer be recursive. This allows us to later inline it and even specialise some of its arguments. This happens precisely when we automatically decide to loopify a function; </li> <li>if a function is not purely tail-recursive, but contains some tail-recursive calls then the transformation will rewrite those calls but not the other ones. This may result in better code but it's hard to be sure in advance. In such cases (and cases where functions become purely tail-recursive only after <code>inlining</code>), users can force the transformation by using the <code>[@@loop]</code> attribute </li> </ul> <h2> <a id="conclusion" class="anchor"></a><a class="anchor-link" href="#conclusion">Conclusion</a> </h2> <p>Here it is, the concept behind the <code>Loopify</code> optimisation pass as well as the general context and philosophy which led to its inception!</p> <p>It should be clear enough now that having to choose between writing clean <strong>or</strong> efficient code was always unsatisfactory to us. With <code>Loopify</code>, as well as with the rest of the <code>Flambda</code> and <code>Flambda2</code> compiler backends, we aim at making sure that users should <strong>not</strong> have to write imperative code for it to be as efficient as functional code. Thus ideally making any which way of writing a piece of code as efficient as the next.</p> <p>This article describes one of the very first user-facing optimisations of this series of snippets on <code>Flambda2</code>. We have not gotten into any of the neat implementation details yet. This is a topic for another time. The functioning of <code>Loopify</code> will be much clearer next time we talk about it.</p> <p><code>Loopify</code> is only applied automatically when the tail-recursive nature of a function call is visible in the source from the get-go. However, the optimisations applied by <code>Loopify</code> can still very much be useful in other situations as seen in <a href="#toloopifyornottoloopify">this section</a>. That is why we have the <code>[@loop]</code> attribute in order to enforce <em>loopification</em>. Good canonical examples for applying <code>Loopify</code> with the <code>[@loop]</code> attribute would be either of the following: loopifying a partially tail-recursive function (i.e, a function with only <em>some</em> tail-recursive paths), or for functions which are not obviously tail-recursive in the source code, but could become so after some optimisation steps.</p> <p>This transformation illustrates a core principle behind the <code>Flambda2</code> design: applying a somewhat naïve optimisation that is not transformative by itself, but changes the way the compiler can look at the code and trigger a whole lot of other useful ones. Conversely, it being triggered in the middle of the inlining phase can allow some non-obvious cases to become radically better. Coding a single optimisation that would discover the cases demonstrated in the examples above would be quite complex, while this one is rather simple thanks to these principles.</p> <p>Throughout the entire series of snippets, we will continue seeing these principles in action, starting with the next blog post that will introduce <code>Downward and Upward Traversals</code>.</p> <p><strong>Stay tuned, and thank you for reading, until next time, <em>see you Space Cowboy</em>. <a href="https://fr.wikipedia.org/wiki/Cowboy_Bebop">🤠</a></strong></p> Fixing and Optimizing the GnuCOBOL Preprocessor https://ocamlpro.com/blog/2024_04_30_fixing_and_optimizing_gnucobol 2024-04-30T13:48:57Z 2024-04-30T13:48:57Z Fabrice Le Fessant In this post, I will present some work that we did on the GnuCOBOL compiler, the only fully-mature open-source compiler for COBOL. It all started with a bug issued by one of our customers that we fixed by improving the preprocessing pass of the compiler. We later went on and optimised it to get bett... <p></p> <p>In this post, I will present some work that we did on the GnuCOBOL compiler, the only fully-mature open-source compiler for COBOL. It all started with a bug issued by one of our customers that we fixed by improving the preprocessing pass of the compiler. We later went on and optimised it to get better performances than the initial version.</p> <blockquote> <p>Supporting the GnuCOBOL compiler has become one of our commercial activities. If you are interested in this project, we have a dedicated website on our <a href="https://get-superbol.com">SuperBOL offer</a>, a set of tools and services to ease deploying GnuCOBOL in a company to replace proprietary COBOL environments.</p> </blockquote> <p> <div class="figure"> <p> <a href="/blog/assets/img/craiyon-gnucobol-optimization.webp"> <img alt="At OCamlPro, we often favor correctness over performance. But at the end, our software is correct AND often faster than its competitors! Optimizing software is an art, that often contradicts popular beliefs." src="/blog/assets/img/craiyon-gnucobol-optimization.webp"/> </a> <div class="caption"> At OCamlPro, we often favor correctness over performance. But at the end, our software is correct AND often faster than its competitors! Optimizing software is an art, that often contradicts popular beliefs. </div> </p> </div> </p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#replacing">Preprocessing and Replacements in COBOL</a> </li> <li><a href="#gnucobol">Preprocessing in the GnuCOBOL Compiler</a> </li> <li><a href="#standard">Conformance to the ISO Standard</a> </li> <li><a href="#automata">Preprocessing with Automata on Streams</a> </li> <li><a href="#issues">Some Performance Issues</a> </li> <li><a href="#allocations">Optimising Allocations</a> </li> <li><a href="#fastpaths">What about Fast Paths ?</a> </li> <li><a href="#conclusion">Conclusion</a> </div> </li> </ul> <h2> <a id="replacing" class="anchor"></a><a class="anchor-link" href="#replacing">Preprocessing and Replacements in COBOL</a> </h2> <p>COBOL was born in 1959, at a time where the science of programming languages was just starting. If you had to design a new language for the same purpose today, the result would be very different, you would do different mistakes, but maybe not fewer. Actually, COBOL has shown to be particularly resilient to time, as it is still used, 70 years later! Though it has evolved over the years (the <a href="https://www.iso.org/fr/standard/74527.html">last ISO standard for COBOL</a> was released in January 2023), the kernel of the language is still the same, showing that most of the initial design choices were not perfect, but still got the job done.</p> <p>One of these choices, which would sure scare off young developers, is how COBOL favors code reusability and sharing, through replacements done in its preprocessor.</p> <p>Let's consider this COBOL code, this will be our example for the rest of this article:</p> <pre><code class="language-COBOL">DATA DIVISION. WORKING-STORAGE SECTION. 01 VAL1. COPY MY-RECORD REPLACING ==:XXX:== BY ==VAL1==. 01 VAL2. COPY MY-RECORD REPLACING ==:XXX:== BY ==VAL2==. 01 COUNTERS. 05 COUNTER-NAMES PIC 999 VALUE 0. 05 COUNTER-VALUES PIC 999 VALUE 0. </code></pre> <p>We are using the <em>free</em> format, a modern way of formatting code, the older <em>fixed</em> format would require to leave a margin of 7 characters on the left. We are in the <code>DATA</code> division, the part of the program that defines the format of data, and specifically, in the <code>WORKING-STORAGE</code> section, where global variables are defined. In <em>standard</em> COBOL, there are no local variables, so the <code>WORKING-STORAGE</code> section usually contains all the variables of the program, even temporary ones.</p> <p>In COBOL, there are variables of basic types (integers and strings with specific lengths), and composite types (arrays and records). Records are defined using levels: global variables are at level <code>01</code> (such as <code>VAL1</code>, <code>VAL2</code> and <code>COUNTERS</code> in our example), whereas most other levels indicate inner fields: here, <code>COUNTER-NAMES</code> and <code>COUNTER-VALUES</code> are two direct fields of <code>COUNTERS</code>, as shown by their lower level <code>05</code> (both are actually integers of 3 digits as specified by <code>PIC 999</code>). Moreover, COBOL programmers like to be able to access fields directly, by making them unique in the program: it is thus possible to use <code>COUNTER-NAMES</code> everywhere in the program, without refering to <code>COUNTERS</code> itself (note that if the field wasn't assigned a unique name, it would be possible to use <code>COUNTER-NAMES OF COUNTERS</code> to disambiguate them).</p> <p>On the other hand, in older versions of COBOL, there were no type definitions.</p> <p><strong>So how would one create two record variables with the same content?</strong></p> <p>One would use the preprocessor to include the same file several times, describing the structure of the record into your program. One would also use that same file to describe the format of some data files storing such records. Actually, COBOL developers use external tools that are used to manage data files and generate the descriptions, that are then included into COBOL programs in order to manipulate the files (<code>pacbase</code> for example is one such tool).</p> <p>In our example, there would be a file <code>MY-RECORD.CPY</code> (usually called a <em>copybook</em>), containing something like the following somewhere in the filesystem:</p> <pre><code class="language-COBOL">05 :XXX:-USERNAME PIC X(30). 05 :XXX:-BIRTHDATE. 10 :XXX:-BIRTHDATE-YEAR PIC 9999. 10 :XXX:-BIRTHDATE-MONTH PIC 99. 10 :XXX:-BIRTHDATE-MDAY PIC 99. 05 :XXX:-ADDRESS PIC X(100). </code></pre> <p>This code except is actually not really correct COBOL code because identifiers cannot contain a <code>:XXX:</code> part:. It was written instead for it to be included <strong>and modified</strong> in other COBOL programs.</p> <p>Indeed, the following line will include the file and perform a replacement of a <code>:XXX:</code> partial token by <code>VAL1</code>:</p> <pre><code class="language-COBOL">COPY MY-RECORD REPLACING ==:XXX:== BY ==VAL1==. </code></pre> <p>So, in our main example, we now have two global record variables <code>VAL1</code> and <code>VAL2</code>, of the same format, but containing fields with unique names such as <code>VAL1-USERNAME</code> and <code>VAL2-USERNAME</code>.</p> <p>Allow me to repeat that, despite pecular nature, these features <strong>have</strong> stood the test of the time.</p> <p>The journey continues. Suppose now that you are in a specific part of your program, and that wish to manipulate longer names, say, you would like the <code>:XXX:-USERNAME</code> variable to be of size <code>60</code> instead of <code>30</code>.</p> <p>Here is how you could do it:</p> <pre><code class="language-COBOL"> [...] REPLACE ==PIC X(30)== BY ==PIC X(60)==. 01 VAL1. COPY [...] REPLACE OFF. 01 COUNTERS. [...] </code></pre> <p>Here, we can replace a list of consecutive tokens <code>PIC X(30)</code> by another list of tokens <code>PIC X(60)</code>. The result is that the fields <code>VAL1-USERNAME</code> and <code>VAL2-USERNAME</code> are now <code>60</code> bytes long.</p> <p><code>REPLACE</code> and <code>COPY REPLACING</code> can both perform the same kind of replacements on both parts of tokens (using <code>LEADING</code> or <code>TRAILING</code> keywords) and lists of tokens. COBOL programmers combine them to perform their daily job of building consistent software, by sharing formats using shared copybooks.</p> <p>Let's see now how GnuCOBOL can deal with that.</p> <h2> <a id="gnucobol" class="anchor"></a><a class="anchor-link" href="#gnucobol">Preprocessing in the GnuCOBOL Compiler</a> </h2> <p>The GnuCOBOL compiler is a transpiler: it translates COBOL source code into C89 source code, that can then be compiled to executable code by a C compiler. It has two main benefits: <strong>high portability</strong>, as GnuCOBOL will work on any platform with any C compiler, including very old hardware and mainframes, and <strong>simplicity</strong>, as code generation is reduced to its minimum, most of the code of the compiler is its parser... Which is actually still huge as COBOL is a particularly rich language.</p> <p>GnuCOBOL implements many dialects, (i.e.: extensions of COBOL available on proprietary compilers such as IBM, MicroFocus, etc.), in order to provide a solution to the migration issues posed by proprietary platforms.</p> <blockquote> <p>The support of dialects is one of the most interesting features of GnuCOBOL: by supporting natively many extensions of proprietary compilers, it is possible to migrate applications from these compilers to GnuCOBOL without modifying the sources, allowing to run the same code on the old platform and the new one during all the migration.</p> <p>One of OCamlPro's main contributions to GnuCOBOL has been to create such a dialect for GCOS7, a former Bull mainframe still in use in some places.</p> </blockquote> <p> <div class="figure"> <p> <a href="/blog/assets/img/bull-dps7-gcos7.jpg"> <img alt="This is a Bull DPS-7 mainframe around 1980, running the GCOS7 operating system. Such systems are still used to run COBOL critical applications in some companies, though running on software emulators on PCs. GnuCOBOL is a mature solution to migrate such applications to modern Linux computers." src="/blog/assets/img/bull-dps7-gcos7.jpg"/> </a> <div class="caption"> This is a Bull DPS-7 mainframe around 1980, running the GCOS7 operating system. Such systems are still used to run COBOL critical applications in some companies, though running on software emulators on PCs. GnuCOBOL is a mature solution to migrate such applications to modern Linux computers. </div> </p> </div> </p> <p>To perform its duty, GnuCOBOL processes COBOL source files in two passes: it preprocesses them during the first phase, generating a new temporary COBOL file with all inclusions and replacement done, and then parses this file and generates the corresponding C code.</p> <p>To do that, GnuCOBOL includes two pairs of lexers and parsers, one for each phase. The first pair only recognises a very limited set of constructions, such as <code>COPY... REPLACING</code>, <code>REPLACE</code>, but also some other ones like compiler directives.</p> <p>The lexer/parser for preprocessing directly works on the input file, and performed all these operations in a single pass before version <code>3.2</code>.</p> <p>The output can be seen using the <code>-E</code> argument:</p> <pre><code class="language-shell">$ cobc -E --free foo.cob #line 1 &quot;foo.cob&quot; DATA DIVISION. WORKING-STORAGE SECTION. 01 VAL1. #line 1 &quot;MY-RECORD.CPY&quot; 05 VAL1-USERNAME PIC X(60). 05 VAL1-BIRTHDATE. 10 VAL1-BIRTHDATE-YEAR PIC 9999. 10 VAL1-BIRTHDATE-MONTH PIC 99. 10 VAL1-BIRTHDATE-MDAY PIC 99. 05 VAL1-ADDRESS PIC X(100). #line 5 &quot;foo.cob&quot; 01 VAL2. #line 1 &quot;MY-RECORD.CPY&quot; 05 VAL2-USERNAME PIC X(60). 05 VAL2-BIRTHDATE. 10 VAL2-BIRTHDATE-YEAR PIC 9999. 10 VAL2-BIRTHDATE-MONTH PIC 99. 10 VAL2-BIRTHDATE-MDAY PIC 99. 05 VAL2-ADDRESS PIC X(100). #line 7 &quot;foo.cob&quot; 01 COUNTERS. 05 COUNTER-NAMES PIC 999 VALUE 0. 05 COUNTER-VALUES PIC 999 VALUE 0. </code></pre> <p>The <code>-E</code> option is particularly useful if you want to understand the final code that GnuCOBOL will compile. You can also get access to this information using the option <code>--save-temps</code> (save intermediate files), in which case <code>cobc</code> will generate a file with extension <code>.i</code> (<code>foo.i</code> in our case) containing the preprocessed COBOL code.</p> <p>You can see that <code>cobc</code> successfully performed both the <code>REPLACE</code> and <code>COPY REPLACING</code> instructions.</p> <p>The <a href="https://github.com/OCamlPro/gnucobol/blob/5ab722e656a25dc95ab99705ee1063562f2e5be5/cobc/pplex.l#L2049">corresponding code in version 3.1.2</a> is in file <code>cobc/pplex.l</code>, function <code>ppecho</code>. Fully understanding it is left as an exercice for the motivated reader.</p> <p>The general idea is that replacements defined by <code>COPY REPLACING</code> and <code>REPLACE</code> are added to the same list of active replacements.</p> <p>We show in the next section that such an implementation does not conform to the ISO standard.</p> <h2> <a id="standard" class="anchor"></a><a class="anchor-link" href="#standard">Conformance to the ISO Standard</a> </h2> <p>You may wonder if it is possible for <code>REPLACE</code> statements to perform replacements that would change a <code>COPY</code> statement, such as :</p> <pre><code class="language-COBOL">REPLACE ==COPY MY-RECORD== BY == COPY OTHER-RECORD==. COPY MY-RECORD. </code></pre> <p>You may also wonder what happens if we try to combine replacements by <code>COPY</code> and <code>REPLACE</code> on the same tokens, for example:</p> <pre><code class="language-COBOL">REPLACE ==VAL1-USERNAME PIC X(30)== BY ==VAL1-USERNAME PIC X(60)== </code></pre> <p>Such a statement only makes sense if we assume the <code>COPY</code> replacements have been performed before the <code>REPLACE</code> replacements are performed.</p> <p>Such ambiguities have been resolved in the ISO Standard for COBOL: in section <code>7.2.1. Text Manipulation &gt;&gt; General</code>, it is specified that preprocessing is executed in 4 phases on the streams of tokens:</p> <pre><code class="language-shell-session">1. `COPY` statements are performed, and the corresponding `REPLACING` replacements too; 2. Conditional compiler directives are then performed; 3. `REPLACE` statements are performed; 4. `COBOL-WORDS` statements are performed (allowing to enabled/disable some keywords) </code></pre> <p>So, a <code>REPLACE</code> cannot modify a <code>COPY</code> statement (and the opposite is also impossible, as <code>REPLACE</code> are not allowed in copybooks), but it can modify the same set of tokens that are being modified by the <code>REPLACING</code> part of a <code>COPY</code>.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/standard-iso-cobol.jpg"> <img alt="The ISO standard specifies the different steps to preprocess COBOL files and perform replacements in a specific order." src="/blog/assets/img/standard-iso-cobol.jpg"/> </a> <div class="caption"> The ISO standard specifies the different steps to preprocess COBOL files and perform replacements in a specific order. </div> </p> </div> </p> <p>As described in the previous section, GnuCOBOL implements all phases 1, 2 and 3 in a single one, even mixing replacements defined by <code>COPY</code> and by <code>REPLACE</code> statements. Fortunately, this behavior is good enough for most programs. Unfortunately, there are still programs that combine <code>COPY</code> and <code>REPLACE</code> on the same tokens, leading to hard to debug errors, as the compiler does not conform to the specification.</p> <p>A difficult situation which happened to one of our customers and that we prompty addressed by patching a part of the compiler.</p> <h2> <a id="automata" class="anchor"></a><a class="anchor-link" href="#automata">Preprocessing with Automata on Streams</a> </h2> <p>Correctly implementing the specification written in the standard would make the preprocessing phase quite complicated. Indeed, we would have to implement a small parser for every one of the four steps of preprocessing. That's actually what we did for our <a href="https://github.com/OCamlPro/superbol-studio-oss/tree/master/src/lsp/cobol_preproc">COBOL parser in OCaml</a> used by the LSP (<a href="https://microsoft.github.io/language-server-protocol/">Language Server Protocol</a>) of our <a href="https://marketplace.visualstudio.com/items?itemName=OCamlPro.SuperBOL">SuperBOL Studio</a> COBOL plugin for VSCode.</p> <p>However, doing the same in GnuCOBOL is much harder: GnuCOBOL is written in C, and such a change would require a complete rewriting of the preprocessor, something that would take more time than we had on our hands. Instead, we opted for rewriting the replacement function, to split <code>COPY REPLACING</code> and <code>REPLACE</code> into two different replacement phases.</p> <p>The <a href="https://github.com/OCamlPro/gnucobol/blob/gnucobol-3.2/cobc/replace.c">corresponding C code</a> has been moved into a file <code>cobc/replace.c</code>. It implements an automaton that applies a list of replacements on a stream of tokens, returning another stream of tokens. The preprocessor is thus composed of two instances of this automaton, one for <code>COPY REPLACING</code> statements and another one for <code>REPLACE</code> statements.</p> <p>The second instance takes the stream of tokens produced by the first one as input. The automaton is implemented using recursive functions, which is particularly suitable to allow reasoning about its correctness. Actually, several bugs were found in the former C implementation while designing this automaton. Each automaton has an internal state, composed of a set of tokens which are queued (and waiting for a potential match) and a list of possible replacements of these tokens.</p> <p>Thanks to this design, it was possible to provide a working implementation in a very short delay, considering the complexity of that part of the compiler.</p> <p>We added several tests to the testsuite of the compiler for all the bugs that had been detected in the process to prevent regressions in the future, and the <a href="https://github.com/OCamlPro/gnucobol/pull/75">corresponding pull request</a> was reviewed by Simon Sobisch, the GnuCOBOL project leader, and later upstreamed.</p> <h3> <a id="issues" class="anchor"></a><a class="anchor-link" href="#issues">Some Performance Issues</a> </h3> <p>Unfortunately, it was not the end of the work: Simon performed some performance evaluations on this new implementation, and although it had improved the conformance of GnuCOBOL to the standard, it did affect the performance negatively.</p> <p>Compiler performance is not always critical for most applications, as long as you compile only individual COBOL source files. However, some source files can become very big, especially when part of the code is auto-generated. In COBOL, a typical case of that is the use of a pre-compiler, typically for SQL. Such programs contain <code>EXEC SQL</code> statements, that are translated by the SQL pre-compiler into much longer COBOL code, consisting mostly of <code>CALL</code> statements calling C functions into the SQL library to build and execute SQL requests.</p> <p>For such a generated program, of a whopping 700 kLines, Simon noticed an important degradation in compilation time, and profiling tools concluded that the new preprocessor implementation was responsible for it, as shown in the flamegraph below:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/cobc-callgraph-pplex1.png"> <img alt="A flamegraph generated by <code>perf</code> stats visualised on <code>hotspot</code>: the horizontal axis is the total duration. We can see that <code>ppecho</code>, the function for replacements, takes most of the preprocessing time, with the two-automata replacement phases. Credit: Simon Sobisch" src="/blog/assets/img/cobc-callgraph-pplex1.png"/> </a> <div class="caption"> A flamegraph generated by <code>perf</code> stats visualised on <code>hotspot</code>: the horizontal axis is the total duration. We can see that <code>ppecho</code>, the function for replacements, takes most of the preprocessing time, with the two-automata replacement phases. Credit: Simon Sobisch </div> </p> </div> </p> <p>So we started investigating to fix the problem in <a href="https://github.com/OCamlPro/gnucobol/pull/142">a new pull-request</a>.</p> <h3> <a id="allocations" class="anchor"></a><a class="anchor-link" href="#allocations">Optimizing Allocations</a> </h3> <p>Our first intuition was that the main difference with the previous implementation came from allocating too many lists in the temporary state of the two automatons. This intuition was only partially right, as we will see.</p> <p>Mutable lists were used in the automaton (and also in the former implementation) to store a small part of the stream of tokens, while they were being matched with a replacement source. On a partial match, the list had to wait for additionnal tokens to check for a full match. Actually, these <strong>lists</strong> were used as <strong>queues</strong>, as tokens were always added at the end, while matched or un-matched tokens were removed from the top. Also, the size of these lists was bounded by the maximal replacement that was defined in the code, that would unlikely be more than a few dozen tokens.</p> <p>Our first idea was to replace these lists by real queues, that can be efficiently implemented using <a href="https://github.com/OCamlPro/gnucobol/blob/82100d64de35c89ad5980d1b2c8d1ffdd3563570/cobc/replace.c#L89">circular buffers and arrays</a>. Each and every allocation of a new list element would then be replaced by the single allocation of a circular buffer, granted with a few possible reallocations further down the road if the list of replacements was to grow bigger.</p> <p>The results were a bit disappointing: on the flamegraph, there was some improvement, but the replacement phase still took a lot of time:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/cobc-callgraph-pplex2.png"> <img alt="The flamegraph is better, as shown by the disappearance of calls to <code>token_list_add</code>. But our work is not yet finished! Credit: Simon Sobisch" src="/blog/assets/img/cobc-callgraph-pplex2.png"/> </a> <div class="caption"> The flamegraph is better, as shown by the disappearance of calls to <code>token_list_add</code>. But our work is not yet finished! Credit: Simon Sobisch </div> </p> </div> </p> <p>Another intuition we had was that we had been a bit naive about allocating tokens: in the initial implementation of version <code>3.1.2</code>, tokens were allocated when copied from the lexer into the single queue for replacement; in our implementation, that job was also done, but twice, as they were allocated in both automata. So, we modified our implementation to only allocate tokens when they are first entered in the <code>COPY REPLACING</code> stream, and not anymore when entering the <code>REPLACE</code> stream. A simple idea, that reduced again the remaining allocations by a factor of 2.</p> <p>Yet, the new optimised implementation still didn't match the performance of the former <code>3.1.2</code> version, and we were running out of ideas on how the allocations performed by the automata could again be improved:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/cobc-valgrind-pplex1.png"> <img alt="Using circular buffers instead of mutable lists for queues decreased allocations by a factor of 3. Removing the re-allocations between the two streams would also improve it by a factor of 2. A nice improvement, but not yet the performances of version 3.1.2" src="/blog/assets/img/cobc-valgrind-pplex1.png"/> </a> <div class="caption"> Using circular buffers instead of mutable lists for queues decreased allocations by a factor of 3. Removing the re-allocations between the two streams would also improve it by a factor of 2. A nice improvement, but not yet the performances of version 3.1.2 </div> </p> </div> </p> <h3> <a id="fastpaths" class="anchor"></a><a class="anchor-link" href="#fastpaths">What about Fast Paths ?</a> </h3> <p>So we decided to study some of the code from <code>3.1.2</code> to understand what could cause such a difference, and it became immediately obvious: the former version had two fast paths, that we had left out of our own implementation!</p> <p>The two fast paths that completely shortcut the replacement mechanisms are the following:</p> <p>The first one is when there are no replacements defined in the source. In COBOL, most replacements are only performed in the <code>DATA DIVISION</code>, and moreover, <code>COPY REPLACING</code> ones are only performed during copies. This means that a large part of the code that did not need to go through our two automata still did!</p> <p>The second fast path is for spaces: replacements always start and finish by a non-space token in COBOL, so, if we check that we are not in the middle of partial match (i.e. both internal token queues are empty), we can safely make the space token skip the automata. Again, given the frequency of space tokens (about half, as there are very few other separators), this fast path is likely to be used very, very frequently.</p> <p>Implementing them was straigthforward, and the results were the one expected:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/cobc-callgraph-pplex3.png"> <img alt="After implementing the same fast paths as in 3.1.2, the flamegraph is back to normal, with the time spent in the replacement function being almost not noticeable. Credit: Simon Sobisch" src="/blog/assets/img/cobc-callgraph-pplex3.png"/> </a> <div class="caption"> After implementing the same fast paths as in 3.1.2, the flamegraph is back to normal, with the time spent in the replacement function being almost not noticeable. Credit: Simon Sobisch </div> </p> </div> </p> <h3> <a id="conclusion" class="anchor"></a><a class="anchor-link" href="#conclusion">Conclusion</a> </h3> <p>As often with optimisations, intuitions do not always lead to the expected improvements: in our case, the real improvement came not with improving the algorithm, but from shortcutting it!</p> <p>Yet, we are still very pleased by the results: the new optimised implementation of replacements in GnuCOBOL makes it more conformant to the standard, and also more efficient than the former <code>3.1.2</code> version, as shown by the final results sent to us by Simon:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/cobc-valgrind-pplex2.png"> <img alt="These results show that the new implementation is now a little better than 3.1.2. It comes from using the circular buffers instead of the mutable lists for queues, but the optimisation only happens when replacements are defined, which is a very small part of the code source." src="/blog/assets/img/cobc-valgrind-pplex2.png"/> </a> <div class="caption"> These results show that the new implementation is now a little better than 3.1.2. It comes from using the circular buffers instead of the mutable lists for queues, but the optimisation only happens when replacements are defined, which is a very small part of the code source. </div> </p> </div> </p> OCaml Backtraces on Uncaught Exceptions https://ocamlpro.com/blog/2024_04_25_ocaml_backtraces_on_uncaught_exceptions 2024-04-25T13:48:57Z 2024-04-25T13:48:57Z Louis Gesbert Uncaught exception: Not_found This blog post probably won't teach anything new to OCaml veterans; but for the others, you might be glad to learn that this very basic, yet surprisingly little-known feature of OCaml will give you backtraces with source file positions on any uncaught exception. Since i... <p></p> <p> <div class="figure"> <p> <a href="/blog/assets/img/AIGEN_camel_catching_butterflies.jpeg"> <img alt="A mystical Camel using its net to catch all uncaught... Butterflies." src="/blog/assets/img/AIGEN_camel_catching_butterflies.jpeg"/> </a> <div class="caption"> A mystical Camel using its net to catch all uncaught... Butterflies. </div> </p> </div> </p> <h2> <a id="notfound" class="anchor"></a><a class="anchor-link" href="#notfound">Uncaught exception: Not_found</a> </h2> <p>This blog post probably won't teach anything new to OCaml veterans; but for the others, you might be glad to learn that this very basic, yet surprisingly little-known feature of OCaml will give you backtraces with source file positions on any uncaught exception.</p> <p>Since it can save hours of frustrating debugging, my intent is to give some publicity to this accidentally hidden feature.</p> <blockquote> <p>PSA: define <code>OCAMLRUNPARAM=b</code> in your environment.</p> </blockquote> <p>For those wanting to go further, I'll then go on with hints and guidelines for good exception management in OCaml.</p> <blockquote> <p>For the details, everything here is documented in <a href="https://caml.inria.fr/pub/docs/manual-ocaml/libref/Printexc.html">the Printexc module</a>.</p> </blockquote> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#notfound">Uncaught exception: Not_found</a> </li> <li><a href="#getyourstacktraces">Get your stacktraces!</a> </li> <li><a href="#improve">Improve your traces</a> <ul> <li><a href="#reraising">Properly Re-raising exceptions, and finalisers</a> </li> <li><a href="#holes">There are holes in my backtrace!</a> </li> </ul> </li> <li><a href="#guidelines">Guidelines for exception handling, and Control-C</a> <ul> <li><a href="#backtracesinocaml">Controlling the backtraces from OCaml</a> </div> </li> </ul> </li> </ul> <h2> <a id="getyourstacktraces" class="anchor"></a><a class="anchor-link" href="#getyourstacktraces">Get your stacktraces!</a> </h2> <p>Compile-time errors are good, but sometimes you just have to cope with run-time failures.</p> <p>Here is a simple (and buggy) program:</p> <pre><code class="language-ocaml">let dict = [ &quot;foo&quot;, &quot;bar&quot;; &quot;foo2&quot;, &quot;bar2&quot;; ] let rec replace = function | [] -&gt; [] | w :: words -&gt; List.assoc w dict :: words let () = let words = Array.to_list Sys.argv in List.iter print_endline (replace words) </code></pre> <blockquote> <p><strong>Side note</strong></p> <p>For purposes of the example, we use <code>List.assoc</code> here; this relies on OCaml's structural equality, which is often a bad idea in projects, as it can break in surprising ways when the matched type gets more complex. A more serious implementation would use <em>e.g.</em> <code>Map.Make</code> with an explicit comparison function.</p> </blockquote> <p>Here is the result of executing this program with no options:</p> <pre><code class="language-shell-session">$ ./foo Fatal error: exception Not_found </code></pre> <p>This isn't very helpful, but no need for a debugger, lots of <code>printf</code> or tedious debugging, just do the following:</p> <pre><code class="language-shell-session">$ export OCAMLRUNPARAM=b $ ./foo Fatal error: exception Not_found Raised at Stdlib__List.assoc in file &quot;list.ml&quot;, line 191, characters 10-25 Called from Foo.replace in file &quot;foo.ml&quot;, line 8, characters 18-35 Called from Foo in file &quot;foo.ml&quot;, line 12, characters 26-41 </code></pre> <p>Much more helpful! In most cases, this will be enough to find and fix the bug.</p> <p>If you still don't get the backtrace, you may need to recompile with <code>-g</code> (with dune, ensure your default profile is <code>dev</code> or specify <code>--profile=dev</code>)</p> <p>So, now we know where the failure occured... But not on what input. This is not a matter of backtraces: if that's an issue, define your own exceptions, with arguments, and raise that rather than the basic <code>Not_found</code>.</p> <blockquote> <p><strong>Hint</strong></p> <p>If you run the program directly from your editor, with a properly configured OCaml mode, the file positions in the backtrace should be parsed and become clickable, making navigation very quick and easy.</p> </blockquote> <h2> <a id="improve" class="anchor"></a><a class="anchor-link" href="#improve">Improve your traces</a> </h2> <p>The above works well in general, but depending on the complexity of the programs, there are some more advanced tricks that may be helpful, to preserve or improve the backtraces.</p> <h3> <a id="reraising" class="anchor"></a><a class="anchor-link" href="#reraising">Properly Re-raising exceptions, and finalisers</a> </h3> <p>It's pretty common to want a finaliser after some processing, here to remove a temporary file:</p> <pre><code class="language-ocaml">let with_temp_file basename (f: unit -&gt; 'a) : 'a = let filename = Filename.temp_file basename in match f filename with | result -&gt; Sys.remove filename; result | exception e -&gt; Sys.remove filename; raise e </code></pre> <p>In simple cases this will work, but if <em>e.g.</em> you are using the <code>Printf</code> module before re-raising, it will break the printed backtrace.</p> <ul> <li> <p><strong>Solution 1</strong>: use <code>Fun.protect ~finally f</code> that handles the backtrace properly.</p> </li> <li> <p><strong>Solution 2</strong>: manually, use raw backtrace access from the <code>Printexc</code> module:</p> <pre><code class="language-ocaml">| exception e -&gt; let bt = Printexc.get_raw_backtrace () in Sys.remove filename; Printexc.raise_with_backtrace e bt </code></pre> </li> </ul> <p>Re-raising exceptions after catching them should always be done in this way.</p> <h3> <a id="holes" class="anchor"></a><a class="anchor-link" href="#holes">There are holes in my backtrace!</a> </h3> <p>Indeed, it may appear that not all function calls show up in the backtrace.</p> <p>There are two main reasons for that:</p> <ul> <li>functions can get inlined by the compiler, so they don't actually appear in the concrete backtrace at runtime; </li> <li>tail-call optimisation also affects the stack, which can be visible here; </li> </ul> <p>Don't run and disable all optimisations though! Some effort has been put in recording useful debugging information even in these cases. The <a href="https://ocamlpro.com/blog/2024_03_18_the_flambda2_snippets_0/">Flambda pass of the compiler</a>, which does <strong>more</strong> inlining, also actually makes it <strong>more</strong> traceable.</p> <p>As a consequence, switching to Flambda will often give you more helpful backtraces with recursive functions and tail-calls. It can be done with <code>opam install ocaml-option-flambda</code> (this will recompile the whole opam switch).</p> <blockquote> <p><strong>Well, what if my program uses <code>lwt</code>?</strong></p> <p>Backtraces in this context are a complex matter -- but they can be simulated: a good practice is to use <code>ppx_lwt</code> and the <code>let%lwt</code> syntax rather than <code>let*</code> or <code>Lwt.bind</code> directly, because the ppx will insert calls that reconstruct &quot;fake&quot; backtrace information.</p> </blockquote> <h2> <a id="guidelines" class="anchor"></a><a class="anchor-link" href="#guidelines">Guidelines for exception handling, and Control-C</a> </h2> <p>Exceptions in OCaml can happen anywhere in the program: besides uses of <code>raise</code>, system errors can trigger them. In particular, if you want to implement clean termination on the user pressing <code>Control-C</code> without manually handling signals, you should call <code>Sys.catch_break true</code> ; you will then get a <code>Sys.Break</code> exception raised when the user interrupts the program.</p> <p>Anyway, this is one reason why you must never use <code>try .. with _ -&gt;</code></p> <pre><code class="language-ocaml">let find_opt x m = try Some (Map.find x m) with _ -&gt; None </code></pre> <p>The programmer was too lazy to write <code>with Not_found</code>. They may think this is OK since <code>Map.find</code> won't raise anything else. But if <code>Control-C</code> is pressed at the wrong time, this will catch it, and return <code>None</code> instead of stopping the program !</p> <pre><code class="language-ocaml">let find_debug x m = try Map.find x m with e -&gt; let bt = Printexc.get_raw_backtrace () in Printf.eprintf &quot;Error on %s!&quot; (to_string x); Printexc.raise_with_backtrace e bt </code></pre> <p>This version is OK since it re-raises the exception. If you absolutely need to catch all exceptions, a last resort is to explicitely re-raise &quot;uncatchable&quot; exceptions:</p> <pre><code class="language-ocaml">let this_is_a_last_resort = try .. with | (Sys.Break | Assert_failure _ | Match_failure _) as e -&gt; raise e | _ -&gt; .. </code></pre> <p>In practice, you'll finally want to catch exceptions from your main function (<code>cmdliner</code> already offers to do this, for example); catching <code>Sys.Break</code> at that point will offer a better message than <code>Uncaught exception</code>, give you control over finalisation and the final exit code (the convention is to use <code>130</code> for <code>Sys.Break</code>).</p> <h3> <a id="backtracesinocaml" class="anchor"></a><a class="anchor-link" href="#backtracesinocaml">Controlling the backtraces from OCaml</a> </h3> <p>Setting <code>OCAMLRUNPARAM=b</code> in the environment works from the outside, but the module <a href="https://caml.inria.fr/pub/docs/manual-ocaml/libref/Printexc.html">Printexc</a> can also be used to enable or disable them from the OCaml program itself.</p> <ul> <li><code>Printexc.record_backtrace: bool -&gt; unit</code> toggles the recording of backtraces. Forcing it <code>off</code> when running tests, or <code>on</code> when a debug flag is specified, can be good ideas; </li> <li><code>Printexc.backtrace_status: unit -&gt; bool</code> checks if recording is enabled. This can be used when finalising the program to print the backtraces when enabled; </li> </ul> <blockquote> <p><strong>Nota Bene</strong></p> <p>The <code>base</code> library turns <code>on</code> backtraces recording by default. While I salute an attempt to remedy the issue that this post aims to address, this can lead to surprises when just linking the library can change the output of a program (<em>e.g.</em> this might require specific code for cram tests not to display backtraces)</p> </blockquote> <p>The <code>Printexc</code> module also allows to register custom exception printers: if, following the advice above, you defined your own exceptions with parameters, use <code>Printexc.register_printer</code> to have that information available when they are uncaught.</p> Opam 102: Pinning Packages https://ocamlpro.com/blog/2024_03_25_opam_102_pinning_packages 2024-03-25T13:48:57Z 2024-03-25T13:48:57Z Dario Pinto Raja Boujbel Welcome, dear reader, to a new opam blog post! Today we take an additional step down the metaphorical rabbit hole with opam pin, the easiest way to catch a ride on the development version of a package in opam. We are aware that our readers are eager to see these blog posts venture on the developer s... <p></p> <p> <div class="figure"> <p> <a href="/blog/assets/img/opam102_pins.svg"> <img alt="Pins standout. They help us anchor interest points, thus helping us focus on what's important. They become the catalyst for experimentation and help us navigating the strong safety features that opam provides users with." src="/blog/assets/img/opam102_pins.svg"/> </a> <div class="caption"> Pins standout. They help us anchor interest points, thus helping us focus on what's important. They become the catalyst for experimentation and help us navigating the strong safety features that opam provides users with. </div> </p> </div> </p> <p>Welcome, dear reader, to a new opam blog post!</p> <p>Today we take an additional step down the metaphorical rabbit hole with <code>opam pin</code>, the easiest way to catch a ride on the development version of a package in <code>opam</code>.</p> <p>We are aware that our readers are eager to see these blog posts venture on the developer side of the <code>opam</code> experience, and so are we, but we need to spend just a bit little more time on the beginner and user-side of it for now so please, bear with us! 🐻</p> <blockquote> <p>This tutorial is the second one in this on-going series about the OCaml package manager <code>opam</code>. Be sure to read <a href="https://ocamlpro.com/blog/2024_01_23_opam_101_the_first_steps/">the first one</a> to get up to speed. Also, check out each article's <code>tags</code> to get an idea of the entry level required for the smoothest read possible!</p> </blockquote> <blockquote> <p><strong>New to the expansive OCaml sphere?</strong> As said on the official opam website, <a href="https://opam.ocaml.org/about.html#A-little-bit-of-History"><code>opam</code></a> has been a game changer for the OCaml distribution, since it first saw the day of light here, almost a decade ago.</p> </blockquote> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#opampincontext">Tutorial context</a> </li> <li><a href="#opampinusecase">Use-case for <code>opam pin</code></a> <ul> <li><a href="#opampindev">Pinning a released package development version: <code>opam pin add --dev-repo</code></a> </li> <li><a href="#opampinurl">Pinning an unreleased package development version: <code>opam pin add &lt;url&gt;</code></a> </li> </ul> </li> <li><a href="#opampinoptions">Dig into opam pin, find spicy features</a> <ul> <li><a href="#noaction">Add a pin without installing with <code>--no-action</code></a> </li> <li><a href="#updatepins">Update your pinned packages</a> </li> <li><a href="#unpin">Unpin packages</a> <ul> <li><a href="#releasedpins">Released packages</a> </li> <li><a href="#unreleasedpins">Unreleased packages</a> </li> <li><a href="#unpinnoaction">Unpin but do no action</a> </li> </ul> </li> <li><a href="#multiple">One URL to pin them all: handling a multi-package repository</a> </li> <li><a href="#version">Setting arbitrary version numbers, toying with fire</a> </li> <li><a href="#morefire">Setting multiple arbitrary version numbers</a> </li> </ul> </li> <li><a href="#conclusion">Conclusion</a> </div> </li> </ul> <h2> <a id="opampincontext" class="anchor"></a><a class="anchor-link" href="#opampincontext">Tutorial context and basis</a> </h2> <p>As far as context goes for this article, we will consider that you already are familiar with the concepts introduced in our tutorial <a href="https://ocamlpro.com/blog/2024_01_23_opam_101_the_first_steps/">opam 101</a>.</p> <p>Your current environment should thus be somewhat similar to the one we had by the end of that tutorial. Meaning: your version of <code>opam</code> is a least <code>2.1.5</code> (all outputs were generated with this version), you have already launched <code>opam init</code>, created a global switch <code>my-switch</code> and, possibly, you have even populated it with a few packages with a few calls to the <code>opam install</code> command.</p> <p>Furthermore, keep in mind that, in this blog post, we are approaching this subject from the perspective of a developer who is looking into integrating new packages to his current workload, not from the perspective of someone who is looking into sharing a project or publishing a new software.</p> <p><code>opam pin</code> is a feature that will quickly become necessary for you to use as you continue your exploration of <code>opam</code>. It allows for the user to <strong>pin</strong> a given package to a specific version, or even change the source from which said package is pulled, installed, and synchronised with from within your currently active <code>switch</code>.</p> <p>This feature shines the most in contexts such as:</p> <ul> <li>when doing ordinary <code>switch</code> management; </li> <li>for incorporating external, <em>still under-construction</em>, libraries to your own current workload; </li> <li>when designing a specific <code>switch</code>: pinning a specific package version will make it the main compatibility constraint for that switch, thus tailoring the environment around it in the process. </li> </ul> <blockquote> <p><strong>Reminder</strong></p> <p>Remember that <code>opam</code>'s command-line interface is beginner friendly. You can, at any point of your exploration, use the <code>--help</code> option to have every command and subcommand explained. You may also check out the <a href="https://ocamlpro.github.io/ocaml-cheat-sheets/ocaml-opam.pdf">opam cheat-sheet</a> that was released a while ago and still holds some precious insights on opam's CLI.</p> </blockquote> <h2> <a id="opampinusecase" class="anchor"></a><a class="anchor-link" href="#opampinusecase">Use-case for <code>opam pin</code></a> </h2> <p>Now onto today's use-cases for <code>opam pin</code>, the premise is as follows:</p> <p>The package on which your current development depends on has just had a major update on its <em>development</em> branch. This package is available on the opam <code>repository</code> and its name is <a href="https://ocaml.org/p/hc/latest"><code>hc</code></a>.</p> <p>That update introduced a new feature that you would very much like to experiment with for your own on-going project.</p> <p>However, that feature is still very much a <em>work-in-progress</em> and the maintainers of <code>hc</code> are <strong>not</strong> about to release their package anytime soon...</p> <p>That's when <code>opam pin</code> comes in. In this article, we will cover two similar use-cases for <code>opam pin</code>, namely the one dealing with pinning a version of a package that is already available on the <a href="https://opam.ocaml.org/packages/">opam <code>repository</code></a>, and that of pinning a version of an unreleased package, directly from its public URL.</p> <p>After all the basics have been laid out, we will eventually cover some of the more underground ⛏ and dangerous 🔥 features available when pinning packages.</p> <blockquote> <p><strong>Important Notice</strong></p> <p>For the sake of convenience and brevity, we will breakdown the <code>opam pin</code> command, and some of its options, by only dealing with addresses that obey the classic definition of the word <strong>URL</strong>.</p> <p>However do keep in mind that <code>opam</code> uses <a href="https://opam.ocaml.org/doc/Manual.html#URLs">a broader definition </a> for that word, going as far as to consider a filesystem path to be a valid string for a <strong>URL</strong> argument, thus allowing all <code>opam pin</code> calls and options to be valid when manipulating <code>opam</code> packages inside a local filesystem or local network instead of <strong>just</strong> on the web.</p> </blockquote> <h3> <a id="opampindev" class="anchor"></a><a class="anchor-link" href="#opampindev">Pinning the dev version of a released package: <code>opam pin add --dev-repo</code></a> </h3> <p>Picking up from the base context: our project depends on <code>hc</code>, and <code>hc</code> has just received an update. The first option available for us to access this fresh update on the <code>hc</code> repository is to use <code>opam pin add --dev-repo &lt;pkg&gt;</code> command.</p> <pre><code class="language-shell-session">$ opam pin add --dev-repo hc [hc.0.3] synchronised (git+https://git.zapashcanon.fr/zapashcanon/hc.git) hc is now pinned to git+https://git.zapashcanon.fr/zapashcanon/hc.git (version 0.3) The following actions will be performed: ∗ install dune 3.14.0 [required by hc] ∗ install hc 0.3* ===== ∗ 2 ===== Do you want to continue? [Y/n] y &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⬇ retrieved hc.0.3 (no changes) ⬇ retrieved dune.3.14.0 (https://opam.ocaml.org/cache) ∗ installed dune.3.14.0 ∗ installed hc.0.3 Done. </code></pre> <hr /> <h2>So what exactly did <code>opam pin</code> do here?</h2> <pre><code class="language-shell-session">$ opam pin add --dev-repo hc [hc.0.3] synchronised (git+https://git.zapashcanon.fr/zapashcanon/hc.git) </code></pre> <p>When you feed a package name to the <code>opam pin add --dev-repo</code> command, it will first retrieve the package definition found inside the <a href="https://github.com/ocaml/opam-repository/blob/master/packages/hc/hc.0.3/opam"><code>opam file</code></a> in the directory of the corresponding package on the <a href="https://github.com/ocaml/opam-repository">the Official OCaml opam <code>repository</code></a> or any other opam <code>repositories</code> that your local <code>opam</code> installation happens to be synchronised with.</p> <p>You can inspect said package definition directly yourself with the <code>opam show &lt;pkg&gt;</code> command.</p> <p>Let's take a look at the package definition for <code>hc</code>:</p> <pre><code class="language-shell-session">$ opam show hc &lt;&gt;&lt;&gt; hc: information on all versions &gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; name hc all-versions 0.0.1 0.2 0.3 &lt;&gt;&lt;&gt; Version-specific details &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; version 0.3 repository default url.src &quot;https://git.zapashcanon.fr/zapashcanon/hc/archive/0.3.tar.gz&quot; url.checksum &quot;sha256=61b443056adec3f71904c5775b8521b3ac8487df618a8dcea3f4b2c91bedc314&quot; &quot;sha512=a1d213971230e9c7362749d20d1bec6f5e23af191522a65577db7c0f9123ea4c0fc678e5f768418d6dd88c1f3689a49cf564b5c744995a9db9a304f4b6d2c68a&quot; homepage &quot;https://git.zapashcanon.fr/zapashcanon/hc&quot; doc &quot;https://doc.zapashcanon.fr/hc/&quot; bug-reports &quot;https://git.zapashcanon.fr/zapashcanon/hc/issues&quot; dev-repo &quot;git+https://git.zapashcanon.fr/zapashcanon/hc.git&quot; authors &quot;Léo Andrès &lt;contact@ndrs.fr&gt;&quot; maintainer &quot;Léo Andrès &lt;contact@ndrs.fr&gt;&quot; license &quot;ISC&quot; depends &quot;dune&quot; {&gt;= &quot;3.0&quot;} &quot;ocaml&quot; {&gt;= &quot;4.14&quot;} &quot;odoc&quot; {with-doc} synopsis Hashconsing library description hc is an OCaml library for hashconsing. It provides easy ways to use hashconsing, in a type-safe and modular way and the ability to get forgetful memoïzation. </code></pre> <p>Here, you can see the <code>dev-repo</code> field which contains the URL of the development repository of that package. Opam will use that information to retrieve package sources for you.</p> <hr /> <pre><code class="language-shell-session">hc is now pinned to git+https://git.zapashcanon.fr/zapashcanon/hc.git (version 0.3) </code></pre> <p>Once it has retrieved <code>hc</code> sources, opam will then store the status of the pin internally, which is that <code>hc</code> is <em>git pinned</em> to url <code>git.zapashcanon.fr/zapashcanon/hc</code> at version <code>0.3</code>.</p> <pre><code class="language-shell-session">$ opam pin list hc.0.3 git git+https://git.zapashcanon.fr/zapashcanon/hc.git </code></pre> <blockquote> <p><strong>Did you know?</strong> The default behaviour for <code>opam pin</code> is the <code>list</code> option. The option to see all pinned packages in the current active switch.</p> <p>On the other hand, the default behaviour for <code>opam pin &lt;target&gt;</code> command is the <code>add</code> option. Keep it in mind if you happen to grow tired of typing <code>opam pin add &lt;target&gt;</code> every time.</p> </blockquote> <hr /> <p>Opam will then analyse <code>hc</code> dependencies and compute a solution that respects the dependencies constraints and state of your current switch (i.e. the compatibility constraints between the packages currently installed in your switch).</p> <p>If it manages to do so, it will come forth with a prompt to install the pinned package and its dependencies.</p> <pre><code class="language-shell-session">The following actions will be performed: ∗ install dune 3.14.0 [required by hc] ∗ install hc 0.3* ===== ∗ 2 ===== Do you want to continue? [Y/n] y </code></pre> <p>Pressing <code>Enter</code> or <code>y + Enter</code> will perform the installation.</p> <blockquote> <p>Notice that sometimes a <code>*</code> character is found next to some package actions? It's the shorthand signal that the package is pinned, you can get that information at a quick glance when <code>opam</code> outputs the actions to perform for you if you know what to look for.</p> </blockquote> <pre><code class="language-shell-session">&lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⬇ retrieved hc.0.3 (no changes) ⬇ retrieved dune.3.14.0 (https://opam.ocaml.org/cache) ∗ installed dune.3.14.0 ∗ installed hc.0.3 Done. </code></pre> <p>Congratulations, you now have a pinned <em>development</em> version of the <code>hc</code> package. You can now start exploring the neat feature you have been looking forward to!</p> <h3> <a id="opampinurl" class="anchor"></a><a class="anchor-link" href="#opampinurl">Pinning the dev version of an unreleased package: <code>opam pin add &lt;url&gt;</code></a> </h3> <p>Every once in a while on your OCaml journey, you will come across unreleased software.</p> <p>These OCaml programs and libraries can still very much have active repositories but their maintainers have not yet gone as far as to release them in order to distribute their work through <code>opam</code> to the rest of the OCaml ecosystem.</p> <p>Yet, you might still want to have seamless access to these software solutions on your local <code>opam</code> installation for your own personal enjoyment and developments. That's when <code>opam pin add &lt;url&gt;</code> comes in handy.</p> <p>Modern OCaml projects will most often have one or several <code>opam files</code> in their tree which <code>opam</code> can operate with.</p> <pre><code class="language-shell-session">$ opam pin git+https://github.com/rjbou/opam-otopop Package opam-otopop does not exist, create as a NEW package? [Y/n] y opam-otopop is now pinned to git+https://github.com/rjbou/opam-otopop (version 0.1) The following actions will be performed: ∗ install opam-client 2.0.10 [required by opam-otopop] ∗ install opam-otopop 0.1* ===== ∗ 2 ===== Do you want to continue? [Y/n] y &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⬇ retrieved opam-client.2.0.10 (https://opam.ocaml.org/cache) ∗ installed opam-client.2.0.10 ∗ installed opam-otopop.0.1 Done. </code></pre> <p>As you can see, the course of an <code>opam pin add &lt;url&gt;</code> call is very close to that of an <code>opam pin add --dev-repo &lt;pkg&gt;</code>, the only exception being the following line:</p> <pre><code class="language-shell-session">Package opam-otopop does not exist, create as a NEW package? [Y/n] y </code></pre> <p>Since the package is unavailable on the opam <code>repositories</code> that your <code>opam</code> installation is synchronised with, <code>opam</code> doesn't know about it.</p> <p>That's why it will ask you if you want to <code>create it as a NEW package</code>.</p> <p>Once pinned, that package is available in your switch as any other ordinarily available <code>repository</code> package.</p> <hr /> <p>You can see here that <code>opam</code> has pinned the <code>opam-otopop</code> package to a specific <code>0.1</code> version.</p> <pre><code class="language-shell-session">opam-otopop is now pinned to git+https://github.com/rjbou/opam-otopop (version 0.1) </code></pre> <p>The reason for that is found inside the <a href="https://github.com/rjbou/opam-otopop/blob/master/opam-otopop.opam#L2"><code>opam file</code></a> at the root of the source repository for that package:</p> <pre><code class="language-shell-session">version: &quot;0.1&quot; </code></pre> <p>In any instance where this specific field is not found in the <code>opam file</code>, the version name would then be pinned to the verbatim <code>~dev</code> version.</p> <h2> <a id="opampinoptions" class="anchor"></a><a class="anchor-link" href="#opampinoptions">Dig into opam pin, find spicy features</a> </h2> <h3> <a id="noaction" class="anchor"></a><a class="anchor-link" href="#noaction">Add a pin without installing with <code>--no-action</code></a> </h3> <p>Here are the two main use-cases for a call to <code>opam pin</code> with the <code>--no-action</code> option:</p> <ul> <li>You <strong>don't</strong> want to install a package immediately, but <strong>do</strong> want to inform <code>opam</code> of its existence to allow <code>opam</code> to keep the compatibility constraints of that specific package in the equation whenever you are undertaking operations that would require such calculations; </li> <li>You just want to be assured that your package will be synchronised with the right sources; </li> </ul> <p><code>--no-action</code> will only perform the first actions of an <code>opam pin</code> call and will quit <strong>before</strong> installing the package, it can be used with all pin subcommands.</p> <pre><code class="language-shell-session">$ opam pin add hc --dev-repo --no-action [hc.0.3] synchronised (git+https://git.zapashcanon.fr/zapashcanon/hc.git) hc is now pinned to git+https://git.zapashcanon.fr/zapashcanon/hc.git (version 0.3) $ </code></pre> <h3> <a id="updatepins" class="anchor"></a><a class="anchor-link" href="#updatepins">Update your pinned packages</a> </h3> <p>There are two ways to go about updating and upgrading your pinned packages. They are the same no matter if you used the <code>--dev-repo</code> option, or <code>&lt;url&gt;</code> argument, or any other method for pinning them.</p> <p>The first one you may consider is to either install, or reinstall the specific package(s). The reason is that <code>opam</code> will always first synchronise with the linked source, and then proceed to recompiling.</p> <pre><code class="language-shell-session">$ opam install opam-otopop &lt;&gt;&lt;&gt; Synchronising pinned packages &gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; [opam-otopop.0.1] synchronised (git+https://github.com/rjbou/opam-otopop#master) The following actions will be performed: ↻ recompile opam-otopop 0.1* &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⊘ removed opam-otopop.0.1 ∗ installed opam-otopop.0.1 Done. </code></pre> <p>In the above code block, <code>opam-otopop</code> has been upgraded by that <code>opam install</code> call.</p> <p>The second method is to use the specific <code>opam update</code> and <code>opam upgrade</code> mechanisms. These commands are very common in an <code>opam</code> abiding workflow. Their general usage was briefly mentioned in our article <a href="https://ocamlpro.com/blog/2024_01_23_opam_101_the_first_steps/#packages">opam 101</a>.</p> <p>By default, <code>opam update</code> updates the state of your opam <code>repositories</code>, for you to have access to the most recent version of your packages. If you add the <code>--development</code> flag to it, it will also update the source code of your pinned packages internally.</p> <pre><code class="language-shell-session">$ opam update --development &lt;&gt;&lt;&gt; Synchronising development packages &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; [opam-otopop.0.1] synchronised (git+https://github.com/rjbou/opam-otopop#master) Now run 'opam upgrade' to apply any package updates. </code></pre> <p>Then you run <code>upgrade</code> as you would in any other package upgrade scenario.</p> <pre><code class="language-shell-session">$ opam upgrade The following actions will be performed: ↻ recompile opam-otopop 0.1* [upstream or system changes] &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⊘ removed opam-otopop.0.1 ∗ installed opam-otopop.0.1 Done. </code></pre> <h3> <a id="unpin" class="anchor"></a><a class="anchor-link" href="#unpin">Unpin packages</a> </h3> <p>When you are done with your experimentation and wish to remove a pinned package, you can simply call the <code>remove</code> subcommand.</p> <blockquote> <p>Keep in mind that <code>opam unpin</code> is an alias for <code>opam pin remove</code>.</p> </blockquote> <p>The behaviour of <code>opam unpin</code> is slightly different between released and unreleased packages.</p> <h4> <a id="releasedpins" class="anchor"></a><a class="anchor-link" href="#releasedpins">Released packages</a> </h4> <p>If the pinned package is released, by default, <code>opam</code> will retrieve and install the released version of the package instead of removing that package altogether.</p> <pre><code class="language-shell-session">$ opam pin list hc.0.3 git git+https://git.zapashcanon.fr/zapashcanon/hc.git </code></pre> <pre><code class="language-shell-session">$ opam list hc # Packages matching: name-match(hc) &amp; (installed | available) # Package # Installed # Synopsis hc.0.3 0.3 pinned to version 0.3 at git+https://git.zapashcanon.fr/zapashcanon/hc.git </code></pre> <pre><code class="language-shell-session">$ opam pin remove hc Ok, hc is no longer pinned to git+https://git.zapashcanon.fr/zapashcanon/hc.git (version 0.3) The following actions will be performed: ↻ recompile hc 0.3 Do you want to continue? [Y/n] y &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⬇ retrieved hc.0.3 (https://opam.ocaml.org/cache) ⊘ removed hc.0.3 ∗ installed hc.0.3 Done. </code></pre> <pre><code class="language-shell-session">$ opam list hc # Packages matching: name-match(hc) &amp; (installed | available) # Package # Installed # Synopsis hc.0.3 0.3 Hashconsing library </code></pre> <p>As we can see in the details:</p> <pre><code class="language-shell-session">⬇ retrieved hc.0.3 (https://opam.ocaml.org/cache) </code></pre> <p><code>opam</code> has retrieved the sources from the archive that is specified in the <code>opam file</code> of the relevant opam <code>repository</code>, thus pulling <code>hc</code> back down to its latest available, <em>current-switch compatible</em>, release.</p> <blockquote> <p>Notice the absence of the <code>*</code> character next to the package action? It means the package is no longer pinned.</p> </blockquote> <h4> <a id="unreleasedpins" class="anchor"></a><a class="anchor-link" href="#unreleasedpins">Unreleased packages</a> </h4> <p>On the other hand, an unreleased package, since its only definition source—meaning both the <strong>location of its source code</strong> as well as <strong>all information required for <code>opam</code> to operate</strong>, found in the corresponding <code>opam file</code>—<strong>is</strong> the pin itself, <code>opam</code> will have no other choice than to offer to remove it for you.</p> <pre><code class="language-shell-session">$ opam pin list opam-otopop.0.1 git git+https://github.com/rjbou/opam-otopop#master </code></pre> <p>In this case, <code>opam unpin &lt;package-name&gt;</code> (or idempotently: <code>opam pin remove &lt;package-name&gt;</code>) launches an <code>opam remove</code> action:</p> <pre><code class="language-shell-session">$ opam pin remove opam-otopop Ok, opam-otopop is no longer pinned to git+https://github.com/rjbou/opam-otopop#master (version 0.1) The following actions will be performed: ⊘ remove opam-otopop 0.1 Do you want to continue? [Y/n] y &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⊘ removed opam-otopop.0.1 Done. </code></pre> <h4> <a id="unpinnoaction" class="anchor"></a><a class="anchor-link" href="#unpinnoaction">Unpin but do no action</a> </h4> <p>Just like with the <code>opam pin add</code> command, the <code>--no-action</code> option is available when removing pins. It will <strong>only</strong> unpin the package, without removing it, or recompiling it.</p> <pre><code class="language-shell-session">$ opam pin remove opam-otopop --no-action Ok, opam-otopop is no longer pinned to git+https://github.com/rjbou/opam-otopop#master (version 0.1) $ opam list opam-otopop # Packages matching: name-match(opam-otopop) &amp; (installed | available) # Package # Installed # Synopsis opam-otopop.0.1 0.1 An opam-otopop package </code></pre> <p>You may use it for removing the <code>pin</code> from a package while still keeping it installed in your <code>switch</code>, or replacing it by its opam <code>repository</code> definition version.</p> <p>The resulting package remains linked to its URL, but it is not considered as pinned, so there will be no update or automatic syncing to follow the changes of the upstream branch.</p> <p>You may also consider this feature to prepare a specific action, say, as a temporary state. For example, you could unpin several packages in a row, and then proceed to recompiling the whole batch in one go.</p> <h3> <a id="multiple" class="anchor"></a><a class="anchor-link" href="#multiple">One URL to pin them all: handling a multi-package repository</a> </h3> <p>Every example seen so far had but one <code>opam file</code> at the root of their respective work tree (sometimes in a specific <code>opam/</code> directory).</p> <p>Yet it is possible for some projects to have several packages distributed by a single repository. An example of this would be the <a href="https://github.com/ocaml/opam">opam project source repository itself</a>. If that is the case, and you pin that URL, the default behaviour is that all the packages defined at that address will be pinned.</p> <p>Let's take <a href="https://github.com/OCamlPro/ocp-index">this project</a>.</p> <p>You can see that several packages are defined: <code>ocp-index</code> and <code>ocp-browser</code>.</p> <p>Here's how a <code>pin</code> action behaves when given that URL:</p> <pre><code class="language-shell-session">$ opam pin add git+https://github.com/OCamlPro/ocp-index This will pin the following packages: ocp-browser, ocp-index. Continue? [Y/n] y ocp-browser is now pinned to git+https://github.com/OCamlPro/ocp-index (version 1.3.6) ocp-index is now pinned to git+https://github.com/OCamlPro/ocp-index (version 1.3.6) The following actions will be performed: ∗ install ocp-indent 1.8.1 [required by ocp-index] ∗ install ocp-index 1.3.6* ∗ install ocp-browser 1.3.6* ===== ∗ 3 ===== Do you want to continue? [Y/n] y &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⬇ retrieved ocp-indent.1.8.1 (https://opam.ocaml.org/cache) ∗ installed ocp-indent.1.8.1 ∗ installed ocp-index.1.3.6 ∗ installed ocp-browser.1.3.6 Done. </code></pre> <p>As you can see, this process is exactly the same as before, but with 3 packages in one go.</p> <p><strong>What if I do not want to pin every package in that repository?</strong></p> <p>Easy: if you just need one of the packages found at that URL, you can just feed that package name to the <code>opam pin add &lt;package-name&gt; &lt;url&gt;</code> CLI call, just like we did at the beginning of this tutorial!</p> <pre><code class="language-shell-session">$ opam pin add ocp-index git+https://github.com/OCamlPro/ocp-index [ocp-index.1.3.6] synchronised (git+https://github.com/OCamlPro/ocp-index) ocp-index is now pinned to git+https://github.com/OCamlPro/ocp-index (version 1.3.6) The following actions will be performed: ∗ install ocp-indent 1.8.1 [required by ocp-index] ∗ install ocp-index 1.3.6* ===== ∗ 2 ===== Do you want to continue? [Y/n] y &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⬇ retrieved ocp-indent.1.8.1 (cached) ∗ installed ocp-indent.1.8.1 ∗ installed ocp-index.1.3.6 Done. </code></pre> <p>If you do not know the exact names of these different packages, you may also consider using the very handy <code>opam pin scan</code> command which will lookup the contents repository at the URL and list its <code>opam</code> packages for you:</p> <pre><code class="language-shell-session">$ opam pin scan git+https://github.com/OCamlPro/ocp-index # Name # Version # Url ocp-index 1.3.6 git+https://github.com/OCamlPro/ocp-index ocp-browser 1.3.6 git+https://github.com/OCamlPro/ocp-index </code></pre> <h3> <a id="version" class="anchor"></a><a class="anchor-link" href="#version">Setting arbitrary version numbers, toying with fire</a> </h3> <p>As demonstrated <a href="#opampinurl">earlier</a>, <code>opam</code> will choose a version of the pinned package according to the contents of the <code>opam file</code>.</p> <p>The important thing to take away from that is, in most usual scenarios, the contents of the <code>opam file</code> are paramount to how <code>opam</code> will calculate compatibility constraints in a given <code>switch</code>.</p> <p>It is <strong>from</strong> the information that is hardcoded <strong>inside</strong> the <code>opam file</code> that <code>opam</code> will be able to take educated decisions whenever changes to the state of your current <code>switch</code> are to be made. There is a way, however, to circumvent that behaviour, that we want to inform you of, even if it entails a bit of precaution.</p> <blockquote> <p>Naturally, directly tinkering with such a key stability feature like <code>compatibility constraints solving</code> does require you to <strong>tread carefully</strong>. We will see together some of the pitfalls and things to do that will keep you from finding yourself in confusing situations in regards to the state of your <code>switch</code> and the dependencies within it.</p> </blockquote> <p><strong>Ready? Lets get acquainted with our first slightly <em>dangerous</em> <code>opam</code> feature:</strong></p> <p>You are allowed to append an <strong>arbitrary</strong> version number to the name of the pinned package for <code>opam</code> to incorporate in its calculations, as seen in the following code block:</p> <pre><code class="language-shell-session">$ opam pin add directories.1.0 git+https://github.com/ocamlpro/directories --no-action [directories.1.0] synchronised (git+https://github.com/ocamlpro/directories) directories is now pinned to git+https://github.com/ocamlpro/directories (version 1.0) </code></pre> <p>In this specific example, package <a href="https://github.com/ocaml/opam-repository/blob/master/packages/directories/directories.0.5/opam"><code>directories</code></a> is available in the opam <a href="https://ocaml.org/p/directories/latest"><code>repository</code></a>, that our <code>opam</code> installation is synchronised with. However, there is no such <code>1.0</code> version in that <code>repository</code>. Not a single reference to such a version number can be found at that address, neither in the <code>tags</code>, nor <code>releases</code> of the repository, and not even in the <a href="https://github.com/OCamlPro/directories/blob/master/directories.opam"><code>opam file</code></a>.</p> <pre><code class="language-shell-session">$ opam show directories --field all-versions 0.1 0.2 0.3 0.4 0.5 </code></pre> <p>What we have done here is effectively telling <code>opam</code> that <code>directories</code> is at a different version number than it <strong>actually</strong> is in the most purely technical aspect...</p> <p><strong>But why would we want to do such a thing?</strong></p> <hr /> <p>Let's consider a reasonable use-case for <code>opam pin add &lt;package&gt;.&lt;my-version-number&gt; &lt;url&gt;</code>:</p> <p>You have been working on a project called <code>my-project</code> for some time and you are using a package named <code>fst-dep</code> for your development.</p> <p>Below, you will find an excerpt of the <code>fst-dep.opam</code> file, specifically its dependencies:</p> <pre><code class="language-shell-session">depends: [ &quot;dep-to-try&quot; { &lt;= &quot;3.0.0&quot; } &quot;other-dep&quot; ] </code></pre> <p>All three packages (<code>fst-dep</code>,<code>dep-to-try</code> and <code>other-dep</code>) are installed in your current switch and are available on your favourite opam <code>repository</code>.</p> <p>One day you go about checking the repository for each dependency, and you find that <code>dep-to-try</code> has just had one of its main features <strong>reimplemented</strong>, improved and optimised, they are preparing to release a <code>4.0.0</code> version soon.</p> <p>See, these changes would have been available for you to fetch directly from it's <em>development</em> repository had you been working with it directly, but you are not. It is up to the maintainers of <code>fst-dep</code> to do that work.</p> <p>Since you have no ownership over any of these dependencies. You have no way of changing any of the version constraints in this tiny dependency tree that ranges from <code>fst-dep</code> and upwards.</p> <p><strong>Here are the three mainstream solutions to this problem:</strong></p> <ul> <li>Wait for both packages to publish new releases. A new official release from the <code>dep-to-try</code> team, which would ship said reimplementation, and another from the <code>fst-dep</code> team which would update its dependency tree to include <code>dep-to-try</code>'s latest version. Needless to say that this could take an arbitrary amount of time which is unsatisfying at best. </li> <li>Another suboptimal solution would be to copy the current state of the entire opam <code>repository</code> relevant to your package distribution, go to the corresponding directory for <code>fst-dep</code> inside that <code>repository</code>, relax the hard dependency <code>&quot;dep-to-try&quot; { &lt;= &quot;3.0.0&quot; }</code> and reinstall all the packages that are directly or indirectly affected by that change. A very time consuming task for such a small edit to the global dependency tree. </li> <li>Last option would be to pin <code>fst-dep</code>, then go about manually editing the dependencies of <code>fst-dep</code> with the <code>opam pin --edit</code> option to relax the dependency. The only pitfall with this solution is that, in a context where <code>dep-to-try</code> is a <strong>key</strong> package in the OCaml distribution, and many other packages depend on it as well, you might have to do <strong>a lot</strong> of editing to make your <code>switch</code> a stable environment with all dependency constraints met... </li> </ul> <p>So neither of these solutions fit our needs. They are all unsatisfactory at best and even counter-productive at worst.</p> <p><strong>That's when <code>arbitrary version pinning</code> shines.</strong></p> <p>The main benefit of this feature is that it allows for added flexibility in navigating and tweaking the compatibility tree of any opam <code>repository</code> at the <code>switch</code>-level. It provides the user with ways to circumvent all tasks pertaining to a larger operation on the global graph of packages.</p> <pre><code class="language-shell-session">$ opam pin dep-to-try.3.0.0 git+https://github.com/OCamlPro/dep-to-try [dep-to-try.3.0.0] synchronised (file:///home/rjbou/ocamlpro/opam_bps_examples/dep-to-try) dep-to-try is now pinned to git+https://github.com/OCamlPro/dep-to-try#master (version 3.0.0) </code></pre> <p><code>opam</code> will still think that <code>dep-to-try</code>'s version is valid (<code>{ &lt;= &quot;3.0.0&quot;}</code>), even if you are synchronised with the state of its <em>development</em> branch, thus giving you access to the latest changes with the minimal amount of manual editing required. Pretty neat, right?</p> <p>Now, onto the pitfalls that you should keep in mind when tinkering with your dependencies like that.</p> <p><strong>What kind of predicament awaits you?</strong></p> <ol> <li>You could introduce unforeseen behaviours. This could be anything from errors at compile-time, if <code>dep-to-try</code>'s interfaces have changed significantly, to runtime crashes if you're unlucky. </li> <li>Another source of confusion could arise if you happen to use the <code>opam unpin dep-to-try --no-action</code> command on such a package. After unpinning it, there's a chance that you would later forget it used to be pinned to a <em>development</em> version. There would be little to no way for you to remember which package it was that you had experimented with at some point. You would either have to inspect all you installed packages or even remake a <code>switch</code> from scratch which would not be affected by your reckless <code>arbitrary version pinning</code> and would work just fine after that. </li> </ol> <p>Our advice is rather simple: use this feature with discretion and try to avoid unpinning packages if it's not to reinstall or remove them altogether. If you follow these instructions, you <strong>should</strong> be safe...</p> <h3> <a id="morefire" class="anchor"></a><a class="anchor-link" href="#morefire">Setting multiple arbitrary version numbers</a> </h3> <p>One last bit of black magic for you to play around with.</p> <p>Instead of pinning <code>package-name.my-version-number</code>, you may use the <code>--with-version</code> option to pin packages at that URL to an arbitrary version. A key detail is that it is compatible with multiple opam file pinning... Just keep in mind that all the pitfalls mentioned previously apply here too, only with multiple packages at once, which could make it more confusing.</p> <p>Below, you can see that we are setting <strong>all</strong> the packages found in that repository to the same version:</p> <pre><code class="language-shell-session">$ opam pin add git+https://github.com/OCamlPro/ocp-index --with-version 2.0.0 This will pin the following packages: ocp-browser, ocp-index. Continue? [Y/n] y ocp-browser is now pinned to git+https://github.com/OCamlPro/ocp-index (version 2.0.0) ocp-index is now pinned to git+https://github.com/OCamlPro/ocp-index (version 2.0.0) The following actions will be performed: ∗ install ocp-indent 1.8.1 [required by ocp-index] ∗ install ocp-index 2.0.0* ∗ install ocp-browser 2.0.0* ===== ∗ 3 ===== Do you want to continue? [Y/n] y &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ⬇ retrieved ocp-indent.1.8.1 (cached) ⬇ retrieved ocp-index.2.0.0 (no changes) ⬇ retrieved ocp-browser.2.0.0 (no changes) ∗ installed ocp-indent.1.8.1 ∗ installed ocp-index.2.0.0 ∗ installed ocp-browser.2.0.0 Done. </code></pre> <p>You can see that all these packages are pinned to <code>2.0.0</code> now.</p> <pre><code class="language-shell-session">$ opam pin list ocp-browser.2.0.0 git git+https://github.com/OCamlPro/ocp-index ocp-index.2.0.0 git git+https://github.com/OCamlPro/ocp-index </code></pre> <h2> <a id="conclusion" class="anchor"></a><a class="anchor-link" href="#conclusion">Conclusion</a> </h2> <p>Here it is, the <code>opam pin</code> command in most of its glory.</p> <p>If you have managed to stick this long to read this article, you should no longer feel confused about pinning projects and should now have another of <code>opam</code>'s most commonly used feature in your arsenal when tackling your own development challenges!</p> <p>So it is that we have learned about pinning both released and unreleased packages. Additionally, we showcased several features for orthogonal use-cases: from the more <em>quality of life</em>-oriented calls such as <code>opam show</code> and <code>opam pin scan</code>, to obscure features like arbitrary version pinning as well as ordinary options like <code>--no-action</code>, <code>--dev-repo</code> and subcommands like <code>opam unpin</code>.</p> <p>We are steadily approaching a level of familiarity with <code>opam</code> that will allow us to get into some really neat features soon.</p> <p>Be sure to stay tuned with our blog, the journey into the rabbit hole has only started and <code>opam</code> is a deep one indeed!</p> <hr /> <p>Thank you for reading,</p> <p>From 2011, with love,</p> <p>The OCamlPro Team</p> Flambda2 Ep. 1: Foundational Design Decisions https://ocamlpro.com/blog/2024_03_19_the_flambda2_snippets_1 2024-03-19T13:48:57Z 2024-03-19T13:48:57Z Pierre Chambart Vincent Laviron Guillaume Bury Nathanaëlle Courant Dario Pinto Welcome to The Flambda2 Snippets! In this first post of The Flambda2 Snippets, we dive into the powerful CPS-based internal representation used within the Flambda2 optimizer, which was one of the main motivation to move on from the former Flambda optimizer. Credit goes to Andrew Kennedy's paper Comp... <h3>Welcome to <strong>The Flambda2 Snippets</strong>!</h3> <p>In this first post of <a href="/blog/2024_03_18_the_flambda2_snippets_0/">The Flambda2 Snippets</a>, we dive into the powerful CPS-based internal representation used within the <a href="https://github.com/ocaml-flambda/flambda-backend/tree/main/middle_end/flambda2">Flambda2 optimizer</a>, which was one of the main motivation to move on from the former Flambda optimizer.</p> <p><strong>Credit goes to Andrew Kennedy's paper <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2007/10/compilingwithcontinuationscontinued.pdf"><em>Compiling with Continuations, Continued</em></a> for pointing us in this direction.</strong></p> <blockquote> <p>The <strong>F2S</strong> blog posts aim at gradually introducing the world to the inner-workings of a complex piece of software engineering: The <code>Flambda2 Optimising Compiler</code>, a technical marvel born from a 10 year-long effort in Research &amp; Development and Compilation; with many more years of expertise in all aspects of Computer Science and Formal Methods.</p> </blockquote> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#cps">CPS (Continuation Passing Style)</a> </li> <li><a href="#double-barrelled">Double Barrelled CPS</a> </li> <li><a href="#term">The Flambda2 Term Language</a> </li> <li><a href="#roadmap">Following up</a> </div> </li> </ul> <hr /> <h2> <a id="cps" class="anchor"></a><a class="anchor-link" href="#cps">CPS (Continuation Passing Style)</a> </h2> <p>Terms in the <code>Flambda2</code> IR are represented in CPS style, so let us briefly explain what that means.</p> <p>Some readers may already be familiar with what we call <em>First-Class CPS</em> where continuations are represented using functions of the language:</p> <pre><code class="language-ocaml">(* Non-tail-recursive implementation of map *) let rec map f = function | [] -&gt; [] | x :: r -&gt; f x :: map f r (* Tail-recursive CPS implementation of map *) let rec map_cps f l k = match l with | [] -&gt; k [] | x :: r -&gt; let fx = f x in map_cps f r (fun r -&gt; k (fx :: r)) </code></pre> <p>This kind of transformation is useful to make a recursive function tail-recursive and sometimes to avoid allocations for functions returning multiple values.</p> <p>In <code>Flambda2</code>, we use <em>Second-Class CPS</em> instead, where continuations are <strong>control-flow constructs in the Intermediate Language</strong>. In practice, this is equivalent to an explicit representation of a control-flow graph.</p> <p>Here's an example using some <strong>hopefully</strong> intuitive syntax for the <code>Flambda2</code> IR.</p> <pre><code class="language-ocaml">let rec map f = function | [] -&gt; [] | x :: r -&gt; f x :: map f r (* WARNING: FLAMBDA2 PSEUDO-SYNTAX INBOUND *) let rec map ((f : &lt;whatever_type1&gt; ), (param : &lt;whatever_type2&gt;)) {k_return_continuation : &lt;return_type&gt;} { let_cont k_empty () = k_return_continuation [] in let_cont k_nonempty x r = let_cont k_after_f fx = let_cont k_after_map map_result = let result = fx :: map_result in k_return_continuation result in Apply (map f r {k_after_map}) in Apply (f x {k_after_f}) in match param with | [] -&gt; k_empty () | x :: r -&gt; k_nonempty x r } </code></pre> <p>Every <code>let_cont</code> binding declares a new sequence of instructions in the control-flow graph, which can be terminated either by:</p> <ul> <li>calling a continuation (for example, <code>k_return_continuation</code>) which takes a fixed number of parameters; </li> <li>applying an OCaml function (<code>Apply</code>), this function takes as a special parameter the continuation which it must jump to at the end of its execution. Unlike continuations, OCaml functions can take a number of arguments that does not match the number of parameters at their definition; </li> <li>branching constructions like <code>match _ with</code> and <code>if _ then _ else _</code>, in these cases each branch is a call to a (potentially different) continuation; </li> </ul> <p> <div class="figure"> <p> <a href="/blog/assets/img/flambda2_snippets2_ep1_figure1.png"> <img alt="This image shows the previous code represented as a graph." src="/blog/assets/img/flambda2_snippets2_ep1_figure1.png"/> </a> <div class="caption"> This image shows the previous code represented as a graph. </div> </p> </div> </p> <blockquote> <p>Notice that some boxes are nested to represent scoping relations: variables defined in the outer boxes are available in the inner ones.</p> </blockquote> <p>To demonstrate the kinds of optimisations that such control-flow graphs allow us, see the following simple example:</p> <p><strong>Original Program:</strong></p> <pre><code class="language-ocaml">let f cond = let v = if cond then raise Failure else 0 in v, v </code></pre> <p>We then represent the same program using CPS in two steps, the first is the direct translation of the original program, the second is an equivalent program represented in a more compact form.</p> <p><strong>Minimal CPS transformation, using pseudo-syntax</strong></p> <pre><code class="language-ocaml">(* showing only the body of f *) (* STEP 1 - Before graph compaction *) let_cont k_after_if v = let result = v, v in k_return_continuation result in let_cont k_then () = k_raise_exception Failure in let_cont k_else () = k_after_if 0 in if cond then k_then () else k_else () </code></pre> <p>which becomes after inlining <code>k_after_if</code>:</p> <pre><code class="language-ocaml">(* STEP 2 - After graph compaction *) let_cont k_then () = k_raise_exception Failure in let_cont k_else () = let v = 0 in let result = v, v in k_return_continuation result in if cond then k_then () else k_else () </code></pre> <p>This allows us, by using the translation to CPS and back, to transform the original program into the following:</p> <p><strong>Optimized original program</strong></p> <pre><code class="language-ocaml">let f cond = if cond then raise Failure else 0, 0 </code></pre> <p>As you can see, the original program is simpler now. The nature of the changes operated on the code are in fact not tied to a particular optimisation but rather the nature of the CPS transformation itself. Moreover, we do want to actively perform optimisations and to that extent, having an intermediate representation that is equivalent to a control-flow graph allows us to benefit from the huge amount of literature on the subject of static analysis of imperative programs which often are represented as control-flow graphs.</p> <p>To be fair, in the previous example, we have cheated in how we have translated the <code>raise</code> primitive. Indeed we used a simple continuation (<code>k_raise_exception</code>) but we haven't defined it anywhere prior. This is possible because our use of Double Barrelled CPS.</p> <h2> <a id="double-barrelled" class="anchor"></a><a class="anchor-link" href="#double-barrelled">Double Barrelled CPS</a> </h2> <p>In OCaml, all functions can not only return normally (Barrel 1) but also throw exceptions (Barrel 2), it corresponds to two different paths in the control-flow and we need the ability to represent it in our own control-flow graph.</p> <p>Hence the name: <code>Double Barrelled CPS</code>, that we took from <a href="https://web.archive.org/web/20210420165356/https://www.cs.bham.ac.uk/~hxt/research/HOSC-double-barrel.pdf">this paper</a>, by Hayo Thielecke. In practice this only has consequences in four places:</p> <ol> <li>the function definitions must have two special parameters instead of one: the exception continuation (<code>k_raise_exception</code>) in addition to the normal return continuation (<code>k_return_continuation</code>); </li> <li>the function applications must have two special arguments, reciprocally; </li> <li><code>try ... with</code> terms are translated using regular continuations with the exception handler (the <code>with</code> path of the construct) compiled to a continuation handler (<code>let_cont</code>); </li> <li><code>raise</code> terms are translated into continuation calls, to either the current function exception continuation (e.g. in case of uncaught exception) or the enclosing <code>try ... with</code> handler continuation. </li> </ol> <h2> <a id="term" class="anchor"></a><a class="anchor-link" href="#term">The Flambda2 Term Language</a> </h2> <p>This CPS form has directed the concrete implementation of the FL2 language.</p> <p>We can see that the previous IRs have very descriptive representations, with about 20 constructors for <code>Clambda</code> and 15 for <code>Flambda</code> while <code>Flambda2</code> has regrouped all these features into only 6 categories which are sorted by how they affect the control-flow.</p> <pre><code class="language-ocaml">type expr = | Let of let_expr | Let_cont of let_cont_expr | Apply of apply | Apply_cont of apply_cont | Switch of switch | Invalid of { message : string } </code></pre> <p>The main benefits we reap from such a strong design choice are that:</p> <ul> <li>Code organisation is better: dealing with control-flow is only done when matching on full expressions and dealing with specific features of the language is done at a lower level; </li> <li>Reduce code duplication: features that behave in a similar way will have their common code shared by design; </li> </ul> <h2> <a id="roadmap" class="anchor"></a><a class="anchor-link" href="#roadmap">Following up</a> </h2> <p>The goal of this article was to show a fundamental design choice in <code>Flambda2</code> which is using a CPS-based representation. This design is felt throughout the <code>Flambda2</code> architecture and will be mentioned and strengthened again in later posts.</p> <p><code>Flambda2</code> takes the <code>Lambda</code> IR as input, then performs <code>CPS conversion</code>, followed by <code>Closure conversion</code>, each of them worth their own blog post, and this produces the terms in the <code>Flambda2</code> IR.</p> <p>From there, we have our main optimisation pass that we call <code>Simplify</code> which first performs static analysis on the term during a single <code>Downwards Traversal</code>, and then rebuilds an optimised term during the <code>Upwards Traversal</code>.</p> <p>Once we have an optimised term, we can convert it to the <code>CMM</code> IR and feed it to the rest of the backend. This part is mostly CPS elimination but with added original and interesting work we will detail in a specific snippet.</p> <p>The single-pass design allows us to consider all the interactions between optimisations</p> <p>Some examples of optimisations performed during <code>Simplify</code>:</p> <ul> <li>Inlining of function calls; </li> <li>Constant propagation; </li> <li>Dead code elimination </li> <li>Loopification, that is transforming tail-recursive functions into loops; </li> <li>Unboxing; </li> <li>Specialisation of polymorphic primitives; </li> </ul> <p>Most of the following snippets will detail one or several parts of these optimisations.</p> <p><strong>Stay tuned, and thank you for reading!</strong></p> Behind the Scenes of the OCaml Optimising Compiler Flambda2: Introduction and Roadmap https://ocamlpro.com/blog/2024_03_18_the_flambda2_snippets_0 2024-03-18T13:48:57Z 2024-03-18T13:48:57Z Pierre Chambart Vincent Laviron Dario Pinto Introducing our Flambda2 snippets At OCamlPro, the main ongoing task on the OCaml Compiler is to improve the high-level optimisation. This is something that we have been doing for quite some time now. Indeed, we are the authors behind the Flambda optimisation pass and today we would like to introduc... <p></p> <h2> <a id="introduction" class="anchor"></a><a class="anchor-link" href="#introduction">Introducing our Flambda2 snippets</a> </h2> <blockquote> <p>At OCamlPro, the main ongoing task on the OCaml Compiler is to improve the high-level optimisation. This is something that we have been doing for quite some time now. Indeed, we are the authors behind the <code>Flambda</code> optimisation pass and today we would like to introduce the series of blog snippets showcasing the direct successor to it, the creatively named <code>Flambda2</code>.</p> </blockquote> <p>This series of blog posts will cover everything about <code>Flambda2</code>, a new optimising backend for the OCaml native compiler. This introductory episode will provide you with some context and history about <a href="https://github.com/ocaml-flambda/flambda-backend"><code>Flambda2</code></a> but also about its predecessor <code>Flambda</code> and, of course, the OCaml compiler!</p> <p>This work may be considered as a completement to an on-going documentation effort at OCamlPro as well as to the many different talks we have given last year on the subject, two of which you can watch online: <a href="https://www.youtube.com/watch?v=eI5GBpT2Brs">OCaml Workshop</a> ( <a href="https://cambium.inria.fr/seminaires/transparents/20230626.Vincent.Laviron.pdf">slideshow</a> ), <a href="https://www.youtube.com/watch?v=PRb8tRfxX3s">ML Workshop</a> ( <a href="https://cambium.inria.fr/seminaires/transparents/20230828.Vincent.Laviron.pdf">slideshow</a> ).</p> <p><strong>This work was developed in collaboration with, and funded by Jane Street. Warm thanks to Mark Shinwell for shepherding the Flambda project and to Ron Minsky for his support.</strong></p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#introduction">Introduction</a> </li> <li><a href="#compiling">Compiling OCaml</a> </li> <li><a href="#roadmap">Snippets Roadmap</a> </li> <li><a href="#listing">The F2S Series!</a> </div> </li> </ul> <h2> <a id="compiling" class="anchor"></a><a class="anchor-link" href="#compiling">Compiling OCaml</a> </h2> <p>The compiling of OCaml is done through a multitude of passes (see simplified representation below), and the bulk of high-level optimisations happens between the <code>Lambda</code> IR (Intermediate Representation) and <code>CMM</code> (which stands for <em>C--</em>). This set of optimisations will be the main focus of this series of snippets.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/flambda2_snippets_ep0_figure3_1.png"> <img alt="The different passes of the OCaml compilers, from sources to executable code, before the addition of <code>Flambda</code>." src="/blog/assets/img/flambda2_snippets_ep0_figure3_1.png"/> </a> <div class="caption"> The different passes of the OCaml compilers, from sources to executable code, before the addition of <code>Flambda</code>. </div> </p> </div> </p> <p>Indeed, that part of the compiler is quite crowded. Originally, after the frontend has type-checked the sources, the <code>Closure</code> pass was in charge of transforming the <code>Lambda</code> IR <a href="https://github.com/ocaml/ocaml/blob/34cf5aafcedc2f7895c7f5f0ac27c7e58e4f4adf/lambda/lambda.mli#L279">(see source code)</a> into the <code>Clambda</code> IR <a href="https://github.com/ocaml/ocaml/blob/cce52acc7c7903e92078e9fe40745e11a1b944f0/middle_end/clambda.mli#L57">(see source code)</a>. This transformation handles <a href="https://en.wikipedia.org/wiki/Constant_folding"><em>Constant Propagation</em></a>, some <a href="https://en.wikipedia.org/wiki/Inline_expansion"><em>inlining</em></a>, and some <em>Constant Lifting</em> (moving constant structures to static allocation). Then, a subsequent pass (called <code>Cmmgen</code>) transforms the <code>Clambda</code> IR into the <code>CMM</code> IR <a href="https://github.com/ocaml/ocaml/blob/cce52acc7c7903e92078e9fe40745e11a1b944f0/asmcomp/cmm.mli#L168">(see source code)</a></b> and handles some <a href="https://en.wikipedia.org/wiki/Peephole_optimization">peep-hole optimisations</a> and <a href="https://en.wikipedia.org/wiki/Boxing_(computer_science)"><em>unboxing</em></a>. This final representation will be used by architecture-specific backends to produce assembler code.</p> <p>Before we get any further into the <strong>hairy</strong> details of <code>Flambda2</code> in the upcoming snippets, it is important that we address some context.</p> <p>We introduced the <code>Flambda</code> framework which was <a href="https://blog.janestreet.com/flambda/">released with <code>OCaml 4.03</code></a>. This was a success in improving <em>inlining</em> and related optimisations, and has been stable ever since, with very few bug reports.</p> <p>We kept both <code>Closure</code> and <code>Flambda</code> alive together because some users cared a lot about the compilation speed of OCaml - <code>Flambda</code> is indeed a bit slower than <code>Closure</code>.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/flambda2_snippets_ep0_figure3_2.png"> <img alt="<code>Flambda</code> provides an alternative to the classic <code>Closure</code> transformation, with additionnal optimizations." src="/blog/assets/img/flambda2_snippets_ep0_figure3_2.png"/> </a> <div class="caption"> <code>Flambda</code> provides an alternative to the classic <code>Closure</code> transformation, with additionnal optimizations. </div> </p> </div> </p> <p>Now is time to introduce another choice to both <code>Flambda</code> and <code>Closure</code>: <code>Flambda2</code>, which is meant to eventually replace <code>Flambda</code> and potentially <code>Closure</code> as well. In fact, Janestreet has been gradually moving from <code>Closure</code> and <code>Flambda</code> to <code>Flambda2</code> during the past year and has to this day no more systems relying on <code>Closure</code> or <code>Flambda</code>.</p> <blockquote> <p>You can read more about the transition from staging to production-level workloads of <code>Flambda2</code> right <a href="https://ocamlpro.com/blog/2023_06_30_2022_at_ocamlpro/#flambda">here</a>.</p> </blockquote> <p><code>Flambda</code> is still maintained and will be for the forseeable future. However, we have noticed some limitations that prevented us from doing some kinds of optimisations and on which we will elaborate in the following episodes of <em>The Flambda2 Snippets</em> series.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/flambda2_snippets_ep0_figure3.png"> <img alt="<code>Flambda2</code> provides a much extended alternative to Flambda, from <code>Lambda</code> IR to <code>CMM</code>." src="/blog/assets/img/flambda2_snippets_ep0_figure3.png"/> </a> <div class="caption"> <code>Flambda2</code> provides a much extended alternative to Flambda, from <code>Lambda</code> IR to <code>CMM</code>. </div> </p> </div> </p> <p>One obvious difference to notice is that <code>Flambda2</code> translates directly to <code>CMM</code>, circumventing the <code>Clambda</code> IR, allowing us to lift some limitations inherent to <code>Clambda</code> itself.</p> <p>Furthermore, we experimented after releasing <code>Flambda</code> with the aim to incrementally improve and add new optimisations. We tried to improve its internal representation and noticed that we could gain a lot by doing so, but also that it required deeper changes and that is what led us to <code>Flambda2</code>.</p> <h2> <a id="roadmap" class="anchor"></a><a class="anchor-link" href="#roadmap">Snippets Roadmap</a> </h2> <p>This is but the zeroth snippet of the series. It aims at providing you with history and context for <code>Flambda2</code>.</p> <p>You can expect the rest of the snippets to alternate between deep dives into the technical aspects of <code>Flambda2</code>, and user-facing descriptions of the new optimisations that we enable.</p> <h2> <a id="listing" class="anchor"></a><a class="anchor-link" href="#listing">The F2S Series!</a> </h2> <ul> <li> <p><a href="/blog/2024_01_31_the_flambda2_snippets_1">Episode 1: Foundational Design Decisions in Flambda2</a></p> <p>The first snippet covers the characteristics and benefits of a CPS-based internal representation for the optimisation of the OCaml language. It was already covered in part <a href="https://icfp23.sigplan.org/details/ocaml-2023-papers/8/Efficient-OCaml-compilation-with-Flambda-2">at the OCaml Workshop</a> in 2023 and we go deeper into the subject in these blog posts.</p> </li> <li> <p><a href="/blog/2024_05_07_the_flambda2_snippets_2">Episode 2: Loopifying Tail-Recursive Functions</a></p> <p><code>Loopify</code> is the first optimisation algorithm that we introduce in the <strong>F2S</strong> series. In this post, we breakdown the concept of transforming tail-recursive functions in the context of reducing memory allocations inside of the <code>Flambda2</code> compiler. We start with giving broader context around tail-recursion and tail-recursion optimisation before diving into how this transformation is both simple and representative of the philosophy behind all the optimisations conducted by the <code>Flambda2</code> compiler.</p> </li> <li> <p><a href="/blog/2024_08_09_the_flambda2_snippets_3">Episode 3: Speculative Inlining</a></p> <p>This article introduces <code>Speculative Inlining</code>, which is the name of the algorithm responsible for computing and inlining optimised function code inside of <code>Flambda2</code>. We cover how quickly we are faced with complex questions with only heuristic answers when it comes down to an optimal inlining choice. <code>Speculative Inlining</code> is also the best demonstration of how we traverse code in our compilation pipeline.</p> </li> <li> <p>Episode 4: Upward and Downward Traversals</p> <blockquote> <p>Coming soon...</p> </blockquote> </li> </ul> <p>Stay tuned, and thank you for reading!</p> Lean 4: When Sound Programs become a Choice https://ocamlpro.com/blog/2024_03_07_lean4_when_sound_programs_become_a_choice 2024-03-07T13:48:57Z 2024-03-07T13:48:57Z Adrien Champion Dario Pinto Monitoring Edge Technical Endeavours As a company specialized in strongly-typed programming languages with strong static guarantees, OCamlPro closely monitors the ongoing trend of bringing more and more of these elements into mainstream programming languages. Rust is a relatively recent example of t... <h2> <a id="watch" class="anchor"></a><a class="anchor-link" href="#watch">Monitoring Edge Technical Endeavours</a> </h2> <p>As a company specialized in strongly-typed programming languages with strong static guarantees, OCamlPro closely monitors the ongoing trend of bringing more and more of these elements into mainstream programming languages. Rust is a relatively recent example of this trend; another one is the very recent <a href="https://leanprover-community.github.io/index.html">Lean 4 language</a>.</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#watch">Monitoring Edge Technical Software</a> </li> <li><a href="#lean4">Lean 4, the Promise of Proven Software</a> </li> <li><a href="#leanpro">OCamlPro for a Future of Trustworthy Software</a> </div> </li> </ul> <h3> <a id="lean4" class="anchor"></a><a class="anchor-link" href="#lean4">Lean 4, the Promising Future of Proven Software</a> </h3> <p>Lean 4 builds on the shoulders of giants like the Coq proof assistant, and languages such as OCaml and Haskell, to put programmers in a world where they can write elegant programs, express their specification with the full power of modern logics, and prove that these programs are correct with respect to their specification. Doing all this in the same language is crucial as it can streamline the certification process: once Lean 4 is trusted (audits, certification...), then programs, specifications, and proofs are also trusted. This contrasts with having a programming language, a specification language, and a separate verification/certification tool, and then having to argue about the trustworthiness of each of them, and that the glue linking all of them together makes sense. This is extremely interesting in the context of critical embedded systems in particular, and in qualified/certified &quot;high-trust&quot; development in general.</p> <p>While admittedly not as mainstream as Rust, Lean 4 has recently seen an explosion in interest from the media, developers, mathematicians, and (some) industrials. Quanta now <a href="https://www.quantamagazine.org/tag/computer-assisted-proofs">routinely publishes articles about/mentioning Lean 4</a>; Fields medalist Terry Tao is increasingly vocal about (and productive with) its use of Lean 4, see <a href="https://terrytao.wordpress.com/2023/11/18/formalizing-the-proof-of-pfr-in-lean4-using-blueprint-a-short-tour">here</a> and <a href="https://terrytao.wordpress.com/2023/12/05/a-slightly-longer-lean-4-proof-tour">here</a> for (very technical) example(s). On the industrial side, Leonardo de Moura (Lean 4's lead designer) recently went from a position at Microsoft Research to Amazon Web Service, which was followed by a fast and still ongoing expansion of the infrastructure around Lean 4.</p> <h3> <a id="leanpro" class="anchor"></a><a class="anchor-link" href="#leanpro">Pushing for a Future of Trustworthy Software</a> </h3> <p>OCamlPro has been closely monitoring Lean 4's progress by regularly developing in-house prototypes in Lean 4. Getting involved in the community and Lean 4's development effort is also part of our culture. This is to give back to the community, but also to closely follow the evolution of Lean 4 and sharpen our skills.</p> <p>There are a few notable and public examples of our involvement. As part of our in-house prototyping, we discovered a <a href="https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/case.20in.20dependent.20match.20not.20triggering.20.28.3F.29/near/288328239">&quot;major bug&quot; in Lean 4's dependent pattern-matching</a>; later, we contributed on <a href="https://github.com/leanprover/lean4/pull/1811">improving aspects of the by notation</a> (used to construct proofs), which then ricocheted into <a href="https://github.com/leanprover/lean4/pull/1844">fixing problems into the calc tactic</a>. More recently, we contributed on various fronts such as <a href="https://github.com/leanprover/lean4/issues/2988">improving the ecosystem's ergonomics</a>, <a href="https://github.com/leanprover/std4/pull/233">adding useful lemmas to Lean 4's standard library</a>, <a href="https://github.com/leanprover/lean4/pull/2167">contributing to the documentation effort</a>...</p> <p>Lean 4 is not of industrial-strength yet, but it gets closer and closer. Quickly enough for us to think that now's a reasonable time to spend some time exploring it.</p> Opam 101: The First Steps https://ocamlpro.com/blog/2024_01_23_opam_101_the_first_steps 2024-01-23T13:48:57Z 2024-01-23T13:48:57Z Dario Pinto Raja Boujbel Welcome, dear reader, to a new series of blog posts! This series will be about everything opam. Each article will cover a specific aspect of the package manager, and make sure to dissipate any confusion or misunderstandings on this keystone of the OCaml distribution! Each technical article will be t... <p></p> <p> <div class="figure"> <p> <a href="/blog/assets/img/opam-banniere-e1600868011587.png"> <img alt="Opam is like a magic box that allows people to be tidy when they share their work with the world, thus making the environment stable and predictable for everybody!" src="/blog/assets/img/opam-banniere-e1600868011587.png"/> </a> <div class="caption"> Opam is like a magic box that allows people to be tidy when they share their work with the world, thus making the environment stable and predictable for everybody! </div> </p> </div> </p> <p>Welcome, dear reader, to a new series of blog posts!</p> <p>This series will be about everything <code>opam</code>. Each article will cover a specific aspect of the package manager, and make sure to dissipate any confusion or misunderstandings on this keystone of the OCaml distribution!</p> <p>Each technical article will be tailored for specific levels of engineering -- everyone, be they beginners, intermediate or advanced in the <em>OCaml Arts</em> will find answers to some questions about <code>opam</code> right here.</p> <blockquote> <p>Checkout each article's <code>tags</code> to get an idea of the entry level required for the smoothest read possible!</p> </blockquote> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#onboarding">Walking the path of opam, treading on solid ground</a> </li> <li><a href="#install">First step: installing opam</a> </li> <li><a href="#opaminit">Second step: initialisation</a> </li> <li><a href="#opamenv">Acclimating to the environment</a> </li> <li><a href="#switch">Switches, tailoring your workspace to your vision</a> <ul> <li><a href="#createaswitch">Creating a global switch</a> </li> <li><a href="#switchlocal">Creating a local switch</a> </li> </ul> </li> <li><a href="#opamrepo">The official opam-repository, the safe for all your packages</a> </li> <li><a href="#packages">Installing packages in your current switch</a> </li> <li><a href="#conclusion">Conclusion</a> </div> </li> </ul> <blockquote> <p>New to the expansive OCaml sphere? As said on the official opam website, <a href="https://opam.ocaml.org/about.html#A-little-bit-of-History">opam</a> has been a game changer for the OCaml distribution, since it first saw the light of day here, almost a decade ago.</p> </blockquote> <h2> <a id="onboarding" class="anchor"></a><a class="anchor-link" href="#onboarding">Walking the path of opam, treading on solid ground</a> </h2> <p>We are aware that it can be quite a daunting task to get on-board with the OCaml distribution. Be it because of its decentralised characteristics: plethora of different tools, a variety of sometimes clashing <em>modi operandi</em> and practices, usually poorly documented edge use-cases, the variety of ways to go about having a working environment or many a different reason...</p> <p>We have been thinking about making it easier for everyone, even the more confirmed Cameleers, by releasing a set of blogposts progressively detailing the depths at which <code>opam</code> can go.</p> <p>Be sure to read these articles from the start if you are new to the beautiful world of OCaml and, if you are already familiar, use it as a trust-worthy documentation on speed-dial... You never know when you will have to setup an opam installation while off-the-grid, do you ?</p> <p>Are you ready to dive in ?</p> <h2> <a id="install" class="anchor"></a><a class="anchor-link" href="#install">First step: installing opam</a> </h2> <p>First, let's talk about installing opam.</p> <blockquote> <p>DISCLAIMER: In this tutorial, we will only be addressing a fresh install of <code>opam</code> on Linux and Mac. For more information about a Windows installation, stay tuned with this blog!</p> </blockquote> <p>One would expect to have to interact with the package manager of one's favourite distribution in order to install <code>opam</code>, and, to some extent, one would be correct. However, we cannot guarantee that the version of opam you have at your disposal through these means is indeed the one expected by this tutorial, and every subsequent one for that matter.</p> <p>You can check that <a href="https://opam.ocaml.org/doc/Distribution.html">here</a>, make sure the version available to you is <code>2.1.5</code> or above.</p> <p>Thus, in order for us to guarantee that we are on the same version, we will use the installation method found <a href="https://opam.ocaml.org/doc/Install.html">here</a> and add an option to specify the version of opam we will be working with from now on.</p> <p>Note that if you <strong>don't</strong> add the <code>--version 2.1.5</code> option to the following command line, the script will download and install the <strong>latest</strong> opam release. The cli of <code>opam</code> is made to remain consistent between versions so, unless you have a very old version, or if you read this article in the very distant future, you should not have problems by not using the <strong>exact</strong> same version as we do. For the sake of consistency though, I will use this specific version.</p> <pre><code class="language-shell-session">$ bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.1.5&quot; </code></pre> <p>This script will download the necessary binaries for a proper installation of <code>opam</code>. Once done, you can move on to the nitty gritty of having a working <code>opam</code> environment with <code>opam init</code>.</p> <h2> <a id="opaminit" class="anchor"></a><a class="anchor-link" href="#opaminit">Second step: initialisation</a> </h2> <p>The first command to launch, after the initial <code>opam</code> binaries have been downloaded and <code>opam</code> has been installed on your system, is <code>opam init</code>.</p> <p>This is when you step into the OCaml distribution for the first time.</p> <p><code>opam init</code> does several crucial things for you when you launch it, and the rest of this article will detail what exactly these crucial things are and what they mean:</p> <ul> <li>it checks some required and recommended tools; </li> <li>it syncs with the official OCaml <strong>opam-repository</strong>, which you can find <a href="https://github.com/ocaml/opam-repository">here</a>; </li> <li>it sets up the <strong>opam environment</strong> in your <code>*rc</code> files; </li> <li>it creates a <strong>switch</strong> and installs an <strong>ocaml-compiler</strong> for you; </li> </ul> <p>Lets take a step-by-step look at the output of that command:</p> <pre><code class="language-shell-session">$ opam init No configuration file found, using built-in defaults. Checking for available remotes: rsync and local, git, mercurial, darcs. Perfect! &lt;&gt;&lt;&gt; Fetching repository information &gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; [default] Initialised &lt;&gt;&lt;&gt; Required setup - please read &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; In normal operation, opam only alters files within ~/.opam. However, to best integrate with your system, some environment variables should be set. If you allow it to, this initialisation step will update your bash configuration by adding the following line to ~/.profile: test -r ~/.opam/opam-init/init.sh &amp;&amp; . ~/.opam/opam-init/init.sh &gt; /dev/null 2&gt; /dev/null || true Otherwise, every time you want to access your opam installation, you will need to run: eval $(opam env) You can always re-run this setup with 'opam init' later. Do you want opam to modify ~/.profile? [N/y/f] (default is 'no', use 'f' to choose a different file) y User configuration: Updating ~/.profile. [NOTE] Make sure that ~/.profile is well sourced in your ~/.bashrc. &lt;&gt;&lt;&gt; Creating initial switch 'default' (invariant [&quot;ocaml&quot; {&gt;= &quot;4.05.0&quot;}] - initially with ocaml-base-compiler) &lt;&gt;&lt;&gt; Installing new switch packages &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; Switch invariant: [&quot;ocaml&quot; {&gt;= &quot;4.05.0&quot;}] &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ∗ installed base-bigarray.base ∗ installed base-threads.base ∗ installed base-unix.base ∗ installed ocaml-options-vanilla.1 ⬇ retrieved ocaml-base-compiler.5.1.0 (https://opam.ocaml.org/cache) ∗ installed ocaml-base-compiler.5.1.0 ∗ installed ocaml-config.3 ∗ installed ocaml.5.1.0 ∗ installed base-domains.base ∗ installed base-nnp.base Done. </code></pre> <p>The main result for an <code>opam init</code> call is to setup what is called your <code>opam root</code>. It does so by creating a <code>~/.opam</code> directory to operate inside of. <code>opam</code> modifies and writes in this location <strong>only</strong> as a default.</p> <hr /> <p>First, <code>opam</code> checks that there is at least one required tool for syncing to the <code>opam-repository</code>. Then it checks what backends are available in your system. Here all are available: <code>rsync, git, mercurial, and darcs</code>. They will be used to sync repositories or packages.</p> <pre><code class="language-shell-session">$ opam init No configuration file found, using built-in defaults. Checking for available remotes: rsync and local, git, mercurial, darcs. Perfect! </code></pre> <p>Then, <code>opam</code> fetches the default opam repository: <code>opam.ocaml.org</code>.</p> <pre><code class="language-shell-session">&lt;&gt;&lt;&gt; Fetching repository information &gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; [default] Initialised </code></pre> <hr /> <p>Secondly, <code>opam</code> requires your input in order to configure your shell for the smoothest possible experience. For more details about the opam environment, refer to the next section.</p> <blockquote> <p>Something interesting to remember for later is, in the excerpt below, we grant opam with the permission to edit the <code>~/.profile</code> file. This part of the Quality of Life features for an everyday use an <code>opam</code> environment and we will detail how so below.</p> </blockquote> <pre><code class="language-shell-session">&lt;&gt;&lt;&gt; Required setup - please read &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; In normal operation, opam only alters files within ~/.opam. However, to best integrate with your system, some environment variables should be set. If you allow it to, this initialisation step will update your bash configuration by adding the following line to ~/.profile: test -r ~/.opam/opam-init/init.sh &amp;&amp; . ~/.opam/opam-init/init.sh &gt; /dev/null 2&gt; /dev/null || true Otherwise, every time you want to access your opam installation, you will need to run: eval $(opam env) You can always re-run this setup with 'opam init' later. Do you want opam to modify ~/.profile? [N/y/f] (default is 'no', use 'f' to choose a different file) y User configuration: Updating ~/.profile. [NOTE] Make sure that ~/.profile is well sourced in your ~/.bashrc. </code></pre> <hr /> <p>The next action is the installation of your very first <code>switch</code> alongside a version of the OCaml compiler, by default a compiler &gt;= <code>4.05.0</code> to be exact.</p> <p>For more information about what is a <code>switch</code> be sure to read <a href="#switch">the rest of the article</a>.</p> <pre><code class="language-shell-session">&lt;&gt;&lt;&gt; Creating initial switch 'default' (invariant [&quot;ocaml&quot; {&gt;= &quot;4.05.0&quot;}] - initially with ocaml-base-compiler) &lt;&gt;&lt;&gt; Installing new switch packages &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; Switch invariant: [&quot;ocaml&quot; {&gt;= &quot;4.05.0&quot;}] &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ∗ installed base-bigarray.base ∗ installed base-threads.base ∗ installed base-unix.base ∗ installed ocaml-options-vanilla.1 ⬇ retrieved ocaml-base-compiler.5.1.0 (https://opam.ocaml.org/cache) ∗ installed ocaml-base-compiler.5.1.0 ∗ installed ocaml-config.3 ∗ installed ocaml.5.1.0 ∗ installed base-domains.base ∗ installed base-nnp.base Done. </code></pre> <p><strong>Great! So let's focus on the actions performed by the <code>opam init</code> call!</strong></p> <h2> <a id="opamenv" class="anchor"></a><a class="anchor-link" href="#opamenv">Acclimating to the environment</a> </h2> <p>Well, as said previously, the first action was to setup an <code>opam root</code> in your <code>$HOME</code> directory, (i.e: <code>~/.opam</code>). This is where <code>opam</code> will operate. <code>opam</code> will never modify other locations in your filesystem without notifying you first.</p> <p>An <code>opam</code> root is made to resemble a linux-like architecture. You will find inside it directories such as <code>/usr</code>, <code>/etc</code>, <code>/bin</code> and so on. This is by default where <code>opam</code> will store everything relative to your system-wide installation. Config files, packages and their configurations, and also binaries.</p> <p>This leads us to the need for an <code>eval $(opam env)</code> call.</p> <p>Indeed, in order to make your binaries and such accessible as system-wide tools, you need to update all the relevant environment variables (<code>PATH</code>, <code>MANPATH</code>, etc.) with all the locations for all of your everyday OCaml tools.</p> <p>To see what variables are exported when evaluating the <code>opam env</code> command, you can check the following codeblock:</p> <pre><code class="language-shell-session">$ opam env OPAM_SWITCH_PREFIX='~/.opam/default'; export OPAM_SWITCH_PREFIX; CAML_LD_LIBRARY_PATH='~/.opam/default/lib/stublibs:~/.opam/default/lib/ocaml/stublibs:~/.opam/default/lib/ocaml'; export CAML_LD_LIBRARY_PATH; OCAML_TOPLEVEL_PATH='~/.opam/default/lib/toplevel'; export OCAML_TOPLEVEL_PATH; MANPATH=':~/.opam/default/man'; export MANPATH; PATH='~/.opam/default/bin:$PATH'; export PATH; </code></pre> <p>Remember when we granted <code>opam init</code> with the permission to edit the <code>~/.profile</code> file, earlier in this tutorial ? That comes in handy now: it keeps us from having to use the <code>eval $(opam env)</code> more than necessary.</p> <p>Indeed, you would otherwise have to call it every time you launch a new shell among other things. What it does instead is adding hook at prompt level that keeps <code>opam</code> environment synced, updating it every time the user presses <code>Enter</code>. Very handy indeed.</p> <h2> <a id="switch" class="anchor"></a><a class="anchor-link" href="#switch">Switches, tailoring your workspace to your vision</a> </h2> <p>The second task accomplished by <code>opam init</code> was installing the first <code>switch</code> inside your fresh installation.</p> <p>A <code>switch</code> is one of opam's core operational concepts, it's definition can vary depending on your exact use-case but in the case of OCaml, a <code>switch</code> is a <strong>named pair</strong>:</p> <ul> <li>an arbitrary version of the OCaml compiler </li> <li>a list of packages available for that specific version of the compiler. </li> </ul> <p>In our example, we see that the only packages installed in the process were the dependencies for the OCaml compiler version <code>5.1.0</code> inside the <code>switch</code> named <code>default</code>.</p> <pre><code class="language-shell-session">&lt;&gt;&lt;&gt; Creating initial switch 'default' (invariant [&quot;ocaml&quot; {&gt;= &quot;4.05.0&quot;}] - initially with ocaml-base-compiler) &lt;&gt;&lt;&gt; Installing new switch packages &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; Switch invariant: [&quot;ocaml&quot; {&gt;= &quot;4.05.0&quot;}] &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ∗ installed base-bigarray.base ∗ installed base-threads.base ∗ installed base-unix.base ∗ installed ocaml-options-vanilla.1 ⬇ retrieved ocaml-base-compiler.5.1.0 (https://opam.ocaml.org/cache) ∗ installed ocaml-base-compiler.5.1.0 ∗ installed ocaml-config.3 ∗ installed ocaml.5.1.0 ∗ installed base-domains.base ∗ installed base-nnp.base Done. </code></pre> <p>You can create an arbitrary amount of parallel <code>switches</code> in opam. This allows users to manage parallel, independent OCaml environments for their developments.</p> <p>There are two types of <code>switches</code>:</p> <ul> <li><code>global switches</code> have their packages, binaries and tools available anywhere on your computer. They are useful when you consider a given <code>switch</code> to be your default and most adequate environment for your everyday use of <code>opam</code> and OCaml. </li> <li><code>local switches</code> on the other hand are only available in a given directory. Their packages and binaries are local to that <strong>specific</strong> directory. This allows users to make specific projects have their own self-contained working environments. The local switch is automatically selected by <code>opam</code> as the current one when you are located inside the appropriate directory. More details on local switches below. </li> </ul> <p>The default behaviour for <code>opam</code> when creating a <code>switch</code> at init-time is to make it global and name it <code>default</code>.</p> <pre><code class="language-shell-session">$ opam switch show default $ opam switch # switch compiler description → default ocaml.5.1.0 default </code></pre> <p>Now that you have a general understanding of what exactly is a <code>switch</code> and how it is used, let's get into how you can go about manually creating your first <code>switch</code>.</p> <h3> <a id="createaswitch" class="anchor"></a><a class="anchor-link" href="#createaswitch">Creating a global switch</a> </h3> <blockquote> <p>NB: Remember that <code>opam</code>'s command-line interface is beginner friendly. You can, at any point of your exploration, use the <code>--help</code> option to have every command and subcommand explained. You may also checkout the <a href="https://ocamlpro.github.io/ocaml-cheat-sheets/ocaml-opam.pdf">opam cheat-sheet</a> that was released a while ago and might still hold some precious insights on opam's cli.</p> </blockquote> <p>So how does one create a <code>switch</code> ? The short answer is bafflingly straightforward:</p> <pre><code class="language-shell-session"># Installs a switch named &quot;my-switch&quot; based OCaml compiler version &gt; 4.05.0 # Here 4.05 is the default lower compiler version opam selects when unspecified $ opam switch create my-switch </code></pre> <p>Easy, right? Now let's imagine that you would like to specify a <strong>later</strong> version of the OCaml compiler. The first thing you would want to know is which version are available for you to specify, and you can use <code>opam list</code> for that.</p> <p>Other commands can be used to the same effect but we prefer introducing you to this specific one as it may also be used for any other package available via <code>opam</code>.</p> <p>So, as for any other package than <code>ocaml</code> itself, <code>opam list</code> will give you all available versions of that package for your currently active <code>switch</code>. Since we don't yet have an OCaml compiler installed, it will list all of them so that we may pick and choose our favourite to use for the <code>switch</code> we are making.</p> <pre><code class="language-shell-session">$ opam list ocaml # Packages matching: name-match(ocaml) &amp; (installed | available) # Package # Installed # Synopsis ocaml.3.07 -- The OCaml compiler (virtual package) ocaml.3.07+1 -- The OCaml compiler (virtual package) ocaml.3.07+2 -- The OCaml compiler (virtual package) ocaml.3.08.0 -- The OCaml compiler (virtual package) (...) ocaml.4.13.1 -- The OCaml compiler (virtual package) ocaml.4.13.2 -- The OCaml compiler (virtual package) (...) ocaml.5.2.0 -- The OCaml compiler (virtual package) </code></pre> <p>Let's use it for a switch:</p> <pre><code class="language-shell-session"># Installs a switch named &quot;my-switch&quot; based OCaml compiler version = 4.13.1 $ opam switch create my-switch ocaml.4.13.1 </code></pre> <p>That's it, for the first time, you have manually created your own <code>global switch</code> tailored to your specific needs, congratulations!</p> <blockquote> <p>NB: Creating a switch can be a fairly time-consuming task depending on whether or not the compiler version you have queried from <code>opam</code> is already installed on your machine, typically in a previously created <code>switch</code>. Every time you ask <code>opam</code> to install a version of the compiler, it will first scour your installation for a locally available version of that compiler to save you the time necessary for downloading, compiling and installing a brand new one.</p> </blockquote> <p>Now, onto <code>local switches</code>.</p> <h3> <a id="switchlocal" class="anchor"></a><a class="anchor-link" href="#switchlocal">Creating a local switch</a> </h3> <p>As said previously, the use of a <code>local switch</code> is to constrain a specific OCaml environment to a specific location on your workstation.</p> <p>Let's imagine you are about to start a new development called <code>my-project</code>.</p> <p>While preparing all necessary pre-requisites for it, you notice something problematic: your global <code>default</code> switch is drastically incompatible with the dependencies of your project. In this imaginary situation, you have a <code>default</code> global switch that is useful for most of your other tasks but now have only one project that differs from your usual usage of OCaml.</p> <p>To remedy this situation, you could go about creating another global switch for your upcoming dev requirements on <code>my-project</code> and proceed to install all relevant packages and remake a full <code>switch</code> from scratch for that specific project. However this would require you to always keep track of which one is your currently active <code>switch</code>, while possibly having to regularly oscillate between your global <code>default</code> switch and your alternative global <code>my-project</code> switch which you could understandably find to be suboptimal and tedious to incorporate to your workflow on the long run.</p> <p>That's when <code>local switches</code> come in handy because they allow you to leave the rest of your OCaml dev environment unaffected by whatever out-of-bounds or specific workload you're undertaking. Additionally, the fact that <code>opam</code> automatically selects your <code>local switch</code> as your current active one as soon as you step inside the relevant directory makes the developers's context switch seemless.</p> <p>Let's examine how you can create such a <code>switch</code>:</p> <pre><code class="language-shell-session"># Hop inside the directory of your project $ cd my-project # We consider your project already has an opam file describing only # its main dependency: ocaml.4.14.1 $ opam switch create . &lt;&gt;&lt;&gt; Installing new switch packages &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; Switch invariant: [&quot;ocaml&quot; {&gt;= &quot;4.05.0&quot;}] &lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt; ∗ installed base-bigarray.base ∗ installed base-threads.base ∗ installed base-unix.base ∗ installed ocaml-system.4.14.1 ∗ installed ocaml-config.2 ∗ installed ocaml.4.14.1 Done. $ opam switch # switch compiler description → /home/ocp/my-project ocaml.4.14.1 /home/ocp/my-project default ocaml.5.1.0 default my-switch ocaml.4.13.1 my-switch [NOTE] Current switch has been selected based on the current directory. The current global system switch is default. </code></pre> <p>Here it is, you can now hop into your local switch <code>/home/ocp/my-project</code> whenever you have time to deviate from your global environment.</p> <h2> <a id="opamrepo" class="anchor"></a><a class="anchor-link" href="#opamrepo">The official opam-repository, the safe for all your packages</a> </h2> <p>Among all the things that <code>opam init</code> did when it was executed, there is still one detail we have yet to explain and that's the first action of the process: retrieving packages specification from the official OCaml <code>opam-repository</code>.</p> <p>Explaining what exactly an <code>opam-repository</code> is requires the recipient to have a slightly deeper understanding of how <code>opam</code> works than the average reader this article was written for might have; so you will have to wait for us to go deeper into that subject in another blogpost when the time is ripe.</p> <p>What we <strong>will</strong> do now though is explain what the official OCaml <code>opam-repository</code> is and how it relates to our use of <code>opam</code> in this blog post.</p> <p><a href="https://github.com/ocaml/opam-repository">The Official OCaml opam-repository</a> is an open-source project where all released software of the OCaml distributions are <strong>referenced</strong>. It holds different compilers, basic tools, thousands of libraries, approximatively 4500 packages in total as of today and is configured to be the default repository for <code>opam</code> to sync to. You may add your own repositories for your own use of <code>opam</code>, but again, that's a subject for another time.</p> <p>In case the repository itself is not what you are looking for, know that all packages available throughout the entire OCaml distribution may be browsed directly on <a href="https://ocaml.org/packages">ocaml.org</a>.</p> <p>It is essentially a collection of <code>opam packages</code> described in <code>opam file</code> format. Checkout <a href="https://opam.ocaml.org/doc/Manual.html#opam">the manual</a> for more information about the <code>opam file</code> format.</p> <p>A short explanation for it is that an <code>opam package</code> file holds every information necessary for <code>opam</code> to operate and provide. The file lists all of the packages direct dependencies, where to find its source code, the names and emails of maintainers and authors, different checksums for each archive release and the list goes on.</p> <p>Here's a quick example for you to have an idea of what it looks like:</p> <pre><code class="language-shell-session">opam-version: &quot;2.0&quot; synopsis: &quot;OCaml bindings to Zulip API&quot; maintainer: [&quot;Dario Pinto &lt;dario.pinto@ocamlpro.com&gt;&quot;] authors: [&quot;Mohamed Hernouf &lt;mohamed.hernouf@ocamlpro.com&gt;&quot;] license: &quot;LGPL-2.1-only WITH OCaml-LGPL-linking-exception&quot; homepage: &quot;https://github.com/OCamlPro/ozulip&quot; doc: &quot;https://ocamlpro.github.io/ozulip&quot; bug-reports: &quot;https://github.com/OCamlPro/ozulip/issues&quot; dev-repo: &quot;git+https://github.com/OCamlPro/ozulip.git&quot; tags: [&quot;zulip&quot; &quot;bindings&quot; &quot;api&quot;] depends: [ &quot;ocaml&quot; {&gt;= &quot;4.10&quot;} &quot;dune&quot; {&gt;= &quot;2.0&quot;} &quot;ez_api&quot; {&gt;= &quot;2.0.0&quot;} &quot;re&quot; &quot;base64&quot; &quot;json-data-encoding&quot; {&gt;= &quot;1.0.0&quot;} &quot;logs&quot; &quot;lwt&quot; {&gt;= &quot;5.4.0&quot;} &quot;ez_file&quot; {&gt;= &quot;0.3.0&quot;} &quot;cohttp-lwt-unix&quot; &quot;yojson&quot; &quot;logs&quot; ] build: [ &quot;dune&quot; &quot;build&quot; &quot;-p&quot; name &quot;-j&quot; jobs &quot;@install&quot; ] url { src: &quot;https://github.com/OCamlPro/ozulip/archive/refs/tags/0.1.tar.gz&quot; checksum: [ &quot;md5=4173fefee440773dd0f8d7db5a2e01e5&quot; &quot;sha512=cb53870eb8d41f53cf6de636d060fe1eee6c39f7c812eacb803b33f9998242bfb12798d4922e7633aa3035cf2ab98018987b380fb3f380f80d7270e56359c5d8&quot; ] } </code></pre> <p>Okay so now, how do we go about populating a <code>switch</code> with packages and really get started?</p> <h2> <a id="packages" class="anchor"></a><a class="anchor-link" href="#packages">Installing packages in your current switch</a> </h2> <p>It's elementary. This simple command will do the trick of <em>trying</em> to install a package, <strong>and its dependencies</strong>, in your currently active <code>switch</code>.</p> <pre><code class="language-shell-session">$ opam install my-package </code></pre> <p>I say <em>trying</em> because <code>opam</code> will notify you if the current package version and its dependencies you are querying are or not compatible with the current state of your <code>switch</code>. It will also offer you solutions for the compatibility constraints between packages to be satisfiable: it may suggest to upgrade some of your packages, or even to remove them entirely.</p> <p>The key thing about this process is that <code>opam</code> is designed to solve compatibility constraints in the global graph of dependencies that the OCaml packages form. This design is what makes <code>opam</code> the average Cameleer's best friend. It will highlight inconsistencies within dependencies, it will figure out a way for your specific query to be satisfiable somehow and save you <strong>a lot</strong> of headscratching, that is, if you are willing to accommodate a bit of <em>getting-used to</em>.</p> <p>The next command allows you to uninstall a package from your currently active <code>switch</code> <strong>as well as</strong> the packages that depend on it:</p> <pre><code class="language-shell-session">$ opam remove my-package </code></pre> <p>And the two following will <code>update</code> the state of the repositories <code>opam</code> is synchronised with and <code>upgrade</code> the packages installed while <strong>always</strong> keeping package compatibility in mind.</p> <pre><code class="language-shell-session">$ opam update $ opam upgrade </code></pre> <h2> <a id="conclusion" class="anchor"></a><a class="anchor-link" href="#conclusion">Conclusion</a> </h2> <p>Here it is, you should now be knowledgeable enough about <code>opam</code> to jump right in the OCaml discovery!</p> <p>Today we learned everything elementary about <code>opam</code>.</p> <p>From installation, to initialisation and explanations about the core concepts of the <code>opam</code> environment, <code>switches</code>, packages and the Official OCaml <code>opam-repository</code>.</p> <p>Be sure to stay tuned with our blog, the journey into the rabbit hole has only started and <code>opam</code> is a deep one indeed!</p> <hr /> <p>Thank you for reading,</p> <p>From 2011, with love,</p> <p>The OCamlPro Team</p> Maturing Learn-OCaml to version 1.0: Gateway to the OCaml World https://ocamlpro.com/blog/2023_12_13_learn_ocaml_gateway_to_the_ocaml_world 2023-12-13T13:48:57Z 2023-12-13T13:48:57Z Dario Pinto From the very start OCamlPro has been trying to help ease the learning of the OCaml language. OCaml has been used around the world to teach about a variety of Computer Science domains, from algorithmic to calculus, or functional programming and compilation. The language had been long taught in Acade... <p></p> <p> <div class="figure"> <p> <a href="/blog/assets/img/dalle_camel_on_the_road.png"> <img alt="Camels are known to be able to walk long distances. They have adapted to an inhospitable environment and help Humanity daily." src="/blog/assets/img/dalle_camel_on_the_road.png"/> </a> <div class="caption"> Camels are known to be able to walk long distances. They have adapted to an inhospitable environment and help Humanity daily. </div> </p> </div> </p> <p>From the very start OCamlPro has been trying to help ease the learning of the OCaml language. OCaml has been used around the world to teach about a variety of Computer Science domains, from algorithmic to calculus, or functional programming and compilation.</p> <p>The language had been long taught in Academia when arose initiatives to offer simple web tools to write and compile OCaml code in a simple web browser. We launched the <a href="https://try.ocaml.pro/">TryOCaml</a> web editor for OCaml all the way back in 2012. We were then appointed in 2015 by Roberto Di Cosmo from the French University Paris-Diderot, to create the <a href="https://www.fun-mooc.fr/fr/cours/introduction-to-functional-programming-in-ocaml/">OCaml FUN MOOC</a> platform - and helped write the exercises used as pedagogical resources for the <code>Introduction to Functional Programming</code>.</p> <p>That is how the <a href="https://github.com/ocaml-sf/learn-ocaml">Learn-OCaml open source learning platform</a> was born, created then maintained at OCamlPro until 2018. Its steering was then transferred to the <a href="http://ocaml-sf.org/actions/">OCaml Software Foundation</a> in 2019 and the project steadily grew into a fully fledged tool used by teachers and students around the world to this day.</p> <p><em>Kudos to all OCaml teachers around the world, and to the LearnOCaml team, shepherded by Louis Gesbert</em></p> <h2> <a id="loc" class="anchor"></a><a class="anchor-link" href="#loc">Learn-OCaml v1.0</a> </h2> <blockquote> <p><strong>What is Learn-OCaml today?</strong></p> <p><a href="https://github.com/ocaml-sf/learn-ocaml">Learn-OCaml</a> is a web platform for orchestrating exercises for OCaml programming, with automated grading. The interface features a code editor and client-side evaluation and grading; it can be served statically, but if running the bundled server there are also server-side saves, facilities for teachers to follow the progress of students, give assignments, get grades, etc.</p> <p>We are thrilled to announce that the steady work that has been accomplished over the years on <code>Learn-OCaml</code> is finally bearing its fruits in the form of a long-awaited soon-to-be-released <code>v1.0</code>!</p> </blockquote> <hr /> <p>For all details relative to the upcoming <code>1.0</code> release, do refer to <a href="https://discuss.ocaml.org/t/learn-ocaml-1-0-approaching-call-for-testers/13621/1">Louis' post on OCaml Discuss</a>.</p> <p>For all historical intents and purposes, do refer to the <a href="https://files.ocamlpro.com/uploads/ocaml-2016-learn-ocaml.pdf">original 2016 OCaml Workshop paper</a> on Learn-OCaml which kickstarted a long stream of updates and improvements to the platform and its <a href="https://github.com/ocaml-sf/learn-ocaml-corpus">public corpus exercices</a>.</p> <p><strong>The maintenance and development work on the platform is now funded by the OCaml Software Foundation.</strong></p> The latest release of Alt-Ergo version 2.5.1 is out, with improved SMT-LIB and bitvector support! https://ocamlpro.com/blog/2023_09_18_release_of_alt_ergo_2_5_1 2023-09-18T13:48:57Z 2023-09-18T13:48:57Z Pierre Villemot We are happy to announce a new release of Alt‑Ergo (version 2.5.1). Alt-Ergo is a cutting-edge automated prover designed specifically for mathematical formulas, with a primary focus on advancing program verification. This powerful tool is instrumental in the arsenal of static analysis solutions su... <p> <div class="figure"> <p> <a href="/blog/assets/img/ae-251-is-out.png"> <img alt="Alt‑Ergo: An Automated SMT Solver for Program Verification" src="/blog/assets/img/ae-251-is-out.png"/> </a> <div class="caption"> Alt‑Ergo: An Automated SMT Solver for Program Verification </div> </p> </div> </p> <p><strong>We are happy to announce a new release of Alt‑Ergo (version 2.5.1).</strong></p> <blockquote> <p>Alt-Ergo is a cutting-edge automated prover designed specifically for mathematical formulas, with a primary focus on advancing program verification.</p> <p>This powerful tool is instrumental in the arsenal of static analysis solutions such as Trust-In-Soft Analyzer and Frama-C. It accompanies other major solvers like CVC5 and Z3, and is part of the solvers used behind Why3, a platform renowned for deductive program verification.</p> <p><strong>Find out more about Alt‑Ergo and how to join the Alt-Ergo Users' Club <a href="https://alt-ergo.ocamlpro.com/#about">here</a>!</strong></p> </blockquote> <p>This release includes the following new features and improvements:</p> <ul> <li>support for bit-vectors in the SMT-LIB format; </li> <li>new SMT-LIB parser and typechecker; </li> <li>improved bit-vector reasoning; </li> <li>partial support for SMT-LIB commands <code>set-option</code> and <code>get-model</code>; </li> <li>simplified options to enable floating-point arithmetic theory; </li> <li>various bug fixes. </li> </ul> <h3>Update for bug fixes</h3> <p>Since writing this blog post, we have released Alt-Ergo version 2.5.2 which fixes an incorrect implementation of the <code>(distinct)</code> SMT-LIB operator when applied to more than two arguments, and a (rare) crash in model generation. We strongly advise users interested in SMT-LIB or model generation support upgrade to version 2.5.2 on OPAM.</p> <h2>Better SMT-LIB Support</h2> <p>This release includes a better support of the <a href="https://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.6-r2021-05-12.pdf">SMT-LIB standard v2.6</a>. More precisely, the release contains:</p> <ul> <li>built-in primitives for the <a href="https://smtlib.cs.uiowa.edu/theories-FixedSizeBitVectors.shtml">FixedSizeBitVectors</a>; </li> <li><a href="https://smtlib.cs.uiowa.edu/theories-Reals_Ints.shtml">Reals_Ints</a> theories and the <a href="https://smtlib.cs.uiowa.edu/logics-all.shtml#QF_BV">QF_BV</a> logic; </li> <li>new fully-featured parsers and type-checkers for SMT-LIB and native Alt-Ergo languages; </li> <li>specific and meaningful messages for syntax and typing errors. </li> </ul> <p>These features are powered by the <a href="https://github.com/Gbury/dolmen">Dolmen Library</a> through a new frontend alongside the legacy one. Dolmen, developed by our own Guillaume Bury, is also used by the SMT community to check the conformity of the <a href="https://smtlib.cs.uiowa.edu/benchmarks.shtml">SMT-LIB benchmarks</a>.</p> <p><strong>Important</strong>: In this release, the legacy frontend is still the default. If you want to enable the new Dolmen frontend, use the option <code>--frontend dolmen</code>. We encourage you to try it and report any bugs on our <a href="https://github.com/OCamlPro/alt-ergo/issues">issue tracker</a>.</p> <p><strong>Note</strong>: We plan to deprecate the legacy frontend and make Dolmen the default frontend in version <code>2.6.0</code>, and to fully remove the legacy frontend in version <code>2.7.0</code>.</p> <h3>Support For Bit-Vectors Primitives</h3> <p>Alt-Ergo has had support for bit-vectors in its native language for a long time, but bit-vectors were not supported by the old SMT-LIB parser, and hence not available in the SMT-LIB format. This has changed with the new Dolmen front-end, and support for bit-vectors in the SMT-LIB format is now available starting with Alt-Ergo 2.5.1!</p> <p>The SMT-LIB theories for bit-vectors, <code>BV</code> and <code>QF_BV</code>, have more primitives than those previously available in Alt-Ergo. Alt-Ergo 2.5.1 supports all the primitives in the <code>BV</code> and <code>QF_BV</code> primitives when using the Dolmen frontend. Alt-Ergo's reasoning capabilities on the new primitives are limited, and will be gradually improved in future releases.</p> <h3>Built-in Primitives For Mixed Integer And Real Problems</h3> <p>In this release, we add the support for the primitives <code>to_real</code>, <code>to_int</code> and <code>is_int</code> of the SMT-LIB theory <a href="https://smtlib.cs.uiowa.edu/theories-Reals_Ints.shtml">Reals_Ints</a>. Notice that the support is only avalaible through the Dolmen frontend.</p> <h3>Example</h3> <p>For instance, the input file <code>input.smt2</code>:</p> <pre><code class="language-shell-session">(set-logic ALL) (declare-const x Int) (declare-const y Real) (declare-fun f (Int Int) Int) (assert (= (f x y) 0)) (check-sat) </code></pre> <p>with the command:</p> <pre><code class="language-shell-session">alt-ergo --frontend dolmen input.smt2 </code></pre> <p>produces the limpid error message:</p> <pre><code class="language-shell-session">File &quot;input.smt2&quot;, line 5, character 11-18: 5 | (assert (= (f x y) 0)) ^^^^^^^ Error The term: `y` has type `real` but was expected to be of type `int` </code></pre> <h2>Model Generation</h2> <p>Generating models (also known as counterexamples) is highly appreciated by users of SMT-solvers. Indeed, most builtin theories in common SMT-solvers are incomplete. As a consequence, solvers can fail to discharge goals and, without models, the SMT-solver behaves as a black box by outputting laconic answers: <code>sat</code>, <code>unsat</code> or <code>unknown</code>.</p> <p>Providing best-effort counterexamples assists developers to understand why the solver failed to validate goals. If the goal isn't valid, the solver should, as much as it can, output a correct counter-example that helps users while fixing their specifications. If the goal is actually valid, the generated model is wrong but it can help SMT-solver's maintainers to understand why their solver didn't manage to discharge the goal.</p> <p>Model generation for <code>LIA</code> theory and <code>enum</code> theory is available in Alt-Ergo. The feature for other theories is either in testing phase or being implemented. If you run into wrong models, please report them on our <a href="https://github.com/OCamlPro/alt-ergo/issues">Github repository</a>.</p> <h3>Usage</h3> <p>The present release provides convenient ways to invoking models. Notice that we change model invocation since the post <a href="https://ocamlpro.com/blog/2022_11_16_alt-ergo-models/">Alt-Ergo: the SMT solver with model generation</a> about model generation on the <code>next</code> development branch.</p> <p>Check out the <a href="https://ocamlpro.github.io/alt-ergo/Usage/index.html#generating-models">documentation</a> for more details.</p> <h2>Floating Point Support</h2> <p>In version 2.5.1, the options to enable support for unbounded floating-point arithmetic have been simplified. The options <code>--use-fpa</code> and <code>--prelude fpa-theory-2019-10-08-19h00.ae</code> are gone: floating-point arithmetic is now treated as a built-in theory and can be enabled with <code>--enable-theories fpa</code>. We plan on enabling support for the FPA theory by default in a future release.</p> <h3>Usage</h3> <p>To turn on the <code>fpa</code> theory, use the new option <code>--enable-theory fpa</code> as follows:</p> <pre><code class="language-shell-session">alt-ergo --enable-theory fpa input.smt2 </code></pre> <h2>About Alt-Ergo 2.5.0</h2> <p>Version 2.5.0 should not be used, as it contains a soundness bug with the new <code>bvnot</code> primitive that slipped through the cracks. The bug was found immediately after the release, and version 2.5.1 released with a fix.</p> <h2>Acknowledgements</h2> <p>We thank members of the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users' Club</a>: Thales, Trust-in-Soft, AdaCore, MERCE and the CEA.</p> <p>We specially thank David Mentré and Denis Cousineau at Mitsubishi Electric R&amp;D Center Europe for funding the initial work on model generation. Note that MERCE has been a Member of the Alt-Ergo Users' Club for three years. This partnership allowed Alt-Ergo to evolve and we hope that more users will join the Club on our journey to make Alt-Ergo a must-have tool.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/clubAE.png"> <img alt="The dedicated members of our Alt-Ergo Club!" src="/blog/assets/img/clubAE.png"/> </a> <div class="caption"> The dedicated members of our Alt-Ergo Club! </div> </p> </div> </p> 2022 at OCamlPro https://ocamlpro.com/blog/2023_06_30_2022_at_ocamlpro 2023-06-30T13:48:57Z 2023-06-30T13:48:57Z Dario Pinto OCamlPro For 12 years now, OCamlPro has been empowering a large range of customers, allowing them to harness state-of-the-art technologies and languages like OCaml and Rust. Our not-so-small-anymore company steadily grew into a team of highly-skilled and passionate engineers, experts in Computer Science, fro... <p> <div class="figure"> <p> <a href="/blog/assets/img/ocp_beach_2023.png"> <img alt="Clear skies on OCamlPro's way of life." src="/blog/assets/img/ocp_beach_2023.png"/> </a> <div class="caption"> Clear skies on OCamlPro's way of life. </div> </p> </div> </p> <p>For 12 years now, OCamlPro has been empowering a large range of customers, allowing them to harness state-of-the-art technologies and languages like OCaml and Rust. Our not-so-small-anymore company steadily grew into a team of highly-skilled and passionate engineers, experts in Computer Science, from Compilation and Software Analysis to Domain Specific Languages design and Formal Methods.</p> <p>In this article, as every year (see <a href="https://ocamlpro.com/blog/2022_01_31_2021_at_ocamlpro/">last year's post</a>) - albeit later than we do usually, we review some of the work we did during 2022, in many different worlds as shows the wide range of the missions we achieved.</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <p><a href="#people">Newcomers at OCamlPro</a></p> <p><a href="#apps">Modernizing Core Parts of Real Life Applications</a></p> <ul> <li><a href="#mlang">MLANG, keystone of the French citizens' Income Tax Calculation</a> </li> <li><a href="#cobol">Contributing to GnuCOBOL, the Free Open-Source COBOL Alternative</a> </li> </ul> <p><a href="#rust">Rust Expertise and Developments</a></p> <ul> <li><a href="#ecore">Ecore, a heart of Rust for EMF</a> </li> <li><a href="#osource">Open-Source Rust Contributions</a> <ul> <li><a href="#lean4">Contributions to Lean4 Language</a> </li> <li><a href="#matla">Matla, TLA+ Projects Manager</a> </li> <li><a href="#agnos">Agnos, for Let's Encrypt Wildcard Certificates</a> </li> </ul> </li> </ul> <p><a href="#wasm">The WebAssembly Garbage Collection Working-Group</a></p> <p><a href="#formal-methods">Tooling for Formal Methods</a></p> <ul> <li><a href="#prover">The Alt-Ergo Theorem Prover</a> <ul> <li><a href="#club">The Alt-Ergo Users' Club</a> </li> <li><a href="#alt-ergo">Developing Alt-Ergo</a> </li> </ul> </li> <li><a href="#dolmen">Dolmen Library for Automated Deduction Languages</a> </li> </ul> <p><a href="#ocaml">Contributions to OCaml</a></p> <ul> <li><a href="#opam">About opam, the OCaml Package Manager</a> </li> <li><a href="#flambda">The Flambda2 Optimizing Compiler</a> </li> </ul> <p><a href="#meetups">Organizing OCaml Meetups</a></p> <ul> <li><a href="#oups">OCaml Users in PariS (OUPS)</a> </li> <li><a href="#octo">OCaml Meet-Up in Toulouse</a> </li> </ul> <p><a href="#confs">Participation to External Events</a></p> <ul> <li><a href="#ocamlworkshop">The OCaml Workshop 2022 - ICFP Ljubljana</a> </li> <li><a href="#jfla2022">Journées Francophones Langages Applicatif 2022</a> </li> </ul> <p></div></p> <h2> <a id="people" class="anchor"></a><a class="anchor-link" href="#people">Newcomers at OCamlPro</a> </h2> <p>OCamlPro is not just a R&amp;D software company, we like to think about it more as a team of people who like to work together. So, we are proud to introduce you the incredible human beings that joined us in 2022:</p> <ul> <li> <p><em>Pierre Villemot</em> joined us in June. After three years of research at the Weizmann Institute on transcendental measures in Arithmetical Geometry, he was recruited and became the main maintainer of the Alt-Ergo Theorem Prover.</p> </li> <li> <p><em>Milàn Martos</em> joined us in July. He studied Chemistry and Computer Science at ENS, and he holds an MBA. He joined the Team as a Presales Engineer and as a Junior OCaml Web Developer.</p> </li> <li> <p><em>Nathanaëlle Courant</em> joined us in September. She holds a Master's degree from École Normale Supérieure in Paris, and is finishing her Ph.D. on efficient and verified reduction and convertibility tests for theorem provers. She joined OCamlPro in 2022 and works on the OCaml optimizer, in the Flambda team.</p> </li> <li> <p><em>Arthur Carcano</em> also joined us in September. Arthur is a Rust developer interested in performance optimization, software design, and crafting powerful and user-friendly tools. After completing his M.Sc. in Computer Science at ENS Ulm, he obtained a Ph.D. in Mathematics and Computer Science from Université de Paris.</p> </li> <li> <p><em>Emilien Lemaire</em> joined us in December 2022. After an internship on typechecking COBOL statements, he will be working with our COBOL team on creating a studio of modern tools for COBOL.</p> </li> </ul> <h2> <a id="apps" class="anchor"></a><a class="anchor-link" href="#apps">Modernizing Core Parts of Real Life Applications</a> </h2> <p>We love to harness our IT expertise to give a competitive advantage to our clients by modernizing core chunks of key infrastructures. For example, we are working with the French Public Finances General Directorate on two of their modernization projects, to reimplement the language used for the computation of the Income Tax (<a href="#mlang">MLang</a>) and to provide support on the GnuCOBOL compiler used by the MedocDB application (<a href="#cobol">COBOL</a>).</p> <h3> <a id="mlang" class="anchor"></a><a class="anchor-link" href="#mlang">MLANG, keystone of the French citizens' Income Tax Calculation</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/dgfip_2023_at_ocp.jpg"> <img alt="The M language, designed in the 80s to compute the French Income Tax, is still being rewritten in OCaml!" src="/blog/assets/img/dgfip_2023_at_ocp.jpg"/> </a> <div class="caption"> The M language, designed in the 80s to compute the French Income Tax, is still being rewritten in OCaml! </div> </p> </div> </p> <p>In 2022, our work on MLANG has passed a significant milestone: our work may no longer be considered a prototype! Code generation is now behaviourly compliant with the upstream compiler. David focused on rewriting the C architecture, which has been of great aid in iterating through each version of this new implementation of MLANG.</p> <p>As far as testing goes, we were allowed to compare the results of our implementation against the ones of the upstream calculator, on real-life inputs too. We are talking about calculations of immense scale, which entails a highly performance-dependent project. Naturally, we managed to produce something of equivalent performance which was a very important matter for our contractors which have, since then, voiced their satisfaction. It is always great for us to feel appreciated for our work.</p> <p>The next step is to make a production-level language by the end of 2023, so stay tuned if you are interested in this great project.</p> <blockquote> <p>Wondering what MLANG is ? Be sure to read <a href="https://ocamlpro.com/blog/2022_01_31_2021_at_ocamlpro/#mlang">last year's post</a> on the matter.</p> </blockquote> <h3> <a id="cobol" class="anchor"></a><a class="anchor-link" href="#cobol">Contributing to GnuCOBOL, the Free Open-Source COBOL Alternative</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/COBOL_DEFENSE_2.jpg"> <img alt="Cobol is ran in gargantuan infrastructures of many an insurance companies and banks across the globe." src="/blog/assets/img/COBOL_DEFENSE_2.jpg"/> </a> <div class="caption"> Cobol is ran in gargantuan infrastructures of many an insurance companies and banks across the globe. </div> </p> </div> </p> <p>In 2022, we started contributing to <a href="https://github.com/ocamlpro/gnucobol">the GnuCOBOL project</a>: the GnuCOBOL compiler is, today, the only free, open-source, industrial-grade alternative to proprietary compilers by IBM and Micro-Focus. A cornerstone feature of GnuCOBOL is its ability to understand proprietary extensions to COBOL (dialects), to decrease the migration work of developers.</p> <blockquote> <p><a href="https://ocamlpro.com/blog/2022_01_31_2021_at_ocamlpro/#cobol">Last year's <code>at OCamlPro</code></a> presented our gradual introduction to the <a href="https://wikipedia.org/wiki/COBOL">COBOL</a> Universe as one of our latest technical endeavours. In the beginning, our main objective was to wrap our heads around the state of the environment for COBOL developers.</p> </blockquote> <p>Our main contribution for now is to add support for the GCOS7 dialect, to ease migration from obsolete GCOS Bull mainframes to a cluster of PCs running GnuCOBOL for our first COBOL customer, the French <a href="https://fr.wikipedia.org/wiki/Direction_g%C3%A9n%C3%A9rale_des_Finances_publiques">DGFIP</a> (<em>Public Finances General Directorate</em>). We also contributed a few fixes and small useful general features. Our contributions are gradually upstreamed in the official version of GnuCOBOL.</p> <p>The other part of our COBOL strategy is to progressively develop our <a href="https://get-superbol.com/">SuperBOL Studio</a>, a set of modern tools for COBOL, and especially GnuCOBOL, based on an OCaml parser for COBOL that we have been improving during the year to support the full COBOL standard. More on this next year, hopefully !</p> <h2> <a id="rust" class="anchor"></a><a class="anchor-link" href="#rust">Rust Expertise and Developments</a> </h2> <p> <div class="figure"> <p> <a href="/blog/assets/img/florian_gilcher_ferrous_praises_ocp.png"> <img alt="Kind words sent our way by Florian Gilcher (skade), managing director at Ferrous Systems!" src="/blog/assets/img/florian_gilcher_ferrous_praises_ocp.png"/> </a> <div class="caption"> Kind words sent our way by Florian Gilcher (skade), managing director at Ferrous Systems! </div> </p> </div> </p> <p>OCamlPro's culture is one of human values and appeal for everything scientific.</p> <p>Programming languages of all nature have caught our attention at some point in time. As a consequence of years of expertise in all kinds of languages, we have grown fond of strongly-typed, memory-safe ones. Eventually gravitating towards Rust, we have since then invested significantly in adding this state-of-the-art language to our toolsets, one of which being the <a href="https://training.ocamlpro.com/">trainings we deliver to industrial actors</a> of various backgrounds to help them grasp at such technological marvels.</p> <p>Our trainers are qualified engineers, some of which have more than ten years of experience in the industry in R&amp;D, Formal Methods and embedded systems alike, seven of which being solely in Rust.</p> <p>Strong of our collective experiences, 2022 was indeed the stage for many contributions and missions, some of which we will share with you right now.</p> <h3> <a id="ecore" class="anchor"></a><a class="anchor-link" href="#ecore">Ecore, a heart of Rust for EMF</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/EMF_ARCHITECTURE.png"> <img alt="Ecore is the code generator at the heart of the EMF Architecture." src="/blog/assets/img/EMF_ARCHITECTURE.png"/> </a> <div class="caption"> Ecore is the code generator at the heart of the EMF Architecture. </div> </p> </div> </p> <p>In 2022, we have seized the opportunity to work at the threshold between Java and Rust for our clients and academic partners of the CEA (Commissariat aux Énergies Atomiques et aux Énergies Alternatives). The product was a Rust-written and Rust-encoded Java class hierarchy code generator.</p> <p>Ecore is the core metamodel at the heart of the <a href="https://en.wikipedia.org/wiki/Eclipse_Modeling_Framework">Eclipse Modeling Framework (EMF)</a>, which is used internally at the CEA. Ecore is a superset of <a href="https://en.wikipedia.org/wiki/Unified_Modeling_Language">UML</a> and allows for the engineers of the CEA to express a Java class hierarchy through a graphical interface. In practice, this allows for the generation of basic Java models for the engineers to then build upon.</p> <p>Our mission consisted in writing, in Rust, a new model generator for them to use in their workflows and to make it capable of generating Rust code instead of Java.</p> <p>The cost for harnessing the objective qualities of a new implementation in Rust was to have us tackle the scientific challenges pertaining to the inherent structural differences between both languages. Our goal was to find a way to encode, in Rust, a way to express the semantics of the hierarchy of classes of Java, hence merging the worlds of Rust and Java on the way there.</p> <p>Eventually, our partners were convinced the challenges were worth the improved speed at which models were generated. Furthermore, the now embedded-programming compliant platform, the runtime safety and even Rust's broader WebAssembly-ready toolchain have cleared a new path for the future iterations of their internal projects.</p> <h3> <a id="osource" class="anchor"></a><a class="anchor-link" href="#osource">Open-Source Rust Contributions</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/ferris_stained_glass_2023.png"> <img alt="Ferris the Crab is the mascot of the Rust Language. No wonder why we converged as well!" src="/blog/assets/img/ferris_stained_glass_2023.png"/> </a> <div class="caption"> Ferris the Crab is the mascot of the Rust Language. No wonder why we converged as well! </div> </p> </div> </p> <p>As we continue scouring the market for more and more Rust projects, and whenever the opportunity shows up, we still actively contribute to the open-source community, here are some of this year's OS work:</p> <h4> <a id="lean4" class="anchor"></a><a class="anchor-link" href="#lean4">Lean4</a> </h4> <p>Here's a project suited for all who, like us, are Formal Methods, functional programming and formal methods enthousiasts: <a href="https://leanprover.github.io/about/">Lean</a>:</p> <blockquote> <p>Lean is a functional programming language that makes it easy to write correct and maintainable code. You can also use Lean as an interactive theorem prover. Lean programming primarily involves defining types and functions. This allows your focus to remain on the problem domain and manipulating its data, rather than the details of programming.</p> </blockquote> <p>The list of our contributions to the <a href="https://github.com/leanprover/lean4">repository of lean4</a>:</p> <ul> <li>Detection of a major <a href="https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/case.20in.20dependent.20match.20not.20triggering.20.28.3F.29/near/288328239">dependent pattern matching bug</a> </li> <li>Some QA with <a href="https://github.com/leanprover/lean4/pull/1844">unintuitive <code>calc</code> indentation</a> </li> <li>And some more with <a href="https://github.com/leanprover/lean4/pull/1811">strict indentation in nested <code>by</code>-s requirement</a> </li> </ul> <h4> <a id="matla" class="anchor"></a><a class="anchor-link" href="#matla">Matla, TLA+ Projects Manager</a> </h4> <p><a href="https://ocamlpro.com/blog/2022_01_31_2021_at_ocamlpro/#matla">Last year, we shared a sneakpeek of Matla</a>, introducing its use-case and the motivations for implementing such manager for TLA+ projects. As we tinkered with TLA+, sometimes <a href="https://github.com/tlaplus/tlaplus/issues/732">finding a bug</a>, we continued our development of Matla on the side.</p> <p>The tool, although still a work-in-progress, has since then undergone a few changes and <a href="https://github.com/OCamlPro/matla/releases">releases</a>:</p> <ul> <li><a href="https://github.com/OCamlPro/matla/pull/10">Implemented user feedback</a> </li> <li><a href="https://github.com/OCamlPro/matla/pull/8">Clap builder overhaul</a> </li> <li><a href="https://github.com/OCamlPro/matla/pull/1">Fixed a bug in temporal (lasso) cex parsing</a> </li> <li><a href="https://github.com/OCamlPro/matla/pull/7">Documentation efforts</a> </li> <li><a href="https://github.com/OCamlPro/matla/pull/6">Fix double quote parsing and no JRE error</a> </li> </ul> <p>You are welcome to <a href="https://github.com/OCamlPro/matla">contribute</a> if you happen to find yourself in the same situation we were in when we started the project.</p> <h4> <a id="agnos" class="anchor"></a><a class="anchor-link" href="#agnos">Agnos, for Let's Encrypt Wildcard Certificates</a> </h4> <p>Agnos is a single-binary program allowing you to easily obtain certificates (including wildcards) from Let's Encrypt using DNS-01 challenges. It answers Let's Encrypt DNS queries on its own, bypassing the need for API calls to your DNS provider, and strives to offer a user-friendly and easy configuration.</p> <p>Often, the best contributions are of a practical nature, which is the case for <a href="https://github.com/krtab/agnos">Agnos</a>.</p> <p>If that sounds interesting to you, you can learn more about it by reading <a href="https://ocamlpro.com/blog/2022_10_05_agnos_0.1.0-beta/">this article</a>.</p> <p>Make sure to give us some feedback when you end up using it!</p> <h2> <a id="wasm" class="anchor"></a><a class="anchor-link" href="#wasm">The WebAssembly Garbage Collection Working-Group</a> </h2> <p> <div class="figure"> <p> <a href="/blog/assets/img/wasm.png"> <img alt="WebAssembly is used to compile many languages to an efficient portable code for web-browsers." src="/blog/assets/img/wasm.png"/> </a> <div class="caption"> WebAssembly is used to compile many languages to an efficient portable code for web-browsers. </div> </p> </div> </p> <p>Late 2022 was finally time for us to put into practice the knowledge we have acquired about <a href="https://webassembly.org/">WebAssembly</a> over the years by writing and presenting the first compiler of a real-world functional language targeting the WasmGC proposal.</p> <p>Although a <em>relatively</em> new technology, its great design, huge potential, and already very tangible and interesting use-cases have not escaped our watch and we are very happy to have kept a sharp eye on it.</p> <blockquote> <p>WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a <strong>portable compilation target</strong> for programming languages, enabling deployment on the web for client and server applications.</p> </blockquote> <p><a href="https://github.com/WebAssembly/gc">WasmGC</a> is the name of the on-going working group and proposal towards eventually adding support for garbage collection to Wasm. December 2022 saw a significant amount of work accomplished by both Léo Andrès (whose thesis work is directed by Pierre and Jean-Christophe Filliâtre) and Pierre Chambart towards finding viable compilation strategies from OCaml to WasmGC. The goal was three-fold: make a prototype compiler to demonstrate the soundness of the proposal, show that our compilation strategies were viable and, finally, convince the commitee of the significance of the Wasm <code>i31ref</code> for OCaml.</p> <p>Our success in these three distinct points was paramount for OCaml, and other languages that depend on the presence of <code>i31ref</code>, in order to one day benefit from having WebAssembly as a natively supported compilation target for Web-bound applications.</p> <p>Here's a short listing of the work they have accomplished for that matter. Please rest assured, <a href="https://discuss.ocaml.org/t/announcing-the-ocaml-wasm-organisation/12676/3">more detailed explanations</a> are to be expected on this very blog in the future so stay tuned!</p> <ul> <li><a href="https://github.com/WebAssembly/meetings/blob/main/gc/2023/GC-01-10.md">Introducing Wasocaml to the Wasm-GC Group</a> and demonstrating the OCaml's dependency on Wasm keeping <code>i31ref</code> in their GC proposal. </li> <li><a href="https://github.com/OCamlPro/wasocaml">Wasocaml</a>, an OCaml compiler to Wasm. Wasocaml is also the first compiler for a real-world functional language to Wasm-GC. </li> <li><a href="https://github.com/OCamlPro/owi">owi</a>, an OCaml toolchain to work with Wasm. It provides and interpreter as an executable and a library. </li> </ul> <h2> <a id="formal-methods" class="anchor"></a><a class="anchor-link" href="#formal-methods">Tooling for Formal Methods</a> </h2> <p>Programming languages theory is closely tied with the idea of proper mathematical formalisation. Hence the strong scientific background in Formal Methods that we draw from both for language design or formal verification for cybersecurity.</p> <h3> <a id="prover" class="anchor"></a><a class="anchor-link" href="#prover">The Alt-Ergo Theorem Prover</a> </h3> <p>OCamlPro develops and maintains <a href="https://alt-ergo.ocamlpro.com/">Alt-Ergo</a>, an automatic solver of mathematical formulas designed for program verification and based on Satisfiability Modulo Theories (SMT) technology. Alt-Ergo was initially created within the <a href="https://vals.lri.fr/">VALS</a> team at <a href="https://www.universite-paris-saclay.fr/en">University of Paris-Saclay</a>.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/Blackboard_with_formulas_and_geometry.jpg"> <img alt="Alt-Ergo proves mathematical formulas corresponding to software program properties." src="/blog/assets/img/Blackboard_with_formulas_and_geometry.jpg"/> </a> <div class="caption"> Alt-Ergo proves mathematical formulas corresponding to software program properties. </div> </p> </div> </p> <h4> <a id="club" class="anchor"></a><a class="anchor-link" href="#club">The Alt-Ergo Users' Club</a> </h4> <p>The Alt-Ergo Users' Club was launched in 2019. Its 4th annual meeting was held in late March 2022.</p> <p>These meetings allow us to keep track of our members' needs and demands as well as keep them informed with the latest changes to the SMT Solver; they are the lifeline of our Club and help us guarantee that the project lives on, despite the enormous task it represents.</p> <p>This is a good time to appreciate the scope of the project: Alt-Ergo is the fruit of more than 10 years' worth of Research &amp; Development. Currently maintained by Pierre Villemot, whom we will introduce in the next section, as full-time R&amp;D engineer.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/clubAE.png"> <img alt="The dedicated members of the Club!" src="/blog/assets/img/clubAE.png"/> </a> <div class="caption"> The dedicated members of the Club! </div> </p> </div> </p> <p>This is the reason why we would like to thank our partners from the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users’ Club</a>, for their trust: Thales, Trust-in-Soft, AdaCore, MERCE (Mitsubishi Electric R&amp;D Centre Europe) and the CEA. Their support allows us to maintain our tool.</p> <p>Congratulations and many thanks to our partners at Trust-In-Soft, who have upgraded their subscription to the Club to Gold-tier, allowing this great open-source tool to live on!</p> <h4> <a id="alt-ergo" class="anchor"></a><a class="anchor-link" href="#alt-ergo">Developing Alt-Ergo</a> </h4> <p>In 2022, the Alt-Ergo team welcomed Pierre Villemot as full-time maintainer of the project! His recruitement shows our commitment to the project's long term maintenance and evolution. We are looking forward to seeing him take it to new heights in future releases! Speaking of releases, 2022 has also been the stage for Alt-Ergo's v2.4.2 minor release which introduced an update of the <code>labgltk</code> library to version 3 and a set of bug fixes.</p> <p>Now onto the more substantial changes to Alt-Ergo, the integration into <code>next</code> of all the following:</p> <ul> <li>Integration of the SMT-LIB2 format parser <a href="https://github.com/Gbury/dolmen">Dolmen</a> to Alt-Ergo's frontend; </li> <li>Improvement and test of models generation; </li> <li>Addition of mutually recursive functions for the legacy frontend <strong>and</strong> Dolmen alike; </li> <li>Significant amounts of documentation and code-cleaning; </li> <li>Implementation of systematical benchmarks of the SMT-LIB for regression prevention; </li> <li>Prototypical Dockerisation; </li> </ul> <p>These are significative improvements to the User Experience and overall ergonomy of the tool. You can already benefit from these changes by using Alt-Ergo's <code>dev</code> version.</p> <p>Finally, let us inform you that our candidacy for the DECYSIF project was approved. Indeed, we and our partners at Adacore, Trust-In-Soft and the <em>Laboratoire Méthodes Formelles</em> have been selected to conduct this funded research project as consultant in Formal Methods. Now, we hope to be part of collaborative R&amp;D projects to further fund core Alt-Ergo developments. This should allow us to deepen collaboration with old partners like the Why3 team at the Formal Methods Lab (LMF) and the ProofinUse consortium members. Stay tuned!</p> <h3> <a id="dolmen" class="anchor"></a><a class="anchor-link" href="#dolmen">Dolmen Library for Automated Deduction Languages</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/dolmen_2023.jpg"> <img alt="Dolmens are Neolithic megalithic structures composed of menhirs and they can range from a few centimeters to several meters high!" src="/blog/assets/img/dolmen_2023.jpg"/> </a> <div class="caption"> Dolmens are Neolithic megalithic structures composed of menhirs and they can range from a few centimeters to several meters high! </div> </p> </div> </p> <p><a href="https://github.com/Gbury/dolmen">Dolmen</a> is an OCaml Library developed by Guillaume Bury as part of our Research and Development processes around Formal Methods and our development efforts for our automated SMT-Solver <a href="https://alt-ergo.ocamlpro.com/">Alt-Ergo</a>.</p> <p>Dolmen is a testimony of our push towards standardised input languages for SMT-Solvers. Indeed, it provides flexible Menhir parsers and typecheckers for several languages used in automated deduction such as: smtlib, ae (Alt-Ergo), dimacs, iCNF, tptp and zf (zipperposition). And so, Dolmen aims at encompassing the largest amount of input languages for automated deduction as possible and provides the OCaml community with a centralised solution for their input parsing and typechecking, hence keeping them from having to reimplement them each time.</p> <p>Furthermore, the Dolmen binary is used by the maintainers of the SMTLIB in order to assert that newly submitted benchmarks to the SMTLIB are compliant with the specification which makes Dolmen its <em>de facto</em> reference implementation. In time, Dolmen will become the default frontend for Alt-Ergo and, we hope, any other OCaml written SMT-Solver from now on.</p> <h2> <a id="ocaml" class="anchor"></a><a class="anchor-link" href="#ocaml">Contributions to OCaml</a> </h2> <p>Last but not least, OCamlPro’s DNA built itself on one of the most powerful and elegant programming languages ever, born from more than 30 years of French public Research in Computer Science, and widely used in safety critical industries. OCaml’s traits pervasively inspired many new languages (like F#). We are proud to be part of this great community of researchers and computer scientists.</p> <h3> <a id="opam" class="anchor"></a><a class="anchor-link" href="#opam">About opam, the OCaml Package Manager</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/opam-banniere-e1600868011587.png"> <img alt="opam, the OCaml Package Manager, remains one of OCamlPro's greatest achievements!" src="/blog/assets/img/opam-banniere-e1600868011587.png"/> </a> <div class="caption"> opam, the OCaml Package Manager, remains one of OCamlPro's greatest achievements! </div> </p> </div> </p> <p>2022 has been the theatre of a sustained and continuous effort from the opam team.</p> <p>The fruits of their labor have been compiled into an <a href="https://opam.ocaml.org/blog/opam-2-2-0-alpha/">alpha release of version 2.2.0</a> by June 28th 2023, so here is a taste of what should make the final <code>2.2.0</code> version of opam a treat for its users:</p> <ul> <li>Windows support: opam 2.2 comes with native Windows compatibility. You can now use opam from your preferred Windows terminal! </li> <li>Recursive pinning: allows to have opam lookup for opam files into subdirectories. </li> <li>Software Heritage binding: opam now integrates a fallback to Software Heritage archive retrieval, based on SWHID. If an SWHID url is present in an opam file, the fallback can be activated. </li> <li>Enhanced features for developers: as the development tools variable to share a development setup, the <code>opam tree</code> command to have better an overview of dependencies, new pinning subcommands, and so on. </li> </ul> <p>That being said, 2022 was a very special year for opam. Indeed, 10 years prior, on the 26th of June 2012, OCamlPro birthed version <a href="https://github.com/ocaml/opam/releases/tag/0.1"><code>0.1</code></a> of what was to become the official OCaml Package Manager, the cornerstone of the OCaml environment.</p> <p>It was no small feat to make opam what it is today. It took approximately 5 years to bring <a href="https://opam.ocaml.org/blog/opam-1-0-0-released/"><code>1.0.0</code></a> up to <a href="https://opam.ocaml.org/blog/opam-2-0-0/"><code>2.0.0</code></a> and another 3 to reach <a href="https://opam.ocaml.org/blog/opam-2-1-0/"><code>2.1.0</code></a> all the while ensuring changes were compliant with the current ecosystem (opam repository, OCaml tooling) and the public's feedback and vision.</p> <p><em><strong>This work is allowed thanks to Jane Street's funding.</strong></em></p> <h3> <a id="flambda" class="anchor"></a><a class="anchor-link" href="#flambda">The Flambda2 Optimizing Compiler</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/flambda_camel_2023.png"> <img alt="Flambda2 is a powerful code optimizer for the OCaml compiler strong of many years of R&amp;D." src="/blog/assets/img/flambda_camel_2023.png"/> </a> <div class="caption"> Flambda2 is a powerful code optimizer for the OCaml compiler strong of many years of R&amp;D. </div> </p> </div> </p> <p>OCamlPro is proud to be working on Flambda2, an ambitious OCaml optimizing compiler project, initiated with Mark Shinwell from Jane Street, our long-term partner and client. Flambda2 builds upon its predecessor: Flambda, which focused on reducing the runtime cost of abstractions and removing as many short-lived allocations as possible. Thus, Flambda2 not only shines with the maturity and experience its architects acquired through years worth of R&amp;D and dev-time for Flambda, but it improves upon them.</p> <p>In 2022, Flambda2 was for the first time used for production workloads and has been ever since! Indeed, we can officially say that Flambda2 left the realm of the prototype to enter one of real-life, production-tested software for which we continue to provide development and support as it has been for years now.</p> <p>This achievement comes along having our engineers take more and more action in maintaining the OCaml compiler. Being part of the OCaml Core-Team is an honour.</p> <p>Finally, in 2022, the Flambda Team welcomed a new member: Nathanaëlle Courant will be joining forces with Pierre Chambart, Damien Doligez, Vincent Laviron, Guillaume Bury to tackle the challenges inherent to maintaining Flambda2 and that of the Core-Team.</p> <p>If you are interested in more things Flambda2, stay tuned in with our blog, there should be a series of very interesting articles coming up in the not-so distant future!</p> <p><em><strong>This work is allowed thanks to Jane Street's funding.</strong></em></p> <p>In other OCaml compiler news, 2022 was also the year of the official release of OCaml 5.0.0, also known as Multi-Core, on the 16th of December. This major release introduced a new runtime system with support for shared memory parallelism and effect handlers! This fabulous milestone is brought to us by the joined work of the amazing people of the OCaml Core-Team; among them some of our own.</p> <p>Many thanks to all who take part in uncovering the yet untrodden paths of the OCaml distribution!</p> <p>What a time to be an OCaml enthousiast!</p> <h2> <a id="meetups" class="anchor"></a><a class="anchor-link" href="#meetups">Organizing Meetups for the OCaml Community</a> </h2> <h3> <a id="oups" class="anchor"></a><a class="anchor-link" href="#oups">OCaml Users in PariS (OUPS)</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/camels_going_to_oups.jpg"> <img alt="Camels going to their pluri-annual OUPS Meet-up." src="/blog/assets/img/camels_going_to_oups.jpg"/> </a> <div class="caption"> Camels going to their pluri-annual OUPS Meet-up. </div> </p> </div> </p> <p>Just under 10 years ago, Fabrice Le Fessant initiated the <a href="https://www.meetup.com/fr-FR/ocaml-paris/events/99222322/">very first OCaml Users in Paris</a>.</p> <p>This event allowed OCaml users in Paris, professionals and amateurs alike, to meet and exchange on OCaml novelties. This is still the case and the organising crew now includes several people of diverse affiliations, maintaining the purpose of this friendly event.</p> <p>Every two months or so, the organisers reach out to the community, hail volunteers and select presentations of on-going works. When the time and place is settled, the <code>ocaml-paris</code> Meetup members are informed by various means. The OCaml Users in PariS meetup is the place to enthusiastically share knowledge and a few pizzas. It is supported by the <a href="https://ocaml-sf.org/">OCaml Software Foundation</a> who graciously pays for the pizzas.</p> <blockquote> <p><strong>You can register to the OCaml Users in PariS (OUPS) meetup group <a href="https://www.meetup.com/ocaml-paris/">here</a></strong>.</p> </blockquote> <p>Here are all the relevant links to the talks that happened in Paris in 2022:</p> <ul> <li><a href="https://www.meetup.com/ocaml-paris/events/284313963/">10th March 2022</a> </li> <li><a href="https://www.meetup.com/ocaml-paris/events/285435718/">12th May 2022</a> </li> <li><a href="https://www.meetup.com/ocaml-paris/events/288520108/">29th September 2022</a> </li> <li><a href="https://www.meetup.com/ocaml-paris/events/289909374/">8th December 2022</a> </li> </ul> <h3> <a id="octo" class="anchor"></a><a class="anchor-link" href="#octo">OCaml Meet-Up in Toulouse</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/Hopital_de_la_Grave-Toulouse-2012-06-23.jpg"> <img alt="Toulouse also has its set of enthousiastic OCaml supporters." src="/blog/assets/img/Hopital_de_la_Grave-Toulouse-2012-06-23.jpg"/> </a> <div class="caption"> Toulouse also has its set of enthousiastic OCaml supporters. </div> </p> </div> </p> <p>Fortunately for OCaml Users that live in the French South-West, <a href="https://www.meetup.com/ocaml-toulouse/events/288464047/">a new Meet-up is now available</a> to them. On the 11th of October 2022, the first OCaml meet-up in <a href="https://en.wikipedia.org/wiki/Toulouse">Toulouse</a> happened.</p> <p>The first occurence of the OCaml Users in Toulouse Meetup kicked off with Erik Martin-Dorel (OCaml Software Foundation) presenting <a href="https://ocaml-sf.org/learn-ocaml/"><code>Learn-OCaml</code></a> who was then followed by David Declerck (OCamlPro) presenting his <a href="https://github.com/ocamlpro/ocaml-canvas"><code>OCaml-Canvas</code></a> graphics library for OCaml.</p> <blockquote> <p><strong>You can register to the OCaml Meet-Up in Toulouse group <a href="https://www.meetup.com/ocaml-toulouse/">here</a></strong>.</p> </blockquote> <p>Here's to sharing a slice or two with you soon!</p> <h2> <a id="confs" class="anchor"></a><a class="anchor-link" href="#confs">Participation to External Events</a> </h2> <h3> <a id="ocamlworkshop" class="anchor"></a><a class="anchor-link" href="#ocamlworkshop">The OCaml Workshop 2022 - ICFP Ljubljana</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/ljubjana_slovenie_icfp_2022.jpg"> <img alt="ICFP 2022 took place in the beautiful town of Ljubjana, Slovenia." src="/blog/assets/img/ljubjana_slovenie_icfp_2022.jpg"/> </a> <div class="caption"> ICFP 2022 took place in the beautiful town of Ljubjana, Slovenia. </div> </p> </div> </p> <p>The OCaml Workshop is an international conference that focuses on everything OCaml and is part of the ICFP (International Conference on Functional Programming).</p> <p>We attended many of these and have presented numerous papers throughout the years.</p> <p>In 2022, a paper co-authored by the maintainers of opam, the OCaml Package Manager, was submitted and approved for presentation: &quot;Supporting a decade of opam&quot;.</p> <p>You can find the textual references of the talk <a href="https://icfp22.sigplan.org/details/ocaml-2022-papers/11/Supporting-a-decade-of-opam">here</a> and a replay of the presentation <a href="https://watch.ocaml.org/w/1rWj4jYyaDkmMjdH4KNcv6">there</a>.</p> <p>You can expect more papers and interesting talks coming from us in upcoming editions of the conference!</p> <h3> <a id="jfla2022" class="anchor"></a><a class="anchor-link" href="#jfla2022">Journées Francophones Langages Applicatifs 2022</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/picture_jfla2022_domaine_essendieras.jpg"> <img alt="the JFLA'2022 took place in the beautiful Domaine d'Essendiéras in Périgord, France." src="/blog/assets/img/picture_jfla2022_domaine_essendieras.jpg"/> </a> <div class="caption"> the JFLA'2022 took place in the beautiful Domaine d'Essendiéras in Périgord, France. </div> </p> </div> </p> <p>Among the many scientific conferences we attend on an annual basis, the <a href="https://jfla.inria.fr/">JFLA</a> (<em>Journée Francophones des Langages Applicatifs</em> or <em>French-Speaking annual gathering on Application Programming Languages</em>, mainly Functional Languages) is the one we feel most at home since 2016.</p> <p>Ever since have we remained among their faithful supporters and participants. This gathering of many of our fellow French computer scientists and industrial actors alike has been our go-to conference to catch-up with and present our work. The 2022 edition was no exception!</p> <p>We submitted and presented the following papers:</p> <ul> <li><a href="https://ocamlpro.com/blog/2021_10_14_verification_for_dummies_smt_and_induction/">Mikino, formal verification made accessible (link to dedicated blogpost)</a>; </li> <li>Connecting Software Heritage with the OCaml ecosystem; </li> <li>Alt-Ergo-Fuzz, hunting the bugs of the bug hunter; </li> </ul> <p>You can find a more detailed recounting of our JFLA2022 submissions in <a href="https://ocamlpro.com/blog/2022_07_12_ocamlpro_at_the_jfla2022/">this blog post</a> as well as the links to the actual (french-written) submitted papers.</p> <h2> <a id="conclusion" class="anchor"></a><a class="anchor-link" href="#conclusion">Conclusion</a> </h2> <p>As always, we warmly thank all our clients, partners, and friends, for their support and collaboration throughout the year,</p> <p>And to you, dear reader, thank you for tagging along,</p> <p>Since 2011 with love,</p> <p>The OCamlPro Team</p> Autofonce, GNU Autotests Revisited https://ocamlpro.com/blog/2023_03_18_autofonce 2023-06-27T13:48:57Z 2023-06-27T13:48:57Z Fabrice Le Fessant Since 2022, OCamlPro has been contributing to GnuCOBOL, the only fully open-source compiler for the COBOL language. To speed-up our contributions to the compiler, we developed a new tool, autofonce, to be able to easily run and modify the testsuite of the compiler, originally written as a GNU Autoco... <p></p> <p>Since 2022, OCamlPro has been contributing to GnuCOBOL, the only fully open-source compiler for the COBOL language. To speed-up our contributions to the compiler, we developed a new tool, <code>autofonce</code>, to be able to easily run and modify the testsuite of the compiler, originally written as a GNU Autoconf testsuite. This article describes this tool, that could be useful for other project testsuites.</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#introduction">Introduction</a> </li> <li><a href="#gnucobol">The Gnu Autoconf Testsuite of GnuCOBOL</a> </li> <li><a href="#autofonce">Main Features of Autofonce</a> </li> <li><a href="#conclusion">Conclusion</a> </div> </li> </ul> <p> <div class="figure"> <p> <a href="/blog/assets/img/autofonce-2023.png"> <img alt="Autofonce is a modern runner for GNU Autoconf Testsuite" src="/blog/assets/img/autofonce-2023.png"/> </a> <div class="caption"> Autofonce is a modern runner for GNU Autoconf Testsuite </div> </p> </div> </p> <h2> <a id="introduction" class="anchor"></a><a class="anchor-link" href="#introduction">Introduction</a> </h2> <p>Since 2022, OCamlPro has been involved in a big modernization project for the French state: the goal is to move a large COBOL application, running on a former Bull mainframe (<a href="https://fr.wikipedia.org/wiki/General_Comprehensive_Operating_System">GCOS</a>) to a cluster of Linux computers. The choice was made to use the most open-source compiler, <a href="https://gnucobol.sourceforge.io/">GnuCOBOL</a>, that had already been used in such projects.</p> <p>One of the main problems in such migration projects is that most COBOL proprietary compilers provide extensions to the COBOL language standard, that are not supported by other compilers. Fortunately, GnuCOBOL has good support for several mainstream COBOL dialects, such as IBM or Micro-Focus ones. Unfortunately, GnuCOBOL had no support at the time for the GCOS COBOL dialect developed by Bull for its mainframes.</p> <p>As a consequence, OCamlPro got involved in the project to extend GnuCOBOL with the support for the GCOS dialect needed for the application. This work implied a lot of (sometimes very deep) <a href="https://github.com/OCamlPro/gnucobol/pulls?q=is%3Apr+is%3Aclosed">modifications</a> of the compiler and its runtime library, both of them written in the C language. And of course, our modifications had first to pass the large existing testsuite of COBOL examples, and then extend it with new tests, so that the new dialect would continue to work in the future.</p> <p>This work lead us to develop <a href="https://github.com/OCamlPro/autofonce"><code>autofonce</code>, a modern open-source runner</a> for GNU Autoconf Testsuites, the framework used in GnuCOBOL to manage its testsuite. Our tool is available on Github, with Linux and Windows binaries on the <a href="https://github.com/OCamlPro/autofonce/releases">release page</a>.</p> <h2> <a id="gnucobol" class="anchor"></a><a class="anchor-link" href="#gnucobol">The GNU Autoconf Testsuite of GnuCOBOL</a> </h2> <p><a href="https://www.gnu.org/software/autoconf/">GNU Autoconf</a> is a set of powerful tools, developed to help developers of open-source projects to manage their projects, from configuration steps to testing and installation. As a very old technology, GNU Autoconf relies heavily on <a href="https://www.gnu.org/software/m4/manual/m4.html">M4 macros</a> both as its own development language, and as its extension language, typically for tests.</p> <p>In GnuCOBOL, the testsuite is in a <a href="https://github.com/OCamlPro/gnucobol/tree/gcos4gnucobol-3.x/tests">sub-directory <code>tests/</code></a>, containing a file <a href="https://github.com/OCamlPro/gnucobol/blob/gcos4gnucobol-3.x/tests/testsuite.at"><code>testsuite.at</code></a>, itself including other files from a sub-directory <a href="https://github.com/OCamlPro/gnucobol/blob/gcos4gnucobol-3.x/tests/testsuite.src"><code>testsuite.src/</code></a>.</p> <p>As an example, a typical test from <a href="https://github.com/OCamlPro/gnucobol/blob/gcos4gnucobol-3.x/tests/testsuite.src/syn_misc.at">syn_copy.at</a> looks like:</p> <pre><code class="language-COBOL">AT_SETUP([INITIALIZE constant]) AT_KEYWORDS([misc]) AT_DATA([prog.cob], [ IDENTIFICATION DIVISION. PROGRAM-ID. prog. DATA DIVISION. WORKING-STORAGE SECTION. 01 CON CONSTANT 10. 01 V PIC 9. 78 C78 VALUE 'A'. PROCEDURE DIVISION. INITIALIZE CON. INITIALIZE V. INITIALIZE V, 9. INITIALIZE C78, V. ]) AT_CHECK([$COMPILE_ONLY prog.cob], [1], [], [prog.cob:10: error: invalid INITIALIZE statement prog.cob:12: error: invalid INITIALIZE statement prog.cob:13: error: invalid INITIALIZE statement ]) AT_CLEANUP </code></pre> <p>Actually, we were quite pleased by the syntax of tests, it is easy to generate test files (using <code>AT_DATA</code> macro) and to test the execution of commands (using <code>AT_CHECK</code> macro), checking its exit code, its standard output and error output separately. It is even possible to combine checks to run additional checks in case of error or success. In general, the testsuite is easy to read and complete.</p> <p>However, there were still some issues:</p> <ul> <li> <p>At every update of the code or the tests, the testsuite runner has to be recompiled;</p> </li> <li> <p>Running the testsuite requires to be in the correct sub-directory, typically within the <code>_build/</code> sub-directory;</p> </li> <li> <p>By default, tests are ran sequentially, even when many cores are available.</p> </li> <li> <p>The output is pretty verbose, showing all tests that have been executed. Failed tests are often lost in the middle of other successful tests, and you have to wait for the end of the run to start investigating them;</p> <pre><code class="language-shell">## -------------------------------------------- ## ## GnuCOBOL 3.2-dev test suite: GnuCOBOL Tests. ## ## -------------------------------------------- ## General tests of used binaries 1: compiler help and information ok 2: compiler warnings ok 3: compiler outputs (general) ok 4: compiler outputs (file specified) ok 5: compiler outputs (path specified) ok 6: compiler outputs (assembler) ok 7: source file not found ok 8: temporary path invalid ok 9: use of full path for cobc ok 10: C Compiler optimizations ok 11: invalid cobc option ok 12: cobcrun help and information ok 13: cobcrun validation ok 14: cobcrun -M DSO entry argument ok 15: cobcrun -M directory/ default ok [...] </code></pre> </li> <li> <p>There is no automatic way to update tests, when their output has changed. Every test has to be updated manually.</p> </li> <li> <p>In case of error, it is not always easy to rerun a specific test within its directory.</p> </li> </ul> <p>With <code>autofonce</code>, we tried to solve all of these issues...</p> <h2> <a id="autofonce" class="anchor"></a><a class="anchor-link" href="#autofonce">Main Features of Autofonce</a> </h2> <p><code>autofonce</code> is written in a modern language, OCaml, so that it can handle a large testsuite much faster than GNU Autoconf. Since we do not expect users to have an OCaml environment set up, we provide binary versions of <code>autofonce</code> for both Linux (static executable) and Windows (cross-compiled executable) on Github.</p> <p><code>autofonce</code> does not use <code>m4</code>, instead, it has a limited support for a small set of predefined m4 macros, typically supporting m4 escape sequences (quadrigraphs), but not the addition of new m4 macros, and neither the execution of shell commands outside of these macros (yes, testsuites in GNU Autoconf are actually <code>sh</code> shell scripts with m4 macros...). In the case of GnuCOBOL, we were lucky enough that the testsuite was well written and avoided such problems (we had to fix only a few of them, such as including shell commands into <code>AT_CHECK</code> macros). The syntax of tests is <a href="https://ocamlpro.github.io/autofonce/sphinx/format.html">documented here</a>.</p> <p>Some interesting features of <code>autofonce</code> are :</p> <ul> <li> <p><code>autofonce</code> executes the tests in parallel by default, using as many cores as available. Only failed tests are printed, so that the developer can immediately start investigating them;</p> </li> <li> <p><code>autofonce</code> can be run from any directory in the project. A <a href="https://github.com/OCamlPro/gnucobol/blob/gcos4gnucobol-3.x/.autofonce"><code>.autofonce</code> file</a> has to be present at the root of the project, to describe where the tests are located and in which environment they should be executed;</p> </li> <li> <p><code>autofonce</code> makes it easy to re-execute a specific test that failed, by generating, within the test sub-directory, a script for every step of the test;</p> </li> <li> <p><code>autofonce</code> provides many options to filter which tests should be executed. Tests can be specified by number, range of numbers, keywords, or negative keywords. The complete list of options is easily printable using <code>autofonce run --help</code> for example;</p> </li> </ul> <p>Additionnally, <code>autofonce</code> implements a powerful promotion mechanism to update tests, with the <a href="https://ocamlpro.github.io/autofonce/sphinx/commands.html#autofonce-promote"><code>autofonce promote</code> sub-command</a>. For example, if you update a warning message in the compiler, you would like all tests where this message appears to be modified. With <code>autofonce</code>, it is as easy as:</p> <pre><code class="language-shell"># Run all tests at least once autofonce run # Print the patch that would be applied in case of promotion autofonce promote # Apply the patch above autofonce promote --apply # Combine running and promotion 10 times: autofonce run --auto-promote 10 </code></pre> <p>The last command iterates promotion 10 times: indeed, since a test may have multiple checks, and only the first failed check of the test will be updated during one iteration (because the test aborts at the first failed check), as many iterations as the maximal number of failed checks within a test may be needed.</p> <p>Also, as for GNU Autoconf, <code>autofonce</code> generates a final log file containing the results with a full log of errors and files needed to reproduce the error. This file can be uploaded into the artefacts of a CI system to easily debug errors after a CI failure.</p> <h2> <a id="conclusion" class="anchor"></a><a class="anchor-link" href="#conclusion">Conclusion</a> </h2> <p>During our work on GnuCOBOL, <code>autofonce</code> improved a lot our user experience of running the testsuite, especially using the auto-promotion feature to update tests after modifications.</p> <p>We hope <code>autofonce</code> could be used for other open-source projects that already use the GNU Autoconf testsuite. Of course, it requires that the testsuite does not make heavy use of shell features and mainly relies on standard m4 macros.</p> <p>We found that the format of GNU Autoconf tests to be quite powerful to easily check exit codes, standard outputs and error outputs of shell commands. <code>autofonce</code> could be used to help using this format in projects, that do not want to rely on an old tool like GNU Autoconf, and are looking for a much more modern test framework.</p> Sub-single-instruction Peano to machine integer conversion https://ocamlpro.com/blog/2023_01_23_Pea_No_Op 2023-01-23T13:48:57Z 2023-01-23T13:48:57Z Arthur Carcano It is a rainy end of January in Paris, morale is getting soggier by the day, and the bulk of our light exposure needs are now fulfilled by our computer screens as the sun seems to have definitively disappeared behind a continuous stream of low-hanging clouds. But, all is not lost, the warm rays of c... <p></p> <p><img src="/blog/assets/img/forgive_me_father.png" alt="" /></p> <p>It is a rainy end of January in Paris, morale is getting soggier by the day, and the bulk of our light exposure needs are now fulfilled by our computer screens as the sun seems to have definitively disappeared behind a continuous stream of low-hanging clouds. But, all is not lost, the warm rays of comradeship pierce through the bleak afternoons, and our joyful <a href="https://ocamlpro.com/team">party</a> of adventurers once again embarked on an adventure of curiosity and rabbit-hole depth-first-searching.</p> <p>Last week's quest led us to a treasure coveted by a mere handful of enlightened connoisseurs, but a treasure of tremendous nerdy-beauty, known to the academics as &quot;Sub-single-instruction Peano to machine integer conversion&quot; and to the locals as &quot;How to count how many nested <code>Some</code> there are very very fast by leveraging druidic knowledge about unspecified, undocumented, and unstable behavior of the Rust compiler&quot;.</p> <h1>Our quest begins</h1> <p>Our whole quest started when we wanted to learn more about discriminant elision. Discriminant elision in Rust is part of what makes it practical to use <code>Option&lt;&amp;T&gt;</code> in place of <code>*const T</code>. More precisely it is what allows <code>Option&lt;&amp;T&gt;</code> to fit in as much memory as <code>*const T</code>, and not twice as much. To understand why, let's consider an <code>Option&lt;u64&gt;</code>. An <code>u64</code> is 8 bytes in size. An <code>Option&lt;u64&gt;</code> should have at least one more bit, to indicate whether it is a <code>None</code>, or a <code>Some</code>. But bits are not very practical to deal with for computers, and hence this <em>discriminant</em> value -- indicating which of the two variants (<code>Some</code> or <code>None</code>) the value is -- should take up at least one byte. Because of <a href="https://doc.rust-lang.org/reference/type-layout.html#the-default-representation">alignment requirements</a> (and because the size is always a multiple of the alignment) it actually ends up taking 8 bytes as well, so that the whole <code>Option&lt;u64&gt;</code> occupies twice the size of the wrapped <code>u64</code>.</p> <p>In languages like C, it is very common to pass around pointers, and give them a specific meaning if they are null. Typically, a function like <a href="https://linux.die.net/man/3/lfind"><code>lfind</code></a> which searches for an element in a array will return a pointer to the matching element, and this pointer will be null if no such element was found. In Rust however fallibility is expected to be encoded in the type system. Hence, functions like <a href="https://doc.rust-lang.org/core/iter/trait.Iterator.html#method.find"><code>find</code></a> returns a reference, wrapped in a <code>Option</code>. Because this kind of API is so ubiquitous, it would have been a real hindrance to Rust adoption if it took twice as much space as the C version.</p> <p>This is why discriminant elision exists. In our <code>Option&lt;&amp;T&gt;</code> example Rust can leverage the same logic as C: <code>&amp;T</code> references in Rust are guaranteed to be -- among other things -- non-null. Hence Rust can choose to encode the <code>None</code> variant as the null value of the variable. Transparently to the user, our <code>Option&lt;&amp;T&gt;</code> now fits on 8 bytes, the same size as a simple <code>&amp;T</code>. But Rust discriminant elision mechanism goes beyond <code>Option&lt;&amp;T&gt;</code> and works for any general type if:</p> <ol> <li>The option-like value has one fieldless variant and one single-field variant </li> <li>The wrapped type has so-called niche values, that is values that are statically known to never be valid for said type. </li> </ol> <p>Discriminant elision remains under-specified, but more information can be found in the <a href="https://rust-lang.github.io/unsafe-code-guidelines/layout/enums.html#discriminant-elision-on-option-like-enums">FFI guidelines</a>. Note that other unspecified situations seem to benefit from niche optimization (e.g. <a href="https://github.com/rust-lang/rust/pull/94075/">PR#94075</a>).</p> <h1>Too many options</h1> <p>Out of curiosity, we wanted to investigate how the Rust compiler represents a series of nested <code>Option</code>. It turns out that up to 255 nested options can be stored into a byte, which is also the theoretical limit. Because this mechanism is not limited to <code>Option</code>, we can use it with (value-level) <a href="https://en.wikipedia.org/wiki/Peano_axioms">Peano integers</a>. Peano integers are a theoretical encoding of integer in &quot;unary base&quot;, but it is enough for this post to consider them a fun little gimmick. If you want to go further, know that Peano integers are more often used at the type-level, to try to emulate type-level arithmetic.</p> <p>In our case, we are mostly interested in Peano-integers at the value level. We define them as follows:</p> <pre><code class="language-rust">#![recursion_limit = &quot;512&quot;] #![allow(dead_code)] /// An empty enum, a type without inhabitants. /// Cf: https://en.wikipedia.org/wiki/Bottom_type enum Null {} /// PeanoEncoder&lt;Null&gt; is a Peano-type able to represent integers up to 0. /// If T is a Peano-type able to represent integers up to n /// PeanoEncoder&lt;T&gt; is a Peano-type able to represent integers up to n+1 #[derive(Debug)] enum PeanoEncoder&lt;T&gt; { Successor(T), Zero, } macro_rules! times2 { ($peano_2x:ident, $peano_x:ident ) =&gt; { type $peano_2x&lt;T&gt; = $peano_x&lt;$peano_x&lt;T&gt;&gt;; }; } times2!(PeanoEncoder2, PeanoEncoder); times2!(PeanoEncoder4, PeanoEncoder2); times2!(PeanoEncoder8, PeanoEncoder4); times2!(PeanoEncoder16, PeanoEncoder8); times2!(PeanoEncoder32, PeanoEncoder16); times2!(PeanoEncoder64, PeanoEncoder32); times2!(PeanoEncoder128, PeanoEncoder64); times2!(PeanoEncoder256, PeanoEncoder128); type Peano0 = PeanoEncoder&lt;Null&gt;; type Peano255 = PeanoEncoder256&lt;Null&gt;; </code></pre> <p>Note that we cannot simply go for</p> <pre><code class="language-rust">enum Peano { Succesor(Peano), Zero, } </code></pre> <p>like in <a href="https://wiki.haskell.org/Peano_numbers">Haskell</a> or OCaml because without indirection the type has <a href="https://doc.rust-lang.org/error_codes/E0072.html">infinite size</a>, and adding indirection would break discriminant elision. What we really have is that we are actually using a <em>type-level</em> Peano-encoding of integers to create a type <code>Peano256</code> that contains <em>value-level</em> Peano-encoding of integers up to 255, as a byte would.</p> <p>We can define the typical recursive pattern matching based way of converting our Peano integer to a machine integer (a byte).</p> <pre><code class="language-rust">trait IntoU8 { fn into_u8(self) -&gt; u8; } impl IntoU8 for Null { fn into_u8(self) -&gt; u8 { match self {} } } impl&lt;T: IntoU8&gt; IntoU8 for PeanoEncoder&lt;T&gt; { fn into_u8(self) -&gt; u8 { match self { PeanoEncoder::Successor(x) =&gt; 1 + x.into_u8(), PeanoEncoder::Zero =&gt; 0, } } } </code></pre> <p>Here, according to <a href="https://godbolt.org/z/hfdKdxe19">godbolt</a>, <code>Peano255::into_u8</code> gets compiled to more than 900 lines of assembly, which resembles a binary decision tree with jump-tables at the leaves.</p> <p>However, we can inspect a bit how rustc represents a few values:</p> <pre><code class="language-rust">println!(&quot;Size of Peano255: {} byte&quot;, std::mem::size_of::&lt;Peano255&gt;()); for x in [ Peano255::Zero, Peano255::Successor(PeanoEncoder::Zero), Peano255::Successor(PeanoEncoder::Successor(PeanoEncoder::Zero)), ] { println!(&quot;Machine representation of {:?}: {}&quot;, x, unsafe { std::mem::transmute::&lt;_, u8&gt;(x) }) } </code></pre> <p>which gives</p> <pre><code>Size of Peano255: 1 byte Machine representation of Zero: 255 Machine representation of Successor(Zero): 254 Machine representation of Successor(Successor(Zero)): 253 </code></pre> <p>A pattern seems to emerge. Rustc chooses to represent <code>Peano255::Zero</code> as 255, and each successor as one less.</p> <p>As a brief detour, let's see what happens for <code>PeanoN</code> with other values of N.</p> <pre><code class="language-rust">let x = Peano1::Zero; println!(&quot;Machine representation of Peano1::{:?}: {}&quot;, x, unsafe { std::mem::transmute::&lt;_, u8&gt;(x) }); for x in [ Peano2::Successor(PeanoEncoder::Zero), Peano2::Zero, ] { println!(&quot;Machine representation of Peano2::{:?}: {}&quot;, x, unsafe { std::mem::transmute::&lt;_, u8&gt;(x) }) } </code></pre> <p>gives</p> <pre><code>Machine representation of Peano1::Zero: 1 Machine representation of Peano2::Successor(Zero): 1 Machine representation of Peano2::Zero: 2 </code></pre> <p>Notice that the representation of Zero is not the same for each <code>PeanoN</code>. What we actually have -- and what is key here -- is that the representation for <code>x</code> of type <code>PeanoN</code> is the same as the representation of <code>Succesor(x)</code> of type <code>PeanoEncoder&lt;PeanoN&gt;</code>, which implies that the machine representation of an integer <code>k</code> in the type <code>PeanoN</code> is <code>n-k</code>.</p> <p>That detour being concluded, we refocus on <code>Peano255</code> for which we can write a very efficient conversion function</p> <pre><code class="language-rust">impl Peano255 { pub fn transmute_u8(x: u8) -&gt; Self { unsafe { std::mem::transmute(u8::MAX - x) } } } </code></pre> <p>Note that this function mere existence is very wrong and a sinful abomination to the eye of anything that is holy and maintainable. But provided you run the same compiler version as me on the very same architecture, you may be ok using it. Please don't use it.</p> <p>In any case <code>transmute_u8</code> gets compiled to</p> <pre><code>movl %edi, %eax notb %al retq </code></pre> <p>that is a simple function that applies a binary not to its argument register. And in most use cases, this function would actually be inlined and combined with operations above, making it run in less than one processor operation!</p> <p>And because 255 is so small, we can exhaustively check that the behavior is correct for all values! Take that formal methods!</p> <pre><code class="language-rust">for i in 0_u8..=u8::MAX { let x = Peano255::transmute_u8(i); if i % 8 == 0 { print!(&quot;{:3} &quot;, i) } else if i % 8 == 4 { print!(&quot; &quot;) } let c = if x.into_u8() == i { '✓' } else { '✗' }; print!(&quot;{}&quot;, c); if i % 8 == 7 { println!() } } </code></pre> <pre><code> 0 ✓✓✓✓ ✓✓✓✓ 8 ✓✓✓✓ ✓✓✓✓ 16 ✓✓✓✓ ✓✓✓✓ 24 ✓✓✓✓ ✓✓✓✓ 32 ✓✓✓✓ ✓✓✓✓ 40 ✓✓✓✓ ✓✓✓✓ 48 ✓✓✓✓ ✓✓✓✓ 56 ✓✓✓✓ ✓✓✓✓ 64 ✓✓✓✓ ✓✓✓✓ 72 ✓✓✓✓ ✓✓✓✓ 80 ✓✓✓✓ ✓✓✓✓ 88 ✓✓✓✓ ✓✓✓✓ 96 ✓✓✓✓ ✓✓✓✓ 104 ✓✓✓✓ ✓✓✓✓ 112 ✓✓✓✓ ✓✓✓✓ 120 ✓✓✓✓ ✓✓✓✓ 128 ✓✓✓✓ ✓✓✓✓ 136 ✓✓✓✓ ✓✓✓✓ 144 ✓✓✓✓ ✓✓✓✓ 152 ✓✓✓✓ ✓✓✓✓ 160 ✓✓✓✓ ✓✓✓✓ 168 ✓✓✓✓ ✓✓✓✓ 176 ✓✓✓✓ ✓✓✓✓ 184 ✓✓✓✓ ✓✓✓✓ 192 ✓✓✓✓ ✓✓✓✓ 200 ✓✓✓✓ ✓✓✓✓ 208 ✓✓✓✓ ✓✓✓✓ 216 ✓✓✓✓ ✓✓✓✓ 224 ✓✓✓✓ ✓✓✓✓ 232 ✓✓✓✓ ✓✓✓✓ 240 ✓✓✓✓ ✓✓✓✓ 248 ✓✓✓✓ ✓✓✓✓ </code></pre> <p>Isn't computer science fun?</p> <p><em>Note:</em> The code for this blog post is available <a href="https://github.com/OCamlPro/PeaNoOp">here</a>.</p> Statically guaranteeing security properties on Java bytecode: Paper presentation at VMCAI 23 https://ocamlpro.com/blog/2023_01_12_vmcai_popl 2023-01-12T13:48:57Z 2023-01-12T13:48:57Z Nicolas Berthier We are excited to announce that Nicolas will present a paper at the International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI) the 16th and 17th of January. This year, VMCAI is co-located with the Symposium on Principles of Programming Languages (POPL) conference, ... <p></p> <p>We are excited to announce that Nicolas will present a paper at the <a href="https://popl23.sigplan.org/home/VMCAI-2023">International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI)</a> the 16th and 17th of January.</p> <p>This year, VMCAI is co-located with the <a href="https://popl23.sigplan.org/">Symposium on Principles of Programming Languages (POPL)</a> conference, which, as its name suggests, is a flagship conference in the Programming Languages domain.</p> <p>What's more, for its 50th anniversary edition, POPL will return back where its first edition took place: Boston! It is thus in the vicinity of the MIT and Harvard that we will meet with prominent figures of computer science research.</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/popl2023.jpg"> <img alt="This paper will be presented at VMCAI'2023, colocated with POPL'2023 at Boston!" src="/blog/assets/img/popl2023.jpg"/> </a> <div class="caption"> This paper will be presented at VMCAI'2023, colocated with POPL'2023 at Boston! </div> </p> </div> </p> <!-- ## A sound technique to statically guarantee non-interference --> <h2>A sound technique to statically guarantee security properties on Java bytecode</h2> <p>Nicolas will be presenting a novel static program analysis technique dedicated to the discovery of information flows in Java bytecode. By automatically discovering such flows, the new technique allows developers and users of Java libraries to assess key security properties on the software they run.</p> <p>Two prominent examples of such properties are <em>confidentiality</em> (stating that no single bit of secret information may be inadvertently revealed by the software), and its dual, <em>integrity</em> (stating that no single bit of trusted information may be tampered via untrusted data).</p> <p>The technique is proven <em>sound</em> (i.e. it cannot miss a flow of information), and achieves <em>state-of-the-art precision</em> (i.e. it does not raise too many false alarms) according to evaluations using the <a href="https://pp.ipd.kit.edu/uploads/publikationen/ifspec18nordsec.pdf">IFSpec benchmark suite</a>.</p> <h2>Try it out!</h2> <p>In addition to being supported by a proof, the technique has also been implemented in a tool called <a href="http://nberth.space/symmaries">Guardies</a>.</p> <p>We believe this static analysis tool will naturally complement the taint tracking and dynamic analysis techniques that are usually employed to assess software security.</p> <h2>Reading more about it</h2> <p>You may already access the full paper <a href="https://arxiv.org/abs/2211.03450">here</a>.</p> <p>Nicolas developed this contribution while working at the University of Liverpool, in collaboration with Narges Khakpour, herself from the University of Newcastle.</p> Release of ocplib-simplex, version 0.5 https://ocamlpro.com/blog/2022_11_25_ocplib-simplex-0.5 2023-01-05T13:48:57Z 2023-01-05T13:48:57Z Steven de Oliveira Pierre Villemot Hichem Rami Ait El Hara Guillaume Bury On last November, we released version 0.5 of ocplib-simplex, a generic library implementing the Simplex Algorithm in OCaml. It is a key component of the Alt-Ergo automatic theorem prover that we keep developing at OCamlPro. ** The Simplex Algorithm What Changed in 0.5 ? ] The simplex algorithm The S... <p></p> <p>On last November, we released <a href="https://opam.ocaml.org/packages/ocplib-simplex/">version 0.5</a> of <a href="https://github.com/OCamlPro/ocplib-simplex">ocplib-simplex</a>, a generic library implementing the <a href="https://en.wikipedia.org/wiki/Simplex_algorithm">Simplex Algorithm</a> in OCaml. It is a key component of the <a href="https://alt-ergo.ocamlpro.com">Alt-Ergo</a> automatic theorem prover that we keep developing at OCamlPro.</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#simplex">The Simplex Algorithm</a> </li> <li><a href="#changes">What Changed in 0.5 ?</a> </div> </li> </ul> <p> <div class="figure"> <p> <a href="/blog/assets/img/ocplib-simplex.jpg"> <img alt="Try ocplib-simplex before implementing your own library !" src="/blog/assets/img/ocplib-simplex.jpg"/> </a> <div class="caption"> Try ocplib-simplex before implementing your own library ! </div> </p> </div> </p> <h2> <a id="simplex" class="anchor"></a><a class="anchor-link" href="#simplex">The simplex algorithm</a> </h2> <p>The <a href="https://en.wikipedia.org/wiki/Simplex_algorithm">Simplex Algorithm</a> is well known among linear optimization enthusiasts. Let's say you own a manufacture producing two kinds of chairs: the first is cheap, you make a small profit out of them but they are quick to produce; the second one is a bit more fancy, you make a bigger profit but they need a lot of time to build. You have a limited amount of wood and time. How many cheap and fancy chairs should you produce to optimize your profits?</p> <p>You can represent this problem with a set of mathematical constraints (more precisely, linear inequalities) which is precisely the scope of the simplex algorithm. Given a set of linear inequalities, it computes a solution maximizing a given value (in our example, the total profit). If you are interested in the detail of the algorithm, you shoud definitely watch <a href="https://www.youtube.com/watch?v=jh_kkR6m8H8">this video</a>.</p> <p>The simplex algorithm is known to be a difficult problem in terms of <a href="https://en.wikipedia.org/wiki/Computational_complexity">complexity</a>. While the base algorithm is EXP-time, it is generally very efficient in practice.</p> <h2> <a id="changes" class="anchor"></a><a class="anchor-link" href="#changes">What Changed in 0.5 ?</a> </h2> <p>Among the main changes in this new version of <a href="https://github.com/OCamlPro/ocplib-simplex">ocplib-simplex</a>:</p> <ul> <li> <p>Make the library's API more generic and easier to use (see the <a href="https://github.com/OCamlPro/ocplib-simplex/blob/v0.5/tests/standalone_minimal.ml">System Solving Example</a> or the <a href="https://github.com/OCamlPro/ocplib-simplex/blob/v0.5/tests/standalone_minimal_maximization.ml">Linear Optimization Example</a>);</p> </li> <li> <p>All the modules are better documentated in their <code>.mli</code> interfaces (see <a href="https://github.com/OCamlPro/ocplib-simplex/blob/v0.5/src/coreSig.mli">coreSig.mli</a> for example);</p> </li> <li> <p>the build system has been switched to <code>dune</code></p> </li> </ul> <p>We hope that this work of simplification will help you to integrate this library more easily in your projects!</p> <p>If you want to follow this project, report an issue or contribute, you can find it on <a href="https://github.com/OCamlPro/ocplib-simplex">GitHub</a>.</p> <p>Please do not hesitate to contact us at OCamlPro: <a href="mailto:alt-ergo@ocamlpro.com">alt-ergo@ocamlpro.com</a>.</p> The Growth of the OCaml Distribution https://ocamlpro.com/blog/2023_01_02_ocaml_distribution 2023-01-02T13:48:57Z 2023-01-02T13:48:57Z Fabrice Le Fessant We recently worked on a project to build a binary installer for OCaml, inspired from RustUp for Rust. We had to build binary packages of the distribution for every OCaml version since 4.02.0, and we were surprised to discover that their (compressed) size grew from 18 MB to about 200 MB. This post gi... <p></p> <p>We recently worked on a project to build a binary installer for OCaml, inspired from <a href="https://rustup.rs">RustUp</a> for Rust. We had to build binary packages of the distribution for every OCaml version since 4.02.0, and we were surprised to discover that their (compressed) size grew from 18 MB to about 200 MB. This post gives a survey of our findings.</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <ul> <li><a href="#introduction">Introduction</a> </li> <li><a href="#trends">General Trends</a> </li> <li><a href="#changes">Causes and Consequences</a> </li> <li><a href="#distribution">Inside the OCaml Installation</a> </li> <li><a href="#conclusion">Conclusion</a> </div> </li> </ul> <h2> <a id="introduction" class="anchor"></a><a class="anchor-link" href="#introduction">Introduction</a> </h2> <p>One of the strengths of Rust is the ease with which it gets installed on a new computer in user space: with a simple command copy-pasted from a website into a terminal, you get all what you need to start building Rust projects in a few seconds. <a href="https://rustup.rs">Rustup</a>, and a set of prebuilt packages for many architectures, is the project that makes all this possible.</p> <p>OCaml, on the other hand, is a bit harder to install: you need to find in the documentation the proper way for your operating system to install <code>opam</code>, find how to create a switch with a compiler version, and then wait for the compiler to be built and installed. This usually takes much more time.</p> <p>As a winter holiday project, we worked on a project similar to Rustup, providing binary packages for most OCaml distribution versions. It builds upon our experience of <code>opam</code> and <a href="https://ocamlpro.github.io/opam-bin/"><code>opam-bin</code></a>, our plugin to build and share binary packages for <code>opam</code>.</p> <p>While building binary packages for most versions of the OCaml distribution, we were surprised to discover that the size of the binary archive grew from 18 MB to about 200 MB in 10 years. Though on many high-bandwidth connexions, it is not a problem, it might become one when you go far from big towns (and fortunately, we designed our tool to be able to install from sources in such a case, compromising the download speed against the installation speed).</p> <p>We decided it was worth trying to investigate this growth in more details, and this post is about our early findings.</p> <h2> <a id="trends" class="anchor"></a><a class="anchor-link" href="#trends">General Trends</a> </h2> <p> <div class="figure"> <p> <a href="/blog/assets/img/ocaml-binary-growth-2022.svg"> <img alt="In 10 years, the OCaml Distribution binary archive grew by a factor 10, from 18 MB to 198 MB, corresponding to a growth from 73 MB to 522 MB after installation, and from 748 to 2433 installed files." src="/blog/assets/img/ocaml-binary-growth-2022.svg"/> </a> <div class="caption"> In 10 years, the OCaml Distribution binary archive grew by a factor 10, from 18 MB to 198 MB, corresponding to a growth from 73 MB to 522 MB after installation, and from 748 to 2433 installed files. </div> </p> </div> </p> <p>So, let's have a look at the evolution of the size of the binary OCaml distribution in more details. Between version 4.02.0 (Aug 2014) and version 5.0.0 (Dec 2022):</p> <ul> <li> <p>The size of the compressed binary archive grew from from 18 MB to 198 MB</p> </li> <li> <p>The size of the installed binary distribution grew from 73 MB to 522 MB</p> </li> <li> <p>The number of installed files grew from 748 to 2433</p> </li> </ul> <p> <div class="figure"> <p> <a href="/blog/assets/img/ocaml-sources-growth-2022.svg"> <img alt="The OCaml Distribution source archive was much more stable, with a global growth smaller than 2." src="/blog/assets/img/ocaml-sources-growth-2022.svg"/> </a> <div class="caption"> The OCaml Distribution source archive was much more stable, with a global growth smaller than 2. </div> </p> </div> </p> <p>On the other hand, the source distribution itself was much more stable:</p> <ul> <li> <p>The size of the compressed source archive grew only from 3 MB to 5 MB</p> </li> <li> <p>The size of the sources grew from 14 MB to 26 MB</p> </li> <li> <p>The number of source files grew from 2355 to 4084</p> </li> </ul> <p>For our project, this evolution makes the source distribution a good alternative to binary distributions for low-bandwidth settings, especially as OCaml is much faster than Rust at building itself. For the record, version 5.0.0 takes about 1 minute to build on a 16-core 64GB-RAM computer.</p> <p>Interestingly, if we plot the total size of the binary distribution, and the total size with only files that were present in the previous version, we can notice that the growth is mostly caused by the increase in size of these existing files, and not by the addition of new files:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/ocaml-binary-size-2022.svg"> <img alt="The growth is mostly caused by the increase in size of existing files, and not by the addition of new files." src="/blog/assets/img/ocaml-binary-size-2022.svg"/> </a> <div class="caption"> The growth is mostly caused by the increase in size of existing files, and not by the addition of new files. </div> </p> </div> </p> <h2> <a id="changes" class="anchor"></a><a class="anchor-link" href="#changes">Causes and Consequences</a> </h2> <p>We tried to identify the main causes of this growth: the growth is linear most of the time, with sharp increases (and decreases) at some versions. We plotted the difference in size, for the total size, the new files, the deleted files and the same files, i.e. the files that made it from one version to the next one:</p> <p> <div class="figure"> <p> <a href="/blog/assets/img/ocaml-binary-size-diff-2022.svg"> <img alt="The difference of size between two versions is not big most of the time, but some versions exhibit huge increases or decreases." src="/blog/assets/img/ocaml-binary-size-diff-2022.svg"/> </a> <div class="caption"> The difference of size between two versions is not big most of the time, but some versions exhibit huge increases or decreases. </div> </p> </div> </p> <p>Let's have a look at the versions with the highest increases in size:</p> <ul> <li> <p>+86 MB for 4.08.0: though there are a lot of new files (+307), they only account for 3 MB of additionnal storage. Most of the difference comes from an increase in size of both compiler libraries (probably in relation with the use of Menhir for parsing) and of some binaries. In particular:</p> <ul> <li>+13 MB for <code>bin/ocamlobjinfo.byte</code> (2_386_046 -&gt; 16_907_776) </li> <li>+12 MB for <code>bin/ocamldep.byte</code> (2_199_409 -&gt; 15_541_022) </li> <li>+6 MB for <code>bin/ocamldebug</code> (1_092_173 -&gt; 7_671_300) </li> <li>+6 MB for <code>bin/ocamlprof.byte</code> (630_989 -&gt; 7_043_717) </li> <li>+6 MB for <code>lib/ocaml/compiler-libs/parser.cmt</code> (2_237_513 -&gt; 9_209_256) </li> </ul> </li> <li> <p>+74 MB for 4.03.0: again, though there are a lot of new files (+475, mostly in <code>compiler-libs</code>), they only account for 11 MB of additionnal storage, and a large part is compensated by the removal of <code>ocamlbuild</code> from the distribution, causing a gain of 7 MB.</p> <p>Indeed, most the increase in size is probably caused by the compilation with debug information (option <code>-g</code>), that increases considerably the size of all executables, for example:</p> <ul> <li>+12 MB for <code>bin/ocamlopt</code> (2_016_697 -&gt; 15_046_969) </li> <li>+9 MB for <code>bin/ocaml</code> (1_833_357 -&gt; 11_574_555) </li> <li>+8 MB for <code>bin/ocamlc</code> (1_748_717 -&gt; 11_070_933) </li> <li>+8 MB for <code>lib/ocaml/expunge</code> (1_662_786 -&gt; 10_672_805) </li> <li>+7 MB for <code>lib/ocaml/compiler-libs/ocamlcommon.cma</code> (1_713_947 -&gt; 8_948_807) </li> </ul> </li> <li> <p>+72 MB for 4.11.0: again, the increase almost only comes from existing files. For example:</p> <ul> <li>+16 MB for <code>bin/ocamldebug</code> (8_170_424 -&gt; 26_451_049) </li> <li>+6 MB for <code>bin/ocamlopt.byte</code> (21_895_130 -&gt; 28_354_131) </li> <li>+5 MB for <code>lib/ocaml/extract_crc</code> (659_967 -&gt; 6_203_791) </li> <li>+5 MB for <code>bin/ocaml</code> (17_074_577 -&gt; 22_388_774) </li> <li>+5 MB for <code>bin/ocamlobjinfo.byte</code> (17_224_939 -&gt; 22_523_686) </li> </ul> <p>Again, the increase is probably related to adding more debug information in the executable (there is a specific PR on <code>ocamldebug</code> for that, and for all executables more debug info is available for each allocation);</p> </li> <li> <p>+48 MB for 5.0.0: a big difference in storage is not surprising for a change in a major version, but actually half of the difference just comes from an increase of 23 MB of <code>bin/ocamldoc</code>;</p> </li> <li> <p>+34 MB for 4.02.3: this one is worth noting, as it comes at a minor version change. The increase is mostly caused by the addition of 402 new files, corresponding to <code>cmt/cmti</code> files for the <code>stdlib</code> and <code>compiler-libs</code></p> </li> </ul> <p>We could of course study some other versions, but understanding the root causes of most of these changes would require to go deeper than what we can in such a blog post. Yet, these figures give good hints for experts on which versions to start investigating with.</p> <h2> <a id="distribution" class="anchor"></a><a class="anchor-link" href="#distribution">Inside the OCaml Installation</a> </h2> <p>Before concluding, it might also be worth studying which parts of the OCaml Installation take most of the space. 5.0.0 is a good candidate for such a study, as libraries have been moved to separate directories, instead of all being directly stored in <code>lib/ocaml</code>.</p> <p>Here is a decomposition of the OCaml Installation:</p> <ul> <li>Total: 529 MB <ul> <li><code>share</code>: 1 MB </li> <li><code>man</code>: 4 MB </li> <li><code>bin</code>: 303 MB </li> <li><code>lib/ocaml</code>: 223 MB <ul> <li><code>compiler-libs</code>: 134 MB </li> <li><code>expunge</code>: 20 MB </li> </ul> </li> </ul> </li> </ul> <p>As we can see, a large majority of the space is used by executables. For example, all these ones are above 10 MB:</p> <ul> <li>28 MB <code>ocamldoc</code> </li> <li>26 MB <code>ocamlopt.byte</code> </li> <li>25 MB <code>ocamldebug</code> </li> <li>21 MB <code>ocamlobjinfo.byte</code>, <code>ocaml</code> </li> <li>20 MB <code>ocamldep.byte</code>, <code>ocamlc.byte</code> </li> <li>19 MB <code>ocamldoc.opt</code> </li> <li>18 MB <code>ocamlopt.opt</code> </li> <li>15 MB <code>ocamlobjinfo.opt</code> </li> <li>14 MB <code>ocamldep.opt</code>, <code>ocamlc.opt</code>, <code>ocamlcmt</code> </li> </ul> <p>There are both bytecode and native code executables in this list.</p> <h2> <a id="conclusion" class="anchor"></a><a class="anchor-link" href="#conclusion">Conclusion</a> </h2> <p>Our installer project would benefit from having a smaller binary OCaml distribution, but most OCaml users in general would also benefit from that: after a few years of using OCaml, OCaml developers usually end up with huge <code>$HOME/.opam</code> directories, because every <code>opam</code> switch often takes more than 1 GB of space, and the OCaml distribution takes a big part of that. <code>opam-bin</code> partially solves this problem by sharing equal files between several switches (when the <code>--enable-share</code> configuration option has been used).</p> <p>Here is a short list of ideas to test to decrease the size of the binary OCaml distribution:</p> <ul> <li> <p>Use the same executable for multiple programs (<code>ocamlc.opt</code>, <code>ocamlopt.opt</code>, <code>ocamldep.opt</code>, etc.), using the first command argument to choose the behavior to have. Rustup, for example, only installs one binary in <code>$HOME/.cargo/bin</code> for <code>cargo</code>, <code>rustc</code>, <code>rustup</code>, etc. and actually, our tool does the same trick to share the same binary for itself, <code>opam</code>, <code>opam-bin</code>, <code>ocp-indent</code> and <code>drom</code>.</p> </li> <li> <p>Split installed files into separate <code>opam</code> packages, of which only one would be installed as the compiler distribution. For example, most <code>cmt</code> files of <code>compiler-libs</code> are not needed by most users, they might only be useful for compiler/tooling developers, and even then, only in very rare cases. They could be installed as another <code>opam</code> package.</p> </li> <li> <p>Remove the <code>-linkall</code> flag on <code>ocamlcommon.cm[x]a</code> libraries. In general, such a flag should only be set when building an executable that is expected to use plugins, because otherwise, this executable will contain all the modules of the library, even the ones that are not useful for its specific purpose.</p> </li> </ul> WebAssembly/Wasm and OCaml https://ocamlpro.com/blog/2022_12_14_wasm_and_ocaml 2022-12-14T13:48:57Z 2022-12-14T13:48:57Z Léo Andrès Pierre Chambart In this first post about WebAssembly (Wasm) and OCaml, we introduce the work we have been doing for quite some time now, though without publicity, about our participation in the Garbage-Collection (GC) Working Group for Wasm, and two related development projects in OCaml. WebAssembly, a fast and por... <p></p> <div class="figure"> <p> <img alt="" src="/blog/assets/img/dalle_dragon_camel.png"/> <div class="caption"> The Dragon-Camel is raging at the sight of all the challenges we overcome! </div> </p> </div> <p>In this first post about <a href="https://webassembly.org/">WebAssembly</a> (Wasm) and OCaml, we introduce the work we have been doing for quite some time now, though without publicity, about our participation in the Garbage-Collection (GC) Working Group for Wasm, and two related development projects in OCaml.</p> <h2>WebAssembly, a fast and portable bytecode</h2> <blockquote> <p>WebAssembly is a low-level, binary format that allows compiled code to run efficiently in the browser. Its roadmap is decided by Working Groups from multiple organizations and companies, including Microsoft, Google, and Mozilla. These groups meet regularly to discuss and plan the development of WebAssembly, with the broader community of developers, academics, and other interested parties to gather feedback and ideas for the future of WebAssembly.</p> </blockquote> <p>There are multiple projects in OCaml related to Wasm, notably <a href="https://github.com/remixlabs/wasicaml">Wasicaml</a>, a production-ready port of the OCaml bytecode interpreter to Wasm . However, these projects don't tackle the domain we would like to address, and for good reasons: they target the <strong>existing</strong> version of Wasm, which is basically a very simple programming language with no data structures, but with an access to a large memory array. Almost anything can of course be compiled to something like that, but there is a big restriction: the resulting program can interact with the outside world only through the aforementioned memory buffer. This is perfectly fine if you write Command-Line Interface (CLI) tools, or workers to be deployed in a Content Delivery Network (CDN). However, this kind of interaction can become quite tedious if you need to deal with abstract objects provided by your environment, for example DOM objects in a browser to manipulate webpages. In such cases, you will need to write some wrapper access functions in JavaScript (or OCaml with <code>js_of_ocaml</code> of course), and you will have to be very careful about the lifetime of those objects to avoid memory leaks.</p> <p>Hence the shiny new proposals to extend Wasm with various useful features that can be very convenient for OCaml. In particular, three extensions crucially matter to us, functional programmers: the <a href="https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP.md">Garbage Collection</a>, <a href="https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md">Exceptions</a> and <a href="https://github.com/WebAssembly/tail-call/blob/main/proposals/tail-call/Overview.md">Tail-Call</a> proposals.</p> <h2>Our involvement in the GC-related Working Group</h2> <p>The Wasm committee has already worked on these proposals for a few years, and the Exceptions and Tail-Call proposals are now quite satisfying. However, this is not yet the case for the GC proposal. Indeed, finding a good API for a GC that is compatible with all the languages in the wild, that can be implemented efficiently, and can be used to run a program you don't trust, is all but an easy task. Multiple attempts by strong teams, for different virtual machines, have exposed limitations of past proposals. But, we must now admit that the current proposal has reached a state where it is quite impressive, being both simple <strong>and</strong> generic.</p> <p>The proposal is now getting close to a feature freeze status. Thanks to the hard work of many people on the committee, including us, the particularities of functional typed languages were not forgotten in the design, and we are convinced that there should be no problem for OCaml. Now is the time to test it for real!</p> <h2>Targetting Wasm from the OCaml Compiler</h2> <p>Adding a brand new backend to a compiler to target something that is quite different from your usual assembly can be a huge work, and only a few language developers actively work on making a prototype for Wasm+GC. Yet, we think that it is important for the committee, to have as many examples as possible to validate the proposal and move it to the next step.</p> <p>That's the reason why we decided to contribute to the proposal, by prototyping a backend for Wasm to the OCaml compiler.</p> <h2>Our experimental Wasm interpreter in OCaml</h2> <p>In parallel, we are also working on the development of our own Wasm Virtual Machine in OCaml, to be able to easily experiment both on the OCaml side and Wasm side, while waiting for most official Wasm VM to fully implement the new proposals.</p> <p>These experimental projects and related discussions are very important design steps, although obviously far from production-ready status.</p> <p>As our current work focuses on OCaml 4.14, effect handlers are left for future work. The current <a href="https://github.com/WebAssembly/stack-switching/blob/main/proposals/stack-switching/Overview.md">proposal</a> that would make it possible to compile effect handlers to Wasm nicely is still in its earlier stages. We hope to be able to prototype it too on our Wasm VM.</p> <p>Note that we are looking for sponsors to fund this work. If supporting Wasm in OCaml may impact your business, you can contact us to discuss how we can use your help!</p> <p>Our next blog post in January will provide more technical details on our two prototyping efforts.</p> Alt-Ergo: the SMT solver with model generation https://ocamlpro.com/blog/2022_11_16_alt-ergo-models 2022-11-16T13:48:57Z 2022-11-16T13:48:57Z Steven de Oliveira Pierre Villemot Hichem Rami Ait El Hara Guillaume Bury The Alt-Ergo automatic theorem prover developed at OCamlPro has just been released with a major update : counterexample model can now be generated. This is now available on the next branch, and will officially be part of the 2.5.0 release, coming this year ! Alt-Ergo at a Glance Alt-Ergo is an open ... <p>The Alt-Ergo automatic theorem prover developed at OCamlPro has just been released with a major update : counterexample model can now be generated. This is now available on the next branch, and will officially be part of the 2.5.0 release, coming this year !</p> <h3>Alt-Ergo at a Glance</h3> <p><a href="https://alt-ergo.ocamlpro.com">Alt-Ergo</a> is an open source automatic theorem prover based on the <a href="https://en.wikipedia.org/wiki/Satisfiability_Modulo_Theories">SMT</a> technology. It was born at the <a href="https://www.lri.fr">Laboratoire de Recherche en Informatique</a>, <a href="https://www.inria.fr/centre/saclay">Inria Saclay Ile-de-France</a> and <a href="https://www.cnrs.fr/index.php">CNRS</a> in 2006 and has been maintained and developed by OCamlPro since 2013.</p> <p></p> <p>It is capable of reasoning in a combination of several built-in theories such as:</p> <ul> <li>uninterpreted equality; </li> <li>integer and rational arithmetic; </li> <li>arrays; </li> <li>records; </li> <li>algebraic data types; </li> <li>bit vectors. </li> </ul> <p>It also is able to deal with commutative and associative operators, quantified formulas and has a polymorphic first-order native input language. Alt-Ergo is written in <a href="https://caml.inria.fr/ocaml/index.fr.html">OCaml</a>. Its core has been formally proved in the <a href="https://coq.inria.fr">Coq proof assistant</a>.</p> <p>Alt-Ergo has been involved in a qualification process (DO-178C) by <a href="http://www.airbus.com">Airbus Industrie</a>. During this process, a qualification kit has been produced. It was composed of a technical document with tool requirements (TR) that gives a precise description of each part of the prover, a companion document (~ 450 pages) of tests, and an instrumented version of the tool with a TR trace mechanism.</p> <h3>Model Generation</h3> <p>When a property is false, generating a counterexample is a key that many state-of-the-art SMT-solvers should include by default. However, this is a complex problem in the first place.</p> <p>The first obstacle is the decidability of the theories manipulated by the SMT solvers. In general, the complexity class (i.e. the classification of algorithmic problems) is between &quot;NP-Hard&quot; (for the linear arithmetic theory on integers for example) and &quot;Undecidable&quot; (for the polynomial arithmetic on integers for example). Then comes the quantified properties, i.e. properties prefixed with <code>forall</code>s and <code>exists</code>, adding an additional layer of complexity and undecidability. Another challenge was the core algorithm behind Alt-Ergo which does not natively support model generation. At last, an implementation of the models have to take care of Alt-Ergo's support of polymorphism.</p> <h3>How to use Model Generation in Alt-Ergo</h3> <p>There are two ways to activate model generation on Alt-Ergo.</p> <ul> <li>Basic usage: simply add the option <code>--model</code> to your command (<code>$ alt-ergo file --model</code>) </li> <li>Advanced usage: three options mainly impact the model generation. <ul> <li> <p><code>--interpretation</code>: sets the model generation strategy. It can either be <code>none</code> for no model generation; <code>first</code> for generating the very first interpretation computed only; <code>every</code> for generating a model after each decision and <code>last</code> only generating a model when <code>alt-ergo</code> concludes on the formula satisfiability.</p> </li> <li> <p><code>--sat-solver</code>: only the 'Tableaux-CDCL' sat solver is compatible with the interpretation feature</p> </li> <li> <p><code>--instantiation-heuristic</code>: when set to <code>normal</code>, <code>alt-ergo</code> generates model faster. This is an experimental feature that sometimes generates incorrect models.</p> <p>Example:</p> <p><code>$ alt-ergo file --interpretation every --sat-solver Tableaux-CDCL --instantiation-heuristic auto</code></p> </li> </ul> </li> </ul> <p><em>Warning:</em> only the linear arithmetic and the enum model generation have been tested. Other theories are either not implemented (ADTs) or experimental (risk of crash or unsound models). We are currently still heavily testing the feature, so feel free to join us on <a href="github.com/OcamlPro/alt-ergo">Alt-Ergo's GitHub repository</a> if you have questions or issues with this new feature. Note that the models generated are best-effort models; Alt-Ergo does not answer <code>Sat</code> when it outputs a model. In a future version, we will add a mechanism that automatically checks the model generated.</p> <p>Godspeed!</p> <h3>Acknowledgements</h3> <p>We want to thank David Mentré and Denis Cousineau at <a href="https://www.mitsubishielectric-rce.eu/merce-in-france/">Mitsubishi Electric R&amp;D Center Europe</a> for funding the initial work on counterexample.</p> <p>Note that MERCE has been a Member of the Alt-Ergo Users’ Club for 3 years. This partnership allowed Alt-Ergo to evolve and we hope that more users will join the Club on our journey to make Alt-Ergo a must-have tool. Please do not hesitate to contact the Alt-Ergo team at OCamlPro: <a href="mailto:alt-ergo@ocamlpro.com">alt-ergo@ocamlpro.com</a>.</p> Let's Encrypt Wildcard Certificates Made Easy with Agnos https://ocamlpro.com/blog/2022_10_05_agnos_0.1.0-beta 2022-10-05T13:48:57Z 2022-10-05T13:48:57Z Arthur Carcano OCamlPro It is with great pleasure that we announce the first beta release of Agnos. A former personal project of our new recruit, Arthur, Agnos development is now hosted at and sponsored by OCamlPro's Rust division, Red Iron. A white lamb with a blue padlock and blue stars. He is clearly to be trusted with ... <p></p> <p>It is with great pleasure that we announce the first beta release of <a href="https://github.com/krtab/agnos">Agnos</a>. A former personal project of our new recruit, Arthur, Agnos development is now hosted at and sponsored by OCamlPro's Rust division, <a href="https://red-iron.eu/">Red Iron</a>.</p> <p><img src="/blog/assets/img/agnos-banner.png" alt="A white lamb with a blue padlock and blue stars. He is clearly to be trusted with your certificate needs. A text reads: Agnos, wildcard Let's Encrypt certificates, no DNS-provider API required." /></p> <blockquote> <p><strong>TL;DR:</strong> If you are familiar with ACME providers like Let's Encrypt, DNS-01 and the challenges relating to wildcard certificates, simply know that Agnos touts itself as a single-binary, API-less, provider-agnostic dns-01 client, allowing you to easily obtain wildcard certificates without having to interface with your DNS provider. To do so, it offers a user-friendly configuration and answers Let's Encrypt DNS-01 challenges on its own, bypassing the need for API calls to edit DNS zones. You may want to jump to the last <a href="#agnos-as-the-best-of-both-worlds">section</a> of this post, or directly join us on Agnos's <a href="https://github.com/krtab/agnos">github</a>.</p> </blockquote> <p>Agnos was born from the observation that even though wildcard certificates are in many cases more convenient and useful than their fully qualified counterparts, they are not often used in practice. As of today it is not uncommon to see certificates with multiple <a href="https://en.wikipedia.org/wiki/Subject_Alternative_Name">Subject Alternate Names</a> (SAN) for multiple subdomains, which can become <a href="https://discuss.httparchive.org/t/san-certificates-how-many-alt-names-are-too-many/1867">problematic</a> and weaken infrastructure. If some situations indeed require to forego wildcard certificates, this choice is too often still a default one.</p> <p>At OCamlPro, we believe that technical difficulties should not stand in the way of optimal decision making, and that compromises should only be made in the face of unsolvable challenges. By releasing this first beta of Agnos, we hope that your feedback we'll help us build a tool truly useful to the community and that together, we can open a path towards seamless wildcard certificate issuance – tossing away issues and pain-points previously encountered as a thing of the past.</p> <p>This blog post describes the different ACME challenges, why DNS providers API have so far been hindering DNS-01 adoption, and how Agnos solves this issue. If you are already curious and want to run some code, let's meet on Agnos's <a href="https://github.com/krtab/agnos">github</a></p> <h2>Let's encrypt's mechanism and ACME challenges</h2> <p>The Automatic Certificate Management Environment (ACME) is the protocol behind automated certificate authority services like Let's Encrypt. At its core, this protocol requires the client asking for a certificate to provide evidence that they control a resource by having said resource display some authority-determined token.</p> <p>The easiest way to do so is to serve a file on a web-server. For example serving a file containing the token at <code>my-domain.example</code> would prove that I control the web-server that the <strong>fully qualified domain name</strong> <code>my-domain.example</code> is pointing to. This, under normal circumstances proves that I somewhat control this fully qualified domain. This process is illustrated below.</p> <p>The ACME client initiates the certificate issuance process and is challenged to serve the token via HTTP at the domain address. The ACME client and HTTP server can be and often are on the same machine. The token can be quickly provisioned, and the ACME client can ask the ACME server to validate the challenge and issue the certificate.</p> <p><img src="/blog/assets/img/http-01-schema.png" alt="Schematic illustration of the HTTP-01 challenge." /></p> <p>However, demonstrating that one controls an HTTP server pointed to by <code>my-domain.example</code> is not deemed enough by Let's Encrypt to demonstrate <strong>full</strong> control of the <code>my-domain.example</code> domain and all its subdomains. Hence, the user cannot be issued a wildcard certificate through this method.</p> <p>To obtain a wildcard certificate, one must rely on the DNS-01 type of challenge, illustrated below. The ACME client initiates the certificate issuance process and is challenged to serve the token via a DNS TXT record. Because DNS management is often delegated to a DNS provider, the DNS server is rarely on the same machine, and the token must be provisioned via a call to the DNS provider API, if there is any. Moreover, DNS providers virtually always use multiple servers, and the new record must be propagated to all of them. The ACME client must then wait and check for the propagation to be finished before asking the ACME server to validate the challenge and issue the certificate.</p> <p><img src="/blog/assets/img/dns-01-schema.png" alt="Schematic illustration of the DNS-01 challenge." /></p> <p>The pros and cons of each of these two challenge type are summarized by Let's Encrypt's <a href="https://letsencrypt.org/docs/challenge-types/">documentation</a> as follow:</p> <blockquote> <h4>HTTP-01</h4> <h5>Pros</h5> <ul> <li>It’s easy to automate without extra knowledge about a domain’s configuration. </li> <li>It allows hosting providers to issue certificates for domains CNAMEd to them. </li> <li>It works with off-the-shelf web servers. </li> </ul> <h5>Cons</h5> <ul> <li>It doesn’t work if your ISP blocks port 80 (this is rare, but some residential ISPs do this). </li> <li>Let’s Encrypt doesn’t let you use this challenge to issue wildcard certificates. </li> <li>If you have multiple web servers, you have to make sure the file is available on all of them. </li> </ul> <h4>DNS-01</h4> <h5>Pros</h5> <ul> <li>You can use this challenge to issue certificates containing wildcard domain names. </li> <li>It works well even if you have multiple web servers. </li> </ul> <h5>Cons</h5> <ul> <li>Keeping API credentials on your web server is risky. </li> <li>Your DNS provider might not offer an API. </li> <li>Your DNS API may not provide information on propagation times. </li> </ul> </blockquote> <h2>Agnos as the best of both worlds</h2> <p>By using NS records to delegate the DNS-01 challenge to Agnos itself, we can virtually remove all of DNS-01 cons. Indeed by serving its own DNS answers, Agnos:</p> <ul> <li>Nullifies the need for API and API credentials </li> <li>Nullifies all concerns regarding propagation times </li> </ul> <p>In more details, Agnos proceeds as follows (and as illustrated below). Before any ACME transaction takes place (and only once), the ACME client user manually updates their DNS zone to delegate ACME specific subdomains to Agnos. Note that the rest of DNS functionality is still assumed by the DNS provider. To carry out the ACME transaction, the ACME client initiates the certificate issuance process and is challenged to serve the token via a DNS TXT record. Agnos does so using its own DNS functionality (leveraging <a href="https://trust-dns.org/">Trust-dns</a>). The ACME client can immediately ask the ACME server for validation. The ACME server asks the DNS provider for the TXT record and is replied to that the ACME specific subdomain is delegated to Agnos. The ACME server then asks Agnos-as-a-DNS-server for the TXT record which Agnos provides. Finally the certificate is issued and stored by Agnos on the client machine.</p> <p><img src="/blog/assets/img/dns-01-agnos-schema.png" alt="Schematic illustration of the DNS-01 challenge when using Agnos." /></p> <h2>Taking Agnos for a ride</h2> <p>In conclusion, we hope that by switching to Agnos, or more generally to provider-agnostic DNS-01 challenge solving, individuals and organizations will benefit from the full power of DNS-01 and wildcard certificates, without having to take API-related concerns into account when choosing their DNS provider.</p> <p>If this post has piqued your interest and you want to help us develop Agnos further by trying the beta out, let's meet on our <a href="https://github.com/krtab/agnos">github</a>. We would very much appreciate any feedback and bug reports, so we tried our best to streamline and well document the installation process to facilitate new users. On ArchLinux for example, getting started can be as easy as:</p> <p>Adding two records to your DNS zone using your provider web GUI:</p> <pre><code>agnos-ns.doma.in A 1.2.3.4 _acme-challenge.doma.in NS agnos-ns.doma.in </code></pre> <p>and running on your server</p> <pre><code class="language-bash"># Install the agnos binary yay -S agnos # Allow agnos to bind the priviledge 53 port sudo setcap 'cap_net_bind_service=+ep' /usr/bin/agnos # Download the example configuration file curl 'https://raw.githubusercontent.com/krtab/agnos/v.0.1.0-beta.1/config_example.toml' &gt; agnos_config.toml # Edit it to suit your need vim agnos_config.toml # Launch agnos 🚀 agnos agnos_config.toml </code></pre> <p>Until then, happy hacking!</p> opam 2.1.3 is released! https://ocamlpro.com/blog/2022_08_12_opam_2.1.3_release 2022-08-12T13:48:57Z 2022-08-12T13:48:57Z Raja Boujbel OCamlPro Feedback on this post is welcomed on Discuss! We are pleased to announce the minor release of opam 2.1.3. This opam release consists of backported fixes: Fix opam init and opam init --reinit when the jobs variable has been set in the opamrc or the current config. (#5056) opam var no longer fails if ... <p><em>Feedback on this post is welcomed on <a href="https://discuss.ocaml.org/t/ann-opam-2-1-3/10299">Discuss</a>!</em></p> <p>We are pleased to announce the minor release of <a href="https://github.com/ocaml/opam/releases/tag/2.1.3">opam 2.1.3</a>.</p> <p>This opam release consists of <a href="https://github.com/ocaml/opam/issues/5000">backported</a> fixes:</p> <ul> <li>Fix <code>opam init</code> and <code>opam init --reinit</code> when the <code>jobs</code> variable has been set in the opamrc or the current config. (<a href="https://github.com/ocaml/opam/issues/5056">#5056</a>) </li> <li><code>opam var</code> no longer fails if no switch is set (<a href="https://github.com/ocaml/opam/issues/5025">#5025</a>) </li> <li>Setting a variable with option <code>--switch &lt;sw&gt;</code> fails instead of writing an invalid <code>switch-config</code> file (<a href="https://github.com/ocaml/opam/issues/5027">#5027</a>) </li> <li>Handle external dependencies when updating switch state pin status (all pins), instead as a post pin action (only when called with <code>opam pin</code> (<a href="https://github.com/ocaml/opam/issues/5046">#5046</a>) </li> <li>Remove windows double printing on commands and their output (<a href="https://github.com/ocaml/opam/issues/4940">#4940</a>) </li> <li>Stop Zypper from upgrading packages on updates on OpenSUSE (<a href="https://github.com/ocaml/opam/issues/4978">#4978</a>) </li> <li>Clearer error message if a command doesn't exist (<a href="https://github.com/ocaml/opam/issues/4112">#4112</a>) </li> <li>Actually allow multiple state caches to co-exist (<a href="https://github.com/ocaml/opam/issues/4554">#4554</a>) </li> <li>Fix some empty conflict explanations (<a href="https://github.com/ocaml/opam/issues/4373">#4373</a>) </li> <li>Fix an internal error on admin repository upgrade from OPAM 1.2 (<a href="https://github.com/ocaml/opam/issues/4965">#4965</a>) </li> </ul> <p>and improvements:</p> <ul> <li>When inferring a 2.1+ switch invariant from 2.0 base packages, don't filter out pinned packages as that causes very wide invariants for pinned compiler packages (<a href="https://github.com/ocaml/opam/issues/4501">#4501</a>) </li> <li>Some optimisations to <code>opam list --installable</code> queries combined with other filters (<a href="https://github.com/ocaml/opam/issues/4311">#4311</a>) </li> <li>Improve performance of some opam list combinations (e.g. <code>--available</code>, <code>--installable</code>) (<a href="https://github.com/ocaml/opam/issues/4999">#4999</a>) </li> <li>Improve performance of <code>opam list --conflicts-with</code> when combined with other filters (<a href="https://github.com/ocaml/opam/issues/4999">#4999</a>) </li> <li>Improve performance of <code>opam show</code> by as much as 300% when the package to show is given explicitly or is unique (<a href="https://github.com/ocaml/opam/issues/4997">#4997</a>)(<a href="https://github.com/ocaml/opam/issues/4172">#4172</a>) </li> <li>When a field is defined in switch and global scope, try to determine the scope also by checking switch selection (<a href="https://github.com/ocaml/opam/issues/5027">#5027</a>) </li> </ul> <p>You can also find API changes in the <a href="https://github.com/ocaml/opam/releases/tag/2.1.3">release note</a>.</p> <hr /> <p>Opam installation instructions (unchanged):</p> <ol> <li> <p>From binaries: run</p> <pre><code class="language-shell-session">$ bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.1.3&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.1.3">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script.</p> </li> <li> <p>From source, using opam:</p> <pre><code class="language-shell-session">$ opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update your sandbox script)</p> </li> <li> <p>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.1.3#compiling-this-repo">README</a>.</p> </li> </ol> <p>We hope you enjoy this new minor version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> OCamlPro at the JFLA2022 Conference https://ocamlpro.com/blog/2022_07_12_ocamlpro_at_the_jfla2022 2022-07-12T13:48:57Z 2022-07-12T13:48:57Z OCamlPro Dario Pinto In today's article, we share our contributions to the 2022 JFLAs, the French-Speaking annual gathering on Application Programming Languages, mainly Functional Languages such as OCaml (Journées Francophones des Langages Applicatifs). This much awaited event is organised by Inria, the French National... <p></p> <div class="figure"> <p> <img alt="" src="/blog/assets/img/picture_jfla2022_domaine_essendieras.jpg"/> <div class="caption"> <a href="https://www.essendieras.fr/" target="_blank"> Domaine d'Essendiéras </a>, located in French Region <em>Perigord</em>, where the JFLA2022 took place! </div> </p> </div> <p>In today's article, we share our contributions to the 2022 <a href="http://jfla.inria.fr/">JFLA</a>s, the French-Speaking annual gathering on Application Programming Languages, mainly Functional Languages such as OCaml (<em>Journées Francophones des Langages Applicatifs</em>).</p> <p>This much awaited event is organised by <a href="https://www.inria.fr/fr">Inria</a>, the French National Institute for Research in Science and Digital Technologies.</p> <p>This is always the best occasion for us to directly share our latest works and contributions with this diverse community of researchers, professors, students and industrial actors alike. Moreover, it allows us to meet up with all our long known peers and get in contact with an ever changing pool of actors in the fields of functional languages in general, formal methods and everything OCaml!</p> <p>This year the three papers we submitted were selected, and this is what this article is about!</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <p><a href="#mikino">Mikino, formal verification made accessible</a></p> <p><a href="#SWH">Connecting Software Heritage with the OCaml ecosystem</a></p> <p><a href="#alt-ergo">Alt-Ergo-Fuzz, hunting the bugs of the bug hunter</a> </div></p> <h2> <a id="mikino" class="anchor"></a><a class="anchor-link" href="#mikino">Mikino, formal verification made accessible</a> </h2> <p><em>Mikino and all correlated content mentionned in this article was made by Adrien Champion</em></p> <p>If you follow our Blog, you may have already read our <a href="https://ocamlpro.com/blog/2021_10_14_verification_for_dummies_smt_and_induction">Mikino blogpost</a>, but if you haven't here's a quick breakdown and a few pointers... In case you wish to play around or maybe contribute to the project. ;)</p> <p>So what is Mikino ?</p> <blockquote> <p>Mikino is a simple induction engine over transition systems. It is written in Rust, with a strong focus on ergonomics and user-friendliness.</p> </blockquote> <p>Depending on what your needs are, you could either be interested in the <a href="https://crates.io/crates/mikino_api">Mikino Api</a> or the <a href="https://crates.io/crates/mikino">Mikino Binary</a> or just, for purely theoretical reasons, want to undergo our <a href="https://ocamlpro.github.io/verification_for_dummies/">Verification for Dummies: SMT and Induction</a> tutorial which is specifically tailored to appeal to the newbies of formal verification!</p> <p>Have a go at it, learn and have fun!</p> <p>For further reading: <a href="https://hal.inria.fr/hal-03626850/">OCamlPro's paper for the JFLA2022 (Mikino) </a> (French-written article describing the entire project).</p> <h2> <a id="SWH" class="anchor"></a><a class="anchor-link" href="#SWH">Connecting Software Heritage with the OCaml ecosystem</a> </h2> <p><em>The archiving of OCaml packages into the SWH architecture, the release of <a href="https://github.com/OCamlPro/swhid/">swhid</a> library and the integration of SWH into opam was done by Léo Andrès, Raja Boujbel, Louis Gesbert and Dario Pinto</em></p> <p>Once again, if you follow our Blog, you must have seen <a href="https://www.softwareheritage.org/?lang=fr">Software Heritage</a> (SWH) mentioned in our <a href="https://ocamlpro.com/blog/2022_01_31_2021_at_ocamlpro#free_software">yearly review article</a>.</p> <p>Now you can also look at <a href="https://hal.archives-ouvertes.fr/hal-03626845/">SWH paper by OCamlPro for the JFLA2022 (French)</a> if you are looking for a detailed explanation of how important Software Heritage is to free software as a whole, and in what manner OCamlPro contributed to this gargantuan long-term endeavour of theirs.</p> <p>This great collaboration was one of the highlights of last year from which arose an OCaml library called <a href="https://github.com/OCamlPro/swhid/">swhid</a> and the guaranteed perennity of all the packages found on opam.</p> <p>The work we did to achieve this was to:</p> <ul> <li>add a few modules to the SWH architecture in order to store all the OCaml packages found on opam in the Library of Alexandria of open source software. </li> <li>release a library used for computing SWH identifiers </li> <li>add support in opam in order to allow a fallback on SWH architecture if a given package is missing from the <a href="https://github.com/ocaml/opam-repository">opam repository</a> </li> <li>patch the opam repository in order to detect already missing packages </li> </ul> <h2> <a id="alt-ergo" class="anchor"></a><a class="anchor-link" href="#alt-ergo">Alt-Ergo-Fuzz, hunting the bugs of the bug hunter</a> </h2> <p><em>The fuzzing of the SMT-Solver Alt-Ergo was done by Hichem Rami Ait El Hara, Guillaume Bury and Steven de Oliveira</em></p> <p>As the last entry of OCamlPro's papers that have made it to this year's JFLA: a rundown of Hichem's work, guided by Guillaume and Steven, on developping a Fuzzer for <a href="https://github.com/OCamlPro/alt-ergo">Alt-Ergo</a>.</p> <p>When it comes to critical systems, and industry-borne software, there are no limits to the requirements in safety, correctness, testing that would prove a program's reliability.</p> <p>This is what SMT (Satisfiability Modulo Theory)-Solvers like Alt-Ergo are for: they use a complex mix of theory and implementation in order to prove, given a set of input theories, whether a program is acceptable... But SMT-Solvers, like any other program in the world, has to be searched for bugs or unwanted behaviours - this is the harsh reality of development.</p> <p>With that in mind, Hichem sought to provide a fuzzer for Alt-Ergo to help <em>hunt the bugs of the bug hunter</em>: <a href="https://github.com/hra687261/alt-ergo-fuzz">Alt-Ergo-Fuzz</a>.</p> <p>This tool has helped identify several bugs of unsoundness and crashes:</p> <ul> <li><a href="https://github.com/OCamlPro/alt-ergo/issues/474">#474</a> - Crash </li> <li><a href="https://github.com/OCamlPro/alt-ergo/issues/475">#475</a> - Crash </li> <li><a href="https://github.com/OCamlPro/alt-ergo/issues/476">#476</a> - Unsoundness </li> <li><a href="https://github.com/OCamlPro/alt-ergo/issues/477">#477</a> - Unsoundness </li> <li><a href="https://github.com/OCamlPro/alt-ergo/issues/479">#479</a> - Unsoundness </li> <li><a href="https://github.com/OCamlPro/alt-ergo/issues/481">#481</a> - Crash </li> <li><a href="https://github.com/OCamlPro/alt-ergo/issues/482">#482</a> - Crash </li> </ul> <p>More details in <a href="https://hal.inria.fr/hal-03626861/">OCamlPro's paper for the JFLA2022 (Alt-Ergo-Fuzz)</a>.</p> 2021 at OCamlPro https://ocamlpro.com/blog/2022_01_31_2021_at_ocamlpro 2022-01-31T13:48:57Z 2022-01-31T13:48:57Z Muriel OCamlPro OCamlPro was created in 2011 to advocate the adoption of the OCaml language and Formal Methods in general in the industry. 2021 was a very special year as we celebrated our 10th anniversary! While building a team of highly-skilled engineers, we navigated through our expertise domains, programming la... <p> <div class="figure"> <p> <a href="/blog/assets/img/2021_ocamlpro.jpeg"> <img alt="Passing from one year to another is a great time to have a look back!" src="/blog/assets/img/2021_ocamlpro.jpeg"/> </a> <div class="caption"> Passing from one year to another is a great time to have a look back! </div> </p> </div> </p> <p>OCamlPro was created in 2011 to advocate the adoption of the OCaml language and Formal Methods in general in the industry. 2021 was a very special year as we celebrated our 10th anniversary! While building a team of highly-skilled engineers, we navigated through our expertise domains, programming languages design, compilation and analysis, advanced developer tooling, formal methods, blockchains and high-value software prototyping.</p> <p>In this article, as every year (see <a href="/blog/2021_02_02_2020_at_ocamlpro">last year's post</a>), we review some of the work we did during 2021, in many different worlds.</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <p><a href="#people">Newcomers at OCamlPro</a></p> <p><a href="#apps">Real Life Modern Applications</a></p> <ul> <li><a href="#mlang">Modernizing the French Income Tax System</a> </li> <li><a href="#cobol">A First Step in the COBOL Universe</a> </li> <li><a href="#geneweb">Auditing a High-Scale Genealogy Application</a> </li> <li><a href="#mosaic">Improving an ecotoxicology platform</a> </li> </ul> <p><a href="#ocaml">Contributions to OCaml</a></p> <ul> <li><a href="#flambda">Flambda Code Optimizer</a> </li> <li><a href="#opam">Opam Package Manager</a> </li> <li><a href="#community">LearnOCaml and TryOCaml</a> </li> <li><a href="#tooling">OCaml Documentation Hub</a> </li> <li><a href="#free_software">Plugging Opam into Software Heritage</a> </li> </ul> <p><a href="#formal-methods">Tooling for Formal Methods</a></p> <ul> <li><a href="#alt-ergo">Alt-Ergo Development</a> </li> <li><a href="#club">Alt-Ergo Users’ Club and R&amp;D Projects</a> </li> <li><a href="#dolmen">Dolmen Library for Automated Deduction Languages</a> </li> </ul> <p><a href="#rust">Rust Developments</a></p> <ul> <li><a href="#mikino">SMT, Induction and Mikino</a> </li> <li><a href="#matla">Matla, a Project Manager for TLA+/TLC</a> </li> <li><a href="#rust-training">Rust Training at Onera</a> </li> <li><a href="#rust-audit">Audit of a Rust Blockchain Node</a> </li> </ul> <p><a href="#blockchains">Scaling and Verifying Blockchains</a></p> <ul> <li><a href="#everscale">From Dune Network to FreeTON/EverScale</a> </li> <li><a href="#solidity">A Why3 Framework for Solidity</a> </li> </ul> <p><a href="#events">Participations in Public Events</a></p> <ul> <li><a href="#osxp2021">Open Source Experience 2021</a> </li> <li><a href="#ow2021">OCaml Workshop at ICFP 2021</a> </li> <li><a href="#why3consortium">Joining the Why3 Consortium at the ProofInUse Seminar</a> </li> </ul> <p><a href="#next">Towards 2022</a> </div></p> <p>As always, we warmly thank all our clients, partners, and friends, for their support and collaboration during this peculiar year!</p> <h2> <a id="people" class="anchor"></a><a class="anchor-link" href="#people">Newcomers at OCamlPro</a> </h2> <p> <div class="figure"> <p> <a href="/blog/assets/img/mini-team-2022-02-14.jpg"> <img alt="Some of the new and old members of the team: Pierre Chambart, Dario Pinto, Léo Andrès, Fabrice Le Fessant, Louis Gesbert, Artemiy Rozovyk, Muriel Shan Sei Fan, Nicolas Berthier, Vincent Laviron, Steven De Oliveira and Keryan Didier." src="/blog/assets/img/mini-team-2022-02-14.jpg"/> </a> <div class="caption"> Some of the new and old members of the team: Pierre Chambart, Dario Pinto, Léo Andrès, Fabrice Le Fessant, Louis Gesbert, Artemiy Rozovyk, Muriel Shan Sei Fan, Nicolas Berthier, Vincent Laviron, Steven De Oliveira and Keryan Didier. </div> </p> </div> </p> <p>A company is nothing without its employees. This year, we have been delighted to welcome a great share of newcomers:</p> <ul> <li><em>Hichem Rami Ait El Hara</em> recently completed his master's degree in Computer Science. After an internship at OCamlPro, during which he developed a fuzzer for Alt-Ergo, he joined OCamlPro to work on Alt-Ergo and the verification of smart contracts. He will soon start a PhD on SMT solving. </li> <li><em>Nicolas Berthier</em> holds a PhD on synchronous programming for resource-constrained systems. With many years experience on model-checking, abstract interpretation, and software analysis, he joined OCamlPro to work on programming language compilation and analysis. </li> <li><em>Julien Blond</em> is a senior OCaml developer with a strong experience in formal verification of security software. He joined OCamlPro as both a project manager and a Coq expert. </li> <li><em>Keryan Didier</em> joined the team as a R&amp;D engineer. He holds a PhD from University Pierre et Marie Curie, during which he developed an automated implementation method for hard real-time applications. Previously, he studied functional programming and language design at University Paris-Diderot. Keryan has been involved in the MLang project as well as the flambda2 project within OCamlPro's Compilation team. </li> <li><em>Mohamed Hernouf</em> recently completed his master's degree in Computer Science. After an internship at OCamlPro, working on the <a href="https://docs.ocaml.pro">OCaml Documentation Hub</a>, he joined OCamlPro and continues to work on the documentation hub and other OCaml applications. </li> <li><em>Dario Pinto</em> is a student at the <a href="https://42.fr/en/homepage/">42Paris</a> School of Computer Science. He joined OCamlPro in a work-study contract for two years. </li> <li><em>Artemiy Rozovyk</em> recently completed his master's degree in Computer Science. He joined OCamlPro to work on the development of applications for the EverScale and Avalanche blockchains. </li> </ul> <h2> <a id="apps" class="anchor"></a><a class="anchor-link" href="#apps">Real Life Modern Applications</a> </h2> <h3> <a id="mlang" class="anchor"></a><a class="anchor-link" href="#mlang">Modernizing the French Income Tax System</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/income-tax.jpg"> <img alt="The M language, designed in the 80s for the Income Tax, is now being rewritten and extended in OCaml." src="/blog/assets/img/income-tax.jpg"/> </a> <div class="caption"> The M language, designed in the 80s for the Income Tax, is now being rewritten and extended in OCaml. </div> </p> </div> </p> <p>The M language is a very old programming language developed by the French tax administration to compute income taxes. Recently, Denis Merigoux and Raphael Monat have implemented a <a href="https://github.com/MLanguage/mlang">new compiler in OCaml</a> for the M language. This new compiler shows better performance, clearer semantics, and achieves greater maintainability than the former compiler. OCamlPro is now involved in strengthening this new compiler, to put it in production and eventually compute the taxes of more than 30 million French families.</p> <h3> <a id="cobol" class="anchor"></a><a class="anchor-link" href="#cobol">A First Step in the COBOL Universe</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/cobol.jpg"> <img alt="Recent studies still estimate that COBOL has the highest amount of lines of code running." src="/blog/assets/img/cobol.jpg"/> </a> <div class="caption"> Recent studies still estimate that COBOL has the highest amount of lines of code running. </div> </p> </div> </p> <p>Born more than 60 years ago, <a href="https://wikipedia.org/wiki/COBOL">COBOL</a> is still said to be the most used language in the world, in terms of the number of lines running in computers, though many people forecast it would disappear at the edge of the 21st century. With more than 300 reserved keywords, it is also one of the most complex languages to parse and analyse. It's not enough to scare the developers at OCamlPro: while helping one of the biggest COBOL users in France to translate its programs into the <a href="https://gnucobol.sourceforge.io/">GNUCobol open-source compiler</a>, OCamlPro built a strong expertise of COBOL and mainframes, and developed a powerful parser of COBOL that will help us bring modern development tools to the COBOL developers.</p> <h3> <a id="geneweb" class="anchor"></a><a class="anchor-link" href="#geneweb">Auditing a High-Scale Genealogy Application</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/genealogie.jpg"> <img alt="Geneweb was developed in the 90s to manage family trees... and is still managing them!" src="/blog/assets/img/genealogie.jpg"/> </a> <div class="caption"> Geneweb was developed in the 90s to manage family trees... and is still managing them! </div> </p> </div> </p> <p><a href="https://geneweb.tuxfamily.org/wiki/GeneWeb/fr">Geneweb</a> is one of the most powerful software to manage and share genealogical data to date. Written in OCaml more than 20 years ago, it contains a web server and complex algorithms to compute information on family trees. It is used by <a href="https://en.geneanet.org/">Geneanet</a>, which is one of the leading companies in the genealogy field, to store more than 800,000 family trees and more than 7 billion names of ancestors. OCamlPro is now working with Geneanet to improve Geneweb and make it scale to even larger data sets.</p> <h3> <a id="mosaic" class="anchor"></a><a class="anchor-link" href="#mosaic">Improving an ecotoxicology platform</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/labo.jpg"> <img alt="Mosaic is used by ecotoxicologists and regulators to obtain advanced and innovative methods for environmental risks assessment." src="/blog/assets/img/labo.jpg"/> </a> <div class="caption"> Mosaic is used by ecotoxicologists and regulators to obtain advanced and innovative methods for environmental risks assessment. </div> </p> </div> </p> <p>The <a href="https://mosaic.univ-lyon1.fr/">Mosaic</a> platform helps researchers, industrials actors and regulators in the field of ecotoxicology by providing an easy way to run various statistical analyses. All the user has to do is to enter some data on the web interface, then computations are run on the server and the results are displayed. The platform is fully written in OCaml and takes care of calling the mathematical model which is written in R. OCamlPro modernised the project in order to ease maintainance and new contributions. In the process, we discovered <a href="https://github.com/pveber/morse/issues/286">bugs introduced by new R versions</a> (without any kind of warning). Then we developped a new interface for data input, it's similar to a spreadsheet and much more convenient than having to write raw CSV. During this work, we had the opportunity to contribute to some other OCaml packages such as <a href="https://github.com/mfp/ocaml-leveldb">leveldb</a> or write new ones such as <a href="https://github.com/OCamlPro/agrid">agrid</a>.</p> <h2> <a id="ocaml" class="anchor"></a><a class="anchor-link" href="#ocaml">Contributions to OCaml</a> </h2> <h3> <a id="flambda" class="anchor"></a><a class="anchor-link" href="#flambda">Flambda Code Optimizer</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/flambda_2021.jpeg"> <img alt="Flambda2 is a powerful code optimizer for the OCaml compiler." src="/blog/assets/img/flambda_2021.jpeg"/> </a> <div class="caption"> Flambda2 is a powerful code optimizer for the OCaml compiler. </div> </p> </div> </p> <p>OCamlPro is proud to be working on Flambda2, an ambitious OCaml optimizing compiler project, initiated with Mark Shinwell from Jane Street, our long-term partner and client. Flambda focuses on reducing the runtime cost of abstractions and removing as many short-lived allocations as possible. Jane Street has launched large-scale testing of flambda2, and on our side, we have documented the design of some key algorithms. In 2021, the Flambda team grew bigger with Keryan. Along with the considerable amount of fixes and improvements, this will allow us to publish <a href="https://github.com/ocaml-flambda/flambda-backend">Flambda2</a> in the coming months!</p> <p>In other OCaml compiler news, 2021 saw the long-awaited merge of the multicore branch into the official development branch. This was thanks to the amazing work of many people, including our own, Damien Doligez. This is far from the end of the story though, and we're looking forward to both further contributing to the compiler (fixing bugs, re-enabling support for all platforms) and making use of the features in our own programs.</p> <p><em>This work is allowed thanks to Jane Street’s funding.</em></p> <h3> <a id="opam" class="anchor"></a><a class="anchor-link" href="#opam">Opam Package Manager</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/opam_2021.jpg"> <img alt="A large set of new features have been implemented in Opam in 2021." src="/blog/assets/img/opam_2021.jpg"/> </a> <div class="caption"> A large set of new features have been implemented in Opam in 2021. </div> </p> </div> </p> <p><a href="https://opam.ocaml.org">Opam</a> is the OCaml source-based package manager. The first specification draft was written <a href="https://opam.ocaml.org/about.html">in early 2012</a> and went on to become OCaml’s official package manager — though it may be used for other languages and projects, since Opam is language-agnostic! If you need to install, upgrade and manage your compiler(s), tools and libraries easily, Opam is meant for you. It supports multiple simultaneous compiler installations, flexible package constraints, and a Git-friendly development workflow.</p> <p>Opam development and maintenance is a collaboration between OCamlPro, with Raja &amp; Louis, and OCamlLabs, with David Allsopp &amp; Kate Deplaix.</p> <p><a href="https://github.com/ocaml/opam/releases">Our 2021 work on opam</a> lead to the final release of the long-awaited opam 2.1, three versions of opam 2.0 and two versions of opam 2.1 with small fixes.</p> <p>Opam 2.1 introduced several new features:</p> <ul> <li>Integration of system dependencies (formerly the opam-depext plugin) </li> <li>Creation of lock files for reproducible installations (formerly the opam-lock plugin) </li> <li>Switch invariants, replacing the &quot;base packages&quot; in opam 2.0 and allowing for easier compiler upgrades </li> <li>Improved option configurations </li> <li>CLI versioning, allowing cleaner deprecations for opam now and also improvements to semantics in future without breaking backwards-compatibility </li> <li>opam root readability by newer and older versions, even if the format changed </li> <li>Performance improvements to opam-update, conflict messages, and many other areas </li> </ul> <p>Take a stroll through the <a href="https://opam.ocaml.org/blog//opam-2-1-0">blog post</a> for a closer look.</p> <p>In 2021, we also prepared the soon to-be alpha release of opam 2.2 version. It will provide a better handling of the Windows ecosystem, integration of Software Heritage <a href="#foundation">archive fallback</a>, better UI on user interactions, recursively pinning of projects, fetching optimisations, etc.</p> <p><em>This work is greatly helped by Jane Street’s funding and support.</em></p> <h3> <a id="community" class="anchor"></a><a class="anchor-link" href="#community">LearnOCaml and TryOCaml</a> </h3> <p>We have also been active in the maintainance of <a href="https://github.com/ocaml-sf/learn-ocaml">Learn-ocaml</a>. What was originally designed as the platform for the <a href="https://www.fun-mooc.fr/en/courses/introduction-functional-programming-ocaml/">OCaml MOOC</a> is now a tool in the hands of OCaml teachers worldwide, managed and funded by <a href="http://ocaml-sf.org/">the OCaml Foundation</a>.</p> <p>The work included a well overdue port to OCaml 4.12; generation of portable executables (automatic through CI) for much easier deployment and use of the command-line client; as well as many quality-of-life and usability improvements following from two-way conversations with many teachers.</p> <p>On a related matter, we also reworked our on-line OCaml editor and toplevel <a href="https://try.ocaml.pro">TryOCaml</a>, improving its design and adding features like code snippet sharing. We were glad to see that, in these difficult times, these tools proved useful to both teachers and students, and look forward to improving them further.</p> <h3> <a id="tooling" class="anchor"></a><a class="anchor-link" href="#tooling">OCaml Documentation Hub</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/ocaml_2021.jpg"> <img alt="The OCaml Documentation Hub includes browsable documentation and sources for more than 2000 Opam packages." src="/blog/assets/img/ocaml_2021.jpg"/> </a> <div class="caption"> The OCaml Documentation Hub includes browsable documentation and sources for more than 2000 Opam packages. </div> </p> </div> </p> <p>As one of the biggest user of OCaml, OCamlPro aims at facilitating daily use of OCaml by developing a lot of open-source tooling.</p> <p>One of our main contributions to the OCaml ecosystem in 2021 was probably the OCaml Documentation Hub at <a href="https://docs.ocaml.pro">docs.ocaml.pro</a>.</p> <p>The OCaml Documentation Hub is a website that provides documentation for more than 2000 OPAM packages, among which of course the most popular ones, with inter-package documentation links! The website also contains browsable sources for all these packages, and a search engine to discover useful OCaml functions, modules, types and classes.</p> <p>All this documentation is generated using our custom tool <a href="https://github.com/OCamlPro/digodoc">Digodoc</a>. Though it's not worth a specific section, we also kept maintaining <a href="https://github.com/OCamlPro/drom">Drom</a>, our layer on Dune and Opam that most of our recent projects use.</p> <h3> <a id="free_software" class="anchor"></a><a class="anchor-link" href="#free_software">Pluging Opam into Software Heritage</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/Svalbard_seed_vault.jpg"> <img alt="Svalbard Global Seed Vault in Norway." src="/blog/assets/img/Svalbard_seed_vault.jpg"/> </a> <div class="caption"> Svalbard Global Seed Vault in Norway. </div> </p> </div> </p> <p>Last year has also seen the long awaited collaboration between Software Heritage and OCamlPro happen.</p> <p>Thanks to a grant by the <a href="https://www.softwareheritage.org/2021/04/20/connecting-ocaml/">Alfred P. Sloan Foundation</a>, OCamlPro has been able to collaborate with our partners at Software Heritage and manage to further expand the coverage of this gargantuan endeavour of theirs by archiving 3516 opam packages. In effect, the main benefits of this Open Source collaboration have been:</p> <ul> <li>The addition of several modules to the Software Heritage architecture, allowing the archiving of said opam packages; </li> <li>The publication of an OCaml library allowing to work with <a href="https://github.com/OCamlPro/swhid">SWHID</a>s; </li> <li>An implementation of a possible fallback onto Software Heritage if a given package on opam is no longer available; </li> <li>A fix for the official opam repository in order to identify already missing packages. </li> </ul> <p>Not long after Software was at last acknowledged by Unesco as part of the World Heritage, we were thrilled to be part of this great and meaningful initiative. We could feel how true passion remained throughout our interactions and long after the work was done.</p> <h2> <a id="formal-methods" class="anchor"></a><a class="anchor-link" href="#formal-methods">Tooling for Formal Methods</a> </h2> <p> <div class="figure"> <p> <a href="/blog/assets/img/pure-mathematics-formulae-blackboard.jpg"> <img alt="Avionics, blockchains, cyber-security, cloud, etc... formal methods are spreading in the computer industry." src="/blog/assets/img/pure-mathematics-formulae-blackboard.jpg"/> </a> <div class="caption"> Avionics, blockchains, cyber-security, cloud, etc... formal methods are spreading in the computer industry. </div> </p> </div> </p> <h3> <a id="alt-ergo" class="anchor"></a><a class="anchor-link" href="#alt-ergo">Alt-Ergo Development</a> </h3> <p>OCamlPro develops and maintains <a href="https://alt-ergo.ocamlpro.com/">Alt-Ergo</a>, an automatic solver of mathematical formulas designed for program verification and based on Satisfiability Modulo Theories (SMT). Alt-Ergo was initially created within the <a href="https://vals.lri.fr/">VALS</a> team at <a href="https://www.universite-paris-saclay.fr/en">University of Paris-Saclay</a>.</p> <p>In 2021, we continued to focus on the maintainability of our solver. We released versions 2.4.0 and <a href="https://github.com/OCamlPro/alt-ergo/releases/tag/2.4.1">2.4.1</a> in January and July respectively, with 2.4.1 containing a bugfix as well as some performance improvements.</p> <p>In order to increase our test coverage, we instrumented Alt-Ergo so that we could run it using <a href="https://github.com/google/AFL">afl-fuzz</a>. Although this is a proof of concept, and has yet to be integrated into Alt-ergo's continuous integration, it has already helped us find a few bugs, such as <a href="https://github.com/OCamlPro/alt-ergo/pull/489">this</a>.</p> <h3> <a id="club" class="anchor"></a><a class="anchor-link" href="#club">Alt-Ergo Users’ Club and R&amp;D Projects</a> </h3> <p>We thank our partners from the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users’ Club</a>, Adacore, CEA List, MERCE (Mitsubishi Electric R&amp;D Centre Europe), Thalès, and Trust-In-Soft, for their trust. Their support allows us to maintain our tool.</p> <p>The club was launched in 2019 and the third annual meeting of the Alt-Ergo Users’ Club was held in early April 2021. Our annual meeting is the perfect place to review each partner’s needs regarding Alt-Ergo. This year, we had the pleasure of receiving our partners to discuss the roadmap for future Alt-Ergo features and enhancements. If you want to join us for the next meeting (coming soon), contact us!</p> <p>Finally, we will be able to merge into the main branch of Alt-Ergo some of the work we did in 2020. Thanks to our partner MERCE (Mitsubishi Electric R&amp;D Centre Europe), we worked on the SMT model generation. Alt-Ergo is now (partially) able to output a model in the smt-lib2 format. Thanks to the <a href="http://why3.lri.fr/">Why3 team</a> from University of Paris-Saclay, we hope that this work will be available in the Why3 platform to help users in their program verification efforts. OCamlPro was very happy to join the <a href="https://proofinuse.gitlabpages.inria.fr/">Why3 Consortium</a> this year, for even more collaborations to come!</p> <p><em>This work is funded in part by the FUI R&amp;D Project LCHIP, MERCE, Adacore and with the support of the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users’ Club</a>.</em></p> <h3> <a id="dolmen" class="anchor"></a><a class="anchor-link" href="#dolmen">Dolmen Library for Automated Deduction Languages</a> </h3> <p><a href="https://github.com/Gbury/dolmen">Dolmen</a> is a powerful library providing flexible parsers and typecheckers for many languages used in automated deduction.</p> <p>The ongoing work on using the Dolmen library as frontend for Alt-Ergo has progressed considerably, both on the side of dolemn which has been extended to support Alt-Ergo's native language in <a href="https://github.com/Gbury/dolmen/pull/89">this PR</a>, and on Alt-Ergo's side to add dolmen as a frontend that can be chosen in <a href="https://github.com/OCamlPro/alt-ergo/pull/491">this PR</a>. Once these are merged, Alt-Ergo will be able to read input problems in new languages, such as <a href="http://www.tptp.org/">TPTP</a>!</p> <h2> <a id="rust" class="anchor"></a><a class="anchor-link" href="#rust">Rust Developments</a> </h2> <p> <div class="figure"> <p> <a href="/blog/assets/img/logo_rust.jpg"> <img alt="Rust is a very good complement to OCaml for performance critical applications." src="/blog/assets/img/logo_rust.jpg"/> </a> <div class="caption"> Rust is a very good complement to OCaml for performance critical applications. </div> </p> </div> </p> <h3> <a id="mikino" class="anchor"></a><a class="anchor-link" href="#mikino">SMT, Induction and Mikino</a> </h3> <p>A few months ago, we published a series of posts: <a href="/blog/2021_10_14_verification_for_dummies_smt_and_induction"><em>verification for dummies: SMT and induction</em></a>. These posts introduce and discuss SMT solvers, the notion of induction and that of invariant strengthening. They rely on <a href="https://github.com/OCamlPro/mikino_bin"><em>mikino</em></a>, a simple software we wrote that can analyze simple transition systems and perform SMT-based induction checks (as well as BMC, <em>i.e.</em> bug-finding). We wrote mikino in Rust with readability and ergonomics in mind: mikino showcases the basics of writing an SMT-based model checker performing induction. The posts are very hands-on and leverage mikino's high-quality output to discuss induction and invariant strengthening, with examples that readers can run and edit themselves.</p> <h3> <a id="matla" class="anchor"></a><a class="anchor-link" href="#matla">Matla, a Project Manager for TLA+/TLC</a> </h3> <p>During 2021 we ended up using the TLA+ language and its associated TLC verification engine in several completely unrelated projects. TLC is an amazing tool, but it is not suited to handle a TLA+ project with many modules (files), regression tests, <em>etc.</em> In particular, TLA+ is not a typed language. This means that TLA+ code tends to have many <em>checks</em> (dynamic assertions) checking that quantities have the expected type. This is fine, albeit a bit tedious, to some extent, but as the code grows bigger the analysis conducted by TLC can become very, very expensive. Eventually it is not reasonable to keep assert-type-checking everything since it contributes to TLC's analysis exploding.</p> <p>As TLA+/TLC users, we are currently developing <code>matla</code> which <code>ma</code>nages <code>TLA</code>+ projects. Written in Rust, matla is heavily inspired by the Rust ecosystem, in particular <a href="https://doc.rust-lang.org/cargo">cargo</a>. Matla has not been publicly released yet as we are waiting for more feedback from early users. We do use it internally however as its various features make our TLA+ projects much simpler:</p> <ul> <li>handling the TLA toolchain (download, <code>PATH</code>, updates...) for the user; </li> <li>provide a <code>Matla</code> module with <em>&quot;debug assertions&quot;</em> helpers: these assertions are active in <em>debug</em> mode, which is the default when running <code>matla run</code>. Passing <code>--release</code> to matla's run mode however compiles all debug assertions away; this allows to type-check everything when debugging while making sure release runs do not pay the price of these checks; </li> <li>handle <em>integration</em> testing: matla projects have a <code>tests</code> directory where users can write tests (TLA files with a <code>.tla</code> and <code>.cfg</code> files) and specify if they are expected to be successful or to fail (and how); </li> <li>understand and transform TLC's output to improve user feedback, in particular when TLC yields an error (not good enough yet and is the reason we have not released yet); matla also parses and prettifies TLC's counterexample traces by formatting values, formatting states (aggregation of values), and render traces of states graphically using ASCII art. </li> </ul> <h3> <a id="rust-training" class="anchor"></a><a class="anchor-link" href="#rust-training">Rust Training at Onera</a> </h3> <p>The ongoing pandemic is undoubtingly impacting our professional training activities. Still, we had the opportunity to set up a Rust training session with applied researchers at ONERA during the summer. The session spanned over a week (about seven hours a day) and was our first fully remote Rust training session. We still believe on-site training (when possible) is better, full remote offers some flexibility (spreading out the training over several weeks for instance) and our experience with ONERA shows that it can work in practice with the right technology. Interestingly, it turns out that some aspects of the session actually work better with remote: hands-on exercises and projects for instance benefit from screen sharing. Discussing code with one participant is done with screen sharing, meaning all participants can follow along if they so chose.</p> <p>Long story short, fully remote training is something we now feel confident proposing to our clients as a flexible alternative to on-site training.</p> <h3> <a id="rust-audit" class="anchor"></a><a class="anchor-link" href="#rust-audit">Audit of a Rust Blockchain Node</a> </h3> <p>We participated in a contest aiming at writing a high-level specification of the (compiler for) the TON VM assembler, in particular its instructions and how they are compiled. This contest was a first step towards applying Formal Methods, and in particular formal verification, to the TON VM. We are happy to report that we finished first in this context, and are looking forward to future contests pushing Formal Methods further in the Everscale blockchain.</p> <h2> <a id="blockchains" class="anchor"></a><a class="anchor-link" href="#blockchains">Scaling and Verifying Blockchains</a> </h2> <p> <div class="figure"> <p> <a href="/blog/assets/img/chain.jpg"> <img alt="OCamlPro is involved in several projects with high-throughput blockchains, such as EverScale and Avalanche." src="/blog/assets/img/chain.jpg"/> </a> <div class="caption"> OCamlPro is involved in several projects with high-throughput blockchains, such as EverScale and Avalanche. </div> </p> </div> </p> <h3> <a id="everscale" class="anchor"></a><a class="anchor-link" href="#everscale">From Dune Network to FreeTON/EverScale</a> </h3> <p>In 2019-2020, we concentrated our efforts on the development of blockchains on adding new programming languages to the <a href="https://dune.network">Dune Network</a> ecosystem, in collaboration with Origin Labs. You can read more about <a href="/blog/2020_06_09_a_dune_love_story_from_liquidity_to_love">Love</a> and <a href="https://medium.com/dune-network/deploy-your-first-solidity-contract-on-dune-network-a96a53169a91">Solidity for Dune</a>.</p> <p>At the end of 2020, it became clear that high-throughput was becoming a major requirement for blockchain adoption in real applications, and that the Tezos-based technology behind Dune Network could not compete with high-performance blockchains such as <a href="https://solana.com">Solana</a> or <a href="https://www.avax.network">Avalanche</a>. Following this observation, the Dune Network community decided to merge with the FreeTON community early in 2021. Initially developed by Telegram, the TON project was stopped under legal threats, but another company, <a href="https://tonlabs.io/main">TONLabs</a>, restarted the project from its open-source code under the FreeTON name, and the blockchain was launched mid-2020. FreeTON, now renamed <a href="https://everscale.network/">EverScale</a>, is today the fastest blockchain in the world, with around 55,000 transactions per second on an open network sustained during several days.</p> <p>EverScale uses a very unique community-driven development process: contests are organized by thematic sub-governances (subgov) to improve the ecosystem, and contestants win prices in tokens to reward their high-quality work. During 2021, OCamlPro got involved in several of these sub-governances, both as a jury, in the Formal Methods subgov and the Developer Experience subgov, and a contestant winning multiple prices for the development of smart contracts (<a href="https://medium.com/ocamlpro-blockchain-fr/zk-snarks-freeton-et-ocamlpro-796adc323351">zksnarks use-cases</a>, <a href="https://github.com/OCamlPro/freeton_auctions">auctions</a> and <a href="https://github.com/OCamlPro/devex-27-recurring-payments">recurring payments</a>), the audit of several smart contracts (<a href="https://github.com/OCamlPro/formet-17-true-nft-audit">TrueNFT audit</a>, <a href="https://github.com/OCamlPro/formet-14-rsquad-smv-audit">Smart Majority Voting audit</a> and <a href="https://github.com/OCamlPro/formet-13-radiance-dex-audit">a DEX audit</a>), and the specification of some Rust components in the node (the <a href="https://formet.gov.freeton.org/submission?proposalAddress=0%3A91a2ecea35ee9405ccb572c577cb6ba139491b493d86191e8e46a30fdd4b01e5&amp;submissionId=5">Assembler module</a>).</p> <p>This work in the EverScale ecosystem gave us the opportunity to develop some interesting OCaml contributions:</p> <ul> <li>We improved our <a href="https://github.com/OCamlPro/ocaml-solidity">ocaml-solidity</a> parser to support all the extensions of the <a href="https://solidity.readthedocs.io/en/v0.6.8/">Solidity</a> language required to parse EverScale contracts; </li> <li>We developed an <a href="https://github.com/OCamlPro/freeton_ocaml_sdk">OCaml binding</a> for the EverScale Rust SDK; </li> <li>We developed a command line <a href="https://github.com/OCamlPro/freeton_wallet">wallet called <code>ft</code></a> to help developers easily deploy the contracts and interact with them; </li> <li>We developed a <a href="https://gitlab.com/dune-network/ton-merge">bridge</a> between Dune Network and EverScale to swap DUN tokens into EVER tokens. </li> </ul> <p><em>This work was funded by the EverScale community through contests.</em></p> <h3> <a id="solidity" class="anchor"></a><a class="anchor-link" href="#solidity">A Why3 Framework for Solidity</a> </h3> <p>Our most recent work on the EverScale blockchain has been targetted into the development of a <a href="http://why3.lri.fr/">Why3 framework</a> to formally verify EverScale Solidity contracts. At the same time, we have been involved in the specification of several big smart contract projects, and we plan to use this framework in practice on these projects as soon as their formal verification starts.</p> <p>We hope to be able to extend this work to EVM based Solidity contracts, as available on Ethereum and Avalanche and many other blockchains. By comparison with other frameworks that work directly on the EVM bytecode, this work focused directly on the Solidity language should make the verification much higher-level and so more straight-forward.</p> <h2> <a id="events" class="anchor"></a><a class="anchor-link" href="#events">Participations in Public Events</a> </h2> <h3> <a id="osxp2021" class="anchor"></a><a class="anchor-link" href="#osxp2021">Open Source Experience 2021</a> </h3> <p> <div class="figure"> <p> <a href="/blog/assets/img/ospx_1.jpg"> <img alt="Stéfane Fermigier (Abilian) and Pierre Baudracco (BlueMind) from Systematic Open Source Hub meet Amélie de Montchalin (French Minister of Public Service) in front of OCamlPro's booth." src="/blog/assets/img/ospx_1.jpg"/> </a> <div class="caption"> Stéfane Fermigier (Abilian) and Pierre Baudracco (BlueMind) from Systematic Open Source Hub meet Amélie de Montchalin (French Minister of Public Service) in front of OCamlPro's booth. </div> </p> </div> </p> <p>We were present at the new edition of the <a href="https://www.opensource-experience.com/">Open Source Experience</a> in Paris! Our booth welcomed our visitors to discuss tailor-made software solutions. Fabrice had the opportunity to give a presentation on FreeTON (Now EverScale) <a href="https://www.youtube.com/watch?v=EEtE4YpWbjw">(Watch the video)</a>, the high speed blockchain he is working on. We were delighted to meet the open source community. Moreover, Amélie de Montchalin, French Minister of Transformation and Public Service, was present to the Open Source Experience to thank all the free software actors. A very nice experience for us, we can't wait to be back in 2022!</p> <h3> <a id="ow2021" class="anchor"></a><a class="anchor-link" href="#ow2021">OCaml Workshop at ICFP 2021</a> </h3> <p>We participated in the programming competition organized by the International Conference on Functional Programming (ICFP). 3 talks we submitted to the OCaml Workshop were accepted!</p> <ul> <li>Fabrice, Mohamed and Louis presented <a href="https://github.com/OCamlPro/digodoc">Digodoc</a>, our new tool that builds a graph of an opam switch, associating files, libraries and opam packages into a cyclic graph of inclusions and dependencies; </li> <li>Fabrice spoke about <a href="https://github.com/OCamlPro/opam-bin">Opam-bin</a>, a plugin that builds binary opam packages on the fly; </li> <li>Lastly, Steven and David presented <strong>Love</strong>, a smart contract language embedded in the Dune Network blockchain. It was an opportunity to present our tools and projects, and above all to discuss with the OCaml community. We're delighted to take part in this adventure every year! </li> </ul> <h3> <a id="why3consortium" class="anchor"></a><a class="anchor-link" href="#why3consortium">Joining the Why3 Consortium at the ProofInUse seminar</a> </h3> <p>We were very happy to join the Why3 Consortium while participating the ProofinUse joint lab <a href="https://proofinuse.gitlabpages.inria.fr/meeting-2021oct21/">seminar on counterexamples</a> on October the 1st. Many thanks to Claude Marché for his role as scientific shepherd.</p> <h2> <a id="next" class="anchor"></a><a class="anchor-link" href="#next">Towards 2022</a> </h2> <p> <div class="figure"> <p> <a href="/blog/assets/img/towards_2022.jpeg"> <img alt="Though 2022 is just starting, it already sounds like a great year with many new interesting and innovative projects for OCamlPro." src="/blog/assets/img/towards_2022.jpeg"/> </a> <div class="caption"> Though 2022 is just starting, it already sounds like a great year with many new interesting and innovative projects for OCamlPro. </div> </p> </div> </p> <p>After a phase of adaptation to the health context in 2020 and a year of growth in 2021, we are motivated to start the year 2022 with new and very enriching projects, new professional encounters, leading to the growth of our teams. If you want to be part of a passionate team, we would love to hear from you! We are currently actively hiring. Check the available job positions and follow the application instructions!</p> <p>All our amazing achievements are the result of incredible people and teamwork, kudos to Fabrice, Pierre, Louis, Vincent, Damien, Raja, Steven, Guillaume, David, Adrien, Léo, Keryan, Mohamed, Hichem, Dario, Julien, Artemiy, Nicolas, Elias, Marla, Aurore and Muriel.</p> Verification for Dummies: SMT and Induction https://ocamlpro.com/blog/2021_10_14_verification_for_dummies_smt_and_induction 2021-10-14T13:48:57Z 2021-10-14T13:48:57Z Adrien Champion Adrien Champion adrien.champion@ocamlpro.com This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. These posts broadly discusses induction as a formal verification technique, which here really means formal program verification. I will use concrete, runnabl... <ul> <li>Adrien Champion <a href="mailto:adrien.champion@ocamlpro.com">adrien.champion@ocamlpro.com</a> </li> <li><a href="http://creativecommons.org/licenses/by-sa/4.0/"></a> This work is licensed under a <a href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>. </li> </ul> <p>These posts broadly discusses <em>induction</em> as a <em>formal verification</em> technique, which here really means <em>formal program verification</em>. I will use concrete, runnable examples whenever possible. Some of them can run directly in a browser, while others require to run small easy-to-retrieve tools locally. Such is the case for pretty much all examples dealing directly with induction.</p> <p>The next chapters discuss the following notions:</p> <ul> <li>formal logics and formal frameworks; </li> <li>SMT-solving: modern, <em>low-level</em> verification building blocks; </li> <li>declarative transition systems; </li> <li>transition system unrolling; </li> <li>BMC and induction proofs over transition systems; </li> <li>candidate strengthening. </li> </ul> <p>The approach presented here is far from being the only one when it comes to program verification. It happens to be relatively simple to understand, and I believe that familiarity with the notions discussed here makes understanding other approaches significantly easier.</p> <p>This book thus hopes to serve both as a relatively deep dive into the specific technique of SMT-based induction, as well as an example of the technical challenges inherent to both developing and using automated proof engines.</p> <p>Some chapters contain a few pieces of Rust code. Usually to provide a runnable version of a system under discussion, or to serve as example of actual code that we want to encode and verify. Some notions of Rust could definitely help in places, but this is not mandatory (probably).</p> <p>Read more here: <a href="https://github.com/rust-lang/this-week-in-rust/pull/2479"></a><a href="https://ocamlpro.github.io/verification_for_dummies/">https://ocamlpro.github.io/verification_for_dummies/</a></p> Generating static and portable executables with OCaml https://ocamlpro.com/blog/2021_09_02_generating_static_and_portable_executables_with_ocaml 2021-09-02T13:48:57Z 2021-09-02T13:48:57Z Louis Gesbert Distributing OCaml software on opam is great (if I dare say so myself), but sometimes you need to provide your tools to an audience outside of the OCaml community, or just without recompilations or in a simpler way. However, just distributing the locally generated binaries requires that the users ha... <blockquote> <p>Distributing OCaml software on opam is great (if I dare say so myself), but sometimes you need to provide your tools to an audience outside of the OCaml community, or just without recompilations or in a simpler way.</p> <p>However, just distributing the locally generated binaries requires that the users have all the needed shared libraries installed, and a compatible libc. It's not something you can assume in general, and even if you don't need any C shared library or are confident enough it will be installed everywhere, the libc issue will arise for anyone using a distribution based on a different kind, or a little older than the one you used to build.</p> <p>There is no built-in support for generating static executables in the OCaml compiler, and it may seem a bit tricky, but it's not in fact too complex to do by hand, something you may be ready to do for a release that will be published. So here are a few tricks, recipes and advice that should enable you to generate truly portable executables with no external dependency whatsoever. Both Linux and macOS will be treated, but the examples will be based on Linux unless otherwise specified.</p> </blockquote> <h2>Example</h2> <p>I will take as an example a trivial HTTP file server based on <a href="https://github.com/aantron/dream">Dream</a>.</p> <blockquote> <details> <summary>Sample code</summary> <h5>fserv.ml</h5> <pre><code class="language-ocaml">let () = Dream.(run @@ logger @@ static &quot;.&quot;) </code></pre> <h5>fserv.opam</h5> <pre><code class="language-python">opam-version: &quot;2.0&quot; depends: [&quot;ocaml&quot; &quot;dream&quot;] </code></pre> <h5>dune-project</h5> <pre><code class="language-lisp">(lang dune 2.8) (name fserv) </code></pre> </details> </blockquote> <p>The relevant part of our <code>dune</code> file is just:</p> <pre><code class="language-lisp">(executable (public_name fserv) (libraries dream)) </code></pre> <p>This is how we check the resulting binary:</p> <pre><code class="language-shell-session">$ dune build fserv.exe ocamlc .fserv.eobjs/byte/dune__exe__Fserv.{cmi,cmo,cmt} ocamlopt .fserv.eobjs/native/dune__exe__Fserv.{cmx,o} ocamlopt fserv.exe $ file _build/default/fserv.exe _build/default/fserv.exe: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=1991bb9f1d67807411c93f6fb6ec46b4a0ee8ed5, for GNU/Linux 3.2.0, with debug_info, not stripped $ ldd _build/default/fserv.exe linux-vdso.so.1 (0x00007ffe97690000) libssl.so.1.1 =&gt; /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007fd6cc636000) libcrypto.so.1.1 =&gt; /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007fd6cc342000) libev.so.4 =&gt; /usr/lib/x86_64-linux-gnu/libev.so.4 (0x00007fd6cc330000) libpthread.so.0 =&gt; /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd6cc30e000) libm.so.6 =&gt; /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd6cc1ca000) libdl.so.2 =&gt; /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd6cc1c4000) libc.so.6 =&gt; /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd6cbffd000) /lib64/ld-linux-x86-64.so.2 (0x00007fd6cced7000) </code></pre> <p>(on macOS, replace <code>ldd</code> with <code>otool -L</code>; dune output is obtained with <code>(display short)</code> in <code>~/.config/dune/config</code>)</p> <p>So let's see how to change this result. Basically, here, <code>libev</code>, <code>libssl</code> and <code>libcrypto</code> are required shared libraries that may not be installed on every system, while all the others are part of the core system:</p> <ul> <li><code>linux-vdso</code>, <code>libdl</code> and <code>ld-linux</code> are concerned with the dynamic loading of shared objects ; </li> <li><code>libm</code> and <code>libpthread</code> are extensions of the core <code>libc</code> that are tightly bound to it, and always installed. </li> </ul> <h2>Statically linking the libraries</h2> <p>In simple cases, static linking can be turned on as easily as passing the <code>-static</code> flag to the C compiler: through OCaml you will need to pass <code>-cclib -static</code>. We can add that to our <code>dune</code> file:</p> <pre><code class="language-lisp">(executable (public_name fserv) (flags (:standard -cclib -static)) (libraries dream)) </code></pre> <p>... which gives:</p> <pre><code class="language-shell-session">$ dune build fserv.exe ocamlc .fserv.eobjs/byte/dune__exe__Fserv.{cmi,cmo,cmt} ocamlopt .fserv.eobjs/native/dune__exe__Fserv.{cmx,o} ocamlopt fserv.exe /usr/bin/ld: /usr/lib/gcc/x86_64-linuxgnu/10/../../../x86_64-linux-gnu/libcrypto.a(dso_dlfcn.o): in function `dlfcn_globallookup': (.text+0x13): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking /usr/bin/ld: ~/.opam/4.11.0/lib/ocaml/libunix.a(initgroups.o): in function `unix_initgroups': initgroups.c:(.text.unix_initgroups+0x1f): warning: Using 'initgroups' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking [...] $ file _build/default/fserv.exe _build/default/fserv.exe: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=9ee3ae1c24fbc291d1f580bc7aaecba2777ee6c2, for GNU/Linux 3.2.0, with debug_info, not stripped $ ldd _build/default/fserv.exe not a dynamic executable </code></pre> <p>The executable was generated... and the result <em>seems</em> OK, but we shouldn't skip all these <code>ld</code> warnings. Basically, what <code>ld</code> is telling us is that you shouldn't statically link <code>glibc</code> (it internally uses dynlinking, to libraries that also need <code>glibc</code> functions, and will therefore <strong>still</strong> need to dynlink a second version from the system 🤯).</p> <p>Indeed here, we have been statically linking a dynamic linking engine, among other things. Don't do it.</p> <h3>Linux solution: linking with musl instead of glibc</h3> <p>The easiest workaround at this point, on Linux, is to compile with <a href="http://musl.libc.org/">musl</a>, which is basically a glibc replacement that can be statically linked. There are some OCaml and gcc variants to automatically use musl (comments welcome if you have been successful with them!), but I have found the simplest option is to use a tiny Alpine Linux image through a Docker container. Here we'll use OCamlPro's <a href="https://hub.docker.com/r/ocamlpro/ocaml">minimal Docker images</a> but anything based on musl should do.</p> <pre><code class="language-shell-session">$ docker run --rm -it ocamlpro/ocaml:4.12 [...] ~/fserv $ sudo apk add openssl-libs-static (1/1) Installing openssl-libs-static (1.1.1l-r0) OK: 161 MiB in 52 packages ~/fserv $ opam switch create . --deps ocaml-system [...] ~/fserv $ eval $(opam env) ~/fserv $ dune build fserv.exe ~/fserv $ file _build/default/fserv.exe _build/default/fserv.exe: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped ~/fserv $ ldd _build/default/fserv.exe /lib/ld-musl-x86_64.so.1 (0x7ff41353f000) </code></pre> <p>Almost there! We see that we had to install extra packages with <code>apk add</code>: the static libraries might not be already installed and in this case are in a separate package (you would get <code>bin/ld: cannot find -lssl</code>). The last remaining dynamic loader in the output of <code>ldd</code> is because static PIE executable were not supported <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81498#c1">until recently</a>. To get rid of it, we just need to add <code>-cclib -no-pie</code> (note: a previous revision of this blogpost mentionned <code>-static-pie</code> instead, which may work with recent compilers, but didn't seem to give reliable results):</p> <pre><code class="language-lisp">(executable (public_name fserv) (flags (:standard -cclib -static -cclib -no-pie)) (libraries dream)) </code></pre> <p>And we are good!</p> <pre><code class="language-shell-session">~/fserv $ file _build/default/fserv.exe _build/default/fserv.exe: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, with debug_info, not stripped ~/fserv $ ldd _build/default/fserv.exe /lib/ld-musl-x86_64.so.1: _build/default/fserv.exe: Not a valid dynamic program </code></pre> <blockquote> <p><strong>Trick</strong>: short script to compile through a Docker container</p> <p>Passing the context to a Docker container and getting the artefacts back can be bothersome and often causes file ownership issues, so I use the following snippet to pipe them to/from it using <code>tar</code>:</p> <pre><code class="language-bash">git ls-files -z | xargs -0 tar c | docker run --rm -i ocamlpro/ocaml:4.12 sh -uexc '{ tar x &amp;&amp; opam switch create . ocaml-system --deps-only --locked &amp;&amp; opam exec -- dune build --profile=release @install; } &gt;&amp;2 &amp;&amp; tar c -hC _build/install/default/bin .' | tar vx </code></pre> </blockquote> <h3>The other cases: turning to manual linking</h3> <p>Sometimes you can't use the above: the automatic linking options may need to be tweaked for static libraries, your app may still need dynlinking support at some point, or you may not have the musl option. On macOS, for example, the libc doesn't have a static version at all (and the <code>-static</code> option of <code>ld</code> is explicitely &quot;only used building the kernel&quot;). Let's get our hands dirty and see how to use a mixed static/dynamic linking scheme. First, we examine how OCaml does the linking usually:</p> <p>The linking options are passed automatically by OCaml, using information that is embedded in the <code>cm(x)a</code> files, for example:</p> <pre><code class="language-shell-session">$ ocamlobjinfo $(opam var lwt:lib)/unix/lwt_unix.cma |head File ~/.opam/4.11.0/lib/lwt/unix/lwt_unix.cma Force custom: no Extra C object files: -llwt_unix_stubs -lev -lpthread Extra C options: Extra dynamically-loaded libraries: -llwt_unix_stubs Unit name: Lwt_features Interfaces imported: c21c5d26416461b543321872a551ea0d Stdlib 1372e035e54f502dcc3646993900232f Lwt_features 3a3ca1838627f7762f49679ce0278ad1 CamlinternalFormatBasics </code></pre> <p>Now the linking flags, here <code>-llwt_unix_stubs -lev -lpthread</code> let the C compiler choose the best way to link; in the case of stubs, they will be static (using the <code>.a</code> files — unless you make special effort to use dynamic ones), but <code>-lev</code> will let the system linker select the shared library, because it is generally preferred. Gathering these flags by hand would be tedious: my preferred trick is to just add the <code>-verbose</code> flag to OCaml (for the lazy, you can just set — temporarily — <code>OCAMLPARAM=_,verbose=1</code>):</p> <pre><code class="language-lisp">(executable (public_name fserv) (flags (:standard -verbose)) (libraries dream)) </code></pre> <pre><code class="language-shell-session">$ dune build ocamlc .fserv.eobjs/byte/dune__exe__Fserv.{cmi,cmo,cmt} ocamlopt .fserv.eobjs/native/dune__exe__Fserv.{cmx,o} + as -o '.fserv.eobjs/native/dune__exe__Fserv.o' '/tmp/build8eb7e5.dune/camlasm91a0b9.s' ocamlopt fserv.exe + as -o '/tmp/build8eb7e5.dune/camlstartupc9267f.o' '/tmp/build8eb7e5.dune/camlstartup1d9915.s' + gcc -O2 -fno-strict-aliasing -fwrapv -Wall -Wdeclaration-after-statement -fno-common -fexcess-precision=standard -fno-tree-vrp -ffunction-sections -D_FILE_OFFSET_BITS=64 -D_REENTRANT -DCAML_NAME_SPACE -Wl,-E -o 'fserv.exe' '-L~/.opam/4.11.0/lib/bigstringaf' '-L~/.opam/4.11.0/lib/ocaml' '-L~/.opam/4.11.0/lib/ocaml' '-L~/.opam/4.11.0/lib/ocaml' '-L~/.opam/4.11.0/lib/lwt/unix' '-L~/.opam/4.11.0/lib/cstruct' '-L~/.opam/4.11.0/lib/mirage-crypto' '-L~/.opam/4.11.0/lib/mirage-crypto-rng/unix' '-L~/.opam/4.11.0/lib/mtime/os' '-L~/.opam/4.11.0/lib/digestif/c' '-L~/.opam/4.11.0/lib/bigarray-overlap/stubs' '-L~/.opam/4.11.0/lib/ocaml' '-L~/.opam/4.11.0/lib/ssl' '-L~/.opam/4.11.0/lib/ocaml' '/tmp/build8eb7e5.dune/camlstartupc9267f.o' '~/.opam/4.11.0/lib/ocaml/std_exit.o' '.fserv.eobjs/native/dune__exe__Fserv.o' '~/.opam/4.11.0/lib/dream/dream.a' '~/.opam/4.11.0/lib/dream/sql/dream__sql.a' '~/.opam/4.11.0/lib/dream/http/dream__http.a' '~/.opam/4.11.0/lib/dream/websocketaf/websocketaf.a' '~/.opam/4.11.0/lib/dream/httpaf-lwt-unix/httpaf_lwt_unix.a' '~/.opam/4.11.0/lib/dream/httpaf-lwt/httpaf_lwt.a' '~/.opam/4.11.0/lib/dream/h2-lwt-unix/h2_lwt_unix.a' '~/.opam/4.11.0/lib/dream/h2-lwt/h2_lwt.a' '~/.opam/4.11.0/lib/dream/h2/h2.a' '~/.opam/4.11.0/lib/psq/psq.a' '~/.opam/4.11.0/lib/dream/httpaf/httpaf.a' '~/.opam/4.11.0/lib/dream/hpack/hpack.a' '~/.opam/4.11.0/lib/dream/gluten-lwt-unix/gluten_lwt_unix.a' '~/.opam/4.11.0/lib/lwt_ssl/lwt_ssl.a' '~/.opam/4.11.0/lib/ssl/ssl.a' '~/.opam/4.11.0/lib/dream/gluten-lwt/gluten_lwt.a' '~/.opam/4.11.0/lib/faraday-lwt-unix/faraday_lwt_unix.a' '~/.opam/4.11.0/lib/faraday-lwt/faraday_lwt.a' '~/.opam/4.11.0/lib/dream/gluten/gluten.a' '~/.opam/4.11.0/lib/faraday/faraday.a' '~/.opam/4.11.0/lib/dream/localhost/dream__localhost.a' '~/.opam/4.11.0/lib/dream/graphql/dream__graphql.a' '~/.opam/4.11.0/lib/ocaml/str.a' '~/.opam/4.11.0/lib/graphql-lwt/graphql_lwt.a' '~/.opam/4.11.0/lib/graphql/graphql.a' '~/.opam/4.11.0/lib/graphql_parser/graphql_parser.a' '~/.opam/4.11.0/lib/re/re.a' '~/.opam/4.11.0/lib/dream/middleware/dream__middleware.a' '~/.opam/4.11.0/lib/yojson/yojson.a' '~/.opam/4.11.0/lib/biniou/biniou.a' '~/.opam/4.11.0/lib/easy-format/easy_format.a' '~/.opam/4.11.0/lib/magic-mime/magic_mime_library.a' '~/.opam/4.11.0/lib/fmt/fmt_tty.a' '~/.opam/4.11.0/lib/multipart_form/lwt/multipart_form_lwt.a' '~/.opam/4.11.0/lib/dream/pure/dream__pure.a' '~/.opam/4.11.0/lib/hmap/hmap.a' '~/.opam/4.11.0/lib/multipart_form/multipart_form.a' '~/.opam/4.11.0/lib/rresult/rresult.a' '~/.opam/4.11.0/lib/pecu/pecu.a' '~/.opam/4.11.0/lib/prettym/prettym.a' '~/.opam/4.11.0/lib/bigarray-overlap/overlap.a' '~/.opam/4.11.0/lib/bigarray-overlap/stubs/overlap_stubs.a' '~/.opam/4.11.0/lib/base64/rfc2045/base64_rfc2045.a' '~/.opam/4.11.0/lib/unstrctrd/parser/unstrctrd_parser.a' '~/.opam/4.11.0/lib/unstrctrd/unstrctrd.a' '~/.opam/4.11.0/lib/uutf/uutf.a' '~/.opam/4.11.0/lib/ke/ke.a' '~/.opam/4.11.0/lib/fmt/fmt.a' '~/.opam/4.11.0/lib/base64/base64.a' '~/.opam/4.11.0/lib/digestif/c/digestif_c.a' '~/.opam/4.11.0/lib/stdlib-shims/stdlib_shims.a' '~/.opam/4.11.0/lib/dream/graphiql/dream__graphiql.a' '~/.opam/4.11.0/lib/dream/cipher/dream__cipher.a' '~/.opam/4.11.0/lib/mirage-crypto-rng/lwt/mirage_crypto_rng_lwt.a' '~/.opam/4.11.0/lib/mtime/os/mtime_clock.a' '~/.opam/4.11.0/lib/mtime/mtime.a' '~/.opam/4.11.0/lib/duration/duration.a' '~/.opam/4.11.0/lib/mirage-crypto-rng/unix/mirage_crypto_rng_unix.a' '~/.opam/4.11.0/lib/mirage-crypto-rng/mirage_crypto_rng.a' '~/.opam/4.11.0/lib/mirage-crypto/mirage_crypto.a' '~/.opam/4.11.0/lib/eqaf/cstruct/eqaf_cstruct.a' '~/.opam/4.11.0/lib/eqaf/bigstring/eqaf_bigstring.a' '~/.opam/4.11.0/lib/eqaf/eqaf.a' '~/.opam/4.11.0/lib/cstruct/cstruct.a' '~/.opam/4.11.0/lib/caqti-lwt/caqti_lwt.a' '~/.opam/4.11.0/lib/lwt/unix/lwt_unix.a' '~/.opam/4.11.0/lib/ocaml/threads/threads.a' '~/.opam/4.11.0/lib/ocplib-endian/bigstring/ocplib_endian_bigstring.a' '~/.opam/4.11.0/lib/ocplib-endian/ocplib_endian.a' '~/.opam/4.11.0/lib/mmap/mmap.a' '~/.opam/4.11.0/lib/ocaml/bigarray.a' '~/.opam/4.11.0/lib/ocaml/unix.a' '~/.opam/4.11.0/lib/logs/logs_lwt.a' '~/.opam/4.11.0/lib/lwt/lwt.a' '~/.opam/4.11.0/lib/caqti/caqti.a' '~/.opam/4.11.0/lib/uri/uri.a' '~/.opam/4.11.0/lib/angstrom/angstrom.a' '~/.opam/4.11.0/lib/bigstringaf/bigstringaf.a' '~/.opam/4.11.0/lib/bigarray-compat/bigarray_compat.a' '~/.opam/4.11.0/lib/stringext/stringext.a' '~/.opam/4.11.0/lib/ptime/ptime.a' '~/.opam/4.11.0/lib/result/result.a' '~/.opam/4.11.0/lib/logs/logs.a' '~/.opam/4.11.0/lib/ocaml/stdlib.a' '-lssl_stubs' '-lssl' '-lcrypto' '-lcamlstr' '-loverlap_stubs_stubs' '-ldigestif_c_stubs' '-lmtime_clock_stubs' '-lrt' '-lmirage_crypto_rng_unix_stubs' '-lmirage_crypto_stubs' '-lcstruct_stubs' '-llwt_unix_stubs' '-lev' '-lpthread' '-lthreadsnat' '-lpthread' '-lunix' '-lbigstringaf_stubs' '~/.opam/4.11.0/lib/ocaml/libasmrun.a' -lm -ldl </code></pre> <p>There is a lot of noise, but the interesting part is at the end, the <code>-l*</code> options before the standard <code>ocaml/libasmrun -lm -ldl</code>:</p> <pre><code class="language-shell-session"> '-lssl_stubs' '-lssl' '-lcrypto' '-lcamlstr' '-loverlap_stubs_stubs' '-ldigestif_c_stubs' '-lmtime_clock_stubs' '-lrt' '-lmirage_crypto_rng_unix_stubs' '-lmirage_crypto_stubs' '-lcstruct_stubs' '-llwt_unix_stubs' '-lev' '-lpthread' '-lthreadsnat' '-lpthread' '-lunix' '-lbigstringaf_stubs' </code></pre> <h4>Manually linking with glibc (Linux)</h4> <p>To link these statically, but the glibc dynamically:</p> <ul> <li>we disable the automatic generation of linking flags by OCaml with <code>-noautolink</code> </li> <li>we pass directives to the linker through OCaml and the C compiler, using <code>-cclib -Wl,xxx</code>. <code>-Bstatic</code> makes static linking the preferred option </li> <li>we escape the linking flags we extracted above through <code>-cclib</code> </li> </ul> <pre><code class="language-lisp">(executable (public_name fserv) (flags (:standard -noautolink -cclib -Wl,-Bstatic -cclib -lssl_stubs -cclib -lssl -cclib -lcrypto -cclib -lcamlstr -cclib -loverlap_stubs_stubs -cclib -ldigestif_c_stubs -cclib -lmtime_clock_stubs -cclib -lrt -cclib -lmirage_crypto_rng_unix_stubs -cclib -lmirage_crypto_stubs -cclib -lcstruct_stubs -cclib -llwt_unix_stubs -cclib -lev -cclib -lthreadsnat -cclib -lunix -cclib -lbigstringaf_stubs -cclib -Wl,-Bdynamic -cclib -lpthread)) (libraries dream)) </code></pre> <p>Note that <code>-lpthread</code> and <code>-lm</code> are tightly bound to the libc and can't be static in this case, so we moved <code>-lpthread</code> to the end, outside of the static section. The part between the <code>-Bstatic</code> and the <code>-Bdynamic</code> is what will be statically linked, leaving the defaults and the libc dynamic. Result:</p> <pre><code class="language-shell-session">$ dune build fserv.exe &amp;&amp; ldd _build/default/fserv.exe ocamlc .fserv.eobjs/byte/dune__exe__Fserv.{cmi,cmo,cmt} ocamlopt .fserv.eobjs/native/dune__exe__Fserv.{cmx,o} ocamlopt fserv.exe $ file _build/default/fserv.exe _build/default/fserv.exe: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=31c93085284da5d74002218b1d6b61c0efbdefe4, for GNU/Linux 3.2.0, with debug_info, not stripped $ ldd _build/default/fserv.exe linux-vdso.so.1 (0x00007ffe207c5000) libpthread.so.0 =&gt; /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f49d5e56000) libm.so.6 =&gt; /lib/x86_64-linux-gnu/libm.so.6 (0x00007f49d5d12000) libdl.so.2 =&gt; /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f49d5d0c000) libc.so.6 =&gt; /lib/x86_64-linux-gnu/libc.so.6 (0x00007f49d5b47000) /lib64/ld-linux-x86-64.so.2 (0x00007f49d69bf000) </code></pre> <p>The remaining are the base of the dynamic linking / shared object systems, but we got away with <code>libssl</code>, <code>libcrypto</code> and <code>libev</code>, which were the ones possibly absent from target systems. The resulting executable should work on any glibc-based Linux distribution that is recent enough; on older ones you will likely get missing <code>GLIBC</code> symbols.</p> <p>If you need to distribute that way, it's a good idea to compile on an old release (like Debian 'oldstable' or 'oldoldstable') for maximum portability.</p> <h4>Manually linking on macOS</h4> <p>Unfortunately, the linker on macOS doesn't seem to have options to select the static versions of the libraries; the only solution is to get our hands even dirtier, and link directly to the <code>.a</code> files, instead of using <code>-l</code> arguments.</p> <p>Most of the flags just link with stubs, we can keep them as is: <code>-lssl_stubs</code> <code>-lcamlstr</code> <code>-loverlap_stubs_stubs</code> <code>-ldigestif_c_stubs</code> <code>-lmtime_clock_stubs</code> <code>-lmirage_crypto_rng_unix_stubs</code> <code>-lmirage_crypto_stubs</code> <code>-lcstruct_stubs</code> <code>-llwt_unix_stubs</code> <code>-lthreadsnat</code> <code>-lunix</code> <code>-lbigstringaf_stubs</code></p> <p>That leaves us with: <code>-lssl</code> <code>-lcrypto</code> <code>-lev</code> <code>-lpthread</code></p> <ul> <li><code>lpthread</code> is built-in, we can ignore it </li> <li>for the others, we need to lookup the <code>.a</code> file: I use <em>e.g.</em> <pre><code class="language-shell-session">$ echo $(pkg-config libssl --variable libdir)/libssl.a ~/brew/Cellar/openssl@1.1/1.1.1k/lib/libcrypto.a </code></pre> </li> </ul> <p>Of course you don't want to hardcode these paths, but let's test for now:</p> <pre><code class="language-lisp">(executable (public_name fserv) (flags (:standard -noautolink -cclib -lssl_stubs -cclib -lcamlstr -cclib -loverlap_stubs_stubs -cclib -ldigestif_c_stubs -cclib -lmtime_clock_stubs -cclib -lmirage_crypto_rng_unix_stubs -cclib -lmirage_crypto_stubs -cclib -lcstruct_stubs -cclib -llwt_unix_stubs -cclib -lthreadsnat -cclib -lunix -cclib -lbigstringaf_stubs -cclib ~/brew/Cellar/openssl@1.1/1.1.1k/lib/libssl.a -cclib ~/brew/Cellar/openssl@1.1/1.1.1k/lib/libcrypto.a -cclib ~/brew/Cellar/libev/4.33/lib/libev.a)) (libraries dream)) </code></pre> <pre><code class="language-shell-session">$ dune build fserv.exe ocamlc .fserv.eobjs/byte/dune__exe__Fserv.{cmi,cmo,cmt} ocamlopt .fserv.eobjs/native/dune__exe__Fserv.{cmx,o} ocamlopt fserv.exe $ file _build/default/fserv.exe _build/default/fserv.exe: Mach-O 64-bit executable x86_64 $ otool -L _build/default/fserv.exe _build/default/fserv.exe: /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1292.60.1) </code></pre> <p>This is as good as it will get!</p> <h2>Cleaning up the build system</h2> <p>We have until now been adding the linking flags manually in the <code>dune</code> file; you probably don't want to do that and be restricted to static builds only! Not counting the non-portable link options we have been using...</p> <h3>The quick&amp;dirty way</h3> <p>Don't use this in your build system! But for quick testing you can conveniently pass flags to the OCaml compilers using the <code>OCAMLPARAM</code> variable. Combined with the tar/docker snippet above, we get a very simple static-binary generating command:</p> <pre><code class="language-bash">git ls-files -z | xargs -0 tar c | docker run --rm -i ocamlpro/ocaml:4.12 sh -uexc '{ tar x &amp;&amp; sudo apk add openssl-libs-static &amp;&amp; opam switch create . ocaml-system --deps-only --locked &amp;&amp; OCAMLPARAM=_,cclib=-static,cclib=-no-pie opam exec -- dune build --profile=release @install; } &gt;&amp;2 &amp;&amp; tar c -hC _build/install/default/bin .' | tar vx </code></pre> <p>Note that, for releases, you may also want to <code>strip</code> the generated binaries.</p> <h3>Making it an option of the build system (with dune)</h3> <p>For something you will want to commit, I recommend to generate the flags in a separate file <code>linking-flags-fserv.sexp</code>:</p> <pre><code class="language-lisp">(executable (public_name fserv) (flags (:standard (:include linking-flags-fserv.sexp))) (libraries dream)) </code></pre> <p>The linking flags will depend on the chosen linking mode and on the OS. For the OS, it's easiest to generate them through a script ; for the linking mode, I use an environment variable to optionally turn static linking on.</p> <pre><code class="language-lisp">(rule (with-stdout-to linking-flags-fserv.sexp (run ./gen-linking-flags.sh %{env:LINKING_MODE=dynamic} %{ocaml-config:system}))) </code></pre> <p>This will use the following <code>gen-linking-flags.sh</code> script to generate the file, passing it the value of <code>$LINKING_MODE</code> and defaulting to <code>dynamic</code>. Doing it this way also ensures that <code>dune</code> will properly recompile when the value of the environment variable changes.</p> <pre><code class="language-bash">#!/bin/sh set -ue LINKING_MODE=&quot;$1&quot; OS=&quot;$2&quot; FLAGS= CCLIB= case &quot;$LINKING_MODE&quot; in dynamic) ;; # No extra flags needed static) case &quot;$OS&quot; in linux) # Assuming Alpine here CCLIB=&quot;-static -no-pie&quot;;; macosx) FLAGS=&quot;-noautolink&quot; CCLIB=&quot;-lssl_stubs -lcamlstr -loverlap_stubs_stubs -ldigestif_c_stubs -lmtime_clock_stubs -lmirage_crypto_rng_unix_stubs -lmirage_crypto_stubs -lcstruct_stubs -llwt_unix_stubs -lthreadsnat -lunix -lbigstringaf_stubs&quot; LIBS=&quot;libssl libcrypto libev&quot; for lib in $LIBS; do CCLIB=&quot;$CCLIB $(pkg-config $lib --variable libdir)/$lib.a&quot; done;; *) echo &quot;No known static compilation flags for '$OS'&quot; &gt;&amp;2 exit 1 esac;; *) echo &quot;Invalid linking mode '$LINKING_MODE'&quot; &gt;&amp;2 exit 2 esac echo '(' for f in $FLAGS; do echo &quot; $f&quot;; done for f in $CCLIB; do echo &quot; -cclib $f&quot;; done echo ')' </code></pre> <p>Then you'll only have to run <code>LINKING_MODE=static dune build fserv.exe</code> to generate the static executable (wrapped in the Docker script above, in the case of Alpine), and can include that in your CI as well.</p> <p>For real-world examples, you can check <a href="https://github.com/ocaml-sf/learn-ocaml/blob/master/scripts/static-build.sh">learn-ocaml</a> or <a href="https://github.com/ocaml/opam/blob/master/release/Makefile">opam</a>.</p> <blockquote> <h2>Related topics</h2> <ul> <li><a href="https://github.com/ocaml/opam/releases/download/2.1.0/opam-2.1.0-x86_64-macos">reproducible builds</a> should be a goal when you intend to distribute pre-compiled binaries. </li> <li><a href="https://github.com/AltGr/opam-bundle">opam-bundle</a> is a different, heavy-weight approach to distributing opam software to non-OCaml developers, that retains the &quot;compile all from source&quot; policy but provides one big package that bootstraps OCaml, opam and all the dependencies with a single command.- </li> </ul> </blockquote> opam 2.1.0 is released! https://ocamlpro.com/blog/2021_08_04_opam_2.1.0_is_released 2021-08-04T13:48:57Z 2021-08-04T13:48:57Z David Allsopp (OCamlLabs) Raja Boujbel Louis Gesbert Feedback on this post is welcomed on Discuss! We are happy to announce the release of opam 2.1.0. Many new features made it in (see the pre-release changelogs or release notes for the details), but here are a few highlights. What's new in opam 2.1? Integration of system dependencies (formerly the op... <p><em>Feedback on this post is welcomed on <a href="https://discuss.ocaml.org/t/ann-opam-2-1-0/8255">Discuss</a>!</em></p> <p>We are happy to announce the release of opam 2.1.0.</p> <p>Many new features made it in (see the <a href="https://github.com/ocaml/opam/blob/2.1.0/CHANGES">pre-release changelogs</a> or <a href="https://github.com/ocaml/opam/releases">release notes</a> for the details), but here are a few highlights.</p> <h2>What's new in opam 2.1?</h2> <ul> <li>Integration of system dependencies (formerly the opam-depext plugin), increasing their reliability as it integrates the solving step </li> <li>Creation of lock files for reproducible installations (formerly the opam-lock plugin) </li> <li>Switch invariants, replacing the &quot;base packages&quot; in opam 2.0 and allowing for easier compiler upgrades </li> <li>Improved options configuration (see the new <code>option</code> and expanded <code>var</code> sub-commands) </li> <li>CLI versioning, allowing cleaner deprecations for opam now and also improvements to semantics in future without breaking backwards-compatibility </li> <li>opam root readability by newer and older versions, even if the format changed </li> <li>Performance improvements to opam-update, conflict messages, and many other areas </li> </ul> <h3>Seamless integration of System dependencies handling (a.k.a. &quot;depexts&quot;)</h3> <p>opam has long included the ability to install system dependencies automatically via the <a href="https://github.com/ocaml-opam/opam-depext">depext plugin</a>. This plugin has been promoted to a native feature of opam 2.1.0 onwards, giving the following benefits:</p> <ul> <li>You no longer have to remember to run <code>opam depext</code>, opam always checks depexts (there are options to disable this or automate it for CI use). Installation of an opam package in a CI system is now as easy as <code>opam install .</code>, without having to do the dance of <code>opam pin add -n/depext/install</code>. Just one command now for the common case! </li> <li>The solver is only called once, which both saves time and also stabilises the behaviour of opam in cases where the solver result is not stable. It was possible to get one package solution for the <code>opam depext</code> stage and a different solution for the <code>opam install</code> stage, resulting in some depexts missing. </li> <li>opam now has full knowledge of depexts, which means that packages can be automatically selected based on whether a system package is already installed. For example, if you have <em>neither</em> MariaDB nor MySQL dev libraries installed, <code>opam install mysql</code> will offer to install <code>conf-mysql</code> and <code>mysql</code>, but if you have the MariaDB dev libraries installed, opam will offer to install <code>conf-mariadb</code> and <code>mysql</code>. </li> </ul> <p><em>Hint: You can set <code>OPAMCONFIRMLEVEL=unsafe-yes</code> or <code>--confirm-level=unsafe-yes</code> to launch non interactive system package commands.</em></p> <h3>opam lock files and reproducibility</h3> <p>When opam was first released, it had the mission of gathering together scattered OCaml source code to build a <a href="https://github.com/ocaml/opam-repository">community repository</a>. As time marches on, the size of the opam repository has grown tremendously, to over 3000 unique packages with over 19500 unique versions. opam looks at all these packages and is designed to solve for the best constraints for a given package, so that your project can keep up with releases of your dependencies.</p> <p>While this works well for libraries, we need a different strategy for projects that need to test and ship using a fixed set of dependencies. To satisfy this use-case, opam 2.0.0 shipped with support for <em>using</em> <code>project.opam.locked</code> files. These are normal opam files but with exact versions of dependencies. The lock file can be used as simply as <code>opam install . --locked</code> to have a reproducible package installation.</p> <p>With opam 2.1.0, the creation of lock files is also now integrated into the client:</p> <ul> <li><code>opam lock</code> will create a <code>.locked</code> file for your current switch and project, that you can check into the repository. </li> <li><code>opam switch create . --locked</code> can be used by users to reproduce your dependencies in a fresh switch. </li> </ul> <p>This lets a project simultaneously keep up with the latest dependencies (without lock files) while providing a stricter set for projects that need it (with lock files).</p> <p><em>Hint: You can export the full configuration of a switch with <code>opam switch export</code> new options, <code>--full</code> to have all packages metadata included, and <code>--freeze</code> to freeze all VCS to their current commit.</em></p> <h3>Switch invariants</h3> <p>In opam 2.0, when a switch is created the packages selected are put into the “base” of the switch. These packages are not normally considered for upgrade, in order to ease pressure on opam's solver. This was a much bigger concern early on in opam 2.0's development, but is less of a problem with the default mccs solver.</p> <p>However, it's a problem for system compilers. opam would detect that your system compiler version had changed, but be unable to upgrade the ocaml-system package unless you went through a slightly convoluted process with <code>--unlock-base</code>.</p> <p>In opam 2.1, base packages have been replaced by switch invariants. The switch invariant is a package formula which must be satisfied on every upgrade and install. All existing switches' base packages could just be expressed as <code>package1 &amp; package2 &amp; package3</code> etc. but opam 2.1 recognises many existing patterns and simplifies them, so in most cases the invariant will be <code>&quot;ocaml-base-compiler&quot; {= &quot;4.11.1&quot;}</code>, etc. This means that <code>opam switch create my_switch ocaml-system</code> now creates a <em>switch invariant</em> of <code>&quot;ocaml-system&quot;</code> rather than a specific version of the <code>ocaml-system</code> package. If your system OCaml package is updated, <code>opam upgrade</code> will seamlessly switch to the new package.</p> <p>This also allows you to have switches which automatically install new point releases of OCaml. For example:</p> <pre><code class="language-shell-session">opam switch create ocaml-4.11 --formula='&quot;ocaml-base-compiler&quot; {&gt;= &quot;4.11.0&quot; &amp; &lt; &quot;4.12.0~&quot;}' --repos=old=git+https://github.com/ocaml/opam-repository#a11299d81591 opam install utop </code></pre> <p>Creates a switch with OCaml 4.11.0 (the <code>--repos=</code> was just to select a version of opam-repository from before 4.11.1 was released). Now issue:</p> <pre><code class="language-shell-session">opam repo set-url old git+https://github.com/ocaml/opam-repository opam upgrade </code></pre> <p>and opam 2.1 will automatically offer to upgrade OCaml 4.11.1 along with a rebuild of the switch. There's not yet a clean CLI for specifying the formula, but we intend to iterate further on this with future opam releases so that there is an easier way of saying “install OCaml 4.11.x”.</p> <p><em>Hint: You can set up a default invariant that will apply for all new switches, via a specific <code>opamrc</code>. The default one is <code>ocaml &gt;= 4.05.0</code></em></p> <h3>Configuring opam from the command-line</h3> <p>Configuring opam is not a simple task: you need to use an <code>opamrc</code> at init stage, or hack global/switch config file, or use <code>opam config var</code> for additional variables. To ease that step, and permit a more consistent opam config tweaking, a new command was added : <code>opam option</code>.</p> <!-- The new `opam option` command allows to configure several options, without requiring manual edition of the configuration files. --> <p>For example:</p> <ul> <li><code>opam option download-jobs</code> gives the global <code>download-jobs</code> value (as it exists only in global configuration) </li> <li><code>opam option jobs=6 --global</code> will set the number of parallel build jobs opam is allowed to run (along with the associated <code>jobs</code> variable) </li> <li><code>opam option depext-run-commands=false</code> disables the use of <code>sudo</code> for handling system dependencies; it will be replaced by a prompt to run the installation commands </li> <li><code>opam option depext-bypass=m4 --global</code> bypass <code>m4</code> system package check globally, while <code>opam option depext-bypass=m4 --switch myswitch</code> will only bypass it in the selected switch </li> </ul> <p>The command <code>opam var</code> is extended with the same format, acting on switch and global variables.</p> <p><em>Hint: to revert your changes use <code>opam option &lt;field&gt;=</code>, it will take its default value.</em></p> <h3>CLI Versioning</h3> <p>A new <code>--cli</code> switch was added to the first beta release, but it's only now that it's being widely used. opam is a complex enough system that sometimes bug fixes need to change the semantics of some commands. For example:</p> <ul> <li><code>opam show --file</code> needed to change behaviour </li> <li>The addition of new controls for setting global variables means that the <code>opam config</code> was becoming cluttered and some things want to move to <code>opam var</code> </li> <li><code>opam switch install 4.11.1</code> still works in opam 2.0, but it's really an OPAM 1.2.2 syntax. </li> </ul> <p>Changing the CLI is exceptionally painful since it can break scripts and tools which themselves need to drive <code>opam</code>. CLI versioning is our attempt to solve this. The feature is inspired by the <code>(lang dune ...)</code> stanza in <code>dune-project</code> files which has allowed the Dune project to rename variables and alter semantics without requiring every single package using Dune to upgrade their <code>dune</code> files on each release.</p> <p>Now you can specify which version of opam you expected the command to be run against. In day-to-day use of opam at the terminal, you wouldn't specify it, and you'll get the latest version of the CLI. For example: <code>opam var --global</code> is the same as <code>opam var --cli=2.1 --global</code>. However, if you issue <code>opam var --cli=2.0 --global</code>, you will told that <code>--global</code> was added in 2.1 and so is not available to you. You can see similar things with the renaming of <code>opam upgrade --unlock-base</code> to <code>opam upgrade --update-invariant</code>.</p> <p>The intention is that <code>--cli</code> should be used in scripts, user guides (e.g. blog posts), and in software which calls opam. The only decision you have to take is the <em>oldest</em> version of opam which you need to support. If your script is using a new opam 2.1 feature (for example <code>opam switch create --formula=</code>) then you simply don't support opam 2.0. If you need to support opam 2.0, then you can't use <code>--formula</code> and should use <code>--packages</code> instead. opam 2.0 does not have the <code>--cli</code> option, so for opam 2.0 instead of <code>--cli=2.0</code> you should set the environment variable <code>OPAMCLI</code> to <code>2.0</code>. As with <em>all</em> opam command line switches, <code>OPAMCLI</code> is simply the equivalent of <code>--cli</code> which opam 2.1 will pick-up but opam 2.0 will quietly ignore (and, as with other options, the command line takes precedence over the environment).</p> <p>Note that opam 2.1 sets <code>OPAMCLI=2.0</code> when building packages, so on the rare instances where you need to use the <code>opam</code> command in a <em>package</em> <code>build:</code> command (or in your build system), you <em>must</em> specify <code>--cli=2.1</code> if you're using new features.</p> <p>Since 2.1.0~rc2, CLI versioning applies to opam environment variables. The previous behavior was to ignore unknown or wrongly set environment variable, while now you will have a warning to let you know that the environment variable won't be handled by this version of opam.</p> <p>To ensure not breaking compatibility of some widely used deprecated options, a <em>default</em> CLI is introduced: when no CLI is specified, those deprecated options are accepted. It concerns <code>opam exec</code> and <code>opam var</code> subcommands.</p> <p>There's even more detail on this feature <a href="https://github.com/ocaml/opam/wiki/Spec-for-opam-CLI-versioning">in our wiki</a>. We're hoping that this feature will make it much easier in future releases for opam to make required changes and improvements to the CLI without breaking existing set-ups and tools.</p> <p><em>Note: For opam libraries users, since 2.1 environment variable are no more loaded by the libraries, only by opam client. You need to load them explicitly.</em></p> <h3>opam root portability</h3> <p>opam root format changes during opam life-cycle, new field are added or removed, new files are added ; an older opam version sometimes can no longer read an upgraded or newly created opam root. opam root format has been updated to allow new versions of opam to indicate that the root may still be read by older versions of the opam libraries. A plugin compiled against the 2.0.9 opam libraries will therefore be able to read information about an opam 2.1 root (plugins and tools compiled against 2.0.8 are unable to load opam 2.1.0 roots). It is a <em>read-only</em> best effort access, any attempt to modify the opam root fails.</p> <p><em>Hint: for opam libraries users, you can safely load states with <a href="https://github.com/ocaml/opam/blob/master/src/state/opamStateConfig.mli"><code>OpamStateConfig</code></a> load functions.</em></p> <!-- _ change to the opam root format which allows new versions of opam to indicate that the root may still be read by older versions of the opam libraries. A plugin compiled against the 2.0.9 opam libraries will therefore be able to read information about an opam 2.1 root (plugins and tools compiled against 2.0.8 are unable to load opam 2.1.0 roots). _ --> <p><strong>Tremendous thanks to all involved people, who've developed, tested &amp; retested, helped with issue reports, comments, feedback...</strong></p> <h2>Try it!</h2> <p>In case you plan a possible rollback, you may want to first backup your <code>~/.opam</code> directory.</p> <p>The upgrade instructions are unchanged:</p> <ol> <li>Either from binaries: run </li> </ol> <pre><code class="language-shell-session">bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.1.0&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.1.0">the Github &quot;Releases&quot; page</a> to your PATH.</p> <ol start="2"> <li>Or from source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.1.0#compiling-this-repo">README</a>. </li> </ol> <p>You should then run:</p> <pre><code class="language-shell-session">opam init --reinit -ni </code></pre> opam 2.0.9 release https://ocamlpro.com/blog/2021_08_03_opam_2.0.9_release 2021-08-03T13:48:57Z 2021-08-03T13:48:57Z Raja Boujbel Louis Gesbert Feedback on this post is welcomed on Discuss! We are pleased to announce the minor release of opam 2.0.9. This new version contains some back-ported fixes. New features Back-ported ability to load upgraded roots read-only; allows applications compiled with opam-state 2.0.9 to load a root which has b... <p><em>Feedback on this post is welcomed on <a href="https://discuss.ocaml.org/t/ann-opam-2-1-0/8255">Discuss</a>!</em></p> <p>We are pleased to announce the minor release of <a href="https://github.com/ocaml/opam/releases/tag/2.0.9">opam 2.0.9</a>.</p> <p>This new version contains some <a href="https://github.com/ocaml/opam/pull/4547">back-ported</a> fixes.</p> <h2>New features</h2> <ul> <li>Back-ported ability to load upgraded roots read-only; allows applications compiled with opam-state 2.0.9 to load a root which has been upgraded to opam 2.1 [<a href="https://github.com/ocaml/opam/issues/4636">#4636</a>] </li> <li>macOS sandbox now supports <code>OPAM_USER_PATH_RO</code> for adding a custom read-only directory to the sandbox [<a href="https://github.com/ocaml/opam/issues/4589">#4589</a>, <a href="https://github.com/ocaml/opam/issues/4609">#4609</a>] </li> <li><code>OPAMROOT</code> and <code>OPAMSWITCH</code> now reflect the <code>--root</code> and <code>--switch</code> parameters in the package build [<a href="https://github.com/ocaml/opam/issues/4668">#4668</a>] </li> <li>When built with opam-file-format 2.1.3+, opam-format 2.0.x displays better errors for newer opam files [<a href="https://github.com/ocaml/opam/issues/4394">#4394</a>] </li> </ul> <h2>Bug fixes</h2> <ul> <li>Linux sandbox now mounts <em>host</em> <code>$TMPDIR</code> read-only, then sets the <em>sandbox</em> <code>$TMPDIR</code> to a new separate tmpfs. <strong>Hardcoded <code>/tmp</code> access no longer works if <code>TMPDIR</code> points to another directory</strong> [<a href="https://github.com/ocaml/opam/issues/4589">#4589</a>] </li> <li>Stop clobbering <code>DUNE_CACHE</code> in the sandbox script [<a href="https://github.com/ocaml/opam/issues/4535">#4535</a>, fixing <a href="https://github.com/ocaml/dune/issues/4166">ocaml/dune#4166</a>] </li> <li>Ctrl-C now correctly terminates builds with bubblewrap; sandbox now requires bubblewrap 0.1.8 or later [<a href="https://github.com/ocaml/opam/issues/4400">#4400</a>] </li> <li>Linux sandbox script no longer makes <code>PWD</code> read-write on remove actions [<a href="https://github.com/ocaml/opam/issues/4589">#4589</a>] </li> <li>Lint W59 and E60 no longer trigger for packages flagged <code>conf</code> [<a href="https://github.com/ocaml/opam/issues/4549">#4549</a>] </li> <li>Reduce the length of temporary file names for pin caching to ease pressure on Windows [<a href="https://github.com/ocaml/opam/issues/4590">#4590</a>] </li> <li>Security: correct quoting of arguments when removing switches [<a href="https://github.com/ocaml/opam/issues/4707">#4707</a>] </li> <li>Stop advertising the removed option <code>--compiler</code> when creating local switches [<a href="https://github.com/ocaml/opam/issues/4718">#4718</a>] </li> <li>Pinning no longer fails if the archive's opam file is malformed [<a href="https://github.com/ocaml/opam/issues/4580">#4580</a>] </li> <li>Fish: stop using deprecated <code>^</code> syntax to fix support for Fish 3.3.0+ [<a href="https://github.com/ocaml/opam/issues/4736">#4736</a>] </li> </ul> <hr /> <p>Installation instructions (unchanged):</p> <ol> <li>From binaries: run </li> </ol> <pre><code class="language-shell-session">bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.0.9&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.0.9">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script.</p> <ol start="2"> <li>From source, using opam: </li> </ol> <pre><code class="language-shell-session">opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update your sandbox script)</p> <ol start="3"> <li>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.0.9#compiling-this-repo">README</a>. </li> </ol> <p>We hope you enjoy this new minor version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> Detecting identity functions in Flambda https://ocamlpro.com/blog/2021_07_16_detecting_identity_functions_in_flambda 2021-07-16T13:48:57Z 2021-07-16T13:48:57Z Leo Boitel In some discussions among OCaml developers around the empty type (PR#9459), some people mused about the possibility of annotating functions with an attribute telling the compiler that the function should be trivial, and always return a value strictly equivalent to its argument.Curious about the feas... <blockquote> <p>In some discussions among OCaml developers around the empty type (<a href="https://github.com/ocaml/ocaml/issues/9459">PR#9459</a>), some people mused about the possibility of annotating functions with an attribute telling the compiler that the function should be trivial, and always return a value strictly equivalent to its argument.<br>Curious about the feasibility of implementing this feature, we advertised an internship with our compiler team aimed at exploring this subject.<br>We welcomed Léo Boitel during three months to work on this topic, with Vincent Laviron as mentor, and we're proud to let him show off what he has achieved in this post.</p> </blockquote> <h3>The problem at hand</h3> <p>OCaml's strong typing system is one of its main perks: it allows to write safe code thanks to the abstraction it provides. Most of the basic design mistakes will directly result into a typing error and the user cannot mess up the memory as it is automatically handled by the compiler and runtime.</p> <p>However, these perks keep a power-user from implementing some optimizations, in particular those linked to memory representation as they cannot be accessed directly.</p> <p>A good example would be this piece of code:</p> <pre><code class="language-Ocaml">type return = Ok of int | Failure let id = function | Some x -&gt; Ok x | None -&gt; Failure </code></pre> <p>In terms of memory representation, this function is indeed the identity function. <code>Some x</code> and <code>Ok x</code> share the same representation (and so do <code>None</code> and <code>Failure</code>). However, this identity is invisible to the user. Even if the user knows the representation is the same, they would need to use this function to avoid a typing error.</p> <p>Another good example would be this one:</p> <pre><code class="language-Ocaml">type record = { a:int; b:int } let id (x,y) = { a = x; b = y } </code></pre> <p>Even if those functions are the identity, they come with a cost: not only do they cost a function call, they reallocate the result instead of just returning their argument directly. Detecting those functions would allow us to produce interesting optimizations.</p> <h3>Hurdles</h3> <p>If we want to detect identities, we quickly hit the problem of recursive functions: how does one recognize identity in those cases? Can a function be an identity if it doesn't always terminate, or if it never does?</p> <p>Once we have a good definition of what exactly an identity function is, we still need to prove that an existing function fits the definition. Indeed, we want to ensure the user that this optimization will not change the observable behavior of the program.</p> <p>We also want to avoid breaking type safety. As an example, the following function:</p> <pre><code class="language-Ocaml">let rec fake_id = function | [] -&gt; 0 | t::q -&gt; fake_id (t::q) </code></pre> <p>A naive induction proof would allow us to replace this function with the identity, as <code>[]</code> and <code>0</code> share the same memory representation. However, this is unsafe as applying this to a non-empty list would return a list even if this function has an <code>int</code> type (we'll talk more about it later).</p> <p>To tackle those challenges, we started the internship with a theoretical study that lasted for three fourths of the allocated time and lastly implemented a practical solution in the Flambda representation of the compiler.</p> <h3>Theoretical results</h3> <p>We worked on extensions of lambda-calculus (implemented in OCaml) in order to gradually test our ideas in a simpler framework than the full Flambda.</p> <h4>Pairs</h4> <p>We started with a lambda-calculus to which we only added the concept of pairs. To prove identities, every function has to be annotated as identity or not. We then prove these annotations by β-reducing the function bodies. After each recursive reduction, we apply a rule that states that a pair made of the first and second projection of a variable is equal to that variable. We do not reduce applications, but we replace them by the argument if the concerned function is annotated as identity.</p> <p>Using this method, we maintain a reasonable complexity compared to a full β-reduction which would be unrealistic on a big program.</p> <p>We then add higher-order capabilities by allowing annotations of the form <code>Annotation → Annotation</code>. Functions as <code>List.map</code> can that way be abstracted as <code>Id → Id</code>. Even though this solution doesn't cover every case, most real-world usage are recognized by these patterns.</p> <h4>Tuple reconstruction</h4> <p>We then move from just pairs to tuples of arbitrary size. This adds a new problem: if we make a pair out of the first two fields of a variable, this is no longer necessarily that variable, as it may have more than two fields.</p> <p>We then have two solutions: we can first annotate every projection with the size of the involved tuple to know if we are indeed reconstructing the entire variable. As an example, if we make a pair from the fields of a triplet, we know there is no way to simplify this reconstruction.</p> <p>Another solution, more ambitious, is to adopt a less restrictive definition of equality and to allow the replacement of <code>(x,y)</code> by <code>(x,y,z)</code>. Indeed, if the variable was typed as a pair, we are guaranteed that the third field will never be accessed. The behavior of the program will therefore never be affected by this extension.</p> <p>Though this allows to avoid a lot of allocations, it may also increase memory usage in some cases: if the triplet ceases to be used, it won't be deallocated by the Garbage Collector (GC) and the field <code>z</code> will be kept in memory as long as <code>(x,y)</code> is still accessible.</p> <p>This approach remains interesting to us, as long as it is manually enabled by the user for some specific blocks.</p> <h4>Recursion</h4> <p>We now add recursive definitions to our language, through the use of a fixpoint operator.</p> <p>To prove that a recursive function is the identity, we have to use induction. The main difficulty is to prove that the function indeed terminates to ensure the validity of the induction.</p> <p>We can separate this into three different levels of safety. The first option is to not prove termination at all, and let the user state which function they know will return. We can then assume the function is the identity and replace its body on that hypothesis. This approach is enough for most practical cases, but its main problem lies in the fact that it allows to write unsafe code (as we've already seen).</p> <p>Our second option is to limit our induction hypothesis to recursive applications on &quot;smaller&quot; elements than the argument. An element is defined as smaller if it is a projection of the argument or a projection of a smaller element. This is not enough to prove that the function will terminate (the argument might be cyclic, for example) but is enough to ensure type safety. The reason is that any possibly returned value is constructed (as it cannot directly come from a recursive call) and have therefore a defined type. Typing would fail if the function was to return a value that cannot be identified to its argument.</p> <p>Finally, we may want to establish a perfect equivalence between the function and the identity function before simplifying it. In that case, we propose to create a special annotation for functions that are the identity when applied to a non-cyclical object. We can prove they have this property with the already described induction. The difficulty now lies into applying the simplification only to valid applications: if an object is immutable, wasn't recursively defined and is made of components that also have that property, we can declare that object inductive and simplify applications on it. The inductive state of variables can be propagated during our recursive pass of optimization.</p> <h3>Block reconstruction</h3> <p>The representation of blocks in Flambda provides interesting challenges in terms of equality detection, which is often crucial to prove an identity. It is very hard to detect an identical block reconstruction.</p> <h4>Blocks in Flambda</h4> <h5>Variants</h5> <p>The blocks in Flambda come from the existence of variants in OCaml: one type may have several different constructors, as we can see in</p> <pre><code class="language-Ocaml">type choice = A of int | B of int </code></pre> <p>When OCaml is compiled to Flambda, the information used by the constructor is lost and replaced by a tag. The tag is a number contained in the header of the object's memory representation between 0 and 255 that represents which constructor was used. As an example, an element of type <code>choice</code> would have tag <code>0</code> for the <code>A</code> constructor, and <code>1</code> for <code>B</code>.</p> <p>That tag will be kept at runtime, which will allow for example to implement pattern matching as a simple switch in Flambda, that executes simple comparisons on the tag to know which branch to execute next.</p> <p>This system complicates our task as Flambda's typing doesn't inform us which type the constructor is supposed to have, and therefore keeps us from easily knowing if two variants are indeed equal.</p> <h5>Tag generalization</h5> <p>To complicate things, tags are actually used for any block, meaning tuples, modules or functions (as a matter of fact, almost anything but constant constructors and integers). If the object doesn't have variants, it will usually have tag 0. This tag is never read (as there are no variants to differentiate) but keeps us from simply comparing two tuples, because Flambda will simply see two blocks of unknown tag.</p> <h5>Inlining</h5> <p>Finally, this system is optimized by inlining tuples: if a variant has a shape <code>Pair of int * int</code>, it will be often be flattened into a tuple <code>(tag Pair, int, int)</code>.</p> <p>This also means that variants can have an arbitrary size, which is also unknown in Flambda.</p> <h4>Existing approach</h4> <p>A partial solution to the problem already existed in a Pull Request (PR) you can read <a href="https://github.com/ocaml/ocaml/pull/8958">here</a>.</p> <p>The chosen approach in this PR is the natural one: we use the switch to gain information on the tag of a block, depending on the branch taken. The PR also allows to know the mutability and size of the block in each branch, starting from OCaml (where this information is known as it is explicit in the pattern matching) and propagating the knowledge to Flambda.</p> <p>This allows to register every block on which a switch is done with their tag, size and mutability. We can then detect if one of them is reconstructed with the use of a <code>Pmakeblock</code> primitive.</p> <p>Unfortunately, this path has its limits as there are numerous cases where the tag and size could be known without performing a switch on the value. As an example, this doesn't allow the simplification of tuple reconstruction.</p> <h4>New solution</h4> <p>Our new solution will have to propagate more information from OCaml into Flambda. This propagation is based on two PRs that already existed for Flambda 2, which annotated in the lambda representation each projection (<code>Pfield</code>) with typing informations. We add <a href="https://github.com/ocaml-flambda/ocaml/commit/fa5de9e64ff1ef04b596270a8107d1f9dac9fb2d">block mutability</a> and <a href="https://github.com/ocaml-flambda/ocaml/pull/53">tag and finally size</a>.</p> <p>Our first contribution was to translate these PRs to Flambda 1, and to propagate from lambda to Flambda correctly.</p> <p>We then had access to every necessary information to detect and prove block reconstruction: not only we have a list of blocks that were pattern-matched, we can make a list of partially immutable blocks, meaning blocks for which we know that some fields are immutable.</p> <p>Here's how we use it:</p> <h6>Block discovery</h6> <p>As soon as we find a projection, we verify whether it is done on an immutable block of known size. If so, we add that block to the list of partial blocks. We verify that the information we have on the tag and size are compatible with the already known projections. If all of the fields of the block are known, the block is added to the list of simplifiable blocks.</p> <p>Of course, we also keep track of known blocks though switches.</p> <h6>Simplification</h6> <p>This part is similar to the original PR: when an immutable block is met, we check whether this block is known as simplifiable. In that case we avoid a reallocation.</p> <p>Compared to the original approach, we also reduced the asymptotic complexity (from quadratic to linear) by registering the association of every projection variable to its index and original block. We also modified some implementation details that could have triggered a bug when associated with our PR.</p> <h4>Example</h4> <p>Let's consider this function:</p> <pre><code class="language-Ocaml">type typ1 = A of int | B of int * int type typ2 = C of int | D of {x:int; y:int} let id = function | A n -&gt; C n | B (x,y) -&gt; D {x; y} </code></pre> <p>The current compiler would produce the resulting Flambda output:</p> <pre><code>End of middle end: let_symbol (camlTest__id_21 (Set_of_closures ( (set_of_closures id=Test.8 (id/5 = fun param/7 -&gt; (switch*(0,2) param/7 case tag 0: (let (Pmakeblock_arg/11 (field 0&lt;{../../test.ml:4,4-7}&gt; param/7) Pmakeblock/12 (makeblock 0 (int)&lt;{../../test.ml:4,11-14}&gt; Pmakeblock_arg/11)) Pmakeblock/12) case tag 1: (let (Pmakeblock_arg/15 (field 1&lt;{../../test.ml:5,4-11}&gt; param/7) Pmakeblock_arg/16 (field 0&lt;{../../test.ml:5,4-11}&gt; param/7) Pmakeblock/17 (makeblock 1 (int,int)&lt;{../../test.ml:5,17-23}&gt; Pmakeblock_arg/16 Pmakeblock_arg/15)) Pmakeblock/17))) free_vars={ } specialised_args={}) direct_call_surrogates={ } set_of_closures_origin=Test.1]))) (camlTest__id_5_closure (Project_closure (camlTest__id_21, id/5))) (camlTest (Block (tag 0, camlTest__id_5_closure))) End camlTest </code></pre> <p>Our optimization allows to detect that this function reconstructs a similar block and therefore can simplify it:</p> <pre><code>End of middle end: let_symbol (camlTest__id_21 (Set_of_closures ( (set_of_closures id=Test.7 (id/5 = fun param/7 -&gt; (switch*(0,2) param/7 case tag 0 (1): param/7 case tag 1 (2): param/7)) free_vars={ } specialised_args={}) direct_call_surrogates={ } set_of_closures_origin=Test.1]))) (camlTest__id_5_closure (Project_closure (camlTest__id_21, id/5))) (camlTest (Block (tag 0, camlTest__id_5_closure))) End camlTest </code></pre> <h4>Possible improvements</h4> <h5>Equality relaxation</h5> <p>We can use observational equality studied in the theoretical part for block equality in order to avoid more allocations. The implementation is simple:</p> <p>When a block is created, to know if it will be allocated, the normal course of action is to check if all of its fields are the known projections of another block, with the same index, and if the block sizes are the same. We can just remove that last check.</p> <p>Implementing this was a bit more tricky because of several practical details. First, we want that optimization to be only triggered on user-annotated blocks, we had to propagate that annotation to Flambda.</p> <p>Additionally, if we only implement that optimization, numerous optimization cases will be ignored because unused variables are simplified before our optimization pass. As an example, if a function looks like</p> <pre><code class="language-Ocaml">let loose_id (a,b,c) = (a,b) </code></pre> <p>The <code>c</code> variable will be simplified away before reaching Flambda, and there will be no way to prove that <code>(a,b,c)</code> is immutable as its third field could not be. This problem is being solved on Flambda2 thanks to a PR that propagates mutability information for every block, but we didn't have the time necessary to migrate it on Flambda 1.</p> <h3>Detecting recursive identities</h3> <p>Now that we can detect block reconstruction, we're left with solving the problem of recursive functions.</p> <h4>Unsafe approach</h4> <p>We began the implementation of a pass that contains no termination proof. The idea is to add the proof later, or to authorize non-terminating functions to be simplified as long as they type correctly (see previously in the theory part).</p> <p>For now, we trust the user to verify these properties manually.</p> <p>Hence, we modified the function simplification procedure: when a function with a single argument is modified, we first assume that this function is the identity before simplifying its body. We then check whether the result is equivalent to an identity by recursively going through it, so as to cover as many cases as possible (for example in conditional branchings). If it is the case, the function will be replaced by the identity ; otherwise, we go back to a normal simplification, without the induction hypothesis.</p> <h4>Constant propagation</h4> <p>We took some time to improve our code that checks whether the body of a function is an identity or not, so as to handle constant values. It propagates identity information we have on an argument during conditional branching.</p> <p>This way, on a function like</p> <pre><code class="language-Ocaml">type truc = A | B | C let id = function | A -&gt; A | B -&gt; B | C -&gt; C </code></pre> <p>or even</p> <pre><code class="language-Ocaml">let id x = if x=0 then 0 else x </code></pre> <p>We can successfully detect identity.</p> <h4>Examples</h4> <h5>Recursive functions</h5> <p>We can now detect recursive identities:</p> <pre><code class="language-Ocaml">let rec listid = function | t::q -&gt; t::(listid q) | [] -&gt; [] </code></pre> <p>Used to compile to:</p> <pre><code>End of middle end: let_rec_symbol (camlTest__listid_5_closure (Project_closure (camlTest__set_of_closures_20, listid/5))) (camlTest__set_of_closures_20 (Set_of_closures ( (set_of_closures id=Test.11 (listid/5 = fun param/7 -&gt; (if param/7 then begin (let (apply_arg/13 (field 1&lt;{../../test.ml:9,4-8}&gt; param/7) apply_funct/14 camlTest__listid_5_closure Pmakeblock_arg/15 *(apply*&amp;#091;listid/5]&lt;{../../test.ml:9,15-25}&gt; apply_funct/14 apply_arg/13) Pmakeblock_arg/16 (field 0&lt;{../../test.ml:9,4-8}&gt; param/7) Pmakeblock/17 (makeblock 0&lt;{../../test.ml:9,12-25}&gt; Pmakeblock_arg/16 Pmakeblock_arg/15)) Pmakeblock/17) end else begin (let (const_ptr_zero/27 Const(0a)) const_ptr_zero/27) end)) free_vars={ } specialised_args={}) direct_call_surrogates={ } set_of_closures_origin=Test.1]))) let_symbol (camlTest (Block (tag 0, camlTest__listid_5_closure))) End camlTest </code></pre> <p>But is now detected as being the identity:</p> <pre><code>End of middle end: let_symbol (camlTest__set_of_closures_20 (Set_of_closures ( (set_of_closures id=Test.13 (listid/5 = fun param/7 -&gt; param/7) free_vars={ } specialised_args={}) direct_call_surrogates={ } set_of_closures_origin=Test.1]))) (camlTest__listid_5_closure (Project_closure (camlTest__set_of_closures_20, listid/5))) (camlTest (Block (tag 0, camlTest__listid_5_closure))) End camlTest </code></pre> <h5>Unsafe example</h5> <p>However, we can use the unsafety of the feature to go around the typing system and access a memory address as if it was an integer:</p> <pre><code class="language-Ocaml">type bugg = A of int*int | B of int let rec bug = function | A (a,b) -&gt; (a,b) | B x -&gt; bug (B x) let (a,b) = (bug (B 42)) let _ = print_int b </code></pre> <p>This function will be simplified to the identity even though the <code>bugg</code> type is not compatible with tuples; trying to project on the second field of variant b will access an undefined part of memory:</p> <pre><code>$ ./unsafe.out 47423997875612 </code></pre> <h4>Possible improvements - short term</h4> <h5>Function annotation</h5> <p>A theoretically simple thing to add would be to let the choice of applying unsafe optimizations to the user. We lacked the time to do the work of propagating the information to Flambda, but it would not be hard to implement.</p> <h5>Order on arguments</h5> <p>For a safer optimization, we could use the idea developed in the theoretical part to make the optimization correct on non-cyclical objects and more importantly give us typing guarantees to avoid the problem we just saw.</p> <p>To get this guarantee, we would have to change the simplification pass by adding an optional pair of function-argument to the environment. When this option exists, the pair indicates that we are in the body in the process of simplification and that applications on smaller elements can be simplified as identity. Of course, the pass would need to be modified to remember which elements are not smaller than the previous argument.</p> <h4>Possible improvements - long term</h4> <h5>Exclusion of cyclical objects</h5> <p>As described in the theoretical part, we could recursively deduce which objects are cyclical and attempt to remove them from our optimization. The problem is then that instead of having to replace functions by the identity, we need to add a special annotation that represents <code>IdRec</code>.</p> <p>This amounts to a lot of added implementation complexity when compiling over several files, as we need access to the interface of already compiled files to know when the optimization can be used.</p> <p>A possibility would be to use .cmx files to store this information when the file is compiled, but that kind of work would have taken too long to be achieved during the internship. Moreover, the practicality of that choice is far from obvious: it would complexify the optimization pass for a small improvement with respect to a version that would be correct on non-cyclical objects and activated through annotations.</p> Détection de fonctions d’identité dans Flambda https://ocamlpro.com/blog/2021_07_15_fr_detection_de_fonctions_didentite_dans_flambda 2021-07-15T13:48:57Z 2021-07-15T13:48:57Z Leo Boitel Au cours de discussions parmi les développeurs OCaml sur le type vide (PR#9459), certains caressaient l’idée d’annoter des fonctions avec un attribut indiquant au compilateur que la fonction devrait être triviale, et toujours renvoyer une valeur strictement équivalente à son argument. Nous ... <blockquote> <p>Au cours de discussions parmi les développeurs OCaml sur le type vide (<a href="https://github.com/ocaml/ocaml/issues/9459">PR#9459</a>), certains caressaient l’idée d’annoter des fonctions avec un attribut indiquant au compilateur que la fonction devrait être triviale, et toujours renvoyer une valeur strictement équivalente à son argument. Nous étions curieux de voir si l’implémentation d’une telle fonctionnalité serait possible et nous avons publié une offre de stage pour explorer ce sujet. L’équipe Compilation d’OCamlPro a ainsi accueilli Léo Boitel durant trois mois pour se consacrer à ce sujet, avec Vincent Laviron pour encadrant. Nous sommes fiers des résultats auxquels Léo a abouti !</p> <p>Voici ce que Léo en a écrit 🙂</p> </blockquote> <h3>Description du problème</h3> <p>Le typage fort d’OCaml est un de ses grands avantages : il permet d’écrire du code plus sûr grâce à la capacité d’abstraction qu’il offre. La plupart des erreurs de conception se traduiront directement en erreur de typage, et l’utilisateur ne peut pas faire d’erreur avec la manipulation de la mémoire puisqu’elle est entièrement gérée par le compilateur.</p> <p>Cependant, ces avantages empêchent l’utilisateur de faire certaines optimisations lui-même, en particulier celles liées aux représentations mémoires puisqu’il n’y accède pas directement.</p> <p>Un cas classique serait le suivant :</p> <pre><code class="language-Ocaml">type return = Ok of int | Failure let id = function | Some x -&gt; Ok x | None -&gt; Failure </code></pre> <p>Cette fonction est une identité, car la représentation mémoire de <code>Some x</code> et de <code>Ok x</code> est la même (idem pour <code>None</code> et <code>Failure</code>). Cependant, l’utilisateur ne le voit pas, et même s’il le voyait, il aurait besoin de cette fonction pour conserver un typage correct.</p> <p>Un autre exemple serait le suivant: Another good example would be this one:</p> <pre><code class="language-Ocaml">type record = { a:int; b:int } let id (x,y) = { a = x; b = y } </code></pre> <p>Même si ces fonctions sont des identités, elles ont un coût : en plus de nous coûter un appel, elles réallouent le résultat au lieu de nous retourner leur argument directement. C’est pourquoi leur détection permettrait des optimisations intéressantes.</p> <h3>Difficultés</h3> <p>Si on veut pouvoir détecter les identités, on se heurte rapidement au problème des fonctions récursives : comment définir l’identité pour ces dernières ? Est-ce qu’une fonction peut-être l’identité si elle ne termine pas toujours, voire jamais ?</p> <p>Une fois qu’on a défini l’identité, le problème est la preuve qu’une fonction est bien l’identité. En effet, on veut garantir à l’utilisateur que cette optimisation ne changera pas le comportement observable du programme.</p> <p>On veut aussi éviter d’ajouter des failles de sûreté au typage. Par exemple, si on a une fonction de la forme suivante:</p> <pre><code class="language-Ocaml">let rec fake_id = function | [] -&gt; 0 | t::q -&gt; fake_id (t::q) </code></pre> <p>Une preuve naïve par induction nous ferait remplacer cette fonction par l’identité, car <code>[]</code> et <code>0</code> ont la même représentation mémoire. C’est dangereux car le résultat d’une application à une liste non-vide sera une liste alors qu’il est typé comme un entier (voir exemples plus bas).</p> <p>Pour résoudre ces problèmes, nous avons commencé par une partie théorique qui a occupé les trois quarts du stage, pour finir par une partie pratique d’implémentation dans Flambda.</p> <h3>Résultats théoriques</h3> <p>Pour cette partie, nous avons travaillé sur des extensions de lambda-calcul, implémentées en OCaml, pour pouvoir tester nos idées au fur et à mesure dans un cadre plus simple que Flambda.</p> <h4>Paires</h4> <p>Nous avons commencé par un lambda calcul auquel on ajoute seulement des paires. Pour effectuer nos preuves, on annote toutes les fonctions comme des identités ou non. On prouve ensuite ces annotations en β-réduisant le corps des fonctions. Après chaque réduction récursive, on applique une règle qui dit qu’une paire composée des deux projections d’une variable est égale à la variable. On ne réduit pas les applications, mais on les remplace par l’argument si la fonction est annotée comme une identité.</p> <p>On garde ainsi une complexité raisonnable par rapport à une β-réduction complète qui serait évidemment irréaliste pour de gros programmes.</p> <p>On passe ensuite à l’ordre supérieur en permettant des annotations de la forme <code>Annotation → Annotation</code>. Les fonctions comme <code>List.map</code> peuvent donc être représentées comme <code>Id → Id</code>. Bien que cette solution ne soit pas complète, elle couvre la grande majorité des cas d’utilisation.</p> <h4>Reconstruction de tuples</h4> <p>On passe ensuite des paires aux tuples de taille arbitraire. Cela complexifie le problème : si on construit une paire à partir des projections des deux premiers champs d’une variable, ce n’est pas forcément la variable, puisqu’elle peut avoir plus de champs.</p> <p>On a alors deux solutions : tout d’abord, on peut annoter les projections avec la taille du tuple pour savoir si on reconstruit la variable en entier. Par exemple, si on reconstruit une paire avec deux projections d’un triplet, on sait qu’on ne peut pas simplifier cette reconstruction.</p> <p>L’autre solution, plus ambitieuse, est d’adopter une définition moins stricte de l’égalité, et de dire qu’on peut remplacer, par exemple, <code>(x,y)</code> par <code>(x,y,z)</code>. En effet, si la variable a été typée comme une paire, on a la garantie qu’on accédera jamais au champ <code>z</code> de toute façon. Le comportement du programme sera donc le même si on étend la variable avec des champs supplémentaires.</p> <p>Utiliser l’égalité observationnelle permet d’éviter beaucoup d’allocations, mais elle peut utiliser plus de mémoire dans certains cas : si le triplet cesse d’être utilisé, il ne sera pas désalloué par le Garbage Collector (GC), et le champ <code>z</code> restera donc en mémoire pour rien tant que <code>(x,y)</code> est utilisé.</p> <p>Cette approche reste intéressante, au moins si on donne la possibilité à l’utilisateur de l’activer manuellement pour certains blocs.</p> <h4>Récursion</h4> <p>On ajoute maintenant les définitions récursives à notre langage, par le biais d’un opérateur de point fixe.</p> <p>Pour prouver qu’une fonction récursive est l’identité, on doit procéder par induction. La difficulté est alors de prouver que la fonction termine, pour que l’induction soit correcte.</p> <p>On peut distinguer trois niveaux de preuve : la première option est de ne pas prouver la terminaison, et de laisser l’utilisateur choisir les fonctions dont il est sûr qu’elles terminent. On suppose donc que la fonction est l’identité, et on simplifie son corps avec cette hypothèse. Cette approche est suffisante pour la plupart des cas pratiques, mais son problème principal est qu’elle autorise à écrire du code qui casse la sûreté du typage, comme discuté ci-dessus.</p> <p>La seconde option est de faire notre hypothèse d’induction uniquement sur des applications de la fonction sur des éléments plus “petits” que l’argument. Un élément est défini comme tel s’il est une projection de l’argument, ou une projection d’un élément plus petit. Cela n’est pas suffisant pour prouver que la fonction termine (par exemple si l’argument est cyclique), mais c’est assez pour avoir un typage sûr. En effet, cela implique que toutes les valeurs de retour possibles de la fonction sont construites (puisqu’elles ne peuvent provenir directement d’un appel récursif), et ont donc un type défini. Le typage échouerait donc si la fonction pouvait renvoyer une valeur qui n’est pas identifiable à son argument.</p> <p>Finalement, on peut vouloir une équivalence observationnelle parfaite entre la fonction et l’identité pour la simplifier. Dans ce cas, la solution que nous proposons est de créer une annotation spéciale pour les fonctions qui sont l’identité quand elles sont appliquées à un objet non cyclique. On peut prouver qu’elles ont cette propriété avec l’induction décrite ci-dessus. La difficulté est ensuite de faire la simplification sur les bonnes applications : si un objet est immutable, n’est pas défini récursivement, et que tous ses sous-objets satisfont cette propriété, on le dit inductif et on peut simplifier les applications sur lui. On propage le statut inductif des objets lors de notre passe récursive d’optimisation.</p> <p>###Reconstruction de blocs</p> <p>La représentation des blocs dans Flambda pose des problèmes intéressants pour détecter leur égalité, ce qui est souvent nécessaire pour prouver une identité. En effet, il est difficile de détecter la reconstruction d’un bloc à l’identique.</p> <h4>Blocs dans Flambda</h4> <h5>Variants</h5> <p>The blocks in Flambda come from the existence of variants in OCaml: one type may have several different constructors, as we can see in</p> <pre><code class="language-Ocaml">type choice = A of int | B of int </code></pre> <p>Quand OCaml est compilé vers Flambda, l’information du constructeur utilisé par un objet est perdue, et est remplacée par un tag. Le tag est un nombre contenu dans un entête de la représentation mémoire de l’objet, et est un nombre entre <code>0</code> et <code>255</code> représentant le constructeur de l’objet. Par exemple, un objet de type choice aurait le tag <code>0</code> si c’est un <code>A</code> et <code>1</code> si c’est un <code>B</code>.</p> <p>Le tag est ainsi présent dans la mémoire à l’exécution, ce qui permet par exemple d’implémenter le pattern matching de OCaml comme un switch en Flambda, qui fait de simples comparaisons sur le tag pour décider quelle branche prendre.</p> <p>Ce système nous complique la tâche puisque le typage de Flambda ne nous dit pas quel type de constructeur contient un variant, et empêche donc de décider facilement si deux variants sont égaux.</p> <h5>Généralisation des tags</h5> <p>Pour plus de complexité, les tags sont en faits utilisés pour tous les blocs, c’est à dire les tuples, les modules, les fonctions (en fait presque toutes les valeurs sauf les entiers et les constructeurs constants). Quand l’objet n’est pas un variant, on lui donne généralement un tag 0. Ce tag n’est donc jamais lu par la suite (puisqu’on ne fait pas de match sur l’objet), mais nous empêche de comparer simplement deux tuples, puisqu’on verra simplement deux objets de tag inconnu en Flambda.</p> <h5>Inlining</h5> <p>Enfin, on optimise ce système en inlinant les tuples : si on a un variant de type <code>Pair of int*int</code>, au lieu d’être représenté comme le tag de Pair et une adresse mémoire pointant vers un couple (donc un tag 0 et les deux entiers), le couple est inliné et l’objet est de la forme <code>(tag Pair, entier, entier)</code>.</p> <p>Cela implique que les variants sont de taille arbitraire, qui est aussi inconnue dans Flambda.</p> <h4>Approche existante</h4> <p>Une solution partielle au problème existait déjà dans une Pull Request (PR) disponible <a href="https://github.com/ocaml/ocaml/pull/8958">ici</a>.</p> <p>L’approche qui y est adoptée est naturelle : on y utilise les switchs pour gagner de l’information sur le tag d’un bloc, en fonction de la branche prise. La PR permet aussi de connaître la mutabilité et la taille du bloc dans chaque branche, en partant de OCaml (où l’information est connue puisque le constructeur est explicite dans le match), et propageant l’information jusqu’à Flambda.</p> <p>Cela permet d’enregistrer tous les blocs sur lesquels on a fait un switch dans l’environnement, avec leur tag, taille et mutabilité. On peut ensuite détecter si on reconstruit l’un d’entre eux avec la primitive <code>Pmakeblock</code>.</p> <p>Cette approche est malheureusement limitée puisqu’ils existe de nombreux cas où on pourrait connaître le tag et la taille du bloc sans faire de switch dessus. Par exemple, on ne pourra jamais simplifier une reconstruction de tuple avec cette solution.</p> <h4>Nouvelle approche</h4> <p>Notre nouvelle approche commence donc par propager plus d’information depuis OCaml. La propagation est fondée sur deux PR qui existaient sur Flambda 2, et qui annotent dans lambda chaque projection (<code>Pfiel</code>) avec des informations dérivées du typage OCaml. Une ajoute la <a href="https://github.com/ocaml-flambda/ocaml/commit/fa5de9e64ff1ef04b596270a8107d1f9dac9fb2d">mutabilité du bloc</a> et l’autre <a href="https://github.com/ocaml-flambda/ocaml/pull/53">son tag et enfin sa taille</a>.</p> <p>Notre première contribution a été d’adapter ces PRs à Flambda 1 et de les propager de lambda à Flambda correctement.</p> <p>Nous avons ensuite les informations nécessaires pour détecter les reconstructions de blocs : en plus d’avoir une liste de blocs sur lesquels on a switché, on crée une liste de blocs partiellement immutables, c’est à dire dont on sait que certains champs sont immutables.</p> <p>On l’utilise ainsi :</p> <h6>Découverte de blocs</h6> <p>Dès qu’on voit une projection, on regarde si elle est faite sur un bloc immutable de taille connue. Si c’est le cas, on ajoute le bloc correspondant aux blocs partiels. On vérifie que l’information qu’on a sur le tag et la taille est compatible avec celle des projections de ce bloc vues précédemment. Si on connaît maintenant tous les champs du bloc, on l’ajoute à notre liste de blocs connus sur lesquels on peut faire des simplifications.</p> <p>On garde aussi les informations sur les blocs qu’on connaît grâce aux switchs.</p> <h6>Simplification</h6> <p>Cette partie est similaire à celle de la PR originale : quand on construit un bloc immutable, on vérifie si on le connaît, et le cas échéant on ne le réalloue pas.</p> <p>Par rapport à l’approche originale, nous avons aussi réduit la complexité de la PR originale (de quadratique à linéaire), en enregistrant l’association de chaque variable de projection à son index et bloc original. Nous avons aussi modifié des détails de l’implémentation originale qui auraient pu créer un bug lorsque associés à notre PR.</p> <h4>Exemple</h4> <p>Considérons cette fonction:</p> <pre><code class="language-Ocaml">type typ1 = A of int | B of int * int type typ2 = C of int | D of {x:int; y:int} let id = function | A n -&gt; C n | B (x,y) -&gt; D {x; y} </code></pre> <p>Le compilateur actuel produirait le Flambda suivant:</p> <pre><code>End of middle end: let_symbol (camlTest__id_21 (Set_of_closures ( (set_of_closures id=Test.8 (id/5 = fun param/7 -&gt; (switch*(0,2) param/7 case tag 0: (let (Pmakeblock_arg/11 (field 0&lt;{../../test.ml:4,4-7}&gt; param/7) Pmakeblock/12 (makeblock 0 (int)&lt;{../../test.ml:4,11-14}&gt; Pmakeblock_arg/11)) Pmakeblock/12) case tag 1: (let (Pmakeblock_arg/15 (field 1&lt;{../../test.ml:5,4-11}&gt; param/7) Pmakeblock_arg/16 (field 0&lt;{../../test.ml:5,4-11}&gt; param/7) Pmakeblock/17 (makeblock 1 (int,int)&lt;{../../test.ml:5,17-23}&gt; Pmakeblock_arg/16 Pmakeblock_arg/15)) Pmakeblock/17))) free_vars={ } specialised_args={}) direct_call_surrogates={ } set_of_closures_origin=Test.1]))) (camlTest__id_5_closure (Project_closure (camlTest__id_21, id/5))) (camlTest (Block (tag 0, camlTest__id_5_closure))) End camlTest </code></pre> <p>Notre amélioration permet de détecter que cette fonction reconstruit des blocs similaires et donc la simplifie:</p> <pre><code>End of middle end: let_symbol (camlTest__id_21 (Set_of_closures ( (set_of_closures id=Test.7 (id/5 = fun param/7 -&gt; (switch*(0,2) param/7 case tag 0 (1): param/7 case tag 1 (2): param/7)) free_vars={ } specialised_args={}) direct_call_surrogates={ } set_of_closures_origin=Test.1]))) (camlTest__id_5_closure (Project_closure (camlTest__id_21, id/5))) (camlTest (Block (tag 0, camlTest__id_5_closure))) End camlTest </code></pre> <h4>Pistes d’amélioration</h4> <h5>Relâchement de l’égalité</h5> <p>On peut utiliser l’égalité observationnelle étudiée dans la partie théorique pour l’égalité de blocs, afin d’éviter plus d’allocations. L’implémentation est simple :</p> <p>Quand on crée un bloc, pour voir si il est alloué, l’approche normale est de regarder si chacun de ses champs est une projection connue d’un autre bloc, a le même index et si les deux blocs sont de même taille. On peut simplement supprimer la dernière vérification.</p> <p>L’implémentation a été un peu plus difficile que prévu à cause de détails pratiques. Tout d’abord, on veut appliquer cette optimisation uniquement sur certains blocs annotés par l’utilisateur. Il faut donc propager l’annotation jusqu’à Flambda.</p> <p>De plus, si on se contente d’implémenter l’optimisation, beaucoup de cas seront ignorés car les variables inutilisées sont simplifiées avant notre passe. Par exemple, prenons une fonction de la forme suivante :</p> <pre><code class="language-Ocaml">let loose_id (a,b,c) = (a,b) </code></pre> <p>La variable <code>c</code> sera simplifiée avant d’atteindre Flambda, et on ne pourra donc plus prouver que <code>(a,b,c)</code> est immutable car son troisième champ pourrait ne pas l’être. Ce problème est en passe d’être résolu sur Flambda 2 grâce à une PR qui propage l’information de mutabilité pour tous les blocs, mais nous n’avons pas eu le temps nécessaire pour l’adapter à Flambda 1.</p> <h3>Détection d’identités récursives</h3> <p>Maintenant que nous pouvons détecter les reconstructions de blocs, reste à résoudre le problème des fonctions récursives.</p> <h4>Approche sans garanties</h4> <p>Nous avons commencé par implémenter une approche qui ne comporte pas de preuve de terminaison. L’idée est de rajouter la preuve ensuite, ou d’autoriser les fonctions qui ne terminent pas toujours à être simplifiées à condition qu’elles soient correctes au niveau du typage (voir section 7 dans la partie théorique).</p> <p>Ici, on fait confiance à l’utilisateur pour vérifier ces propriétés manuellement.</p> <p>Nous avons donc modifié la simplification de fonction : quand on simplifie une fonction à un seul argument, on commence par supposer que cette fonction est l’identité avant de simplifier son corps. On vérifie ensuite si le résultat est équivalent à une identité en le parcourant récursivement, pour couvrir le plus de cas possible (par exemple les branchements conditionnels). Si c’est le cas, la fonction est remplacée par l’identité ; sinon, on revient à une simplification classique, sans hypothèse d’induction.</p> <h4>Propagation de constantes</h4> <p>Nous avons ensuite amélioré notre fonction qui détermine si le corps d’une fonction est une identité ou non, pour gérer les constantes. Il propage les informations d’égalité qu’on gagne sur l’argument lors des branchements conditionnels.</p> <p>Ainsi, si on a une fonction de la forme</p> <pre><code class="language-Ocaml">type truc = A | B | C let id = function | A -&gt; A | B -&gt; B | C -&gt; C </code></pre> <p>ou même</p> <pre><code class="language-Ocaml">let id x = if x=0 then 0 else x </code></pre> <p>on détectera bien que c’est l’identité.</p> <h4>Exemples</h4> <h5>Fonctions récursives</h5> <p>Nous pouvons maintenant détecter les identités récursives :</p> <pre><code class="language-Ocaml">let rec listid = function | t::q -&gt; t::(listid q) | [] -&gt; [] </code></pre> <p>compilait avant ainsi:</p> <pre><code>End of middle end: let_rec_symbol (camlTest__listid_5_closure (Project_closure (camlTest__set_of_closures_20, listid/5))) (camlTest__set_of_closures_20 (Set_of_closures ( (set_of_closures id=Test.11 (listid/5 = fun param/7 -&gt; (if param/7 then begin (let (apply_arg/13 (field 1&lt;{../../test.ml:9,4-8}&gt; param/7) apply_funct/14 camlTest__listid_5_closure Pmakeblock_arg/15 *(apply*&amp;#091;listid/5]&lt;{../../test.ml:9,15-25}&gt; apply_funct/14 apply_arg/13) Pmakeblock_arg/16 (field 0&lt;{../../test.ml:9,4-8}&gt; param/7) Pmakeblock/17 (makeblock 0&lt;{../../test.ml:9,12-25}&gt; Pmakeblock_arg/16 Pmakeblock_arg/15)) Pmakeblock/17) end else begin (let (const_ptr_zero/27 Const(0a)) const_ptr_zero/27) end)) free_vars={ } specialised_args={}) direct_call_surrogates={ } set_of_closures_origin=Test.1]))) let_symbol (camlTest (Block (tag 0, camlTest__listid_5_closure))) End camlTest </code></pre> <p>On détecte maintenant que c’est l’identité :</p> <pre><code>End of middle end: let_symbol (camlTest__set_of_closures_20 (Set_of_closures ( (set_of_closures id=Test.13 (listid/5 = fun param/7 -&gt; param/7) free_vars={ } specialised_args={}) direct_call_surrogates={ } set_of_closures_origin=Test.1]))) (camlTest__listid_5_closure (Project_closure (camlTest__set_of_closures_20, listid/5))) (camlTest (Block (tag 0, camlTest__listid_5_closure))) End camlTest </code></pre> <h5>Exemple non sûr</h5> <p>En revanche, on peut profiter de l’absence de garanties pour contourner le typage, et accéder à une adresse mémoire comme à un entier :</p> <pre><code class="language-Ocaml">type bugg = A of int*int | B of int let rec bug = function | A (a,b) -&gt; (a,b) | B x -&gt; bug (B x) let (a,b) = (bug (B 42)) let _ = print_int b </code></pre> <p>Cette fonction va être simplifiée vers l’identité alors que le type <code>bugg</code> n’est pas compatible avec le type tuple ; quand on essaie de projeter sur le second champ du variant <code>b</code>, on accède à une partie de la mémoire indéfinie :</p> <pre><code>$ ./unsafe.out 47423997875612 </code></pre> <h4>Pistes d’améliorations – court terme</h4> <h5>Annotation des fonctions</h5> <p>Une amélioration simple en théorie, serait de laisser le choix à l’utilisateur des fonctions sur lesquelles il veut appliquer ces optimisations qui ne sont pas toujours correctes. Nous n’avons pas eu le temps de faire le travail de propagation de l’information jusqu’à Flambda, mais il ne devrait pas y avoir de difficultés d’implémentation.</p> <h5>Ordre sur les arguments</h5> <p>Pour avoir une optimisation plus sûre, on voudrait pouvoir utiliser l’idée développée dans la partie théorique, qui rend l’optimisation correcte sur les objets non cycliques, et surtout qui nous redonne les garanties du typage pour éviter le problème vu dans l’exemple ci-dessus.</p> <p>Afin d’avoir cette garantie, on veut changer la passe de simplification pour que son environnement contienne une option de couple fonction – argument. Quand cette option existe, le couple indique que nous sommes dans le corps d’une fonction, en train de la simplifier, et donc que les applications de la fonction sur des éléments plus petits que l’argument peuvent être simplifiés en une identité. Bien sûr, on devrait aussi modifier la passe pour se rappeler des éléments qui ne sont pas plus petits que l’argument.</p> <h4>Pistes d’améliorations – long terme</h4> <h5>Exclusion des objets cycliques</h5> <p>Comme décrit dans la partie théorique, on pourrait déduire récursivement quels objets sont cycliques et tenter de les exclure de notre optimisation. Le problème est alors qu’au lieu de remplacer les fonctions par l’identité, on doit avoir une annotation spéciale qui représente <code>IdRec</code>.</p> <p>Cela devient bien plus complexe à implémenter quand on compile entre plusieurs fichiers, puisqu’on doit alors avoir cette information dans l’interface des fichiers déjà compilés pour pouvoir faire l’optimisation quand c’est nécessaire.</p> <p>Une piste serait d’utiliser les fichiers .cmx pour enregistrer cette information quand on compile un fichier, mais ce genre d’implémentation était trop longue pour être réalisée pendant le stage. De plus, il n’est même pas évident qu’elle soit un bon choix pratique : elle complexifierait beaucoup l’optimisation pour un avantage faible par rapport à une version correcte sur les objets non cycliques et activée par une annotation de l’utilisateur.</p> opam 2.1.0~rc2 released https://ocamlpro.com/blog/2021_06_23_opam_2.1.0_rc2_released 2021-06-23T13:48:57Z 2021-06-23T13:48:57Z David Allsopp (OCamlLabs) Feedback on this post is welcomed on Discuss! The opam team has great pleasure in announcing opam 2.1.0~rc2! The focus since beta4 has been preparing for a world with more than one released version of opam (i.e. 2.0.x and 2.1.x). The release candidate extends CLI versioning further and, under the ho... <p><em>Feedback on this post is welcomed on <a href="https://discuss.ocaml.org/t/ann-opam-2-1-0-rc2/8042">Discuss</a>!</em></p> <p>The opam team has great pleasure in announcing opam 2.1.0~rc2!</p> <p>The focus since beta4 has been preparing for a world with more than one released version of opam (i.e. 2.0.x and 2.1.x). The release candidate extends CLI versioning further and, under the hood, includes a big change to the opam root format which allows new versions of opam to indicate that the root may still be read by older versions of the opam libraries. A plugin compiled against the 2.0.9 opam libraries will therefore be able to read information about an opam 2.1 root (plugins and tools compiled against 2.0.8 are unable to load opam 2.1.0 roots).</p> <p>Please do take this release candidate for a spin! It is available in the Docker images at ocaml/opam on <a href="https://hub.docker.com/r/ocaml/opam/tags">Docker Hub</a> as the opam-2.1 command (or you can <code>sudo ln -f /usr/bin/opam-2.1 /usr/bin/opam</code> in your <code>Dockerfile</code> to switch to it permanently). The release candidate can also be tested via our installation script (see the <a href="https://github.com/ocaml/opam/wiki/How-to-test-an-opam-feature#from-a-tagged-release-including-pre-releases">wiki</a> for more information).</p> <p>Thank you to anyone who noticed the unannounced first release candidate and tried it out. Between tagging and what would have been announcing it, we discovered an issue with upgrading local switches from earlier alpha/beta releases, and so fixed that for this second release candidate.</p> <p>Assuming no showstoppers, we plan to release opam 2.1.0 next week. The improvements made in 2.1.0 will allow for a much faster release cycle, and we look forward to posting about the 2.2.0 plans soon!</p> <h2>Try it!</h2> <p>In case you plan a possible rollback, you may want to first backup your <code>~/.opam</code> directory.</p> <p>The upgrade instructions are unchanged:</p> <ol> <li>Either from binaries: run </li> </ol> <pre><code class="language-shell-session">bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.1.0~rc2&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.1.0-rc2">the Github &quot;Releases&quot; page</a> to your PATH.</p> <ol start="2"> <li>Or from source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.1.0-rc2#compiling-this-repo">README</a>. </li> </ol> <p>You should then run:</p> <pre><code class="language-shell-session">opam init --reinit -ni </code></pre> <p>We hope there won't be any, but please report any issues to <a href="https://github.com/ocaml/opam/issues">the bug-tracker</a>. Thanks for trying it out, and hoping you enjoy!</p> Tutorial: Format Module of OCaml https://ocamlpro.com/blog/2021_05_06_tutorial_format_module_of_ocaml 2021-05-06T13:48:57Z 2021-05-06T13:48:57Z OCamlPro ... <p>The <a href="http://caml.inria.fr/pub/docs/manual-ocaml/libref/Format.html">Format</a> module of OCaml is an extremely powerful but unfortunately often poorly used module. </p> <p>It combines two distinct elements:</p> <ul><li>pretty-print boxes</li><li>semantic tags</li></ul> <p>This tutorial aims to demystify much of this module and explain the range of things that you can do with it.</p> <p><a href="/blog/2020_06_01_fr_tutoriel_format">Read more (in French)</a></p> Réunion annuelle du Club des utilisateurs d’Alt-Ergo 2021 https://ocamlpro.com/blog/2021_04_29_reunion_annuelle_du_club_des_utilisateurs_dalt_ergo_2021 2021-04-29T13:48:57Z 2021-04-29T13:48:57Z OCamlPro La troisième réunion annuelle du Club des utilisateurs d’Alt-Ergo a eu lieu le 1er avril ! Cette réunion annuelle est l’endroit idéal pour passer en revue les besoins de chaque partenaire concernant Alt-Ergo. Nous avons eu le plaisir de recevoir nos partenaires pour discuter de la feuille de... <p>La troisième réunion annuelle du Club des utilisateurs d’Alt-Ergo a eu lieu le 1er avril ! Cette réunion annuelle est l’endroit idéal pour passer en revue les besoins de chaque partenaire concernant Alt-Ergo. Nous avons eu le plaisir de recevoir nos partenaires pour discuter de la feuille de route concernant les développements et les améliorations futures d’Alt-Ergo.</p> <blockquote> <p>Alt-Ergo est un démonstrateur automatique de formules mathématiques, créé au <a href="https://www.lri.fr/">LRI</a> et développé par OCamlPro depuis 2013. Pour en savoir plus ou rejoindre le Club, visitez le site <a href="https://alt-ergo.ocamlpro.com">https://alt-ergo.ocamlpro.com</a>.</p> </blockquote> <p>Notre Club a plusieurs objectifs. Son objectif principal est de garantir la pérennité d’Alt-Ergo en favorisant la collaboration entre les membres du Club et en tissant des liens avec les utilisateurs de méthodes formelles, telle que la communauté Why3. L’une de nos priorités est de définir les besoins des utilisateurs de solveurs de contraintes en étendant Alt-Ergo à de nouveaux domaines tels que le Model Checking, tout en concurrençant les autres solveurs de l’état de l’art au cours de compétitions internationales. Enfin, le dernier objectif du Club est de trouver de nouveaux projets ou contrats pour le développement de fonctionnalités à long terme.</p> <p>Nous tenons à remercier tous nos membres pour leur soutien : Mitsubishi Electric R&amp;D Centre Europe, AdaCore et le CEA List. Nous souhaitons également mettre en lumière l’équipe de développement <a href="http://why3.lri.fr/">Why3</a> avec laquelle nous travaillons pour améliorer nos outils.</p> <p>Cette année, de nouveaux points d’intérêts ont été soulevés par nos membres. Dans un premier temps, la génération de modèles, ajoutée à Alt-Ergo suite à la dernière édition, a été utile à la majorité des membres du club. Les points techniques souhaités à présent sont de pouvoir raffiner les contraintes et étudier comment les propager. Dans un second temps a eu lieu la présentation de Dolmen, le parseur/typer qui permettra de ne typer qu’une seule fois les fichiers SMT2 et d’être prêt pour le SMT3. Son intégration à Alt-Ergo est en cours, l’avis des membres du club est enthousiaste sur les apports futurs de l’outil Dolmen à la communauté des solveurs SMT !</p> <p>Ces fonctionnalités sont désormais nos principales priorités, retrouvez <a href="https://gitlab.ocamlpro.com/OCamlPro/club-alt-ergo_ext/-/blob/master/Planche_Club_Alt-Ergo_Edition2021.pdf?inline=false">les planches</a> présentées à la réunion du Club édition 2021. Pour suivre nos avancement et les nouveautés, n’hésitez pas à lire nos <a href="/blog/category/formal_methods">articles</a> sur notre blog.</p> New Try-Alt-Ergo https://ocamlpro.com/blog/2021_03_29_new_try_alt_ergo 2021-03-29T13:48:57Z 2021-03-29T13:48:57Z Albin Coquereau Have you heard about our Try-Alt-Ergo website? Created in 2014 (see our blogpost), the first objective was to facilitate access to our performant SMT Solver Alt-Ergo. Try-Alt-Ergo allows you to write and run your problems in your browser without any server computation. This playground website has be... <p><img src="/blog/assets/img/screenshot_ask_altergo.jpg" alt="" /></p> <p>Have you heard about our <a href="https://alt-ergo.ocamlpro.com/try.html">Try-Alt-Ergo</a> website? Created in 2014 (see <a href="/blog/2014_07_15_try_alt_ergo_in_your_browser">our blogpost</a>), the first objective was to facilitate access to our performant SMT Solver <a href="https://alt-ergo.ocamlpro.com/">Alt-Ergo</a>. <em>Try-Alt-Ergo allows you to write and run your problems in your browser without any server computation.</em></p> <p>This playground website has been maintained by OCamlPro for many years, and it's high time to bring it back to life with new updates. We are therefore pleased to announce the new version of the <a href="https://try-alt-ergo.ocamlpro.com/">Try-Alt-Ergo</a> website! In this article, we will first explain what has changed in the back end, and what you can use if you are interested in running your own version of Alt-Ergo on a website, or in an application! And then we will focus on the new front-end of our website, from its interface to its features through its tutorial about the program.* *</p> <h2><a href="/blog/2021_03_29_new_try_alt_ergo">Try-Alt-Ergo 2014</a></h2> <p><img src="/blog/assets/img/screenshot_from_2021_03_29.png" alt="" /></p> <p><a href="https://alt-ergo.ocamlpro.com/try.html">Try-Alt-Ergo</a> was designed to be a powerful and simple tool to use. Its interface was minimalist. It offered three panels, one panel (left) with a text area containing the problem to prove. The centered panel was composed of a button to run Alt-Ergo, load examples, set options. The right panel showed these options, examples and other information. This design lacked some features that have been added to our solver through the years. Features such as models (counter-examples), unsat-core, more options and debug information was missing in this version.</p> <p>Try-Alt-Ergo did not offer a proper editor (with syntax coloration), a way to save the file problem nor an option to limit the run of the solver with a time limit. Another issue was about the thread. When the solver was called the webpage froze, that behavior was problematic in case of the long run because there was no way to stop the solver.</p> <h2><a href="/blog/2021_03_29_new_try_alt_ergo">Alt-Ergo 1.30</a></h2> <p>The 1.30 version of Alt-Ergo was the version used in the back-end to prove problems. Since this version, a lot of improvements have been done in Alt-Ergo. To learn more about these improvements, see our <a href="https://ocamlpro.github.io/alt-ergo/About/changes.html">changelog</a> in the documentation.</p> <p>Over the years we encountered some difficulties to update the Alt-Ergo version used in Try-Alt-Ergo. We used <a href="https://ocsigen.org/js_of_ocaml/latest/manual/overview">Js_of_ocaml</a> to compile the OCaml code of our solver to be runnable as a JavaScript code. Some libraries were not available in JavaScript and we needed to manually disable them. The lack of automatism leads to a lack of time to update the JavaScript version of Alt-Ergo in Try-Alt-Ergo.</p> <p>In 2019 we switched our build system to <a href="https://dune.readthedocs.io/en/latest/overview.html">dune</a> which opens the possibility to ease the cross-compilation of Alt-Ergo in JavaScript.</p> <h2><a href="/blog/2021_03_29_new_try_alt_ergo">New back-end</a></h2> <p>With some simple modification, we were able to compile Alt-Ergo in JavaScript. This modification is simple enough that this process is now automated in our continuous integration. This will enable us to easily provide a JavaScript version of our Solver for each future version.</p> <p>Two ways of using our solver in JavaScript are available:</p> <ul> <li><code>alt-ergo.js</code>, a JavaScript version of the Alt-Ergo CLI. It can be runned with <code>node</code>: <code>node alt-ergo.js &lt;options&gt; &lt;file&gt;</code>. Note that this code is slower than the natively compiled CLI of Alt-Ergo.In our effort to open the SMT world to more people, an npm package is the next steps of this work. </li> <li><code>alt-ergo-worker.js</code>, a web worker of Alt-Ergo. This web worker needs JSON file to input file problem, options into Alt-Ergo and to returns its answers: <ul> <li>Options are sent as a list of couple <em>name:value</em> like:<code>{&quot;debug&quot;:true,&quot;input_format&quot;:&quot;Native&quot;,&quot;steps_bound&quot;:100,&quot;sat_solver&quot;: &quot;Tableaux&quot;,&quot;file&quot;:&quot;test-file&quot;}</code>You can specify all options used in Alt-Ergo. If some options are missing, the worker uses the default value for these options. For example, if debug is not specified the worker will use its defaults <em>value :false</em>.- Input file is sent as a list of string, with the following format:<code>{ &quot;content&quot;: [ &quot;goal g: true&quot;] }</code> </li> <li>Alt-Ergo answers can be composed with its results, debug information, errors, warnings …<code>{ &quot;results&quot;: [ &quot;File &quot;test-file&quot;, line 1, characters 9-13: Valid (0.2070) (0 steps) (goal g) ] ,``&quot;debugs&quot;: [ &quot;[Debug][Sat_solver]&quot;, &quot;use Tableaux-like solver&quot;] }</code>like the options, a result value like <code>debugs</code> does not contains anything, <code>&quot;debugs&quot;: [...]</code> is not returned. </li> <li>See the Alt-Ergo <a href="https://ocamlpro.github.io/alt-ergo/Usage/index.html#js-worker">web-worker documentation</a> to learn more on how to use it. </li> </ul> </li> </ul> <h2><a href="/blog/2021_03_29_new_try_alt_ergo">New Front-end</a></h2> <p><img src="/blog/assets/img/screenshot_new_altergo_interface.jpg" alt="" /></p> <p>The <a href="https://try-alt-ergo.ocamlpro.com">Try-Alt-Ergo</a> has been completely reworked and we added some features:</p> <ul> <li>The left panel is still composed in an editor and answers area <ul> <li><a href="https://ace.c9.io/">Ace editor</a> with custom syntax coloration (both native and smt-lib2) is now used to make it more pleasant to write your problems. </li> </ul> </li> <li>A top panel that contains the following buttons: <ul> <li><code>Ask Alt-Ergo</code> which retrieves content from the editor and options, launch the web worker and print answers in the defined areas. </li> <li><code>Load</code> and <code>Save</code> files. </li> <li><code>Documentation</code>, that sends users to the newly added <a href="https://ocamlpro.github.io/alt-ergo/Input_file_formats/Native/index.html">native syntax documentation</a> of Alt-Ergo. </li> <li><code>Tutorial</code>, that opens an interactive <a href="https://try-alt-ergo.ocamlpro.com/tuto/tutorial.html">tutorial</a> to introduce you to Alt-Ergo native syntax and program verification. </li> </ul> </li> </ul> <p><img src="/blog/assets/img/screenshot_welcome_to_altergo_tutorial.png" alt="" /></p> <ul> <li>A right panel composed of tabs: <ul> <li><code>Start</code> and <code>About</code> that contains general information about Alt-Ergo, Try-Alt-Ergo and how to use it. </li> <li><code>Outputs</code> prints more information than the basic answer area under the editor. In these tabs you can find debugs (long) outputs, unsat-core or models (counter-example) generated by Alt-Ergo. </li> <li><code>Options</code> contains every option you can use, such as the time limit / steps limit or to set the format of the input file to prove . </li> <li><code>Statistics</code> is still a basic tab that only output axioms used to prove the input problem. </li> <li><code>Examples</code> contains some basic examples showing the capabilities of our solver. </li> </ul> </li> </ul> <p>We hope you will enjoy this new version of Try-Alt-Ergo, we can't wait to read your feedback!</p> <p><em>This work was done at OCamlpro.</em></p> opam 2.0.8 release https://ocamlpro.com/blog/2021_02_08_opam_2.0.8_release 2021-02-08T13:48:57Z 2021-02-08T13:48:57Z Raja Boujbel Louis Gesbert We are pleased to announce the minor release of opam 2.0.8. This new version contains some backported fixes: Critical for fish users! Don't add . to PATH. [#4078] Fix sandbox script for newer ccache versions. [#4079 and #4087] Fix sandbox crash when ~/.cache is a symlink. [#4068] User modifications ... <p>We are pleased to announce the minor release of <a href="https://github.com/ocaml/opam/releases/tag/2.0.8">opam 2.0.8</a>.</p> <p>This new version contains some <a href="https://github.com/ocaml/opam/pull/4425">backported</a> fixes:</p> <ul> <li><strong>Critical for fish users!</strong> Don't add <code>.</code> to <code>PATH</code>. [<a href="https://github.com/ocaml/opam/issues/4078">#4078</a>] </li> <li>Fix sandbox script for newer <code>ccache</code> versions. [<a href="https://github.com/ocaml/opam/issues/4079">#4079</a> and <a href="https://github.com/ocaml/opam/pull/4087">#4087</a>] </li> <li>Fix sandbox crash when <code>~/.cache</code> is a symlink. [<a href="https://github.com/ocaml/opam/issues/4068">#4068</a>] </li> <li>User modifications to the sandbox script are no longer overwritten by <code>opam init</code>. [<a href="https://github.com/ocaml/opam/pull/4092">#4020</a> &amp; <a href="https://github.com/ocaml/opam/pull/4092">#4092</a>] </li> <li>macOS sandbox script always mounts <code>/tmp</code> read-write, regardless of <code>TMPDIR</code> [<a href="https://github.com/ocaml/opam/pull/3742">#3742</a>, addressing <a href="https://github.com/ocaml/opam-repository/issues/13339">ocaml/opam-repository#13339</a>] </li> <li><code>pre-</code> and <code>post-session</code> hooks can now print to the console [<a href="https://github.com/ocaml/opam/issues/4359">#4359</a>] </li> <li>Switch-specific pre/post sessions hooks are now actually run [<a href="https://github.com/ocaml/opam/issues/4472">#4472</a>] </li> <li>Standalone <code>opam-installer</code> now correctly builds from sources [<a href="https://github.com/ocaml/opam/issues/4173">#4173</a>] </li> <li>Fix <code>arch</code> variable detection when using 32bit mode on ARM64 and i486 [<a href="https://github.com/ocaml/opam/pull/4462">#4462</a>] </li> </ul> <p>A more complete <a href="https://github.com/ocaml/opam/releases/tag/2.0.8">release note</a> is available.</p> <hr /> <p>Installation instructions (unchanged):</p> <ol> <li>From binaries: run </li> </ol> <pre><code class="language-shell-session">$~ bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.0.8&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.0.8">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script.</p> <ol start="2"> <li>From source, using opam: </li> </ol> <pre><code class="language-shell-session">$~ opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script)</p> <ol start="3"> <li>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.0.8#compiling-this-repo">README</a>. </li> </ol> <p>We hope you enjoy this new minor version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>, and published in <a href="https://discuss.ocaml.org/t/ann-opam-2-0-8-release/7242">discuss.ocaml.org</a>.</p> </blockquote> 2020 at OCamlPro https://ocamlpro.com/blog/2021_02_02_2020_at_ocamlpro 2021-02-02T13:48:57Z 2021-02-02T13:48:57Z Muriel OCamlPro 2020 at OCamlPro OCamlPro was created in 2011 to advocate the adoption of the OCaml language and formal methods in general in the industry. While building a team of highly-skilled engineers, we navigated through our expertise domains, delivering works on the OCaml language and tooling, training comp... <p><img src="/blog/assets/img/logo_2020_at_ocamlpro.png" alt="2020 at OCamlPro" /></p> <p>OCamlPro was created in 2011 to advocate the adoption of the OCaml language and formal methods in general in the industry. While building a team of highly-skilled engineers, we navigated through our expertise domains, delivering works on the OCaml language and tooling, training companies to the use of strongly-typed languages like OCaml but also Rust, tackling formal verification challenges with formal methods, maintaining <a href="https://alt-ergo.ocamlpro.com">the SMT solver Alt-Ergo</a>, designing languages and tools for smart contracts and blockchains, and more!</p> <p>In this article, as every year (see <a href="/blog/2020_02_04_2019_at_ocamlpro">2019 at OCamlPro</a> for last year's post), we review some of the work we did during 2020, in many different worlds.</p> <p><div id="tableofcontents"> <strong>Table of contents</strong></p> <p> <a href="#ocaml">In the World of OCaml</a></p> <ul> <li><a href="#flambda">Flambda &amp; Compilation Team</a> </li> <li><a href="#opam">Opam, the OCaml Package Manager</a> </li> <li><a href="#community">Encouraging OCaml Adoption: Trainings and Resources for OCaml</a> </li> <li><a href="#tooling">Open Source Tooling and Libraries for OCaml</a> </li> <li><a href="#foundation">Supporting the OCaml Software Foundation</a> </li> <li><a href="#events">Events</a> </li> </ul> <p><a href="#formal-methods">In the World of Formal Methods</a></p> <ul> <li><a href="#alt-ergo">Alt-Ergo Development</a> </li> <li><a href="#club">Alt-Ergo Users’ Club and R&amp;D Projects</a> </li> <li><a href="#roadmap">Alt-Ergo’s Roadmap</a> </li> </ul> <p><a href="#rust">In the World of Rust</a></p> <p><a href="#blockchains">In the World of Blockchain Languages</a> </div></p> <p>We warmly thank all our partners, clients and friends for their support and collaboration during this peculiar year!</p> <p>The first lockdown was a surprise and we took advantage of this special moment to go over our past contributions and sum it up in a timeline that gives an overview of the key events that made OCamlPro over the years. The <a href="https://timeline.ocamlpro.com">timeline format</a> is amazing to reconnect with our history and to take stock in our accomplishments.</p> <p>Now this will turn into a generic timeline edition tool on the Web, stay tuned if you are interested in our internal project to be available to the general public! If you think that a timeline would fit your needs and audience, <a href="https://timelines.cc/">we designed a simplistic tool</a>, tailored for users who want complete control over their data.</p> <h2> <a id="ocaml" class="anchor"></a><a class="anchor-link" href="#ocaml">In the World of OCaml</a> </h2> <h3> <a id="flambda" class="anchor"></a><a class="anchor-link" href="#flambda">Flambda &amp; Compilation Team</a> </h3> <p><em>Work by Pierre Chambart, Vincent Laviron, Guillaume Bury, Pierrick Couderc and Louis Gesbert</em></p> <p><img src="/blog/assets/img/picture_cpu.jpg" alt="flambda" /></p> <p>OCamlPro is proud to be working on Flambda2, an ambitious work on an OCaml optimizing compiler, in close collaboration with Mark Shinwell from our long-term partner and client Jane Street. Flambda focuses on reducing the runtime cost of abstractions and removing as many short-lived allocations as possible. In 2020, the Flambda team worked on a considerable number of fixes and improvements, transforming Flambda2 from an experimental prototype to a version ready for testing in production!</p> <p>This year also marked the conclusion of our work on the pack rehabilitation (see our two recent posts <a href="/blog/2020_09_24_rehabilitating_packs_using_functors_and_recursivity_part_1">Part 1</a> and <a href="/blog/2020_09_30_rehabilitating_packs_using_functors_and_recursivity_part_2">Part 2</a>, and a much simpler <a href="/blog/2011_08_10_packing_and_functors">Version</a> in 2011). Our work aimed to give them a new youth and utility by adding the possibility to generate functors or recursive packs. This improvement allows programmers to define big functors, functors that are split among multiple files, resulting in what we can view as a way to implement some form of parameterized libraries.</p> <p><em>This work is allowed thanks to Jane Street’s funding.</em></p> <h3> <a id="opam" class="anchor"></a><a class="anchor-link" href="#opam">Opam, the OCaml Package Manager</a> </h3> <p><em>Work by Raja Boujbel, Louis Gesbert and Thomas Blanc</em></p> <p><img src="/blog/assets/img/picture_containers.jpg" alt="opam" /></p> <p><a href="https://opam.ocaml.org/">Opam</a> is the OCaml source-based package manager. The first specification draft was written <a href="https://opam.ocaml.org/about.html">in early 2012</a> and went on to become OCaml’s official package manager — though it may be used for other languages and projects, since Opam is language-agnostic! If you need to install, upgrade and manage your compiler(s), tools and libraries easily, Opam is meant for you. It supports multiple simultaneous compiler installations, flexible package constraints, and a Git-friendly development workflow.</p> <p><a href="https://github.com/ocaml/opam/releases">Our 2020 work on Opam</a> led to the release of two versions of opam 2.0 with small fixes, and the release of three alphas and two betas of Opam 2.1!</p> <p>Opam 2.1.0 will soon go to release candidate and will introduce a seamless integration of depexts (system dependencies handling), dependency locking, pinning sub-directories, invariant-based definition for Opam switches, the configuration of Opam from the command-line without the need for a manual edition of the configuration files, and the CLI versioning for better handling of CLI evolutions.</p> <p><em>This work is greatly helped by Jane Street’s funding and support.</em></p> <h3> <a id="community" class="anchor"></a><a class="anchor-link" href="#community">Encouraging OCaml Adoption: Trainings and Resources for OCaml</a> </h3> <p><em>Work by Pierre Chambart, Vincent Laviron, Adrien Champion, Mattias, Louis Gesbert and Thomas Blanc</em></p> <p><img src="/blog/assets/img/picture_ocaml_library.jpg" alt="trainings" /></p> <p>OCamlPro is also a training centre. We organise yearly training sessions for programmers from multiple companies in our offices: from OCaml to OCaml tooling to Rust! We can also design custom and on-site trainings to meet specific needs.</p> <p>We released a brand new version of TryOCaml, a tool born from our work on <a href="https://ocaml-sf.org/learn-ocaml/">Learn-OCaml</a>! <a href="https://try.ocamlpro.com">Try OCaml</a> has been highly praised by professors at the beginning of the Covid lockdown. Even if it can be used as a personal sandbox, it’s also possible to adapt its usage for classes. TryOCaml is a hassle-free tool that lowers significantly the barriers to start coding in OCaml, as no installation is required.</p> <p>We regularly release cheat sheets for developers: in 2020, we shared <a href="/blog/2020_01_10_opam_2.0_cheat_sheet">the long-awaited Opam 2.0 cheat sheet</a>, with a new theme! In just two pages, you’ll have in one place the everyday commands you need as an Opam user. We also shine some light on unsung features which may just change your coding life.</p> <p>2020 was also an important year for the OCaml language itself: we were pleased to welcome <a href="https://ocaml.org/releases/4.10.0.html">OCaml 4.10</a>! One of the highlights of the release was the “Best-fit” Garbage Collector Strategy. We had <a href="/blog/2020_03_23_in_depth_look_at_best_fit_gc">an in-depth look</a> at this exciting change.</p> <p><em>This work is self-funded by OCamlPro as part of its effort to ease the adoption of OCaml.</em></p> <h3> <a id="tooling" class="anchor"></a><a class="anchor-link" href="#tooling">Open Source Tooling and Libraries for OCaml</a> </h3> <p><em>Work by Fabrice Le Fessant, Léo Andrès and David Declerck</em></p> <p><img src="/blog/assets/img/picture_tools.jpg" alt="tooling" /></p> <p>OCamlPro has a long history of developing open source tooling and libraries for the community. 2020 was no exception!</p> <p><a href="https://github.com/OCamlPro/drom">drom</a> is a simple tool to create new OCaml projects that will use best OCaml practices, i.e. Opam, Dune and tests. Its goal is to provide a cargo-like user experience and helps onboarding new developers in the community. drom is available in the official opam repository.</p> <p><a href="https://github.com/OCamlPro/directories">directories</a> is a new OCaml Library that provides configuration, cache and data paths (and more!). The library follows the suitable conventions on Linux, MacOS and Windows.</p> <p><a href="https://ocamlpro.github.io/opam-bin/">opam-bin</a> is a framework to create and use binary packages with Opam. It enables you to create, use and share binary packages easily with opam, and to create as many local switches as you want spending no time, no disk space! If you often use Opam, opam-bin is a must-have!</p> <p>We also released a number of libraries, focused on making things easy for developers… so we named them with an <code>ez_</code> prefix: <a href="https://github.com/OCamlPro/ez_cmdliner">ez_cmdliner</a> provides an Arg-like interface for cmdliner, <a href="https://github.com/OCamlPro/ez_file">ez_file</a> provides simple functions to read and write files, <a href="https://github.com/OCamlPro/ez_subst">ez_subst</a> provides easily configurable string substitutions for shell-like variable syntax, <a href="https://github.com/OCamlPro/ez_config">ez_config</a> provides abstract options stored in configuration files with an OCaml syntax. There are also a lot of <a href="https://github.com/OCamlPro?q=ezjs">ezjs-*</a> libraries, that are bindings to Javascript libraries that we used in some of our js_of_ocaml projects.</p> <p>*This work was self-funded by OCamlPro as part of its effort to improve the OCaml ecosystem.*</p> <h3> <a id="foundation" class="anchor"></a><a class="anchor-link" href="#foundation">Supporting the OCaml Software Foundation</a> </h3> <p>OCamlPro was proud and happy to initiate the <a href="https://www.dropbox.com/s/omba1d8vhljnrcn/OCaml-user-survey-2020.pdf?dl=0">OCaml User Survey 2020</a> as part of the mission of the [OCaml Software Foundation]. The goal of the survey was to better understand the community and its needs. The final results have not yet been published by the Foundation, we are looking forward to reading them soon!</p> <h3> <a id="events" class="anchor"></a><a class="anchor-link" href="#events">Events</a> </h3> <p>Though the year took its toll on our usual tour of the world conferences and events, OCamlPro members still took part in the annual 72-hour team programming competition organised by the International Conference on Functional Programming (ICFP). Our joint team “crapo on acid” went <a href="https://icfpcontest2020.github.io/#/scoreboard#final">through the final</a>!</p> <h2> <a id="formal-methods" class="anchor"></a><a class="anchor-link" href="#formal-methods">In the World of Formal Methods</a> </h2> <ul> <li><em>Work by Albin Coquereau, Mattias, Sylvain Conchon, Guillaume Bury and Louis Rustenholz</em> </li> </ul> <p><img src="/blog/assets/img/altergo-meeting.jpeg" alt="formal methods" /></p> <p><a href="/blog/2020_06_05_interview_sylvain_conchon_joins_ocamlpro">Sylvain Conchon joined OCamlPro</a> as Formal Methods Chief Scientific Officer in 2020!</p> <h3> <a id="alt-ergo" class="anchor"></a><a class="anchor-link" href="#alt-ergo">Alt-Ergo Development</a> </h3> <p>OCamlPro develops and maintains <a href="https://alt-ergo.ocamlpro.com/">Alt-Ergo</a>, an automatic solver of mathematical formulas designed for program verification and based on Satisfiability Modulo Theories (SMT). Alt-Ergo was initially created within the <a href="https://vals.lri.fr/">VALS</a> team at <a href="https://www.universite-paris-saclay.fr/en">University of Paris-Saclay</a>.</p> <p>In 2020, we focused on the maintainability of our solver. The first part of this work was to maintain and fix issues within the already released version. The 2.3.0 (released in 2019) had some issues that needed to be fixed <a href="https://ocamlpro.github.io/alt-ergo/About/changes.html#version-2-3-2-march-23-2020">minor releases</a>.</p> <p>The second part of the maintainability work on Alt-Ergo contains more major features. All these features were released in the new <a href="https://alt-ergo.ocamlpro.com/#releases">version 2.4.0</a> of Alt-Ergo. The main goal of this release was to focus on the user experience and the documentation. This release also contains bug fixes and many other improvements. Alt-Ergo is on its way towards a new <a href="https://ocamlpro.github.io/alt-ergo/index.html">documentation</a> and in particular a new documentation on its <a href="https://ocamlpro.github.io/alt-ergo/Input_file_formats/Native/index.html">native syntax</a>.</p> <p>We also tried to improve the command line experience of our tools with the use of the <a href="https://erratique.ch/software/cmdliner">cmdliner library</a> to parse Alt-Ergo options. This library allows us to improve the manpage of our tool. We tried to harmonise the debug messages and to improve all of Alt-Ergo’s outputs to make it clearer for the users.</p> <h3> <a id="club" class="anchor"></a><a class="anchor-link" href="#club">Alt-Ergo Users’ Club and R&amp;D Projects</a> </h3> <p>We thank our partners from the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users’ Club</a>, Adacore, CEA List, MERCE (Mitsubishi Electric R&amp;D Centre Europe) and Trust-In-Soft, for their trust. Their support allows us to maintain our tool.</p> <p>The club was launched in 2019 and the second annual meeting of the Alt-Ergo Users’ Club was held in mid-February 2020. Our annual meeting is the perfect place to review each partner’s needs regarding Alt-Ergo. This year, we had the pleasure of receiving our partners to discuss the roadmap for future Alt-Ergo developments and enhancements. If you want to join us for the next meeting (coming soon), contact us!</p> <p>We also want to thank our partners from the FUI R&amp;D Project LCHIP. Thanks to this project, we were able to add a new major feature in Alt-Ergo: the support for incremental commands (<code>push</code>, <code>pop</code> and <code>check-sat-assuming</code>) from the <a href="https://alt-ergo.ocamlpro.com/#releases">smt-lib2 standard</a>.</p> <h3> <a id="roadmap" class="anchor"></a><a class="anchor-link" href="#roadmap">Alt-Ergo’s Roadmap</a> </h3> <p>Some of the work we did in 2020 is not yet available. Thanks to our partner MERCE (Mitsubishi Electric R&amp;D Centre Europe), we worked on the SMT model generation. Alt-Ergo is now (partially) able to output a model in the smt-lib2 format. Thanks to the <a href="http://why3.lri.fr/">Why3 team</a> from University of Paris-Saclay, we hope that this work will be available in the Why3 platform to help users in their program verification efforts.</p></p> <p>Another project was launched in 2020 but is still in early development: the complete rework of our Try-Alt-Ergo website with new features such as model generation. Try Alt-Ergo <a href="https://alt-ergo.ocamlpro.com/try.html">current version</a> allows users to use Alt-Ergo directly from their browsers (Firefox, Chromium) without the need of a server for computations.</p> <p>This work needed a JavaScript compatible version of Alt-Ergo. We have made some work to build our solver in two versions, one compatible with Node.js and another as a webworker. We hope that this work can make it easier to use our SMT solver in web applications.</p> <p><em>This work is funded in part by the FUI R&amp;D Project LCHIP, MERCE, Adacore and with the support of the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users’ Club</a>.</em></p> <h2> <a id="rust" class="anchor"></a><a class="anchor-link" href="#rust">In the World of Rust</a> </h2> <p><em>Work by Adrien Champion</em></p> <p><img src="/blog/assets/img/logo_rust.jpg" alt="rust" /></p> <p>As OCaml-ians, we naturally saw in the Rust language a beautiful complement to our approach. One opportunity to explore this state-of-the art language has been to pursue our work on ocp-memprof and build <a href="https://github.com/OCamlPro/memthol">Memthol</a>, a visualizer and analyzer to profile OCaml programs. It works on memory dumps containing information about the size and (de)allocation date of part of the allocations performed by some execution of a program.</p> <p>Between lockdowns, we’ve also been able to hold <a href="https://training.ocamlpro.com/">our Rust training</a>. It’s designed as a highly-modular vocational course, from 1 to 4 days. The training covers a beginner introduction to Rust’s basics features, crucial features and libraries for real-life development and advanced features, all through complex use-cases one would find in real life.</p> <p><em>This work was self-funded by OCamlPro as part of our exploration of other statically and strongly typed functional languages.</em></p> <h2> <a id="blockchains" class="anchor"></a><a class="anchor-link" href="#blockchains">In the World of Blockchain Languages</a> </h2> <p><em>Work by David Declerck and Steven de Oliveira</em></p> <p><img src="/blog/assets/img/logo_blockchain.jpg" alt="Blockchain languages" /></p> <p>One of our favourite activities is to develop new programming languages, specialized for specific domains, but with nice properties like clear semantics, strong typing, static typing and functional features. In 2020, we applied our skills in the domain of blockchains and smart contracts, with the creation of a new language, Love, and work on a well-known language, Solidity.</p> <p>In 2020, our blockchain experts released <a href="https://dune.network/docs/dune-node-next/love-doc/introduction.html">Love</a>, a type-safe language with an ML syntax and suited for formal verification. In a few words, Love is designed to be expressive for fast development, efficient in execution time and cheap in storage, and readable in terms of smart contracts auditability. Yet, it has a clear and formal semantics and a strong type system to detect bugs. It allows contracts to use other contracts as libraries, and to call viewers on other contracts. Contracts developed in Love can also be formally verified.</p> <p>We also released a <a href="https://solidity.readthedocs.io/en/v0.6.8/">Solidity</a> parser and printer written in OCaml using Menhir, and used it to implement a full interpreter directly in a blockchain. Solidity is probably the most used language for smart contracts, it was first born on Ethereum but many other blockchains provide it as a way to easily onboard new developers coming from the Ethereum ecosystem. In the future, we plan to extend this work with formal verification of Solidity smart contracts.</p> <p><em>This is a joint effort with <a href="https://www.origin-labs.com/">Origin Labs</a>, the company created to tackle blockchain-related challenges.</em></p> <p>##Towards 2021##</p> <p><img src="/blog/assets/img/picture_towards.jpg" alt="towards" /></p> <p>Adaptability and continuous improvement, that’s what 2020 brought to OCamlPro!</p> <p>We will remember 2020 as a complicated year, but one that allowed us to surpass ourselves and challenge our projects. We are very proud of our team who all continued to grow, learn, and develop our projects in this particular context. We are more motivated than ever for the coming year, which marks our tenth year anniversary! We’re excited to continue sharing our knowledge of the OCaml world and to accompany you in your own projects.</p> Release of Alt-Ergo 2.4.0 https://ocamlpro.com/blog/2021_01_22_release_of_alt_ergo_2_4_0 2021-01-22T13:48:57Z 2021-01-22T13:48:57Z Albin Coquereau A new release of Alt-Ergo (version 2.4.0) is available. You can get it from Alt-Ergo's website. The associated opam package will be published in the next few days. This release contains some major novelties: Alt-Ergo supports incremental commands (push/pop) from the smt-lib standard. We switched co... <p>A new release of Alt-Ergo (version 2.4.0) is available.</p> <p>You can get it from <a href="https://alt-ergo.ocamlpro.com/">Alt-Ergo's website</a>. The associated opam package will be published in the next few days.</p> <p>This release contains some major novelties:</p> <ul> <li>Alt-Ergo supports incremental commands (push/pop) from the <a href="https://smtlib.cs.uiowa.edu/">smt-lib </a>standard. </li> <li>We switched command line parsing to use <a href="https://erratique.ch/software/cmdliner">cmdliner</a>. You will need to use <code>--&lt;option name&gt;</code> instead of <code>-&lt;option name&gt;</code>. Some options have also been renamed, see the manpage or the documentation. </li> <li>We improved the online documentation of your solver, available <a href="https://ocamlpro.github.io/alt-ergo/">here</a>. </li> </ul> <p>This release also contains some minor novelties:</p> <ul> <li><code>.mlw</code> and <code>.why</code> extension are depreciated, the use of <code>.ae</code> extension is advised. </li> <li>Add <code>--input</code> (resp <code>--output</code>) option to manually set the input (resp output) file format </li> <li>Add <code>--pretty-output</code> option to add better debug formatting and to add colors </li> <li>Add exponentiation operation, <code>**</code> in native Alt-Ergo syntax. The operator is fully interpreted when applied to constants </li> <li>Fix <code>--steps-count</code> and improve the way steps are counted (AdaCore contribution) </li> <li>Add <code>--instantiation-heuristic</code> option that can enable lighter or heavier instantiation </li> <li>Reduce the instantiation context (considered foralls / exists) in CDCL-Tableaux to better mimic the Tableaux-like SAT solver </li> <li>Multiple bugfixes </li> </ul> <p>The full list of changes is available <a href="https://ocamlpro.github.io/alt-ergo/About/changes.html">here</a>. As usual, do not hesitate to report bugs, to ask questions, or to give your feedback!</p> opam 2.1.0~beta4 released https://ocamlpro.com/blog/2021_01_13_opam_2.1.0_beta4_released 2021-01-13T13:48:57Z 2021-01-13T13:48:57Z David Allsopp (OCamlLabs) Feedback on this post is welcomed on Discuss! On behalf of the opam team, it gives me great pleasure to announce the third beta release of opam 2.1. Don’t worry, you didn’t miss beta3 - we had an issue with a configure script that caused beta2 to report as beta3 in some instances, so we skipped ... <p><em>Feedback on this post is welcomed on <a href="https://discuss.ocaml.org/t/ann-opam-2-1-0-beta4/7252">Discuss</a>!</em></p> <p>On behalf of the opam team, it gives me great pleasure to announce the third beta release of opam 2.1. Don’t worry, you didn’t miss beta3 - we had an issue with a configure script that caused beta2 to report as beta3 in some instances, so we skipped to beta4 to avoid any further confusion!</p> <p>We encourage you to try out this new beta release: there are instructions for doing so in <a href="https://github.com/ocaml/opam/wiki/How-to-test-an-opam-feature">our wiki</a>. The instructions include taking a backup of your <code>~/.opam</code> root as part of the process, which can be restored in order to wind back. <em>Please note that local switches which are written to by opam 2.1 are upgraded and will need to be rebuilt if you go back to opam 2.0</em>. This can either be done by removing <code>_opam</code> and repeating whatever you use in your build process to create the switch, or you can use <code>opam switch export switch.export</code> to backup the switch to a file before installing new packages. Note that opam 2.1 <em>shouldn’t</em> upgrade a local switch unless you upgrade the base packages (i.e. the compiler).</p> <h2>What’s new in opam 2.1?</h2> <ul> <li>Switch invariants </li> <li>Improved options configuration (see the new <code>option</code> and expanded <code>var</code> sub-commands) </li> <li>Integration of system dependencies (formerly the opam-depext plugin), increasing their reliability as it integrates the solving step </li> <li>Creation of lock files for reproducible installations (formerly the opam-lock plugin) </li> <li>CLI versioning, allowing cleaner deprecations for opam now and also improvements to semantics in future without breaking backwards-compatibility </li> <li>Performance improvements to opam-update, conflict messages, and many other areas </li> <li>New plugins: opam-compiler and opam-monorepo </li> </ul> <h3>Switch invariants</h3> <p>In opam 2.0, when a switch is created the packages selected are put into the “base” of the switch. These packages are not normally considered for upgrade, in order to ease pressure on opam’s solver. This was a much bigger concern early on in opam 2.0’s development, but is less of a problem with the default mccs solver.</p> <p>However, it’s a problem for system compilers. opam would detect that your system compiler version had changed, but be unable to upgrade the ocaml-system package unless you went through a slightly convoluted process with <code>--unlock-base</code>.</p> <p>In opam 2.1, base packages have been replaced by switch invariants. The switch invariant is a package formula which must be satisfied on every upgrade and install. All existing switches’ base packages could just be expressed as <code>package1 &amp; package2 &amp; package3</code> etc. but opam 2.1 recognises many existing patterns and simplifies them, so in most cases the invariant will be <code>&quot;ocaml-base-compiler&quot; {= 4.11.1}</code>, etc. This means that <code>opam switch create my_switch ocaml-system</code> now creates a <em>switch invariant</em> of <code>&quot;ocaml-system&quot;</code> rather than a specific version of the <code>ocaml-system</code> package. If your system OCaml package is updated, <code>opam upgrade</code> will seamlessly switch to the new package.</p> <p>This also allows you to have switches which automatically install new point releases of OCaml. For example:</p> <pre><code class="language-shell-session">$~ opam switch create ocaml-4.11 --formula='&quot;ocaml-base-compiler&quot; {&gt;= &quot;4.11.0&quot; &amp; &lt; &quot;4.12.0~&quot;}' --repos=old=git+https://github.com/ocaml/opam-repository#a11299d81591 $~ opam install utop </code></pre> <p>Creates a switch with OCaml 4.11.0 (the <code>--repos=</code> was just to select a version of opam-repository from before 4.11.1 was released). Now issue:</p> <pre><code class="language-shell-session">$~ opam repo set-url old git+https://github.com/ocaml/opam-repository $~ opam upgrade </code></pre> <p>and opam 2.1 will automatically offer to upgrade OCaml 4.11.1 along with a rebuild of the switch. There’s not yet a clean CLI for specifying the formula, but we intend to iterate further on this with future opam releases so that there is an easier way of saying “install OCaml 4.11.x”.</p> <h3>opam depext integration</h3> <p>opam has long included the ability to install system dependencies automatically via the <a href="https://github.com/ocaml-opam/opam-depext">depext plugin</a>. This plugin has been promoted to a native feature of opam 2.1.0 onwards, giving the following benefits:</p> <ul> <li>You no longer have to remember to run <code>opam depext</code>, opam always checks depexts (there are options to disable this or automate it for CI use). Installation of an opam package in a CI system is now as easy as <code>opam install .</code>, without having to do the dance of <code>opam pin add -n/depext/install</code>. Just one command now for the common case! </li> <li>The solver is only called once, which both saves time and also stabilises the behaviour of opam in cases where the solver result is not stable. It was possible to get one package solution for the <code>opam depext</code> stage and a different solution for the <code>opam install</code> stage, resulting in some depexts missing. </li> <li>opam now has full knowledge of depexts, which means that packages can be automatically selected based on whether a system package is already installed. For example, if you have <em>neither</em> MariaDB nor MySQL dev libraries installed, <code>opam install mysql</code> will offer to install <code>conf-mysql</code> and <code>mysql</code>, but if you have the MariaDB dev libraries installed, opam will offer to install <code>conf-mariadb</code> and <code>mysql</code>. </li> </ul> <h3>opam lock files and reproducibility</h3> <p>When opam was first released, it had the mission of gathering together scattered OCaml source code to build a <a href="https://github.com/ocaml/opam-repository">community repository</a>. As time marches on, the size of the opam repository has grown tremendously, to over 3000 unique packages with over 18000 unique versions. opam looks at all these packages and is designed to solve for the best constraints for a given package, so that your project can keep up with releases of your dependencies.</p> <p>While this works well for libraries, we need a different strategy for projects that need to test and ship using a fixed set of dependencies. To satisfy this use-case, opam 2.0.0 shipped with support for <em>using</em> <code>project.opam.locked</code> files. These are normal opam files but with exact versions of dependencies. The lock file can be used as simply as <code>opam install . --locked</code> to have a reproducible package installation.</p> <p>With opam 2.1.0, the creation of lock files is also now integrated into the client:</p> <ul> <li><code>opam lock</code> will create a <code>.locked</code> file for your current switch and project, that you can check into the repository. </li> <li><code>opam switch create . --locked</code> can be used by users to reproduce your dependencies in a fresh switch. </li> </ul> <p>This lets a project simultaneously keep up with the latest dependencies (without lock files) while providing a stricter set for projects that need it (with lock files).</p> <h3>CLI Versioning</h3> <p>A new <code>--cli</code> switch was added to the first beta release, but it’s only now that it’s being widely used. opam is a complex enough system that sometimes bug fixes need to change the semantics of some commands. For example:</p> <ul> <li><code>opam show --file</code> needed to change behaviour </li> <li>The addition of new controls for setting global variables means that the <code>opam config</code> was becoming cluttered and some things want to move to <code>opam var</code> </li> <li><code>opam switch install 4.11.1</code> still works in opam 2.0, but it’s really an OPAM 1.2.2 syntax. </li> </ul> <p>Changing the CLI is exceptionally painful since it can break scripts and tools which themselves need to drive <code>opam</code>. CLI versioning is our attempt to solve this. The feature is inspired by the <code>(lang dune ...)</code> stanza in <code>dune-project</code> files which has allowed the Dune project to rename variables and alter semantics without requiring every single package using Dune to upgrade their <code>dune</code> files on each release.</p> <p>Now you can specify which version of opam you expected the command to be run against. In day-to-day use of opam at the terminal, you wouldn’t specify it, and you’ll get the latest version of the CLI. For example: <code>opam var --global</code> is the same as <code>opam var --cli=2.1 --global</code>. However, if you issue <code>opam var --cli=2.0 --global</code>, you will told that <code>--global</code> was added in 2.1 and so is not available to you. You can see similar things with the renaming of <code>opam upgrade --unlock-base</code> to <code>opam upgrade --update-invariant</code>.</p> <p>The intention is that <code>--cli</code> should be used in scripts, user guides (e.g. blog posts), and in software which calls opam. The only decision you have to take is the <em>oldest</em> version of opam which you need to support. If your script is using a new opam 2.1 feature (for example <code>opam switch create --formula=</code>) then you simply don’t support opam 2.0. If you need to support opam 2.0, then you can’t use <code>--formula</code> and should use <code>--packages</code> instead. opam 2.0 does not have the <code>--cli</code> option, so for opam 2.0 instead of <code>--cli=2.0</code> you should set the environment variable <code>OPAMCLI</code> to <code>2.0</code>. As with <em>all</em> opam command line switches, <code>OPAMCLI</code> is simply the equivalent of <code>--cli</code> which opam 2.1 will pick-up but opam 2.0 will quietly ignore (and, as with other options, the command line takes precedence over the environment).</p> <p>Note that opam 2.1 sets <code>OPAMCLI=2.0</code> when building packages, so on the rare instances where you need to use the <code>opam</code> command in a <em>package</em> <code>build:</code> command (or in your build system), you <em>must</em> specify <code>--cli=2.1</code> if you’re using new features.</p> <p>There’s even more detail on this feature <a href="https://github.com/ocaml/opam/wiki/Spec-for-opam-CLI-versioning">in our wiki</a>. We’re still finalising some details on exactly how <code>opam</code> behaves when <code>--cli</code> is not given, but we’re hoping that this feature will make it much easier in future releases for opam to make required changes and improvements to the CLI without breaking existing set-ups and tools.</p> <h2>What’s new since the last beta?</h2> <ul> <li>opam now uses CLI versioning (<a href="https://github.com/ocaml/opam/pull/4385">#4385</a>) </li> <li>opam now exits with code 31 if all failures were during fetch operations (<a href="https://github.com/ocaml/opam/issues/4214">#4214</a>) </li> <li><code>opam install</code> now has a <code>--download-only</code> flag (<a href="https://github.com/ocaml/opam/issues/4036">#4036</a>), allowing opam’s caches to be primed </li> <li><code>opam init</code> now advises the correct shell-specific command for <code>eval $(opam env)</code> (<a href="https://github.com/ocaml/opam/pull/4427">#4427</a>) </li> <li><code>post-install</code> hooks are now allowed to modify or remove installed files (<a href="https://github.com/ocaml/opam/pull/4388">#4388</a>) </li> <li>New package variable <code>opamfile-loc</code> with the location of the installed package opam file (<a href="https://github.com/ocaml/opam/pull/4402">#4402</a>) </li> <li><code>opam update</code> now has <code>--depexts</code> flag (<a href="https://github.com/ocaml/opam/issues/4355">#4355</a>), allowing the system package manager to update too </li> <li>depext support NetBSD and DragonFlyBSD added (<a href="https://github.com/ocaml/opam/pull/4396">#4396</a>) </li> <li>The format-preserving opam file printer has been overhauled (<a href="https://github.com/ocaml/opam/issues/3993">#3993</a>, <a href="https://github.com/ocaml/opam/pull/4298">#4298</a> and <a href="https://github.com/ocaml/opam/pull/4302">#4302</a>) </li> <li>pins are now fetched in parallel (<a href="https://github.com/ocaml/opam/issues/4315">#4315</a>) </li> <li><code>os-family=ubuntu</code> is now treated as <code>os-family=debian</code> (<a href="https://github.com/ocaml/opam/pull/4441">#4441</a>) </li> <li><code>opam lint</code> now checks that strings in filtered package formulae are booleans or variables (<a href="https://github.com/ocaml/opam/issues/4439">#4439</a>) </li> </ul> <p>and many other bug fixes as listed <a href="https://github.com/ocaml/opam/releases/tag/2.1.0-beta4">on the release page</a>.</p> <h2>New Plugins</h2> <p>Several features that were formerly plugins have been integrated into opam 2.1.0. We have also developed some <em>new</em> plugins that satisfy emerging workflows from the community and the core OCaml team. They are available for use with the opam 2.1 beta as well, and feedback on them should be directed to the respective GitHub trackers for those plugins.</p> <h3>opam compiler</h3> <p>The <a href="https://github.com/ocaml-opam/opam-compiler"><code>opam compiler</code></a> plugin can be used to create switches from various sources such as the main opam repository, the ocaml-multicore fork, or a local development directory. It can use Git tag names, branch names, or PR numbers to specify what to install.</p> <p>Once installed, these are normal opam switches, and one can install packages in them. To iterate on a compiler feature and try opam packages at the same time, it supports two ways to reinstall the compiler: either a safe and slow technique that will reinstall all packages, or a quick way that will just overwrite the compiler in place.</p> <h3>opam monorepo</h3> <p>The <a href="https://github.com/ocamllabs/opam-monorepo"><code>opam monorepo</code></a> plugin lets you assemble standalone dune workspaces with your projects and all of their opam dependencies, letting you build it all from scratch using only Dune and OCaml. This satisfies the “monorepo” workflow which is commonly requested by large projects that need all of their dependencies in one place. It is also being used by projects that need global cross-compilation for all aspects of a codebase (including C stubs in packages), such as the MirageOS unikernel framework.</p> <h2>Next Steps</h2> <p>This is anticipated to be the final beta in the 2.1 series, and we will be moving to release candidate status after this. We could really use your help with testing this release in your infrastructure and projects and let us know if you run into any blockers. If you have feature requests, please also report them on <a href="https://github.com/ocaml/opam/issues">our issue tracker</a> -- we will be planning the next release cycle once we ship opam 2.1.0 shortly.</p> Memthol: exploring program profiling https://ocamlpro.com/blog/2020_12_01_memthol_exploring_program_profiling 2020-12-01T13:48:57Z 2020-12-01T13:48:57Z Adrien Champion Memthol is a visualizer and analyzer for program profiling. It works on memory dumps containing information about the size and (de)allocation date of part of the allocations performed by some execution of a program. For information regarding building memthol, features, browser compatibility… refer... <p><img src="/blog/assets/img/banner_memprof_banniere_blue.png" alt="" /></p> <p><em>Memthol</em> is a visualizer and analyzer for program profiling. It works on memory <em>dumps</em> containing information about the size and (de)allocation date of part of the allocations performed by some execution of a program.</p> <blockquote> <p>For information regarding building memthol, features, browser compatibility… refer to the <a href="https://github.com/OCamlPro/memthol">memthol github repository</a>. *Please note that Memthol, as a side project, is a work in progress that remains in beta status for now. *</p> </blockquote> <p><img src="https://raw.githubusercontent.com/OCamlPro/memthol/master/rsc/example.png" alt="" /></p> <h4>Memthol's background</h4> <p>The Memthol work was started more than a year ago (we had published a short introductory paper at the <a href="https://jfla.inria.fr/jfla2020.html">JFLA2020</a>). The whole idea was to use the previous work originally achieved on <a href="https://memprof.typerex.org/">ocp-memprof</a>, and look for some extra funding to achieve a usable and industrial version.Then came the excellent <a href="https://blog.janestreet.com/finding-memory-leaks-with-memtrace/">memtrace profiler</a> by Jane Street's team (congrats!)Memthol is a self-funded side project, that we think it still is worth giving to the OCaml community. Its approach is valuable, and can be complementary. It is released under the free GPL licence v3.</p> <h4>Memthol's versatility: supporting memtrace's dump format</h4> <p>The memtrace format is nicely designed and polished enough to be considered a future standard for other tools.This is why Memthol supports Jane Street's <em>dumper</em> format, instead of our own dumper library's.</p> <h4>Why choose Rust to implement Memthol?</h4> <p>We've been exploring the Rust language for more than a year now.The Memthol work was the opportunity to further explore this state-of-the-art language. <em>We are open to extra funding, to deepen the Memthol work should industrial users be interested.</em></p> <h4>Memthol's How-to</h4> <blockquote> <p>The following steps are from the <a href="https://ocamlpro.github.io/memthol/mini_tutorial/">Memthol Github howto</a>.</p> <ul> <li><strong>1.</strong> <a href="https://ocamlpro.github.io/memthol/mini_tutorial/basics.html">Introduction</a> </li> <li><strong>2.</strong> <a href="https://ocamlpro.github.io/memthol/mini_tutorial/charts.html">Basics</a> </li> <li><strong>3.</strong> <a href="https://ocamlpro.github.io/memthol/mini_tutorial/global_settings.html">Charts</a> </li> <li><strong>4.</strong> <a href="https://ocamlpro.github.io/memthol/mini_tutorial/callstack_filters.html">Global Settings</a> </li> <li><strong>5.</strong> <a href="https://ocamlpro.github.io/memthol/mini_tutorial/">Callstack Filters</a> </li> </ul> </blockquote> <h2>Introduction</h2> <p>This tutorial deals with the BUI ( <strong>B</strong>rowser <strong>U</strong>ser <strong>I</strong>nterface) aspect of the profiling. How the dumps are generated is outside of the scope of this document. Currently, memthol accepts memory dumps produced by <em>[Memtrace]</em>(https://blog.janestreet.com/finding-memory-leaks-with-memtrace) (github repository <a href="https://github.com/janestreet/memtrace">here</a>). A memtrace dump for a program execution is a single <a href="https://diamon.org/ctf"> <strong>C</strong>ommon <strong>T</strong>race <strong>F</strong>ormat</a> (CTF) file.</p> <p>This tutorial uses CTF files from the memthol repository. All paths mentioned in the examples are from its root.</p> <p>Memthol is written in Rust and is composed of</p> <ul> <li>a server, written in pure Rust, and </li> <li>a client, written in Rust and compiled to web assembly. </li> </ul> <p>The server contains the client, which it will serve at some address on some port when launched.</p> <h3>Running Memthol</h3> <p>Memthol must be given a path to a CTF file generated by memtrace.</p> <pre><code class="language-shell-session">&gt; ls rsc/dumps/ctf/flamba.ctf rsc/dumps/ctf/flamba.ctf &gt; memthol rsc/dumps/ctf/flamba.ctf |===| Starting | url: http://localhost:7878 | target: `rsc/dumps/ctf/flamba.ctf` |===| </code></pre> <h2>Basics</h2> <p>Our running example in this section will be <code>rsc/dumps/mini_ae.ctf</code>:</p> <pre><code class="language-shell-session">❯ memthol --filter_gen none rsc/dumps/ctf/mini_ae.ctf |===| Starting | url: http://localhost:7878 | target: `rsc/dumps/ctf/mini_ae.ctf` |===| </code></pre> <p>Notice the odd <code>--filter_gen none</code> passed to memthol. Ignore it for now, it will be discussed later in this section.</p> <p>Once memthol is running, <code>http://localhost:7878/</code> (here) will lead you to memthol's BUI, which should look something like this:</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/default.png" alt="" /></p> <p>Click on the orange <strong>everything</strong> tab at the bottom left of the screen.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/three_parts.png" alt="" /></p> <p>Memthol's interface is split in three parts:</p> <ul> <li>the central, main part displays charts. There is only one here, showing the evolution of the program's total memory size over time based on the memory dump. </li> <li>the header gives statistics about the memory dump and handles general settings. There is currently only one, the <em>time window</em>.- the footer controls your <em>filters</em> (there is only one here), which we are going to discuss right now. </li> </ul> <h3>Filters</h3> <p><em>Filters</em> allow to split allocations and display them separately. A filter is essentially a set of allocations. Memthol has two built-in filters. The first one is the <strong>everything</strong> filter. You cannot really do anything with it except for changing its name and color using the filter settings in the footer.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/everything_name_color.png" alt="" /></p> <p>Notice that when a filter is modified, two buttons appear in the top-left part of the footer. The first reverts the changes while the second one saves them. Let's save these changes.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/everything_saved.png" alt="" /></p> <p>The <strong>everything</strong> filter always contains all allocations in the memory dump. It cannot be changed besides the cosmetic changes we just did. These changes are reverted in the rest of the section.</p> <h3>Custom Filters</h3> <p>Let's create a new filter using the <code>+</code> add button in the top-right part of the footer.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/new_filter.png" alt="" /></p> <p>Notice that, unlike <strong>everything</strong>, the settings for our new filter have a <strong>Catch allocation if …</strong> (empty) section with a <code>+</code> add button. Let's click on that.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/new_sub_filter.png" alt="" /></p> <p>This adds a criterion to our filter. Let's modify it so that the our filter catches everything of size greater than zero machine words, rename the filter, and save these changes.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/new_filter_1.png" alt="" /></p> <p>The tab for our filter now shows <strong>(3)</strong> next to its name, indicating that this filter catches 3 allocations, which is all the allocations of the (tiny) dump.</p> <p>Now, create a new filter and modify it so that it catches allocations made in file <code>weak.ml</code>. This requires</p> <ul> <li>creating a filter, </li> <li>adding a criterion to that filter, </li> <li>switching it from <code>size</code> to <code>callstack</code> </li> <li>removing the trailing <code>**</code> (anything) by erasing it, </li> <li>write <code>weak.ml</code> as the last file that should appear in the callstack.&gt; </li> </ul> <p>After saving it, you should get the following.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/new_filter_2.png" alt="" /></p> <p>Sadly, this filter does not match anything, although some allocations fit this filter. This is because a <strong>custom filter</strong> <code>F</code> “catches&quot; an allocation if</p> <ul> <li>all of the criteria of <code>F</code> are true for this allocation, and </li> <li>the allocation is not caught by any <strong>custom</strong> filter at the left of <code>F</code> (note that the <strong>everything</strong> filter is not a <strong>custom filter</strong>). </li> </ul> <p>In other words, all allocations go through the list of custom filters from left to right, and are caught by the first filter such that all of its criteria are true for this allocation. As such, it is similar to switch/case and pattern matching.</p> <p>Let's move our new filter to the left by clicking the left arrow next to it, and save the change.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/new_filter_3.png" alt="" /></p> <p>Nice.</p> <p>You can remove a filter by selecting it and clicking the <code>-</code> remove button in the top-right part of the footer, next to the <code>+</code> add filter button. This only works for <strong>custom</strong> filters, you cannot remove built-in filters.</p> <p>Now, remove the first filter we created (size ≥ 0), which should give you this:</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/new_filter_4.png" alt="" /></p> <p>Out of nowhere, we get the second and last built-in filter: <strong>catch-all</strong>. When some allocations are not caught by any of your filters, they will end up in this filter. <strong>Catch-all</strong> is not visible when it does not catch any allocation, which is why it was (mostly) not visible until now. The filter we wrote previously where catching all the allocations.</p> <blockquote> <p>In the switch/case analogy, <strong>catch-all</strong> is the <code>else</code>/<code>default</code> branch. In pattern matching, it would be a trailing wildcard <code>_</code>.</p> </blockquote> <p>So, <code>weak.ml</code> only catches one of the three allocations: <strong>catch-all</strong> appears and indicates it matches the remaining two.</p> <blockquote> <p>It is also possible to write filter criteria over allocations' callstacks. This is discussed in the <a href="https://ocamlpro.github.io/memthol/mini_tutorial/callstack_filters.html">Callstack Filters Section</a>.</p> </blockquote> <h3>Filter Generation</h3> <p>When we launched this section's running example, we passed <code>--filter_gen none</code> to memthol. This is because, by default, memthol will run <em>automatic filter generation</em> which scans allocations and generates filters. The default (and currently only) one creates one filter per allocation-site file.</p> <blockquote> <p>For more details, in particular filter generation customization, run <code>memthol --filter_gen help</code>.</p> </blockquote> <p>If we relaunch the example without <code>--filter_gen none</code></p> <pre><code class="language-shell-session">❯ memthol rsc/dumps/ctf/mini_ae.ctf |===| Starting | url: http://localhost:7878 | target: `rsc/dumps/ctf/mini_ae.ctf` |===| </code></pre> <p>we get something like this (actual colors may vary):</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/basics_pics/filter_gen.png" alt="" /></p> <h2>Charts</h2> <p>This section uses the same running example as the last section.</p> <pre><code class="language-shell-session">❯ memthol rsc/dumps/ctf/mini_ae.ctf |===| Starting | url: http://localhost:7878 | target: `rsc/dumps/ctf/mini_ae.ctf` |===| </code></pre> <h3>Filter Toggling</h3> <p>The first way to interact with a chart is to (de)activate filters. Each chart has its own filter tabs allowing to toggle filters on/off.</p> <p>From the initial settings</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/charts_pics/init.png" alt="" /></p> <p>click on all filters but <strong>everything</strong> to toggle them off.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/charts_pics/only_everything.png" alt="" /></p> <p>Let's create a new chart. The only kind of chart that can be constructed currently is total size over time, so click on <strong>create chart</strong> below our current, lone chart.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/charts_pics/two_charts_1.png" alt="" /></p> <p>Deactivate <strong>everything</strong> in the second chart.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/charts_pics/two_charts_2.png" alt="" /></p> <p>Nice. We now have the overall total size over time in the first chart, and the details for each filter in the second one.</p> <p>Next, notice that both charts have, on the left of their title, a down (first chart) and up (second chart) arrow. This moves the charts up and down.</p> <p>On the right of the title, we have a settings <code>...</code> buttons which is discussed <a href="https://ocamlpro.github.io/memthol/mini_tutorial/charts.html#chart-settings">below</a>. The next button collapses the chart. If we click on the <em>collapse</em>* button of the first chart, it collapses and the button turns into an <em>expand</em> button.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/charts_pics/collapsed.png" alt="" /></p> <p>The last button in the chart header removes the chart.</p> <h3>Chart Settings</h3> <p>Clicking the settings <code>...</code> button in the header of any chart display its settings. (Clicking on the button again hides them.)</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/charts_pics/settings_1.png" alt="" /></p> <p>Currently, these chart settings only allow to rename the chart and change its <strong>display mode</strong>.</p> <h4>Display Mode</h4> <p>In memthol, a chart can be displayed in one of three ways:</p> <ul> <li>normal, the one we used so far, </li> <li>stacked area, where the values of each filter are displayed on top of each other, and </li> <li>stacked area percent, same as stacked area but values are displayed as percents of the total. </li> </ul> <p>Here is the second chart from our example displayed as stacked area for instance:</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/charts_pics/settings_stacked.png" alt="" /></p> <h2>Global Settings</h2> <p>This section uses the same running example as the last section.</p> <pre><code class="language-shell-session">❯ memthol rsc/dumps/ctf/mini_ae.ctf |===| Starting | url: http://localhost:7878 | target: `rsc/dumps/ctf/mini_ae.ctf` |===| </code></pre> <p>There is currently only one global setting: the <em>time window</em>.</p> <h3>Time Window</h3> <p>The <em>time window</em> global setting controls the time interval displayed by all the charts.</p> <p>In our example,</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/global_settings_pics/init.png" alt="" /></p> <p>not much is happening before (roughly) <code>0.065</code> seconds. Let's have the time window start at that point:</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/global_settings_pics/time_window_1.png" alt="" /></p> <p>Similar to filter edition, we can apply or cancel this change using the two buttons that appeared in the bottom-left corner of the header.</p> <p>Saving these changes yields</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/global_settings_pics/time_window_2.png" alt="" /></p> <p>Here is the same chart but with the time window upper-bound set at <code>0.074</code>.</p> <p><img src="https://ocamlpro.github.io/memthol/mini_tutorial/global_settings_pics/time_window_3.png" alt="" /></p> <h2>Callstack Filters</h2> <p>Callstack filters are filters operating over allocation properties that are sequences of strings (potentially with some other data). Currently, this means <strong>allocation callstacks</strong>, where the strings are file names with line/column information.</p> <h3>String Filters</h3> <p>A string filter can have three shapes: an actual <em>string value</em>, a <em>regex</em>, or a <em>match anything</em> / <em>wildcard</em> filter represented by the string <code>&quot;...&quot;</code>. This wildcard filter is discussed in <a href="https://ocamlpro.github.io/memthol/mini_tutorial/callstack_filters.html#the-wildcard-filter">its own section</a> below.</p> <p>A string value is simply given as a value. To match precisely the string <code>&quot;file_name&quot;</code>, one only needs to write <code>file_name</code>. So, a filter that matches precisely the list of strings <code>[ &quot;file_name_1&quot;, &quot;file_name_2&quot; ]</code> will be written</p> <table> <thead> <tr> <th> </th> <th> </th> <th> </th> </tr> </thead> <tbody> <tr> <td>string list</td> <td>contains</td> <td>`[ file_name_1 file_name_2 ]`</td> </tr> </tbody> </table> <p>A <em>regex</em> on the other hand has to be written between <code>#&quot;</code> and <code>&quot;#</code>. If we want the same filter as above, but want to relax the first string description to be <code>file_name_&lt;i&gt;</code> where <code>&lt;i&gt;</code> is a single digit, we write the filter as</p> <table> <thead> <tr> <th> </th> <th> </th> <th> </th> </tr> </thead> <tbody> <tr> <td>string list</td> <td>contains</td> <td>`[ #"file_name_[0-9]"# file_name_2 ]`</td> </tr> </tbody> </table> <h3>The Wildcard Filter</h3> <p>The wildcard filter, written <code>...</code>, <strong>lazily</strong> (in general, see below) matches a repetition of any string-like element of the list. To break this definition down, let us separate two cases: the first one is when <code>...</code> is not followed by another string-like filter, and second one is when it is followed by another filter.</p> <p>In the first case, <code>...</code> simply matches everything. Consider for instance the filter</p> <table> <thead> <tr> <th> </th> <th> </th> <th> </th> </tr> </thead> <tbody> <tr> <td>string list</td> <td>contain</td> <td>`[ #"file_name_[0-9]"# ... ]`</td> </tr> </tbody> </table> <p>This filter matches any list of strings that starts with a string accepted by the first regex filter. The following lists of strings are all accepted by the filter above.</p> <ul> <li><code>[ file_name_0 ]</code> </li> <li><code>[ file_name_7 anything at all ]</code> </li> <li><code>[ file_name_3 file_name_7 ]</code> </li> </ul> <p>Now, there is one case when <code>...</code> is not actually lazy: when the <code>n</code> string-filters <em>after</em> it are not <code>...</code>. In this case, all elements of the list but the <code>n</code> last ones will be skipped, leaving them for the <code>n</code> last string filters.</p> <p>For this reason</p> <table> <thead> <tr> <th> </th> <th> </th> <th> </th> </tr> </thead> <tbody> <tr> <td>string list</td> <td>contain</td> <td>`[ … #"file_name_[0-9]"# ]`</td> </tr> </tbody> </table> <p>does work as expected. For example, on the string list</p> <pre><code class="language-shell-session">[ &quot;some_file_name&quot; &quot;file_name_7&quot; &quot;another_file_name&quot; &quot;file_name_0&quot; ] </code></pre> <p>a lazy behavior would not match. First, <code>...</code> would match anything up to and excluding a string recognized by <code>#&quot;file_name_[0-9]&quot;#</code>. So <code>...</code> would match <code>some_file_name</code>, but that's it since <code>file_name_7</code> is a match for <code>#&quot;file_name_[0-9]&quot;#</code>. Hence the filter would reject this list of strings, because there should be nothing left after the match for <code>#&quot;file_name_[0-9]&quot;#</code>. But there are still <code>another_file_name</code> and <code>file_name_0</code> left.</p> <p>Instead, the filter works as expected. <code>...</code> discards all elements but the last one <code>file_name_0</code>, which is accepted by <code>#&quot;file_name_[0-9]&quot;#</code>.</p> <h3>Callstack (Location) Filters</h3> <p>Allocation callstack information is a list of tuples containing:</p> <ul> <li>the name of the file, </li> <li>the line in the file, </li> <li>a column range. </li> </ul> <p>Currently, the range information is ignored. The line in the file is not, and one can specify a line constraint while writing a callstack filter. The <em>normal</em> syntax is</p> <pre><code class="language-shell-session">&lt;string-filter&gt;:&lt;line-filter&gt; </code></pre> <p>Now, a line filter has two basic shapes</p> <ul> <li><code>_</code>: anything, </li> <li><code>&lt;number&gt;</code>: an actual value. </li> </ul> <p>It can also be a range:</p> <ul> <li><code>[&lt;basic-line-filter&gt;, &lt;basic-line-filter&gt;]</code>: a potentially open range. </li> </ul> <h4>Line Filter Examples</h4> <table> <thead> <tr> <th> </th> <th> </th> </tr> </thead> <tbody> <tr> <td>`_`</td> <td>matches any line at all</td> </tr> <tr> <td>`7`</td> <td>matches line 7</td> </tr> <tr> <td>`[50, 102]`</td> <td>matches any line between `50` and `102`</td> </tr> <tr> <td>`[50, _]`</td> <td>matches any line greater than `50`</td> </tr> <tr> <td>`[_, 102]`</td> <td>matches any line less than `102`</td> </tr> <tr> <td>`[_, _]`</td> <td>same as `_` (matches any line)</td> </tr> </tbody> </table> <h4>Callstack Filter Examples</h4> <p>Whitespaces are inserted for readability but are not needed:</p> <table> <thead> <tr> <th> </th> <th> </th> </tr> </thead> <tbody> <tr> <td>`src/main.ml : _`</td> <td>matches any line of `src/main.ml`</td> </tr> <tr> <td>`#".*/main.ml"# : 107`</td> <td>matches line 107 of any `main.ml` file regardless of its path</td> </tr> </tbody> </table> Rehabilitating Packs using Functors and Recursivity, part 2. https://ocamlpro.com/blog/2020_09_30_rehabilitating_packs_using_functors_and_recursivity_part_2 2020-09-30T13:48:57Z 2020-09-30T13:48:57Z Pierrick Couderc This blog post and the previous one about functor packs covers two RFCs currently developed by OCamlPro and Jane Street. We previously introduced functor packs, a new feature adding the possiblity to compile packs as functors, allowing the user to implement functors as multiple source files or even ... <p><img src="/blog/assets/img/train.jpg" alt="" /></p> <p>This blog post and the previous one about <a href="/blog/2020_09_24_rehabilitating_packs_using_functors_and_recursivity_part_1">functor packs</a> covers two RFCs currently developed by OCamlPro and Jane Street. We previously introduced functor packs, a new feature adding the possiblity to compile packs as functors, allowing the user to implement functors as multiple source files or even parameterized libraries.</p> <p>In this blog post, we will cover the other aspect of the packs rehabilitation: allowing anyone to implement recursive compilation units using packs (as described formally in the <a href="https://github.com/ocaml/RFCs/pull/20">RFC#20</a>). Our previous post introduced briefly how packs were compiled and why we needed some bits of closure conversion to effectively implement big functors. Once again, to implement recursive packs we will need to encode modules through this technique, as such we advise the reader to check at least the introduction and the compilation part of functor packs.</p> <h2>Recursive modules through recursive packs</h2> <p>Recursive modules are a feature long available in the compiler, but restricted to modules, not compilation units. As such, it is impossible to write two files that depend on each other, except by using scripts that tie up these modules into a single compilation file. Due to the internal representation of recursive modules, it would be difficult to implement recursive (and mutually recursive) compilation units. However, we could use packs to implement these.</p> <p>One common example of recursive modules are trees whose nodes are represented by sets. To implement such a data structure with the standard library we need recursive modules: <code>Set</code> is a functor that takes as parameter a module describing the values embedded in the set, but in our case the type needs the already applied functor.</p> <pre><code class="language-Ocaml">module rec T : sig type t = Leaf of int | Node of TSet.t val compare : t -&gt; t -&gt; int end = struct type t = Leaf of int | Node of TSet.t let compare t1 t2 = match t1, t2 with Leaf v1, Leaf v2 -&gt; Int.compare v1 v2 | Node s1, Node s2 -&gt; TSet.compare s1 s2 | Leaf _, Node _ -&gt; -1 | Node _, Leaf _ -&gt; 1 end and TSet : Set.S with type elt = T.t = Set.Make(T) </code></pre> <p>With recursive pack, we can simply put <code>T</code> and <code>TSet</code> into their respective files (<code>t.ml</code> and <code>tSet.ml</code>), and tie them into one module (let's name it <code>P</code>). Signature of recursive modules cannot be infered, as such we also need to define <code>t.mli</code> and <code>tSet.mli</code>. Both must be compiled simultaneously since they refer to each other. The result of the compilation is the following:</p> <pre><code class="language-shell-session">ocamlopt -c -for-pack P -recursive t.mli tSet.mli ocamlopt -c -for-pack P -pack-is-recursive P t.ml ocamlopt -c -for-pack P -pack-is-recursive P tSet.ml ocamlopt -o p.cmx -recursive-pack t.cmx tSet.cmx </code></pre> <p>We have three new compilation options:</p> <ul> <li><code>-recursive</code> indicates to the compiler to typecheck all the given <code>mli</code>s simultaneously, as recursive modules. </li> <li><code>-pack-is-recursive</code> indicates which pack(s) in the hierarchy are meant to be recursive. This is necessary since it determines how the module must be compiled (<em>i.e</em> if we will need to apply closure conversion). </li> <li><code>recursive-pack</code> generates a pack that deals with the initialization of its modules, as for recursive modules. </li> </ul> <h3>Recursives modules compilation</h3> <p>One may be wondering why we need packs to compile recursive modules. Let's take a look at how they are encoded. We will craft a naive example that is simple enough once compiled:</p> <pre><code class="language-Ocaml">module rec Even : sig val test: int -&gt; bool end = struct let test i = if i-1 &lt;= 0 then false else Odd.test (i-1) end and Odd : sig val test: int -&gt; bool end = struct let test i = if i-1 &lt;= 0 then true else Even.test (i-1) end </code></pre> <p>It defines two modules <code>Even</code> and <code>Odd</code>, that both test whether an integer is even or odd, and if that is not the case calls the test function from the other module. Not a really interesting use of recursive modules obviously. The compilation schema for recursive modules is the following:</p> <ul> <li>First, it allocates empty blocks for each module according to its <strong>shape</strong> (how many values are bound and what size they need in the block, if the module is a functor and what are its values, etc). </li> <li>Then these blocks are filled with the implementation. </li> </ul> <p>In our case, in a pseudo-code that is a bit higher order than Lambda (the first intermediate language of ocaml) it would translate as:</p> <pre><code class="language-Ocaml">module Even = &lt;allocation of the shape of even.cmx&gt; module Odd = &lt;allocation of the shape of odd.cmx&gt; Even := &lt;struct let test = .. end&gt; Odd := &lt;struct let test = .. end&gt; </code></pre> <p>This ensures that every reference to <code>Even</code> in <code>Odd</code> (and vice-versa) are valid pointers. To respect this schema, we will use packs to tie recursive modules together. Without packs, this means we would generate this code when linking the units into an executable which can be tricky. The pack can simply do it as initialization code.</p> <h3>Compiling modules for recursive pack</h3> <p>If we tried to compile these modules naively, we would end up in the same situation than for the functor pack: the compilation units would refer to identifiers that do not exist at the time they are generated. Moreover, the initialization part needs to know the shape of the compilation unit to be able to allocate precisely the block that will contain the recursive module. In order to implement recursive compilation units into packs, we extends the compilation units in two ways:</p> <ul> <li>The shape of the unit is computed and stored in the <code>cmo</code> (or <code>cmx</code>). </li> <li>As for functor pack, we apply closure conversion on the free variables that are modules from the same pack or from packs above in the hierarchy as long as they are recursive. </li> </ul> <p>As an example, we will reuse our <code>Even</code> / <code>Odd</code> example above and split it into two units <code>even.ml</code> and <code>odd.ml</code>, and compile them into a recursive pack <code>P</code>. Both have the same shape: a module with a single value. <code>Even</code> refers to a free variable <code>Odd</code>, which is in the same recursive pack, and vice-versa. The result of the closure conversion is a function that will take the pointer resulting from the initialization. Since the module is also recursive itself, it takes its own pointer resulting from its initialization. The result will look as something like:</p> <pre><code class="language-Ocaml">(* even.cmx *) module Even_rec (Even: &lt;even.mli&gt;&lt;even.mli&gt;)(Odd: &lt;odd.mli&gt;&lt;odd.mli&gt;) = .. (* odd.cmx *) module Odd_rec (Odd: &lt;odd.mli&gt;&lt;odd.mli&gt;)(Even: &lt;even.mli&gt;&lt;even.mli&gt;) = .. (* p.cmx *) module Even = &lt;allocation of the shape of even.cmx&gt; module Odd = &lt;allocation of the shape of odd.cmx&gt; Even := Even_rec(Even)(Odd) Odd := Odd_rec(Odd)(Even) </code></pre> <h2>Rejunavating packs</h2> <p>Under the hood, these new features come with some refactoring in the pack implementation which follows work done for RFC on the <a href="https://github.com/ocaml/RFCs/pull/13">representation of symbols</a> in the middle-end of the compiler. Packs were not really used anymore and were deprecated by module aliases, this work makes them relevant again. These RFCs improve the OCaml ecosystem in multiple ways:</p> <ul> <li>Compilation units are now on par with modules, since they can be functors. </li> <li>Functor packs allow developers to implement parameterized libraries, without having to rely on scripts to produce multiple libraries linked with different <em>backends</em> (for example, Cohttp can use Lwt or Async as backend, and provides two libraries, one for each of these). </li> <li>Recursive packs allow the implementation of recursive modules into separate files. </li> </ul> <p>We hope that such improvements will benefit the users and library developers. Having a way to implement parameterize libraries without having to describe big functors by hand, or use mutually recursive compilation units without using scripts to generate a unique <code>ml</code> file will certainly introduce new workflows.</p> Rehabilitating Packs using Functors and Recursivity, part 1. https://ocamlpro.com/blog/2020_09_24_rehabilitating_packs_using_functors_and_recursivity_part_1 2020-09-24T13:48:57Z 2020-09-24T13:48:57Z Pierrick Couderc OCamlPro has a long history of dedicated efforts to support the development of the OCaml compiler, through sponsorship or direct contributions from Flambda Team. An important one is the Flambda intermediate representation designed for optimizations, and in the future its next iteration Flambda 2. Th... <p><img src="/blog/assets/img/train.jpg" alt="" /></p> <p>OCamlPro has a long history of dedicated efforts to support the development of the OCaml compiler, through sponsorship or direct contributions from <em>Flambda Team</em>. An important one is the Flambda intermediate representation designed for optimizations, and in the future its next iteration Flambda 2. This work is funded by JaneStreet.</p> <p>Packs in the OCaml ecosystem are kind of an outdated concept (options <code>-pack</code> and <code>-for-pack</code> in the <a href="https://caml.inria.fr/pub/docs/manual-ocaml/comp.html">OCaml manual</a>), and their main utility has been overtaken by the introduction of <a href="https://caml.inria.fr/pub/docs/manual-ocaml/modulealias.html">module aliases</a> in OCaml 4.02. What if we tried to redeem them and give them a new youth and utility by adding the possibility to generate functors or recursive packs?</p> <p>This blog post covers the <a href="https://github.com/ocaml/RFCs/pull/11">functor units and functor packs</a>, while the next one will be centered around <a href="https://github.com/ocaml/RFCs/pull/20">recursive packs</a>. Both RFCs are currently developed by JaneStreet and OCamlPro. This idea was initially introduced by <a href="/blog/2011_08_10_packing_and_functors">functor packs</a> (Fabrice Le Fessant) and later generalized by <a href="https://ocaml.org/meetings/ocaml/2014/ocaml2014_8.pdf">functorized namespaces</a> (Pierrick Couderc et al.).</p> <h2>Packs for the masses</h2> <p>First of all let's take a look at what packs are, and how they fixed some issues that arose when the ecosystem started to grow and the number of libraries got quite large.</p> <p>One common problem in any programming language is how names are treated and disambiguated. For example, look at this small piece of code:</p> <pre><code class="language-Ocaml">let x = &quot;something&quot; let x = &quot;something else&quot; </code></pre> <p>We declare two variables <code>x</code>, but actually the first one is shadowed by the second, and is now unavailable for the rest of the program. It is perfectly valid in OCaml. Let's try to do the same thing with modules:</p> <pre><code class="language-Ocaml">module M = struct end module M = struct end </code></pre> <p>The compiler rejects it with the following error:</p> <pre><code class="language-shell-session">File &quot;m.ml&quot;, line 3, characters 0-21: 3 | module M = struct end ^^^^^^^^^^^^^^^^^^^^^ Error: Multiple definition of the module name M. Names must be unique in a given structure or signature. </code></pre> <p>This also applies with programs linking two compilation units of the same name. Imagine you are using two libraries (here <code>lib_a</code> and <code>lib_b</code>), that both define a module named <code>Misc</code>.</p> <pre><code class="language-shell-session">ocamlopt -o prog.asm -I lib_a -I lib_b lib_a.cmxa lib_b.cmxa prog.ml File &quot;prog.ml&quot;, line 1: Error: The files lib_a/a.cmi and lib_b/b.cmi make inconsistent assumptions over interface Misc </code></pre> <p>At link time, the compiler will reject your program since you are trying to link two modules with the same name but different implementations. The compiler is unable to differentiate the two compilation units since they define some identical symbols, as such cannot link the program. Enforcing unique module names in the same namespace (<em>i.e.</em> a signature) is consistent with the inability to link two modules of the same name in the same program.</p> <p>However, <code>Misc</code> is a common name for a module in any project. How can we avoid that? As a user of the libraries there is nothing you can do, since you cannot rename the modules (you will eventually need to link two files named <code>misc.cmx</code>). As the developer, you need to ensure that your module names are unique enough to be used along any other libraries. One solution would be to use prefixes for each of your compilation units, for example by naming your files <code>mylib_misc.ml</code>, with the drawback that you will need to use those long module names inside your library. Another solution is packing your units.</p> <p>A pack is simply a generated module that appends all your compilation units into one. For example, suppose you have two files <code>a.ml</code> and <code>b.ml</code>, you can generate a pack (<em>i.e.</em> a module) <code>mylib.cmx</code> that is equivalent to:</p> <pre><code class="language-Ocaml">module A = struct &lt;content of a.ml&gt; end module B = struct &lt;content of b.ml&gt; end </code></pre> <p>As such, <code>A</code> and <code>B</code> can retain their original module name, and be accessed from the outside as <code>Mylib.A</code> and <code>Mylib.B</code>. It uses the namespacing induced by the module system. A developer can simply generate a pack for its library, assuming its library name will be unique enough to be linked with other modules without the risk of name clashing. However it has one big downside: suppose you use a library with many modules but only use one. Without packs the compiler will only link the necessary compilation units from this library, but since the pack is one big compilation unit this means your program embeds the complete library.</p> <p>This problem is fixed using module aliases and the compiler option <code>-no-alias-deps</code> since OCaml 4.02, and the result for the user is equivalent to a pack, making them more or less deprecated.</p> <h2>Functorizing packs, or how to parameterize a library</h2> <p>Packs being modules representing libraries, a useful feature would be to be able to produce libraries that take modules as parameters, just like functors. Another usage would be to split a huge functor into multiple files. In other words, we want our pack <code>Mylib</code> to be compiled as:</p> <pre><code class="language-Ocaml">functor (P : sig .. end) -&gt; struct module A = struct &lt;content of a.ml&gt; end module B = struct &lt;content of b.ml&gt; end end </code></pre> <p>while <code>A</code> and <code>B</code> would use the parameter <code>P</code> as a module, and <code>Mylib</code> instantiated later as</p> <pre><code class="language-Ocaml">module Mylib = Mylib(Some_module_with_sig_compatible_with_P) </code></pre> <p>One can notice that our pack is indeed a functor, and not simply a module that binds a functor. To be able to do that, we also extends classical compilation units to be compiled as functors. Such functors are not expressed in the language, we do not provide a syntax for that, they are a matter of options at compile-time. For example:</p> <pre><code class="language-shell-session">ocamlopt -c -parameter P m.ml </code></pre> <p>will compile <code>m.ml</code> as a functor that has a parameter <code>P</code> whose interface is described in <code>p.cmi</code> in the compilation path. Similarly, our pack <code>Mylib</code> can be produced by the following compilation steps:</p> <pre><code class="language-shell-session">ocamlopt -c -parameter-of Mylib p.mli ocamlopt -c -for-pack &quot;Mylib(P)&quot; a.ml ocamlopt -c -for-pack &quot;MyLib(P)&quot; b.ml ocamlopt -pack -o mylib.cmx -parameter P a.cmx b.cmx </code></pre> <p>In details:</p> <ul> <li>The parameter is compiled with the flag <code>-parameter-of Mylib</code>, as such it won't be used as the interface of an implementation. </li> <li>The two modules packed are compiled with the flag <code>-for-pack &quot;MyLib(P)&quot;</code>. Expressing the parameter of the pack is mandatory since <code>P</code> must be known as a functor parameter (we will see why in the next section). </li> <li>The pack is compiled with <code>-parameter P</code>, which will indeed produce a functorized compilation unit. </li> </ul> <p>Functors are not limited to a unique parameter, as such they can be compiled with multiple <code>-parameter</code> options and multiple arguments in <code>-for-pack</code>. This implementation being on the build system side, it does not need to change the syntax of the language. We expect build tools like dune to provide supports for this feature, making it maybe more easier to use. Moreover, it makes compilation units on par with modules which can have a functor type. One downside however is that we cannot express type equalities between two parameters or with the functor body type as we would do with substitutions in module types.</p> <h3>Functor packs under the hood</h3> <p>In terms of implementation, packs should be seen as a concatenation of the compilation units then a rebinding of each of them in the newly created one. For example, a pack <code>P</code> of two units <code>m.cmx</code> and <code>n.cmx</code> is actually compiled as something like:</p> <pre><code class="language-Ocaml">module P__M = &lt;code of m.cmx&gt; module P__N = &lt;code of n.cmx&gt; module M = P__M module N = P__N </code></pre> <p>According to this representation, if we tried to naively implement our previous functor pack <code>Mylib(P)</code> we would end up with a functor looking like this:</p> <pre><code class="language-Ocaml">module Mylib__A = &lt;code of a.cmx, with references to P&gt; module Mylib__B = &lt;code of b.cmx, with references to P&gt; functor (P : &lt;signature of p.cmi&gt;) -&gt; struct module A = Mylib__A module B = Mylib__B end </code></pre> <p>Unfortunately, this encoding of functor packs is wrong: <code>P</code> is free in <code>a.cmx</code> and <code>b.cmx</code> and its identifier cannot correspond to the one generated for the functor retrospectively. The solution is actually quite simple and relies on a transformation known as <strong><a href="https://en.wikipedia.org/wiki/Lambda_lifting">closure conversion</a></strong>. In other words we will transform our modules into functors that takes as parameters their free variables, which in our case are the parameters of the functor pack and the dependencies from the same pack. Let's do it on a concrete functor equivalent to Mylib:</p> <pre><code class="language-Ocaml">module Mylib' (P : P_SIG) = struct module A = struct .. &lt;references to P&gt; end module B = struct .. &lt;references to P&gt; &lt;references to A&gt; end end </code></pre> <p>Our goal here is to move <code>A</code> and <code>B</code> outside of the functor, as such out of the scope of <code>P</code>, which is done by transforming those two modules into functors that takes a parameter <code>P'</code> with the same signature as <code>P</code>:</p> <pre><code class="language-Ocaml">module A_funct (P' : P_SIG) = struct .. &lt;references to P as P'&gt; end module B_funct (P' : P_SIG) = struct module A' = A_funct(P') .. &lt;references to P as P'&gt; &lt;references to A as A'&gt; end module Mylib' (P : P_SIG) = struct module A = A_funct(P) module B = B_funct(P) end </code></pre> <p>While this code compiles it is not semantically equivalent. <code>A_funct</code> is instantiated twice, its side effects are computed twice: the first time when instantiating <code>A</code> in the functor, and the second when instantiating <code>B</code>. The solution is simply to go further with closure conversion and make the result of applying <code>A_funct</code> to <code>P</code> an argument of <code>B_funct</code>.</p> <pre><code class="language-Ocaml">module A_funct (P' : P_SIG) = struct .. &lt;references to P as P'&gt; end module B_funct (P' : P_SIG)(A': module type of A_funct(P'))= struct .. &lt;references to P as P'&gt; &lt;references to A as A'&gt; end module Mylib' (P : P_SIG) = struct module A = A_funct(P) module B = B_funct(P)(A) end </code></pre> <p>This represents exactly how our functor pack <code>Mylib</code> is encoded. Since we need to compile modules in a specific way if they belong to a functor pack, the compiler has to know in the argument <code>-for-pack</code> that the pack is a functor, and what are its parameters.</p> <h3>Functor packs applied to <code>ocamlopt</code></h3> <p>What we described is a functional prototype of functor packs, implemented on OCaml 4.10, as described in <a href="https://github.com/ocaml/RFCs/pull/11">RFC#11</a>. In practice, we already have one usage that we could benefit of in the future: cross-compilation of native code. At the moment the compiler is configured to target the architecture which it is compiled on. The modules relative to the current architecture are linked symbolically into the backend folder and the backend is compiled as if it only supported one architecture. One downside of this approach is that changes into the interface of the backend that need some modifications in each architecture are not detected at compile time, but only for the current one. You need to reconfigure the OCaml compiler and rebuild it to check if another architecture still compiles. One interesting property is that each architecture backend defines the same set of modules with compatible interfaces. In other words, these modules could simply be parameters of a functor, that is instantiated for a given architecture.</p> <p>Following this idea, we implemented a prototype of native compiler whose backend is indeed packed into a functor, and instantiated at the initialization of the compiler. With this approach, we can easily switch the targeted architecture, and moreover we can be sure that each architecture is compiled, leveraging the fact that some necessary refactoring is always done when changes happen in the backend interface. Implementing such a functor is mainly a matter of adapting the build system to produce a functor pack, writing few signatures for the functor and its parameters, and instantiating the backend at the right time.</p> <p>This proof of concept shows how functor packs can ease some complicated build system and allows new workflow.</p> <h2>Making packs useful again</h2> <p>Packs were an old concept mainly outdated by module aliases. They were not practical as they are some sort of monolithic libraries shaped into a unique module containing sub modules. While they perfectly use the module system for its namespacing properties, their usage enforces the compiler to link an entire library even if only one module is actually used. This improvement allows programmers to define big functors, functors that are split among multiple files, resulting in what we can view as a way to implement some form of parameterized libraries.</p> <p>In the second part, we will cover another aspect of the rehabilitation of packs: using packs to implement mutually recursive compilation units.</p> <h1>Comments</h1> <p>François Bobot (25 September 2020 at 9 h 16 min):</p> <blockquote> <p>I believe there is a typo</p> </blockquote> <pre><code class="language-ocaml">module Mylib’ (P : P_SIG) = struct module A = A_funct(P) module B = A_funct(P) end </code></pre> <blockquote> <p>The last must be <code>B_funct(P)</code>, the next example as also the same typo.</p> </blockquote> <p>Pierrick Couderc (25 September 2020 at 10 h 31 min):</p> <blockquote> <p>Indeed, thank you!</p> </blockquote> <p>Cyrus Omar (8 February 2021 at 3 h 49 min):</p> <blockquote> <p>This looks very useful! Any updates on this work? I’d like to be able to use it from dune.</p> </blockquote> A Dune Love story: From Liquidity to Love https://ocamlpro.com/blog/2020_06_09_a_dune_love_story_from_liquidity_to_love 2020-06-09T13:48:57Z 2020-06-09T13:48:57Z Steven De Oliveira By OCamlPro & Origin Labs Writing smart contacts may often be a burdensome task, as you need to learn a new language for each blockchain you target. In the Dune Network team, we are willing to provide as many possibilities as possible for developers to thrive in an accessible and secure framework. T... <div align="center"> <a href="/blog/2020_06_09_a_dune_love_story_from_liquidity_to_love"> <img width="900" height="900" alt="Liquidity & Love" title="A Dune Love story: From Liquidity to Love" src="/blog/assets/img/liq-love-1.png"> </a> </div> <p><em>By OCamlPro &amp; Origin Labs</em></p> <p>Writing smart contacts may often be a burdensome task, as you need to learn a new language for each blockchain you target. In the Dune Network team, we are willing to provide as many possibilities as possible for developers to thrive in an accessible and secure framework.</p> <p>There are two kinds of languages on a blockchain: “native” languages that are directly understood by the blockchain, but with some difficulty by the developers, and “compiled” languages that are more transparent to developers, but need to be translated to a native language to run on the blockchain. For example, Solidity is a developer-friendly language, compiled to the native EVM language on the Ethereum blockchain.</p> <p>Dune Network supports multiple native languages:</p> <ul> <li><a href="https://medium.com/dune-network/love-a-new-smart-contract-language-for-the-dune-network-a217ab2255be"><strong>Love</strong></a>, a type-safe language with a ML syntax and suited for formal verification </li> <li><a href="https://dune.network/docs/dune-node-mainnet/whitedoc/michelson.html"><strong>Michelson</strong></a>, inherited from <a href="https://tezos.com">Tezos</a>, also type-safe, much more difficult to read </li> <li><a href="https://en.wikipedia.org/wiki/Solidity"><strong>Solidity</strong></a>, the Ethereum language, of which we are currently implementing the interpreter after releasing <a href="https://medium.com/dune-network/a-solidity-parser-in-ocaml-with-menhir-e1064f94e76b">its parser in OCaml</a> a few weeks ago </li> </ul> <p>On the side of compiled languages, Dune Network supports:</p> <ul> <li><a href="https://www.liquidity-lang.org/"><strong>Liquidity</strong></a>, a type-safe ML language suited for formal verification, that compiles to Michelson (and allows developers to decompile Michelson for auditing) </li> <li><a href="https://reasonml.github.io/"><strong>ReasonML</strong></a>, a JavaScript language designed by Facebook that compiles down to Michelson through Liquidity </li> <li>All other Tezos languages that compile to Michelson (for example <a href="https://ligolang.org/"><strong>Ligo</strong></a>, <a href="https://smartpy.io/"><strong>SmartPy</strong></a>, <a href="https://albert-lang.io/"><strong>Albert</strong></a>...) </li> </ul> <p>Though Liquidity and Love are both part of the ML family, Liquidity is much more developer-friendly: types are inferred, whereas in Love they have to be explicit, and Liquidity supports the ReasonML JavaScript syntax while Love is bound to its ML syntax.</p> <p>For all these reasons, we are pleased to announce a wedding: Liquidity now supports the Love language!</p> <p><img src="/blog/assets/img/liq-love-2.png" alt="Liquidity &amp; Love" /></p> <p><em>Liquidity now supports generating Love smart contracts</em></p> <p>This is great news for Love, as Liquidity is easier to use, and comes with an online web editor, <a href="https://www.liquidity-lang.org/edit/">Try-Liquidity</a>. Liquidity is also being targeted by the <a href="https://arxiv.org/pdf/1907.10674.pdf">ConCert project</a>, aiming at <strong>verifying smart contracts</strong> with the formal verification framework Coq.</p> <p><img src="/blog/assets/img/dune-compilers.png" alt="Dune Languages" /></p> <p><em>The Smart Contract Framework on the Dune Network</em></p> <p>Compiling contracts from Liquidity to Love has several benefits compared to Michelson. First, Love contracts are about 60% smaller than Michelson contracts, hence they are <strong>60% cheaper</strong> to deploy. Also, the compiler outputs a Love contract that can be easily read and audited.</p> <p>The Love compiler is part of the <a href="https://github.com/OCamlPro/liquidity">Liquidity project</a>. It works as follows:</p> <ul> <li><strong>The Liquidity contract is type-checked by the Liquidity compiler.</strong> The strong type system of liquidity enforces structural &amp; semantic properties on data. </li> <li><strong>The typed Liquidity contract is compiled to a typed Love contract.</strong> During this step, the Liquidity contract is scanned to check if it complies with the Love requirements (correct use of operators, no reentrancy, etc.). </li> <li><strong>The Love contract is type-checked.</strong> Once this step is completed, the contract is ready to be deployed on the chain! </li> </ul> <p>Want to try it out? Check the <a href="https://www.liquidity-lang.org/edit/">Try-Liquidity</a> website: you can now compile and deploy your Liquidity contracts in Love from the online editor directly to the Mainnet and Testnet using <a href="https://metal.dune.network">Dune Metal</a>!</p> <hr /> <p>These are some of the resources you might find interesting when building your own smart contracts:</p> <ul> <li><strong>The Love Language Documentation</strong>: https://dune.network/docs/dune-dev-docs/love-doc/introduction.html </li> <li><strong>Try-Liquidity:</strong> https://www.liquidity-lang.org/edit/ </li> <li><strong>The Liquidity Website:</strong> https://www.liquidity-lang.org/ </li> <li><strong>The Dune Network Website:</strong> https://dune.network </li> </ul> <h2>About Origin Labs</h2> <p>Origin Labs is a company founded in 2019 by the former blockchain team at OCamlPro. At Origin Labs, they have been developing Dune Network, a fork of the Tezos blockchain, its ecosystem, and applications over the Dune Network platform. At OCamlPro, they developed TzScan, the most popular block explorer at the time, Liquidity, a smart contract language, and were involved in the development of the core protocol and node. Feel free to reach out by email: contact@origin-labs.com.</p> [Interview] Sylvain Conchon joins OCamlPro https://ocamlpro.com/blog/2020_06_06_interview_sylvain_conchon_joins_ocamlpro 2020-06-06T13:48:57Z 2020-06-06T13:48:57Z Aurore Dombry On April 2020, Sylvain Conchon joined the OCamlPro team as our Chief Scientific Officer on Formal Methods. Sylvain is a professor at University Paris-Saclay, he has also been teaching OCaml in universities for about 20 years. He is the co-author of Apprendre à programmer avec OCaml with Jean-Christ... <p><img src="/blog/assets/img/picture_sylvainconchon.jpg" alt="" /></p> <p><strong>On April 2020, <a href="https://www.lri.fr/~conchon/">Sylvain Conchon</a> joined the OCamlPro team as our Chief Scientific Officer on Formal Methods</strong>. Sylvain is a professor at University Paris-Saclay, he has also been teaching OCaml in universities for about 20 years. He is the co-author of <em><a href="https://www.eyrolles.com/Informatique/Livre/apprendre-a-programmer-avec-ocaml-9782212136784/">Apprendre à programmer avec OCaml</a></em> with Jean-Christophe Filliâtre, a book for students in French elitist Preparatory Schools. His field of expertise is the automated deduction for program verification and model checking of parameterized systems. He is also the co-creator of <a href="https://alt-ergo.ocamlpro.com">Alt-Ergo</a>, our <a href="https://en.wikipedia.org/wiki/Satisfiability_modulo_theories">SMT Solver</a> dedicated to program verification, used by Airbus and qualified for the [DO-178C](http://(https://en.wikipedia.org/wiki/DO-178C) avionic standard, of <a href="http://cubicle.lri.fr/">Cubicle</a> and the very useful <a href="https://opam.ocaml.org/packages/ocamlgraph/">OCamlgraph</a> library.</p> <h4>Research and Industry</h4> <h4>Sylvain, you’ve been involved in the industrial world for a long time, what do you think about the interactions between industry and research labs?</h4> <p>I’ve always found interactions with industry professionals to be very rewarding. During my studies, I worked for several years in IT (SSII), and as a university professor, I have supervised students during their internships or apprenticeships in tech companies or at large industrial companies every year. I also take part in research projects that involve industrial partners, and I spent some time at Intel in Portland, which allowed me to discover the computer hardware industry from inside.</p> <h4>How do you establish a fruitful collaboration between academia and industry?</h4> <p>It’s primarily a question of mutual understanding. You can see it clearly during collaborative research projects that involve both academics and industrial partners. Tools resulting from research, no matter what they are, have to be relevant to real industrial problems. Once that’s taken care of, the software also needs to be usable by industry professionals without them needing to understand its inner workings (for instance they shouldn’t have to specify all 50 necessary options for its use, interpret its results, or its absence of results!).</p> <p>This requires a significant engineering effort geared towards the end user; and this task is not part of usual research activity. So, we first need to really understand the problems and needs of the industrial partner, and then determine whether our technologies and tools can be adapted or used to prototype a relevant solution.</p> <h4>You’ve just joined OCamlPro, what are your first thoughts?</h4> <p>I am very happy to be joining such a dynamic company full of talented, motivated, friendly people, where they do both high-level engineering and top-quality research! Several of my former PhD students are also working at OCamlPro, such as Albin Coquereau, David Declerck and Mattias Roux. With Mohamed Iguernlala and Alain Mebsout at our partner Origin Labs, and with the other OCP team members, it makes our team rock-solid in formal methods tooling development.</p> <blockquote> <p><em>“Tools resulting from research, no matter what they are, have to satisfy real industry needs.”</em></p> </blockquote> <h4>OCaml, a Cutting-Edge Language</h4> <h4>You are well known in the OCaml community, and some of your students became fans of OCaml (and of your teaching)… What do you say to your students who are just discovering OCaml?</h4> <p>I tend to summarize it with one phrase: “With OCaml, you’re not learning the computer programming of the last 10 years, you’re learning the programming of <em>the 10 coming years</em>”. This has proven true numerous times, because a good number of OCaml’s features were to be found in mainstream languages years later. That being said, all my years of teaching this language have led me to think that some modifications to its syntax would make the language easier to tackle for some beginners.</p> <h4>How did you personally discover OCaml?</h4> <p>During my master’s thesis <em>(maîtrise)</em> at university: one of my teachers pointed this language to me; they believed it would help me write a compiler for another programming language. So, I discovered OCaml by myself, by reading the manual and going through examples. It wasn’t until my MASt <em>(DEA)</em> that I discovered the theoretical foundations of this fantastic language (semantics, typing, compilation).</p> <h4>Would you say OCaml is an industrial programming language?</h4> <p>The question needs to be clarified: what <em>is</em> an industrial programming language? If by industrial language you mean one that is used by industry professionals, then I’d say that OCaml needs to be used more widely to be classified as such. If the question is whether OCaml is at the same level as languages used in industry, then it <em>absolutely</em> is. But maybe the question is more about the OCaml ecosystem and how developed the available tooling is: certain improvements undoubtedly need to be made in order to reach the level of a widespread industrial programming language. But we’re on the right track, especially thanks to companies like OCamlPro and its projects like <a href="https://opam.org">Opam</a> and <a href="https://try.ocamlpro.com">Try-OCaml</a> for example.</p> <h4>Formal Methods as an Industrial Technique, and the Example of the Alt-Ergo Solver</h4> <h4>Formal methods being one of OCamlPro’s areas of expertise, in what way do you think OCaml is suited for the SMT domain?</h4> <p>Tools like SMT solvers are mainly symbolic data manipulation software that allow you to analyze, transform, and reason about logical formulas. OCaml is made for that. There is also a more “computational” side to these tools, which requires precise programming of data structures as well as efficient memory management. OCaml, with its extremely <a href="/blog/2020_03_23_in_depth_look_at_best_fit_gc">efficient garbage collector</a> (GC), is particularly suited for this kind of development. SMT solvers are tools that also need to be very reliable because errors are difficult to find and are potentially very harmful. OCaml’s type system contributes to the reliability of these tools.</p> <blockquote> <p>“<em>SMT solvers are nowadays essential in software engineering</em>”</p> </blockquote> <h4>Can you describe Alt-Ergo in a few words?</h4> <p>Alt-Ergo is a software for proving logical formulas automatically (without human intervention), meaning proving whether a formula is true or false. Alt-Ergo belongs to a family of automated provers called SMT (Satisfiability Modulo Theories). It was designed to be integrated into program verification platforms. These platforms (like [Why3](https://(https://why3.lri.fr/), <a href="https://frama-c.com/">Frama-C</a>, <a href="https://www.adacore.com/about-spark">Spark</a>…) generate logical formulas that need to be proven in order to guarantee that a program is safe. Proving these formulas by hand would be very tedious (there are sometimes tens of thousands of formulas to prove). An SMT solver such as Alt-Ergo is there to do that job in a completely automated way. It is what allows these verification platforms to be used at an industrial level.</p> <h4>In what way developing this software in OCaml benefits Alt-Ergo over its competitors?</h4> <p>It makes it more reliable, since an SMT solver, like any program, can have bugs. Most of Alt-Ergo is written in a purely functional programming style, i.e. only using immutable data structures. One of the advantages of this programming style is that it allowed us to formally prove the main components of Alt-Ergo (for example, its kernel was formalized using the Coq proof assistant, which would have been impossible with a language like C++) without sacrificing efficiency thanks to a very good garbage collector and OCaml’s very powerful persistent data structure library. We made use of OCaml’s module system, particularly functors and recursive modules, to conceive a very modular code, making it maintainable and easily extensible. OCaml allowed us to create <a href="/blog/2019_07_09_alt_ergo_participation_to_the_smt_comp_2019">an SMT solver just as efficient as CVC4 or Z3 for program verification</a>, but with a total number of lines of code divided by three or four.This obviously does not guarantee that Alt-Ergo has zero bugs, but it really helps us in fixing any if they are found.</p> <h4>What is your opinion on SMT solvers and the current state of the art of SMT?</h4> <p>Today, SMT solvers are essential in software engineering. They can be found in various tools for proving, testing, model checking, abstract interpretation, and typing. The main reason for this success is that they are becoming increasingly efficient and the underlying theories are becoming more and more expressive. It is a very competitive area of research among the world’s best universities and research labs, as well as large IT companies. But there is still a lot of room for improvement, particularly in the nonlinear arithmetic domain, where user demand is growing. For now, one of my research objectives is to combine Model Checking tools with program verification. These two types of tools are based on SMT and should complement each other to offer even more automation to verification tools.</p> <h4>What applications can SMT techniques and Alt-Ergo have in industry?</h4> <p>SMT techniques can be used wherever formal methods are useful. Including, but not limited to verifying the safety of critical software in embedded systems, finding security vulnerabilities in computer systems, or resolving planning problems. They can also be found in domains of artificial intelligence, where it is crucial to guarantee neural network stability and produce formal explanations of their results.</p> <h4>You ended up working on Model Checking, can you tell us about how Model Checking is connected to SMT and how it is currently used?</h4> <p>Model Checking consists of verifying that all possible states of a system respect certain properties, regardless of the input data. This is a difficult problem because some systems (like microprocessors for example) can have hundreds of millions of states. To reach that scale, model checkers implement extremely sophisticated algorithms to visit these states quickly by storing them in a compact manner. That said, this technique reaches its limits when the input values are unbounded or when the number of system components is unknown. Imagine Internet routing algorithms where you don’t know how many machines are connected. These algorithms must be correct no matter the number of machines. This is where SMT solvers come into play. By using logical formulas, we’re able to represent sets of states of arbitrary sizes. Visiting system states becomes calculating the formulas that represent the states satisfying the desired properties, etc. Therefore, everything in Model Checking is based on logical formulas, and SMT solvers are of course there to reason about these formulas.</p> [Interview] Sylvain Conchon rejoint OCamlPro https://ocamlpro.com/blog/2020_06_05_fr_interview_sylvain_conchon_rejoint_ocamlpro 2020-06-05T13:48:57Z 2020-06-05T13:48:57Z Aurore Dombry Sylvain Conchon vient de rejoindre OCamlPro en tant que Chief Scientific Officer Méthodes Formelles. Professeur à l’Université Paris-Saclay, il travaille dans le domaine de la démonstration automatique pour la preuve de programmes et le model checking pour systèmes paramétrés. Il est aussi ... <p><img src="/blog/assets/img/picture_sylvainconchon.jpg" alt="" /></p> <blockquote> <p>Sylvain Conchon vient de rejoindre OCamlPro en tant que Chief Scientific Officer Méthodes Formelles. Professeur à l’Université Paris-Saclay, il travaille dans le domaine de la démonstration automatique pour la preuve de programmes et le model checking pour systèmes paramétrés. Il est aussi le co-créateur d’Alt-Ergo.</p> </blockquote> <h3>Recherche et industrie</h3> <p><strong>Sylvain, tu fréquentes de longue date le monde industriel, que penses-tu des interactions entre les industriels et les laboratoires de recherche ?</strong></p> <p>J’ai toujours trouvé très enrichissantes les interactions avec les industriels. Pendant mes études, j’ai travaillé plusieurs années en SSII, et je suis mes étudiants en stage ou en apprentissage dans des sociétés technologiques ou chez de grands industriels. Je participe également à des projets de recherche qui impliquent des industriels,et j’ai passé quelques temps chez Intel à Portland, ce qui m’a permis de découvrir l’industrie du hardware.</p> <p><strong>Comment parvenir à établir des relations fructueuses entre le monde académique et les industriels ?</strong></p> <p>C’est beaucoup une histoire de rencontre. On le voit lors des montages de projets de recherche collaboratifs qui réunissent académiques et industriels. Les outils issus de la recherche, quels qu’ils soient, doivent avant tout répondre à un besoin réel des industriels. Si c’est le cas, il faut aussi que le logiciel soit utilisable par des ingénieurs du métier sans qu’il leur soit nécessaire de comprendre son fonctionnement interne (par exemple, pour positionner les 50 options nécessaires à son utilisation, interpréter ses résultats ou ses absences de résultats!). Cela nécessite à l’évidence un travail d’ingénierie important, tourné vers l’utilisateur final et souvent éloigné des activités des chercheurs. Il faut donc comprendre les problèmes et les besoins des industriels, et ensuite déterminer si les technologies et les outils que l’on maîtrise peuvent être adaptés ou utilisés pour réaliser un prototype qui réponde à certains de ces besoins.</p> <p><strong>Tu viens de rejoindre OCamlPro, quelles sont tes premières impressions ?</strong></p> <p>Je suis heureux d’avoir rejoint une entreprise très dynamique, pleine de gens talentueux, motivés et sympathiques, où l’on fait à la fois de l’ingénierie de haut niveau et de la recherche de qualité !</p> <blockquote> <p><em>“ Les outils issus de la recherche, quels qu’ils soient, doivent avant tout répondre à un besoin réel des industriels.”</em></p> </blockquote> <h3>OCaml, un langage de pointe</h3> <p><strong>Tu es connu dans la communauté OCaml, et certains de tes étudiants sont devenus des fans d’OCaml (et de ton enseignement)… que dis-tu à tes étudiants qui découvrent OCaml ?</strong></p> <p>J’ai tendance à résumer en disant ceci : <em>« avec OCaml, vous n’apprenez pas la programmation des 10 dernières années, mais celle des 10 prochaines années »</em>. Cette affirmation s’est toujours vérifiée car bon nombre de traits du langage OCaml se sont retrouvés dans les langages <em>mainstream</em>, avec plusieurs années de décalage. Cela dit, mes années d’expérience dans l’enseignement de ce langage me laissent penser que quelques modifications dans sa syntaxe permettraient une approche plus aisée pour certains débutants.</p> <p><strong>Et toi, comment as-tu découvert OCaml ?</strong></p> <p>Pendant mes études à l’Université lors de mon projet de fin de maîtrise : un de mes enseignants m’avait orienté vers ce langage pour m’aider à réaliser un compilateur pour un langage de programmation concurrente. J’ai donc découvert ce langage par moi-même, en lisant le manuel et les exemples. Ce n’est que pendant mon DEA que j’ai découvert les fondements théoriques de ce beau langage (sémantique, typage, compilation).</p> <p><strong>OCaml, un langage industriel ou pas encore ?</strong></p> <p>Il convient de préciser la question : qu’est-ce qu’un langage industriel ? Si c’est un langage utilisé par les industriels, alors OCaml n’est hélas pas encore suffisamment utilisé dans l’industrie pour être qualifié ainsi. Si la question est de savoir s’il a le niveau des langages utilisés dans l’industrie, alors la réponse est oui, sans hésiter. Mais peut-être la question porte-t-elle davantage sur l’écosystème OCaml et la maturité de l’outillage: il y a sûrement des progrès à faire pour atteindre le niveau d’un langage très répandu dans l’industrie, mais c’est en bonne voie, en particulier grâce à des entreprises telles qu’OCamlPro.</p> <h3>Les méthodes formelles comme technique industrielle, et l’exemple du solveur Alt-Ergo</h3> <p><strong>Les méthodes formelles sont l’un des domaines d’expertise d’OCamlPro, en quoi penses-tu qu’OCaml est adapté au domaine des SMT ?</strong></p> <p>Les outils comme les solveurs SMT sont principalement des logiciels de manipulation symbolique des données qui permettent d’analyser, de transformer et de raisonner sur des formules logiques. OCaml est fait pour ce genre de traitements. Il y a aussi une partie plus « calculatoire » dans ces outils qui nécessite une programmation fine des structures de données ainsi qu’une gestion efficace de la mémoire. OCaml est particulièrement adapté pour ce genre de développements, surtout avec son ramasse-miettes (GC) extrêmement performant. Enfin, les solveurs SMT sont des outils qui doivent avoir un grand niveau de fiabilité car les erreurs dans ces logiciels sont difficiles à trouver et leur présence peut être très préjudiciable. Le système de types d’OCaml contribue à la fiabilité de ces outils.</p> <blockquote> <p><em>“Les solveurs SMT sont aujourd’hui incontournables dans le domaine de l’ingénierie du logiciel.”</em></p> </blockquote> <p><strong>Peux-tu nous parler d’Alt-Ergo en quelques mots ?</strong></p> <p>C’est un logiciel utilisé pour prouver automatiquement (sans intervention humaine) des formules logiques, c’est-à-dire savoir si ces formules sont vraies ou fausses. Alt-Ergo appartient à une famille de démonstrateurs automatiques appelée SMT (pour Satisfiabilité Modulo Théories). Il a été conçu pour être intégré dans des plate-formes de vérification de programmes. Ces outils (comme Why3, Frama-C, Spark,…) génèrent des formules logiques qu’il est nécessaire de prouver afin de garantir qu’un programme est sûr. Faire la preuve de ces formules à la main serait très fastidieux (il y a parfois plusieurs dizaines de milliers de formules à prouver). Un solveur SMT comme Alt-Ergo est là pour faire ce travail, de manière complètement automatique. C’est ce qui permet à ces plateformes de vérification d’être utilisables au niveau industriel.</p> <p><strong>En quoi le développement d’Alt-Ergo en OCaml peut-il être un avantage par rapport aux concurrents ?</strong></p> <p>Cela lui confère une plus grande sûreté, car un solveur SMT, comme n’importe quel programme peut aussi avoir des bugs. La plus grande partie d’Alt-Ergo est programmée dans un style purement fonctionnel, c’est-à-dire uniquement avec l’utilisation de structures de données immuables. L’un des avantages de ce style de programmation est qu’il nous a permis de prouver formellement ses principaux composants (par exemple, son noyau a été formalisé à l’aide de l’assistant à la preuve Coq, ce qui serait impossible à faire dans un langage comme C++), sans sacrifier son efficacité grâce au très bon ramasse-miettes et à la bibliothèque de structures de données persistantes très performantes d’OCaml. Enfin, nous avons largement bénéficié du système de modules d’OCaml, en particulier les foncteurs et les modules récursifs, pour concevoir un code très modulaire, maintenable et facilement extensible. Au final, OCaml nous a permis de concevoir un solveur SMT aussi performant que CVC4 ou Z3 pour la preuve de programmes, mais avec un nombre de lignes de code divisé par trois ou quatre. Bien sûr, cela ne garantit pas que Alt-Ergo ait zéro bugs, mais cela nous aide beaucoup à mettre le doigt dessus quand quelqu’un en trouve.</p> <p><em>“OCaml nous a permis de concevoir un solveur SMT aussi performant que CVC4 ou Z3 pour la preuve de programmes, mais avec un nombre de lignes de code divisé par trois ou quatre.“</em></p> <p><strong>Quel est ton avis sur les solveurs SMT et l’état de l’art SMT actuel ?</strong></p> <p>Les solveurs SMT sont aujourd’hui incontournables dans le domaine de l’ingénierie du logiciel. On les trouve aussi bien dans des outils de preuve, de test, de model checking, d’interprétation abstraite ou encore de typage. La principale raison de ce succès est qu’ils sont de plus en plus efficaces et les théories sous-jacentes sont très expressives. C’est un domaine de recherche très concurrentiel entre les meilleures universités ou laboratoires du monde et de grandes entreprises en informatique. Mais la marge de progression de ces outils est encore très grande, en particulier dans le domaine de l’arithmétique non linéaire où la demande des utilisateurs est de plus en plus forte. Pour le moment, un de mes objectifs en recherche est de combiner les outils de Model Checking avec ceux de preuve de programmes. Ces deux familles d’outils reposent sur les SMT et elles devraient se compléter pour offrir des outils de vérification encore plus automatiques.</p> <p><strong>Quelles applications les techniques SMT et Alt-Ergo peuvent-elles avoir dans l’industrie ?</strong></p> <p>Les techniques SMT peuvent être utilisées partout où les méthodes formelles peuvent être utiles. Par exemple (mais cette liste est loin d’être exhaustive), pour vérifier la sûreté de logiciels critiques dans le domaine de l’embarqué, pour trouver des failles de sécurité dans les systèmes informatiques ou pour résoudre des problèmes de planification. On les trouve également dans le domaine de l’intelligence artificielle où il est crucial de garantir la stabilité des réseaux de neurones mais aussi de produire des explications formelles sur leurs résultats.</p> <p><strong>Tu as été amené à travailler sur le Model Checking, peux-tu nous parler des liens entre Model Checking et SMT et de son utilisation actuelle ?</strong></p> <p>Le Model Checking consiste à vérifier que tous les états possibles d’un système respectent bien certaines propriétés, et ce quelles que soient les données en entrée. C’est un problème difficile car certains systèmes (microprocesseurs par ex.) peuvent avoir des centaines de millions d’états. Pour passer à l’échelle, les model checkers implémentent des algorithmes très perfectionnés pour visiter ces états rapidement, en les stockant d’une manière très compacte. Cependant, cette technique atteint ses limites quand les valeurs prises en entrée sont non bornées ou quand le nombre de composants du système n’est pas connu. Pensez aux algorithmes de routage d’Internet où on ne connaît pas le nombre de machines sur le réseau, ces algorithmes doivent être corrects, quel que soit ce nombre de machines. C’est là que les solveurs SMT entrent en jeu. En utilisant des formules logiques, on peut représenter des ensembles d’états de taille arbitraire. Visiter les états d’un système consiste alors à calculer les formules qui représentent ces états. Vérifier que les états respectent une propriété revient à prouver que les formules qui représentent des états impliquent la propriété voulue, etc. Tout dans le Model Checking repose donc sur des formules logiques et les solveurs SMT sont évidemment là pour raisonner sur ces formules.</p> Tutoriel Format https://ocamlpro.com/blog/2020_06_01_fr_tutoriel_format 2020-06-01T13:48:57Z 2020-06-01T13:48:57Z OCamlPro Article écrit par Mattias. Le module Format d’OCaml est un module extrêmement puissant mais malheureusement très mal utilisé. Il combine notamment deux éléments distincts : les boîtes d’impression élégante les tags sémantiques Le présent article vise à démystifier une grande partie ... <p><em>Article écrit par Mattias.</em></p> <p>Le module <a href="http://caml.inria.fr/pub/docs/manual-ocaml/libref/Format.html">Format</a> d’OCaml est un module extrêmement puissant mais malheureusement très mal utilisé. Il combine notamment deux éléments distincts :</p> <ul> <li>les boîtes d’impression élégante </li> <li>les tags sémantiques </li> </ul> <p>Le présent article vise à démystifier une grande partie de ce module afin de découvrir l’ensemble des choses qu’il est possible de faire avec.</p> <p>Si tout va bien vous devriez passer de</p> <p><img src="/blog/assets/img/error1-output.png" alt="sortie triviale" /></p> <p>à</p> <p><img src="/blog/assets/img/ocaml-output.png" alt="sortie OCaml" /></p> <p>(En réalité nous arriverons à un résultat légèrement différent car l’auteur de ce tutoriel n’aime pas tous les choix faits pour afficher les messages d’erreur en OCaml mais les différences n’auront pas de grande importance)</p> <h2>I. Introduction générale : <code>fprintf fmt &quot;%a&quot; pp_error e</code></h2> <p>Si vous ne comprenez pas ce que le code dans le titre doit faire, je vous invite à lire attentivement ce qui va suivre. Sinon vous pouvez directement sauter à la deuxième partie.</p> <h3>I.1. Rappels sur <code>printf</code></h3> <p>Pour rappel, la fonction <code>printf</code> est une fonction variadique (c’est-à-dire qu’elle peut prendre un nombre variable de paramètres).</p> <ul> <li> <p>Le premier paramètre est une chaîne de formattage composée de caractères et de spécificateurs de format.</p> <ul> <li>Les <strong>caractères</strong> sont affichés tels quels. <code>printf &quot;abc&quot;</code> affichera <code>abc</code>. </li> <li>Les <strong>spécificateurs de caractère</strong> sont des caractères précédés du caractère <code>% </code>(syntaxe héritée du C). Ils sont remplacés à l’exécution par un des paramètres fournis après la chaîne de formattage à la fonction et servent à indiquer de quel type doit être la valeur qui sera affichée (ainsi que d’autres informations dont les détails peuvent être trouvés dans la documentation du module <a href="https://caml.inria.fr/pub/docs/manual-ocaml/libref/Printf.html">Printf</a>. <code>printf &quot;Test: %d&quot;</code> attend un entier signé et affichera <code>Test: &lt;d&gt;</code> avec <code>&lt;d&gt;</code> remplacé par l’entier fourni. </li> </ul> </li> <li> <p>Les paramètres suivants sont les valeurs fournies à <code>printf</code> pour remplacer les spécificateurs de format</p> <ul> <li><code>printf &quot;%d %s %c&quot; 3 s 'a'</code> affichera l’entier signé 3, une espace insécable, le contenu de la variable <code>s</code> qui doit être une chaîne de caractères, une autre espace insécable et finalement le caractère ‘a’. </li> <li>On remarque aussi qu’ici le nombre de paramètres supplémentaires fournis en plus de la chaîne de formattage correspond au nombre de spécificateurs et que ceux-ci ne peuvent être intervertis. <code>printf &quot;%d %c&quot; 'a' 3</code> ne pourra pas être compilé/exécuté car <code>%d</code> attend un entier signé et le premier paramètre est un caractère. Les spécificateurs qui n’attendent qu’un argument sont des spécificateurs que j’appelle <strong>unaires</strong> et sont extrêmement faciles à utiliser, il faut seulement savoir quel caractère correspond à quel type et les donner dans le bon ordre comme illustré dans la figure ci-dessous (le chevron représentant la sortie standard) </li> </ul> </li> </ul> <p><img src="/blog/assets/img/printf-base-out-dark.png" alt="Fonctionnement basique de printf" /></p> <h3>I.2. Afficher un type défini par l’utilisateur</h3> <p>Arrive alors ce moment où vous commencez à définir vos propres structures de données et, malheureusement, il n’y a aucun moyen d’afficher votre expression avec les spécificateurs par défaut (ce qui semble normal). Définissons donc notre propre type et affichons-le avec les techniques déjà vues :</p> <pre><code class="language-OCaml">type error = | Type_Error of string * string | Apply_Non_Function of string let pp_error = function | Type_Error (s1, s2) -&gt; printf &quot;Type is %s instead of %s&quot; s1 s2 | Apply_Non_Function s -&gt; printf &quot;Type is %s, this is not a function&quot; s </code></pre> <p>Supposons maintenant que nous ayons une liste d’erreurs et que nous souhaitions les afficher en les séparant par une ligne horizontale. Une première solution serait la suivante :</p> <pre><code class="language-OCaml">let pp_list l = List.iter (fun e -&gt; pp_error e; printf &quot;\n&quot; ) l </code></pre> <p>Cette façon de faire a plusieurs inconvénients (qui vont être magiquement réglés par la fonction du titre).</p> <h3>I.3. Afficher sur un <code>formatter</code> abstrait</h3> <p>Le premier inconvénient est que <code>printf</code> envoie son résultat vers la sortie standard alors qu’on peut vouloir l’envoyer vers un fichier ou vers la sortie d’erreur, par exemple.</p> <p>La solution est <code>fprintf</code> (il serait de bon ton de feindre la surprise ici).</p> <p><code>fprintf</code> prend un paramètre supplémentaire avant la chaîne de formattage appelé <strong><code>formatter</code> abstrait</strong>. Ce paramètre est du type <code>formatter</code> et représente un imprimeur élégant (ou <em>pretty-printer</em>)</p> <p>c’est-à-dire l’objet vers lequel le résultat devra être envoyé. L’énorme avantage qui en découle est qu’on peut transformer beaucoup de choses en <code>formatter</code>. Un fichier, un buffer, la sortie standard etc. À vrai dire, <code>printf</code> est implémenté comme <code>let printf = fprintf std_formatter</code></p> <p>Pour l’utiliser on va donc modifier <code>pp_error</code> et lui donner un paramètre supplémentaire :</p> <pre><code class="language-OCaml">let pp_error fmt = function | Type_Error (s1, s2) -&gt; fprintf fmt &quot;Type is %s instead of %s&quot; s1 s2 | Apply_Non_Function s -&gt; fprintf fmt &quot;Type is %s, this is not a function&quot; s </code></pre> <p>Puis on réécrit <code>pp_list</code> pour prendre cela en compte :</p> <pre><code class="language-OCaml">let pp_list fmt l = List.iter (fun e -&gt; pp_error fmt e; fprintf fmt &quot;\n&quot; ) l </code></pre> <p>Comme on peut le voir dans la figure ci-dessous, <code>fprintf</code> imprime dans le <code>formatter</code> qui lui est fourni en paramètre et non plus sur la sortie standard.</p> <p><img src="/blog/assets/img/fprintf-base-out-dark.png" alt="Fontionnement basique de fprintf" /></p> <p>Si on veut maintenant afficher le résultat sur la sortie standard il suffira simplement de donner <code>pp_list std_formatter</code> comme <code>formatter</code> à <code>fprintf</code>. Cette façon de faire n’a, en réalité, que des avantages, puisqu’elle permet d’être beaucoup plus fexible quant au <code>formatter</code> qui sera utilisé à l’exécution du programme.</p> <h3>I.4. Afficher des types complexes avec <code>%a</code></h3> <p>Le deuxième problème arrivera bien assez vite si nous continuons avec cette méthode. Pour bien le comprendre, reprenons <code>pp_error</code>. Dans le cas de <code>Type_error of string * string</code> on veut écrire <code>Type is s1 instead of s2</code> et on fournit donc à <code>fprintf</code> la chaîne de formattage <code>&quot;Type is %s instead of %s&quot;</code> avec <code>s1</code> et <code>s2</code> en paramètres supplémentaires. Comment devrions-nous faire si <code>s1</code> et <code>s2</code> étaient des types définis par l’utilisateur avec chacun leur fonction d’affichage <code>pp_s1`` : formatter -&gt; s1 -&gt; unit</code> et <code>pp_s2 : formatter -&gt; s2 -&gt; unit</code> ? En suivant la logique de notre solution jusqu’ici, nous écririons le code suivant :</p> <pre><code class="language-OCaml">let pp_error fmt = function | Type_Error (s1, s2) -&gt; fprintf fmt &quot;Type is &quot;; pp_s1 fmt s1; fprintf fmt &quot;instead of &quot;; pp_s2 fmt s2 | Apply_non_function s -&gt; fprintf fmt &quot;Type is %s, this is not a function&quot; s </code></pre> <p>Il est assez facile de se rendre compte rapidement que plus nous devrons manipuler des types complexes, plus cette syntaxe s’alourdira. Tout cela parce que les spécificateurs de caractère unaires ne permettent de manipuler que les types de base d’OCaml.</p> <p>C’est là qu’entre en jeu <code>%a</code>. Ce spécificateur de caractère est, lui, binaire (ternaire en réalité mais un de ses paramètres est déjà fourni). Ses paramètres sont :</p> <ul> <li>Une fonction d’affichage de type <code>formatter -&gt; 'a -&gt; unit</code> (premier paramètre devant être fourni) </li> <li>Le <code>formatter</code> dans lequel il doit afficher son résultat (qui ne doit pas être fourni en plus) </li> <li>La valeur qu’on souhaite afficher </li> </ul> <p>Il appliquera ensuite le <code>formatter</code> et la valeur à la fonction fournie comme premier argument et lui donner la main pour qu’elle affiche ce qu’elle doit dans le <code>formatter</code> qui lui a été fourni en paramètre. Lorsqu’elle aura terminé, l’impression continuera. L’exemple suivant montre le fonctionnement (avec une impression sur la sortie standard, <code>fmt</code> ayant été remplacé par <code>std-formatter</code></p> <p><img src="/blog/assets/img/fprintfpa-base-out-dark.png" alt="" /></p> <p>Dans notre cas nous avions déjà transformé nos fonctions d’affichage pour qu’elles prennent un <code>formatter</code> abstrait et nous n’avons donc presque rien à modifier :</p> <pre><code class="language-OCaml">let pp_error fmt = function | Type_Error (s1, s2) -&gt; fprintf fmt &quot;Type is %s instead of %s&quot; s1 s2 | Apply_Non_Function s -&gt; fprintf fmt &quot;Type is %s, this is not a function&quot; s let pp_list fmt l = List.iter (fun e -&gt; fprintf fmt &quot;%a\n&quot; pp_error e; ) l </code></pre> <p>Et, bien sûr, si <code>s1</code> et <code>s2</code> avaient eu leurs propres fonctions d’affichage :</p> <pre><code class="language-ocaml">let pp_error fmt = function | Type_Error (s1, s2) -&gt; fprintf fmt &quot;Type is %a instead of %a&quot; pp_s1 s1 pp_s2 s2 | Apply_Non_Function s -&gt; fprintf fmt &quot;Type is %s, this is not a function&quot; s </code></pre> <p>Arrivé-e-s ici vous devriez être à l’aise avec les notions de <code>formatter</code> abstrait et de spécificateur de caractère binaire et vous devriez donc pouvoir afficher n’importe quelle structure de donnée, même récursive, sans aucun soucis. Je recommande vivement cette façon de faire afin que tout changement qui devrait succéder ne nécessite pas de changer l’intégralité du code.</p> <h2>II. Les boîtes d’impression élégante</h2> <p>Et pour justement avoir des changements qui ne nécessitent pas de tout modifier, il va falloir s’intéresser un minimum aux boîtes d’impression élégante.</p> <p>Aussi appelées <em>pretty-print boxes</em>, je les appellerai “boîtes” dorénavant, un <a href="https://ocaml.org/learn/tutorials/format.fr.html">tutoriel</a> existe déjà, fait par l’équipe de la bibliothèque standard.</p> <p>L'idée derrière les boîtes est tout simple :</p> <blockquote> <p>À mon niveau je m’occupe correctement de comment afficher mes éléments et je n’impose rien au-dessus.</p> </blockquote> <p>Reprenons, par exemple, la fonction permettant d’afficher les <code>error</code>:</p> <pre><code class="language-ocaml">let pp_error fmt = function | Type_Error (s1, s2) -&gt; fprintf fmt &quot;Type is %s instead of %s&quot; s1 s2 | Apply_Non_Function s -&gt; fprintf fmt &quot;Type is %s, this is not a function&quot; s </code></pre> <p>Si on ajoutait un retour à la ligne on imposerait à toute fonction nous appelant ce saut de ligne or ce n’est pas à nous d’en décider. Cette fonction, en l’état, fait parfaitement ce qu’elle doit faire.</p> <p>Regardons, par contre, la fonction affichant une liste d’erreur :</p> <pre><code class="language-ocaml">let pp_list fmt l = List.iter (fun e -&gt; fprintf fmt &quot;%a\n&quot; pp_error e; ) l </code></pre> <p>A l’issue de celle-ci un saut à ligne provenant du dernier élément est forcé. Non seulement il n’est pas recommandé d’utiliser <code>n</code> (ou <code>@n</code> ou même <code>@.</code>) car ce ne sont pas à proprement parler des directives de <code>Format</code> mais des directives systèmes qui vont donc chambouler le reste de l’impression.</p> <blockquote> <p>Malheureusement bien trop de développeurs et développeuses ont découvert <code>@.</code> en même temps que <code>Format</code> et s’en servent sans restriction. Au risque de me répéter souvent : n’utilisez pas <code>@.</code> !</p> </blockquote> <h3>II.1. Le spécificateur <code>@</code></h3> <p>On l’avait vu, une chaîne de formattage est composée de caractères et de spécificateurs de caractères commençant par <code>%</code> Les spécificateurs sont des caractères qui ne sont pas affichés et qui seront remplacés avant l’affichage final.</p> <p><code>Format</code> ajoute son propre spécificateur de caractère : <code>@</code>.</p> <h4>II.1.a. Le vidage (<em>flush</em>)</h4> <p>La première spécification qu’on a vue est donc celle qu’il ne faut presque jamais utiliser (ce qui pose la question de l’avoir mentionnée en premier lieu) : <code>@.</code>. Cette spécification indique seulement au moteur d’impression qu’à ce niveau là il faut sauter une ligne et vider l’imprimeur. Les deux autres spécifications semblables sont <code>@n</code> qui n’indique que le saut de ligne et <code>@?</code> qui n’indique que le vidage de l’imprimeur. L’inconvénient de ces trois spécificateurs est qu’ils sont trop puissants et chamboulent donc le bon fonctionnement du reste de l’impression. Je n’ai personnellement jamais utilisé <code>@n</code> (autant utiliser une boîte avec un spécificateur de coupure comme nous le verrons immédiatement après) et n’utilise <code>@. </code>que lorsque je sais qu’il ne reste rien à imprimer.</p> <h4>II.1.b. Les indications de coupure ou d’espace</h4> <p>Important :</p> <ul> <li>Une indication de coupure saute à la ligne s’il le faut sinon elle ne fait rien </li> <li>Une indication d’espace sécable saute à la ligne s’il le faut, sinon elle affiche une espace </li> </ul> <p>Les deux sont donc des indications de saut de ligne si nécessaire, il n’existe pas d’indication d’espace par défaut ou rien s’il n’y a pas assez d’espace (utiliser <code> </code> affichera toujours une espace).</p> <p>Les indications sont au nombre de trois (et leur fonctionnement sera bien plus clair lorsque vous verrez les boîtes) :</p> <ul> <li><code>@,</code> : indication de coupure (c’est-à-dire rien prioritairement ou un saut à la ligne s’il le faut) </li> <li><code>@⎵</code> : indique une espace sécable (c’est-à-dire une espace prioritairement ou un saut à la ligne s’il le faut) (Il faut bien évidemment comprendre le caractère <code>⎵</code> comme l’espace blanc habituel) </li> <li><code>@;&lt;n o&gt;</code> : indique <code>n</code> espaces sécables ou une coupure indentée de <code>o</code> (c’est-à-dire <code>n</code> espaces sécables prioritairement ou un saut à la ligne <strong>avec une indentation supplémentaire de <code>o</code></strong> s’il le faut) </li> </ul> <p>D’après ce que je viens d’écrire il devrait être évident maintenant que le caractère est une espace insécable qui ne provoquera donc pas de saut à la ligne quand bien même on dépasserait les limites de celle-ci. Contrairement à nos espaces de traitement de texte habituel qui sont des espaces sécables (pouvant provoquer des sauts de ligne), il faut spécifier quels espaces sont sécables lorsqu’on utilise Format.</p> <p>On écrira par exemple <code>fprintf fmt &quot;let rec f =@ %a&quot; pp_expr e</code> car on ne veut pas que <code>let rec f =</code> soit séparé en plusieurs lignes mais on met bien <code>@⎵</code> avant <code>%a</code> car l’expression sera soit sur la même ligne si suffisament petite soit à la ligne suivante si trop grande (on devrait même écrire <code>@;&lt;1 2&gt;</code> pour que l’expression soit indentée si on saute à la ligne suivante mais, on va le voir immédiatement, c’est là que les boîtes nous permettent d’automatiser ce genre de comportement)</p> <h4>II.1.c. Les boîtes</h4> <p>La deuxième spécification est celle permettant d’ouvrir et de fermer des boîtes.</p> <p>Une boîte se commence par <code>@[</code> et se termine par <code>@]</code>. Entre ces deux bornes, on fait ce qu’on veut (<strong>sauf utiliser <code>@.</code>, <code>@?</code> ou <code>@\n</code> !</strong>). Tout ce qui se passe à l’intérieur de la boîte reste (et doit rester) à l’intérieur de celle-ci. Indentation, coupures, boîtes verticales, horizontales, les deux, l’une ou l’autre, toutes ces options sont accessibles une fois qu’une boîte a été ouverte. Voyons-les rapidement (pour rappel, la version détaillée est disponible dans le <a href="https://ocaml.org/learn/tutorials/format.fr.html">tutoriel</a>.</p> <p>Une fois qu’une boîte a été ouverte on peut préciser entre deux chevrons le comportement qu’on veut qu’elle ait en cas d’indication de coupure, en voici un rapide aperçu :</p> <ul> <li><code>&lt;v&gt;</code> : Toute indication de coupure entraîne un saut à la ligne </li> <li><code>&lt;h&gt;</code> : Toute indication d’espace entraîne une espace, les indications de coupure n’ont aucun effet </li> <li><code>&lt;hv&gt;</code>: Si toute la boîte peut être imprimée sur la même ligne alors seules les indications d’espace sont prises en compte sinon seules les indications de coupure le sont et chaque élément est imprimé sur sa propre ligne </li> <li><code>&lt;hov&gt;</code> : Tant que des éléments peuvent être imprimés sur une ligne ils le sont avec leurs indications d’espace. Les indications de coupure sont utilisées lorsqu’il faut sauter une ligne. </li> </ul> <p>Chacun de ces comportements peut se voir attribuer une valeur supplémentaire, sa valeur d’indentation, qui indique l’indentation par rapport au début de la boîte qui devra être ajoutée à chaque saut de ligne.</p> <p>Soit le code suivant permettant d’afficher une liste d’items séparés soit par une indication de coupure <code>@,</code>, soit par une indication d’espace <code>@⎵</code> soit par une indication d’espace ou de coupure indentée <code>@;&lt;2 3&gt;</code> (2 espaces ou une coupure indentée de trois espaces) :</p> <pre><code class="language-ocaml">open Format let l = [&quot;toto&quot;; &quot;tata&quot;; &quot;titi&quot;] let pp_item fmt s = fprintf fmt &quot;%s&quot; s let pp_cut fmt () = fprintf fmt &quot;@,&quot; let pp_spc fmt () = fprintf fmt &quot;@ &quot; let pp_brk fmt () = fprintf fmt &quot;@;&lt;2 3&gt;&quot; let pp_list pp_sep fmt l = pp_print_list pp_item ~pp_sep fmt l </code></pre> <p>Voici un récapitulatif des différents comportements de boîtes en fonction des indications de coupure/espace rencontrées :</p> <pre><code class="language-ocaml">(* Boite verticale (tout est coupure) *) printf &quot;------------@.&quot;; printf &quot;v@.&quot;; printf &quot;------------@.&quot;; printf &quot;@[&lt;v 2&gt;[%a]@]@.&quot; (pp_list pp_cut) l; printf &quot;@[&lt;v 2&gt;[%a]@]@.&quot; (pp_list pp_spc) l; printf &quot;@[&lt;v 2&gt;[%a]@]@.&quot; (pp_list pp_brk) l; (* Sortie attendue: ------------ v ------------ [toto tata titi] [toto tata titi] [toto tata titi] *) (* Boîte horizontale (pas de coupure) *) printf &quot;------------@.&quot;; printf &quot;h@.&quot;; printf &quot;------------@.&quot;; printf &quot;@[&lt;h 2&gt;[%a]@]@.&quot; (pp_list pp_cut) l; printf &quot;@[&lt;h 2&gt;[%a]@]@.&quot; (pp_list pp_spc) l; printf &quot;@[&lt;h 2&gt;[%a]@]@.&quot; (pp_list pp_brk) l; (* Sortie attendue: ------------ h ------------ [tototatatiti] [toto tata titi] [toto tata titi] *) (* Boîte horizontale-verticale (Affiche tout sur une ligne si possible sinon boîte verticale) *) printf &quot;------------@.&quot;; printf &quot;hv@.&quot;; printf &quot;------------@.&quot;; printf &quot;@[&lt;hv 2&gt;[%a]@]@.&quot; (pp_list pp_cut) l; printf &quot;@[&lt;hv 2&gt;[%a]@]@.&quot; (pp_list pp_spc) l; printf &quot;@[&lt;hv 2&gt;[%a]@]@.&quot; (pp_list pp_brk) l; (* Sortie attendue: ------------ hv ------------ [toto tata titi] [toto tata titi] [toto tata titi] *) (* Boîte horizontale ou verticale tassante (Affiche le maximum possible sur une ligne avant de sauter à la ligne suivante et recommencer) *) printf &quot;------------@.&quot;; printf &quot;hov@.&quot;; printf &quot;------------@.&quot;; printf &quot;@[&lt;hov 2&gt;[%a]@]@.&quot; (pp_list pp_cut) l; printf &quot;@[&lt;hov 2&gt;[%a]@]@.&quot; (pp_list pp_spc) l; printf &quot;@[&lt;hov 2&gt;[%a]@]@.&quot; (pp_list pp_brk) l; (* Sortie attendue: ------------ hov ------------ [tototata titi] [toto tata titi] [toto tata titi] *) (* Boîte horizontale ou verticale structurelle (Même fonctionnement que la boîte tassante sauf pour le dernier retour à la ligne qui tente de favoriser une indentation de niveau 0) *) printf &quot;------------@.&quot;; printf &quot;b@.&quot;; printf &quot;------------@.&quot;; printf &quot;@[&lt;b 2&gt;[%a]@]@.&quot; (pp_list pp_cut) l; printf &quot;@[&lt;b 2&gt;[%a]@]@.&quot; (pp_list pp_spc) l; printf &quot;@[&lt;b 2&gt;[%a]@]@.&quot; (pp_list pp_brk) l (* Sortie attendue: ------------ b ------------ [tototata titi] [toto tata titi] [toto tata titi] *) </code></pre> <p>Petite précision sur l’utilisation ici des <code>@.</code> alors qu’il est recommandé de ne jamais les utiliser. Il ne faut en réalité pas <strong>jamais</strong> les utiliser, il faut seulement les utiliser lorsqu’on est sûr de n’être dans aucune boîte. Ici, par exemple, on souhaite marquer distinctement les différentes impressions de boîtes, il est donc tout à fait correct d’utiliser <code>@.</code> étant donné qu’on est sûr d’être au dernier niveau d’impression (rien au-dessus) et de ne pas casser une passe d’impression élégante. Il serait donc bien plus précis de dire</p> <blockquote> <p>Il ne faut pas utiliser <code>@.</code>, <code>@n</code> et <code>@?</code> dans des impressions qui sont ou seront potentiellement imbriquées</p> </blockquote> <p>Mais il est bien plus simple pour commencer de ne jamais les utiliser quitte à les rajouter après.</p> <p>Le comportement de la boîte <code>b</code> (boîte structurelle) semble être le même que celui de la boîte <code>hov</code> (boîte tassante) mais il se trouve des cas où les deux diffèrent (généralement lorsqu’un saut de ligne réduit l’indentation courante, la boîte structurelle saute à la ligne même s’il reste de la place sur la ligne courante). Je vous invite à consulter le <a href="https://ocaml.org/learn/tutorials/format.fr.html">tutoriel</a> pour plus de précisions (je dois aussi avouer que leur fonctionnement est très proche de ce qu’on pourrait appeler “opaque” étant donné qu’en fonction de la taille de marge le comportement attendu aura lieu ou non. L’auteur de ce tutoriel tient à préciser qu’il utilise plutôt des boîtes verticales avec une indentation nulle s’il lui arrive de vouloir obtenir le comportement des boîtes structurelles, un exemple est fourni lors de l’affichage en HTML à la fin de ce document).</p> <h3>II.2. Récapitulatif</h3> <ul> <li>Il faut utiliser des boîtes </li> <li>Les indications de vidage fermant toutes les boîtes, il ne faut surtout pas les utiliser dans des fonctions d’affichage internes, il faut se limiter aux indications de coupure et d’espace </li> <li>Il faut vraiment utiliser des boîtes </li> </ul> <p>Vous voilà armé-e-s pour utiliser Format dans sa version la plus simple, avec des boîtes, de l’indentation, des indications de coupure et d’espace.</p> <p>Reprenons notre affichage d’erreur :</p> <pre><code class="language-ocaml">let pp_error fmt = function | Type_Error (s1, s2) -&gt; fprintf fmt &quot;@[&lt;hov 2&gt;Type is %s@ instead of %s@]&quot; s1 s2 | Apply_non_function s -&gt; fprintf fmt &quot;@[&lt;hov 2&gt;Type is %s,@ this is not a function@]&quot; s let pp_list fmt l = pp_print_list pp_error fmt l </code></pre> <p>On a encapsulé l’affichage des deux erreurs dans des boîtes <code>hov</code> avec une indication d’espace sécable au milieu et utilisé la fonction <code>pp_print_list</code> du module <a href="https://caml.inria.fr/pub/docs/manual-ocaml/libref/Format.html">Format</a></p> <p>Si je tente maintenant d’afficher une liste d’erreurs dans deux environnements, un de 50 colonnes et l’autre de 25 colonnes de largeur avec le code suivant :</p> <pre><code class="language-ocaml">let () = let e1 = Type_Error (&quot;int&quot;, &quot;bool&quot;) in let e2 = Apply_non_function (&quot;int&quot;) in let e3 = Type_Error (&quot;int&quot;, &quot;float&quot;) in let e4 = Apply_non_function (&quot;bool&quot;) in let el = [e1; e2; e3; e4] in pp_set_margin std_formatter 50; fprintf std_formatter &quot;--------------------------------------------------@.&quot;; fprintf std_formatter &quot;@[&lt;v 0&gt;%a@]@.&quot; pp_list el; pp_set_margin std_formatter 25; fprintf std_formatter &quot;-------------------------@.&quot;; fprintf std_formatter &quot;@[&lt;v 0&gt;%a@]@.&quot; pp_list el; </code></pre> <p>J’obtiens le résultat suivant :</p> <pre><code class="language-ocaml">-------------------------------------------------- Type is int instead of bool Type is int, this is not a function Type is int instead of float Type is bool, this is not a function ------------------------- Type is int instead of bool Type is int, this is not a function Type is int instead of float Type is bool, this is not a function </code></pre> <p>Ce qu’on rajoute en verbosité on le gagne en élégance. Et en parlant d’élégance, ça manque de couleurs.</p> <h2>III. Les tags sémantiques</h2> <p>Cette partie n’est pas présente dans le tutoriel mais dans un <a href="https://hal.archives-ouvertes.fr/hal-01503081/file/format-unraveled.pdf">article tutoriel</a> qui l’explique assez rapidement.</p> <p>La troisième spécification, donc (après celles de coupure et de boîtes), est la spécification de tag sémantique : <code>@{</code> pour en ouvrir un et <code>@}</code> pour le fermer.</p> <h3>III.1. Marquer son texte</h3> <p>Mais avant de comprendre leur fonctionnement, cherchons à comprendre leur intérêt. Que vous souhaitiez afficher dans un terminal, dans une page html ou autre, il y a de fortes chances que cette sortie accepte les marques de texte comme l’italique, la coloration etc. Utilisateur d’emacs et d’un <a href="https://en.wikipedia.org/wiki/ANSI_escape_code">terminal ANSI</a> je peux modifier l’apparence de mon texte grâce aux codes ANSI :</p> <p><img src="/blog/assets/img/exemple-ansiterm.png" alt="Exemple de marquage de texte dans un terminal ANSI" /></p> <p>Si je crée un programme OCaml qui affiche cette chaîne de charactère et que je l’exécute directement dans mon terminal je devrais obtenir le même résultat :</p> <p><img src="/blog/assets/img/exemple-ansiocaml-bad.png" alt="Exemple de marquage de texte dans un terminal ANSI depuis un programme OCaml qui se passe mal" /></p> <p>Naturellement, ça ne fonctionne pas, si l’informatique était standardisée et si tout le monde savait communiquer ça se saurait. Il s’avère que le caractère <code>033</code> est interprété en octal par les terminaux ANSI mais en décimal par OCaml (ce qui semble être l’interprétation normale). OCaml permet de représenter un <a href="https://caml.inria.fr/pub/docs/manual-ocaml/lex.html#sss:character-literals">caractère</a> selon plusieurs séquences d’échappement différentes :</p> <table><thead><tr><th>Séquence</th><th>Caractère résultant</th></tr></thead><tbody><tr><td><code>DDD</code></td><td>le caractère correspondant au code ASCII <code>DDD</code> en décimal</td></tr><tr><td><code>xHH</code></td><td>le caractère correspondant au code ASCII <code>HH</code> en hexadécimal</td></tr><tr><td><code>oOOO</code></td><td>le caractère correspondant au code ASCII <code>OOO</code> en octal</td></tr></tbody></table> <p>On peut donc écrire au choix</p> <pre><code class="language-ocaml">let () = Format.printf &quot;\027[36mBlue Text \027[0;3;30;47mItalic WhiteBG Black Text&quot; let () = Format.printf &quot;\x1B[36mBlue Text \x1B[0;3;30;47mItalic WhiteBG Black Text&quot; let () = Format.printf &quot;\o033[36mBlue Text \o033[0;3;30;47mItalic WhiteBG Black Text&quot; </code></pre> <p>Dans tous les cas, on obtient le résultat suivant :</p> <p><img src="/blog/assets/img/exemple-ansiocaml-good.png" alt="Exemple de marquage de texte dans un terminal ANSI depuis un programme OCaml qui se passe bien" /></p> <p>Que se passe-t-il, par contre, si j’exécute une de ces lignes dans un terminal non ANSI ? En testant sur <a href="https://try.ocamlpro.com/">TryOCaml</a>:</p> <p><img src="/blog/assets/img/tryocaml-ansi.png" alt="Exemple de marquage de texte dans un navigateur depuis TryOCaml" /></p> <p>On ne veut surtout pas que ce genre d’affichage puisse arriver. Il faudrait donc pouvoir s’assurer que le marquage du texte soit actif uniquement quand on le décide. L’idée de créer deux chaînes de formattage en fonction de notre capacité ou non à afficher du texte marqué n’est clairement pas une bonne pratique de programmation (changer une formulation demande de changer deux chaînes de formattage, le code est difficilement maintenable). Il faudrait donc un outil qui puisse faire un pré-traitement de notre chaîne de formattage pour lui ajouter des décorations.</p> <p>Cet outil est déjà fourni par Format, ce sont les tags sémantiques.</p> <h3>III.2 Les tags sémantiques</h3> <p>Introduits par <code>@{</code> et fermés par <code>@}</code>, comme les boîtes ils sont paramétrés par la construction <code>&lt;t&gt;</code> pour indiquer l’ouverture (et la fermeture) du tag <code>t</code>. Contrairement aux boîtes, les tags n’ont aucune signification pour l’imprimeur (on peut faire l’analogie avec les types de base d’OCaml que sont <code>int</code>, <code>bool</code>, <code>float</code> etc et les types définis par le programmeur ou la programmeuse (<code>type t = A | B</code>, par exemple. Les types de base ont déjà une quantité de fonctions qui leurs sont associés alors que les types définis ne signifient rien tant qu’on n’écrit pas les fonctions qui les manipuleront). L’avantage premier de ces tags est donc que, n’ayant aucune signification, ils sont tout simplement ignorés par l’imprimeur lors de l’affichage de notre chaîne de caractère finale:</p> <p><img src="/blog/assets/img/exemple-stag-tryocaml.png" alt="Exemple de marquage avec un tag sémantique de texte dans un navigateur depuis TryOCaml" /></p> <p>Par défaut, l’imprimeur ne traite pas les tags sémantiques (ce qui permet d’avoir un comportement d’affichage aussi simple que possible par défaut). Le traitement des tags sémantiques peut être activé pour chaque <code>formatter</code> indépendamment avec les fonctions <code>val pp_set_tags : formatter -&gt; bool -&gt; unit</code>, <code>val pp_set_print_tags : formatter -&gt; bool -&gt; unit</code> et <code>val pp_set_mark_tags : formatter -&gt; bool -&gt; unit</code> dont on verra les effets immédiatement. Voyons déjà ce qui se passe avec la fonction générale <code>pp_set_tags</code> qui combine les deux suivantes :</p> <p><img src="/blog/assets/img/exemple-stag-actif-tryocaml.png" alt="Exemple de traitement du marquage avec un tag sémantique de texte dans un navigateur depuis TryOCaml" /></p> <p>Que s’est-il passé ?</p> <p>Une fois que le traitement des tags sémantiques est activé, quatre opérations vont être effectuées à chaque ouverture et fermeture de tag :</p> <ul> <li><code>print_open_stag</code> suivie de <code>mark_open_stag</code> pour chaque tag <code>t</code> ouvert avec <code>@{&lt;t&gt;</code> </li> <li><code>mark_close_stag</code> suivie de <code>print_close_stag</code> pour chaque tag <code>t</code> fermé avec <code>@}</code> correspondant à la dernière ouverture <code>@{&lt;t&gt;</code> </li> </ul> <p>Regardons les signatures de ces quatre opérations :</p> <pre><code class="language-ocaml">type formatter_stag_functions = { mark_open_stag : stag -&gt; string; mark_close_stag : stag -&gt; string; print_open_stag : stag -&gt; unit; print_close_stag : stag -&gt; unit; } </code></pre> <p>Les fonctions <code>mark_*_stag</code> prennent un tag sémantique en paramètre et renvoie une chaîne de caractères quand les fonctions <code>print_*_stag</code> prennent le même paramètre mais ne renvoient rien. La raison derrière est en réalité toute simple :</p> <ul> <li>Les fonctions de marquage écrivent directement dans la cible d’affichage (le terminal, le fichier ou autre) </li> <li>Les fonctions d’affichage écrivent dans le <code>formatter</code> qui les traite comme des chaînes de caractères normales qui peuvent donc entraîner des sauts de ligne, des coupures, de nouvelles boîtes etc </li> </ul> <p>Une indication de couleur pour un terminal ANSI n’apparaît pas à l’affichage, le texte se retrouve coloré, il semble donc naturel de ne pas vouloir que cette indication ait un effet sur l’impression élégante. En revanche, si on voulait avoir une sortie vers un fichier LaTeX ou HTML, cette indication de couleur apparaîtraît et devrait donc avoir une influence sur l’impression élégante.</p> <p>Il est donc assez simple de savoir dans quel cas on veut utiliser <code>print_*_stag</code> ou <code>mark_*_stag</code> :</p> <ul> <li>Si le tag doit avoir un impact immédiat sur l’apparence du texte affiché (couleur, taille, décorations…) et non pas son contenu, il faut utiliser <code>mark_*_stag</code> </li> <li>Si le tag doit avoir un impact sur le contenu du texte affiché et non pas sur son apparence, il faut utiliser <code>print_*_stag</code> </li> <li>Si le tag doit avoir un impact à la fois sur le contenu et l’apparence du texte affiché alors il faut utiliser les deux en séparant bien entre contenu géré par <code>print_*_stag</code> et apparence gérée par <code>mark_*_stag</code> </li> </ul> <p>Ces quatres fonctions ont chacune un comportement par défaut que voici :</p> <pre><code class="language-ocaml">let mark_open_stag = function | String_tag s -&gt; &quot;&lt;&quot; ^ s ^ &quot;&gt;&quot; | _ -&gt; &quot;&quot; let mark_close_stag = function | String_tag s -&gt; &quot;&lt;/&quot; ^ s ^ &quot;&gt;&quot; let print_open_stag = ignore let print_close_stag = ignore </code></pre> <p>Le type <code>stag</code> est un type somme extensible (introduits dans <a href="https://ocaml.org/releases/4.02.html">OCaml 4.02.0</a>) c’est-à-dire qu’il est défini de la sorte</p> <pre><code class="language-ocaml">type stag = .. type stag += String_tag of string </code></pre> <p>Par défaut seuls les <code>String_tag of string</code> sont donc reconnus comme des tags sémantiques (ce sont aussi les seuls qui peuvent être obtenus par la construction <code>@{&lt;t&gt; ... @}</code>, ici <code>t</code> sera traité comme <code>String_tag t</code>) ce qui est illustré par le comportement par défaut de <code>mark_open_tag</code> et <code>mark_close_tag</code>. Ce comportement par défaut nous permet aussi de comprendre ce qui est arrivé ici :</p> <p><img src="/blog/assets/img/exemple-stag-actif-tryocaml_002.png" alt="Exemple de traitement du marquage avec un tag sémantique de texte dans un navigateur depuis TryOCaml" /></p> <p>N’ayant pas personnalisé les opérations de manipulation des tags, leur comportement par défaut a été exécuté, ce qui revient à afficher directement le tag entre chevrons sans passer par le <code>formatter</code>. Il faut donc définir les comportements voulus pour nos tags (attention, ne manipulant que des chaînes de caractère, toute erreur est conséquemment difficile à identifier et corriger, il vaut mieux donc éviter les célèbres <code>| _ -&gt; ()</code> — il faudrait en réalité les éviter tout le temps si possible mais c’est une autre histoire).</p> <p>Commençons donc par définir nos tags et ce à quoi on veut qu’ils correspondent :</p> <pre><code class="language-ocaml">open Format type style = | Normal | Italic | Italic_off | FG_Black | FG_Blue | FG_Default | BG_White | BG_Default let close_tag = function | Italic -&gt; Italic_off | FG_Black | FG_Blue | FG_Default -&gt; FG_Default | BG_White | BG_Default -&gt; BG_Default | _ -&gt; Normal let style_of_tag = function | String_tag s -&gt; begin match s with | &quot;n&quot; -&gt; Normal | &quot;italic&quot; -&gt; Italic | &quot;/italic&quot; -&gt; Italic_off | &quot;fg_black&quot; -&gt; FG_Black | &quot;fg_blue&quot; -&gt; FG_Blue | &quot;fg_default&quot; -&gt; FG_Default | &quot;bg_white&quot; -&gt; BG_White | &quot;bg_default&quot; -&gt; BG_Default | _ -&gt; raise Not_found end | _ -&gt; raise Not_found </code></pre> <p>Maintenant que chaque tag possible est géré, il nous faut les associer à leur valeur (ANSI dans ce cas) et implémenter nos propres fonctions de marquages (et pas d’affichage car a priori ces tags n’ont aucun effet sur le contenu du texte affiché) :</p> <pre><code class="language-ocaml">(* See https://en.wikipedia.org/wiki/ANSI_escape_code#SGR_parameters for some values *) let to_ansi_value = function | Normal -&gt; &quot;0&quot; | Italic -&gt; &quot;3&quot; | Italic_off -&gt; &quot;23&quot; | FG_Black -&gt; &quot;30&quot; | FG_Blue -&gt; &quot;34&quot; | FG_Default -&gt; &quot;39&quot; | BG_White -&gt; &quot;47&quot; | BG_Default -&gt; &quot;49&quot; let ansi_tag = Printf.sprintf &quot;\x1B[%sm&quot; let start_mark_ansi_stag t = ansi_tag @@ to_ansi_value @@ style_of_tag t let stop_mark_ansi_stag t = ansi_tag @@ to_ansi_value @@ close_tag @@ style_of_tag t </code></pre> <p>On se le rappelle, l’ouverture d’un tag ANSI se fait avec la séquence d’échappement <code>x1B</code> suivie de une ou plusieurs valeurs de tags séparées par <code>;</code> entre <code>[</code> et <code>m</code>. Dans notre cas chaque tag n’est associé qu’à une valeur mais il serait tout à fait possible d’avoir un <code>Error -&gt; &quot;1;4;31&quot;</code> qui imposerait un affichage gras, souligné et en rouge. Tant que la chaîne de caractère renvoyée au terminal correspond bien à une séquence de marquage ANSI tout est possible.</p> <p>Il faut ensuite faire en sorte que ces fonctions soient celles utilisées par le <code>formatter</code> lors de leur traitement :</p> <pre><code class="language-ocaml">let add_ansi_marking formatter = let open Format in pp_set_mark_tags formatter true; let old_fs = pp_get_formatter_stag_functions formatter () in pp_set_formatter_stag_functions formatter { old_fs with mark_open_stag = start_mark_ansi_stag; mark_close_stag = stop_mark_ansi_stag } </code></pre> <p>On utilise la fonction <code>pp_set_mark_tags</code> (au lieu de <code>pp_set_tags</code>) car on ne se sert pas de <code>print_*_stags</code> et on associe aux fonctions <code>mark_*_stag</code> les fonctions <code>*_ansi_stag</code>.</p> <p>Il ne nous reste plus qu’à faire en sorte que les tags sémantiques soient traités et avec nos fonctions avant d’afficher notre chaîne de caractères :</p> <pre><code class="language-ocaml">let () = add_ansi_marking std_formatter; Format.printf &quot;@{&lt;fg_blue&gt;Blue Text @}@{&lt;italic&gt;@{&lt;bg_white&gt;@{&lt;fg_black&gt;Italic WhiteBG BlackFG Text@}@}@}&quot; </code></pre> <p>Et l’affichage dans le terminal sera bien celui voulu :</p> <p><img src="/blog/assets/img/ansi-color-term-stag.png" alt="Exemple de marquage avec la gestion des tags sémantiques par Format dans un terminal ANSI" /></p> <p>Si le programme doit être affiché dans un terminal non ANSI il suffit simplement d’enlever la ligne <code>add_ansi_marking std_formatter;</code> :</p> <p><img src="/blog/assets/img/ansi-color-try-stag.png" alt="Exemple de marquage avec la gestion des tags sémantiques par Format dans un terminal ANSI" /></p> <p>On pourrait aussi faire en sorte que notre texte puisse être envoyé vers un document HTML.</p> <p>Il faut déjà changer les valeurs associées aux tags (on voit ici l’utilisation de boîtes verticales à indentation nulle mentionnée lors du paragraphe sur les boîtes structurelles) :</p> <pre><code class="language-ocaml">let to_html_value fmt = let fg_color c = Format.fprintf fmt {|@[&lt;v 0&gt;@[&lt;v 2&gt;&lt;span style=&quot;color:%s;&quot;&gt;@,|} c in let bg_color c = Format.fprintf fmt {|@[&lt;v 0&gt;@[&lt;v 2&gt;&lt;span style=&quot;background-color:%s;&quot;&gt;@,|} c in let close_span () = Format.fprintf fmt &quot;@]@,&lt;/span&gt;@]&quot; in let default = Format.fprintf fmt in fun t -&gt; match t with | Normal -&gt; () | Italic -&gt; default &quot;&lt;i&gt;&quot; | Italic_off -&gt; default &quot;&lt;/i&gt;&quot; | FG_Black -&gt; fg_color &quot;black&quot; | FG_Blue -&gt; fg_color &quot;blue&quot; | FG_Default -&gt; close_span () | BG_White -&gt; bg_color &quot;white&quot; | BG_Default -&gt; close_span () </code></pre> <p>La construction <code>{| ... |}</code> permet d’avoir des chaînes de caractères sans les caractères spéciaux <code>&quot;</code> et `` ce qui permet d’écrire <code>{|&quot;This is a nice &quot;|}</code> sans espacer ces caractères.</p> <p>De même, la construction</p> <pre><code class="language-ocaml">let fonction arg1 ... argn = let expr1 = ... in ... let exprn = ... in fun argn1 ... argnm -&gt; </code></pre> <p>Permet de définir des expressions internes à une fonction qui dépendent des arguments fournis avant et donc, dans le cas d’une application partielle, de calculer cet environnement une seule fois. Dans le cas de la fonction <code>to_html_value</code> je pourrai donc créer la nouvelle application partielle <code>let to_html_value_std = to_html_value std_formatter</code> qui contiendra donc directment les implémentations de <code>fg_color</code>, <code>bg_color</code>, <code>close_span</code> et <code>default</code> pour <code>std_formatter</code>.</p> <p>Contrairement au cas du terminal ANSI, ce qui changera sera le contenu et non pas l’apparence du texte, nous utiliserons donc les fonctions <code>print_*_stag</code>. C’est pourquoi nos fonctions doivent directement écrire dans le <code>formatter</code> et non pas renvoyer une chaîne de caractères.</p> <p>Les fonctions d’ouverture et de fermeture ne changent pas énormément :</p></p> <pre><code class="language-ocaml">let start_print_html_stag fmt t = to_html_value fmt @@ style_of_tag t let stop_print_html_stag fmt t = to_html_value fmt @@ close_tag @@ style_of_tag t </code></pre> <p>On associe ensuite ces fonctions aux fonctions <code>print_*_stag</code> :</p> <pre><code class="language-ocaml">let add_html_printings formatter = let open Format in pp_set_mark_tags formatter false; pp_set_print_tags formatter true; let old_fs = pp_get_formatter_stag_functions formatter () in pp_set_formatter_stag_functions formatter { old_fs with print_open_stag = start_print_html_stag formatter; print_close_stag = stop_print_html_stag formatter} </code></pre> <p>On en profite pour désactiver le marquage sur le formatter passé en paramètre. Cela évite d’avoir de mauvaises surprises au cas où il aurait été activé précédemment (il aurait fallu faire de même lors du marquage pour le terminal ANSI).</p> <p>Finalement, l’appel à :</p> <pre><code class="language-ocaml">let () = add_html_printings std_formatter; Format.printf &quot;@[&lt;v 0&gt;@{&lt;fg_blue&gt;Blue Text @}@,@{&lt;italic&gt;@{&lt;bg_white&gt;@{&lt;fg_black&gt;Italic WhiteBG BlackFG Text@}@}@}@]@.&quot; </code></pre> <p>Nous donne le résultat attendu :</p> <pre><code class="language-html">&lt;span style=&quot;color:blue;&quot;&gt; Blue Text &lt;/span&gt; &lt;i&gt; &lt;span style=&quot;background-color:white;&quot;&gt; &lt;span style=&quot;color:black;&quot;&gt; Italic WhiteBG BlackFG Text &lt;/span&gt; &lt;/span&gt; &lt;/i&gt; </code></pre> <h2>Conclusion</h2> <p>Nous voici arrivés à la fin de ce tutoriel qui, je l’espère, vous permettra d’appréhender le module Format avec bien plus de sérénité.</p> <p>Dans les possibilités non présentées ici mais qu’il est intéressant d’avoir en mémoire :</p> <ul> <li>Possibilité de redéfinir intégralement toutes les fonctions d’affichage définies dans l’enregistrement : </li> </ul> <pre><code class="language-html">&lt;span class=&quot;hljs-keyword&quot;&gt;type&lt;/span&gt; formatter_out_functions = { out_string : &lt;span class=&quot;hljs-built_in&quot;&gt;string&lt;/span&gt; -&gt; &lt;span class=&quot;hljs-built_in&quot;&gt;int&lt;/span&gt; -&gt; &lt;span class=&quot;hljs-built_in&quot;&gt;int&lt;/span&gt; -&gt; &lt;span class=&quot;hljs-built_in&quot;&gt;unit&lt;/span&gt;; out_flush : &lt;span class=&quot;hljs-built_in&quot;&gt;unit&lt;/span&gt; -&gt; &lt;span class=&quot;hljs-built_in&quot;&gt;unit&lt;/span&gt;; out_newline : &lt;span class=&quot;hljs-built_in&quot;&gt;unit&lt;/span&gt; -&gt; &lt;span class=&quot;hljs-built_in&quot;&gt;unit&lt;/span&gt;; out_spaces : &lt;span class=&quot;hljs-built_in&quot;&gt;int&lt;/span&gt; -&gt; &lt;span class=&quot;hljs-built_in&quot;&gt;unit&lt;/span&gt;; out_indent : &lt;span class=&quot;hljs-built_in&quot;&gt;int&lt;/span&gt; -&gt; &lt;span class=&quot;hljs-built_in&quot;&gt;unit&lt;/span&gt;; } </code></pre> <ul> <li>Possibilité de transformer n’importe quel sortie en un formatter pour écrire directement dedans sans avoir à passer par des chaînes de caractère intermédiaire (notamment la fonction <code>val formatter_of_buffer : Buffer.t -&gt; formatter</code> qui permet directement d’écrire dans un buffer </li> <li>L’impression élégante symbolique qui imprime de façon symbolique donc permet de voir directement quelles directives seront envoyées au <code>formatter</code> à l’impression. Très utile pour débuguer en cas d’impression cacophonique mais aussi extrêmement puissant pour effectuer une phase de post-traitement (par exemple si on veut ajouter un symbole à chaque début de ligne) </li> <li>Les fonctions utiles qu’il ne faut pas oublier d’utiliser (je sais que les devs OCaml aiment réinventer la roue mais il existe déjà des fonctions pour afficher des listes, des options et les résultats <code>Ok _ | Error _</code>) : </li> </ul> <pre><code class="language-ocaml">val pp_print_list : ?pp_sep:(formatter -&gt; unit -&gt; unit) -&gt; (formatter -&gt; 'a -&gt; unit) -&gt; formatter -&gt; 'a list -&gt; unit (* Affiche une liste dont chaque élément est séparé par le séparateur par défaut `@,` ou celui fourni *) val pp_print_option : ?none:(formatter -&gt; unit -&gt; unit) -&gt; (formatter -&gt; ‘a -&gt; unit) -&gt; formatter -&gt; ‘a option -&gt; unit (* Affiche le contenu d’une option en cas de Some contenu et rien par défaut si None ou l’affichage fourni *)&lt;/p&gt; val pp_print result : ok:(formatter -&gt; ‘a -&gt; unit) -&gt; error:(formatter -&gt; ‘e -&gt; unit) -&gt; formatter -&gt; (‘a, ‘e) result -&gt; unit (* Affiche le contenu d’un result. Les arguments ne sont ici pas optionnels et conditionnent l’affichage en cas de Ok &lt;/em&gt; et de Error _ *) </code></pre> <ul> <li>Enfin, une pelletée de fonctions à la <code>printf</code> telles que, donc : </li> <li><code>fprintf</code> que nous avons déjà vue </li> <li><code>dprintf</code> qui permet de retarder l'évaluation de l'impression et donc de ne pas calculer des impressions qui ne seront jamais faites </li> <li><code>ifprintf</code> qui n'affiche rien (utile lorsqu'on veut avoir la même signature que <code>fprintf</code> mais en étant sûr que rien ne sera fait) </li> </ul> <p>Sources :</p> <ul> <li> <p>Tutoriel du site OCaml</p> </li> <li> <p>Richard Bonichon, Pierre Weis. Format Unraveled. 28ièmes Journées Francophones des LangagesApplicatifs, Jan 2017, Gourette, France. hal-01503081</p> </li> </ul> <p>Codes sources :</p> <p>Code LaTeX correspondant à <code>printf</code></p> <pre><code class="language-latex">\documentclass[tikz,border=10pt]{standalone} \usepackage{tikz} \usetikzlibrary{math} \usetikzlibrary{tikzmark} \usepackage{xcolor} \pagecolor[rgb]{0,0,0} \color[rgb]{1,1,1} \colorlet{color1}{blue!50!white} \colorlet{color2}{red!50!white} \colorlet{color3}{green!50!black} \begin{document} \begin{tikzpicture}[remember picture] \node [align=left,font=\ttfamily] at (0,0) { let s = &quot;toto&quot; in\[2em] printf &quot;{color{color1}\tikzmarknode{scd}{\%d}} {color{color2}\tikzmarknode{scc}{\%c}} {color{color3}\tikzmarknode{scs}{\%s}}&quot; {\color{color1}\tikzmarknode{d}{3}} {\color{color2}\tikzmarknode{c}{'c'}} {\color{color3}\tikzmarknode{s}{s}}\\[2em] &gt; &quot;3 c toto&quot; }; \draw[&lt;-, color1] (scd.north) -- ++(0,0.5) -| (d); \draw[&lt;-, color2] (scc.south) -- ++(0,-0.4) -| (c); \draw[&lt;-, color3] (scs.north) -- ++(0,0.4) -| (s); \end{tikzpicture} end{document} </code></pre> <p>Code LaTeX correspondant à <code>fprintf</code>:</p> <pre><code class="language-latex">\documentclass[tikz,border=10pt]{standalone} \usepackage{tikz} \usetikzlibrary{math} \usetikzlibrary{decorations.pathreplacing,tikzmark} \usepackage{xcolor} \pagecolor[rgb]{0,0,0} \color[rgb]{1,1,1} \colorlet{color1}{blue!50!white} \colorlet{color2}{red!50!white} \colorlet{color3}{green!50!black} \begin{document} \begin{tikzpicture}[remember picture] \node [align=left,font=\ttfamily] at (0,0) { let s = &quot;toto&quot; in\\[2em] fprintf \tikzmarknode{fmt}{fmt} \tikzmarknode{str}{&quot;{\color{color1}\tikzmarknode{scd}{\%d}} {\color{color2}\tikzmarknode{scc}{\%c}} {\color{color3}\tikzmarknode{scs}{\%s}}&quot;} {\color{color1}\tikzmarknode{d}{3}} {\color{color2}\tikzmarknode{c}{'c'}} {\color{color3}\tikzmarknode{s}{s}}\\[2em] &gt; \\ (* fmt &lt;- &quot;3 c toto&quot; *) }; \draw[&lt;-, color1] (scd.north) -- ++(0,0.5) -| (d); \draw[&lt;-, color2] (scc.south) -- ++(0,-0.3) -| (c); \draw[&lt;-, color3] (scs.north) -- ++(0,0.4) -| (s); \draw[decorate,decoration={brace, amplitude=5pt, raise=10pt},yshift=-2cm] (str.south east) -- (str.south west) node[midway, yshift=-13pt](a){} ; \draw[-&gt;, white] (a.south) -- ++(0,-0.1) -| (fmt); \end{tikzpicture} \end{document} </code></pre> <p>Code LaTeX correspondant à <code>fprintf</code> avec utilisation de <code>%a</code></p> <pre><code class="language-latex">\documentclass[tikz,border=10pt]{standalone} \usepackage{tikz} \usetikzlibrary{math} \usetikzlibrary{decorations.pathreplacing,tikzmark} \usepackage{xcolor} \pagecolor[rgb]{0,0,0} \color[rgb]{1,1,1} \colorlet{color1}{blue!50!white} \colorlet{color2}{red!50!white} \colorlet{color3}{green!50!black} \begin{document} \begin{tikzpicture}[remember picture] \node [align=left,font=\ttfamily] at (0,0) { let s = &quot;toto&quot; in\\[2em] type expr = \{i: int; j: int\}\\ let pp\_expr fmt {i; j} = fprintf fmt &quot;&lt;\%d, \%d&gt; i j&quot; in\\[2em] fprintf \tikzmarknode{fmt}{std\_formatter} \tikzmarknode{str}{&quot;{\color{color1}\tikzmarknode{scd}{\%d}} {\color{color2}\tikzmarknode{sca}{\%a}} {\color{color3}\tikzmarknode{scs}{\%s}}&quot;} {\color{color1}\tikzmarknode{d}{3}} {\color{color2}\tikzmarknode{ppe}{pp\_expr}} {\color{color2}\tikzmarknode{e}{\{i=1; j=2\}}} {\color{color3}\tikzmarknode{s}{s}}\\[2em] &gt; &quot;3 &lt;1, 2&gt; toto&quot; }; \draw[&lt;-, color1] (scd.north) -- ++(0,0.5) -| (d); \draw[&lt;-, color2] (sca.south) -- ++(0,-0.3) -| (ppe); \draw[&lt;-, color2] (sca.65) -- ++(0,0.3) -| (e); \draw[-&gt;, color2] (fmt.north) -- ++(0,0.2) -| (sca.115); \draw[&lt;-, color3] (scs.south) -- ++(0,-0.4) -| (s); \draw[decorate,decoration={brace, amplitude=5pt, raise=12pt},yshift=-2cm] (str.south east) -- (str.south west) node[midway, yshift=-13pt](a){} ; \draw[-&gt;, white] (a.south) -- ++(0,-0.1) -| (fmt); \end{tikzpicture} \end{document} </code></pre> A Solidity parser in OCaml with Menhir https://ocamlpro.com/blog/2020_05_19_ocaml_solidity_parser_with_menhir 2020-05-19T13:48:57Z 2020-05-19T13:48:57Z David Declerck This article is cross-posted on Origin Labs’ Dune Network blog We are happy to announce the first release of our Solidity parser, written in OCaml using Menhir. This is a joint effort with Origin Labs, the company dedicated to blockchain challenges, to implement a full interpreter for the Solidity... <p align="center" > <a href="/blog/2020_05_19_ocaml_solidity_parser_with_menhir"> <img width="420" height="420" alt="Solidity Logo" title="A Solidity parser in OCaml with Menhir" src="/blog/assets/img/solidity-cover.png"> </a> </p> <br /> <blockquote> <p>This article is cross-posted on Origin Labs’ Dune Network <a href="https://medium.com/dune-network/a-solidity-parser-in-ocaml-with-menhir-e1064f94e76b">blog</a></p> </blockquote> <p>We are happy to announce the first release of <a href="https://github.com/OCamlPro/ocaml-solidity">our Solidity parser</a>, written in OCaml using <a href="http://gallium.inria.fr/~fpottier/menhir/">Menhir</a>. This is a joint effort with <a href="https://www.origin-labs.com/">Origin Labs</a>, the company dedicated to blockchain challenges, to implement a full interpreter for the <a href="https://solidity.readthedocs.io/en/v0.6.8/">Solidity language</a> directly in a blockchain.</p> <p><img src="/blog/assets/img/logo_solidity_title.png" alt="Solidity Logo" /></p> <p>Solidity is probably the most popular language for smart-contracts, small pieces of code triggered when accounts receive transactions on a blockchain.Solidity is an object-oriented strongly-typed language with a Javascript-like syntax.</p> <p><img src="/blog/assets/img/logo_ethereum_title.png" alt="Ethereum Logo" /></p> <p>Solidity was first implemented for the <a href="https://ethereum.org/">Ethereum</a> blockchain, with a compiler to the EVM, the Ethereum Virtual Machine.</p> <p><img src="/blog/assets/img/logo_dune_title.png" alt="Dune Network Logo" /></p> <p>Dune Network takes a different approach, as Solidity smart-contracts will be executed natively, after type-checking. Solidity will be the third native language on Dune Network, with <a href="https://dune.network/docs/dune-node-mainnet/whitedoc/michelson.html">Michelson</a>, a low-level strongly-typed language inherited from Tezos, and <a href="https://dune.network/docs/dune-node-mainnet/love-doc/introduction.html">Love</a>, an higher-level strongly-typed language, also implemented jointly by OCamlPro and Origin Labs.</p> <p>A first step has been accomplished, with the completion of the Solidity parser and printer, written in OCaml with Menhir.</p> <p>This parser (and its printer companion) is now available as a standalone library under the LGPLv3 license with Linking Exception, allowing its integration in all projects. The source code is available at https://gitlab.com/o-labs/solidity-parser-ocaml.</p> <p>Our parser should support all of Solidity 0.6, with the notable exception of inline assembly (may be added in a future release).</p> <h2>Example contract</h2> <p>Here is an example of a very simple contract that stores an integer value and allows the contract’s owner to add an arbitrary value to this value, and any other contract to read this value:</p> <pre><code class="language-solidity">pragma solidity &gt;=0.6.0 &lt;0.7.0; contract C { address owner; int x; constructor() public { owner = msg.sender; x = 0; } function add(int d) public { require(msg.sender == owner); x += d; } function read_x() public view returns(int) { return x; } } </code></pre> <h2>Parser Usage</h2> <h3>Executable</h3> <p>Our parser comes with a small executable that demonstrates the library usage. Simply run:</p> <pre><code class="language-bash">./solp contract.sol </code></pre> <p>This will parse the file <code>contract.sol</code> and reprint it on the terminal.</p> <h3>Library</h3> <p>To use our parser as a library, add it to your program’s dependencies and use the following function:</p> <pre><code class="language-ocaml">Solidity_parser.parse_contract_file : string -&gt; Solidity_parser.Solidity_types.module_ </code></pre> <p>It takes a filename and returns a Solidity AST.</p> <p>If you wish to print this AST, you may turn it into its string representation by sending it to the following function:</p> <pre><code class="language-ocaml">Solidity_parser.Printer.string_of_code : Solidity_parser.Solidity_types.module_ -&gt; string </code></pre> <h2>Conclusion</h2> <p>Of course, all of this is Work In Progress, but we are quite happy to share it with the OCaml community. We think there is a tremendous work to be done around blockchains for experts in formal methods. Do not hesitate to contact us if you want to use this library!</p> <h2>About Origin Labs</h2> <p>Origin Labs is a company founded in 2019 by the former blockchain team at OCamlPro. At Origin Labs, they have been developing Dune Network, a fork of the Tezos blockchain, its ecosystem, and applications over the Dune Network platform. At OCamlPro, they developed TzScan, the most popular block explorer at the time, Liquidity, a smart contract language, and were involved in the development of the core protocol and node.Do not hesitate to reach out by email: contact@origin-labs.com.</p> opam 2.1.0 alpha is here! https://ocamlpro.com/blog/2020_04_22_opam_2.1.0_alpha_is_here 2020-04-22T13:48:57Z 2020-04-22T13:48:57Z Raja Boujbel Louis Gesbert We are happy to announce a alpha for opam 2.1.0, one year and a half in the making after the release of 2.0.0. Many new features made it in (see the complete changelog or release note for the details), but here are a few highlights of this release. Release highlights The two following features have ... <p>We are happy to announce a alpha for opam 2.1.0, one year and a half in the making after the release of 2.0.0.</p> <p>Many new features made it in (see the <a href="https://github.com/ocaml/opam/blob/2.1.0-alpha/CHANGES">complete changelog</a> or <a href="https://github.com/ocaml/opam/releases/tag/2.1.0-alpha">release note</a> for the details), but here are a few highlights of this release.</p> <h2>Release highlights</h2> <p>The two following features have been around for a while as plugins and are now completely integrated in the core of opam. No extra installs needed anymore, and a more smooth experience.</p> <h3>Seamless integration of System dependencies handling (a.k.a. &quot;depexts&quot;)</h3> <p>A number of opam packages depend on tools or libraries installed on the system, which are out of the scope of opam itself. Previous versions of opam added a <a href="http://opam.ocaml.org/doc/Manual.html#opamfield-depexts">specification format</a>, and opam 2.0 already handled checking the OS and extracting the required system package names.</p> <p>However, the workflow generally involved letting opam fail once, then installing the dependencies and retrying, or explicitely using the <a href="https://github.com/ocaml/opam-depext">opam-depext plugin</a>, which was invaluable for CI but still incurred extra steps.</p> <p>With opam 2.1.0, <em>depexts</em> are seamlessly integrated, and you basically won't have to worry about them ahead of time:</p> <ul> <li>Before applying its course of actions, opam 2.1.0 checks that external dependencies are present, and will prompt you to install them. You are free to let it do it using <code>sudo</code>, or just run the provided commands yourself. </li> <li>It is resilient to <em>depexts</em> getting removed or out of sync. </li> <li>Opam 2.1.0 detects packages that depend on stuff that is not available on your OS version, and automatically avoids them. </li> </ul> <p>This is all fully configurable, and can be bypassed without tricky commands when you need it (<em>e.g.</em> when you compiled a dependency yourself).</p> <h3>Dependency locking</h3> <p>To share a project for development, it is often necessary to be able to reproduce the exact same environment and dependencies setting — as opposed to allowing a range of versions as opam encourages you to do for releases.</p> <p>For some reason, most other package managers call this feature &quot;lock files&quot;. Opam can handle those in the form of <code>[foo.]opam.locked</code> files, and the <code>--locked</code> option.</p> <p>With 2.1.0, you no longer need a plugin to generate these files: just running <code>opam lock</code> will create them for existing <code>opam</code> files, enforcing the exact version of all dependencies (including locally pinned packages).</p> <p>If you check-in these files, new users would just have run <code>opam switch create . --locked</code> on a fresh clone to get a local switch ready to build the project.</p> <h3>Pinning sub-directories</h3> <p>This one is completely new: fans of the <em>Monorepo</em> rejoice, opam is now able to handle projects in subtrees of a repository.</p> <ul> <li>Using <code>opam pin PROJECT_ROOT --subpath SUB_PROJECT</code>, opam will look for <code>PROJECT_ROOT/SUB_PROJECT/foo.opam</code>. This will behave as a pinning to <code>PROJECT_ROOT/SUB_PROJECT</code>, except that the version-control handling is done in <code>PROJECT_ROOT</code>. </li> <li>Use <code>opam pin PROJECT_ROOT --recursive</code> to automatically lookup all sub-trees with opam files and pin them. </li> </ul> <h3>Opam switches are now defined by invariants</h3> <p>Previous versions of opam defined switches based on <em>base packages</em>, which typically included a compiler, and were immutable. Opam 2.1.0 instead defines them in terms of an <em>invariant</em>, which is a generic dependency formula.</p> <p>This removes a lot of the rigidity <code>opam switch</code> commands had, with little changes on the existing commands. For example, <code>opam upgrade ocaml</code> commands are now possible; you could also define the invariant as <code>ocaml-system</code> and have its version change along with the version of the OCaml compiler installed system-wide.</p> <h3>Configuring opam from the command-line</h3> <p>The new <code>opam option</code> command allows to configure several options, without requiring manual edition of the configuration files.</p> <p>For example:</p> <ul> <li><code>opam option jobs=6 --global</code> will set the number of parallel build jobs opam is allowed to run (along with the associated <code>jobs</code> variable) </li> <li><code>opam option depext-run-commands=false</code> disables the use of <code>sudo</code> for handling system dependencies; it will be replaced by a prompt to run the installation commands. </li> </ul> <p>The command <code>opam var</code> is extended with the same format, acting on switch and global variables.</p> <h2>Try it!</h2> <p>In case you plan a possible rollback, you may want to first backup your <code>~/.opam</code> directory.</p> <p>The upgrade instructions are unchanged:</p> <ol> <li>Either from binaries: run </li> </ol> <pre><code class="language-shell-session">$~ bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.1.0~alpha&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.1.0-alpha">the Github &quot;Releases&quot; page</a> to your PATH.</p> <ol start="2"> <li>Or from source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.1.0-alpha#compiling-this-repo">README</a>. </li> </ol> <p>You should then run:</p> <pre><code class="language-shell-session">opam init --reinit -ni </code></pre> <p>This is still a alpha, so a few glitches or regressions are to be expected. Please report them to <a href="https://github.com/ocaml/opam/issues">the bug-tracker</a>. Thanks for trying it out, and hoping you enjoy!</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>.</p> </blockquote> opam 2.0.7 release https://ocamlpro.com/blog/2020_04_21_opam_2.0.7_release 2020-04-21T13:48:57Z 2020-04-21T13:48:57Z Raja Boujbel Louis Gesbert We are pleased to announce the minor release of opam 2.0.7. This new version contains backported small fixes: Escape Windows paths on manpages [#4129 @AltGr @rjbou] Fix opam installer opam file [#4058 @rjbou] Fix various warnings [#4132 @rjbou @AltGr - fix #4100] Fix dune 2.5.0 promote-install-files... <p>We are pleased to announce the minor release of <a href="https://github.com/ocaml/opam/releases/tag/2.0.7">opam 2.0.7</a>.</p> <p>This new version contains <a href="https://github.com/ocaml/opam/pull/4143">backported</a> small fixes:</p> <ul> <li>Escape Windows paths on manpages [<a href="https://github.com/ocaml/opam/pull/4129">#4129</a> <a href="https://github.com/AltGr">@AltGr</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>Fix opam installer opam file [<a href="https://github.com/ocaml/opam/pull/4058">#4058</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>Fix various warnings [<a href="https://github.com/ocaml/opam/pull/4132">#4132</a> <a href="https://github.com/rjbou">@rjbou</a> <a href="https://github.com/AltGr">@AltGr</a> - fix <a href="https://github.com/ocaml/opam/issues/4100">#4100</a>] </li> <li>Fix dune 2.5.0 promote-install-files duplication [<a href="https://github.com/ocaml/opam/pull/4132">#4132</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> </ul> <hr /> <p>Installation instructions (unchanged):</p> <ol> <li>From binaries: run </li> </ol> <pre><code class="language-shell-session">bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.0.7&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.0.7">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script.</p> <ol start="2"> <li>From source, using opam: </li> </ol> <pre><code class="language-shell-session">opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script)</p> <ol start="3"> <li>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.0.7#compiling-this-repo">README</a>. </li> </ol> <p>We hope you enjoy this new minor version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>.</p> </blockquote> Le nouveau GC d’OCaml 4.10 : premier aperçu de la stratégie best-fit https://ocamlpro.com/blog/2020_03_24_fr_le_nouveau_gc_docaml_4.10_premier_apercu_de_la_strategie_best_fit 2020-03-24T13:48:57Z 2020-03-24T13:48:57Z Thomas Blanc An in-depth Look at OCaml’s new "Best-fit" Garbage Collector Strategy Le GC d’OCaml oeuvre discrètement à l’efficacité de vos allocations mémoire. Tel un héros de l’ombre, il reste méconnu de la plupart des hackers OCaml. Avec l’arrivée d’OCaml 4.10, il s’enrichit d’une nouvel... <p><a href="/blog/2020_03_23_in_depth_look_at_best_fit_gc"><img src="/blog/assets/img/logo_round_ocaml_search.png" alt="An in-depth Look at OCaml’s new &quot;Best-fit&quot; Garbage Collector Strategy" /></a></p> <p>Le GC d’OCaml oeuvre discrètement à l’efficacité de vos allocations mémoire. Tel un héros de l’ombre, il reste méconnu de la plupart des hackers OCaml. Avec l’arrivée d’OCaml 4.10, il s’enrichit d’une nouvelle stratégie apparue dans le <a href="https://ocaml.org/releases/4.10.0.html#Changes">changelog</a>, signée de Damien Doligez.</p> <p>Dans cet article nous commençons à explorer la nouvelle stratégie baptisée *best-fit *du nouveau Glaneur de Cellules dans OCaml 4.10.</p> <blockquote> <p>En savoir plus : <a href="/2020/03/23/ocaml-new-best-fit-garbage-collector/">article en anglais</a>.</p> </blockquote> An in-depth Look at OCaml’s new “Best-fit” Garbage Collector Strategy https://ocamlpro.com/blog/2020_03_23_in_depth_look_at_best_fit_gc 2020-03-23T13:48:57Z 2020-03-23T13:48:57Z Thomas Blanc An in-depth Look at OCaml’s new "Best-fit" Garbage Collector Strategy The Garbage Collector probably is OCaml’s greatest unsung hero. Its pragmatic approach allows us to allocate without much fear of efficiency loss. In a way, the fact that most OCaml hackers know little about it is a good sign:... <p><a href="/blog/2020_03_23_in_depth_look_at_best_fit_gc"><img src="/blog/assets/img/logo_round_ocaml_search.png" alt="An in-depth Look at OCaml’s new &quot;Best-fit&quot; Garbage Collector Strategy" /></a></p> <p>The Garbage Collector probably is OCaml’s greatest unsung hero. Its pragmatic approach allows us to allocate without much fear of efficiency loss. In a way, the fact that most OCaml hackers know little about it is a good sign: you want a runtime to gracefully do its job without having to mind it all the time.</p> <p>But as OCaml 4.10.0 has now hit the shelves, a very exciting feature is <a href="https://ocaml.org/releases/4.10.0.html#Changes">in the changelog</a>:</p> <blockquote> <p>#8809, #9292: Add a best-fit allocator for the major heap; still experimental, it should be much better than current allocation policies (first-fit and next-fit) for programs with large heaps, reducing both GC cost and memory usage. This new best-fit is not (yet) the default; set it explicitly with OCAMLRUNPARAM=&quot;a=2&quot; (or Gc.set from the program). You may also want to increase the <code>space_overhead</code> parameter of the GC (a percentage, 80 by default), for example OCAMLRUNPARAM=&quot;o=85&quot;, for optimal speed. (Damien Doligez, review by Stephen Dolan, Jacques-Henri Jourdan, Xavier Leroy, Leo White)</p> </blockquote> <p>At OCamlPro, some of the tools that we develop, such as the package manager <a href="https://opam.ocaml.org/">opam</a>, the <a href="https://alt-ergo.ocamlpro.com/">Alt-Ergo</a> SMT solver or the Flambda optimizer, can be quite demanding in memory usage, so we were curious to better understand the properties of this new allocator.</p> <h2>Minor heap and Major heap: the GC in a nutshell</h2> <p>Not all values are allocated equal. Some will only be useful for the span of local calculations, some will last as long as the program lives. To handle those two kinds of values, the runtime uses a <em>Generational Garbage Collector</em> with two spaces:</p> <ul> <li>The minor heap uses the <a href="https://en.wikipedia.org/wiki/Tracing_garbage_collection#Copying_vs._mark-and-sweep_vs._mark-and-don.27t-sweep">Stop-and-copy</a> principle. It is fast but has to stop the computation to perform a full iteration. </li> <li>The major heap uses the <a href="https://en.wikipedia.org/wiki/Tracing_garbage_collection#Na%C3%AFve_mark-and-sweep">Mark-and-sweep</a> principle. It has the perk of being incremental and behaves better for long-lived data. </li> </ul> <p>Allocation in the minor heap is straightforward and efficient: values are stored sequentially, and when there is no space anymore, space is emptied, surviving values get allocated in the major heap while dead values are just forgotten for free. However, the major heap is a bit more tricky, since we will have random allocations and deallocations that will eventually produce a scattered memory. This is called <a href="https://en.wikipedia.org/wiki/Fragmentation_(computing)">fragmentation</a>, and this means that you’re using more memory than necessary. Thankfully, the GC has two strategies to counter that problem:</p> <ul> <li>Compaction: a heavyweight reallocation of everything that will remove those holes in our heap. OCaml’s compactor is cleverly written to work in constant space, and would be worth its own specific article! </li> <li>Free-list Allocation: allocating the newly coming data in the holes (the free-list) in memory, de-scattering it in the process. </li> </ul> <p>Of course, asking the GC to be smarter about how it allocates data makes the GC slower. Coding a good GC is a subtle art: you need to have something smart enough to avoid fragmentation but simple enough to run as fast as possible.</p> <h2>Where and how to allocate: the 3 strategies</h2> <p>OCaml used to propose 2 free-list allocation strategies: <em>next-fit</em>, the default, and <em>first-fit</em>. Version 4.10 of OCaml introduces the new <em>best-fit</em> strategy. Let’s compare them:</p> <h3>Next-fit, the original and remaining champion</h3> <p>OCaml’s original (and default) “next-fit” allocating strategy is pretty simple:</p> <ol> <li>Keep a (circular) list of every hole in memory ordered by increasing addresses; </li> <li>Have a pointer on an element of that list; </li> <li>When an allocation is needed, if the currently pointed-at hole is big enough, allocate in it; </li> <li>Otherwise, try the next hole and so-on. </li> </ol> <p>This strategy is extremely efficient, but a big hole might be fragmented with very small data while small holes stay unused. In some cases, the GC would trigger costly compactions that would have been avoidable.</p> <h3>First-fit, the unsuccessful contender</h3> <p>To counteract that problem, the “first-fit” strategy was implemented in 2008 (OCaml 3.11.0):</p> <ul> <li>Same idea as next-fit, but with an extra allocation table. </li> <li>Put the pointer back at the beginning of the list for each allocation. </li> <li>Use the allocation table to skip some parts of the list. </li> </ul> <p>Unfortunately, that strategy is slower than the previous one. This is an example of making the GC smarter ends up making it slower. It does, however, reduce fragmentation. It was still useful to have this strategy at hand for the case where compaction would be too costly (on a 100Gb heap, for instance). An application that requires low latency might want to disable compaction and use that strategy.</p> <h3>Best-fit: a new challenger enters!</h3> <p>This leads us to the brand new “best-fit” strategy. This strategy is actually composite and will have different behaviors depending on the size of the data you’re trying to allocate.</p> <ul> <li>On small data (up to 32 words), <a href="https://github.com/ocaml/ocaml/blob/trunk/runtime/freelist.c#L868">segregated free lists</a> will allow allocation in (mostly) constant time. </li> <li>On big data, a general best-fit allocator based on <a href="https://en.wikipedia.org/wiki/Splay_tree">splay trees</a>. </li> </ul> <p>This allows for the best of the two worlds, as you can easily allocate your numerous small blocks in the small holes in your memory while you take a bit more time to select a good place for your big arrays.</p> <p>How will best-fit fare? Let’s find out!</p> <h2>Try it!</h2> <p>First, let us remind you that this is still an experimental feature, which from the OCaml development team means “We’ve tested it thoroughly on different systems, but only for months and not on a scale as large as the whole OCaml ecosystem”.</p> <p>That being said, we’d advise you don’t use it in production code yet.</p> <h3>Why you should try it</h3> <p>Making benchmarks of this new strategy could be beneficial for you and the language at large: the dev team is hoping for feedback, the more quality feedback <strong>you</strong> give means the more the future GC will be tuned for your needs.</p> <p>In 2008, the first-fit strategy was released with the hope of improving memory usage by reducing fragmentation. However, the lack of feedback meant that the developers were not aware that it didn’t meet the users’ needs. If more feedback had been given, it’s possible that work on improving the strategy or on better strategies would have happened sooner.</p> <h3>Choosing the allocator strategy</h3> <p>Now, there are two ways to control the GC behavior: through the code or through environment variables.</p> <h4>First method: Adding instructions in your code</h4> <p>This method should be used by those of us who have code that already does some GC fine-tuning. As early as possible in your program, you want to execute the following lines:</p> <pre><code class="language-Ocaml">let () = Gc.(set { (get()) with allocation_policy = 2; (* Use the best-fit strategy *) space_overhead = 100; (* Let the major GC work a bit less since it's more efficient *) }) </code></pre> <p>You might also want to add <code>verbose = 0x400;</code> or <code>verbose = 0x404;</code> in order to get some GC debug information. See <a href="https://caml.inria.fr/pub/docs/manual-ocaml/libref/Gc.html">here</a> for more details on how to use the <code>GC</code> module.</p> <p>Of course, you’ll need to recompile your code, and this will apply only after the runtime has initialized itself, triggering a compaction in the process. Also, since you might want to easily switch between different allocation policies and overhead specifications, we suggest you use the second method.</p> <h4>Second method: setting <code>$OCAMLRUNPARAM</code></h4> <p>At OCamlPro, we develop and maintain a program that any OCaml developer should want to run smoothly. It’s called <a href="https://opam.ocaml.org/">Opam</a>, maybe you’ve heard of it? Though most commands take a few seconds, some <a href="https://opam.ocaml.org/doc/man/opam-admin-check.html">administrative-heavy</a> commands can be a strain on our computer. In other words: those are perfect for a benchmark.</p> <p>Here’s what we did to benchmark Opam:</p> <pre><code class="language-shell-session">$ opam update $ opam switch create 4.10.0 $ opam install opam-devel # or build your own code $ export OCAMLRUNPARAM='b=1,a=2,o=100,v=0x404' $ cd my/local/opam-repository $ perf stat ~/.opam/4.10.0/lib/opam-devel/opam admin check --installability # requires right to execute perf, time can do the trick </code></pre> <p>If you want to compile and run your own benchmarks, here are a few details on <code>OCAMLRUNPARAM</code>:</p> <ul> <li><code>b=1</code> means “print the backtrace in case of uncaught exception” </li> <li><code>a=2</code> means “use best-fit” (default is <code>0</code> , first-fit is <code>1</code>) </li> <li><code>o=100</code> means “do less work” (default is <code>80</code>, lower means more work) </li> <li><code>v=0x404</code> means “have the gc be verbose” (<code>0x400</code> is “print statistics at exit”, 0x4 is “print when changing heap size”) </li> </ul> <p>See the <a href="https://caml.inria.fr/pub/docs/manual-ocaml/runtime.html#s%3Aocamlrun-options">manual</a> for more details on <code>OCAMLRUNPARAM</code></p> <p>You might want to compare how your code fares on all three different GC strategies (and fiddle a bit with the overhead to find your best configuration).</p> <h2>Our results on opam</h2> <p>Our contribution in this article is to benchmark <code>opam</code> with the different allocation strategies:</p> <figure><table><thead><tr><td>Strategy:</td><td>Next-fit</td><td>First-fit</td><td colspan="3" scope="colgroup">Best-fit</td></tr><tr><td>Overhead:</td><td>80</td><td>80</td><td>80</td><td>100</td><td>120</td></tr><tr><td>Cycles used (Gcycle)</td><td>2,040</td><td>3,808</td><td>3,372</td><td>2,851</td><td>2,428</td></tr><tr><td>Maximum heap size (kb)</td><td>793,148</td><td>793,148</td><td>689,692</td><td>689,692</td><td>793,148</td></tr><tr><td>User time (s)</td><td>674</td><td>1,350</td><td>1,217</td><td>1,016</td><td>791</td></tr></thead></table></figure> <p>A quick word on these results. Most of <code>opam</code>‘s calculations are done by <a href="http://www.mancoosi.org/software/">dose</a> and rely heavily on small interconnected blocks. We don’t really have big chunks of data we want to allocate, so the strategy won’t give us the bonus you might have as it perfectly falls into the best-case scenario of the next-fit strategy. As a matter of fact, for every strategy, we didn’t have a single GC compaction happen. However, Best-fit still allows for a lower memory footprint!</p> <h2>Conclusions</h2> <p>If your software is highly reliant on memory usage, you should definitely try the new Best-fit strategy and stay tuned on its future development. If your software requires good performance, knowing if your performances are better with Best-fit (and giving feedback on those) might help you in the long run.</p> <p>The different strategies are:</p> <ul> <li>Next-fit: generally good and fast, but has very bad worst cases with big heaps. </li> <li>First fit: mainly useful for very big heaps that must avoid compaction as much as possible. </li> <li>Best-fit: almost the best of both worlds, with a small performance hit for programs that fit well with next-fit. </li> </ul> <p>Remember that whatever works best for you, it’s still better than having to <code>malloc</code> and <code>free</code> by hand. Happy allocating!</p> <h1>Comments</h1> <p>gasche (23 March 2020 at 17 h 50 min):</p> <blockquote> <p>What about higher overhead values than 120, like 140, 160, 180 and 200?</p> </blockquote> <p>Thomas Blanc (23 March 2020 at 18 h 17 min):</p> <blockquote> <p>Because 100 was the overhead value Leo advised in the PR discussion I decided to put it in the results. As 120 got the same maximum heap size as next-fit I found it worth putting it in. Higher overhead values lead to faster execution time but a bigger heap.</p> <p>I don’t have my numbers at hand right now. You’re probably right that they are relevant (to you and Damien at least) but I didn’t want to have a huge table at the end of the post.</p> </blockquote> <p>nbbb (24 March 2020 at 11 h 18 min):</p> <blockquote> <p>Higher values would allow us to see if best-fit can reproduce the performance characteristics of next-fit, for some value of the overhead.</p> </blockquote> <p>nbbb (24 March 2020 at 16 h 51 min):</p> <blockquote> <p>I just realized that 120 already has a heap as bit as next-fit — so best-fit can’t get as good as next-fit in this example, and higher values of the overhead are not quite as informative. Should have read more closely the first time.</p> </blockquote> <p>Thomas Blanc (24 March 2020 at 16 h 55 min):</p> <blockquote> <p>Sorry that it wasn’t as clear as it could be.</p> <p>Note that opam and dose are in the best-case scenario of best-fit. Your own code would probably produce a different result and I encourage you to test it and communicate about it.</p> </blockquote> New version of TryOCaml in beta! https://ocamlpro.com/blog/2020_03_16_new_version_of_try_ocaml_in_beta 2020-03-16T13:48:57Z 2020-03-16T13:48:57Z Louis Gesbert We are happy to announce that our venerable "TryOCaml" service is being retired and replaced by a new, modern version based upon our work on Learn-OCaml. → Try it here ← The new interface provides an editor panel besides the familiar top-level, error and warning positions highlighting, the lates... <p><img src="/blog/assets/img/picture_new_tryocaml.jpeg" alt="" /></p> <p>We are happy to announce that our venerable &quot;TryOCaml&quot; service is being retired and replaced by a new, modern version based upon our work on <a href="https://github.com/ocaml-sf/learn-ocaml">Learn-OCaml</a>.</p> <p>→ <a href="https://try.ocamlpro.com">Try it here</a> ←</p> <p>The new interface provides an editor panel besides the familiar top-level, error and warning positions highlighting, the latest OCaml release (4.10.0), local storage of your session, and more.</p> <blockquote> <p>The service is still in beta, so it would be helpful if you could tell us about any hiccups you may encounter <a href="https://discuss.ocaml.org/t/ann-try-ocaml-2-0-beta">on the Discuss thread</a>.</p> </blockquote> <p>Let's read the testimony of Sylvain Conchon about our new version of TryOCaml:</p> <blockquote> <p>“TryOCaml saved our lives in Paris Saclay in these times of social distancing. I teach functional programming with OCaml to my Y2 Bachelor’s Degree students. With the quarantine in place, we weren’t able to host the practical assignment in the machine room as usual, so we decided the students would do the exam at home. However, many of our students use Windows on which setting up OCaml is a hassle, or otherwise encountered problems while setting up the OCaml environment. We invited our students to use try-ocaml instead! Many have and the exam went really smoothly.”</p> </blockquote> Réunion annuelle du Club des utilisateurs d’Alt-Ergo https://ocamlpro.com/blog/2020_03_03_reunion_annuelle_du_club_des_utilisateurs_dalt_ergo 2020-03-03T13:48:57Z 2020-03-03T13:48:57Z Aurore Dromby Alt-Ergo meeting Logo Alt-Ergo La deuxième réunion annuelle du Club des utilisateurs d’Alt-Ergo a eu lieu à la mi-février ! Notre réunion annuelle est l’endroit idéal pour passer en revue les besoins de chaque partenaire concernant Alt-Ergo. Cette année, nous avons eu le plaisir de recevo... <p><img src="/blog/assets/img/altergo-meeting.jpeg" alt="Alt-Ergo meeting" /> <img src="/assets/img/logo_altergo.png" alt="Logo Alt-Ergo" /></p> <p>La deuxième réunion annuelle du Club des utilisateurs d’Alt-Ergo a eu lieu à la mi-février ! Notre réunion annuelle est l’endroit idéal pour passer en revue les besoins de chaque partenaire concernant Alt-Ergo. Cette année, nous avons eu le plaisir de recevoir nos partenaires pour discuter de la feuille de route concernant les développements et les améliorations futures d’Alt-Ergo.</p> <blockquote> <p>Alt-Ergo est un démonstrateur automatique de formules mathématiques, créé au <a href="https://www.lri.fr/">LRI</a> et développé par OCamlPro depuis 2013. Pour en savoir plus ou rejoindre le Club, visitez le site <a href="https://alt-ergo.ocamlpro.com">https://alt-ergo.ocamlpro.com/</a>.</p> </blockquote> <p>Notre Club a plusieurs objectifs, le premier étant de garantir la pérennité d’Alt-Ergo en favorisant la collaboration entre les membres du Club et en renforçant la collaboration avec les communautés de méthodes formelles telles que Why3. L’une de nos priorités est d’augmenter le nombre d’utilisateurs de notre outil en l’étendant à de nouveaux domaines tels que le Model Checking, la participation à des compétitions internationales étant également un moyen de gagner en visibilité. Enfin, le dernier objectif du Club est de trouver de nouveaux projets ou contrats pour le développement de fonctionnalités à long terme.</p> <p>Nous remercions tous nos membres pour leur soutien et souhaitons la bienvenue à Mitsubishi Electric R&amp;D Centre Europe qui rejoint AdaCore et le CEA List en tant que membre du Club cette année. Nous souhaitons également mettre en lumière l’équipe de développement <a href="http://why3.lri.fr/">Why3</a> avec laquelle nous travaillons pour améliorer nos outils.</p> <p>Nos membres sont particulièrement intéressés par les points suivants :</p> <p>– Une meilleure génération de modèles et de contre-exemples</p> <p>– L’ajout de la théorie des séquences</p> <p>– L’amélioration du support de l’arithmétique non linéaire dans Alt-Ergo</p> <p>Ces fonctionnalités sont maintenant nos principales priorités. Pour suivre nos avancement et les nouveautés, n’hésitez pas à lire nos <a href="category/formal_methods">articles</a> sur ce blog.</p> 2019 chez OCamlPro https://ocamlpro.com/blog/2020_02_05_fr_2019_chez_ocamlpro 2020-02-05T13:48:57Z 2020-02-05T13:48:57Z OCamlPro 2019 at OCamlPro OCamlPro a pour ambition d’aider les industriels dans leur adoption du langage OCaml et des méthodes formelles. L’entreprise est passée d’1 à 21 personnes et est restée fidèle à cet objectif. L’année 2019 chez OCamlPro a été très animée, et le nombre de réalisati... <p><img src="/blog/assets/img/logo_ocp_2019.png" alt="2019 at OCamlPro" /></p> <p>OCamlPro a pour ambition d’aider les industriels dans leur adoption du langage OCaml et des méthodes formelles. L’entreprise est passée d’1 à 21 personnes et est restée fidèle à cet objectif. L’année 2019 chez OCamlPro a été très animée, et le nombre de réalisations impressionnant, d’abord dans le monde OCaml (flambda2 &amp; optimisations du compilateur, opam 2, notre interface Rust pour memprof, des outils comme tryOCaml, ocp-indent, et le soutien à la OCaml Software Foundation), et dans le monde des méthodes formelles (nouvelles versions de notre solveur SMT Alt-Ergo, lancement du Club des utilisateurs Alt-Ergo,lancement du langage Love, etc.)</p> <p><a href="/2020/02/04/2019-at-ocamlpro/">Lire la suite (en anglais)</a></p> 2019 at OCamlPro https://ocamlpro.com/blog/2020_02_04_2019_at_ocamlpro 2020-02-04T13:48:57Z 2020-02-04T13:48:57Z Muriel OCamlPro 2019 at OCamlPro OCamlPro was created to help OCaml and formal methods spread into the industry. We grew from 1 to 21 engineers, still strongly sharing this ambitious goal! The year 2019 at OCamlPro was very lively, with fantastic accomplishments all along! Let's quickly review the past years' works... <p><img src="/blog/assets/img/logo_ocp_2019.png" alt="2019 at OCamlPro" /></p> <p>OCamlPro was created to help OCaml and formal methods spread into the industry. We grew from 1 to 21 engineers, still strongly sharing this ambitious goal! The year 2019 at OCamlPro was very lively, with fantastic accomplishments all along!</p> <p>Let's quickly review the past years' works, first in the world of <a href="#ocaml">OCaml</a> (<a href="#compilation">flambda2</a> &amp; compiler optimisations, <a href="#opam">opam</a> 2, our <a href="#rust">Rust-based</a> UI for <a href="#memthol">memprof</a>, tools like tryOCaml, ocp-indent, and supporting the <a href="#ocsf">OCaml Software Foundation</a>), then in the world of <a href="#formalmethods">formal methods</a> (new versions of our SMT Solver <a href="#altergo">Alt-Ergo</a>, launch of the <a href="#altergoclub">Alt-Ergo Users' Club</a>, the <a href="#love">Love language</a>, etc.).</p> <h2>In the World of OCaml</h2> <p><img src="/blog/assets/img/logo_ocaml.png" alt="ocaml" /></p> <h3>Flambda/Compilation Team</h3> <p><em>Work by Pierre Chambart, Vincent Laviron, Guillaume Bury and Pierrick Couderc</em></p> <p>Pierre and Vincent's considerable work on Flambda 2 (the optimizing intermediate representation of the OCaml compiler – on which inlining occurs), in close cooperation with Jane Street (Mark Shinwell, Leo White and their team) aims at overcoming some of flambda's limitations. We have continued our work on making OCaml programs always faster: internal types are clearer, more concise, and possible control flow transformations are more flexible. Overall a precious outcome for industrial users. In 2019, the major breakthrough was to go from the initial prototype to a complete compiler, which allowed us to compile simple examples first and then to bootstrap it.</p> <p>On the OCaml compiler side, we also worked with Leo on two new features: functorized compilation units and functorized packs, and recursive packs. The former will allow any developer to implement <code>.ml</code> files as if they were functors and not simply modules, and more importantly generate packs that are real functors. As such, this allows to split big functors into several files or to parameterize libraries on other modules. The latter allows two distinct usages: recursive interfaces, to implement recursive types into distinct <code>.mlis</code> as long as they do not need any implementation; and recursive packs, whose components are typed and compiled as recursive modules.</p> <ul> <li>These new features are described on the new <a href="https://github.com/ocaml/RFCs/pull/11">RFC repository</a> for OCaml (a <a href="https://github.com/ocaml/ocaml/issues/5283">similar idea</a> was suggested and implemented in 2011 by Fabrice Le Fessant). </li> <li>The implementation is available on GitHub for both <a href="https://github.com/OCamlPro-Couderc/ocaml/tree/functorized-packs">functorized packs</a> and <a href="https://github.com/OCamlPro-Couderc/ocaml/tree/recursive-units+pack-cleanup">recursive packs</a>. Be aware that both are based on an old version of OCaml for now, but should be in sync with the current trunk in the near future. </li> <li>See also Vincent's <a href="/blog/2019_08_30_ocamlpros_compiler_team_work_update">OCamlPro’s compiler team work update</a> of August 2019. </li> </ul> <p><em>This work is allowed thanks to Jane Street's funding.</em></p> <h3>Work on a formalized type system for OCaml</h3> <p><em>Work of Pierrick Couderc</em></p> <p>At the end of 2018, Pierrick defended his PhD on &quot;<a href="https://pastel.archives-ouvertes.fr/tel-02100717/">Checking type inference results of the OCaml language</a>&quot;, leading to a formalized type systems and semantics for a large subset of OCaml, or at least its unique typed intermediate language: the Typedtree. This work led us to work on new aspects of the OCaml compiler as recursive and functorized packs described earlier, and we hope this proves useful in the future for the evolution of the language.</p> <h3>The OPAM package manager</h3> <p><em>Work of Raja Boujbel and Louis Gesbert</em></p> <p><img src="/blog/assets/img/logo_opam_300_261.png" alt="opam" /></p> <p><a href="https://opam.ocaml.org">OPAM</a> is maintained and developed at OCamlPro by Louis and Raja. Thanks to their thorough efforts the opam 2.1 first release candidate is soon to be published!</p> <p>Back in 2018, the long-awaited opam 2.0 version was finally released. It embedded many changes, in opam itself as well as for the community. The opam file format was redefined to simplify and add new features. With the close collaboration of OCamlLabs and opam repository maintainers, we were able to manage a smooth transition of the repository and whole ecosystem from opam 1.2 format to the new – and richer – opam 2.0 format. Other emblematic features are e.g. for practically integrated mccs solver, sandboxing builds, for security issues (we care about your filesystem!), for usability reworking of the pin' command, etc.</p> <p>While the 2.1.0 version is in preparation, the 2.0.0 version is still updated with minor releases to fix issues. The lastest 2.0.6 release is fresh from January.</p> <p>In the meantime, we continued to improve opam by integrating some opam plugins (opam lock, opam depext), recursively discover opam files in the file tree when pinning, new definition of a switch compiler, the possibility to use z3 backend instead of mccs, etc.</p> <p>All these new features – among others – will be integrated in the 2.1.0 release, that is betaplanned for February. The best is yet to come!</p> <ul> <li>More details: on <a href="https://opam.ocaml.org">https://opam.ocaml.org</a> </li> <li>Releases on Releases on <a href="https://github.com/ocaml/opam/releases">https://github.com/ocaml/opam/releases</a> &amp; <a href="https://opam.ocaml.org/blog/opam-2-0-6/">our blog</a> </li> </ul> <p><em>This work is allowed thanks to Jane Street's funding.</em></p> <h3>Encouraging OCaml adoption</h3> <h4>OCaml Expert trainings for professional programmers</h4> <p>We proposed in 2019 some <a href="/course_ocaml_expert">OCaml expert training</a> specially designed for developers who want to use advanced features and master all the open-source tools and libraries of OCaml.</p> <blockquote> <p>The &quot;Expert&quot; OCaml course is for already experienced OCaml programmers to better understand advanced type system possibilities (objects, GADTs), discover GC internals, write &quot;compiler-optimizable&quot; code. These sessions are also an opportunity to come discuss with our OPAM &amp; Flambda lead developers and core contributors in Paris.</p> </blockquote> <p>Next session: 3-4 March 2020, Paris <a href="https://www.ocamlpro.com/pre-inscription-a-une-session-de-formation-inter-entreprises/">(registration)</a></p> <h4>Our cheat-sheets on OCaml, the stdlib and opam</h4> <p><em>Work of Thomas Blanc, Raja Boujbel and Louis Gesbert</em></p> <p>Thomas announced the release of our up-to-date cheat-sheets for the <a href="/blog/2019_09_13_updated_cheat_sheets_language_stdlib_2">OCaml language, standard library</a> and <a href="https://ocamlpro.github.io/ocaml-cheat-sheets/ocaml-opam.pdf">opam</a>. Our original cheat-sheets were dating back to 2011. This was an opportunity to update them after the <a href="/blog/2019_09_13_updated_cheat_sheets_language_stdlib_2">many changes</a> in the language, library and ecosystem overall.</p> <blockquote> <p>Cheat-sheets are helpful to refer to, as an overview of the documentation when you are programming, especially when you’re starting in a new language. They are meant to be printed and pinned on your wall, or to be kept in handy on a spare screen. <em>They come in handy when your <a href="https://rubberduckdebugging.com/">rubber duck</a> is rubbish at debugging your code!</em></p> </blockquote> <p>More details on <a href="/blog/2019_09_13_updated_cheat_sheets_language_stdlib_2">Thomas' blog post</a></p> <h4>Open Source Tooling and Web IDEs</h4> <p>And let's not forget the other tools we develop and maintain! We have tools for education such as our interactive editor OCaml-top and <a href="https://try.ocamlpro.com/new.html">Try-OCaml</a> (from the previous work on the learn-OCaml platform for the OCaml Fun MOOC) which you can use to code in your browser. Developers will appreciate tools like our indentation tool ocp-indent, and ocp-index which gives you easy access to the interface information of installed OCaml libraries for editors like Emacs and Vim.</p> <h3>Supporting the OCaml Software Foundation</h3> <p>OCamlPro was proud to be one of the first supporters of the new Inria's <a href="https://ocaml-sf.org/">OCaml Software Foundation.</a> We keep committed to the adoption of OCaml as an industrial language:</p> <blockquote> <p>&quot;[…] As a long-standing supporter of the OCaml language, we have always been committed to helping spread OCaml's adoption and increase the accessibility of OCaml to beginners and students. […] We value close and friendly collaboration with the main actors of the OCaml community, and are proud to be contributing to the OCaml language and tooling.&quot; (August 2019, Advisory Board of the OCSF, ICFP Berlin)</p> </blockquote> <p>More information on the <a href="https://ocaml-sf.org/">OCaml Software Foundation</a></p> <h2>In the World of Formal Methods</h2> <p><em>By Mohamed Iguernlala, Albin Coquereau, Guillaume Bury</em></p> <p>In 2018, we welcomed five new engineers with a background in formal methods. They consolidate the department of formal methods at OCamlPro, in particular help develop and maintain our SMT solver Alt-Ergo.</p> <h3>Release of Alt-Ergo 2.3.0, and version 2.0.0 (free)</h3> <p>After the release of <a href="/blog/2018_04_23_release_of_alt_ergo_2_2_0">Alt-Ergo 2.2.0</a> (with a new front-end that supports the SMT-LIB 2 language, extended prenex polymorphism, implemented as a standalone library) came the version 2.3.0 in 2019 with new features : dune support, ADT / algebraic datatypes, improvement of the if-then-else and let-in support, improvement of the data types.</p> <ul> <li>More information on the <a href="https://alt-ergo.ocamlpro.com/">Alt-Ergo SMT Solver</a> </li> <li>Albin Coquereau defended his PhD thesis in Decembre 2019 &quot;Improving performance of the SMT solver Alt-Ergo with a better integration of efficient SAT solver&quot; </li> <li>We participated in the SMT-COMP 2019 during the 22nd SAT conference. The results of the competition are detailed <a href="/blog/2019_07_09_alt_ergo_participation_to_the_smt_comp_2019">here.</a> </li> </ul> <h3>The launch of the Alt-Ergo Users' Club</h3> <p>Getting closer to our users, gathering industrial and academic supporters, collecting their needs into the Alt-Ergo roadmap is key to Alt-Ergo's development and sustainability.</p> <p>The <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users' Club</a> was officially launched beginning of 2019. The first yearly meeting took place in February 2019. We were happy to welcome our first members <a href="https://www.adacore.com">Adacore</a>, <a href="https://www-list.cea.fr/en/">CEA List</a>, <a href="https://trust-in-soft.com">Trust-In-Soft</a>, and now Mitsubishi MERCE.</p> <p>More information on the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo Users' Club</a></p> <p><img src="/blog/assets/img/logo_love_couleur.png" alt="Love-language" /></p> <h2>Harnessing our language-design expertise: Love</h2> <p><em>Work by David Declerck &amp; Steven de Oliveira</em></p> <p>Following the launch of Dune network, the Love language for smart-contracts was born from the collaboration of OCamlPro and Origin Labs. This new language, whose syntax is inspired from OCaml and Liquidity, is an alternative to the Dune native smart contract language Michelson. Love is based on system-F, a type system requiring no type inference and allowing polymorphism. The language has successfully been integrated on the network and the first smart contracts are being written.</p> <p><a href="https://medium.com/dune-network/love-a-new-smart-contract-language-for-the-dune-network-a217ab2255be">LOVE: a New Smart Contract Language for the Dune Network</a> <a href="https://medium.com/dune-network/the-love-smart-contract-language-introduction-key-features-part-i-949d8a4e73c3">The Love Smart Contract Language: Introduction &amp; Key Features — Part I</a></p> <h2>Rust-related activities</h2> <p>The OCaml &amp; Rust combo <em>should</em> be a candidate for any ambitious software project!</p> <ul> <li>A Rust-based UI for memprof: we started in 2019 to work in collaboration with the memprof developer team on a Rust based UI for memprof. See Pierre and Albin's exposé at the <a href="https://jfla.inria.fr/jfla2020.html">JFLA2020</a>'s &quot;Gardez votre mémoire fraiche avec Memthol&quot; (Pierre Chambart , Albin Coquereau and Jacques-Henri Jourdan) </li> <li><a href="/course_rust_vocational_training">Rust training</a> : <em>Rust borrows heavily from functional programming languages to provide very expressive abstraction mechanisms. Because it is a systems language, these mechanisms are almost always zero-cost. For instance, polymorphic code has no runtime cost compared to a monomorphic version.This concern for efficiency also means that Rust lets developers keep a very high level of control and freedom for optimizations. Rust has no Garbage Collection or any form of runtime memory inspection to decide when to free, allocate or re-use memory. But because manual memory management is almost universally regarded as dangerous, or at least very difficult to maintain, the Rust compiler has a borrow-checker which is responsible for i) proving that the input program is memory-safe (and thread-safe), and ii) generating a safe and “optimal” allocation/deallocation strategy. All of this is done at compile-time.</em> </li> <li>Next sessions: April 20-24th 2020 <a href="https://www.ocamlpro.com/pre-inscription-a-une-session-de-formation-inter-entreprises/">(registration)</a> </li> </ul> <h2>OCamlPro around the world</h2> <p>OCamlPro's team members attended many events throughout the world:</p> <ul> <li><a href="https://icfp19.sigplan.org/">ICFP 2019</a> (Berlin) </li> <li>The <a href="https://dpt-info.u-strasbg.fr/~magaud/JFLA2019/lieu.html">JFLA’2019</a> (Les Rousses, Haut-Jura) </li> <li>The<a href="https://www.opensourcesummit.paris/"> POSS'2019 </a>(Paris) </li> <li><a href="https://retreat.mirage.io/">MirageOS Retreat</a> (Marrakech) </li> </ul> <p>As a committed member of the OCaml ecosystem's animation, we've organized OCaml meetups too (see the famous <a href="https://www.meetup.com/fr-FR/ocaml-paris/">OUPS</a> meetups in Paris!).</p> <p>Now let's jump into the new year 2020, with a team keeping expanding, and new projects ahead: keep posted!</p> <h3>Past projects: blockchain-related achievements (2018-beginning of 2019)</h3> <p>Many people ask us about what happened in 2018! That was an incredibly active year on blockchain-related achievements, and at that time we were hoping to attract clients that would be interested in our blockchain expertise.</p> <p>But that is <a href="https://files.ocamlpro.com/Flyer_Blockchains_OSIS2017ok.pdf">history</a> now! Still interested? Check the <a href="https://www.origin-labs.com/">Origin Labs</a> team and their partner <a href="https://www.thegara.ge/">The Garage</a> on <a href="https://dune.network">Dune Network</a>!</p> <p>For the <a href="/blog/2019_04_29_blockchains_at_ocamlpro_an_overview">record</a>:</p> <ul> <li>(April 2019) We had started Techelson: a testing framework for Michelson and Liquidity </li> <li>(Nov 2018) <a href="/blog/2018_11_21_an_introduction_to_tezos_rpcs_signing_operations">An Introduction to Tezos RPCs: Signing Operations</a> / <a href="/blog/2018_11_15_an-introduction_to_tezos_rpcs_a_basic_wallet">An Introduction to Tezos RPCs: a Basic Wallet</a> / <a href="/blog/2018_11_06_liquidity_tutorial_a_game_with_an_oracle_for_random_numbers">Liquidity Tutorial: A Game with an Oracle for Random Numbers</a> / <a href="/blog/2018_11_08_first_open_source_release_of_tzscan">First Open-Source Release of TzScan</a> </li> <li>(Oct 2018) <a href="/blog/2018_10_17_ocamlpros_tzscan_grant_proposal_accepted_by_the_tezos_foundation_joint_press_release">OCamlPro’s TZScan grant proposal accepted by the Tezos Foundation – joint press release</a> </li> <li>(Jul 2018) <a href="/blog/2018_07_20_new_updates_on_tzscan_2">OCamlPro’s Tezos block explorer TzScan’s last updates</a> </li> <li>(Feb 2018) <a href="/blog/2018_02_14_release_of_a_first_version_of_tzscan_io_a_tezos_block_explorer">Release of a first version of TzScan.io, a Tezos block explorer</a> / <a href="/blog/2018_11_06_liquidity_tutorial_a_game_with_an_oracle_for_random_numbers">OCamlPro’s Liquidity-lang demo at JFLA2018 – a smart-contract design language</a> . We were developing <a href="https://www.liquidity-lang.org/">Liquidity</a>, a high level smart contract language, human-readable, purely functional, statically-typed, which syntax was very close to the OCaml syntax. </li> <li>To garner interest and adoption, we also developed the online editor <a href="https://www.liquidity-lang.org/edit">Try Liquidity</a>. Smart-contract developers could design contracts interactively, directly in the browser, compile them to Michelson, run them and deploy them on the alphanet network of Tezos. Future plans included a full-fledged web-based IDE for Liquidity. Worth mentioning was a neat feature: decompiling a Michelson program back to its Liquidity version, whether it was generated from Liquidity code or not. </li> </ul> opam 2.0.6 release https://ocamlpro.com/blog/2020_01_16_opam_2.0.6_release 2020-01-16T13:48:57Z 2020-01-16T13:48:57Z Raja Boujbel Louis Gesbert We are pleased to announce the minor release of opam 2.0.6. This new version contains some small backported fixes and build update: Don't remove git cache objects that may be used [#3831 @AltGr] Don't include .gitattributes in index.tar.gz [#3873 @dra27] Update FAQ uri [#3941 @dra27] Lock: add warni... <p>We are pleased to announce the minor release of <a href="https://github.com/ocaml/opam/releases/tag/2.0.6">opam 2.0.6</a>.</p> <p>This new version contains some small <a href="https://github.com/ocaml/opam/pull/3973">backported</a> fixes and build update:</p> <ul> <li>Don't remove git cache objects that may be used [<a href="https://github.com/ocaml/opam/pull/3831">#3831</a> <a href="https://github.com/AltGr">@AltGr</a>] </li> <li>Don't include .gitattributes in index.tar.gz [<a href="https://github.com/ocaml/opam/pull/3873">#3873</a> <a href="https://github.com/dra27">@dra27</a>] </li> <li>Update FAQ uri [<a href="https://github.com/ocaml/opam/pull/3941">#3941</a> <a href="https://github.com/dra27">@dra27</a>] </li> <li>Lock: add warning in case of missing locked file [<a href="https://github.com/ocaml/opam/pull/3939">#3939</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>Directory tracking: fix cached entries retrieving with precise tracking [<a href="https://github.com/ocaml/opam/pull/4038">#4038</a> <a href="https://github.com/hannesm">@hannesm</a>] </li> <li>Build: <ul> <li>Add sanity checks [<a href="https://github.com/ocaml/opam/pull/3934">#3934</a> <a href="https://github.com/dra27">@dra27</a>] </li> <li>Build man pages using dune [<a href="https://github.com/ocaml/opam/issues/3902">#3902</a> ] </li> <li>Add patch and bunzip check for make cold [<a href="https://github.com/ocaml/opam/pull/4006">#4006</a> <a href="https://github.com/rjbou">@rjbou</a> - fix <a href="https://github.com/ocaml/opam/issues/3842">#3842</a>] </li> </ul> </li> <li>Shell: <ul> <li>fish: add colon for fish manpath [<a href="https://github.com/ocaml/opam/pull/3886">#3886</a> <a href="https://github.com/rjbou">@rjbou</a> - fix <a href="https://github.com/ocaml/opam/issues/3878">#3878</a>] </li> </ul> </li> <li>Sandbox: <ul> <li>Add dune cache as rw [<a href="https://github.com/ocaml/opam/pull/4019">#4019</a> <a href="https://github.com/rjbou">@rjbou</a> - fix <a href="https://github.com/ocaml/opam/issues/4012">#4012</a>] </li> <li>Do not fail if $HOME/.ccache is missing [<a href="https://github.com/ocaml/opam/pull/3957">#3957</a> <a href="https://github.com/mseri">@mseri</a>] </li> </ul> </li> <li>opam-devel file: avoid copying extraneous files in opam-devel example [<a href="https://github.com/ocaml/opam/pull/3999">#3999</a> <a href="https://github.com/maroneze">@maroneze</a>] </li> </ul> <p>As <strong>sandbox scripts</strong> have been updated, don't forget to run <code>opam init --reinit -ni</code> to update yours.</p> <blockquote> <p>Note: To homogenise macOS name on system detection, we decided to keep <code>macos</code>, and convert <code>darwin</code> to <code>macos</code> in opam. For the moment, to not break jobs &amp; CIs, we keep uploading <code>darwin</code> &amp; <code>macos</code> binaries, but from the 2.1.0 release, only <code>macos</code> ones will be kept.</p> </blockquote> <hr /> <p>Installation instructions (unchanged):</p> <ol> <li>From binaries: run </li> </ol> <pre><code class="language-sheel-session">bash -c &quot;sh &lt;(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) --version 2.0.6&quot; </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.0.6">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script.</p> <ol start="2"> <li>From source, using opam: </li> </ol> <pre><code class="language-shell-session">opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script)</p> <ol start="3"> <li>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.0.6#compiling-this-repo">README</a>. </li> </ol> <p>We hope you enjoy this new minor version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>.</p> </blockquote> The Opam 2.0 cheatsheet, with a new theme! https://ocamlpro.com/blog/2020_01_10_opam_2.0_cheat_sheet 2020-01-10T13:48:57Z 2020-01-10T13:48:57Z Thomas Blanc The Opam 2.0 cheatsheet, with a new theme! Earlier, we dusted-off our Language and Stdlib cheatsheets, for teachers and students. With more time, we managed to design an Opam 2.0 cheat-sheet we are proud of. It is organized into two pages: The everyday average Opam use: Installation, Configuration, ... <p><a href="/blog/2020_01_10_opam_2.0_cheat_sheet"><img src="/blog/assets/img/logo_opam_blue.png" alt="The Opam 2.0 cheatsheet, with a new theme!" /></a></p> <p><a href="/blog/2019_09_13_updated_cheat_sheets_language_stdlib_2">Earlier</a>, we dusted-off our Language and Stdlib cheatsheets, for teachers and students. With more time, we managed to design an Opam 2.0 cheat-sheet we are proud of. It is organized into two pages:</p> <ul> <li>The everyday average Opam use: <ul> <li>Installation, Configuration, Switches, Allowed URL formats, Packages, Exploring, Package pinning, Working with local pins, Sharing a dev setup, Configuring remotes. </li> </ul> </li> <li>Peculiar advanced use cases (opam-managed project, publishing, repository maintenance, etc.): <ul> <li>Package definition files, Some optional fields, Expressions, External dependencies, Publishing, Repository administration. </li> </ul> </li> </ul> <p>Moreover, with the help of listings, we tried the use of colors for better readability. And we left some blank space for your own peculiar commands. Two versions are available (PDF):</p> <ul> <li>The Opam cheatsheet in <a href="https://ocamlpro.github.io/ocaml-cheat-sheets/ocaml-opam-bw.pdf">black &amp; white</a> </li> <li>The Opam cheatsheet in <a href="https://ocamlpro.github.io/ocaml-cheat-sheets/ocaml-opam.pdf">colour</a>. </li> </ul> <p>In any case do not hesitate to send us your suggestions on <a href="https://github.com/OCamlPro/ocaml-cheat-sheets">github</a>:</p> <ul> <li>Louis and Raja, the lead Opam developers, designed this cheatsheet so as to shed light on some important features (some I even discovered even though I speak daily with them!). If a command <em>you</em> find useful is not mentioned, let us know and we’ll add it. Feel free to ask for clarification and/or expansion of the manual! </li> </ul> <p>Happy hacking!</p> <blockquote> <p>Note: If you come to one of our <a href="https://training.ocamlpro.com/">training sessions</a>, you’ll get a free cheatsheet! Isn’t that a bargain?</p> </blockquote> Des nouvelles de la part de l'équipe compilateur d'OCamlPro https://ocamlpro.com/blog/2019_09_30_fr_travaux_sur_le_compilateur_ocaml_dernieres_nouvelles 2019-09-30T13:48:57Z 2019-09-30T13:48:57Z Vincent Laviron Nous sommes heureux de présenter certains travaux en cours sur le compilateur OCaml, travaux menés en étroite collaboration avec notre partenaire et client Janestreet. Un travail conséquent a été fait pour aboutir à un nouveau framework d’optimisation du compilateur, appelé Flambda2, dont ... <p><img src="/blog/assets/img/picture_cpu_compiler.jpeg" alt="" /></p> <p>Nous sommes heureux de présenter certains travaux en cours sur le compilateur OCaml, travaux menés en étroite collaboration avec notre partenaire et client Janestreet.</p> <p>Un travail conséquent a été fait pour aboutir à un nouveau framework d’optimisation du compilateur, appelé Flambda2, dont nous espérons qu’il corrigera certains défauts apparus dans Flambda. En parallèle, l’équipe a mené à bien certaines améliorations immédiates sur Flambda, ainsi que des modifications du compilateur qui seront utiles pour Flambda2.</p> <p>Voir (en anglais) : <a href="/2019/08/30/ocamlpros-compiler-team-work-update/">OCamlPro’s compiler team work update</a></p> Formations OCaml par OCamlPro : 5-6 et 7-8 novembre 2019 https://ocamlpro.com/blog/2019_09_26_fr_formations_ocaml_par_ocamlpro_5_6_et_7_8_novembre_2019 2019-09-26T13:48:57Z 2019-09-26T13:48:57Z OCamlPro OCamlPro lance un cycle de formations régulières à OCaml, en français, dans ses locaux parisiens (métro Alésia). La première session aura lieu début novembre 2019, avec 2 formations: Formation débutant : passer à OCaml (5-6 novembre) Formation expert : approfondir sa maîtrise du langage (... <p><img src="/blog/assets/img/trainings_2019.png" alt="" /></p> <p>OCamlPro lance un cycle de formations régulières à OCaml, en français, dans ses locaux parisiens (métro Alésia). La première session aura lieu début novembre 2019, avec 2 formations:</p> <ul> <li>Formation débutant : <a href="/formation-passer-a-ocaml/">passer à OCaml</a> (5-6 novembre) </li> <li>Formation expert : <a href="/formation-expert-ocaml/">approfondir sa maîtrise du langage</a> (7-8 novembre). </li> </ul> <p>La formation expert sera l’occasion pour des programmeurs OCaml ayant déjà une certaine expérience de mieux comprendre les possibilités avancées du typage (objets, GADTs), de découvrir en détail le fonctionnement du GC et d’écrire du code optimisable par le compilateur.</p> <p>Ces formations sont aussi une occasion de venir discuter avec les lead développeurs et contributeurs d’OPAM et Flambda chez OCamlPro.</p> <blockquote> <p>Des formations en anglais peuvent aussi être organisées sur demande à contact@ocamlpro.com</p> </blockquote> OCaml expert and beginner training by OCamlPro (in French): Nov. 5-6 & 7-8 https://ocamlpro.com/blog/2019_09_25_ocaml_expert_and_beginner_training_by_ocamlpro_in_french_nov_5_6_7_8 2019-09-25T13:48:57Z 2019-09-25T13:48:57Z OCamlPro In our endeavour to encourage professional programmers to understand and use OCaml, OCamlPro will be giving two training sessions, in French, in our Paris offices: OCaml Beginner course for professional programmers (5-6 Nov) OCaml Expertise (7-8 Nov). The "Expert" OCaml course is for already experie... <p><img src="/blog/assets/img/trainings_2019.png" alt="" /></p> <p>In our endeavour to encourage professional programmers to understand and use OCaml, OCamlPro will be giving two training sessions, in French, in our Paris offices:</p> <ul> <li><a href="https://ocamlpro.com/course-ocaml-development/">OCaml Beginner course</a> for professional programmers (5-6 Nov) </li> <li><a href="https://ocamlpro.com/course-ocaml-expert/">OCaml Expertise</a> (7-8 Nov). </li> </ul> <p>The &quot;Expert&quot; OCaml course is for already experienced OCaml programmers to better understand advanced type system possibilities (objects, GADTs), discover GC internals, write &quot;compiler-optimizable&quot; code.</p> <p>These sessions are also an opportunity to come discuss with OCamlPro's OPAM &amp; Flambda lead developers and core contributors in Paris.</p> <blockquote> <p>Training in English can also be organized, on-demand.</p> </blockquote> <p>Register link: http://ocamlpro.com/forms/preinscriptions-formation-ocaml/</p> <blockquote> <p><em>This complements the excellent <a href="https://www.fun-mooc.fr/courses/course-v1:parisdiderot+56002+session04/about">OCaml MOOC</a> from Université Paris-Diderot and the <a href="https://ocaml.foundation/learn-ocaml">learn-OCaml platform</a> of the OCaml Software Foundation.</em></p> </blockquote> A look back on OCaml since 2011 https://ocamlpro.com/blog/2019_09_20_look_back_ocaml_since_2011 2019-09-20T13:48:57Z 2019-09-20T13:48:57Z Thomas Blanc A look back on OCaml since 2011 As you already know if you’ve read our last blogpost, we have updated our OCaml cheat sheets starting with the language and stdlib ones. We know some of you have students to initiate in September and we wanted these sheets to be ready for the start of the school yea... <p><a href="/blog/2019_09_20_look_back_ocaml_since_2011"><img src="/blog/assets/img/ocaml-2011-e1600870731841.jpeg" alt="A look back on OCaml since 2011" /></a></p> <p>As you already know if you’ve read <a href="/blog/2019_09_13_updated_cheat_sheets_language_stdlib_2">our last blogpost</a>, we have updated our OCaml cheat sheets starting with the language and stdlib ones. We know some of you have students to initiate in September and we wanted these sheets to be ready for the start of the school year! We’re working on more sheets for OCaml tools like opam or Dune and important libraries such as ~~Obj~~ Lwt or Core. Keep an eye on our blog or the <a href="https://github.com/OCamlPro/ocaml-cheat-sheets">repo on GitHub</a> to follow all the updates.</p> <p>Going through the documentation was a journey to the past: we have looked back on 8 years of evolution of the OCaml language and library. New feature after new feature, OCaml has seen many changes. Needless to say, upgrading our cheat sheets to OCaml 4.08.1 was a trip down memory lane. We wanted to share our throwback experience with you!</p> <h2>2011</h2> <p>Fabrice Le Fessant first published our cheat sheets in 2011, the year OCamlPro was created! At the time, OCaml was in its 3.12 version and just <a href="https://inbox.ocaml.org/caml-list/E49008DC-30C0-4B22-9939-85827134C8A6@inria.fr/">got its current name</a> agreed upon. <a href="https://caml.inria.fr/pub/docs/manual-ocaml/manual028.html">First-class modules</a> were the new big thing, Camlp4 and Camlp5 were battling for the control of the syntax extension world and Godi and Oasis were the packaging rage.</p> <h2>2012</h2> <p>Right after 3.12 came the switch to OCaml 4.00 which brought a major change: <a href="https://caml.inria.fr/pub/docs/manual-ocaml/manual033.html">GADTs</a> (generalized algebraic data types). Most of OCaml’s developers don’t use their almighty typing power, but the possibilities they provide are really helpful in some cases, most notably the format overhaul. They’re also a fun way to troll a beginner asking how to circumvent the typing system on Stack Overflow. Since most of us might lose track of their exact syntax, GADTs deserve their place in the updated sheet (if you happen to be OCamlPro’s CTO, <em>of course</em> the writer of this blogpost remembers how to use GADTs at all times).</p> <p>On the standard library side, the big change was the switch of <code>Hashtbl</code> to Murmur 3 and the support for seeded randomization<a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-0839">.</a></p> <h2>2013</h2> <p>With OCaml 4.01 came <a href="https://github.com/ocaml/ocaml/issues/5759">constructor disambiguation</a>, but there isn’t really a way to add this to the sheet. This feature allows you to avoid misguided usage of polymorphic variants, but that’s a matter of personal taste (there’s a well-known rule that if you refresh the comments section enough times, someone —usually called Daniel— will appear to explain polymorphic variants’ superiority to you). <code>-ppx</code> rewriters were introduced in this version as well.</p> <p>The standard library got a few new functions. Notably, <code>Printexc.get_callstack</code> for stack inspection, the optimized application operators <code>|&gt;</code> and <code>@@</code> and <code>Format.asprintf</code>.</p> <h2>2014</h2> <p><em>Gabriel Scherer, on the Caml-list, end of January:</em></p> <blockquote> <p>TL;DR: During the six next months, we will follow pull requests (PR) posted on the github mirror of the OCaml distribution, as an alternative to the mantis bugtracker. This experiment hopes to attract more people to participate in the extremely helpful and surprisingly rewarding activity of patch reviews.</p> </blockquote> <p>Can you guess which change to the cheat-sheets came with 4.02? It’s a universally-loved language feature added in 2014. Still don’t know? It is <em>exceptional</em>! Got it?</p> <p>Drum roll… it is the <code>match with exception</code> <a href="https://caml.inria.fr/pub/docs/manual-ocaml/patterns.html#sec131">construction</a>! It made our codes simpler, clearer and in some cases more efficient. A message to people who want to improve the language: please aim for that.</p> <p>This version also added the <code>{quoted|foo|quoted}</code> <a href="https://caml.inria.fr/pub/docs/manual-ocaml/lex.html#string-literal">syntax</a> (which broke comments), generative functors, attributes and <a href="https://caml.inria.fr/pub/docs/manual-ocaml/manual036.html">extension nodes</a>, extensible data types, module aliases and, of course, immutable strings (which was optional at the time). Immutable strings is the one feature that prompted us to <em>remove</em> a line from the cheat sheets. More space is good. Camlp4 and Labltk moved out of the distribution.</p> <p>In consequence of immutable strings, <code>Bytes</code> and <code>BytesLabel</code> were added to the library. For the great pleasure of optimization addicts, <code>raise_notrace</code> popped up. Under the hood, the <code>format</code> type was re-implemented using GADTs.</p> <h2>2015</h2> <p>This release was so big that 4.02.2 feels like a release in itself, with the adding of <code>nonrec</code> and <code>#...</code> operators.</p> <p>The standard library was spared by this bug-fix themed release. Note that this is the last comparatively slow year of OCaml as the transition to GitHub would soon make features multiply, as hindsight teaches us.</p> <h2>2016</h2> <p>Speaking of a major release, we’re up to OCaml 4.03! It introduced <a href="https://caml.inria.fr/pub/docs/manual-ocaml/manual040.html">inline records</a>, a GADT exhaustiveness check on steroids (with <code>-&gt; .</code> to denote unreachability) and standard attributes like <code>warning</code>, <code>inlined</code>, <code>unboxed</code> or <code>immediate</code>. Colors appeared in the compiler and last but not least, it was the dawn of a new option called <a href="http://ocamlpro.com/tag/flambda2-en/">Flambda</a>.</p> <p>The library saw a lot of useful new functions coming in: lots of new iterators for <code>Array</code>, an <code>equal</code> function in most basic type modules, <code>Uchar</code>, the <code>*_ascii</code> alternatives and, of course, <code>Ephemeron</code>.</p> <p>4.04 was much more restrained, but it was the second release in a single year. Local opening of module with the <code>M.{}</code> syntax was added along with the <code>let exception ...</code> in construct. <code>String.split_on_char</code> was notably added to the stdlib which means we don’t have to rewrite it anymore.</p> <h2>2017</h2> <p>We now get to 4.05… which did not change the language. Not that the development team wasn’t busy, OCaml just got better without any change to the syntax.</p> <p>On the library side however, much happened, with the adding of <code>*_opt</code> functions pretty much everywhere. If you’re using the OCaml compiler from <a href="https://packages.debian.org/sid/ocaml">Debian</a>, this is where you might think the story ends. You’d be wrong…</p> <p>…because 4.06 added a lot! My own favorite feature from this release has to be user-defined <a href="https://caml.inria.fr/pub/docs/manual-ocaml/manual042.html">indexing operators</a>. This is also when <code>safe-string</code> became the default, giving worthwhile work to every late maintainer in the community. This release also added one awesome function in the standard library: <code>Map.update</code>.</p> <h2>2018</h2> <p>4.07 was aimed towards solidifying the language. It added empty variants and type-based selection of GADT constructors to the mix.</p> <p>On the library side, one old and two new modules were added, with the integration of <code>Bigarray</code>, <code>Seq</code> and <code>Float</code>.</p> <h2>2019</h2> <p>And here we are with 4.08, in the present day! We can now put exceptions under or-patterns, which is the only language change from this release we propagated to the sheet. Time will tell if we need to add custom <a href="https://caml.inria.fr/pub/docs/manual-ocaml/manual046.html">binding operators</a> or <code>[@@alert]</code>. <code>Pervasives</code> is now deprecated in profit of <code>Stdlib</code> and new modules are popping up (<code>Int</code>, <code>Bool</code>, <code>Fun</code>, <code>Result</code>… did we miss one?) while <code>Sort</code> made its final deprecation warning.</p> <p>We did not add 4.09 to this journey to the past, as this release is still solidly in the <em>now</em> at the time of this blogpost. Rest assured, we will see much more awesome features in OCaml in the future! In the meantime, we are working on updating more cheat sheets: keep posted!</p> <h1>Comments</h1> <p>Micheal Bacarella (23 September 2019 at 18 h 17 min):</p> <blockquote> <p>For a blog-post from a company called OCaml PRO this seems like a rather tone-deaf PR action.</p> <p>I wanted to read this and get hyped but instead I’m disappointed and I continue to feel like a chump advocating for this language.</p> <p>Why? Because this is a rather underwhelming summary of <em>8 years</em> of language activity. Perhaps you guys didn’t intend for this to hit the front of Hacker News, and maybe this stuff is really exciting to programming language PhDs, but I don’t see how the average business OCaml developer would relate to many of these changes at all. It makes OCaml (still!) seem like an out-of-touch academic language where the major complaints about the language are ignored (multicore, Windows support, programming-in-the-large, debugging) while ivory tower people fiddle with really nailing type-based selection in GADTs.</p> <p>I expect INRIA not to care about the business community but aren’t you guys called OCaml PRO? I thought you <em>liked</em> money.</p> <p>You clearly just intended this to be an interesting summary of changes to your cheatsheet but it’s turned into a PR release for the language and leaves normals with the continued impression that this language is a joke.</p> </blockquote> <p>Thomas Blanc (24 September 2019 at 14 h 57 min):</p> <blockquote> <p>Yes, latency can be frustrating even in the OCaml realm. Thanks for your comment, it is nice to see people caring about it and trying to remedy through contributions or comments.</p> <p>Note that we only posted on discuss.ocaml.org expecting to get one or two comments. The reason for this post was that while updating the CS we were surprised to see how much the language had changed and decided to write about it.</p> <p>You do raise some good points though. We did work on a full windows support back in the day. The project was discontinued because nobody was willing to buy it. We also worked on memory profiling for the debugging of memory leaks (before other alternatives existed). We did not maintain it because the project had no money input. I personally worked on compile-time detection of uncaught exception until the public funding of that project ran out. We also had a proposal for namespaces in the language that would have facilitated programming-in-the-large (no funding) and worked on multicore (funding for one man for one year).</p> </blockquote> Mise à jour des Cheat Sheets : OCaml Language et OCaml Standard Library https://ocamlpro.com/blog/2019_09_14_fr_mise_a_jour_des_cheat_sheets_ocaml_language_et_ocaml_standard_library 2019-09-14T13:48:57Z 2019-09-14T13:48:57Z Thomas Blanc Les mémentos (cheat-sheets) OCaml lang et OCaml stdlib partagés par OCamlPro en 2011 ont été mis à jour pour OCaml 4.08. Le langage OCaml OCaml Standard Library Si vous souhaitez contribuer des améliorations: sources sur GitHub. En savoir plus : Updated Cheat Sheets: OCaml Language and OCaml S... <p>Les mémentos (cheat-sheets) OCaml lang et OCaml stdlib partagés par OCamlPro en 2011 ont été mis à jour pour OCaml 4.08.</p> <ul> <li><a href="https://ocamlpro.github.io/ocaml-cheat-sheets/ocaml-lang.pdf">Le langage OCaml</a> </li> <li><ul> <li><a href="https://ocamlpro.github.io/ocaml-cheat-sheets/ocaml-stdlib.pdf">OCaml Standard Library</a> </li> </ul> </li> </ul> <p>Si vous souhaitez contribuer des améliorations: <a href="https://github.com/OCamlPro/ocaml-cheat-sheets">sources sur GitHub</a>.</p> <p>En savoir plus : <a href="/2019/09/13/updated-cheat-sheets-ocaml-language-and-ocaml-standard-library/">Updated Cheat Sheets: OCaml Language and OCaml Standard Library</a></p> Updated Cheat Sheets: OCaml Language and OCaml Standard Library https://ocamlpro.com/blog/2019_09_13_updated_cheat_sheets_language_stdlib_2 2019-09-13T13:48:57Z 2019-09-13T13:48:57Z Thomas Blanc In 2011, we shared several cheat sheets for OCaml. Cheat sheets are helpful to refer to, as an overview of the documentation when you are programming, especially when you’re starting in a new language. They are meant to be printed and pinned on your wall, or to be kept in handy on a spare screen. ... <p>In 2011, we shared several cheat sheets for OCaml. Cheat sheets are helpful to refer to, as an overview of the documentation when you are programming, especially when you’re starting in a new language. They are meant to be printed and pinned on your wall, or to be kept in handy on a spare screen. We hope they will help you out when your rubber duck is rubbish at debugging your code!</p> <p>Since we first shared them, OCaml and its related tools have evolved. We decided to refresh them and started with the two most-used cheat sheets—our own contribution to the start of the school year!</p> <p>Download the revised version:</p> <ul> <li><a href="http://ocamlpro.com/wp-content/uploads/2019/09/ocaml-lang.pdf">OCaml Language (lang)</a> (PDF) </li> <li><a href="http://ocamlpro.com/wp-content/uploads/2019/09/ocaml-stdlib.pdf">OCaml Standard Library (stdlib)</a> (PDF) </li> </ul> <p>You can also find <a href="https://github.com/OCamlPro/ocaml-cheat-sheets">the sources on GitHub</a>. We welcome contributions, feel free to send patches if you see room for improvement! We’re working on other cheat sheets: keep an eye on our blog to see updates and brand new cheat sheets.</p> <p>While we were updating them, we realized how much OCaml had evolved in the last eight years. We’ll tell you everything about our trip down memory lane very soon in another blogpost!</p> OCamlPro’s compiler team work update https://ocamlpro.com/blog/2019_08_30_ocamlpros_compiler_team_work_update 2019-08-30T13:48:57Z 2019-08-30T13:48:57Z Vincent Laviron The OCaml compiler team at OCamlPro is happy to present some of the work recently done jointly with JaneStreet's team. A lot of work has been done towards a new framework for optimizations in the compiler, called Flambda2, aiming at solving the shortcomings that became apparent in the Flambda optimi... <p><img src="/blog/assets/img/picture_cpu_compiler.jpeg" alt="" /></p> <p>The OCaml compiler team at OCamlPro is happy to present some of the work recently done jointly with JaneStreet's team.</p> <p>A lot of work has been done towards a new framework for optimizations in the compiler, called Flambda2, aiming at solving the shortcomings that became apparent in the Flambda optimization framework (see below for more details). While that work is in progress, the team also worked on some more short-term improvements, notably on the current Flambda optimization framework, as well as some compiler modifications that will benefit Flambda2.</p> <blockquote> <p><em>This work is funded by JaneStreet :D</em></p> </blockquote> <h3>Short-term improvements</h3> <h4>Recursive values compilation</h4> <p>OCaml supports quite a large range of recursive definitions. In addition to recursive (and mutually-recursive) functions, one can also define regular values recursively, as for the infinite list <code>let rec l = 0 :: l</code>.</p> <p>Not all recursive constructions are allowed, of course. For instance, the definition <code>let rec x = x</code> is rejected because there is no way to actually build a value that would behave correctly.</p> <p>The basic rule for deciding whether a definition is allowed or not is made under the assumption that recursive values (except for functions, mostly) are compiled by first allocating space in the heap for the recursive values, binding the recursively defined variables to the allocated (but not yet initialized) values. The defining expressions are then evaluated, yielding new values (that can contain references the non-initialized values). Finally, the fields of these new values are copied one-by-one into the corresponding fields of the initial values.</p> <p>For this approach to work, some restrictions need to apply:</p> <ul> <li>the compiler needs to be able to compute the size of the values beforehand (these values must be allocated values, in order to avoid defining an integer recursively), </li> <li>and since during the evaluation of the defining expressions their fields are not valid, one cannot write any code that may read these fields, like pattern-matching on the value, or passing the value to some function (or storing it in a mutable field of some record). </li> </ul> <p>All of those restrictions have recently been reworked and formalized based on work from Alban Reynaud during an internship at Inria, reviewed and completed by Gabriel Scherer and Jeremy Yallop.</p> <p>Unfortunately, this work only covers checking whether the recursive definitions are allowed or not; actual compilation is done later in the compiler, in one place for bytecode and another for native code, and these pieces of code have not been linked with the new check so there have been a few cases where the check allowed code that wasn't actually compiled correctly.</p> <p>Since we didn't want to deal with it directly in our new version of Flambda, we had started working on a patch to move the compilation of recursive values up in the compilation pipeline, before the split between bytecode and native code. After some amount of hacking (we discovered that compilation of classes creates recursive value bindings that would not pass the earlier recursive check…), we have a patch that is mostly ready for review and will soon start engaging with the rest of the compiler team with the aim of integrating it into the compiler.</p> <h4>Separate compilation of recursive modules, compilation units as functors</h4> <p>Some OCaml developers like to encapsulate each type definition in its own module, with an interface that can expose the needed types and functions, while abstracting away as much of the actual implementation as possible. It is then common to have each of these modules in its own file, to simplify management and avoid unseemly big files.</p> <p>However, this breaks down when one needs to define several types that depend on each other. The usual solutions are either to use recursive modules, which have the drawback of requiring all the modules to be in the same compilation unit, leading to very big files (we have seen a real case of a more than 10,000-lines file), or make each module parametric in the other modules, translating them into functors, and then instantiate all the functors when building the outwards-facing interface.</p> <p>To address these issues, we have been working on two main patches to improve the life of developers facing these problems.</p> <p>The first one allows compiling several different files as mutually recursive modules, reusing the approach used to compile regular recursive modules. In practice, this will allow developers using recursive modules extensively to properly separate not only the different modules from each other, but also the implementation and interfaces into a <code>.ml</code> and <code>.mli</code> files. This would of course need some additional support from the different build tools, but we're confident we can get at least <code>dune</code> to support the feature.</p> <p>The second one allows compiling a single compilation unit as a functor instead of a regular module. The arguments of the functor would be specified on the command line, their signature taken from their corresponding interface file. This can be useful not only to break recursive dependencies, like the previous patch (though in a different way), but also to help developers relying on multiple implementations of a same <code>.mli</code> interface functorize their code with minimal effort.</p> <p>These two improvements will also benefit packs, whereas recursive compilation units could be packed in a single module and packs could be functorized themselves.</p> <h4>Small improvements to Flambda</h4> <p>We are still committed to maintain the Flambda part of the compiler. Few bugs have been found, so we concentrate our efforts on small features that either yield overall performance gains or allow naive code patterns to be compiled as efficiently as their equivalent but hand-optimized versions.</p> <p>As an example, one optimization that we should be able to submit soon looks for cases where an immutable block is allocated but an immutable block with the same exact fields and tag already exists.</p> <p>This can be demonstrated with the following example:</p> <pre><code class="language-ocaml">let result_bind f = function | Ok x -&gt; f x | Error e -&gt; Error e </code></pre> <p>The usual way to avoid the extra allocation of <code>Error e</code> is to write the clause as <code>| (Error e) as r -&gt; r</code>. With this new patch, the redundant allocation will be detected and removed automatically! This can be even more interesting with inlining:</p> <pre><code class="language-ocaml">let my_f x = if (* some condition *) then Ok x else (* something else *) let _ = (* ... *) let r = result_bind my_f (* some argument *) in (* ... *) </code></pre> <p>In this example, inlining <code>result_bind</code> then <code>my_f</code> can match the allocation <code>Ok x</code> in <code>my_f</code> with the pattern matching in <code>result_bind</code>. This removes an allocation that would be very hard to remove otherwise. We expect these patterns to occur quite often with some programming styles relying on a great deal of abstraction and small independent functions.</p> <h3>Flambda 2.0</h3> <p>We are building on the work done for Flambda and the experience of its users to develop Flambda 2.0, the next optimization framework.</p> <p>Our goal is to build a framework for analyzing the costs and benefits of code transformations. The framework focuses on reducing the runtime cost of abstractions and removing as many short-lived allocations as possible.</p> <p>The aim of Flambda 2.0 is roughly the same as the original Flambda. So why did we decide to write a new framework instead of patching the existing one? Several points led us to this decision.</p> <ul> <li>An invariant on the representation of closures that ensured that every closure had a unique identifier, which was convenient for a number of reasons, turned out to be quite expensive to maintain and prevented some optimizations. </li> <li>The internal representation of Flambda terms included too many different cases that were either redundant or not relevant to the optimizations we were interested in, making a lot of code more complicated than necessary. </li> <li>The ANF-like representation we used was not perfect. We wanted an easier way to do control flow optimizations, which led us to choose a CPS-like representation for Flambda 2.0. </li> <li>Finally, the original Flambda was thought of as an alternative to the closure conversion and inlining algorithms performed by the <code>Closure</code> module of the compiler, translating from the <code>Lambda</code> representation to <code>Clambda</code>. However, a number of optimizations (most importantly unboxing) are done during the next phase of compilation, <code>Cmmgen</code>, which translates to the <code>Cmm</code> representation. The original Flambda had trouble to estimate correctly which optimizations would trigger and what would their benefit be. It may be noted that correctly estimating benefit is a key in Flambda's algorithms, and we know of a number of cases where Flambda is not as good as it could be because it couldn't predict the unboxing opportunities that inlining would have allowed. Flambda 2.0 will go from <code>Lambda</code> to <code>Cmm</code>, and will handle all transformations done in both <code>Closure</code> and <code>Cmmgen</code> in a single framework. </li> </ul> <p>These improvements are still very much a work in progress. We have not reached the point where other developers can try out the new framework on their codebases yet.</p> <p>This does not mean there are no news to enjoy before our efforts show on the mainstream compiler! While working on Flambda 2.0, we did deploy a number of patches on the compiler both before and after the Flambda stage. We proposed all the changes independant enough to be proposed on their own. Some of these fixes have been merged already. Others are still under discussion and some, like the recursive values patch mentioned above, are still waiting for cleanup or documentation before submission.</p> <h1>Comments</h1> <p>Jon Harrop (30 August 2019 at 20 h 11 min):</p> <blockquote> <p>What is the status of multicore OCaml?</p> </blockquote> <p>Vincent Laviron (2 September 2019 at 16 h 22 min):</p> <blockquote> <p>OCamlPro is not working on multicore OCaml. It is still being worked on elsewhere, with efforts concentrated around OCaml Labs, but I don’t have more information than what is publicly available. All of the work we described here is not expected to interfere with multicore.</p> </blockquote> <p>Lindsay (25 September 2020 at 20 h 20 min):</p> <blockquote> <p>Thanks for your continued work on the compiler and tooling! Am curious if there is any news regarding the item “Separate compilation of recursive modules”.</p> </blockquote> Release d’opam 2.0.5 https://ocamlpro.com/blog/2019_07_23_fr_release_dopam_2.0.5 2019-07-23T13:48:57Z 2019-07-23T13:48:57Z Raja Boujbel Louis Gesbert Nous sommes fiers d’annoncer la release (mineure) d’ opam 2.0.5. Cette nouvelle version contient des mises à jours de build et correctifs. Plus d’information... <p>Nous sommes fiers d’annoncer la release (mineure) d’ <a href="https://github.com/ocaml/opam/releases/tag/2.0.5">opam 2.0.5</a>. Cette nouvelle version contient des mises à jours de build et correctifs.</p> <blockquote> <p><a href="/2019/07/11/opam-2-0-5-release/">Plus d’information</a></p> </blockquote> opam 2.0.5 release https://ocamlpro.com/blog/2019_07_11_opam_2.0.5_release 2019-07-11T13:48:57Z 2019-07-11T13:48:57Z Raja Boujbel Louis Gesbert We are pleased to announce the minor release of opam 2.0.5. This new version contains build update and small fixes: Bump src_ext Dune to 1.6.3, allows compilation with OCaml 4.08.0. [#3887 @dra27] Support Dune 1.7.0 and later [#3888 @dra27 - fix #3870] Bump the ocaml_mccs lib-ext, to include latest ... <p>We are pleased to announce the minor release of <a href="https://github.com/ocaml/opam/releases/tag/2.0.5">opam 2.0.5</a>.</p> <p>This new version contains build update and small fixes:</p> <ul> <li>Bump src_ext Dune to 1.6.3, allows compilation with OCaml 4.08.0. [<a href="https://github.com/ocaml/opam/pull/3887">#3887</a> <a href="https://github.com/dra27">@dra27</a>] </li> <li>Support Dune 1.7.0 and later [<a href="https://github.com/ocaml/opam/pull/3888">#3888</a> <a href="https://github.com/dra27">@dra27</a> - fix <a href="https://github.com/ocaml/opam/issues/3870">#3870</a>] </li> <li>Bump the ocaml_mccs lib-ext, to include latest changes [<a href="https://github.com/ocaml/opam/pull/3896">#3896</a> <a href="https://github.com/AltGr">@AltGr</a>] </li> <li>Fix cppo detection in configure [<a href="https://github.com/ocaml/opam/pull/3917">#3917</a> <a href="https://github.com/dra27">@dra27</a>] </li> <li>Read jobs variable from OpamStateConfig [<a href="https://github.com/ocaml/opam/pull/3916">#3916</a> <a href="https://github.com/dra27">@dra27</a>] </li> <li>Linting: <ul> <li>add check upstream option [<a href="https://github.com/ocaml/opam/pull/3758">#3758</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>add warning for with-test in run-test field [<a href="https://github.com/ocaml/opam/pull/3765">#3765</a>, <a href="https://github.com/ocaml/opam/pull/3860">#3860</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>fix misleading <code>doc</code> filter warning [<a href="https://github.com/ocaml/opam/pull/3871">#3871</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> </ul> </li> <li>Fix typos [<a href="https://github.com/ocaml/opam/pull/3891">#3891</a> <a href="https://github.com/dra27">@dra27</a>, <a href="https://github.com/mehdid">@mehdid</a>] </li> </ul> <blockquote> <p>Note: To homogenise macOS name on system detection, we decided to keep <code>macos</code>, and convert <code>darwin</code> to <code>macos</code> in opam. For the moment, to not break jobs &amp; CIs, we keep uploading <code>darwin</code> &amp; <code>macos</code> binaries, but from the 2.1.0 release, only <code>macos</code> ones will be kept.</p> </blockquote> <hr /> <p>Installation instructions (unchanged):</p> <ol> <li>From binaries: run </li> </ol> <pre><code class="language-shell-session">sh &lt;(curl -sL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.0.5">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script.</p> <ol start="2"> <li>From source, using opam: </li> </ol> <pre><code class="language-shell-sesiion">opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script)</p> <ol start="3"> <li>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.0.5#compiling-this-repo">README</a>. </li> </ol> <p>We hope you enjoy this new minor version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>.</p> </blockquote> Résultats de la SMT-Comp 2019 pour Alt-Ergo https://ocamlpro.com/blog/2019_07_10_results_smt_comp_2019 2019-07-10T13:48:57Z 2019-07-10T13:48:57Z Albin Coquereau Les résultats de la compétition SMT-COMP 2019 ont été publiés au whorkshop SMT de la 22e conférence SAT. Nous étions fiers d’y participer pour la deuxième année consécutive, surtout depuis qu’Alt-Ergo prend en charge le standard SMT-LIB 2. Alt-Ergo est un SAT solveur open-source mainte... <p>Les résultats de la compétition SMT-COMP 2019 ont été publiés au whorkshop SMT de la <a href="http://smt2019.galois.com/">22e conférence SAT</a>. Nous étions fiers d’y participer pour la deuxième année consécutive, surtout depuis qu’Alt-Ergo <a href="2019_02_11_whats-new-for-alt-ergo-in-2018-here-is-a-recap">prend en charge</a> le standard <a href="http://smtlib.cs.uiowa.edu/">SMT-LIB 2</a>.</p> <blockquote> <p>Alt-Ergo est un SAT solveur open-source maintenu et distribué par OCamlPro, et financé entre autres grâce à plusieurs projets de R&amp;D collaborative (BWare, SOPRANO, Vocal, LChip).</p> <p>Si vous êtes un utilisateur d’Alt-Ergo, songez à rejoindre le <a href="https://alt-ergo.ocamlpro.com/#club">Club des Utilisateurs d’Alt-Ergo</a>! L’histoire de ce logiciel remonte à 2006, où il est né de recherches académiques conjointes entre Inria et le CNRS dans le laboratoire du LRI. Il est depuis septembre 2013 maintenu, développé &amp; et distribué par OCamlPro (voir l’historique des <a href="https://alt-ergo.ocamlpro.com/#releases">versions passées</a>).</p> <p><em>Si vous êtes curieux des activités d’OCamlPro dans le domaine des méthodes formelles, vous pouvez lire le court témoignage d’un <a href="http://ocamlpro.com/clients-partners/#mitsubishi-merce">client heureux</a></em></p> </blockquote> <p>Voir <a href="2019_07_09_alt-ergo-participation-to-the-smt-comp-2019">/blog/alt-ergo-participation-to-the-smt-comp-2019</a></p> The Alt-Ergo SMT Solver’s results in the SMT-COMP 2019 https://ocamlpro.com/blog/2019_07_09_alt_ergo_participation_to_the_smt_comp_2019 2019-07-09T13:48:57Z 2019-07-09T13:48:57Z Albin Coquereau The results of the SMT-COMP 2019 were released a few days ago at the SMT whorkshop during the 22nd SAT conference. We were glad to participate in this competition for the second year in a row, especially as Alt-Ergo now supports the SMT-LIB 2 standard. Alt-Ergo is an open-source SAT-solver maintaine... <p>The results of the SMT-COMP 2019 were released a few days ago at the SMT whorkshop during the <a href="http://smt2019.galois.com/">22nd SAT conference</a>. We were glad to participate in this competition for the second year in a row, especially as Alt-Ergo <a href="2019_02_11_whats-new-for-alt-ergo-in-2018-here-is-a-recap">now supports</a> the SMT-LIB 2 standard.</p> <blockquote> <p>Alt-Ergo is an open-source SAT-solver maintained and distributed by OCamlPro and partially funded by R&amp;D projects. If you’re interested, please consider joining the <a href="https://alt-ergo.ocamlpro.com/#club">Alt-Ergo User’s Club</a>! Its history goes back in 2006 from early academic researches conducted conjointly at Inria &amp; CNRS “LRI” lab, and the maintenance and development work by OCamlPro since September 2013 (see the <a href="https://alt-ergo.ocamlpro.com/#releases">past releases</a>).</p> <p>If you’re curious about OCamlPro’s other activities in Formal Methods, see a happy client’s <a href="/#mitsubishi-merce">feedback</a></p> </blockquote> <h2>SMT-COMP 2018</h2> <p>Our goal last year was to challenge ourselves on the community benchmarks. We wanted to compare Alt-Ergo to state-of-the-art SMT solvers. We thus selected categories close to the “deductive program verification”, as Alt-Ergo is primarily tuned for formulas coming from this application domain. Specifically, we took part in four main tracks categories: ALIA, AUFLIA, AUFLIRA, AUFNIRA. These categories are a combination of theories such as Arrays, Uninterpreted Function and Linear and Non-linear arithmetic over Integers and Reals.</p> <h3>Alt-Ergo’s Results at SMT-COMP 2018</h3> <p>For its first participation in SMT-COMP, Alt-Ergo showed that it was a competitive solver comparing to state of the art solvers such as CVC4, Vampire, VeriT or Z3.</p> <figure class="wp-block-table"> <table> <tbody> <tr> <td>Main Track Categories (number of participants)</td> <td>Sequential Perfs</td> <td>Parallel Perfs</td> </tr> <tr> <td><a href="http://smtcomp.sourceforge.net/2018/results-ALIA.shtml?v=1531410683">ALIA</a> (4)</td> <td><img src="/blog/assets/img/icon_silver.png" alt="2nd place" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> </tr> <tr> <td><a href="http://smtcomp.sourceforge.net/2018/results-AUFLIA.shtml?v=1531410683">AUFLIA</a> (4)</td> <td><img src="/blog/assets/img/icon_silver.png" alt="2nd place" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_silver.png" alt="2nd place" width="24" height="24"></td> </tr> <tr> <td><a href="http://smtcomp.sourceforge.net/2018/results-AUFLIRA.shtml?v=1531410683">AUFLIRA</a> (4)</td> <td><img src="/blog/assets/img/icon_silver.png" alt="2nd place" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> </tr> <tr> <td><a href="http://smtcomp.sourceforge.net/2018/results-AUFNIRA.shtml?v=1531410683">AUFNIRA</a> (3)</td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> </tr> </tbody> </table> </figure> <p>The global results of the competition are available <a href="http://smtcomp.sourceforge.net/2018/results-toc.shtml">here</a>.</p> <h2>SMT-COMP 2019</h2> <p>Since last year’s competition, we made some improvements on Alt-Ergo, specifically over our data structures and the support of algebraic datatypes (see <a href="http://ocamlpro.com/2019/02/11/whats-new-for-alt-ergo-in-2018-here-is-a-recap">post</a>).</p> <p>A few changes can be noted for this year’s competition:</p> <ul> <li>A distinction between SAT and UNSAT in the scoring scheme allowed us to compete in more categories, as Alt-Ergo doesn’t send back SAT.</li><li>The aim of the 24s Scoring is to reward solvers which solve problems quickly. </li> <li>The number of benchmarks in each category has changed. For each category, only the benchmarks which were not proven by every solver last year are used. For example: in the division AUFLIRA, 20011 benchmarks were used last year, of which 1683 remained this year.</li> </li> </ul> <p>Alt-Ergo only competed in the Single Query Track. We selected the same categories as last year and added UF, UFLIA, UFLRA and UFNIA. We also decided to compete over categories supporting algebraic DataTypes to test our newly support of this theory. Alt-Ergo’s expertise is over quantified problems, but we wanted to test our hand in the solver theories over some Quantifier-free categories.</p> <h3>Alt-Ergo’s Results at SMT-COMP 2019</h3> <p>We were proud to see Alt-Ergo performs within a reasonable margin on Quantifier Free problems comparing to other solvers over the UNSAT problems, even though these problems are not our solver’s primary goal. And we were happy with the performance of our solver in Datatype categories, as the support of this theory is new.</p> <p>For the last categories, Alt-Ergo managed to reproduce last year’s performance, close to CVC4 (2018 and 2019 winner) and Vampire.</p> <figure class="wp-block-table"> <table> <tbody> <tr> <td>Single Query Categories<br>(number of participants)</td> <td>Sequential</td> <td>Parallel</td> <td>Unsat</td> <td>24s</td> </tr> <tr> <td><a href="https://smt-comp.github.io/2019/results/alia-single-query">ALIA</a> (8)</td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_gold.png" alt="" width="33" height="33"></td> <td><img src="/blog/assets/img/icon_silver.png" alt="" width="24" height="24"></td> </tr> <tr> <td><a href="https://smt-comp.github.io/2019/results/auflia-single-query">AUFLIA</a> (8)</td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> </tr> <tr> <td><a href="https://smt-comp.github.io/2019/results/auflira-single-query">AUFLIRA</a> (8)</td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> </tr> <tr> <td><a href="https://smt-comp.github.io/2019/results/aufnira-single-query">AUFNIRA</a> (5)</td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_silver.png" alt="" width="24" height="24"></td> </tr> <tr> <td><a href="https://smt-comp.github.io/2019/results/uf-single-query">UF</a> (8)</td> <td><img src="http://ocamlpro.com/wp-content/uploads/2019/07/Copper.png" alt="" width="24" height="24"></td> <td><img src="http://ocamlpro.com/wp-content/uploads/2019/07/Copper.png" alt="" width="24" height="24"></td> <td><img src="http://ocamlpro.com/wp-content/uploads/2019/07/Copper.png" alt="" width="24" height="24"></td> <td><img src="http://ocamlpro.com/wp-content/uploads/2019/07/Copper.png" alt="" width="24" height="24"></td> </tr> <tr> <td><a href="https://smt-comp.github.io/2019/results/uflia-single-query">UFLIA</a> (8)</td> <td><img src="http://ocamlpro.com/wp-content/uploads/2019/07/Copper.png" alt="" width="24" height="24"></td> <td><img src="http://ocamlpro.com/wp-content/uploads/2019/07/Copper.png" alt="" width="24" height="24"></td> <td><img src="http://ocamlpro.com/wp-content/uploads/2019/07/Copper.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> </tr> <tr> <td><a href="https://smt-comp.github.io/2019/results/uflra-single-query">UFLRA</a> (8)</td> <td><img src="/blog/assets/img/icon_gold.png" alt="" width="33" height="33"></td> <td><img src="/blog/assets/img/icon_gold.png" alt="" width="33" height="33"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_gold.png" alt="" width="33" height="33"></td> </tr> <tr> <td><a href="https://smt-comp.github.io/2019/results/ufnia-single-query">UFNIA</a> (8)</td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> <td><img src="/blog/assets/img/icon_bronze.png" alt="" width="24" height="24"></td> </tr> </tbody> </table> </figure> <p>This year results are available <a href="https://smt-comp.github.io/2019/results.html">here</a>. These results do not include Par4 a portfolio solver.</p> <p>Alt-Ergo is constantly evolving, as well as our support of the SMT-LIB standard. For next year’s participation, we will try to compete in more categories and hope to cover more tracks, such as the UNSAT-Core track.</p> <p><img src="/assets/img/logo_altergo.png" alt="" /></p> Blockchains @ OCamlPro: an Overview https://ocamlpro.com/blog/2019_04_29_blockchains_at_ocamlpro_an_overview 2019-04-29T13:48:57Z 2019-04-29T13:48:57Z Fabrice Le Fessant OCamlPro started working on blockchains in 2014, when Arthur Breitman came to us with an initial idea to develop the Tezos ledger. The idea was very challenging with a lot of innovations. So, we collaborated with him to write a specification, and to turn the specification into OCaml code. Since then... <p>OCamlPro started working on blockchains in 2014, when Arthur Breitman came to us with an initial idea to develop the Tezos ledger. The idea was very challenging with a lot of innovations. So, we collaborated with him to write a specification, and to turn the specification into OCaml code. Since then, we continually improved our skills in this domain, trained more engineers, introduced the technology to students and to professionals, advised a dozen projects, developed tools and libraries, made some improvements and extensions to the official Tezos node, and conducted several private deployments of the Tezos ledger.</p> <blockquote> <p>For an overview of OCamlPro’s blockchain activities see <a href="/blog/category/blockchains">here</a></p> </blockquote> <h2>TzScan: A complete Block Explorer for Tezos</h2> <p><a href="https://tzscan.io">TzScan</a> is considered today to be the best block explorer for Tezos. It’s made of three main components:</p> <ul> <li>an indexer that queries the Tezos node and fills a relational database, </li> <li>an API server that queries the database to retrieve various informations, </li> <li>a web based user interface (a Javascript application) </li> </ul> <p>We deployed the indexer and API to freely provide the community with an access to all the content of the Tezos blockchain, already used by many websites, wallets and apps. In addition, we directly use this API within our TzScan.io instance. Our deployment spans on multiple Tezos nodes, multiple API servers and a distributed database to scale and reply to millions of queries per day. We also regularly release open source versions under the GPL license, that can be easily deployed on private Tezos networks. TzScan’s development has been initiated in September 2017. It represents today an enormous investment, that the Tezos Foundation helped partially fund in July 2018.</p> <blockquote> <p>Contact us for support, advanced features, advertisement, or if you need a private deployment of the TzScan infrastructure.</p> </blockquote> <h2>Liquidity: a Smart Contract Language for Tezos</h2> <p><a href="https://www.liquidity-lang.org">Liquidity</a> is the first high-level language for Tezos over Michelson. Its development began in April 2017, a few months before the Tezos fundraising in July 2017. It is today the most advanced language for Tezos: it offers OCaml-like and ReasonML-like syntaxes for writing smart contracts, compilation and de-compilation to/from Michelson, multiple-entry points, static type-checking à la ML, etc. Its <a href="https://www.liquidity-lang.org/edit">online editor</a> allows to develop smart contracts and to deploy them directly into the alphanet or mainnet. Liquidity has been used before the mainnet launch to de-compile the Foundation’s vesting smart contracts in order to review them. This smart contract language represents more than two years of work, and is fully funded by OCamlPro. It has been developed with formal verification in mind, formal verification being one of the selling points of Tezos. We have elaborated a detailed roadmap mixing model-checking and deductive program verification to investigate this feature. We are now searching for funding opportunities to keep developing and maintaining Liquidity.</p> <blockquote> <p>See our <a href="https://www.liquidity-lang.org/edit">online editor</a> to get started ! Contact us if you need support, training, writing or in-depth analysis of your smart contracts.</p> </blockquote> <h2>Techelson: a testing framework for Michelson and Liquidity</h2> <p><a href="https://ocamlpro.github.io/techelson/">Techelson</a> is our newborn in the set of tools for the Tezos blockchain. It is a test execution engine for the functional properties of Michelson and Liquidity contracts. Techelson is still in its early development stage. The user documentation <a href="https://ocamlpro.github.io/techelson/user_doc/">is available here</a>. An example on how to use it with Liquidity is detailed in <a href="https://adrienchampion.github.io/blog/tezos/techelson/with_liquidity/index.html">this post</a>.</p> <blockquote> <p>Contact us to customize the engine to suit your own needs!</p> </blockquote> <h2>IronTez: an optimized Tezos node by OCamlPro</h2> <p>IronTez is a tailored node for private (and public) deployments of Tezos. Among its additional features, the node adds some useful RPCs, improves storage, enables garbage collection and context pruning, allows an easy configuration of the private network, provides additional Michelson instructions (GET_STORAGE, CATCH…). One of its nice features is the ability to enable adaptive baking in private / proof-of-authority setting (eg. baking every 5 seconds in presence of transactions and every 10 minutes otherwise, etc.).</p> <p>A simplified version of IronTez has already been made public to allow testing its <a href="/blog/2019_02_04_improving_tezos_storage_gitlab_branch_for_testers">improved storage system, Ironmin</a>, showing a 10x reduction in storage. Some TzScan.io nodes are also using versions of IronTez. We’ve also successfully deployed it along with TzScan for a big foreign company to experiment with private blockchains. We are searching for projects and funding opportunities to keep developing and maintaining this optimized version of the Tezos node.</p> <blockquote> <p>Don’t hesitate to contact us if you want to deploy a blockchain with IronTez, or for more information !</p> </blockquote> <h1>Comments</h1> <p>Kristen (3 May 2019 at 0 h 30 min):</p> <blockquote> <p>I really wanted to keep using IronTez but I ran into bugs that have not yet been fixed, the code is out of date with upstream, and there is no real avenue for support/assistance other than email.</p> </blockquote> opam 2.0.4 release https://ocamlpro.com/blog/2019_04_10_opam_2.0.4_release 2019-04-10T13:48:57Z 2019-04-10T13:48:57Z Raja Boujbel Louis Gesbert We are pleased to announce the release of opam 2.0.4. This new version contains some backported fixes: Sandboxing on macOS: considering the possibility that TMPDIR is unset [#3597 @herbelin - fix #3576] display: Fix opam config var display, aligned on opam config list [#3723 @rjbou - rel. #3717] pin... <p>We are pleased to announce the release of <a href="https://github.com/ocaml/opam/releases/tag/2.0.4">opam 2.0.4</a>.</p> <p>This new version contains some <a href="https://github.com/ocaml/opam/pull/3805">backported fixes</a>:</p> <ul> <li>Sandboxing on macOS: considering the possibility that TMPDIR is unset [<a href="https://github.com/ocaml/opam/pull/3597">#3597</a> <a href="https://github.com/herbelin">@herbelin</a> - fix <a href="https://github.com/ocaml/opam/issues/3576">#3576</a>] </li> <li>display: Fix <code>opam config var</code> display, aligned on <code>opam config list</code> [<a href="https://github.com/ocaml/opam/pull/3723">#3723</a> <a href="https://github.com/rjbou">@rjbou</a> - rel. <a href="https://github.com/ocaml/opam/issues/3717">#3717</a>] </li> <li>pin: <ul> <li>update source of (version) pinned directory [<a href="https://github.com/ocaml/opam/pull/3726">#3726</a> <a href="https://github.com/rjbou">@rjbou</a> - <a href="https://github.com/ocaml/opam/issues/3651">#3651</a>] </li> <li>fix <code>--ignore-pin-depends</code> with autopin [<a href="https://github.com/ocaml/opam/pull/3736">#3736</a> <a href="https://github.com/AltGr">@AltGr</a>] </li> <li>fix pinnings not installing/upgrading already pinned packages (introduced in 2.0.2) [<a href="https://github.com/ocaml/opam/pull/3800">#3800</a> <a href="https://github.com/AltGr">@AltGr</a>] </li> </ul> </li> <li>opam clean: Ignore errors trying to remove directories [<a href="https://github.com/ocaml/opam/pull/3732">#3732</a> <a href="https://github.com/kit">@kit-ty-kate</a>] </li> <li>remove wrong &quot;mismatched extra-files&quot; warning [<a href="https://github.com/ocaml/opam/pull/3744">#3744</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>urls: fix hg opam 1.2 url parsing [<a href="https://github.com/ocaml/opam/pull/3754">#3754</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>lint: update message of warning 47, to avoid confusion because of missing <code>synopsis</code> field internally inferred from <code>descr</code> [<a href="https://github.com/ocaml/opam/pull/3753">#3753</a> <a href="https://github.com/rjbou">@rjbou</a> - fix <a href="https://github.com/ocaml/opam/issues/3738">#3738</a>] </li> <li>system: <ul> <li>lock &amp; signals: don't interrupt at non terminal signals [<a href="https://github.com/ocaml/opam/pull/3541">#3541</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>shell: fix fish manpath setting [<a href="https://github.com/ocaml/opam/pull/3728">#3728</a> <a href="https://github.com/gregory">@gregory-nisbet</a>] </li> <li>git: use <code>diff.noprefix=false</code> config argument to overwrite user defined configuration [<a href="https://github.com/ocaml/opam/pull/3788">#3788</a> <a href="https://github.com/rjbou">@rjbou</a>, <a href="https://github.com/ocaml/opam/pull/3628">#3628</a> <a href="https://github.com/Blaisorblade">@Blaisorblade</a> - fix <a href="https://github.com/ocaml/opam/issues/3627">#3627</a>] </li> </ul> </li> <li>dirtrack: fix precise tracking mode [<a href="https://github.com/ocaml/opam/pull/3796">#3796</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> <li>fix some mispellings [<a href="https://github.com/ocaml/opam/pull/3731">#3731</a> <a href="https://github.com/MisterDA">@MisterDA</a>] </li> <li>CI enhancement &amp; fixes [<a href="https://github.com/ocaml/opam/pull/3706">#3706</a> <a href="https://github.com/dra27">@dra27</a>, <a href="https://github.com/ocaml/opam/pull/3748">#3748</a> <a href="https://github.com/rjbou">@rjbou</a>, <a href="https://github.com/ocaml/opam/pull/3801">#3801</a> <a href="https://github.com/rjbou">@rjbou</a>] </li> </ul> <blockquote> <p>Note: To homogenise macOS name on system detection, we decided to keep <code>macos</code>, and convert <code>darwin</code> to <code>macos</code> in opam. For the moment, to not break jobs &amp; CIs, we keep uploading <code>darwin</code> &amp; <code>macos</code> binaries, but from the 2.1.0 release, only <code>macos</code> ones will be kept.</p> </blockquote> <hr /> <p>Installation instructions (unchanged):</p> <ol> <li>From binaries: run </li> </ol> <pre><code class="language-shell-session">sh &lt;(curl -sL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.0.4">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script.</p> <ol start="2"> <li>From source, using opam: </li> </ol> <pre><code class="language-shell-session">opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script)</p> <ol start="3"> <li>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.0.4#compiling-this-repo">README</a>. </li> </ol> <p>We hope you enjoy this new minor version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>.</p> </blockquote> opam 2.0 tips https://ocamlpro.com/blog/2019_03_12_opam_2.0_tips 2019-03-12T13:48:57Z 2019-03-12T13:48:57Z Louis Gesbert This blog post looks back on some of the improvements in opam 2.0, and gives tips on the new workflows available. Package development environment management Opam 2.0 has been vastly improved to handle locally defined packages. Assuming you have a project ~/projects/foo, defining two packages foo-lib... <p>This blog post looks back on some of the improvements in opam 2.0, and gives tips on the new workflows available.</p> <h2>Package development environment management</h2> <p>Opam 2.0 has been vastly improved to handle locally defined packages. Assuming you have a project <code>~/projects/foo</code>, defining two packages <code>foo-lib</code> and <code>foo-bin</code>, you would have:</p> <pre><code class="language-shell-session">~/projects/foo |-- foo-lib.opam |-- foo-bin.opam `-- src/ ... </code></pre> <p>(See also about <a href="../opam-extended-dependencies/#Computed-versions">computed dependency constraints</a> for handling multiple package definitions with mutual constraints)</p> <h3>Automatic pinning</h3> <p>The underlying mechanism is the same, but this is an interface improvement that replaces most of the opam 1.2 workflows based on <code>opam pin</code>.</p> <p>The usual commands (<code>install</code>, <code>upgrade</code>, <code>remove</code>, etc.) have been extended to support specifying a directory as argument. So when working on project <code>foo</code>, just write:</p> <pre><code class="language-shell-session">cd ~/projects/foo opam install . </code></pre> <p>and both <code>foo-lib</code> and <code>foo-bin</code> will get automatically pinned to the current directory (using git if your project is versioned), and installed. You may prefer to use:</p> <pre><code class="language-shell-session">opam install . --deps-only </code></pre> <p>to just get the package dependencies ready before you start hacking on it. <a href="#Reproducing-build-environments">See below</a> for details on how to reproduce a build environment more precisely. Note that <code>opam depext .</code> will not work at the moment, which will be fixed in the next release when the external dependency handling is integrated (opam will still list you the proper packages to install for your OS upon failure).</p> <p>If your project is versioned and you made changes, remember to either commit, or add <code>--working-dir</code> so that your uncommitted changes are taken into account.</p> <h2>Local switches</h2> <blockquote> <p>Opam 2.0 introduced a new feature called &quot;local switches&quot;. This section explains what it is about, why, when and how to use them.</p> </blockquote> <p>Opam <em>switches</em> allow to maintain several separate development environments, each with its own set of packages installed. This is particularly useful when you need different OCaml versions, or for working on projects with different dependency sets.</p> <p>It can sometimes become tedious, though, to manage, or remember what switch to use with what project. Here is where &quot;local switches&quot; come in handy.</p> <h3>How local switches are handled</h3> <p>A local switch is simply stored inside a <code>_opam/</code> directory, and will be selected automatically by opam whenever your current directory is below its parent directory.</p> <blockquote> <p>NOTE: it's highly recommended that you enable the new <em>shell hooks</em> when using local switches. Just run <code>opam init --enable-shell-hook</code>: this will make sure your PATH is always set for the proper switch.</p> <p>You will otherwise need to keep remembering to run <code>eval $(opam env)</code> every time you <code>cd</code> to a directory containing a local switch. See also <a href="http://opam.ocaml.org/doc/Tricks.html#Display-the-current-quot-opam-switch-quot-in-the-prompt">how to display the current switch in your prompt</a></p> </blockquote> <p>For example, if you have <code>~/projects/foo/_opam</code>, the switch will be selected whenever in project <code>foo</code>, allowing you to tailor what it has installed for the needs of your project.</p> <p>If you remove the switch dir, or your whole project, opam will forget about it transparently. Be careful not to move it around, though, as some packages still contain hardcoded paths and don't handle relocation well (we're working on that).</p> <h3>Creating a local switch</h3> <p>This can generally start with:</p> <pre><code class="language-shell-session">cd ~/projects/foo opam switch create . --deps-only </code></pre> <p>Local switch handles are just their path, instead of a raw name. Additionally, the above will detect package definitions present in <code>~/projects/foo</code>, pick a compatible version of OCaml (if you didn't explicitely mention any), and automatically install all the local package dependencies.</p> <p>Without <code>--deps-only</code>, the packages themselves would also get installed in the local switch.</p> <h3>Using an existing switch</h3> <p>If you just want an already existing switch to be selected automatically, without recompiling one for each project, you can use <code>opam switch link</code>:</p> <pre><code class="language-shell-session">cd ~/projects/bar opam switch link 4.07.1 </code></pre> <p>will make sure that switch <code>4.07.1</code> is chosen whenever you are in project <code>bar</code>. You could even link to <code>../foo</code> here, to share <code>foo</code>'s local switch between the two projects.</p> <h2>Reproducing build environments</h2> <h4>Pinnings</h4> <p>If your package depends on development versions of some dependencies (e.g. you had to push a fix upstream), add to your opam file:</p> <pre><code class="language-shell-session">depends: [ &quot;some-package&quot; ] # Remember that pin-depends are depends too pin-depends: [ [ &quot;some-package.version&quot; &quot;git+https://gitfoo.com/blob.git#mybranch&quot; ] ] </code></pre> <p>This will have no effect when your package is published in a repository, but when it gets pinned to its dev version, opam will first make sure to pin <code>some-package</code> to the given URL.</p> <h4>Lock-files</h4> <p>Dependency contraints are sometimes too wide, and you don't want to explore all the versions of your dependencies while developing. For this reason, you may want to reproduce a known-working set of dependencies. If you use:</p> <pre><code class="language-shell-session">opam lock . </code></pre> <p>opam will check what version of the dependencies are installed in your current switch, and explicit them in <code>*.opam.locked</code> files. <code>opam lock</code> is a plugin at the moment, but will get automatically installed when needed.</p> <p>Then, assuming you checked these files into version control, any user can do</p> <pre><code class="language-shell-session">opam install . --deps-only --locked </code></pre> <p>to instruct opam to reproduce the same build environment (the <code>--locked</code> option is also available to <code>opam switch create</code>, to make things easier).</p> <p>The generated lock-files will also contain added constraints to reproduce the presence/absence of optional dependencies, and reproduce the appropriate dependency pins using <code>pin-depends</code>. Add the <code>--direct-only</code> option if you don't want to enforce the versions of all recursive dependencies, but only direct ones.</p> Release : Liquidity version 1.0 ! https://ocamlpro.com/blog/2019_03_09_release_liquidity_v1_smart_contracts_language 2019-03-09T13:48:57Z 2019-03-09T13:48:57Z Çagdas Bozman Nous sommes fiers d'annoncer la release de la première version majeure de Liquidity, le langage de smart contracts et son outillage. Parmi les fonctions phares : multiples points d'entrée, système de contrats modulaire, polymorphisme et inférence de type, syntaxe ReasonML pour une plus grande ad... <p>Nous sommes fiers d'annoncer la release de la première version majeure de Liquidity, le langage de smart contracts et son outillage. Parmi les fonctions phares : multiples points d'entrée, système de contrats modulaire, polymorphisme et inférence de type, syntaxe ReasonML pour une plus grande adoption, etc.</p> <p>Voir <a href="/blog/2019_03_08_announcing_liquidity_version_1_0">cet article !</a></p> Announcing Liquidity version 1.0 https://ocamlpro.com/blog/2019_03_08_announcing_liquidity_version_1_0 2019-03-08T13:48:57Z 2019-03-08T13:48:57Z Alain Mebsout Liquidity version 1.0 We are pleased to announce the release of the first major version of the Liquidity smart-contract language and associated tools. Some of the highlights of this version are detailed below. Multiple Entry Points In the previous versions of Liquidity, smart contracts were limited ... <h1>Liquidity version 1.0</h1> <p>We are pleased to announce the release of the first major version of the Liquidity smart-contract language and associated tools.</p> <p>Some of the highlights of this version are detailed below.</p> <h3>Multiple Entry Points</h3> <p>In the previous versions of Liquidity, smart contracts were limited to a single entry point (named <code>main</code>). But traditionally smart contracts executions path depend strongly on the parameter and in most cases they are completely distinct.</p> <p>Having different entry points allows to separate code that do not overlap and which usually accomplish vastly different tasks. Encoding entry points with complex pattern matching constructs before was tedious and made the code not extremely readable. This new feature gives you readability and allows to call contracts in a natural way.</p> <p>Internally, entry points are encoded with sum types and pattern matching so that you keep the strong typing guarantees that come over from Michelson. This means that you cannot call a typed smart contract with the wrong entry point or the wrong parameter (this is enforced statically by both the Liquidity typechecker and the Michelson typechecker).</p> <h3>Modules and Contract System</h3> <p>Organizing, encapsulating and sharing code is not always easy when you need to write thousand lines files. Liquidity now allows to write modules (which contain types and values/functions) and contracts (which define entry points in addition). Types and non-private values of contracts and modules in scope can be accessed by other modules and contracts.</p> <p>You can even compile several files at once with the command line compiler, so that you may organize your multiple smart contract projects in libraries and files.</p> <h3>Polymorphism and Type Inference</h3> <p>Thanks to a new and powerful type inference algorithm, you can now get rid of almost all type annotations in the smart contracts.</p> <p>Instead of writing something like</p> <pre><code class="language-ocaml">let%entry main (parameter : bool) (storage : int) = let ops = ([] : operation list) in let f (c : bool) = if not c then 1 else 2 in ops, f parameter </code></pre> <p>you can now write</p> <pre><code class="language-ocaml">let%entry main parameter _ = let ops = [] in let f c = if not c then 1 else 2 in ops, f parameter </code></pre> <p>And type inference works with polymorhpism (also a new feature of this release) so you can now write generic and reusable functions:</p> <pre><code class="language-ocaml">type 'a t = { x : 'a set; y : 'a } let mem_t v = Set.mem v.y v.x </code></pre> <p>Inference also works with contract types and entry points.</p> <h3>ReasonML Syntax</h3> <p>We originally used a modified version of the OCaml syntax for the Liquidity language. This made the language accessible, almost for free, to all OCaml and functional language developers. The typing discipline one needs is quite similar to other strongly typed functional languages so this was a natural fit.</p> <p>However this is not the best fit for everyone. We want to bring the power of Liquidity and Tezos to the masses so adopting a seemingly familiar syntax for most people can help a lot. With this new version of Liquidity, you can now write your smart contracts in both an OCaml-like syntax or a <a href="https://reasonml.github.io">ReasonML</a>-like one. The latter being a lot closer to Javascript on the surface, making it accessible to people that already know the language or people that write smart contracts for other platforms like Solidity/Ethereum.</p> <p>You can see the full changelog as well as download the latest release and binaries <a href="https://github.com/OCamlPro/liquidity/releases">at this address</a>.</p> <p>Don't forget that you can also try all these new cool features and more directly in your browser with our <a href="https://www.liquidity-lang.org/edit/">online editor</a>.</p> Release de Techelson, moteur de tests pour Michelson et Liquidity https://ocamlpro.com/blog/2019_03_07_fr_release_de_techelson_moteur_de_tests_pour_michelson_et_liquidity 2019-03-07T13:48:57Z 2019-03-07T13:48:57Z Adrien Champion Nous sommes fiers d’annoncer la première release de Techelson, moteur d’exécution de tests pour Michelson. Les programmeurs Liquidity peuvent également l’utiliser. Voir Techelson, a test execution engine for Michelson.... <p>Nous sommes fiers d’annoncer la première release de Techelson, moteur d’exécution de tests pour Michelson. Les programmeurs Liquidity peuvent également l’utiliser.</p> <p>Voir <a href="/2019/03/05/techelson-a-test-execution-engine-for-michelson/">Techelson, a test execution engine for Michelson</a>.</p> Techelson, a test execution engine for Michelson https://ocamlpro.com/blog/2019_03_06_techelson_a_test_execution_engine_for_michelson 2019-03-06T13:48:57Z 2019-03-06T13:48:57Z Adrien Champion We are pleased to announce the first release of Techelson, available here. Techelson is a Test Execution Engine for Michelson. It aims at testing functional properties of Michelson smart contracts. Make sure to check the user documentation to get a sense of Techelson's workflow and features. For Liq... <p>We are pleased to announce the first release of <a href="https://ocamlpro.github.io/techelson/">Techelson,</a> available <a href="https://github.com/OCamlPro/techelson/releases/tag/v0.7.0">here</a>.</p> <p>Techelson is a Test Execution Engine for Michelson. It aims at testing functional properties of Michelson smart contracts. Make sure to check the <a href="https://ocamlpro.github.io/techelson/user_doc">user documentation</a> to get a sense of Techelson's workflow and features.</p> <p>For Liquidity programmers interested in Techelson, take a look at <a href="https://adrienchampion.github.io/blog/tezos/techelson/with_liquidity/index.html">this blog post</a> discussing how to write tests in Liquidity and run them using Techelson.</p> <p>Techelson is still young: if you have problems, suggestions or feature requests please <a href="https://github.com/OCamlPro/techelson/issues">open an issue on the repository</a>.</p> Signing Data for Smart Contracts https://ocamlpro.com/blog/2019_03_05_signing_data_for_smart_contracts 2019-03-05T13:48:57Z 2019-03-05T13:48:57Z Çagdas Bozman Smart contracts calls already provide a built-in authentication mechanism as transactions (i.e. call operations) are cryptographically signed by the sender of the transaction. This is a guarantee on which programs can rely. However, sometimes you may want more involved or flexible authentication sch... <p>Smart contracts calls already provide a built-in authentication mechanism as transactions (i.e. call operations) are cryptographically signed by the sender of the transaction. This is a guarantee on which programs can rely.</p> <p>However, sometimes you may want more involved or flexible authentication schemes. The ones that rely on signature validity checking can be implemented in Michelson, and Liquidity provide a built-in instruction to do so. (You still need to keep in mind that you cannot store unencrypted confidential information on the blockchain).</p> <p>This instruction is <code>Crypto.check</code> in Liquidity. Its type can be written as:</p> <pre><code class="language-ocaml">Crypto.check: key -&gt; signature -&gt; bytes -&gt; bool </code></pre> <p>Which means that it takes as arguments a public key, a signature and a sequence of bytes and returns a Boolean. <code>Crypto.check pub_key signature message</code> is <code>true</code> if and only if the signature <code>signature</code> was obtained by signing the Blake2b hash of <code>message</code> using the private key corresponding to the public key <code>pub_key</code>.</p> <p>A small smart contract snippet which implements a signature check (against a predefined public key kept in the smart contract's storage) <a href="https://liquidity-lang.org/edit?source=type+storage+%3D+key%0A%0Alet%25entry+main+%28%28message+%3A+string%29%2C+%28signature+%3A+signature%29%29+key+%3D%0A++let+bytes+%3D+Bytes.pack+message+in%0A++if+not+%28Crypto.check+key+signature+bytes%29+then%0A++++failwith+%22Wrong+signature%22%3B%0A++%28%5B%5D+%3A+operation+list%29%2C+key%0A">can be tested online here.</a></p> <pre><code class="language-ocaml">type storage = key let%entry main ((message : string), (signature : signature)) key = let bytes = Bytes.pack message in if not (Crypto.check key signature bytes) then failwith &quot;Wrong signature&quot;; ([] : operation list), key </code></pre> <p>This smart contract fails if the string <code>message</code> was not signed with the private key corresponding to the public key <code>key</code> stored. Otherwise it does nothing.</p> <p>This signature scheme is more flexible than the default transaction/sender one, however it requires that the signature can be built outside of the smart contract. (And more generally outside of the toolset provided by Liquidity and Tezos). On the other hand, signing a transaction is something you get for free if you use the tezos client or any tezos wallet (as is it essentially their base function).</p> <p>The rest of this blog post will focus on various ways to sign data, and on getting signatures that can be used in Tezos and Liquidity directly.</p> <h3>Signing Using the Tezos Client</h3> <p>One (straightforward) way to sign data is to use the Tezos client directly. You will need to be connected to a Tezos node though as the client makes RPCs to serialize data (this operation is protocol dependent). We can only sign sequences of bytes, so the first thing we need to do is to serialize whichever data we want to sign. This can be done with the command <code>hash data</code> of the client.</p> <pre><code class="language-shell-session">$ ./tezos-client -A alphanet-node.tzscan.io -P 80 hash data '&quot;message&quot;' of type string Raw packed data: 0x0501000000076d657373616765 Hash: exprtXaZciTDGatZkoFEjE1GWPqbJ7FtqAWmmH36doxBreKr6ADcYs Raw Blake2b hash: 0x01978930fd2d04d0db8c2e4ef8a3f5d63b8e732177c8723135ed0dc7d99ebed3 Raw Sha256 hash: 0x32569319f6517036949bcead23a761bfbfcbf4277b010355884a86ba09349839 Raw Sha512 hash: 0xdfa4ea9f77db3a98654f101be1d33d56898df40acf7c2950ca6f742140668a67fefbefb22b592344922e1f66c381fa2bec48aa47970025c7e61e35d939ae3ca0 Gas remaining: 399918 units remaining </code></pre> <p>This command gives the result of hashing the data using various algorithms but what we're really interested in is the first item <code>Raw packed data</code> which is the serialized version of our data (<code>&quot;message&quot;</code>) : <code>0x0501000000076d657373616765</code>.</p> <p>We can now sign these bytes using the Tezos client as well. This step can be performed completely offline, for that we need to use the option <code>-p</code> of the client to specify the protocol we want to use (the <code>sign bytes</code> command will not be available without first selecting a valid protocol). Here we use protocol 3, designated by its hash <code>PsddFKi3</code>.</p> <pre><code class="language-shell-session">$ ./tezos-client -p PsddFKi3 sign bytes 0x0501000000076d657373616765 for my_account Signature: edsigto9QHtXMyxFPyvaffRfFCrifkw2n5ZWqMxhGRzieksTo8AQAFgUjx7WRwqGPh4rXTBGGLpdmhskAaEauMrtM82T3tuxoi8 </code></pre> <p>The account <code>my_account</code> can be any imported account in the Tezos client. In particular, it can be an encrypted key pair (you will need to enter a password to sign) or a hardware Ledger (you will need to confirm the signature on the Ledger). The obtained signature can be used as is with Liquidity or Michelson. This one starts with <code>edsig</code> because it was obtained using an Ed25519 private key, but you can also get signatures starting with <code>spsig1</code> or <code>p2sig</code> depending on the cryptographic curve that you use.</p> <h3>Signing Manually</h3> <p>In this second section we detail the necessary steps and provide a Python script to sign string messages using an Ed25519 private key. This can be easily adapted for other signing schemes.</p> <p>These are the steps that will need to be performed in order to sign a string:</p> <ul> <li>Assuming that the value you want to sign is a string, you first need to convert its ASCII version to hexa, for the string <code>&quot;message&quot;</code> that is <code>6d657373616765</code>. </li> <li>You need to produce the packed version of the corresponding Michelson expression. The binary representation can vary depending on the types of the values you want to pack but for strings it is: </li> </ul> <pre><code class="language-michelson">| 0x | 0501 | [size of the string on 4 bytes] | [ascii string in hexa] | </code></pre> <p>for <code>&quot;message&quot;</code> (of length 7), it is</p> <pre><code class="language-michelson">| 0x | 0501 | 00000007 | 6d657373616765 | </code></pre> <p>or <code>0x0501000000076d657373616765</code>.</p> <ul> <li>Hash this value using <a href="https://en.wikipedia.org/wiki/BLAKE_(hash_function)">Blake2b</a> (<code>01978930fd2d04d0db8c2e4ef8a3f5d63b8e732177c8723135ed0dc7d99ebed3</code>) which is 32 bytes long. </li> <li>Depending on your public key, you then need to sign it with the corresponding curve (ed25519 for edpk keys), the signature is 64 bytes: </li> </ul> <pre><code class="language-michelson">753e013b8515a7d47eaa5424de5efa2f56620ac8be29d08a6952ae414256eac44b8db71f74600275662c8b0c226f3280e9d24e70a5fa83015636b98059b5180c </code></pre> <ul> <li>Optionally convert to base58check. This is not needed because Liquidity and Michelson allow signatures (as well as keys and key hashes) to be given in hex format with a 0x: </li> </ul> <pre><code class="language-michelson">0x753e013b8515a7d47eaa5424de5efa2f56620ac8be29d08a6952ae414256eac44b8db71f74600275662c8b0c226f3280e9d24e70a5fa83015636b98059b5180c </code></pre> <p>The following Python (3) script will do exactly this, entirely offline. Note that this is just an toy example, and should not be used in production. In particular you need to give your private key on the command line so this might not be secure if the machine you run this on is not secure.</p> <pre><code class="language-shell-session">$ pip3 install base58check pyblake2 ed25519 &gt; python3 ./sign_string.py &quot;message&quot; edsk2gL9deG8idefWJJWNNtKXeszWR4FrEdNFM5622t1PkzH66oH3r 0x753e013b8515a7d47eaa5424de5efa2f56620ac8be29d08a6952ae414256eac44b8db71f74600275662c8b0c226f3280e9d24e70a5fa83015636b98059b5180c </code></pre> <h4><code>sign_string.py</code></h4> <pre><code class="language-python">from pyblake2 import blake2b import base58check import ed25519 import sys message = sys.argv[1] seed_b58 = sys.argv[2] prefix = b'x05x01' len_bytes = (len(message)).to_bytes(4, byteorder='big') h = blake2b(digest_size=32) b = bytearray() b.extend(message.encode()) h.update(prefix + len_bytes + b) digest = h.digest() seed = base58check.b58decode(seed_b58)[4:-4] sk = ed25519.SigningKey(seed) sig = sk.sign(digest) print(&quot;0x&quot; + sig.hex()) </code></pre> What's new for Alt-Ergo in 2018? Here is a recap! https://ocamlpro.com/blog/2019_02_11_whats_new_for_alt_ergo_in_2018_here_is_a_recap 2019-02-11T13:48:57Z 2019-02-11T13:48:57Z Mohamed Iguernlala After the hard work done on the integration of floating-point arithmetic reasoning two years ago, 2018 is the year of polymorphic SMT2 support and efficient SAT solving for Alt-Ergo. In this post, we recap the main novelties last year, and we announce the first Alt-Ergo Users’ Club meeting. An SMT... <p>After the hard work done on the integration of floating-point arithmetic reasoning two years ago, 2018 is the year of polymorphic SMT2 support and efficient SAT solving for Alt-Ergo. In this post, we recap the main novelties last year, and we announce the first Alt-Ergo Users’ Club meeting.</p> <h2>An SMT2 front-end with prenex polymorphism</h2> <p>As you may know, Alt-Ergo’s native input language is not compliant with the SMT-LIB 2 input language standard, and translating formulas from SMT-LIB 2 to Alt-Ergo’ syntax (or vice-versa) is not immediate. Besides its extension with polymorphism, this native language diverges from SMT-LIB’s by distinguishing terms of type <code>boolean</code> from formulas (that are <code>propositions</code>). This distinction makes it hard, for instance, to efficiently translate <code>let-in</code> and <code>if-then-else</code> constructs that are ubiquitous in SMT-LIB 2 benchmarks.</p> <p>In order to work closely with the SMT community, we designed a conservative extension of the SMT-LIB 2 standard with <code>prenex polymorphism</code> and implemented it as a new frontend in Alt-Ergo 2.2. This work has been published in the 2018 edition of the SMT-Workshop. An online version of the paper is <a href="https://hal.inria.fr/hal-01960203">available here</a>. Experimental results showed that polymorphism is really important for Alt-Ergo, as it allows to improve both resolution rate and resolution time (see Figure 5 in the paper for more details).</p> <h2>Improved SAT solvers</h2> <p>We also worked on improving SAT-solving in Alt-Ergo last year. The main direction towards this goal was to extend our CDCL-based SAT solver to mimic some desired behaviors of the native Tableaux-like SAT engine. Generally speaking, this allows a better management of the context during proof search, which prevents from overwhelming theories and instantiation engines with useless facts. A comparison of this solver with Alt-Ergo’s old Tableaux-like solver is also done in our SMT-Workshop paper.</p> <h2>SMT-Comp and SMT-Workshop 2018</h2> <p>As emphasized above, we published our work regarding polymorphic SMT2 and SAT solving in SMT-Workshop 2018. More generally, this was an occasion for us to write the first tool paper about Alt-Ergo, and to highlight the main features that make it different from other state-of-the-art SMT solvers like CVC4, Z3 or Yices.</p> <p>Thanks to our new SMT2 frontend, we were able to participate to the SMT-Competition last year. Naturally, we selected categories that are close to “deductive program verification”, as Alt-Ergo is primarily tuned for formulas coming from this application domain.</p> <p>Although Alt-Ergo <a href="http://smtcomp.sourceforge.net/2018/results-summary.shtml?v=1531410683">did not rank first</a>, it was a positive experience and this encourages us to go ahead. Note that Alt-Ergo’s brother, Ctrl-Ergo, was not far from winning <a href="http://smtcomp.sourceforge.net/2018/results-QF_LIA.shtml">the QF-LIA category</a> of the competition. This performance is partly due to the improvements in the CDCL SAT solver that were also integrated in Ctrl-Ergo.</p> <h2>Alt-Ergo for Atelier-B</h2> <p><a href="https://www.atelierb.eu/en/">Atelier-B</a> is a framework that allows to develop formally verified software using <a href="https://www.methode-b.com/en/b-method/">the B Method</a>. The framework rests on an automatic reasoner that allows to discharges thousands of mathematical formulas extracted from B models. If a formula is not discharged automatically, it is proved interactively. <a href="https://www.clearsy.com/en/">ClearSy</a> (the company behind development of Atelier-B) has recently added a new backend to produce verification conditions in Why3’s logic, in order to target more automatic provers and increase automation rate. For certifiability reasons, we extended Alt-Ergo with a new frontend that is able to directly parse these verification conditions without relying on Why3.</p> <h2>Improved hash-consed data-structures</h2> <p>As said above, Alt-Ergo makes a clear distinction between Boolean terms and Propositions. This distinction prevents us from doing some rewriting and simplifications, in particular on expressions involving <code>let-in</code> and <code>if-then-else</code> constructs. This is why we decided to merge <code>Term</code>, <code>Literal</code>, and <code>Formula</code> in a new <code>Expr</code> data-structure, and remove this distinction. This allowed us to implement some additional simplification steps, and we immediately noticed performance improvements, in particular on SMT2 benchmarks. For instance, Alt-Ergo 2.3 proves 19548 formulas of AUFLIRA category in ~350 minutes, while version 2.2 proves 19535 formulas in ~1450 minutes (time limit was set to 20 minutes per formula).</p> <h2>Towards the integration of algebraic datatypes</h2> <p>Last Autumn, we also started working on the integration of algebraic datatypes reasoning in Alt-Ergo. In this first iteration, we extended Alt-Ergo’s native language to be able to declare (mutually recursive) algebraic datatypes, to write expressions with patterns matching, to handle selectors, … We then extended the typechecker accordingly and implemented a (not that) basic theory reasoner. Of course, we also handle SMT2’s algebraic datatypes. Here is an example in Alt-Ergo’s native syntax:</p> <pre><code class="language-OCaml">type ('a, 'b) t = A of {a_1 : 'a} | B of {b_11 : 'a ; b12 : 'b} | C | D | E logic e : (int, real) t logic n : int axiom ax_n : n &amp;gt;= 9 axiom ax_e: e = A(n) or e = B(n*n, 0.) or e = E goal g: match e with | A(u) -&gt; u &gt;= 8 | B (u,v) -&gt; u &gt;= 80 and v = 0. | E -&gt; true | _ -&gt; false end and 3 &lt;= 2+2 </code></pre> <h2>What is planned in 2019 and beyond: the Alt-Ergo’s Users’ Club is born!</h2> <p>In 2018, we welcomed a lot of new engineers with a background in formal methods: Steven (De Oliveira) holds a PhD in formal verification from the Paris-Saclay University and the French Atomic Energy Commission (CEA). He has a master in cryptography and worked in the Frama-C team, developing open-source tools for verifying C programs. David (Declerck) obtained a PhD from Université Paris-Saclay in 2018, during which he extended the Cubicle model checker to support weak memory models and wrote a compiler from a subset of the x86 assembly language to Cubicle. Guillaume (Bury) holds a PhD from Université Sorbonne Paris Cité. He studied the integration of rewriting techniques inside SMT solvers. Albin (Coquereau) is working as a PhD student between OCamlPro, LRI and ENSTA, focusing on improving the Alt-Ergo SMT solver. Adrien is interested in verification of safety properties over software and embedded systems. He worked on higher-order functional program verification at the University of Tokyo, and on the Kind 2 model checker at the University of Iowa. All these people will consolidate the department of formal methods at OCamlPro, which will be beneficial for Alt-Ergo.</p> <p>In 2019 we just launched the Alt-Ergo Users’ Club, in order to get closer to our users, collect their needs, and integrate them into the Alt-Ergo roadmap, but also to ensure sustainable funding for the development of the project. We are happy to announce the very first member of the Club is <a href="https://www.adacore.com">Adacore</a>, very soon to be followed by <a href="https://trust-in-soft.com">Trust-In-Soft</a> and <a href="http://www-list.cea.fr/en/">CEA List</a>. Thanks for your early support!</p> <blockquote> <p>Interested to join? Contact us: contact@ocamlpro.com</p> </blockquote> Optimisation du stockage dans Tezos : une branche de test sur Gitlab https://ocamlpro.com/blog/2019_02_05_fr_optimisation_du_stockage_dans_tezos_une_branche_de_test_sur_gitlab 2019-02-05T13:48:57Z 2019-02-05T13:48:57Z Fabrice Le Fessant Ce troisième article consacré à l’amélioration du stockage dans Tezos fait suite à l’annonce de la mise à disposition d’une image docker pour les beta testeurs souhaitant essayer notre système de stockage et garbage collector. Voir Improving Tezos Storage : Gitlab branch for testers... <p>Ce troisième article consacré à l’amélioration du stockage dans Tezos fait suite à l’annonce de la mise à disposition d’une image docker pour les beta testeurs souhaitant essayer notre système de stockage et garbage collector.</p> <p>Voir <a href="/2019/02/04/improving-tezos-storage-gitlab-branch-for-testers/">Improving Tezos Storage : Gitlab branch for testers</a></p> Improving Tezos Storage : Gitlab branch for testers https://ocamlpro.com/blog/2019_02_04_improving_tezos_storage_gitlab_branch_for_testers 2019-02-04T13:48:57Z 2019-02-04T13:48:57Z Fabrice Le Fessant This article is the third post of a series of posts on improving Tezos storage. In our previous post, we announced the availability of a docker image for beta testers, wanting to test our storage and garbage collector. Today, we are glad to announce that we rebased our code on the latest version of ... <p>This article is the third post of a series of posts on improving Tezos storage. In <a href="http://ocamlpro.com/2019/01/30/improving-tezos-storage-update-and-beta-testing/">our previous post</a>, we announced the availability of a docker image for beta testers, wanting to test our storage and garbage collector. Today, we are glad to announce that we rebased our code on the latest version of <code>mainnet-staging</code>, and pushed a branch <code>mainnet-staging-irontez</code> on our <a href="https://gitlab.com/tzscan/tezos/commits/mainnet-staging-irontez">public Gitlab repository</a>.</p> <p>The only difference with the previous post is a change in the name of the RPCs : <code>/storage/context/gc</code> will trigger a garbage collection (and terminate the node afterwards) and <code>/storage/context/revert</code> will migrate the database back to Irmin (and terminate the node afterwards).</p> <p>Enjoy and send us feedback !!</p> <h1>Comments</h1> <p>AppaDude (10 February 2019 at 15 h 12 min):</p> <blockquote> <p>I must be missing something. I compiled and issued the required rpc trigger:</p> <p>/storage/context/gc with the command</p> <p>~/tezos/tezos-client rpc get /storage/context/gc But I just got an empty JSON response of {} and the size of the .tezos-node folder is unchanged. Any advice is much appreciated. Thank you!</p> </blockquote> <p>Fabrice Le Fessant (10 February 2019 at 15 h 47 min):</p> <blockquote> <p>By default, garbage collection will keep 9 cycles of blocks (~36000 blocks). If you have fewer blocks, or if you are using Irontez on a former Tezos database, and fewer than 9 cycles have been stored in Irontez, nothing will happen. If you want to force a garbage collection, you should tell Irontez to keep fewer block (but more than 100, that’s the minimum that we enforce):</p> <p>~/tezos/tezos-client rpc get ‘/storage/context/gc?keep=120’</p> <p>should trigger a GC if the node has been running on Irontez for at least 2 hours.</p> </blockquote> <p>AppaDude (10 February 2019 at 16 h 04 min):</p> <blockquote> <p>I think it did work. I was confused because the total disk space for the .tezos-node folder remained unchanged. Upon closer inspection, I see these contents and sizes:</p> <p>These are the contents of .tezos-node, can I safely delete context.backup?</p> <p>4.0K config.json 269M context 75G context.backup 4.0K identity.json 4.0K lock 1.4M peers.json 5.4G store 4.0K version.json</p> </blockquote> <blockquote> <p>Is it safe to delete context.backup if I do not plan to revert? (/storage/context/revert)</p> </blockquote> <p>Fabrice Le Fessant (10 February 2019 at 20 h 51 min):</p> <blockquote> <p>Yes, normally. Don’t forget it is still under beta-testing…</p> <p>Note that <code>/storage/context/revert</code> works even if you remove <code>context.backup</code>.</p> </blockquote> <p>Jack (23 February 2019 at 0 h 24 min):</p> <blockquote> <p>Have there been any issues reported with missing endorsements or missing bakings with this patch? We have been using this gc version (https://gitlab.com/tezos/tezos/merge_requests/720) for the past month and ever since we switched we have been missing endorsements and missing bakings. The disk space savings is amazing, but if we keep missing ends/bakes, it’s going to hurt our reputation as a baking service.</p> </blockquote> <p>Fabrice Le Fessant (23 February 2019 at 6 h 58 min):</p> <blockquote> <p>Hi,</p> <p>I am not sure what you are asking for. Are you using our version (https://gitlab.com/tzscan/tezos/commits/mainnet-staging-irontez), or the one on the Tezos repository ? Our version is very different, so if you are using the other one, you should contact them directly on the merge request. On our version, we got a report last week, and the branch has been fixed immediately (but not yet the docker images, should be done in the next days).</p> </blockquote> <p>Jack (25 February 2019 at 15 h 53 min):</p> <blockquote> <p>I was using the 720MR and experiencing issues with baking/endorsing. I understand that 720MR and IronTez are different. I was simply asking if your version has had any reports of baking/endorsing troubles.</p> </blockquote> <p>Jack (25 February 2019 at 15 h 51 min):</p> <blockquote> <p>Is there no way to convert a “standard node” to IronTez? I was running the official tezos-node, and my datadir is around 90G. I compiled IronTez and started it up on that same dir, then ran <code>rpc get /storage/context/gc</code> and nothing is happening. I thought this was supposed to convert my datadir to irontez? If not, what is the RPC to do this? Or must I start from scratch to be 100% irontez?</p> </blockquote> <p>Fabrice Le Fessant (25 February 2019 at 16 h 24 min):</p> <blockquote> <p>There are two ways to get a full Irontez DB:</p> <ul> <li>Start a node from scratch and wait for one or two days… </li> <li>Use an existing node, run Irontez on it for 2 hours, and then call <code>rpc get /storage/context/gc?keep=100</code> . 100 is the number of blocks to be kept. After 2 hours, the last 120 blocks should be stored in the IronTez DB, so the old DB will not be used anymore. Note that Irontez will not delete the old DB, just rename it. You should go there and remove the file to recover the disk space. </li> </ul> </blockquote> <p>Jack (27 February 2019 at 1 h 24 min):</p> <blockquote> <p>Where do we send feedback/get help? Email? Slack? Reddit?</p> </blockquote> <p>Banjo E. (3 March 2019 at 2 h 40 min):</p> <blockquote> <p>There is a major problem for bakers who want to use the irontez branch. After garbage collection, the baker application will not start because the baker requests a rpc call for the genesis block information. That genesis block information is gone after the garbage collection. Please address this isssue soon. Thank you!</p> </blockquote> <p>Fabrice Le Fessant (6 March 2019 at 21 h 44 min):</p> <blockquote> <p>I pushed a new branch with a tentative fix: https://gitlab.com/tzscan/tezos/tree/mainnet-staging-irontez-fix-genesis . Unfortunately, I could not test it (I am far away from work for two weeks), so feedback is really welcome, before pushing in the irontez branch.</p> </blockquote> Tezos et OCamlPro https://ocamlpro.com/blog/2019_01_31_fr_tezos_et_ocamlpro 2019-01-31T13:48:57Z 2019-01-31T13:48:57Z Fabrice Le Fessant Tezos est aujourd’hui un projet open source, un réseau international développé par des équipes sur plus de cinq continents. Dans la genèse du projet, l’entreprise française OCamlPro, qui développe encore aujourd’hui de nombreux projets liés à Tezos (TZscan, Liquidity, etc.), a joué u... <p>Tezos est aujourd’hui un projet open source, un réseau international développé par des équipes sur plus de cinq continents. Dans la genèse du projet, l’entreprise française OCamlPro, qui développe encore aujourd’hui de nombreux projets liés à Tezos (TZscan, Liquidity, etc.), a joué un rôle particulièrement important. C’est en effet en son sein que des ingénieurs-chercheurs ont posé les premières pierres du code, en étroite collaboration avec Arthur Breitman, l’architecte du projet, et DLS pendant plusieurs années. Nous nous réjouissons aujourd’hui de l’essor qu’a pris le projet.</p> <p>Arthur et OCamlPro (publication conjointe)</p> Improving Tezos Storage : update and beta-testing https://ocamlpro.com/blog/2019_01_30_improving_tezos_storage_update_and_beta_testing 2019-01-30T13:48:57Z 2019-01-30T13:48:57Z Fabrice Le Fessant In a previous post, we presented some work that we did to improve the quantity of storage used by the Tezos node. Our post generated a lot of comments, in which upcoming features such as garbage collection and pruning were introduced. It also motivated us to keep working on this (hot) topic, and we ... <p>In a <a href="http://ocamlpro.com/2019/01/15/improving-tezos-storage/">previous post</a>, we presented some work that we did to improve the quantity of storage used by the Tezos node. Our post generated a lot of comments, in which upcoming features such as garbage collection and pruning were introduced. It also motivated us to keep working on this (hot) topic, and we present here our new results, and current state. Irontez3 is a new version of our storage system, that we tested both on real traces and real nodes. We implemented a garbage-collector for it, that is triggered by an RPC on our node (we want the user to be able to choose when it happens, especially for bakers who might risk losing a baking slot), and automatically every 16 cycles in our traces.</p> <p>In the following graph, we present the size of the context database during a full trace execution (~278 000 blocks):</p> <p><img src="/blog/assets/img/plot_sizes-2.png" alt="plot_size-2.png" /></p> <p>There is definitely quite some improvement brought to the current Tezos implementation based on Irmin+LMDB, that we reimplemented as IronTez0. IronTez0 allows an IronTez node to read a database generated by the current Tezos and switch to the IronTez3 database. At the bottom of the graph, IronTez3 increases very slowly (about 7 GB at the end), and the garbage-collector makes it even less expensive (about 2-3 GB at the end). Finally, we executed a trace where we switched from IronTez0 to IronTez3 at block 225 000. The graph shows that, after the switch, the size immediately grows much more slowly, and finally, after a garbage collection, the storage is reduced to what it would have been with IronTez3.</p> <p>Now, let’s compare the speed of the different storages:</p> <p><img src="/blog/assets/img/plot_times-2.png" alt="plot_times-2.png" /></p> <p>The graph shows that IronTez3 is about 4-5 times faster than Tezos/IronTez0. Garbage-collections have an obvious impact on the speed, but clearly negligible compared to the current performance of Tezos. On our computer used for the traces, a Xeon with an SSD disk, the longest garbage collection takes between 1 and 2 minutes, even when the database was about 40 GB at the beginning.</p> <p>In the former post, we didn’t check the amount of memory used by our storage system. It might be expected that the performance improvement could be associated with a more costly use of memory… but such is not the case :</p> <p><img src="/blog/assets/img/plot_mem.png" alt="plot_mem.png" /></p> <p>At the top of the graph is our IronTez0 implementation of the current storage: it uses a little more memory than the current Tezos implementation (about 6 GB), maybe because it shares data structures with IronTez3, with fields that are only used by IronTez3 and could be removed in a specialized version. IronTez3 and IronTez3 with garbage collection are at the bottom, using about 2 GB of memory. It is actually surprising that the cost of garbage collections is very limited.</p> <p>On our current running node, we get the following storage:</p> <pre><code class="language-shell-session">$ du 1.4G ./context 4.9G ./store 6.3G . </code></pre> <p>Now, if we use our new RPC to revert the node to Irmin (taking a little less than 8 minutes on our computer), we get :</p> <pre><code class="language-shell-session">$ du 14.3G ./context 4.9G ./store 19.2G . </code></pre> <h2>Beta-Testing with Docker</h2> <p>If you are interested in these results, it is now possible to test our node: we created a docker image, similar to the ones of Tezos. It is available on Docker Hub (one image that works for both Mainnet and Alphanet). Our script mainnet.sh (http://tzscan.io/irontez/mainnet.sh) can be used similarly to the alphanet.sh script of Tezos to manage the container. It can be run on an existing Tezos database, it will switch it to IronTez3. Note that such a change is not irreversible, still it might be a good idea to backup your Tezos node directory before, as (1) migrating back might take some time, (2) this is a beta-testing phase, meaning the code might still hide nasty bugs, and (3) the official node might introduce a new incompatible format.</p> <h2>New RPCS</h2> <p>Both of these RPCs will make the node TERMINATE once they have completed. You should restart the node afterwards.</p> <p>The RPC <code>/ocp/storage/gc</code> : it triggers a garbage collection using the RPC <code>/ocp/storage/gc</code> . By default, this RPC will keep only the contexts from the last 9 cycles. It is possible to change this value by using the ?keep argument, and specify another number of contexts to keep (beware that if this value is too low, you might end up with a non-working Tezos node, so we have set a minimum value of 100). No garbage-collection will happen if the oldest context to keep was stored in the Irmin database. The RPC <code>/ocp/storage/revert</code> : it triggers a migration of the database fron Irontez3 back to Irmin. If you have been using IronTez for a while, and want to go back to the official node, this is the way. After calling this RPC, you should not run IronTez again, otherwise, it will restart using the IronTez3 format, and you will need to revert again. This operation can take a lot of time, depending on the quantity of data to move between the two formats.</p> <h2>Following Steps</h2> <p>We are now working with the team at Nomadic Labs to include our work in the public Tezos code base. We will inform you as soon as our Pull Request is ready, for more testing ! If all testing and review goes well, we hope it can be merged in the next release !</p> <h1>Comments</h1> <p>Jack (30 January 2019 at 15 h 30 min):</p> <blockquote> <p>Please release this as a MR on gitlab so those of us not using docker can start testing the code.</p> </blockquote> <p>Fabrice Le Fessant (10 February 2019 at 15 h 39 min):</p> <blockquote> <p>That was done: <a href="/2019/02/04/improving-tezos-storage-gitlab-branch-for-testers/">here</a></p> </blockquote> Tezos and OCamlPro https://ocamlpro.com/blog/2019_01_29_tezos_and_ocamlpro 2019-01-29T13:48:57Z 2019-01-29T13:48:57Z Arthur Breitman A reflection on the new year… Today, Tezos is a global network and an open source project with developers spanning over five continents. In the inception of this project, the French company OCamlPro which, to this day, stills develops numerous projects around Tezos, played a particularly important... <p>A reflection on the new year… Today, Tezos is a global network and an open source project with developers spanning over five continents. In the inception of this project, the French company OCamlPro which, to this day, stills develops numerous projects around Tezos, played a particularly important role. Indeed, they were the first home of the research engineers who laid down the cornerstone of the code base, in tight collaboration with Arthur Breitman and the architect of the project, and DLS. We take some time today to remember those early days and celebrate the flourishing of this once small project.</p> <p>(cross-post with Arthur Breitman, Founder of the Tezos project)</p> opam 2.0.3 release https://ocamlpro.com/blog/2019_01_28_opam_2.0.3_release 2019-01-28T13:48:57Z 2019-01-28T13:48:57Z Raja Boujbel Louis Gesbert We are pleased to announce the release of opam 2.0.3. This new version contains some backported fixes: Fix manpage remaining $ (OPAMBESTEFFORT) Fix OPAMROOTISOK handling Regenerate missing environment file Installation instructions (unchanged): From binaries: run or download manually from the Github... <p>We are pleased to announce the release of <a href="https://github.com/ocaml/opam/releases/tag/2.0.3">opam 2.0.3</a>.</p> <p>This new version contains some <a href="https://github.com/ocaml/opam/pull/3715">backported fixes</a>:</p> <ul> <li>Fix manpage remaining $ (OPAMBESTEFFORT) </li> <li>Fix OPAMROOTISOK handling </li> <li>Regenerate missing environment file </li> </ul> <hr /> <p>Installation instructions (unchanged):</p> <ol> <li>From binaries: run </li> </ol> <pre><code class="language-shell-session">sh &lt;(curl -sL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.0.3">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script.</p> <ol start="2"> <li>From source, using opam: </li> </ol> <pre><code class="language-shell-session">opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script)</p> <ol start="3"> <li>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.0.3#compiling-this-repo">README</a>. </li> </ol> <p>We hope you enjoy this new major version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>.</p> </blockquote> Improving Tezos Storage https://ocamlpro.com/blog/2019_01_15_improving_tezos_storage 2019-01-15T13:48:57Z 2019-01-15T13:48:57Z Fabrice Le Fessant Running a Tezos node currently costs a lot of disk space, about 59 GB for the context database, the place where the node stores the states corresponding to every block in the blockchain, since the first one. Of course, this is going to decrease once garbage collection is integrated, i.e. removing ve... <p>Running a Tezos node currently costs a lot of disk space, about 59 GB for the context database, the place where the node stores the states corresponding to every block in the blockchain, since the first one. Of course, this is going to decrease once garbage collection is integrated, i.e. removing very old information, that is not used and cannot change anymore (<a href="https://gitlab.com/tezos/tezos/merge_requests/720#note_125296853">PR720</a> by Thomas Gazagnaire, Tarides, some early tests show a decrease to 14GB ,but with no performance evaluation). As a side note, this is different from pruning, i.e. transmitting only the last cycles for “light” nodes (<a href="https://gitlab.com/tezos/tezos/merge_requests/663">PR663</a> by Thomas Blanc, OCamlPro). Anyway, as Tezos will be used more and more, contexts will keep growing, and we need to keep decreasing the space and performance cost of Tezos storage.</p> <p>As one part of our activity at OCamlPro is to allow companies to deploy their own private Tezos networks, we decided to experiment with new storage layouts. We implemented two branches: our branch <code>IronTez1</code> is based on a full LMDB database, as Tezos currently, but with optimized storage representation ; our branch <code>IronTez2</code> is based on a mixed database, with both LMDB and file storage.</p> <p>To test these branches, we started a node from scratch, and recorded all the accesses to the context database, to be able to replay it with our new experimental nodes. The node took about 12 hours to synchronize with the network, on which about 3 hours were used to write and read in the context database. We then replayed the trace, either only the writes or with both reads and writes.</p> <p>Here are the results:</p> <p><img src="/blog/assets/img/plot_sizes.png" alt="plot_sizes.png" /></p> <p>The mixed storage is the most interesting: it uses half the storage of a standard Tezos node !</p> <p><img src="/blog/assets/img/plot_times-1.png" alt="plot_times-1.png" /></p> <p>Again, the mixed storage is the most efficient : even with reads and writes, <code>IronTez2</code> is five time faster than the current Tezos storage.</p> <p>Finally, here is a graph that shows the impact of the two attacks that happened in November 2018, and how it can be mitigated by storage improvement:</p> <p><img src="/blog/assets/img/plot_diffs.png" alt="plot_diffs.png" /></p> <p>The graph shows that, using mixed storage, it is possible to restore the storage growth of Tezos to what it was before the attack !</p> <p>Interestingly, although these experiments have been done on full traces, our branches are completely backward-compatible : they could be used on an already existing database, to store the new contexts in our optimized format, while keeping the old data in the ancient format.</p> <p>Of course, there is still a lot of work to do, before this work is finished. We think that there are still more optimizations that are possible, and we need to test our branches on running nodes for some time to get confidence (TzScan might be the first tester !), but this is a very encouraging work for the future of Tezos !</p> opam 2.0.2 release https://ocamlpro.com/blog/2018_12_12_opam_2.0.2_release 2018-12-12T13:48:57Z 2018-12-12T13:48:57Z Raja Boujbel Louis Gesbert We are pleased to announce the release of opam 2.0.2. As sandbox scripts have been updated, don't forget to run opam init --reinit -ni to update yours. This new version contains mainly backported fixes: Doc: update man page add message for deprecated options reinsert removed ones to print a deprecat... <p>We are pleased to announce the release of <a href="https://github.com/ocaml/opam/releases/tag/2.0.2">opam 2.0.2</a>.</p> <p>As <strong>sandbox scripts</strong> have been updated, don't forget to run <code>opam init --reinit -ni</code> to update yours.</p> <p>This new version contains mainly <a href="https://github.com/ocaml/opam/pull/3669">backported fixes</a>:</p> <ul> <li>Doc: <ul> <li>update man page </li> <li>add message for deprecated options </li> <li>reinsert removed ones to print a deprecated message instead of fail (e.g. <code>--alias-of</code>) </li> <li>deprecate <code>no-aspcud</code> </li> </ul> </li> <li>Pin: <ul> <li>on pinning, rebuild updated <code>pin-depends</code> packages reliably </li> <li>include descr &amp; url files on pinning 1.2 opam files </li> </ul> </li> <li>Sandbox: <ul> <li>handle symlinks in bubblewrap for system directories such as <code>/bin</code> or <code>/lib</code> (<a href="https://github.com/ocaml/opam/pull/3661">#3661</a>). Fixes sandboxing on some distributions such as CentOS 7 and Arch Linux. </li> <li>allow use of unix domain sockets on macOS (<a href="https://github.com/ocaml/opam/issues/3659">#3659</a>) </li> <li>change one-line conditional to if statement which was incompatible with set -e </li> <li>make /var readonly instead of empty and rw </li> </ul> </li> <li>Path: resolve default opam root path </li> <li>System: suffix .out for read_command_output stdout files </li> <li>Locked: check consistency with opam file when reading lock file to suggest regeneration message </li> <li>Show: remove pin depends messages </li> <li>Cudf: Fix closure computation in the presence of cycles to have a complete graph if a cycle is present in the graph (typically <code>ocaml-base-compiler</code> ⇄ <code>ocaml</code>) </li> <li>List: Fix some cases of listing coinstallable packages </li> <li>Format upgrade: extract archived source files of version-pinned packages </li> <li>Core: add is_archive in OpamSystem and OpamFilename </li> <li>Init: don't fail if empty compiler given </li> <li>Lint: fix light_uninstall flag for error 52 </li> <li>Build: partial port to dune </li> <li>Update cold compiler to 4.07.1 </li> </ul> <hr /> <p>Installation instructions (unchanged):</p> <ol> <li>From binaries: run </li> </ol> <pre><code class="language-shell-session">sh &lt;(curl -sL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh) </code></pre> <p>or download manually from <a href="https://github.com/ocaml/opam/releases/tag/2.0.2">the Github &quot;Releases&quot; page</a> to your PATH. In this case, don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update your sandbox script.</p> <ol start="2"> <li>From source, using opam: </li> </ol> <pre><code class="language-shell-session">opam update; opam install opam-devel </code></pre> <p>(then copy the opam binary to your PATH as explained, and don't forget to run <code>opam init --reinit -ni</code> to enable sandboxing if you had version 2.0.0~rc manually installed or to update you sandbox script)</p> <ol start="3"> <li>From source, manually: see the instructions in the <a href="https://github.com/ocaml/opam/tree/2.0.2#compiling-this-repo">README</a>. </li> </ol> <p>We hope you enjoy this new minor version, and remain open to <a href="https://github.com/ocaml/opam/issues">bug reports</a> and <a href="https://github.com/ocaml/opam/issues">suggestions</a>.</p> <blockquote> <p>NOTE: this article is cross-posted on <a href="https://opam.ocaml.org/blog/">opam.ocaml.org</a> and <a href="/blog">ocamlpro.com</a>.</p> </blockquote> An Introduction to Tezos RPCs: Signing Operations https://ocamlpro.com/blog/2018_11_21_an_introduction_to_tezos_rpcs_signing_operations 2018-11-21T13:48:57Z 2018-11-21T13:48:57Z Fabrice Le Fessant In a previous blogpost, we presented the RPCs used by tezos-client to send a transfer operation to a tezos-node. We were left with two remaining questions: How to forge a binary operation, for signature How to sign a binary operation In this post, we will reply to these questions. We are still assum... <p>In a <a href="http://ocamlpro.com/2018/11/15/an-introduction-to-tezos-rpcs-a-basic-wallet/">previous blogpost</a>, we presented the RPCs used by tezos-client to send a transfer operation to a tezos-node. We were left with two remaining questions:</p> <ul> <li> <p>How to forge a binary operation, for signature</p> </li> <li> <p>How to sign a binary operation</p> </li> </ul> <p>In this post, we will reply to these questions. We are still assuming a node running and waiting for RPCs on address 127.0.0.1:9731. Since we will ask this node to forge a request, we really need to trust it, as a malicious node could send a different binary transaction from the one we sent him.</p> <p>Let’s take back our first operation:</p> <pre><code class="language-json">{ &quot;branch&quot;: &quot;BMHBtAaUv59LipV1czwZ5iQkxEktPJDE7A9sYXPkPeRzbBasNY8&quot;, &quot;contents&quot;: [ { &quot;kind&quot;: &quot;transaction&quot;, &quot;source&quot;: &quot;tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx&quot;, &quot;fee&quot;: &quot;50000&quot;, &quot;counter&quot;: &quot;3&quot;, &quot;gas_limit&quot;: &quot;200&quot;, &quot;storage_limit&quot;: &quot;0&quot;, &quot;amount&quot;: &quot;100000000&quot;, &quot;destination&quot;: &quot;tz1gjaF81ZRRvdzjobyfVNsAeSC6PScjfQwN&quot; } ] } </code></pre> <p>So, we need to translate this operation into a binary format, more amenable for signature. For that, we use a new RPC to forge operations. Under Linux, we can use the tool <code>curl</code> to send the request to the node:</p> <pre><code class="language-shell-session">$ curl -v -X POST http://127.0.0.1:9731/chains/main/blocks/head/helpers/forge/operations -H &quot;Content-type: application/json&quot; --data '{ &quot;branch&quot;: &quot;BMHBtAaUv59LipV1czwZ5iQkxEktPJDE7A9sYXPkPeRzbBasNY8&quot;, &quot;contents&quot;: [ { &quot;kind&quot;: &quot;transaction&quot;, &quot;source&quot;: &quot;tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx&quot;, &quot;fee&quot;: &quot;50000&quot;, &quot;counter&quot;: &quot;3&quot;, &quot;gas_limit&quot;: &quot;200&quot;, &quot;storage_limit&quot;: &quot;0&quot;, &quot;amount&quot;: &quot;100000000&quot;, &quot;destination&quot;: &quot;tz1gjaF81ZRRvdzjobyfVNsAeSC6PScjfQwN&quot; } ] }' </code></pre> <p>Note that we use a POST request (request with content), with a <code>Content-type</code> header indicating that the content is in JSON format. We get the following body in the reply :</p> <pre><code class="language-json">&quot;ce69c5713dac3537254e7be59759cf59c15abd530d10501ccf9028a5786314cf08000002298c03ed7d454a101eb7022bc95f7e5f41ac78d0860303c8010080c2d72f0000e7670f32038107a59a2b9cfefae36ea21f5aa63c00&quot; </code></pre> <p>This is the binary representation of our operation, in hexadecimal format, exactly what we were looking for to be able to include operations on the blockchain. However, this representation is not yet complete, since we also need the operation to be signed by the manager.</p> <p>To sign this operation, we will first use <code>tezos-client</code>. That’s something that we can do if we want, for example, to sign an operation offline, for better security. Let’s assume that we have saved the content of the string (<code>ce69...3c00</code> without the quotes) in a file <code>operation.hex</code>, we can ask <code>tezos-client</code> to sign it with:</p> <pre><code class="language-shell-session">$ tezos-client --addr 127.0.0.1 --port 9731 sign bytes 0x03$(cat operation.hex) for bootstrap1 </code></pre> <p>The <code>0x03$(cat operation.hex)</code> is the concatenation of the <code>0x03</code> prefix and the hexa content of the <code>operation.hex</code>, which is equivalent to <code>0x03ce69...3c00</code>. The prefix is used (1) to indicate that the representation is hexadecimal (<code>0x</code>), and (2) that it should start with <code>03</code>, which is a watermark for operations in Tezos.</p> <p>We get the following reply in the console:</p> <pre><code class="language-shell-session">Signature: edsigtkpiSSschcaCt9pUVrpNPf7TTcgvgDEDD6NCEHMy8NNQJCGnMfLZzYoQj74yLjo9wx6MPVV29CvVzgi7qEcEUok3k7AuMg </code></pre> <p>Wonderful, we have a signature, in <code>base58check</code> format ! We can use this signature in the <code>run_operation</code> and <code>preapply</code> RPCs… but not in the <code>injection</code> RPC, which requires a binary format. So, to inject the operation, we need to convert to the hexadecimal version of the signature. For that, we will use the <code>base58check</code> package of Python (we could do it in OCaml, but then, we could just use <code>tezos-client</code> all along, no ?):</p> <pre><code class="language-shell-session">$ pip3 install base58check $ python &gt;&gt;&gt;import base58check &gt;&gt;&gt;base58check.b58decode(b'edsigtkpiSSschcaCt9pUVrpNPf7TTcgvgDEDD6NCEHMy8NNQJCGnMfLZzYoQj74yLjo9wx6MPVV29CvVzgi7qEcEUok3k7AuMg').hex() '09f5cd8612637e08251cae646a42e6eb8bea86ece5256cf777c52bc474b73ec476ee1d70e84c6ba21276d41bc212e4d878615f4a31323d39959e07539bc066b84174a8ff0de436e3a7' </code></pre> <p>All signatures in Tezos start with <code>09f5cd8612</code>, which is used to generate the <code>edsig</code> prefix. Also, the last 4 bytes are used as a checksum (<code>e436e3a7</code>). Thus, the signature itself is after this prefix and before the checksum: <code>637e08251cae64...174a8ff0d</code>.</p> <p>Finally, we just need to append the binary operation with the binary signature for the injection, and put them into a string, and send that to the server for injection. If we have stored the hexadecimal representation of the signature in a file <code>signature.hex</code>, then we can use :</p> <pre><code class="language-shell-session">$ curl -v -H &quot;Content-type: application/json&quot; 'http://127.0.0.1:9731/injection/operation?chain=main' --data '&quot;'$(cat operation.hex)$(cat signature.hex)'&quot;' </code></pre> <p>and we receive the hash of this new operation:</p> <pre><code class="language-json">&quot;oo1iWZDczV8vw3XLunBPW6A4cjmdekYTVpRxRh77Fd1BVv4HV2R&quot; </code></pre> <p>Again, we cheated a little, by using <code>tezos-client</code> to generate the signature. Let’s try to do it in Python, too !</p> <p>First, we will need the secret key of bootstrap1. We can export from <code>tezos-client</code> to use it directly:</p> <pre><code class="language-shell-session">$ tezos-client show address bootstrap1 -S Hash: tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx Public Key: edpkuBknW28nW72KG6RoHtYW7p12T6GKc7nAbwYX5m8Wd9sDVC9yav Secret Key: unencrypted:edsk3gUfUPyBSfrS9CCgmCiQsTCHGkviBDusMxDJstFtojtc1zcpsh </code></pre> <p>The secret key is exported on the last line by using the <code>-S</code> argument, and it usually starts with <code>edsk</code>. Again, it is in <code>base58check</code>, so we can use the same trick to extract its binary value:</p> <pre><code class="language-shell-session">$ python3 &gt;&gt;&gt; import base58check &gt;&gt;&gt; base58check.b58decode(b'edsk3gUfUPyBSfrS9CCgmCiQsTCHGkviBDusMxDJstFtojtc1zcpsh').hex()[8:72] '8500c86780141917fcd8ac6a54a43a9eeda1aba9d263ce5dec5a1d0e5df1e598' </code></pre> <p>This time, we directly extracted the key, by removing the first 8 hexa chars, and keeping only 64 hexa chars (using <code>[8:72]</code>), since the key is 32-bytes long. Let’s suppose that we save this value in a file <code>bootstrap1.hex</code>.</p> <p>Now, we will use the following script to compute the signature:</p> <pre><code class="language-python">import binascii operation=binascii.unhexlify(open(&quot;operation.hex&quot;,&quot;rb&quot;).readline()[:-1]) seed = binascii.unhexlify(open(&quot;bootstrap1.hex&quot;,&quot;rb&quot;).readline()[:-1]) from pyblake2 import blake2b h = blake2b(digest_size=32) h.update(b'x03' + operation) digest = h.digest() import ed25519 sk = ed25519.SigningKey(seed) sig = sk.sign(digest) print(sig.hex()) </code></pre> <p>The <code>binascii</code> module is used to read the files in hexadecimal (after removing the newlines), to get the binary representation of the operation and of the Ed25519 seed. Ed25519 is an elliptive curve used in Tezos to manage <code>tz1</code> addresses, i.e. to sign data and check signatures.</p> <p>The <code>blake2b</code> module is used to hash the message, before signature. Again, we add a watermark to the operation, i.e. <code>x03</code>, before hashing. We also have to specify the size of the hash, i.e. <code>digest_size=32</code>, because the Blake2b hashing function can generate hashes with different sizes.</p> <p>Finally, we use the ed25519 module to transform the seed (private/secret key) into a signing key, and use it to sign the hash, that we print in hexadecimal. We obtain:</p> <pre><code class="language-json">637e08251cae646a42e6eb8bea86ece5256cf777c52bc474b73ec476ee1d70e84c6ba21276d41bc212e4d878615f4a31323d39959e07539bc066b84174a8ff0d </code></pre> <p>This result is exactly the same as what we got using tezos-client !</p> <p><img src="/blog/assets/img/SignTransaction-791x1024.jpg" alt="SignTransaction-791x1024.jpg" /></p> <p>We now have a complete wallet, i.e. the ability to create transactions and sign them without tezos-client. Of course, there are several limitations to this work: first, we have exposed the private key in clear, which is usually not a very good idea for security; also, Tezos supports three types of keys, <code>tz1</code> for Ed25519 keys, <code>tz2</code> for Secp256k1 keys (same as Bitcoin/Ethereum) and <code>tz3</code> for P256 keys; finally, a realistic wallet would probably use cryptographic chips, on a mobile phone or an external device (Ledger, etc.).</p> <h1>Comments</h1> <p>Anthony (28 November 2018 at 2 h 01 min):</