OCamlPro Highlights: November 2013
New Team Members
We are pleased to welcome three new members in our OCamlPro team since the beginning of November:
- Benjamin Canou started working at OCamlPro on the Richelieu project,
an effort to bring better safety and performance to the
Scilab language. He is in charge of a
type inference algorithm that will serve both as a developper tool
and in coordination with a JIT. He spent his first month
understanding the darkest corners of the language, and then writing
a versatile AST with a parser to build it. Actually, this is not an
easy task, because the language gives different statuses to
characters (including spaces) depending on the context, leading to
non-trivial lexing. But the real source of problems is the fact that
the original lexparser is intermingled with the interpreter inside a
big bunch of venerable FORTRAN code. This old fellow makes parsing
choices depending on the dynamic typing context, allows its users to
catch syntax errors at runtime, among other fun things. The new
OCaml lexer and parser is handwritten in around a thousand lines,
has performance comparable to a
Lex
andYacc
generated one, and is resilient to errors so it could be integrated into an IDE to detect errors on the fly without stopping on the first one. Once again, it’s OCaml to the rescue of the weak and elderly!An example of the kind of code that can be written in Scilab:
if return = while then [ 12..
34.. … .. …
56 } ; else ‘”‘”
end
which is parsed into:
— parsed in 0.000189–
(script (if (== !return !while) (matrix (row 123456)) “‘”))
— messages
1.10:1.11: use of deprecated operator ‘=’
— end
-
Gregoire Henry started working at OCamlPro on the Bware project. He is tackling the optimization of memory performance of automatic provers written in OCaml, in collaboration with Cagdas Bozman. One of his first contributions after joining us was to exhume his internship work of 2004, an implementation of Graphics for Mac OS X that we are going to use for our online OCaml IDE!
-
Thomas Blanc started a PhD at OCamlPro after his summer internship with us. He is going to continue his work on whole-program analysis, especially as a way to detect uncaught exceptions. We hope his tool will be a good replacement for the ocamlexn tool written by Francois Pessaux.
Compiler Updates
On the compiler optimization front, Pierre Chambart got direct access
to the OCaml SVN, so that he will soon upload his work directly into
an SVN branch, easier for reviewing and integration into the official
compiler. A current set of optimizations is already scheduled for the
new branch, and is now working on inlining recursive functions, such
List.map
, by inlining the function definition at the call site, when
at least one of its arguments is invariant during recursion.
A function that can benefit a lot from that transformation is:
let f l = List.fold_left (+) 0 l
camlTest__f_1013:
.L102:
movq %rax, %rdi
movq $1, %rbx
jmp camlTest__fold_left_1017@PLT
camlTest__fold_left_1017:
.L101:
cmpq $1, %rdi
je .L100
movq 8(%rdi), %rsi
movq (%rdi), %rdi
leaq -1(%rbx, %rdi), %rbx
movq %rsi, %rdi
jmp .L101
.align 4
.L100:
movq %rbx, %rax
ret
Development Tools
Release of OPAM 1.1
After lots of testing and fixing, the official version 1.1.0 of OPAM
has finally been released. It features lots of stability improvements,
and a reorganized and cleaner repo now hosted at
https://opam.ocaml.org. Work goes on on OPAM
as we’ll release opam-installer
soon, a small script that enables
using and testing .install
files. This is a step toward a better
integration of OPAM with existing build tools, and are experimenting
with ways to ease usage for Coq packages, to generate binary packages,
and to enhance portability.
Binary Packages for OPAM
We also started to experiment with binary packages. We developed a
very small tool, ocp-bin
, that monitors the compilation of every OPAM
package, takes a snapshot of OPAM files before and after the
compilation and installation, and generates a binary archive for the
package. The next time the package is re-installed, with the same
dependencies, the archive is used instead of compiling the package
again.
For a typical package, the standard OPAM file:
build: [
[ “./configure” “–prefix” “%{prefix}%”]
[ “make]
[ make “install”]
]
remove: [
[ make “uninstall” ]
]
has to be modified in:
build: [
[ “ocp-bin” “begin” “%{package}%” “%{version}%” “%{compiler}%” “%{prefix}%”
“-opam” “-depends” “%{depends}%” “-hash” “%{hash}%”
“-nodeps” “ocamlfind.” ]
[ “ocp-bin” “–” “./configure” “–prefix” “%{prefix}%”]
[ “ocp-bin” “–” make]
[ “ocp-bin” “–” make “install”]
[ “ocp-bin” “end” ]
]
remove: [
[ “!” “ocp-bin” “uninstall”
“%{package}%” “%{version}%” “%{compiler}%” “%{prefix}%” ]
Such a transformation would be automated in the future by adding a
field ocp-bin: true
. Note that, since ocp-bin
takes care of the
deinstallation of the package, it would ensure a complete and correct
deinstallation of all packages.
We also implemented a client-server version of ocp-bin
, to be able
to share binary packages between users. The current limitation with
this approach is that many binary packages are not relocatable: if
packages are compiled by Bob to be installed in
/home/bob/.opam/4.01.0
, the same packages will only be usable on a
different computer by a user with the same home path! Although it can
still be useful for a user with several computers, we plan to
investigate now on how to build relocatable packages for OCaml.
Stable Release of ocp-index
Always looking for a way to provide better tools to the OCaml
programmer, we are happy to announce the first stable release of
ocp-index
, which provides quick access to the installed interfaces
and documentation as well as some source-browsing features (lookup
ident definition, search for uses of an ident, etc).
Profiling Alt-Ergo with ocp-memprof
: The Killer App
One of the most exciting events this month is the use of the
ocp-memprof
tool to profile an execution of Alt-Ergo on a big formula
generated by the Cubicle model checker. The story is the following:
The formula was generated from a transition system modeling the FLASH coherence cache protocol, plus additional information computed by Cubicle during the verification of FLASH’s safety. It contains a sub-formula made of nested conjunctions of 999 elements. Its proof requires reasoning in the combination of the free theory of equality, enumerated data types and quantifiers. Alt-Ergo was able to discharge it in only 10 seconds. However, Alain Mebsout — who is doing his Phd thesis on Cubicle — noticed that Alt-Ergo allocates more than 60 MB during its execution.
In order to localize the source of this abnormal memory consumption, we installed the OCaml Memory Profiler runtime, version 4.00.1+memprof (available in the private OPAM repository of OCamlPro) and compiled Alt-Ergo using -bin-annot option in order to dump .cmt files. We then executed the prover on Alain’s example as shown below, without any instrumentation of Alt-Ergo’s code.
$ OCAMLRUNPARAM=m ./alt-ergo.opt formula.mlw
This execution caused the modified OCaml compiler to dump a snapshot
of the typed heap at every major collection of the GC. The names of
dumped files are of the form
memprof.<PID>.<DUMP-NAME>.<image-number>.dump
, where PID is a natural
number that identifies the set of dumped files during a particular
execution.
Dumped files were then fed to the ocp-memprof
tool (available in the
TypeRex-Pro toolbox) using the syntax below. The synthesis of this
step (.hp file) was then converted to a .ps file thanks to hp2ps
command. At the end, we obtained the diagram shown in the figure
below.
$ ./ocp-memprof -loc -sizes PID
From the figure above, one can extract the following information:
-
there were 15 major collections of OCaml’s GC during the above execution (the x-axis),
-
Alt-Ergo allocated more than 60 MB during its execution (the y-axis),
-
Some function in file
src/preprocess/why_typing.ml
is allocating a lot of data of typeParsed.pp_desc
at line 868 (the first square of the legend).
The third point corresponds to a piece of code used in a recursive function that performs alpha renaming on parsed formulas to avoid variable captures. This code is the following:
let rec alpha_renaming_b s f =
…
| PPinfix(f1, op, f2) -> (* ‘op’ may be the AND operator *)
let ff1 = alpha_renaming_b s f1 in
let ff2 = alpha_renaming_b s f2 in
PPinfix(ff1, op, ff2) (* line 868 *)
…
Actually, in 99% there are no capture problems and the function just
reconstructs a new value PPinfix(ff1, op, ff2)
that is structurally
equal (=) to its argument f
. In case of very big formulas (recall that
Alain’s formula contains a nested conjunction of 999 elements), this
causes Alt-Ergo to allocate a lot.
Fixing this behavior was straightforward. We only had to check whether
recursive calls to alpha renaming function returned modified values
using physical equality ==
. If not, no renaming is performed and we
safely return the formula given in the argument. This way, the
function will never allocate for formulas without capture issues. For
instance, the piece of code given above is fixed as follows:
let rec alpha_renaming_b s f =
…
| PPinfix(f1, op, f2) ->
let ff1 = alpha_renaming_b s f1 in
let ff2 = alpha_renaming_b s f2 in
if ff1 == f1 && ff2 == f2 then f (* no renaming performed by recursive calls ? *)
else PPinfix(ff1, op, ff2)
…
Once we applied the patch on the hole function alpha_renaming_b,
Alt-Ergo
only needed 2 seconds and less than 2.2MB memory to prove our
formula. Profiling an execution of patched version of the prover with
OCaml 4.00.1+memprof and ocp-memprof
produced the diagram below. The
difference with the first drawing was really impressive.
Other R&D Projects
Scilint, the Scilab Style-Checker
This month our work on Richelieu was mainly focused on improving Scilint. After some discussions with Scilab knowledgeable users, we chose a new set of warnings to implement. Among other things those warnings analyze primitive fonctions and their arguments as well as loop variables. Another important thing was to allow SciNotes, Scilab’s editor, to display our warnings. This has been done by implementing support for Firehose. Finally some minor bugs were also fixed.
Au sujet d'OCamlPro :
OCamlPro développe des applications à haute valeur ajoutée depuis plus de 10 ans, en utilisant les langages les plus avancés, tels que OCaml et Rust, visant aussi bien rapidité de développement que robustesse, et en ciblant les domaines les plus exigeants (méthodes formelles, cybersécurité, systèmes distribués/blockchain, conception de DSLs). Fort de plus de 20 ingénieurs R&D, avec une expertise unique sur les langages de programmation, aussi bien théorique (plus de 80% de nos ingénieurs ont une thèse en informatique) que pratique (participation active au développement de plusieurs compilateurs open-source, prototypage de la blockchain Tezos, etc.), diversifiée (OCaml, Rust, Cobol, Python, Scilab, C/C++, etc.) et appliquée à de multiples domaines. Nous dispensons également des [formations sur mesure certifiées Qualiopi sur OCaml, Rust, et les méthodes formelles] (https://training.ocamlpro.com/) Pour nous contacter : contact@ocamlpro.com.
Articles les plus récents
2024
- opam 2.3.0 release!
- Optimisation de Geneweb, 1er logiciel français de Généalogie depuis près de 30 ans
- Alt-Ergo 2.6 is Out!
- Flambda2 Ep. 3: Speculative Inlining
- opam 2.2.0 release!
- Flambda2 Ep. 2: Loopifying Tail-Recursive Functions
- Fixing and Optimizing the GnuCOBOL Preprocessor
- OCaml Backtraces on Uncaught Exceptions
- Opam 102: Pinning Packages
- Flambda2 Ep. 1: Foundational Design Decisions
- Behind the Scenes of the OCaml Optimising Compiler Flambda2: Introduction and Roadmap
- Lean 4: When Sound Programs become a Choice
- Opam 101: The First Steps
2023
- Maturing Learn-OCaml to version 1.0: Gateway to the OCaml World
- The latest release of Alt-Ergo version 2.5.1 is out, with improved SMT-LIB and bitvector support!
- 2022 at OCamlPro
- Autofonce, GNU Autotests Revisited
- Sub-single-instruction Peano to machine integer conversion
- Statically guaranteeing security properties on Java bytecode: Paper presentation at VMCAI 23
- Release of ocplib-simplex, version 0.5