OCamlPro’s Contributions to OCaml 4.00.0
OCaml 4.00.0 has been released on July 27, 2012. For the first time, the new OCaml includes some of the work we have been doing during the last year. In this article, I will present our main contributions, mostly funded by Jane Street and Lexifi.
Binary Annotations for Advanced Development Tools
OCaml 4.00.0 has a new option -bin-annot
(undocumented, for now, as
it is still being tested). This option tells the compiler to dump in
binary format a compressed version of the typed tree (an abstract
syntax tree with type annotations) to a file (with the .cmt
extension for implementation files, and .cmti
for interface
files). This file can then be used by development tools to provide new
features, based on the full knowledge of types in the sources. One of
the first tools to use it is the new version of ocamlspotter
, by Jun
Furuse.
This new option will probably make the old option -annot
obsolete
(except, maybe, in specific contextes where you don’t want to depend
on the internal representation of the typedtree, for example when you
are modifying this representation !). Generated files are much smaller
than with the -annot
option, and much faster to write (during
compilation) and to read (for analysis).
New Options for ocamldep
As requested on the bug tracker, we implemented a set of new options for ocamldep:
-
-all
will print all the dependencies, i.e. not only on .cmi, .cmo and .cmx files, but also on source files, and for .o files. In this mode also, no proxying is performed: if there is no interface file, a bytecode dependency will still appear against the .cmi file, and not against the .cmo file as it would before; -
-one-line
will not break dependencies on several lines; -
-sort
will print the arguments of ocamldep (filenames) in the order of dependencies, so that the following command should work when all source files are in the same directory:
ocamlopt -o my_program `ocamldep -sort *.ml *.mli
CFI Directives for Debugging
OCaml tries to make the best use of available registers and stack space, and consequently, its layout on the stack is much different from the one of C functions. Also, function names are mangled to make them local to their module. As a consequence, debugging native code OCaml programs has long been a problem with previous versions of OCaml:, since the debugger cannot print correctly the backtrace of the stack, nor put breakpoints on OCaml functions.
In OCaml 4.00.0, we worked on a patch submitted on the bug tracker to improve the situation: x86 and amd64 backends now emit more debugging directives, such as the locations in the source corresponding to functions in the assembly (so that you can put breakpoints at function entry), and CFI directives, indicating the correct stack layout, for the debugger to correctly unwind the stack. These directives are part of the DWARF debugging standard.
Unfortunately, line by line stepping is not yet available, but here is an example of session that was not possible with previous versions:
let f x y = List.map ( (+) x ) y
let _ = f 3 [1;2;3;4]
$ ocamlopt -g toto.ml
$ gdb ./a.out
(gdb) b toto.ml:1
Breakpoint 1 at 0x4044f4: file toto.ml, line 1.
(gdb) run
Starting program: /home/lefessan/ocaml-4.00.0-example/a.out
Breakpoint 1, 0x00000000004044f4 in camlToto__f_1008 () at toto.ml:1
1 let f x y = List.map ( (+) x ) y
(gdb) bt
0 0x00000000004044f4 in camlToto__f_1008 () at toto.ml:1
1 0x000000000040456c in camlToto__entry () at toto.ml:2
2 0x000000000040407d in caml_program ()
3 0x0000000000415fe6 in caml_start_program ()
4 0x00000000004164b5 in caml_main (argv=0x7fffffffe3f0) at startup.c:189
5 0x0000000000408cdc in main (argc=<optimized out>, argv=<optimized out>)
at main.c:56
(gdb)
Optimisation of Partial Function Applications
Few people know that partial applications with multiple arguments are not very efficient. For example, do you know how many closures are dynamically allocated in in the following example ?
let f x y z = x + y + z
let sum_list_offsets orig list = List.fold_left (f orig) 0 list
let sum = sum_list_offsets 10 [1;2;3]
Most programmers would reply one, f orig
, but that’s not all
(indeed, f and sum_list_offsets are allocated statically, not
dynamically, as they have no free variables). Actually, three more
closures are allocated, when List.fold_left
is executed on the list,
one closure per element of the list.
The reason for this is that Ocaml has only two modes to execute functions: either all arguments are present, or just one argument. Prior to 4.00.0, when a function would enter the second mode (as f in the previous example), then it would remain in that mode, meaning that the two other arguments would be passed one by one, creating a partial closure between them.
In 4.00.0, we implemented a simple optimization, so that whenever all the remaining expected arguments are passed at once, no partial closure is created and the function is immediatly called with all its arguments, leading to only one dynamic closure creation in the example.
Optimized Pipe Operators
It is sometimes convenient to use the pipe notation in OCaml programs, for example:
let (|>) x f = f x;;
let (@@) f x = f x;;
[1;2;3] |> List.map (fun x -> x + 2) |> List.map print_int;;
List.map print_int @@ List.map (fun x -> x + 1 ) @@ [1;2;3];;
However, such |>
and @@
operators are currently not optimized: for
example, the last line will be compiled as:
let f1 = List.map print_int;;
let f2 = List.map (fun x -> x + 1);;
let x = f2 [1;2;3;];;
f1 x;;
Which means that partial closures are allocated every time a function is executed with multiple arguments.
In OCaml 4.00.0, we optimized these operators by providing native operators, for which no partial closures are generated:
external (|>) : ‘a -> (‘a -> ‘b) -> ‘b = "%revapply";;
external ( @@ ) : (‘a -> ‘b) -> ‘a -> ‘b = "%apply"
Now, the previous example is equivalent to:
List.map print_int (List.map ( (+) 1 ) [1;2;3])
Bug Fixing
Of course, a lot of our contributions are not always as visible as the previous ones. We also spent a lot of time fixing small bugs. Although it doesn’t sound very fun, fixing bugs in OCaml is also fun, because bugs are often challenging to understand, and even more challenging to remove without introducing new ones !
About OCamlPro:
OCamlPro is a R&D lab founded in 2011, with the mission to help industrial users benefit from experts with a state-of-the-art knowledge of programming languages theory and practice.
- We provide audit, support, custom developer tools and training for both the most modern languages, such as Rust, Wasm and OCaml, and for legacy languages, such as COBOL or even home-made domain-specific languages;
- We design, create and implement software with great added-value for our clients. High complexity is not a problem for our PhD-level experts. For example, we helped the French Income Tax Administration re-adapt and improve their internally kept M language, we designed a DSL to model and express revenue streams in the Cinema Industry, codename Niagara, and we also developed the prototype of the Tezos proof-of-stake blockchain from 2014 to 2018.
- We have a long history of creating open-source projects, such as the Opam package manager, the LearnOCaml web platform, and contributing to other ones, such as the Flambda optimizing compiler, or the GnuCOBOL compiler.
- We are also experts of Formal Methods, developing tools such as our SMT Solver Alt-Ergo (check our Alt-Ergo Users' Club) and using them to prove safety or security properties of programs.
Please reach out, we'll be delighted to discuss your challenges: contact@ocamlpro.com or book a quick discussion.
Most Recent Articles
2024
- opam 2.3.0 release!
- Optimisation de Geneweb, 1er logiciel français de Généalogie depuis près de 30 ans
- Alt-Ergo 2.6 is Out!
- Flambda2 Ep. 3: Speculative Inlining
- opam 2.2.0 release!
- Flambda2 Ep. 2: Loopifying Tail-Recursive Functions
- Fixing and Optimizing the GnuCOBOL Preprocessor
- OCaml Backtraces on Uncaught Exceptions
- Opam 102: Pinning Packages
- Flambda2 Ep. 1: Foundational Design Decisions
- Behind the Scenes of the OCaml Optimising Compiler Flambda2: Introduction and Roadmap
- Lean 4: When Sound Programs become a Choice
- Opam 101: The First Steps
2023
- Maturing Learn-OCaml to version 1.0: Gateway to the OCaml World
- The latest release of Alt-Ergo version 2.5.1 is out, with improved SMT-LIB and bitvector support!
- 2022 at OCamlPro
- Autofonce, GNU Autotests Revisited
- Sub-single-instruction Peano to machine integer conversion
- Statically guaranteeing security properties on Java bytecode: Paper presentation at VMCAI 23
- Release of ocplib-simplex, version 0.5