wxOCaml, camlidl and Class Modules
A few months ago, a memory leak in the Scanf.fscanf
function of OCaml’s standard library has been reported on the OCaml mailing list. The following “minimal” example reproduces this misbehavior:
for i = 0 to 100_000 do
let ic = open_in “some_file.txt” in
Scanf.fscanf ic “%s” (fun _s -> ());
close_in ic
done;;
read_line ();;
Let us see how to identify the origin of the leak and fix it with our OCaml memory profiler.
Installing the OCaml Memory Profiler
We first install our modified OCaml compiler and the memory profiling tool thanks to the following opam commands:
$ opam remote add memprof http://memprof.typerex.org/opam
$ opam update
$ opam switch 4.01.0+ocp1-20150202
$ opam install ocp-memprof
$ eval opam config env
That’s all ! Installation is done after only five (opam) commands.
Compiling and Executing the Example
The second step consists in compiling the example above and profiling it. This is simply achieved with the commands:
$ ocamlopt scanf_leak.ml -o scanf.x
$ ocp-memprof –exec scanf.x
You may notice that no instrumentation of the source is needed to enable profiling.
Visualizing the Results
In the last command above, scanf.x
dumps a lot of information (related to memory occupation) during its execution. Our “OCaml Memory Profiler” then analyzes these dumps, and generates a “human readable” graph that shows the evolution of memory consumption after each OCaml garbage collection. Concretely, this yields the graph below (the interactive graph generated by ocp-memprof
is available here). As you can see, memory consumption is growing abnormally and exceed 240Mb ! Note that we stopped the scanf.x
after 90 seconds.
Playing With (Some of) ocp-memprof Capabilities
ocp-memprof allows to group and show data contained in the graph w.r.t. several criteria. For instance, data are grouped by “Modules” in the capture below. This allows us to deduce that most allocations are performed in the Scanf
and Buffer
modules.
In addition to aggregation capabilities, the interactive graph generated by ocp-memprof also allows to “zoom” on particular data. For instance, by looking at Scanf
, we obtain the graph below that shows the different functions that are allocating in this module. We remark that the most allocating function is Scanf.Scanning.from_ic
. Let us have a look to this function.
From Profiling Graphs to Source Code
The code of the function from_ic
, that is responsible for most of the allocation in Scanf
, is the following:
let memo_from_ic =
let memo = ref [] in
(fun scan_close_ic ic ->
try
List.assq ic !memo
with
| Not_found ->
let ib = from_ic scan_close_ic (From_channel ic) ic in
memo := (ic, ib) :: !memo;
ib)
;;
It looks like that the leak is caused by the memo
list that associates a lookahead buffer, resulting from the call to from_ic
, with each input channel.
Patching the Code
Benoit Vaugon quickly sent a patch based on weak-pointers that seems to solve the problem. He modified the code as follows:
- he put the key in a weak set in order to test if it is gone;
- he created a pair that stores the key and the associated value (
PairMemo
); - he put this pair in a weak set (
IcMemo
), where it will be reclaimed at the next GC because; - he added a finalizer on the pair that adds again the pair in the weak set at each GC
let memo_from_ic =
let module IcMemo = Weak.Make (
struct
type t = Pervasives.in_channel
let equal ic1 ic2 = ic1 = ic2
let hash ic = Hashtbl.hash ic
end)
in
let module PairMemo = Weak.Make (
struct
type t = Pervasives.in_channel * in_channel
let equal (ic1, _) (ic2, _) = ic1 = ic2
let hash (ic, _) = Hashtbl.hash ic
end)
in
let ic_memo = IcMemo.create 16 in
let pair_memo = PairMemo.create 16 in
let rec finaliser ((ic, _) as pair) =
if IcMemo.mem ic_memo ic then (
Gc.finalise finaliser pair;
PairMemo.add pair_memo pair) in
(fun scan_close_ic ic ->
try snd (PairMemo.find pair_memo (ic, stdin)) with
| Not_found ->
let ib = from_ic scan_close_ic (From_channel ic) ic in
let pair = (ic, ib) in
IcMemo.add ic_memo ic;
Gc.finalise finaliser pair;
PairMemo.add pair_memo pair;
ib)
;;
Checking the Fixed Version
Curious to see the memory behavior after applying this patch ? The graph below shows the memory consumption of the patched version of Scanf
module. Again, the interactive version is available here. After each iteration of the for-loop
, the memory is released as expected and memory consumption does not exceed 2.1Mb during each for-loop
iteration.
Conclusion
This example is online in our gallery of examples if you want to see and explore the graphs (with the leak and without the leak).
Do not hesitate to use ocp-memprof
on your applications. Of course, all feedback and suggestions on using ocp-memprof
are welcome, just send us an email !
More information:
- Homepage: http://memprof.typerex.org/
- Usage: http://memprof.typerex.org/free-version.php
- Support: http://memprof.typerex.org/report-a-bug.php
- Gallery of examples: http://memprof.typerex.org/gallery.php
- Commercial: http://memprof.typerex.org/commercial-version.php
Au sujet d'OCamlPro :
OCamlPro développe des applications à haute valeur ajoutée depuis plus de 10 ans, en utilisant les langages les plus avancés, tels que OCaml et Rust, visant aussi bien rapidité de développement que robustesse, et en ciblant les domaines les plus exigeants (méthodes formelles, cybersécurité, systèmes distribués/blockchain, conception de DSLs). Fort de plus de 20 ingénieurs R&D, avec une expertise unique sur les langages de programmation, aussi bien théorique (plus de 80% de nos ingénieurs ont une thèse en informatique) que pratique (participation active au développement de plusieurs compilateurs open-source, prototypage de la blockchain Tezos, etc.), diversifiée (OCaml, Rust, Cobol, Python, Scilab, C/C++, etc.) et appliquée à de multiples domaines. Nous dispensons également des [formations sur mesure certifiées Qualiopi sur OCaml, Rust, et les méthodes formelles] (https://training.ocamlpro.com/) Pour nous contacter : contact@ocamlpro.com.
Articles les plus récents
2024
- opam 2.3.0 release!
- Optimisation de Geneweb, 1er logiciel français de Généalogie depuis près de 30 ans
- Alt-Ergo 2.6 is Out!
- Flambda2 Ep. 3: Speculative Inlining
- opam 2.2.0 release!
- Flambda2 Ep. 2: Loopifying Tail-Recursive Functions
- Fixing and Optimizing the GnuCOBOL Preprocessor
- OCaml Backtraces on Uncaught Exceptions
- Opam 102: Pinning Packages
- Flambda2 Ep. 1: Foundational Design Decisions
- Behind the Scenes of the OCaml Optimising Compiler Flambda2: Introduction and Roadmap
- Lean 4: When Sound Programs become a Choice
- Opam 101: The First Steps
2023
- Maturing Learn-OCaml to version 1.0: Gateway to the OCaml World
- The latest release of Alt-Ergo version 2.5.1 is out, with improved SMT-LIB and bitvector support!
- 2022 at OCamlPro
- Autofonce, GNU Autotests Revisited
- Sub-single-instruction Peano to machine integer conversion
- Statically guaranteeing security properties on Java bytecode: Paper presentation at VMCAI 23
- Release of ocplib-simplex, version 0.5