Detecting this problem has been a bit of detective work. I started from within python3
and wrapped liblept
for use with libtesseract
in a python3
C-Foreign-Function-Interface.
Background context:
From within python3, I detected a perfect memory leak (100% of data or even more) pushed into the tesseract
/lept
interface failed to get freed regardless of code changes.
I isolated the python3 interface first to a bare minimum C-interface call, which involved libtesseract
and liblept
working together. The memleak was present, with plateaus. Either cffi
, libtesseract
or liblept
had a leak, or I was misusing one of the two libraries in cffi
such that there was a leak due to my own error.
In order to rule out libtesseract
to at least some extent, I set up a C++ api example and tested it for leakage. There was no leakage even though liblept
was used as part of this test.
With that done, I dialed in on liblept
as the possible source of leakage. When isolating solely the liblept
calls, I was able to reproduce a perfect memory leakage with no plateaus.
At this point, I figured that the error was most likely in cffi
, since cffi
is the most complex component (it bridges two languages), and I posted a bug on cffi
, CFFI Issue #527.
Current Issue
Per the advice of one of the developers there, I set up a C code to duplicate the python3 minimal example of leakage.
Between these two codes, there is some flaw... and it is in either liblept
, cffi
, or (more unlikely, imo) in one of the supporting python3 libraries -- but this does not seem to be the case.
Starting from the Python3 side, I'll try to pinch the issue between working code. Most likely it falls to either lept
or cffi
, but it is still not clear to me where the bug is exactly. I'll develop my position while writing this issue -- hopefully I'll get some clear information from valgrind on the C-side.
Starting with the detection of the problem in python3
:
import gc
import click
import cv2
import numpy
from cffi import FFI
from PIL import Image as ImageModule
from PIL.Image import Image
DEFAULT_DPI = 300
def read_img(img_file: str):
img = cv2.imread(img_file)
# favoring numpy's memory management over opencv-python
img_arr = numpy.copy(img)
del img
return img_arr
def cffi_memtest_pix(img: numpy.ndarray, cycles: int = 1):
"""Memory test the image cffi transaction with a standalone cffi interface.
Run this for an arbitrary number of cycles in a docker container while monitoring the image's memory to confirm
leakage.
Args:
img: the image to test
cycles: the number of times to load and delete images
"""
hdr = """
typedef struct Pix Pix;
char * getLeptonicaVersion();
Pix * pixRead(const char* filename);
Pix * pixCreate(int cols, int rows, int channels);
Pix * pixSetData(Pix* pix, unsigned int * buffer);
int pixSetResolution(Pix * pix, int xres, int yres);
void pixDestroy(Pix ** pix);
Pix * pixEndianByteSwap(Pix * pix);
"""
global DEFAULT_DPI
ffi = FFI(backend=None)
ffi.cdef(hdr)
lept = ffi.dlopen('lept')
for i in range(cycles):
pil: Image = ImageModule.fromarray(img).convert("RGBA")
cols, rows = pil.size
pil_img_buf = pil.tobytes("raw", "RGBA")
pix = lept.pixCreate(cols, rows, 32)
lept.pixSetData(pix, ffi.from_buffer("unsigned int[]", pil_img_buf))
# disabled for minimal test
# lept.pixSetResolution(pix, DEFAULT_DPI, DEFAULT_DPI)
pix = lept.pixEndianByteSwap(pix) # no-op on big-endian, fixes the problem on little-endian
_pix = ffi.new("Pix **")
_pix[0] = pix
lept.pixDestroy(_pix)
ffi.release(_pix)
del pix
del _pix
gc.collect()
ffi.dlclose(lept)
def py_memtest_pix(img: numpy.ndarray, cycles: int = 1):
"""Memory test the python component of the image cffi transaction with a standalone cffi interface.
Run this for an arbitrary number of cycles in a docker container while monitoring the image's memory to confirm
leakage.
Args:
img: the image to test
cycles: the number of times to load and delete images
"""
hdr = """
typedef struct Pix Pix;
char * getLeptonicaVersion();
Pix * pixRead(const char* filename);
Pix * pixCreate(int cols, int rows, int channels);
Pix * pixSetData(Pix* pix, unsigned int * buffer);
int pixSetResolution(Pix * pix, int xres, int yres);
void pixDestroy(Pix ** pix);
Pix * pixEndianByteSwap(Pix * pix);
"""
global DEFAULT_DPI
ffi = FFI(backend=None)
ffi.cdef(hdr)
lept = ffi.dlopen('lept')
for i in range(cycles):
pil: Image = ImageModule.fromarray(img).convert("RGBA")
pil_img_buf = pil.tobytes("raw", "RGBA")
gc.collect()
ffi.dlclose(lept)
@click.command()
@click.argument('img', type=str)
@click.argument('cycles', type=int)
def run(img: str, cycles: int):
img_ndarr = read_img(img)
cffi_memtest_pix(img_ndarr, cycles)
if __name__ == "__main__":
run()
My python3
memory test procedure drops valgrind, since I found it to be problematic for benchmarking python3 memory usage. Instead, I ran python3 code inside of an python3 eval loop running in a docker container. I plotted the docker container's memory usage per second in GNU-plot. X-axis is time in seconds, Y-axis is memory usage in GB.
Python3-only memtest result (PIL + numpy/opencv)

Python3 with CFFI(lept)

C-side analysis (in progress, checking in this issue's content so far):
I'll be starting with this code and I will try to reproduce the leakage. If the leak is not in this library, I'll use this issue as a reference for my issue in cffi
:
#include "leptonica/allheaders.h"
int main(){
struct Pix * r_pix = pixRead("/.../my-large-image.png");
for (int i = 0; i < 1000; i++){
// Pix seems to delete the data associated with the original pix when it is set to a different pix
// so this avoids segfault
struct Pix * r_pix_cp = pixCopy(NULL, r_pix);
l_uint32 * pix_data = pixExtractData(r_pix_cp);
// analog for python3 conversion from PIL
struct Pix * pix = pixCreate(r_pix->w, r_pix->h, r_pix->d);
pixSetData(pix, pix_data);
pixEndianByteSwap(pix);
pixDestroy(&pix);
// free the rest of the copied struct
pixDestroy(&r_pix_cp);
};
// free the original pix
pixDestroy(&r_pix);
}