DATA INTO - Memory Leak Issues

Discussions relating to the ScottKlement.com port of the open source YAJL JSON Reader/Generator. This includes the YAJL tool as well as the YAJLR4, YAJLGEN, YAJLINTO and YAJLDTAGEN add-ons from ScottKlement.com. http://www.scottklement.com/yajl/
Post Reply
anchitmb
Posts: 2
Joined: Wed Jan 12, 2022 1:38 pm

DATA INTO - Memory Leak Issues

Post by anchitmb »

hi Scott,
I have a very simple program that will parse a JSON document using DATA-INTO and YAJLINTO. We receive the JSON request from other systems via MQ. We have a never ending job that will be reading the MQ. This job will run for months in the system and processes about about million requests per day . What we are seeing is that when the temp storage of the job is increasing every day.

Here in the sample program, I am running the job for 10,00 cycles in a loop. You will see that the memory of the job keeps on increasing. At the end of the job the job would have used about 27 MB of temp storage. In production, we are running with much larger json payload and over a million cycles per day. We are seeing the temp storage in GB.

I tried debugging it but was not able to point out the issue. Can you please help us identify the issue or if there is a way to clean up the memory after using YAJLINTO explicitly using a subprocedure.


Edit1: Few more data points.
1. We are using the TERASPACE version.
2. In the PEX report when TERASPACE = (*YES)
yajl_internal_free() function calls, there never appear to be any corresponding calls to _C_TS_free().
3. In the PEX report when TERASPACE = (*NO)
We see the standard C library free() functions in the QC2UTIL1 *SRVPGM from the QC2ALLOC.do_free_default__FPv() procedure.


Thanks in Advance !

Code: Select all


**free                                                                     
       ctl-opt DFTACTGRP(*NO) ACTGRP('SGP') BNDDIR('YAJL') DECEDIT('0.');  
                                                                           
       /define YAJL_C_PROTOTYPES                                           
       /include YAJL_H                                                     
       /include IFSIO_H                                                    
       /include ICONV_H                                                    
                                                                           
                                                                           
       dcl-s json_string   char(20000);                                    
       dcl-s errMsg       varchar(500);                                    
       dcl-s i            int(5)      ;                                    
                                                                           
       dcl-s  num_sample int(5);                                           
       dcl-ds sample qualified dim(005);                                   
          id           char(30);                                           
          type         char(30);                                           
          name         char(30);                                           
          ppu          char(9);                                            
          dcl-ds batters;                                                          
             dcl-ds batter dim(100);                                               
                id           char(30);                                             
                type         char(30);                                             
             end-ds;                                                               
          end-ds;                                                                  
          dcl-ds topping dim(100);                                                 
                id           char(30);                                             
                type         char(30);                                             
          end-ds;                                                                  
       end-ds;                                                                     
                                                                                   
       for i = 1 to 10000;                                                         
       json_string = '                                                         +   
       [{"id":"0001","type":"donut","name":"Cake","ppu":"0.55","batters":        + 
       {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":          +   
       "Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":      +   
       "DevilsFood"}]},"topping":[{"id":"5001","type":"None"},                 +   
       {"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},             +   
       {"id":"5007","type":"PowderedSugar"},{"id":"5006","type"                +   
       :"ChocolatewithSprinkles"},{"id":"5003","type":"Chocolate"},{"id":      +   
       "5004","type":"Maple"}]},{"id":"0002","type":"donut","name":"Raised",   +   
       "ppu":"0.55","batters":{"batter":[{"id":"1001","type":"Regular"}]},       + 
       "topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},   +   
       {"id":"5005","type":"Sugar"},{"id":"5003","type":"Chocolate"},          +   
       {"id":"5004","type":"Maple"}]},{"id":"0003","type":"donut","name":      +   
       "OldFashioned","ppu":"0.55","batters":{"batter":[{"id":"1001","type":     + 
       "Regular"},{"id":"1002","type":"Chocolate"}]},"topping":[{"id":"5001",  +   
       "type":"None"},{"id":"5002","type":"Glazed"},{"id":"5003","type":       +   
       "Chocolate"},{"id":"5004","type":"Maple"}]}]';                              
                                                                                   
          data-into sample  %DATA(json_string                                      
                :'doc=string case=any allowmissing=yes +                           
                allowextra=yes trim=none countprefix=num_')                        
          %PARSER('YAJLINTO');                                                     
                                 
       endfor;                                                                     
                                                                                   
       *inlr = *on;     
Scott Klement
Site Admin
Posts: 652
Joined: Sun Jul 04, 2021 5:12 am

Re: DATA INTO - Memory Leak Issues

Post by Scott Klement »

Hello Anchit,

I'm looking into this problem and trying to determine where the memory leak is.

To do that, I modified my copy of YAJLINTO so that each time it allocates memory with RPG's %ALLOC BIF, or via the various yajl functions such as yajl_alloc, yajl_iconv_dyn, etc it adds 1 to a counter. Each time it frees up the memory for those things, it subtracts one from the counter. At the end, if all is well, the counter should be zero (meaning that it freed everything that it allocated.)

Indeed, after making these modifications, the counters are all zero. Everything allocated in YAJLINTO is being freed up properly. So the problem must be somewhere else, either it is inside YAJLR4, or it is somewhere in the C code. I have not yet determined where the problem is.

I am still looking -- I just wanted to give you a status update.
Scott Klement
Site Admin
Posts: 652
Joined: Sun Jul 04, 2021 5:12 am

Re: DATA INTO - Memory Leak Issues

Post by Scott Klement »

Hello again!

I completed my analysis of the problem, and the memory leak does not appear to be in YAJL. My best guess is that it is in the RPG runtime, perhaps the code that IBM added to RPG for the DATA-INTO opcode.

To determine that, I added some "memory status" features to YAJL to monitor the number of times it allocated memory or file descriptors (as I described in my previous post.) I ran it and got this report:

Code: Select all

YAJLINTO OBJECTS IN USE:
-------------------------------
parsers           = 0
trees             = 0
ccsid converters  = 0
converted strings = 0
error structures  = 0
heap buffers      = 0

YAJLR4 OBJECTS IN USE:
------------------------------
file descriptors  = 0
parsers           = 0
generators        = 0
trees             = 0
ccsid converters  = 0
converted strings = 0
error structures  = 0
heap buffers      = 0
expanding buffers = 0
The fact that this shows nothing but zeroes tells me that it is successfully freeing up all of the allocations that it makes. Now of course loading YAJL and YAJLINTO into memory is going to use some memory -- the program itself takes up memory after all -- but this memory won't "grow" with repeated usage. It is completely normal, and not a memory leak.

To verify this, I looked at the temporary storage before running your program:

Storage prior to running your program:

Code: Select all

                                                              Temporary
Opt  Subsystem/Job  User        Number  Type  CPU %  Threads   Storage
     QINTER         QSYS        059609  SBS      .0       22         6
       QPADEV002N   SKLEMENT    090283  INT      .0        1        30
As you can see, my job is using 30mb of temporary storage before having used YAJL or YAJLINTO. I took the source code that you posted at the start of this thread and compiled it into a program named ANCHIT. (I named it after you! I didn't know what else to call it...) And then I ran the program, and checked the memory usage afterwards.

Code: Select all

CALL PGM(ANCHIT)
                                                               Temporary
Opt  Subsystem/Job  User        Number  Type  CPU %  Threads   Storage
     QINTER         QSYS        059609  SBS      .0       22         6
       QPADEV002N   SKLEMENT    090283  INT      .0        1        54
Notice that there is 24mb of additional temporary storage now in use. That is the problem that you noticed! It is using extra storage that it is not freeing up! That isn't good.

Next I ran the program and stopped it (using a breakpoint) when the YAJL service program is in memory. This let me display the call stack to see all of the programs that are loaded and which activation groups they are in.

Code: Select all

                               ------Activation Group------  Control 
Type  Program                  Name        Number            Boundary
      DETGENIE   SKLEMENT      *DFTACTGRP  0000000000000002  Yes     
   1  QCMD       QSYS          *DFTACTGRP  0000000000000001  No      
      ANCHIT     SKLEMENT      SGP         000000000000004F  Yes     
      ANCHIT     SKLEMENT      SGP         000000000000004F  No      
      QRNXUTIL   QSYS          SGP         000000000000004F  No      
      QRNXUTIL   QSYS          SGP         000000000000004F  No      
      QRNXUTIL   QSYS          SGP         000000000000004F  No      
      YAJLINTO   SKLEMENT      YAJL        0000000000000050  Yes     
      YAJLINTO   SKLEMENT      YAJL        0000000000000050  No      
      YAJL       SKLEMENT      YAJL        0000000000000050  No      
As you can see, aside from QCMD (the IBM command line) and DETGENIE (an unrelated program of mine that always runs when I sign on) the only programs in memory are your program (named ANCHIT) the RPG runtime (QRNXUTIL) and the YAJL programs (YAJLINTO and YAJL). The ANCHIT and QRNXUTIL code is loaded into the SGP activation group. YAJL and YAJLINTO are in the YAJL activation group.

To determine which ones are using memory, we can reclaim each activation group. First, I will reclaim YAJL, since that's where we thought the memory leak was:

Code: Select all

RCLACTGRP YAJL
                                                              Temporary
Opt  Subsystem/Job  User        Number  Type  CPU %  Threads   Storage
     QINTER         QSYS        059609  SBS      .0       22         6
       QPADEV002N   SKLEMENT    090283  INT      .0        1        53
Notice that this only reduced the temporary storage by 1mb -- a very small amount. This is just because the YAJL and YAJLINTO code is no longer activated in memory -- so is completely normal. There is still 23mb of extra temporary storage even without YAJL!

Next, I tried reclaiming your SGP activation group:

Code: Select all

RCLACTGRP SGP
                                                              Temporary
Opt  Subsystem/Job  User        Number  Type  CPU %  Threads   Storage 
     QINTER         QSYS        059609  SBS      .0       22         6 
       QPADEV002N   SKLEMENT    090283  INT      .0        1        30 
Now it is back to where I started at 30mb of temporary storage. The entire 23mb of extra storage was in the SGP activation group. About 1mb of that is normal memory usage from the program being activated, and the remaining 22mb is from memory leaking for the 10000 iterations of data-into . We know that only ANCHIT and QRNXUTIL were in this activation group, and since ANCHIT is very simple it seems likely that it is QRNXUTIL has the problem.

You mentioned that your memory usage kept growing larger the more times you ran DATA-INTO. To test that, I changed your program to loop 20000 times (instead of the original 10000) and tried the same experiment.

Code: Select all

CALL PGM(ANCHIT)
                                                              Temporary
Opt  Subsystem/Job  User        Number  Type  CPU %  Threads   Storage
     QINTER         QSYS        059609  SBS      .0       22         6
       QPADEV002N   SKLEMENT    090283  INT      .0        1        76
With 20000 iterations, it uses 76mb. We started with 30mb in use, so that's 46mb additional. Assuming that 2mb of that is normal (1mb for ANCHIT/QRNXUTIL, 1mb for YAJL/YAJLINTO) that leaves 44mb of "leaked" memory -- exactly twice as much as we had with 10000 iterations, so that makes perfect sense.

Let's try reclaiming YAJL's memory again:

Code: Select all

RCLACTGRP YAJL
                                                              Temporary
Opt  Subsystem/Job  User        Number  Type  CPU %  Threads   Storage
     QINTER         QSYS        059609  SBS      .0       22         6
       QPADEV002N   SKLEMENT    090283  INT      .0        1        75
Notice that reclaiming YAJL once again freed up the "normal" 1mb of memory, exactly the same as before.

Now let's try reclaiming SGP:

Code: Select all

RCLACTGRP SGP
                                                              Temporary
Opt  Subsystem/Job  User        Number  Type  CPU %  Threads   Storage
     QINTER         QSYS        059609  SBS      .0       22         6
       QPADEV002N   SKLEMENT    090283  INT      .0        1        30
It is now back to the starting point of 30mb of temporary storage.

It seems very clear to me that the problem is not in YAJL!!

At the moment the code that adds the "YAJL objects in use" report is only on my test copy of YAJL. If you'd like to try it yourself, please let me know. But, clearly, the problem is not in YAJL.
Scott Klement
Site Admin
Posts: 652
Joined: Sun Jul 04, 2021 5:12 am

Re: DATA INTO - Memory Leak Issues

Post by Scott Klement »

Compiling with teraspace enabled is the default. In fact, I don't know anyone who isn't using the teraspace option.

YAJLINTO doesn't ever call _C_TS_malloc64 or _C_TS_free, it uses RPG's %ALLOC BIF and DEALLOC opcode. YAJLR4 does use teraspace allocations, but only if you call particular subprocedures -- and you are not calling any of those in your example code.

See my previous post. My guess is that the RPG runtime is calling _C_TS_malloc and not calling _C_TS_free.
anchitmb
Posts: 2
Joined: Wed Jan 12, 2022 1:38 pm

Re: DATA INTO - Memory Leak Issues

Post by anchitmb »

Thank you Scott!
I never thought about the RCLACTGRP option. We tried with the other JSON parser (one provided by IBM which they say is not prod grade) and we see the same memory leak issue. This definitely rules out the YAJL parser. I am also surprised - how efficient it is - just use 1MB of memory.

We really appreciate all you help here. YAJL was the first open source code at our shop and this gives us lot more confidence with the open source projects.

If you can please share the code for "YAJL objects in use" report - i will run more tests in our environment.

I am bit surprised that data-into would have a memory leak issue and i did not see any one else complain about it yet. May be it is the way we are using it in a never-ending job and our volume that we found out about it so soon.

Again, thank you very much and appreciate all your help.

Hope to meet you in person again - hopefully at omni user group this year.


Thanks
Scott Klement
Site Admin
Posts: 652
Joined: Sun Jul 04, 2021 5:12 am

Re: DATA INTO - Memory Leak Issues

Post by Scott Klement »

Anchit,

I have updated the YAJL on my web site:
https://www.scottklement.com/yajl/

Please download the latest copy from there. (If needed, clear your browser cache so you don't have the old one cached.) Once you have it installed, you may do this in the job where you will be testing YAJL:

Code: Select all

ADDENVVAR ENVVAR(YAJL_MEMORY_STATUS) VALUE('/tmp/yajl_memory_log.txt')
Note: environment variables are case sensitive, please make sure you match upper/lowercase exactly.

When that environment variable is set, each time YAJLINTO runs, it will report how many objects it has allocated in memory when its done parsing each document. If the IFS file doesn't exist, it will create it. If it does already exist, it will append the report to the end of the file. (So, for example, in your sample program that parsed the document 10000 times, the report will be added 10000 times to the file.)

If there is a leak, you will see the numbers increasing with each call to YAJLINTO... but if there is none, the numbers will be the same. If it has nothing in memory, they'll all be zero.

Hope that helps!
Post Reply