Running JuMP using KNITRO: killed

question

#1

I am running a bunch optimization problems in a row with different parameters using JuMP and KNITRO and after about 5-7 hours the process simply says Killed and is sent back to the terminal with no additional information. Might this be a small memory leak? Should I somehow clear the RAM after each run?


#2

I also encountered similar problem with some long running script. The memory usage gradually stack up and finally get killed. I have not got a MWE yet.

I am Not using JuMP though, so this could be a potential Julia memory leak bug.

There is also some memory leak issues of distributed computation in Julia repo, but I think I am not using them, not sure about other dependent packages. I do used @async though.


#3

The Killed message means out of memory. The kernel triggers the out of memory killer. You can see the logs at /var/log/messages or equivalent


#4

@bbrunaud thank you for the reply! There are many different files here, which ones should I look at:

febbo@febbo-HP-Z220-SFF-Workstation:/var/log$ ls
alternatives.log        alternatives.log.7.gz  apport.log.5.gz  auth.log.2.gz  dist-upgrade    dpkg.log.5.gz   gpu-manager.log  lightdm            syslog.6.gz          Xorg.0.log.old
alternatives.log.1      alternatives.log.8.gz  apport.log.6.gz  auth.log.3.gz  dmesg           dpkg.log.6.gz   installer        speech-dispatcher  syslog.7.gz          Xorg.1.log
alternatives.log.10.gz  alternatives.log.9.gz  apport.log.7.gz  auth.log.4.gz  dpkg.log        dpkg.log.7.gz   kern.log         syslog             teamviewer12         Xorg.1.log.old
alternatives.log.2.gz   apport.log             apt              boot.log       dpkg.log.1      dpkg.log.8.gz   kern.log.1       syslog.1           unattended-upgrades  Xorg.2.log
alternatives.log.3.gz   apport.log.1           aptitude         bootstrap.log  dpkg.log.10.gz  dpkg.log.9.gz   kern.log.2.gz    syslog.2.gz        upstart              Xorg.2.log.old
alternatives.log.4.gz   apport.log.2.gz        aptitude.1.gz    btmp           dpkg.log.2.gz   faillog         kern.log.3.gz    syslog.3.gz        wtmp                 Xorg.3.log
alternatives.log.5.gz   apport.log.3.gz        auth.log         btmp.1         dpkg.log.3.gz   fontconfig.log  kern.log.4.gz    syslog.4.gz        wtmp.1
alternatives.log.6.gz   apport.log.4.gz        auth.log.1       cups           dpkg.log.4.gz   fsck            lastlog          syslog.5.gz        Xorg.0.log

Also if I do:

dmesg -T| grep -E -i -B5 'killed process'

And look at the lines just before it is killed I get:

febbo@febbo-HP-Z220-SFF-Workstation:~/Documents//results/pp$ dmesg -T| grep -E -i -B5 'killed process'
[Tue Feb 20 04:24:24 2018] [ 6212]  1000  6212   353344      577      83       5      966             0 TVGuiSlave.64
[Tue Feb 20 04:24:24 2018] [ 6213]  1000  6213    33381      115      38       2     1824             0 TVGuiDelegate
[Tue Feb 20 04:24:24 2018] [ 6275]  1000  6275  8016419  3941396   15269      33  3601973             0 julia
[Tue Feb 20 04:24:24 2018] [ 6778]  1000  6778   137757      280     120       3      940             0 unity-panel-ser
[Tue Feb 20 04:24:24 2018] Out of memory: Kill process 6275 (julia) score 933 or sacrifice child
[Tue Feb 20 04:24:24 2018] Killed process 6275 (julia) total-vm:32065676kB, anon-rss:15765584kB, file-rss:0kB
--
[Wed Feb 21 04:55:33 2018] [ 6213]  1000  6213    33381       60      38       2     1928             0 TVGuiDelegate
[Wed Feb 21 04:55:33 2018] [19292]  1000 19292   165148       42     140       4     2339             0 gnome-terminal-
[Wed Feb 21 04:55:33 2018] [19299]  1000 19299     5845        1      18       3      670             0 bash
[Wed Feb 21 04:55:33 2018] [19456]  1000 19456  7863062  3953755   15203      33  3547061             0 julia
[Wed Feb 21 04:55:33 2018] Out of memory: Kill process 19456 (julia) score 928 or sacrifice child
[Wed Feb 21 04:55:33 2018] Killed process 19456 (julia) total-vm:31452248kB, anon-rss:15815020kB, file-rss:0kB
--
[Thu Feb 22 03:09:35 2018] [ 5770]  1000  5770   184534        1     154       4      666             0 gvfsd-smb-brows
[Thu Feb 22 03:09:35 2018] [ 5779]  1000  5779    90429        0      45       3      252             0 gvfsd-dnssd
[Thu Feb 22 03:09:35 2018] [ 5865]  1000  5865  7852110  3946446   15138      33  3571519             0 julia
[Thu Feb 22 03:09:35 2018] [ 7250]  1000  7250   137749      259     121       4     1473             0 unity-panel-ser
[Thu Feb 22 03:09:35 2018] Out of memory: Kill process 5865 (julia) score 930 or sacrifice child
[Thu Feb 22 03:09:35 2018] Killed process 5865 (julia) total-vm:31408440kB, anon-rss:15784840kB, file-rss:944kB
--
[Fri Feb 23 02:15:37 2018] [ 5779]  1000  5779    90429       25      45       3      211             0 gvfsd-dnssd
[Fri Feb 23 02:15:37 2018] [31248]  1000 31248   178707        0     198       4    14099             0 update-manager
[Fri Feb 23 02:15:37 2018] [ 2457]  1000  2457  7795624  3945875   15051      33  3504405             0 julia
[Fri Feb 23 02:15:37 2018] [ 6061]  1000  6061   137759      273     124       3      947             0 unity-panel-ser
[Fri Feb 23 02:15:37 2018] Out of memory: Kill process 2457 (julia) score 922 or sacrifice child
[Fri Feb 23 02:15:37 2018] Killed process 2457 (julia) total-vm:31182496kB, anon-rss:15783500kB, file-rss:0kB
--
[Sun Feb 25 01:38:21 2018] [31248]  1000 31248   178707        0     198       4    14098             0 update-manager
[Sun Feb 25 01:38:21 2018] [31641]  1000 31641   158830    11162     243       2    88624             0 TeamViewer_Desk
[Sun Feb 25 01:38:21 2018] [31842]  1000 31842  7898497  3921800   14943      33  3464435             0 julia
[Sun Feb 25 01:38:21 2018] [ 4229]  1000  4229   137758      257     119       4      950             0 unity-panel-ser
[Sun Feb 25 01:38:21 2018] Out of memory: Kill process 31842 (julia) score 914 or sacrifice child
[Sun Feb 25 01:38:21 2018] Killed process 31842 (julia) total-vm:31593988kB, anon-rss:15686432kB, file-rss:768kB
--
[Mon Feb 26 09:14:39 2018] [ 5779]  1000  5779    90429        0      45       3      269             0 gvfsd-dnssd
[Mon Feb 26 09:14:39 2018] [31248]  1000 31248   178707       68     198       4    13986             0 update-manager
[Mon Feb 26 09:14:39 2018] [14615]  1000 14615  7775460  3937516   15002      32  3464151             0 julia
[Mon Feb 26 09:14:39 2018] [17523]  1000 17523   137742      258     126       3      958             0 unity-panel-ser
[Mon Feb 26 09:14:39 2018] Out of memory: Kill process 14615 (julia) score 916 or sacrifice child
[Mon Feb 26 09:14:39 2018] Killed process 14615 (julia) total-vm:31101840kB, anon-rss:15750064kB, file-rss:0kB
--
[Wed Feb 28 18:26:52 2018] [ 6945]  1000  6945    51965        5      35       3      277             0 oosplash
[Wed Feb 28 18:26:52 2018] [ 6963]  1000  6963   449680     5575     321       5     7391             0 soffice.bin
[Wed Feb 28 18:26:52 2018] [16733]  1000 16733  7751156  3931379   14923      33  3511692             0 julia
[Wed Feb 28 18:26:52 2018] [16745]  1000 16745   137769      255     123       4      964             0 unity-panel-ser
[Wed Feb 28 18:26:52 2018] Out of memory: Kill process 16733 (julia) score 921 or sacrifice child
[Wed Feb 28 18:26:52 2018] Killed process 16733 (julia) total-vm:31004624kB, anon-rss:15725416kB, file-rss:100kB
febbo@febbo-HP-Z220-SFF-Workstation:~/Documents/workspace/PhD/papers/RTPP/results/pp$ 

[Wed Feb 28 18:26:52 2018] [16733]  1000 16733  7751156  3931379   14923      33  3511692             0 julia
[Wed Feb 28 18:26:52 2018] [16745]  1000 16745   137769      255     123       4      964             0 unity-panel-ser
[Wed Feb 28 18:26:52 2018] Out of memory: Kill process 16733 (julia) score 921 or sacrifice child
[Wed Feb 28 18:26:52 2018] Killed process 16733 (julia) total-vm:31004624kB, anon-rss:15725416kB, file-rss:100kB

Is there a way that I can clear the cache in between runs? It works fine if I restart it, but if it has been running for a bit then I need to restart. I have tried:

 clear!(:n)

where n is the object that holds all of the data. My feeling is that KNITRO and IPOPT need to be cleared or reset. Is there a way to do this? I guess that I could open up a new julia after each run…


Killing current julia and launching a new one programatically