I am running a bunch optimization problems in a row with different parameters using JuMP and KNITRO and after about 5-7 hours the process simply says Killed
and is sent back to the terminal with no additional information. Might this be a small memory leak? Should I somehow clear the RAM after each run?
I also encountered similar problem with some long running script. The memory usage gradually stack up and finally get killed. I have not got a MWE yet.
I am Not using JuMP though, so this could be a potential Julia memory leak bug.
There is also some memory leak issues of distributed computation in Julia repo, but I think I am not using them, not sure about other dependent packages. I do used @async though.
The Killed message means out of memory. The kernel triggers the out of memory killer. You can see the logs at /var/log/messages
or equivalent
@bbrunaud thank you for the reply! There are many different files here, which ones should I look at:
febbo@febbo-HP-Z220-SFF-Workstation:/var/log$ ls
alternatives.log alternatives.log.7.gz apport.log.5.gz auth.log.2.gz dist-upgrade dpkg.log.5.gz gpu-manager.log lightdm syslog.6.gz Xorg.0.log.old
alternatives.log.1 alternatives.log.8.gz apport.log.6.gz auth.log.3.gz dmesg dpkg.log.6.gz installer speech-dispatcher syslog.7.gz Xorg.1.log
alternatives.log.10.gz alternatives.log.9.gz apport.log.7.gz auth.log.4.gz dpkg.log dpkg.log.7.gz kern.log syslog teamviewer12 Xorg.1.log.old
alternatives.log.2.gz apport.log apt boot.log dpkg.log.1 dpkg.log.8.gz kern.log.1 syslog.1 unattended-upgrades Xorg.2.log
alternatives.log.3.gz apport.log.1 aptitude bootstrap.log dpkg.log.10.gz dpkg.log.9.gz kern.log.2.gz syslog.2.gz upstart Xorg.2.log.old
alternatives.log.4.gz apport.log.2.gz aptitude.1.gz btmp dpkg.log.2.gz faillog kern.log.3.gz syslog.3.gz wtmp Xorg.3.log
alternatives.log.5.gz apport.log.3.gz auth.log btmp.1 dpkg.log.3.gz fontconfig.log kern.log.4.gz syslog.4.gz wtmp.1
alternatives.log.6.gz apport.log.4.gz auth.log.1 cups dpkg.log.4.gz fsck lastlog syslog.5.gz Xorg.0.log
Also if I do:
dmesg -T| grep -E -i -B5 'killed process'
And look at the lines just before it is killed I get:
febbo@febbo-HP-Z220-SFF-Workstation:~/Documents//results/pp$ dmesg -T| grep -E -i -B5 'killed process'
[Tue Feb 20 04:24:24 2018] [ 6212] 1000 6212 353344 577 83 5 966 0 TVGuiSlave.64
[Tue Feb 20 04:24:24 2018] [ 6213] 1000 6213 33381 115 38 2 1824 0 TVGuiDelegate
[Tue Feb 20 04:24:24 2018] [ 6275] 1000 6275 8016419 3941396 15269 33 3601973 0 julia
[Tue Feb 20 04:24:24 2018] [ 6778] 1000 6778 137757 280 120 3 940 0 unity-panel-ser
[Tue Feb 20 04:24:24 2018] Out of memory: Kill process 6275 (julia) score 933 or sacrifice child
[Tue Feb 20 04:24:24 2018] Killed process 6275 (julia) total-vm:32065676kB, anon-rss:15765584kB, file-rss:0kB
--
[Wed Feb 21 04:55:33 2018] [ 6213] 1000 6213 33381 60 38 2 1928 0 TVGuiDelegate
[Wed Feb 21 04:55:33 2018] [19292] 1000 19292 165148 42 140 4 2339 0 gnome-terminal-
[Wed Feb 21 04:55:33 2018] [19299] 1000 19299 5845 1 18 3 670 0 bash
[Wed Feb 21 04:55:33 2018] [19456] 1000 19456 7863062 3953755 15203 33 3547061 0 julia
[Wed Feb 21 04:55:33 2018] Out of memory: Kill process 19456 (julia) score 928 or sacrifice child
[Wed Feb 21 04:55:33 2018] Killed process 19456 (julia) total-vm:31452248kB, anon-rss:15815020kB, file-rss:0kB
--
[Thu Feb 22 03:09:35 2018] [ 5770] 1000 5770 184534 1 154 4 666 0 gvfsd-smb-brows
[Thu Feb 22 03:09:35 2018] [ 5779] 1000 5779 90429 0 45 3 252 0 gvfsd-dnssd
[Thu Feb 22 03:09:35 2018] [ 5865] 1000 5865 7852110 3946446 15138 33 3571519 0 julia
[Thu Feb 22 03:09:35 2018] [ 7250] 1000 7250 137749 259 121 4 1473 0 unity-panel-ser
[Thu Feb 22 03:09:35 2018] Out of memory: Kill process 5865 (julia) score 930 or sacrifice child
[Thu Feb 22 03:09:35 2018] Killed process 5865 (julia) total-vm:31408440kB, anon-rss:15784840kB, file-rss:944kB
--
[Fri Feb 23 02:15:37 2018] [ 5779] 1000 5779 90429 25 45 3 211 0 gvfsd-dnssd
[Fri Feb 23 02:15:37 2018] [31248] 1000 31248 178707 0 198 4 14099 0 update-manager
[Fri Feb 23 02:15:37 2018] [ 2457] 1000 2457 7795624 3945875 15051 33 3504405 0 julia
[Fri Feb 23 02:15:37 2018] [ 6061] 1000 6061 137759 273 124 3 947 0 unity-panel-ser
[Fri Feb 23 02:15:37 2018] Out of memory: Kill process 2457 (julia) score 922 or sacrifice child
[Fri Feb 23 02:15:37 2018] Killed process 2457 (julia) total-vm:31182496kB, anon-rss:15783500kB, file-rss:0kB
--
[Sun Feb 25 01:38:21 2018] [31248] 1000 31248 178707 0 198 4 14098 0 update-manager
[Sun Feb 25 01:38:21 2018] [31641] 1000 31641 158830 11162 243 2 88624 0 TeamViewer_Desk
[Sun Feb 25 01:38:21 2018] [31842] 1000 31842 7898497 3921800 14943 33 3464435 0 julia
[Sun Feb 25 01:38:21 2018] [ 4229] 1000 4229 137758 257 119 4 950 0 unity-panel-ser
[Sun Feb 25 01:38:21 2018] Out of memory: Kill process 31842 (julia) score 914 or sacrifice child
[Sun Feb 25 01:38:21 2018] Killed process 31842 (julia) total-vm:31593988kB, anon-rss:15686432kB, file-rss:768kB
--
[Mon Feb 26 09:14:39 2018] [ 5779] 1000 5779 90429 0 45 3 269 0 gvfsd-dnssd
[Mon Feb 26 09:14:39 2018] [31248] 1000 31248 178707 68 198 4 13986 0 update-manager
[Mon Feb 26 09:14:39 2018] [14615] 1000 14615 7775460 3937516 15002 32 3464151 0 julia
[Mon Feb 26 09:14:39 2018] [17523] 1000 17523 137742 258 126 3 958 0 unity-panel-ser
[Mon Feb 26 09:14:39 2018] Out of memory: Kill process 14615 (julia) score 916 or sacrifice child
[Mon Feb 26 09:14:39 2018] Killed process 14615 (julia) total-vm:31101840kB, anon-rss:15750064kB, file-rss:0kB
--
[Wed Feb 28 18:26:52 2018] [ 6945] 1000 6945 51965 5 35 3 277 0 oosplash
[Wed Feb 28 18:26:52 2018] [ 6963] 1000 6963 449680 5575 321 5 7391 0 soffice.bin
[Wed Feb 28 18:26:52 2018] [16733] 1000 16733 7751156 3931379 14923 33 3511692 0 julia
[Wed Feb 28 18:26:52 2018] [16745] 1000 16745 137769 255 123 4 964 0 unity-panel-ser
[Wed Feb 28 18:26:52 2018] Out of memory: Kill process 16733 (julia) score 921 or sacrifice child
[Wed Feb 28 18:26:52 2018] Killed process 16733 (julia) total-vm:31004624kB, anon-rss:15725416kB, file-rss:100kB
febbo@febbo-HP-Z220-SFF-Workstation:~/Documents/workspace/PhD/papers/RTPP/results/pp$
[Wed Feb 28 18:26:52 2018] [16733] 1000 16733 7751156 3931379 14923 33 3511692 0 julia
[Wed Feb 28 18:26:52 2018] [16745] 1000 16745 137769 255 123 4 964 0 unity-panel-ser
[Wed Feb 28 18:26:52 2018] Out of memory: Kill process 16733 (julia) score 921 or sacrifice child
[Wed Feb 28 18:26:52 2018] Killed process 16733 (julia) total-vm:31004624kB, anon-rss:15725416kB, file-rss:100kB
Is there a way that I can clear the cache in between runs? It works fine if I restart it, but if it has been running for a bit then I need to restart. I have tried:
clear!(:n)
where n
is the object that holds all of the data. My feeling is that KNITRO
and IPOPT
need to be cleared or reset. Is there a way to do this? I guess that I could open up a new julia after each run…