[ANN] ChoosyDataLoggers.jl: choosy practices for logging data in numerical experiments

mkschleg · January 23, 2023, 6:24pm

Hello! I think many people may find a use in the package ChoosyDataLoggers.jl, especially those running Reinforcement Learning or Machine Learning experiments.

The ChoosyDataLoggers package is used to log various groups and variables in a large code base. The data can be sunk into an array sink which will populate a dictionary based on the group and variable being logged. This feature is quite simple and powerful, and revolves around the @data macro. When choosing what to log, you may also create pre-processors in your user code which can be chosen dynamically at run time.

Often code bases can get quite complicated. If you want to figure out what you can log w/o going through the entire code base ChoosyDataLoggers has support for automatic registration of the uses of @data which can be handy when interacting with an experiment a partner has written.

I use this package now quite frequently, and especially when used in conjunction with Reproduce.jl.

PRs, issues, and comments welcome!

oschulz · January 23, 2023, 6:45pm

This looks very interesting - is ChoosyDataLoggers thread-safe? And does it work across (remote) processes/workers?

mkschleg · January 23, 2023, 7:47pm

Currently it isn’t explicitly thread safe or multi-proc safe. It can be used in an experiment framework which sends jobs to various self-contained processes (i.e. each which constructs its own logger internally), but if you are wanting to log across process and thread you will likely have trouble (you can look at GitHub - mkschleg/ActionRNNs.jl for how I use it).

This would all come down to the ArrayLogger at the root of the data loggers. There is likely quite a bit of work to deal with merging the data logged from other threads as well, but likely this would only change how the data logger itself works and all the macro code should work.

If you have a specific use case in mind I would be happy to take a look or look through a PR! Or if you have any ideas on how to do this from an interface point of view that would be great (doing the actual code should be straightforward though)/

oschulz · January 24, 2023, 9:25am

I’m not sure how to tackle this myself. The logging system does transport across distributed worker boundaries though, and sends everything to process 1, right?

mkschleg · February 2, 2023, 8:53pm

I’m pretty sure the logging system is contained in each process if we are doing distributed compute. For threads, I’m not exactly sure how it works .

Topic		Replies	Views
[ANN] SimulationLogger.jl: a convenient logging tool for DifferentialEquations.jl Package Announcements diffeq , logging	1	761	June 22, 2021
[ANN] SimulationLogs.jl 🪵🪵🪵 Package Announcements	4	961	April 19, 2021
How to use Logging.jl? General Usage package	0	300	November 7, 2022
[ANN] TensorBoardLogger.jl v0.1.6 - Easy peasy logging to TensorBoard from Julia Package Announcements package , announcement	2	895	February 21, 2020
New Package: Journal.jl Data	0	1026	May 3, 2017

[ANN] ChoosyDataLoggers.jl: choosy practices for logging data in numerical experiments

Related topics