One of the issues with my setup for running R code in production is that it usually means that I have a dirty Rscript that contains no proper logging. Within python applications I have logging set up such that the logger will contain info or warning messages that pertain to the state of the application. This makes debugging way easier and also helps to figure out if certain scripts ran.
The issue with R is that I can’t integrate my Python logging module into it (at least not easily). I have recently discovered packages like reticulate that help with using R and Python within their respective environments. The issue is that my logging is setup so that whenever I do
log.info('info message') the message is autoamtically grabbed and inserted into the database as a json object along with various datetime metadata.
Raw SQL inserting from R
So the first thing I tried was to just create a raw insert message in SQL, inserting a json object into a Postgres table. Unfortunately my attempt stalled when I figured out that the database package didn’t support parameterisation and JSON objects. I could have written a big
paste object and it (probably) would have worked. But that idea seemed not very fun to implement to me.
Calling Rscript from Python
My other idea was to use
subprocess in Python and call the Rscript, that way I could have everything run from within the python application instead of having one off Rscripts running. This was much easier to implement and only required doing some reading of the Python documentation of
I really didn’t spend too much time exploring the different options, but it would be neat to have a logging setup that captures both R and Python application information.