Built 2024-11-01 using NMdata 0.1.8.
This vignette is still under development. Please make sure to see latest version available here.
NMdata comes with a configuration tool that can be used to tailor default behaviour to the user’s system configuration and preferences.
For most users, taking NMdata
from the package archive
you are already using is the preferred way:
install.packages("NMdata")
library(NMdata)
To get the development version of NMdata, do the following from within R:
In a production environment, use a github release
To install NMdata X.Y.Z
release, do (notice “@v”:)
This example is not automatically updated to latest available release. Please make sure to check for latest version on (https://github.com/nmautoverse/NMdata/releases)
If you need to archive a script that is dependent on a feature or bug fix which is only in the master branch, you can request a release.
NMdata depends on data.table
only, and
data.table
does not have any dependencies. R 3.0 or later
is required, that’s all (yes, you can run NMdata on pretty old R
installations without any difference).
Other tools exist for reading data from Nonmem. However, they most often have requirements to how the dataset or the Nonmem control stream is written. NMdata aims at assuming as little as possible about how you work while still doing (and checking) as much as possible for you.
Tools in NMdata do not assume you use other tools in NMdata. If you like just one function in NMdata and want to integrate that in your workflow, that is perfectly possible. NMdata aims at being able to integrate with anything else and still do the job.
While many other tools available provide plotting functions, NMdata focuses on getting the data ready for smooth Nonmem experience (by checking as much as possible first), get the resulting data out of Nonmem and a little processing.
If you are building applications that process data from Nonmem, you may find very valuable tools in NMdata. I have not seen as generic and flexible a Nonmem data reader elsewhere.
The data creation tools in NMdata may still be interesting. If
another tool is working well for a project, you may not have any reason
to use NMscanData
(the Nonmem data reader in NMdata) for
it. However, a lot of the time those tools have certain requirements to
how datasets are constructed and especially how data is exported from
Nonmem ($TABLE
). If a Nonmem run does not live up to those
requirements, and you want to look at the run results,
NMscanData
will let you do that without you having to
“correct” your $TABLE
statements and wait for Nonmem to run
again.
Also, in other cases NMscanData
can save you a lot of
time and frustration even if you have your own routines for these tasks.
It will allow you to easily read your old Nonmem runs that were done a
little differently, or if you need to read someone elses work. A meta
analysis of a large number of models implemented by different people
over the years? That’s a candy store for NMscanData
. Here,
NMdata can save you many hours of frustration.
NMscanData
do all that only based on the Nonmem
output list file
NMscanData
reads the names of the input and output
tables from the list file, the path to the input data from the control
stream (which it knows where to find, or you can change the method it
uses to find it), and then it checks several properties of these files
and their contents before combining all the info. It is quite a lot of
checking and book keeping but the key steps are simple.
NMdata is generally fast - all thanks to the incredible
data.table
package. If you don’t use
data.table
already, NMdata may even seem extremely fast. If
you are noticing some circumstances where NMdata seems slow, I am very
curious to hear about it.
NMdata definitely modify variables by reference internally but this
will not affect your workspace (only exception is
NMstamp()
). It might improve speed to go all the way and
modify by reference in the user workspace, but using NMdata must be easy
for R users at all levels of experience. If you don’t understand what
this is all about, you’re fine.
Every function in NMdata has an argument called as.fun
.
You can provide a function in this argument which will automatically be
applied to results.
You can get a tibble if you work with
dplyr
/tidyverse
tools:
NMscanData(..., as.fun = tibble::as_tibble)
Under the hood, NMdata is implemented in data.table
, and
in fact the default is to convert to data.frame
before
returning the output. So if you want to have data.table
’s
back, use as.fun="data.table"
(notice as a character
string, not a function) to avoid the conversion
NMscanData(..., as.fun = "data.table")
Using as.fun=as.data.table
(a function this time) wold
work but is not recommended, because that would do an unnecessary copy
of data.
If you want to change the behaviour generally for the session and
omit the as.fun
argument, you can do this by this
option:
## for tibbles
NMdataConfig(as.fun = tibble::as_tibble)
## for data.table
NMdataConfig(as.fun = "data.table")
All NMdata functions (that return data) should now return tibbles or
data.tables. If you use NMdataConfig
to control this, for
reproducibility please do so explicitly in your script and not in your
.Rprofile
or similar.
Absolutely. data.tables and tibbles are data.frames, and data.frame in the documentation refers to any of these structures (that technically inherit from the data.frame class).
Please open an issue on github.
No. See the question above about dependencies. If NMdata were extended with plotting features, it would become dependent on a number of packages, and backward compatibility would be very difficult to ensure.
The only way to provide plotting features for output from NMdata
functions, would be to launch a separate package. I rarely need much
code to get the plots I want based on the format returned by
NMscanData
and maybe a call to findCovs
or
findVars
.
NMscanData
needs the path to the output control stream
(technically, the input control stream will work too, but this is not
recommended). It has no requirement to the naming of files.
NMdata
by default assumes a PSN setup. The
reason this makes a difference is a little technical but briefly, the
path to the input data is not available in the list file when using PSN.
If you don’t use PSN, all information will probably be available in the
output control stream. Enough talking, just do this:
NMdataConf(file.mod = identity)
NMscanData
take the name of the model from the file name
and adds it to a column in the returned data. If the name of the
directory is the model name, you can do the following as well:
NMdataConfig(modelname = function(file) basename(dirname(normalizePath(file))))
Feel free to modify the function to omit parts of the string or add to it.
$DATA
or $INFILE
.$TABLE
statements.Notice especially, “output data” does not refer to any of the files
automatically written by Nonmem such as .phi
,
.ext
, .cov
etc.