Skip to content

Variables/Parameters/Observations terminology & API Cleanup #199

@alyst

Description

@alyst

As a part of #193 I already made some changes, so I wanted to get the feedback from maintainers about it.
Plus, there are a few other changes in the same direction that I can integrate into #193, so I wanted to mention them here too.

  • Parameters. Sometimes they are called parameters, sometimes identifiers (in the ParTable).
    I propose to change it into param (intuitively understandable, but still short):
    • param in the ParTable
    • params() to get the vector of parameters
    • nparams() to get the number of parameters (called n_par() now)
  • Variables. Sometimes called vars, sometimes colnames, sometimes nodes.
    Observed variables are sometimes called observed, sometimes manifested.
    I propose to consolidate into vars (short, but intuitive), which could be observed (more intuitive than manifested) or latent:
    • vars() to get the vector of variables from ParTable, RAMMatrices (matching the order of A columns)
    • nvars() to get the number of variables
    • observed_vars() to get the observed variables matching the order of rows/cols in obs_cov and rows of RAMMatrices.F
      Alternatively, it could be obs_vars(), which would match obs_cov() and obs_mean() (if observed_vars is chosen, then obs_cov also needs be renamed into observed_cov for consistency).
    • nobserved_vars() to get the number of observed vars (replaces n_man, which in this short form is a little bit confusing).
    • latent_var_indices()/observed_var_indices() to get the indices of vars() that match the observed/latent variables
      (i-th index of observed_var_indices() is for the i-th variable of observed_vars())
    • latent_vars() is a shortcut to vars()[latent_var_indices()]
    • Also, in case of missing data, I propose to use measured/missing terms (now it uses observed/missing, but observed clashes with observed/latent), and nmeasured_vars()/nmissing_vars() to get their counts
  • Observations. Also referred to as rows. To disambiguate from observed_vars, I propose to refer to as samples (row is confusing because SEM operates with so many matrices).
    • samples to access to the individual samples (sometimes referred to as rows or rowwise).
    • nsamples() is the number of samples (n_obs() now)
  • Relations (between the variables, i.e. <- or <->). Now the ParTable have the in param_type column, which is confusing, because sometimes it is constant.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions