-
-
Notifications
You must be signed in to change notification settings - Fork 24
Weighted rvars #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Weighted rvars #331
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #331 +/- ##
==========================================
+ Coverage 95.31% 95.80% +0.49%
==========================================
Files 50 51 +1
Lines 3840 3979 +139
==========================================
+ Hits 3660 3812 +152
+ Misses 180 167 -13 ☔ View full report in Codecov by Sentry. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Currently, everything else than
Pinging @n-kall , too |
For reference: Equations 6 (MCSE) and 7 (ESS) in preprint v6
Any thoughts on how the two sets of diagnostics should be presented in summarise_draws? |
|
I'm currently working on updating the |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
updates to weighted diagnostics
Weighted diagnostics
|
Are there any remaining blockers for this PR @mjskay @paul-buerkner ? It would be really nice to get this merged and start on changes that would depend on this. Happy to help out if there is something specific |
|
Would you have time to resolve the merge conflicts? PR is good from my side, I think. @mjskay can you confirm we can merge your PR? |
|
Merge conflicts should be manageable, I'll take a look |
merge latest changes to rvar_weights branch
fix merge conflicts for rvar_weights
Summary
This PR aims to address (at least part of) #184 by implementing weighted
rvars.Currently,
rvars cannot contain weights, and weighting of them can only be done by putting them in adraws_rvarsobject that itself contains a".log_weight"rvarcontaining the weights. This leads to counterintuitive behavior, like the default output of thervar(showing mean and sd) using unweighted versions of those statistics.This PR addresses that issue in the following ways:
rvarweights as a"log_weights"attribute directly on thervar, just like the"nchains"attribute is used to store chain count.draws_rvarsno longer use a".log_weight"variable to store weights, instead storing them directly on eachrvarthey contain, and requiring allrvars they contain to have the same weights (the same way it handles"nchains").log_weights()function fordrawsandrvars is added, which is a lower-level version ofweights(x, log = TRUE, normalize = FALSE)that just returns the log weights stored in the object without transformation. I initially did not have this, but found it greatly eased programming with weights.weight_draws(x, NULL)is now allowed as the canonical way to remove weights from adrawsobject, sinceremove_variables(x, ".log_weight")does not work ondraws_rvarsobjects anymore.rvars have been updated to incorporate weights (with a couple of exceptions I haven't gotten to yet, see TODOs and Questions below).rvarinternals are becoming (even more) complicated, I have added an "rvarInternals" section to?rvarthat hopefully will help in case others need to touch the code ;).Demo
You can't combine two rvars with different weights:
The check for equality of weights is done on the internal weights using
identical(), which should be fast, especially in cases where the two weight vectors are actually pointers to the same vector in memory (in which case the comparison is constant time). This does mean the weights vectors must be exactly the same (no tolerance for floating point error), but I suspect in most cases when weighting happens the exact same weight vector is being applied to many objects. In any case, if someone did encounter this issue they could simply assign the log weights from object to the other.If one rvar is weighted and another is not, the weights of the weighted rvar are inherited, which I believe covers the use case of (weighted draws from some model) + (unweighted draws, e.g. used to simulate predictions):
If you install the dev version of {ggdist}:
It supports weighted rvars in all functions (densities, CDFs, quantiles, all interval types and all point summaries):
Without weights:
With weights:
Weights should work basically everywhere:
TODOs and Questions
TODOs:
density(<rvar>),cdf(<rvar>), andquantile(<rvar>)/quantile2(<rvar>). The first two are straightforward. For weighted quantiles, I have an implementation in {ggdist} that I can port over, but I may want to update it first; some thoughts on weighted quantiles are here and feedback is welcome. (In fact, since writing that document my thinking has changed a bit---I originally thought the way I suggested implementing weighted quantiles in that document is an improvement on ggdist's current implementation, but after further investigation I might be leaning back towards how I did it in ggdist originally...).vignette("rvar")Questions:
R/convergence.Rshould be modified for weighted rvars. @avehtari?Would love for folks to kick the tires. I think once this is in we could also start thinking about what a successor to
summarise_draws()might look like that supports weights (and solves the various other open issues onsummarise_draws()).Copyright and Licensing
By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses: