Skip to content

Commit 0723cfc

Browse files
committed
Finisheed first draft of fBiSSE tutorial. Will leave fHiSSE code and .md there to be completed in a later date.
1 parent 02597d1 commit 0723cfc

15 files changed

+106
-22779
lines changed

tutorials/sse/bisse

Whitespace-only changes.

tutorials/sse/fhisse.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
Saving the beginnings of a serially-sampled HiSSE tutorial...
2+
3+
{% section Setting up a serially-sampled HiSSE model | sec_fHiSSE %}
4+
5+
The hidden SSE (HiSSE) model {% cite Beaulieu2016 %} was introduced to explicitly accommodate the role of
6+
unobserved effects on diversification rates.
7+
This alleviated some of the more complicated issues in BiSSE, as it was shown to falsely identify trait-associated
8+
hetereogeneity when the source of heterogeneity was not included in the model {% cite Rabosky2015 %}.
9+
See [the HiSSE tutorial]({{ base.url }}/tutorials/sse/hisse) for more information on the model.
10+
11+
We only need to make minor modifications to our BiSSE script to make it into a HiSSE model.
12+
The easiest way to proceed would therefore be to make a copy of your BiSSE script, renaming it as appropriate,
13+
and make the changes we mention below.
14+
If a part of the script is not mentioned, assume it remains unchanged (e.g. the lines to read `tree` and `data`
15+
remain the same, since we are using the same fixed tree and same observed characters).
16+
17+
First, in addition to our `num_states` variables as before, we must create new variables to
18+
hold the number of hidden states, and the total number of states.
19+
20+
{{ hisse_script | snippet:"block#", "4-5" }}
21+
22+
We are using only two hidden states, which we call A and B.
23+
While it is possible to include more hidden states, since these are unobserved states for which we have no data,
24+
one should proceed with caution including more parameters.
25+
26+
Then, we must expand our `data` object to include the hidden states
27+
28+
{{ hisse_script | snippet:"block#", "8" }}
29+
30+
Now the character in `data_exp` has four states.
31+
By default, RevBayes sets up state numbers to follow the order of observed, then hidden.
32+
So here, states 1 and 2 correspond to 0A and 1A, and states 3 and 4 to 0B and 1B.
33+
34+
To set up our speciation, extinction, and fossil sampling rates, we follow the exact same procedure as in the
35+
BiSSE case, but with `num_rates` rates instead.
36+
37+
{{ hisse_script | snippet:"block#", "9-13" }}
38+
39+
Note that this sets each rate to be independent.
40+
While that is the most flexible set up, some analyses might run into "label-swapping" issues.
41+
Label-swapping occurs because the likelihood of the model does not change if the states A and B are switched,
42+
since there is no data associated with those states.
43+
As such, the assignment of each state might switch throughout the MCMC, making it so that each rate parameter's
44+
posterior distribution is actually a mixture between the corresponding rates.
45+
If you encounter such problems in your analyses, see [the HiSSE tutorial]({{ base.url }}/tutorials/sse/hisse) for
46+
a way to set rates that guarantees there will be no label-swapping (sacrificing a small amount of flexibility in
47+
the process).
48+
49+
To set up the rate matrix, we can create variables for `q_obs` and `q_hidden`, corresponding to $q_{01}$ and \
50+
$q_{10}$, and $q_{AB}$ and $q_{BA}$ respectively.
51+
We can then use `fnHiddenStateRateMatrix` to build the full rate matrix.
52+
53+
{{ hisse_script | snippet:"block#", "14-18" }}
54+
55+
Note that this set up assumes that transitions between rate categories with the same observed state occur at the
56+
same rate (i.e. $q_{0A->0B} = q_{1A->1B} = q_{AB}$), and similarly for categories with the same hidden state.
57+
While this is often a useful assumption (and one that was
58+
684 KB
Loading
152 KB
Loading

tutorials/sse/fossils.md

Lines changed: 48 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,11 @@ include_files:
1616
- data/canidae_tree.nex
1717
- scripts/mcmc_fBiSSE.Rev
1818
- scripts/mcmc_fBiSSE_tp.Rev
19-
- scripts/mcmc_fHiSSE.Rev
2019
index: true
2120
---
2221

2322
{% assign bisse_script = "mcmc_fBiSSE.Rev" %}
2423
{% assign bisse_tp_script = "mcmc_fBiSSE_tp.Rev" %}
25-
{% assign hisse_script = "mcmc_fHiSSE.Rev" %}
2624

2725
{% section Introduction | introduction %}
2826

@@ -273,13 +271,6 @@ posteriori (MAP) state and the posterior probabilities for each state at a given
273271

274272
{{ bisse_script | snippet:"block#", "32" }}
275273

276-
To visualize the ancestral state estimation using RevGadgets {% cite Tribble2022 %}, see
277-
[the BiSSE tutorial]({{ base.url }}/tutorials/sse/bisse).
278-
279-
Before we analyze our BiSSE results, we will take some time to set up a HiSSE analysis of the same dataset.
280-
This will allow us to consider whether any trait-dependent extinction signal we detect is truly due to the trait,
281-
or due to unobserved effects.
282-
283274
{% aside Using TensorPhylo %}
284275

285276
TensorPhylo {% cite May2022 %} is a plugin that introduced an alternative general SSE function, the generalized
@@ -312,58 +303,59 @@ to change the output, checkpointing, and ancestral state filenames if you run bo
312303

313304
{% endaside %}
314305

315-
{% section Setting up a serially-sampled HiSSE model | sec_fHiSSE %}
316-
317-
The hidden SSE (HiSSE) model {% cite Beaulieu2016 %} was introduced to explicitly accommodate the role of
318-
unobserved effects on diversification rates.
319-
This alleviated some of the more complicated issues in BiSSE, as it was shown to falsely identify trait-associated
320-
hetereogeneity when the source of heterogeneity was not included in the model {% cite Rabosky2015 %}.
321-
See [the HiSSE tutorial]({{ base.url }}/tutorials/sse/hisse) for more information on the model.
322-
323-
We only need to make minor modifications to our BiSSE script to make it into a HiSSE model.
324-
The easiest way to proceed would therefore be to make a copy of your BiSSE script, renaming it as appropriate,
325-
and make the changes we mention below.
326-
If a part of the script is not mentioned, assume it remains unchanged (e.g. the lines to read `tree` and `data`
327-
remain the same, since we are using the same fixed tree and same observed characters).
328-
329-
First, in addition to our `num_states` variables as before, we must create new variables to
330-
hold the number of hidden states, and the total number of states.
306+
{% section Summarizing and interpreting results | secanalysis %}
331307

332-
{{ hisse_script | snippet:"block#", "4-5" }}
308+
To visualize our rate and ancestral state estimates, we will make use of RevGadgets {% cite Tribble2022 %}, similar
309+
to the [BiSSE tutorial]({{ base.url }}/tutorials/sse/bisse). We will not copy that code here to avoid redundancy,
310+
since the procedure is the same.
333311

334-
We are using only two hidden states, which we call A and B.
335-
While it is possible to include more hidden states, since these are unobserved states for which we have no data,
336-
one should proceed with caution including more parameters.
312+
First, let us take a look at our ancestral state results
337313

338-
Then, we must expand our `data` object to include the hidden states
314+
{% figure fBiSSE_anc_states %}
315+
<img src="figures/fBiSSE_anc_states_diet.png" width="75%">
316+
{% figcaption %}
317+
A visualization of the ancestral states estimated under the BiSSE model. Code for this plot can be found in [the BiSSE tutorial]({{ base.url }}/tutorials/sse/bisse).
318+
{% endfigcaption %}
319+
{% endfigure %}
339320

340-
{{ hisse_script | snippet:"block#", "8" }}
321+
These ancestral states largely agree with the ancestral state estimation in {% cite Slater2015 %}, which is a good
322+
sign. The clusters of hypercarnivory are sensible considering the tree in question, and there are not a lot of
323+
confounding nodes.
341324

342-
Now the character in `data_exp` has four states.
343-
By default, RevBayes sets up state numbers to follow the order of observed, then hidden.
344-
So here, states 1 and 2 correspond to 0A and 1A, and states 3 and 4 to 0B and 1B.
325+
Note that there are reasonably few transitions to and from hypercarnivory (state 0), which should spell caution
326+
when interpreting the results of the model. This illustrates an important point about the signal for state-dependent
327+
diversification: the sample size for these analyses is not the number of species in a tree, but the number of state
328+
transitions present in that group's history {% cite Maddison2006 %}. This should be considered, together with our
329+
remarks on sampled ancestors in the introduction, by any researcher hoping to implement a serially-sampled SSE
330+
analysis for their data. We reiterate that this tutorial should be seen as illustrative only, and the conclusions
331+
drawn herein about the history of diet (or the effect of diet on diversification) in Canidae should not be
345332

346-
To set up our speciation, extinction, and fossil sampling rates, we follow the exact same procedure as in the
347-
BiSSE case, but with `num_rates` rates instead.
333+
All that said, let us take a look at our rate estimates. Note that given the naming convention in our tutorial,
334+
and our desire to also plot fossil sampling rate `psi`, the call to `processSSE` should be something like
348335

349-
{{ hisse_script | snippet:"block#", "9-13" }}
350-
351-
Note that this sets each rate to be independent.
352-
While that is the most flexible set up, some analyses might run into "label-swapping" issues.
353-
Label-swapping occurs because the likelihood of the model does not change if the states A and B are switched,
354-
since there is no data associated with those states.
355-
As such, the assignment of each state might switch throughout the MCMC, making it so that each rate parameter's
356-
posterior distribution is actually a mixture between the corresponding rates.
357-
If you encounter such problems in your analyses, see [the HiSSE tutorial]({{ base.url }}/tutorials/sse/hisse) for
358-
a way to set rates that guarantees there will be no label-swapping (sacrificing a small amount of flexibility in
359-
the process).
360-
361-
To set up the rate matrix, we can create variables for `q_obs` and `q_hidden`, corresponding to $q_{01}$ and \
362-
$q_{10}$, and $q_{AB}$ and $q_{BA}$ respectively.
363-
We can then use `fnHiddenStateRateMatrix` to build the full rate matrix.
364-
365-
{{ hisse_script | snippet:"block#", "14-18" }}
336+
```
337+
pdata <- processSSE(bisse_file,
338+
speciation = "lambda",
339+
extinction = "mu",
340+
  rates = c("lambda", "mu", "diversification", "psi"))
341+
```
366342

367-
Note that this set up assumes that transitions between rate categories with the same observed state occur at the
368-
same rate (i.e. $q_{0A->0B} = q_{1A->1B} = q_{AB}$), and similarly for categories with the same hidden state.
369-
While this is often a useful assumption (and one that was
343+
Using the `plotMuSSE` function, we then get our posterior density plots.
344+
345+
{% figure fBiSSE_rates %}
346+
<img src="figures/fBiSSE_div_rates_diet.png" width="75%">
347+
{% figcaption %}
348+
Visualizing posterior samples of diversification and sampling rates associated with diet in Canidae. Code for this plot can be found in [the BiSSE tutorial]({{ base.url }}/tutorials/sse/bisse).
349+
{% endfigcaption %}
350+
{% endfigure %}
351+
352+
We do not recover a strong signal of trait-dependent diversification, but the diversification rate for
353+
non-hypercarnivorous canids (state $1$) seems to be slightly higher (posterior probability of
354+
$\lambda_1 - \mu_1 > \lambda_2 - \mu_2$ is around 0.65). So there is no support one way or the other for
355+
the hypothesis that hypercarnivorous canids have higher extinction and speciation rates. A more complete
356+
dataset is likely to provide better grounds for testing this and other complex patterns.
357+
358+
In this tutorial, you learned how to set up, run, and visualize the results of a serially-sampled BiSSE
359+
analysis. While we only explored the BiSSE model, you can make use of the many SSE functions in RevBayes
360+
to apply any SSE model to your dataset by modifying the code in the appropriate tutorial to reflect the
361+
addition of fossil sampling rates.

0 commit comments

Comments
 (0)