Compute Branchwise Rate Maps Across Runs

Summarize fitted regime-specific rates across a list of stochastic-map-aware model fits or completed bifrost_search results. By default, rateMap() follows bifrost's branch-level framing: the first retained tree is used as the plotting scaffold, all retained trees must match in topology and branch lengths, each branch receives one summarized log-rate, and runs are averaged with equal weight.

Usage

rateMap(
  fits,
  weights = c("equal", "ic"),
  uncertainty = FALSE,
  summary = c("branch", "interval"),
  log = TRUE,
  value_summary = c("mean", "median"),
  target_tree = NULL,
  workers = NULL,
  progress = TRUE,
  control = rateMapControl()
)

Arguments

fits: A completed run, fitted model object, or non-empty list of these objects. Supported shapes are bifrost_search objects, mvgls objects, list(model = <mvgls>), and scratch-style lists with variables$tree plus param. Single supported objects are wrapped automatically as a convenience for one-fit plotting and inspection.
weights: Fit-level weighting mode. "equal" gives every retained fit equal weight. "ic" computes standard IC weights from optimal_ic and requires all retained fits to have the same IC_used. A numeric vector is also accepted and treated as custom fit weights.
uncertainty: Logical; if TRUE, compute and return across-fit uncertainty summaries for every summary row: whole branches when summary = "branch" and depth-grid intervals when summary = "interval".
summary: Character; "branch" computes one length-weighted value per edge, while "interval" slices branches on a global depth grid.
log: Logical; if TRUE (the default), transform extracted rate parameters with log() before branch, interval, and across-fit averaging. Weighted means are then mean log-rates, equivalent to the log of weighted geometric mean rates. Set log = FALSE to summarize rates in their original units.
value_summary: Character; central estimate stored in the summary table column intervals$value. "mean" uses the weighted mean. "median" uses the weighted median. When log = TRUE, both summaries are computed on the log-rate scale.
target_tree: Optional explicit target tree used as the summary and plotting scaffold. This may be any target or summary tree with the same topology and tip labels as the inputs. It does not need to contain stochastic maps because rateMap() replaces maps with display maps in the returned object. Branch lengths in target_tree define the geometry of the returned and plotted tree.
workers: Optional number of future workers to use. If NULL, the current future::plan() is used as-is.
progress: Logical; if TRUE, display a text progress bar via progressr::with_progress().
control: A "rateMap_control" object from rateMapControl(), or a named list of advanced control options.

Value

An object of class "rateMap" with components:

tree: A SIMMAP-style tree whose mapped segments encode color-bin indices in continuous mode, or named rate categories in category mode.
cols: The resolved color palette. In continuous mode this has length ncolors; in category mode it has one color per exact value or category bin.
lims: Numeric length-2 vector giving the plotted value range.
breaks: Numeric vector of palette bin boundaries in continuous mode, or category boundaries/values in category mode.
values: List of plotted-row central values by edge before color binning.
intervals: Plotted summary table. With summary = "branch", this has one row per branch. With summary = "interval", this has one row per plotted depth-grid interval. When branch-level rate diagnostics are enabled, this table also includes rate_for_flagging, rate_flag, rate_flag_source, is_near_zero, and is_high_outlier.
rate_categories: Data frame describing discrete rate categories when color_mode = "category"; otherwise NULL. With summary = "branch", this table also includes bin-level summaries of the plotted branch values, including n_branches, value_mean, value_median, value_min, value_max, value_sd, and total_branch_length.
run_values: When uncertainty = TRUE, list of numeric matrices containing run-level values for each edge. Matrix rows match the plotted rows for that edge and columns are retained fits. Otherwise NULL.
clade_key: Character descendant-tip key for each target-tree edge.
edge_matches: Integer matrix mapping target-tree edge rows to matched source-tree edge rows for each retained fit.
summary: The summary mode used, "interval" or "branch".
uncertainty: Logical indicating whether uncertainty summaries were computed.
value_summary: Central estimate used for the plotted summary table column intervals$value.
quantile_probs: Quantile probabilities used for uncertainty summaries.
highest_density_interval_prob: Highest-density interval mass used for uncertainty summaries.
rate_diagnostics: List summarizing rate-flag settings, counts, detected tail cutoffs, and fold-rate ranges with and without flagged branches.
rate_flags: The normalized "rateMap_rate_flags" control object used for rate diagnostics.
rate_flag_source: Character name of the rate-valued column used to compute or preserve rate_flag metadata, or NA when diagnostics are disabled.
plot_value: Current interval column mapped to branch colors.
target: Target-tree selection mode used.
check: Tree compatibility check mode used.
weights: Normalized fit weights used for aggregation.
weight_mode: Weighting mode used: "equal", "ic", or "custom".
weight_table: Data frame linking retained input indices, weights, and IC values when available.
palette: Original palette specification.
reverse_palette: Logical indicating whether the palette was reversed.
ncolors: Stored continuous-ramp resolution used when recoloring with color_mode = "continuous" and no explicit ncolors.
color_mode: Coloring mode used for the current tree.
n_categories: Category count target used when color_mode = "category".
category_breaks: Category breaks or exact category values used when color_mode = "category".
category_labels: Category labels used when color_mode = "category".
category_bin_method: Automatic category-binning method used when color_mode = "category" and category_breaks = NULL.
title: Legend title used for plotting.
n_fits: Number of fits used after validation or omission.
omitted: Integer indices of omitted fits when na_action = "omit".

Details

Everyday workflow. rateMap() is a compute function: call it to build a reusable "rateMap" object, then call plot(x, ...) to choose what to display. Common compute controls are fits, weights, uncertainty, summary, log, and value_summary. Common display controls belong in plot() or rateMapView(), including value, type, palette, color_mode, n_categories, show_tip_labels, and legend sizing. Controls such as target, check, res, tree_fun, param_fun, and the future_* arguments live in rateMapControl() because they are mainly advanced tools for same-topology tree samples, interval summaries, custom fit objects, and larger run sets.

Algorithmic provenance. rateMap() is inspired by phytools::densityMap(), which summarizes a set of stochastic maps by slicing mapped branches on a shared depth grid and coloring a SIMMAP-style tree by an averaged branchwise quantity. Here the averaged quantity is not a posterior probability of a mapped state; instead, each mapped state is translated through the corresponding fitted regime-rate parameter from each run. The plotting interface likewise follows the phytools density-map family by returning a colored SIMMAP tree and drawing it with phytools::plotSimmap() plus either a segmented rate-category color bar or a continuous color-bar legend.

Although rateMap() is designed for summarizing multiple fitted maps, it also accepts a single completed bifrost search or supported fit. Single-fit input is a convenience for inspecting one fitted model before scaling up to multi-run summaries.

rateMap() can also summarize same-topology posterior or sensitivity samples where branch lengths differ. In that case, usually supply an explicit target_tree and use control = list(check = "topology"). The target can be any tree with the same topology and tip labels as the inputs; an MCC tree is only one possible choice. The target tree supplies the plotted topology and branch lengths, while rates are matched from each input tree by descendant-tip clade keys rather than by edge order.

Summary modes. With the default summary = "branch" and log = TRUE, each edge receives one length-weighted average log-rate from each run before the across-run summary is computed. This is the natural default for bifrost searches because shifts are placed at nodes, so a bifrost branch is not expected to change regimes internally. With summary = "interval", each target-tree branch is subdivided by the global depth grid controlled by control = list(res = ...). If source branch lengths differ from the target branch length, source stochastic-map segments are projected onto target intervals by relative position along the matched branch. Interval mode is useful for general stochastic maps that can genuinely change state along a branch.

Log-rate averaging. When log = TRUE, rate parameters are transformed before branch-level, interval-level, and across-fit summaries are computed. With weights w_i, the default plotted mean is therefore sum(w_i * log(rate_i)), which is the log of a weighted geometric mean. It is not log(sum(w_i * rate_i)), the log of a weighted arithmetic mean. Use log = FALSE when downstream interpretation requires arithmetic summaries on the original rate scale. Raw fitted rate parameters must be strictly positive in either mode. Negative plotted values are therefore valid in log-rate maps whenever the positive raw fitted rates are less than one.

Tree checks and targets. In rateMapControl(), check = TRUE is equivalent to check = "full" and requires topology and branch lengths to match the target tree. Use check = "topology" when all inputs have the same topology and tip labels but may have different branch lengths. check = FALSE or check = "none" skips the upfront ape::all.equal.phylo() check, but every target branch must still be recoverable in every input tree by descendant-tip set. This function is not a mixed-topology posterior summarizer; clades absent from an input tree are treated as an error rather than being marginalized over topology.

When target_tree is supplied, it is used directly as the plotting scaffold and need not contain SIMMAP maps. When target_tree = NULL, rateMapControl(target = "first") uses the first retained input tree, and rateMapControl(target = "mcc") chooses the retained input tree with the highest sum of log clade credibilities. For truly same-topology inputs, the MCC score is usually tied, so "mcc" commonly resolves to the first retained tree. MCC target selection does not make rateMap() a mixed-topology summarizer: every target branch must still be present in every retained input. If you already have a preferred consensus, chronogram, MCC, maximum-likelihood, or otherwise curated target tree, pass it with target_tree.

Fit weights. weights = "equal" assigns the same weight to each retained fit. weights = "ic" computes standard information-criterion weights from each retained fit's optimal_ic, requiring all retained fits to share the same non-missing IC_used. A numeric weights vector can be supplied for custom weighting. Weights are subset to retained fits after rateMapControl(na_action = "omit") and are normalized to sum to one. IC weights are descriptive fit-level weights for comparable retained searches; they do not choose a formal threshold or make incomparable searches comparable.

Uncertainty summaries. The returned intervals data frame is the plotted summary table. With summary = "branch", it has one row per branch. With summary = "interval", it has one row per plotted depth-grid interval. When uncertainty = TRUE, this table includes across-fit summaries for each plotted row: weighted mean, weighted median, weighted standard deviation, quantiles, highest-density interval bounds, quantile and highest-density interval widths, coefficient of variation, and the number of finite run-level values. The run-level values are also returned in run_values as one matrix per edge. These are weighted empirical summaries of run-level values, not posterior uncertainty or model-internal uncertainty unless the retained inputs themselves have that interpretation. For posterior tree samples, use weights = "equal" if each retained run represents one posterior draw. value_summary controls whether the plotted value column uses the weighted mean or weighted median. These summaries are computed on the log-rate scale when log = TRUE. Highest-density intervals use the same shortest empirical interval calculation for both equal and unequal fit weights. For weights = "equal", sd is the ordinary sample standard deviation of retained run-level values. For unequal weights, sd is the square root of the normalized weighted variance, sum(w * (x - mu)^2), with weights normalized to sum to one.

Display views. Returned objects include a default category-style display mapping so they can be plotted immediately. Palette, category-bin, legend title, and alternative-value choices are intentionally controlled by plot(x, ...) or rateMapView(), not by rateMap(). For a single bifrost search with log = FALSE, category display is the closest formal analogue to the illustrative generateViridisColorScale() plot. For multi-run summaries, displayed categories are bins for summarized branch values; they should not be read as newly inferred bifrost regimes. When summary = "branch" and color_mode = "category", the rate_categories table also reports branch-level summaries for the plotted values assigned to each bin. These summaries are recomputed by rateMapView() or plot() whenever the plotted value or category breaks change. They are not computed for summary = "interval" because interval rows depend on the plotting grid rather than on discrete biological branch units.

Rate diagnostics. Branch-summary maps compute rate diagnostics through rateMapControl(rate_flags = rateMapRateFlags()). The default rateMapRateFlags() setting records the fitted-rate range and fold range but applies no special near-zero or high-outlier rule. Supplying zero_floor, equivalently rateMapRateFlags(method = "floor", zero_floor = ...), flags finite rates at or below an explicit manual floor. Use rateMapRateFlags(method = "tail_cluster") to apply a deterministic, Otsu-style guarded two-class split on sorted log rates when the near-zero values form a broader separated lower-tail cluster rather than one extreme adjacent gap. The split is kept only when gap, fold-reduction, tail-fraction, and minimum-count guardrails are satisfied. These flags never remove branches and never alter intervals$value; they add rate_flag metadata, rate_flag_source provenance, object-level rate_diagnostics, and, in category display mode, special display categories such as "near-zero". Special categories are colored outside the ordered palette, so the remaining regular categories still use the full low-to-high palette range. In category legends, diagnostic cutoffs mark where special categories end and regular bins begin. Set rate_flags = NULL to turn off rate-flag metadata and diagnostics entirely.

References

Paradis, E., and Schliep, K. (2019). ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics, 35, 526-528. doi:10.1093/bioinformatics/bty633

Clavel, J., Aristide, L., and Morlon, H. (2019). A penalized likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Systematic Biology, 68, 93-116. doi:10.1093/sysbio/syy045

Revell, L. J. (2013). Two new graphical methods for mapping trait evolution on phylogenies. Methods in Ecology and Evolution, 4, 754-759.

Revell, L. J. (2024). phytools 2.0: an updated R ecosystem for phylogenetic comparative methods (and other things). PeerJ, 12, e16505. doi:10.7717/peerj.16505

Examples

if (FALSE) { # \dontrun{
# A list of completed bifrost searches can be summarized directly:
rm_obj <- rateMap(list(search_a, search_b, search_c))
plot(rm_obj, type = "arc", show_tip_labels = FALSE)

# A single completed bifrost search can be inspected with the same API:
one_search_map <- rateMap(search_a)
plot(one_search_map, type = "arc", show_tip_labels = FALSE)

# Scratch-style lists remain supported:
scratch_fit <- list(
  variables = list(tree = mapped_tree),
  param = c("0" = 0.12, "1" = 0.45)
)
rateMap(list(scratch_fit))

# Use bifrost model-level IC weights across comparable sensitivity runs:
rm_ic <- rateMap(list(search_a, search_b), weights = "ic")

# Branch-level, discrete-category maps are the bifrost default:
branch_rates <- rateMap(list(search_a, search_b))
branch_rates$rate_categories

# To mimic the old jaw-shape preview style for a single search, use raw
# fitted rates and category colors:
jaw_view <- rateMapView(rateMap(search_a, log = FALSE), palette = viridis::viridis)

# Category bins use pretty breaks by default. Use equal-width bins or custom
# boundaries when those are easier to compare across figures:
equal_bin_rates <- rateMapView(
  branch_rates,
  n_categories = 5,
  category_bin_method = "equal"
)
custom_bin_rates <- rateMapView(
  branch_rates,
  category_breaks = c(-4, -2, 0, 2),
  category_labels = c("slow", "middle", "fast")
)

# Continuous interval maps remain available for stochastic maps with
# along-branch changes:
interval_rates <- rateMap(
  list(search_a, search_b),
  summary = "interval",
  control = list(res = 200)
)
plot(interval_rates, color_mode = "continuous")

# Custom fit weights are normalized internally:
custom_weighted <- rateMap(
  list(search_a, search_b, search_c),
  weights = c(2, 1, 1),
  summary = "branch"
)

# Posterior trees with the same topology but different branch lengths can be
# summarized on any explicit target or summary tree:
posterior_target_rates <- rateMap(
  posterior_fit_list,
  target_tree = summary_tree,
  summary = "branch",
  weights = "equal",
  uncertainty = TRUE,
  control = list(check = "topology")
)
plot(posterior_target_rates, value = "sd", type = "arc")

# Or choose the retained input tree with the highest summed log clade
# credibility as the plotting scaffold:
posterior_mcc_rates <- rateMap(
  posterior_fit_list,
  summary = "branch",
  control = list(check = "topology", target = "mcc")
)

# Large sensitivity sets can be explicitly subsampled before plotting.
# By default, rates are mapped on the log scale:
set.seed(1)
idx <- sample(seq_along(fit_list), size = 1000)
rm_sub <- rateMap(
  fit_list[idx],
  workers = 8,
  control = list(res = 100, future_strategy = "multisession")
)

# Use log = FALSE only when a raw-rate scale is preferred:
rm_raw <- rateMap(fit_list[idx], log = FALSE)
plot(
  rm_sub,
  type = "arc",
  show_tip_labels = FALSE,
  lwd = 1,
  palette = c("lightblue", "blue", "pink", "red")
)
} # }