R for Finance Exercises: 25 Real-World Practice Problems
Twenty-five practice problems that mirror real desk work in quant research, risk, and portfolio analytics: returns, rolling volatility, VaR, drawdowns, portfolio construction, Sharpe, CAPM, Fama-French, and end-to-end risk reports. Solutions are hidden behind reveal blocks so you can struggle first.
Section 1. Returns and price transformations (5 problems)
Exercise 1.1: Compute simple and log returns from a daily price series
Task: A quant analyst is auditing the daily price tape for ticker AAPL and needs both simple and log returns side by side to compare the two definitions before passing the series downstream. From the inline price tibble below, compute both return types and save the resulting tibble (columns date, price, simple_ret, log_ret) to ex_1_1.
Expected result:
#> # A tibble: 6 x 4
#> date price simple_ret log_ret
#> <date> <dbl> <dbl> <dbl>
#> 1 2024-01-02 185. NA NA
#> 2 2024-01-03 184. -0.00754 -0.00756
#> 3 2024-01-04 181. -0.0125 -0.0126
#> 4 2024-01-05 181. -0.00400 -0.00401
#> 5 2024-01-08 186. 0.0240 0.0237
#> 6 2024-01-09 185. -0.00432 -0.00433
Difficulty: Beginner
Both return definitions compare each price to the one immediately before it, so the very first row has no prior price and must come out missing.
Add two columns inside mutate(); use lag(price) for the prior close, then price / lag(price) - 1 for the simple return and log(price / lag(price)) for the log return.
Click to reveal solution
Explanation: Simple returns are price-ratio-minus-one and aggregate nicely across assets at a single point in time (a portfolio's simple return is the weighted sum). Log returns are differences of log-prices and aggregate nicely across time (multi-period log return is the sum). Most risk and time-series models prefer log returns because they are roughly symmetric around zero and small-value differences match simple returns to first order.
Exercise 1.2: Build the cumulative wealth curve for a $10,000 starting balance
Task: A retail brokerage dashboard needs to show what $10,000 invested at day zero would be worth at each subsequent close. Given the daily simple-return vector below for a 10-day window, compute the running wealth path (no contributions, no fees) and save the resulting numeric vector to ex_1_2.
Expected result:
#> [1] 10000.00 10075.00 10025.62 9925.36 10084.16 10185.00 10134.07 10215.14 10306.07 10193.71
Difficulty: Beginner
Wealth grows multiplicatively, so each day's balance is the previous balance scaled by one plus that day's return.
Use cumprod() on 1 + daily_ret, and prepend a 1 with c() so the running path starts at the $10,000 opening balance.
Click to reveal solution
Explanation: Wealth compounds multiplicatively, so the path is the running product of (1 + r_t) factors. Prepending 1 ensures the first element equals the starting balance and the vector length equals returns plus one. Using log returns instead, the same path would be 10000 * exp(cumsum(log_ret)), which is numerically more stable for very long horizons.
Exercise 1.3: Aggregate daily returns to monthly returns by compounding
Task: The performance team reports monthly P&L to the investment committee, so daily returns must be compounded inside each calendar month rather than summed. From the inline two-month daily-return tibble, produce a monthly tibble with columns month and monthly_ret and save it to ex_1_3.
Expected result:
#> # A tibble: 2 x 2
#> month monthly_ret
#> <date> <dbl>
#> 1 2024-01-01 0.0142
#> 2 2024-02-01 -0.0098
Difficulty: Intermediate
Returns inside a calendar month combine multiplicatively, not by addition, so each month's figure is the compounded growth across all of its days.
Derive a month key with format(date, "%Y-%m-01"), then group_by() it and summarise() with prod(1 + ret) - 1.
Click to reveal solution
Explanation: Compounding inside a bucket is prod(1 + r) - 1, which is correct for arithmetic returns. Summing daily returns is wrong because it ignores the cross-product term (matters more when daily moves are large or the horizon is long). For log returns the bucket aggregation is just sum(), which is one of the main reasons risk models work in log space.
Exercise 1.4: Reshape wide OHLC bars into long format for plotting
Task: A junior analyst exported daily bars in a wide format with separate columns for open, high, low, and close, but ggplot wants a long table to facet by series. Pivot the inline OHLC tibble to columns date, series, value keeping the four series in a sensible order, and save the result to ex_1_4.
Expected result:
#> # A tibble: 12 x 3
#> date series value
#> <date> <chr> <dbl>
#> 1 2024-01-02 open 185.
#> 2 2024-01-02 high 187.
#> 3 2024-01-02 low 184.
#> 4 2024-01-02 close 186.
#> # 8 more rows hidden
Difficulty: Beginner
Plotting wants one row per observation, so the four price columns should collapse into a single value column tagged by which series each value came from.
Use pivot_longer() with names_to = "series" and values_to = "value", then set a factor() with explicit levels to keep open/high/low/close in order.
Click to reveal solution
Explanation: Long format is the canonical shape for ggplot2 because the grammar of graphics maps a single column to each aesthetic. The factor ordering matters: ggplot would otherwise alphabetize the legend (close, high, low, open) which is jarring on a price chart where the natural order is open, high, low, close. pivot_longer() replaced the legacy gather() in tidyr 1.0.
Exercise 1.5: Flag daily returns that exceed three standard deviations
Task: The trade-surveillance team scans daily returns for anomalous moves that should be reviewed for fat-finger errors or news events. Given a 30-day return vector, add a logical column is_outlier that is TRUE when the absolute return exceeds three sample standard deviations of the series, then save the tibble (columns day, ret, is_outlier) to ex_1_5.
Expected result:
#> # A tibble: 30 x 3
#> day ret is_outlier
#> <int> <dbl> <lgl>
#> 1 1 0.00755 FALSE
#> 2 2 -0.0103 FALSE
#> # 27 more rows hidden
#> 30 30 -0.0782 TRUE
#>
#> n_outliers: 1
Difficulty: Intermediate
An anomalous move is one that sits far from the typical spread of the whole series, measured in multiples of that spread.
Inside mutate(), compare abs(ret) against 3 * sd(ret) to produce the logical is_outlier column.
Click to reveal solution
Explanation: Three-sigma is a fast first-pass screen, not a formal anomaly test, because daily returns are leptokurtic (fat-tailed) and the threshold misclassifies real market moves more often than a normal-theory calculation suggests. Production surveillance usually layers in a robust scale (MAD instead of sd()) and a rolling window so the threshold adapts to volatility regimes. The same idea generalizes to any z-score filter.
Section 2. Risk metrics: volatility, VaR, drawdown (5 problems)
Exercise 2.1: Rolling 30-day annualized volatility for an equity book
Task: The risk team needs a daily report of 30-day annualized realized volatility for the equity book. Given the inline 100-day return tibble, compute the trailing 30-day standard deviation and annualize by multiplying by sqrt(252), then save the result (columns day, ret, vol_30d_ann) to ex_2_1.
Expected result:
#> # A tibble: 100 x 3
#> day ret vol_30d_ann
#> <int> <dbl> <dbl>
#> 1 1 0.0114 NA
#> 2 2 -0.00533 NA
#> # 27 more rows hidden
#> 30 30 -0.00207 0.171
#> 31 31 0.0150 0.173
#> # 69 more rows hidden
Difficulty: Intermediate
Volatility on any given day should look only at a fixed trailing window of recent returns, and lifting a daily figure to a yearly one uses the square root of the number of trading days.
Use zoo::rollapplyr() with width = 30, FUN = sd, and fill = NA, then multiply the result by sqrt(252).
Click to reveal solution
Explanation: rollapplyr (right-aligned) is the convention for trailing windows: the volatility at day t uses returns from t-29 through t and is therefore strictly backward-looking, which matters because forward-looking windows leak future information into the metric. The sqrt(252) factor assumes 252 trading days a year and i.i.d. returns; the i.i.d. assumption is wrong (vol clusters) but the convention is universal so reports remain comparable across desks.
Exercise 2.2: Historical 95% VaR for a daily P&L vector
Task: A risk officer needs to report 1-day 95% historical Value-at-Risk for a long equity position with a $1,000,000 notional. From the inline 250-day return vector, compute VaR as the negative of the 5th percentile of returns scaled by notional, returning a single numeric (positive number, dollars at risk), and save it to ex_2_2.
Expected result:
#> [1] 18233.42
Difficulty: Advanced
Historical VaR reads a loss threshold straight off the empirical distribution of past returns and reports it as a positive number of dollars.
Take quantile(ret, probs = 0.05), negate it, and multiply by the notional.
Click to reveal solution
Explanation: Historical VaR is a non-parametric, distribution-free estimate: take the empirical 5% quantile and flip its sign so VaR is reported as a positive loss number. It does not assume normality and naturally captures the empirical left tail. The weakness is that with only 250 daily observations the 5th-percentile estimator has high variance, which is why many desks layer in parametric or filtered-historical methods (next exercise) and stress overlays.
Exercise 2.3: Parametric Normal VaR at 99% confidence
Task: The same risk officer wants a parametric Normal VaR overlay at the 99% level for a $5,000,000 position to compare against the historical number from the previous exercise. Compute parametric VaR assuming returns are Normal with sample mean and sample sd, then scale by notional, and save the single numeric to ex_2_3.
Expected result:
#> [1] 126541.6
Difficulty: Advanced
Parametric VaR assumes a Normal shape, so it needs only the average return, the spread of returns, and a confidence multiplier drawn from that distribution.
Compute mean() and sd() of the returns, get the multiplier with qnorm(0.99), and form -(mu - z * sig) before scaling by the notional.
Click to reveal solution
Explanation: Parametric VaR multiplies the volatility by a Normal quantile (qnorm(0.99) is about 2.33). It is fast and easy to scale across many books, but it understates left-tail risk because asset returns are fatter-tailed than Normal. In practice teams use a Student-t or filtered-historical version for the same compute cost. The sign convention -(mu - z*sig) returns VaR as a positive loss; some teams drop mu entirely because mean drift is small on a 1-day horizon.
Exercise 2.4: Maximum drawdown and the date it bottomed
Task: A portfolio manager presenting performance to allocators must show the deepest peak-to-trough loss the fund experienced during the back-test. From the inline wealth-curve tibble, compute the maximum drawdown (as a negative number, e.g. -0.18 for an 18% loss) and the date it occurred, returning a one-row tibble (columns max_dd, dd_date), and save to ex_2_4.
Expected result:
#> # A tibble: 1 x 2
#> max_dd dd_date
#> <dbl> <date>
#> 1 -0.124 2024-03-15
Difficulty: Intermediate
Drawdown measures how far below its running high-water mark the portfolio has fallen, so you first need that running peak at every point in time.
Build the peak with cummax(wealth) and the drawdown as wealth / peak - 1, then slice_min() on it and transmute() the two output columns.
Click to reveal solution
Explanation: Drawdown at time t is wealth_t / max_{s<=t}(wealth_s) - 1, so cummax() is the right primitive: it tracks the running peak. slice_min then picks the day with the worst drawdown. Two extensions matter in practice: recovery time (days from trough back to the prior peak) and the ulcer index (RMS of drawdowns), both of which read more honestly than max drawdown alone, which is a single worst-case observation.
Exercise 2.5: Conditional VaR (Expected Shortfall) at 99% from historical returns
Task: Modern regulatory frameworks (FRTB) require Expected Shortfall in addition to VaR because ES penalizes fat tails that VaR ignores. From the inline 500-day return vector, compute 99% historical ES as the mean of all returns at or below the 1% quantile, flip its sign so the answer is a positive loss percentage, scale by a $2,000,000 notional, and save the dollar number to ex_2_5.
Expected result:
#> [1] 73428.41
Difficulty: Advanced
Expected Shortfall averages only the returns living in the extreme tail beyond the cutoff, rather than reading a single threshold off the distribution.
Find the 1% point with quantile(ret, 0.01), take mean() of the returns at or below it, negate, and scale by the notional.
Click to reveal solution
Explanation: ES (also called CVaR or TailVaR) averages the losses in the tail beyond the VaR cutoff, so it reports what you expect to lose conditional on a bad day. With heavy-tailed Student-t returns ES will be materially larger than the equal-confidence VaR, which is exactly the point of using ES under FRTB. The estimator has high variance for small samples, so production systems use parametric overlays or extreme-value tail fits when only a few hundred observations are available.
Section 3. Portfolio construction and analysis (5 problems)
Exercise 3.1: Equal-weight portfolio returns from four ticker return streams
Task: A long-only fund runs an equal-weight benchmark across four tech tickers and needs the daily portfolio return series for performance attribution. From the inline tibble of daily simple returns for AAPL, MSFT, GOOG, NVDA, compute the daily equal-weight portfolio return as a numeric vector (one element per day) and save it to ex_3_1.
Expected result:
#> [1] 0.00518 -0.00865 0.01035 0.00115 -0.00533
Difficulty: Beginner
An equal-weight portfolio earns the plain average of its holdings' returns on each day.
Use rowMeans() across the four ticker columns, picking them out first with select().
Click to reveal solution
Explanation: For an equal-weight portfolio the daily return is just the row mean of the asset returns. rowMeans() on the four return columns is faster and clearer than a manual sum-divided-by-4. For arbitrary weights you would build a numeric weight vector w and compute as.matrix(rets[ , tickers]) %*% w, which generalizes to thousands of assets without rewriting the code.
Exercise 3.2: Portfolio variance from a 4x4 covariance matrix and weight vector
Task: A risk model produces a daily covariance matrix of returns for four assets and the PM wants the realized portfolio variance at a given target weight allocation. From the inline covariance matrix and weight vector, compute portfolio variance as the quadratic form w' Sigma w, returning a single numeric and saving it to ex_3_2.
Expected result:
#> [1] 0.000135
Difficulty: Advanced
Portfolio variance is not a simple weighted sum of variances; it folds in every pairwise covariance through a quadratic combination of the weights.
Form the quadratic t(w) %*% Sigma %*% w and wrap it in as.numeric() to get a single scalar.
Click to reveal solution
Explanation: Portfolio variance is the quadratic form w' Sigma w and ignoring it is the most common reason linear weighting of risk metrics gives wrong answers (correlations are missing). Annualize by multiplying by 252 if Sigma is built from daily returns; take a square root to get portfolio volatility. For large universes the covariance matrix becomes ill-conditioned and shrinkage (Ledoit-Wolf) or factor models are used to stabilize it before any optimization runs.
Exercise 3.3: Monthly rebalance to target weights with drift between months
Task: A risk-parity fund rebalances back to fixed target weights at the start of each month and lets the portfolio drift during the month. Given the inline monthly drift tibble (asset returns within each month for two assets) and target weights of 60% equity and 40% bonds, compute the post-rebalance weights at the start of month 2 (after applying month 1 drift) and save the named numeric vector to ex_3_3.
Expected result:
#> equity bonds
#> 0.6000000 0.4000000
Difficulty: Advanced
A rebalance trades the portfolio all the way back to its fixed targets, so wherever the weights drifted to during the month does not affect the post-rebalance answer.
The result is simply target_w itself; you can compute the drifted weights as target_w * (1 + month1_drift) normalized by their sum() to see the trade list.
Click to reveal solution
Explanation: The rebalance trades the portfolio back to the target weights, so the post-rebalance vector is exactly the targets regardless of how far the drifted weights had moved. The intermediate drifted_w is what you would feed into a turnover or transaction-cost calculation: the difference between drifted and target weights is the trade list. In real systems you would round to lot sizes and check trading-cost thresholds before rebalancing tiny drifts.
Exercise 3.4: Annualized Sharpe ratio with a 2% risk-free rate
Task: The PM is filing a fund factsheet and needs the annualized Sharpe ratio of the daily portfolio return stream. Given the inline 250-day return vector and an annual risk-free rate of 2%, compute the annualized Sharpe ratio (mean excess return over the daily rf, scaled to annual, divided by annualized volatility) and save the single numeric to ex_3_4.
Expected result:
#> [1] 0.97
Difficulty: Intermediate
Sharpe compares the average return earned above the risk-free rate to the volatility of those excess returns, then lifts the daily figure to an annual one.
Convert the annual rate with rf_annual / 252, subtract it from the returns, and compute mean(excess) / sd(excess) * sqrt(252).
Click to reveal solution
Explanation: The Sharpe ratio annualization assumes i.i.d. daily returns, so the numerator scales by 252 and the denominator by sqrt(252), netting to a single sqrt(252) factor on the daily Sharpe. The risk-free rate is converted from annual to daily by simple division because the magnitude is tiny. Modified Sharpe ratios that incorporate skewness and kurtosis are used in hedge-fund reporting where return distributions are far from Normal.
Exercise 3.5: Marginal and component risk contribution by asset
Task: The risk team wants to attribute total portfolio risk to each asset using component contribution to risk (CCR), which sums to total portfolio variance. From the same Sigma and w as exercise 3.2, compute the component contributions w * (Sigma %*% w) as a named numeric vector (one element per asset) summing to portfolio variance, and save it to ex_3_5.
Expected result:
#> [1] 4.110e-05 3.480e-05 2.180e-05 3.760e-05
#> sum: 0.0001352
Difficulty: Advanced
Total portfolio risk can be split so each asset owns a share of it, and those shares add back up to the whole portfolio variance.
Compute the marginal term Sigma %*% w, multiply it elementwise by w, and coerce the result with as.numeric().
Click to reveal solution
Explanation: The decomposition is Euler's theorem for the homogeneous function sigma^2(w) = w' Sigma w: each asset contributes w_i * (Sigma w)_i and the contributions sum to total variance. The fourth asset has the largest contribution despite having the same weight as the third because its variance is much higher (0.04% vs 0.014%). Risk-parity targets equal CCR per asset; minimum-variance ignores CCR.
Section 4. Performance and benchmarking (4 problems)
Exercise 4.1: Information ratio of a strategy against its benchmark
Task: A long-short equity strategy reports its performance against the S&P 500 daily total return, and the allocator wants the annualized information ratio (excess return over benchmark divided by tracking error). Given the inline 252-day strategy and benchmark return vectors, compute the annualized IR and save the single numeric to ex_4_1.
Expected result:
#> [1] 0.41
Difficulty: Intermediate
Information ratio is a Sharpe-style figure where the yardstick is the benchmark rather than cash, built from the return earned over that benchmark.
Form the active return strategy_ret - bench_ret, then take its mean() over its sd(), scaled by sqrt(252).
Click to reveal solution
Explanation: Information ratio is the Sharpe-like metric where the comparison is the benchmark instead of cash: numerator is the mean active return, denominator is the standard deviation of active returns (tracking error). It is the metric of choice for actively managed long-only mandates where the manager is paid for beating a benchmark rather than absolute return. A persistent IR above 0.5 is considered strong; above 1.0 is exceptional.
Exercise 4.2: Tracking error in basis points and average active return
Task: A passive index replication desk is monitoring how closely a tracker fund follows its underlying index, and the regulator requires monthly tracking error reports. From the same 252-day strategy and benchmark vectors, compute annualized tracking error in basis points (1 unit = 0.01%) and the annualized active return in basis points, returning a one-row tibble (columns te_bps, active_bps), and save to ex_4_2.
Expected result:
#> # A tibble: 1 x 2
#> te_bps active_bps
#> <dbl> <dbl>
#> 1 2143 89.5
Difficulty: Advanced
Tracking error is the annualized spread of the active return stream, and reporting it in basis points just rescales a small decimal into a readable unit.
From active, compute sd(active) * sqrt(252) * 10000 and mean(active) * 252 * 10000 inside a one-row tibble().
Click to reveal solution
Explanation: Basis points are the universal unit on fixed-income and ETF tracking desks because percentage points are too coarse. A 21% annualized tracking error on a passive replicator would be a disaster; on an active fund it is normal. Multiplying by 10000 converts decimals to bps. Reporting tracking error and active return as a pair lets the reader compute the implied IR without the analyst telling them what to think.
Exercise 4.3: Win rate and average win-to-loss ratio for a trading strategy
Task: A discretionary trader is reviewing their trade blotter to size positions for next quarter and needs two simple statistics: the win rate (share of profitable trades) and the ratio of the average winning trade to the average losing trade. From the inline trade-P&L vector, compute both as a named numeric vector and save it to ex_4_3.
Expected result:
#> win_rate win_loss_ratio
#> 0.6000 2.0833
Difficulty: Beginner
Split the trades into winners and losers, then summarise how often you win and how big a typical win is relative to a typical loss.
Subset pnl[pnl > 0] and pnl[pnl < 0], then build a named vector using length() ratios and mean() of each side.
Click to reveal solution
Explanation: Expectancy of a strategy is win_rate * avg_win - (1 - win_rate) * avg_loss, so a high win rate alone is meaningless without the magnitude ratio. A trend-following system might have a 35% win rate but a 3:1 win-to-loss ratio and be highly profitable; a mean-reversion strategy might have a 65% win rate and a 0.7:1 ratio and bleed slowly. Both numbers belong on every blotter review.
Exercise 4.4: Sortino ratio using downside deviation against a zero target
Task: The Sharpe ratio penalizes upside volatility, which clients dislike for funds that report mostly positive returns with rare large drawups. From the inline 250-day return vector, compute the annualized Sortino ratio using a zero target return (downside deviation = sqrt of mean of squared negative excess returns), and save the single rounded numeric to ex_4_4.
Expected result:
#> [1] 1.32
Difficulty: Advanced
Sortino swaps Sharpe's two-sided spread for a one-sided one that counts only the volatility of below-target outcomes.
Use pmin(ret - target, 0) to keep just the downside, take sqrt(mean(downside^2)) for the deviation, then mean(ret - target) / dd * sqrt(252).
Click to reveal solution
Explanation: Sortino ratio replaces Sharpe's standard deviation with a one-sided downside deviation, capturing only the volatility of unwanted (negative) outcomes. pmin(x, 0) is the idiomatic R way to zero out the positive side. The annualization uses sqrt(252) on the downside deviation under the same i.i.d. assumption Sharpe uses. The metric reads more favorably for positively skewed strategies, which is exactly the marketing reason allocators ask for it.
Section 5. Factor models and regression (3 problems)
Exercise 5.1: CAPM beta from market and stock daily returns
Task: A junior quant onboarding to the equity strategy desk needs to compute the CAPM beta of a stock against the market: beta is the slope coefficient from regressing stock excess returns on market excess returns. From the inline 252-day market and stock return tibble (assume the risk-free rate is zero for simplicity), fit lm() and extract beta as a single numeric saved to ex_5_1.
Expected result:
#> [1] 1.20
Difficulty: Intermediate
Beta is the sensitivity of the stock to the market, which is exactly the slope of a straight-line fit of one return series on the other.
Fit lm(stock ~ mkt) and pull the mkt slope out of coef().
Click to reveal solution
Explanation: CAPM beta is the population covariance of stock and market returns divided by the market variance, which lm() estimates by ordinary least squares. The intercept is alpha, the standard error of beta is the regression standard error, and the R-squared is the share of variance explained by the market factor. Beta is sensitive to the regression window: a 60-day rolling beta will swing far more than a 5-year monthly beta, and product disclosures must state which one.
Exercise 5.2: Fama-French 3-factor regression on monthly excess returns
Task: A long-only mutual fund is being benchmarked against the Fama-French 3-factor model (market, size SMB, value HML) to back out style-adjusted alpha. From the inline 60-month tibble of fund excess returns and three factor returns, fit a 3-factor regression and return a one-row tibble (columns alpha, beta_mkt, beta_smb, beta_hml) with rounded coefficients, saved to ex_5_2.
Expected result:
#> # A tibble: 1 x 4
#> alpha beta_mkt beta_smb beta_hml
#> <dbl> <dbl> <dbl> <dbl>
#> 1 0.0008 0.98 0.31 -0.15
Difficulty: Advanced
The three-factor model explains fund returns with three drivers at once, and the leftover intercept is the style-adjusted skill term.
Fit lm(fund_x ~ mkt_rf + smb + hml, data = ff), then read the intercept and three slopes from coef() into a one-row tibble.
Click to reveal solution
Explanation: Fama-French extends CAPM by adding two long-short factor portfolios: SMB (small minus big, capturing the size premium) and HML (high minus low book-to-market, capturing the value premium). Alpha after the regression is what's left over and is closer to a manager's "skill" coefficient than raw CAPM alpha. The newer Carhart 4-factor adds momentum (MOM/UMD), and the FF 5-factor adds profitability (RMW) and investment (CMA). The estimation procedure is identical: just add columns to the regression.
Exercise 5.3: Rolling 60-day beta of a stock against the market
Task: A portfolio risk dashboard plots the time-varying beta of every holding so the PM can see when a name's sensitivity to the market drifts up or down. From the inline 250-day market and stock return tibble, compute a rolling 60-day OLS beta using zoo::rollapplyr and return the tibble augmented with a beta_60d column, saved to ex_5_3.
Expected result:
#> # A tibble: 250 x 4
#> day mkt stock beta_60d
#> <int> <dbl> <dbl> <dbl>
#> 1 1 0.0073 0.0095 NA
#> # 58 more rows hidden
#> 60 60 0.0011 0.00170 1.18
#> 61 61 0.0028 0.00380 1.19
#> # 189 more rows hidden
Difficulty: Advanced
A time-varying beta recomputes the same stock-versus-market sensitivity over each trailing window as that window slides forward.
Pass an index vector to zoo::rollapplyr() with width 60 and a helper that returns cov(mkt, stock) / var(mkt) for the slice.
Click to reveal solution
Explanation: Rolling beta uses the closed-form cov(x, y) / var(x) formula rather than calling lm() 191 times, which is roughly 50x faster for long histories. Passing the index vector to rollapplyr is a common idiom for rolling regressions: the helper function looks up the slice itself, which lets you carry extra columns through unchanged. The right-aligned window means today's beta is computed from the last 60 days inclusive, preserving causality.
Section 6. End-to-end workflows (3 problems)
Exercise 6.1: Build a one-row daily risk report for the equity book
Task: Every morning the risk team posts a one-row summary to the trading floor: closing P&L in dollars, 30-day annualized volatility, 95% historical VaR in dollars, and current drawdown from running peak. Given the inline 90-day return tibble and a $1,000,000 notional, build a one-row tibble (columns as_of, pnl_dollar, vol_30d_ann, var95_dollar, dd_from_peak) using the most recent day and save it to ex_6_1.
Expected result:
#> # A tibble: 1 x 5
#> as_of pnl_dollar vol_30d_ann var95_dollar dd_from_peak
#> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 2024-04-01 -2104. 0.182 17456 -0.043
Difficulty: Advanced
A daily risk line derives the running metrics across the whole history first, then reports only the figures as of the most recent day.
Build columns with mutate() (rollapplyr for vol, cumprod/cummax for drawdown), compute VaR with quantile(), then slice_tail(n = 1) and transmute().
Click to reveal solution
Explanation: This is what a one-line risk summary looks like in production: a chained mutate that derives wealth, peak, and drawdown columns; a separate quantile for VaR that uses the full history rather than the most recent point; and a slice/transmute that picks the most recent day and rounds for human-readable output. Real desks add stress overlays (rates +100bps, equity -20%), open positions broken out by sector, and an exception flag when any metric breaches a hard limit.
Exercise 6.2: Decompose the worst trading day into per-asset P&L contributors
Task: A multi-asset book had a bad day and the CIO wants a one-page debrief listing each holding's dollar P&L contribution on the worst day, sorted from worst to best. From the inline tibble of daily returns for four positions with their dollar weights, find the day with the most negative portfolio P&L and return a tibble (columns asset, weight_usd, ret, pnl_usd) of the four positions on that day sorted ascending by pnl_usd, saved to ex_6_2.
Expected result:
#> # A tibble: 4 x 4
#> asset weight_usd ret pnl_usd
#> <chr> <dbl> <dbl> <dbl>
#> 1 NVDA 500000 -0.045 -22500
#> 2 AAPL 400000 -0.024 -9600
#> 3 MSFT 300000 -0.012 -3600
#> 4 GOOG 200000 0.005 1000
Difficulty: Advanced
First collapse the book down to one figure per day to locate the worst day, then return to the line items belonging to that single day.
group_by(date) and summarise(sum(weight_usd * ret)), slice_min() then pull() the date, then filter() back to it and arrange() by pnl_usd.
Click to reveal solution
Explanation: The pattern is a two-step pipeline: first reduce to one row per day to find the worst day, then filter back to the line items on that single day and compute contributions. NVDA dominates the loss not because its return was the worst by a wide margin but because its dollar weight is the largest, which is the standard reason concentration risk creates outsized debrief lines. Wider books extend this with sector, currency, and factor cuts of the same per-day P&L.
Exercise 6.3: Detect weight drift versus model targets and flag rebalance candidates
Task: A passive-tilt strategy maintains target weights but tolerates 200 bps of drift before triggering a rebalance trade to control transaction costs. From the inline tibble of current and target weights for six holdings, compute drift in basis points, flag holdings beyond +/- 200bps, and return only the flagged rows (columns ticker, current_w, target_w, drift_bps, action) where action is "BUY" or "SELL", saved to ex_6_3.
Expected result:
#> # A tibble: 2 x 5
#> ticker current_w target_w drift_bps action
#> <chr> <dbl> <dbl> <dbl> <chr>
#> 1 NVDA 0.245 0.20 450 SELL
#> 2 BND 0.130 0.17 -400 BUY
Difficulty: Advanced
Each holding's gap from its target is converted into basis points, and only the holdings that breach the tolerance band are kept.
Compute (current_w - target_w) * 10000, assign BUY/SELL with case_when() against the +/- 200 thresholds, and filter() out the unflagged rows.
Click to reveal solution
Explanation: Tolerance-banded rebalancing is standard in passive and risk-parity strategies because transaction costs make daily rebalancing of every tiny drift uneconomic. Drift in basis points is the natural unit because traders think in bps; converting weight differences to bps and applying a single threshold is faster to reason about than working in decimals. The same pattern extends to factor exposures (drift from target factor loading) and dollar-neutral books (cash drift from target gross or net exposure).
What to do next
You have just worked through 25 problems mirroring real desk work. Suggested next steps:
- R Tutorial for the broader foundation in base R that ties returns, vectors, and data frames together.
- dplyr Exercises in R to deepen the data-manipulation idioms used heavily here (rolling, group-by, summarise).
- tidyr Exercises in R for the wide-to-long pivot work that comes up constantly in factor and panel data.
- ggplot2 Exercises in R for charting price paths, drawdown curves, and rolling betas.
r-statistics.co · Verifiable credential · Public URL
This document certifies mastery of
R for Finance Mastery
Every certificate has a public verification URL that proves the holder passed the assessment. Anyone with the link can confirm the recipient and date.
342 learners have earned this certificate