generate_piecewise_its_data#
- causalpy.data.simulate_data.generate_piecewise_its_data(N=100, interruption_times=None, baseline_intercept=10.0, baseline_slope=0.1, level_changes=None, slope_changes=None, noise_sigma=1.0, seed=None)[source]#
Generate piecewise Interrupted Time Series data with known ground truth parameters.
This function creates synthetic data for testing and demonstrating piecewise ITS / segmented regression models. The data follows the model:
y_t = β₀ + β₁t + Σₖ(level_k · I_k(t) + slope_k · R_k(t)) + ε_t
Where: - I_k(t) = 1 if t >= T_k else 0 (step function for level change) - R_k(t) = max(0, t - T_k) (ramp function for slope change)
- Parameters:
N (int, default=100) – Number of time points in the series.
interruption_times (list[int], optional) – List of time indices where interruptions occur. Defaults to [50].
baseline_intercept (float, default=10.0) – The intercept (β₀) of the baseline trend.
baseline_slope (float, default=0.1) – The slope (β₁) of the baseline trend.
level_changes (list[float], optional) – List of level changes at each interruption. Length must match interruption_times. If None, defaults to [5.0] for single interruption.
slope_changes (list[float], optional) – List of slope changes at each interruption. Length must match interruption_times. If None, defaults to [0.0] (no slope change).
noise_sigma (float, default=1.0) – Standard deviation of the Gaussian noise.
seed (int, optional) – Random seed for reproducibility.
- Returns:
df (pd.DataFrame) – DataFrame with columns: - ‘t’: time index (0 to N-1) - ‘y’: observed outcome with noise - ‘y_true’: outcome without noise (ground truth) - ‘counterfactual’: baseline trend without intervention effects - ‘effect’: true causal effect at each time point
params (dict) – Dictionary containing the true parameters: - ‘baseline_intercept’: β₀ - ‘baseline_slope’: β₁ - ‘level_changes’: list of level changes - ‘slope_changes’: list of slope changes - ‘interruption_times’: list of interruption times - ‘noise_sigma’: noise standard deviation
- Return type:
Examples
>>> from causalpy.data.simulate_data import generate_piecewise_its_data >>> # Single interruption with level and slope change >>> df, params = generate_piecewise_its_data( ... N=100, ... interruption_times=[50], ... level_changes=[5.0], ... slope_changes=[0.2], ... seed=42, ... ) >>> df.shape (100, 5)
>>> # Multiple interruptions >>> df, params = generate_piecewise_its_data( ... N=150, ... interruption_times=[50, 100], ... level_changes=[3.0, -2.0], ... slope_changes=[0.1, -0.15], ... seed=42, ... ) >>> len(params["interruption_times"]) 2
>>> # Level change only (no slope change) >>> df, params = generate_piecewise_its_data( ... N=100, ... interruption_times=[50], ... level_changes=[5.0], ... slope_changes=[0.0], ... seed=42, ... )