Mode-Mixing due to PCA

In this notebook, we give a synopsis of the mode-mixing effect described in Chen 2025, and how to use meer21cm to generate mode-mixing effects that you can apply to your power spectrum modelling.

[42]:

from meer21cm import MockSimulation
import numpy as np
import matplotlib.pyplot as plt
from meer21cm.transfer import analytic_transfer_function
from meer21cm.util import redshift_to_freq, pca_clean

Let us first generate some simple HI simulation in a rectangular box to demonstrate this effect. To capture the survey volume more or less, consider a 50 x 50 deg^2 area from z~0.5-1.5:

[8]:

z_min = 0.5
z_max = 1.5
nu_max = redshift_to_freq(z_min)
nu_min = redshift_to_freq(z_max)
nu = np.linspace(nu_min,nu_max,100)

mock = MockSimulation(
    nu=nu,
    tracer_bias_1=1.5,
    mean_amp_1='average_hi_temp',
    omega_hi=5e-4,
)
# roughly 50 deg
box_xy = 50 * np.pi / 180 * mock.comoving_distance(mock.z_ch.mean()).value #Mpc
box_z = (mock.comoving_distance(mock.z_ch.max()) - mock.comoving_distance(mock.z_ch.min())).value #Mpc
mock._box_len = np.array([box_xy,box_xy,box_z])
mock._box_ndim = np.array([100,100,100])
mock.propagate_field_k_to_model()

meer21cm can generate some HI box using lognormal simulations:

[10]:

mock.mock_tracer_field_1;
plt.imshow(mock.mock_tracer_field_1[0])

[10]:

<matplotlib.image.AxesImage at 0x1125790f0>

You can pass the lognormal simulation back to meer21cm for power spectrum estimation:

[21]:

mock.field_1 = mock.mock_tracer_field_1
mock.k1dbins = np.geomspace(5e-3, 0.2, 21)
pmodel_1d, keff, _ = mock.get_1d_power(mock.auto_power_tracer_1_model)
pmock_1d, _, _ = mock.get_1d_power(mock.auto_power_3d_1)
plt.plot(keff,pmodel_1d,label='model')
plt.plot(keff,pmock_1d,label='mock')
plt.legend()
plt.xscale('log')
plt.yscale('log')
plt.xlabel('k [Mpc-1]')
plt.ylabel('P(k) [K^2 Mpc^3]')

[21]:

Text(0, 0.5, 'P(k) [K^2 Mpc^3]')

Let us run 100 realisations to retrieve some covariance:

[27]:

pmock_arr = []
for seed in range(100):
    mock.seed = seed
    # trigger a reset of cache to rerun mock
    mock.tracer_bias_1 = 1.5
    mock.field_1 = mock.mock_tracer_field_1
    pmock_1d, _, _ = mock.get_1d_power(mock.auto_power_3d_1)
    pmock_arr.append(pmock_1d)

pmock_arr = np.array(pmock_arr)

[33]:

plt.errorbar(keff,pmock_arr.mean(axis=0),yerr=pmock_arr.std(axis=0),label='mock',)
plt.plot(keff,pmodel_1d,label='model',ls='--',lw=4)
plt.legend()
plt.xscale('log')
plt.yscale('log')
plt.xlabel('k [Mpc-1]')
plt.ylabel('P(k) [K^2 Mpc^3]')

[33]:

Text(0, 0.5, 'P(k) [K^2 Mpc^3]')

../_images/cookbook_mode_mixing_10_1.png

[37]:

cov_mock = np.cov(pmock_arr.T)
corr_mock = np.corrcoef(pmock_arr.T)

The correlation between the k-bins is strong at large k. This is a feature of lognormal mocks:

[39]:

plt.imshow(corr_mock,origin='lower')
plt.colorbar()

[39]:

<matplotlib.colorbar.Colorbar at 0x14546fb80>

../_images/cookbook_mode_mixing_13_1.png

Now let us do some naive PCA. We will simply at a uniform fg with constant spectral index. To illustrate the effect, let’s also remove a large number of modes (15):

[76]:

fg = 20 * (nu[None,None,:]/408/1e6)**(-2.7) * np.ones(mock.box_ndim)
N_fg = 15
res_map, A_mat = pca_clean(mock.mock_tracer_field_1+fg, N_fg, return_A=True)
plt.imshow(mock.mock_tracer_field_1[0],origin='lower')
plt.colorbar()
plt.figure()
plt.imshow(res_map[0],origin='lower')
plt.colorbar()

[76]:

<matplotlib.colorbar.Colorbar at 0x1476127d0>

../_images/cookbook_mode_mixing_15_1.png

../_images/cookbook_mode_mixing_15_2.png

We can compare the cleaned ps with original:

[77]:

mock.field_1 = mock.mock_tracer_field_1
pmock_orig, keff, _ = mock.get_1d_power(mock.auto_power_3d_1)
mock.field_1 = res_map
pmock_pca, keff, _ = mock.get_1d_power(mock.auto_power_3d_1)

[78]:

plt.plot(keff,pmock_pca,label='pca',ls='-',lw=2)
plt.plot(keff,pmock_orig,label='original',ls='--',lw=2)
plt.legend()
plt.xscale('log')
plt.yscale('log')
plt.xlabel('k [Mpc-1]')
plt.ylabel('P(k) [K^2 Mpc^3]')

[78]:

Text(0, 0.5, 'P(k) [K^2 Mpc^3]')

../_images/cookbook_mode_mixing_18_1.png

You can see that there is a “signal loss” (which is in fact both signal loss and mode-mixing).

Following the derivation in Chen 2025, meer21cm provides a functionality to calculate the window function and convolve it to the model:

[79]:

R_pca = np.eye(A_mat.shape[0]) - A_mat @ A_mat.T
tf, wab = analytic_transfer_function(R_pca, R_pca)
hab = np.diag(1/tf) @ wab
pmodel_pca_corrected, keff, _ = mock.get_1d_power(
    np.einsum('ij, abj->abi',hab, mock.auto_power_tracer_1_model)
)

[83]:

plt.plot(keff,pmock_pca,label='pca mock',ls='-',lw=2)
plt.plot(keff,pmodel_1d,label='uncorrected model',ls='--',lw=2)
plt.plot(keff,pmodel_pca_corrected,label='pca corrected model',ls=':',lw=2)

plt.legend()
plt.xscale('log')
plt.yscale('log')
plt.xlabel('k [Mpc-1]')
plt.ylabel('P(k) [K^2 Mpc^3]')

[83]:

Text(0, 0.5, 'P(k) [K^2 Mpc^3]')

../_images/cookbook_mode_mixing_21_1.png

Note again, that there is not just a signal loss but also a mode-mixing. Let us rerun the covariance calculation:

[81]:

pmock_pca_arr = []
for seed in range(100):
    mock.seed = seed
    # trigger a reset of cache to rerun mock
    mock.tracer_bias_1 = 1.5
    res_map = pca_clean(mock.mock_tracer_field_1 + fg, N_fg)
    mock.field_1 = res_map
    pmock_1d, _, _ = mock.get_1d_power(mock.auto_power_3d_1)
    pmock_pca_arr.append(pmock_1d)

pmock_pca_arr = np.array(pmock_pca_arr)

[84]:

plt.errorbar(keff,pmock_pca_arr.mean(axis=0),yerr=pmock_pca_arr.std(axis=0),label='pca mock',ls='-',lw=2)
plt.plot(keff,pmodel_pca_corrected,label='pca corrected model',ls='--',lw=4)

plt.legend()
plt.xscale('log')
plt.yscale('log')
plt.xlabel('k [Mpc-1]')
plt.ylabel('P(k) [K^2 Mpc^3]')

[84]:

Text(0, 0.5, 'P(k) [K^2 Mpc^3]')

../_images/cookbook_mode_mixing_24_1.png

[85]:

corr_pca_mock = np.corrcoef(pmock_pca_arr.T)

[90]:

fig,axes = plt.subplots(1,2,figsize=(10,5))
axes[0].pcolormesh(mock.k1dbins, mock.k1dbins, corr_mock,vmin=-1,vmax=1,cmap='RdBu_r')
axes[0].set_xscale('log')
axes[0].set_yscale('log')
axes[0].set_title('mock')
axes[1].pcolormesh(mock.k1dbins, mock.k1dbins, corr_pca_mock,vmin=-1,vmax=1,cmap='RdBu_r')
axes[1].set_xscale('log')
axes[1].set_yscale('log')
axes[1].set_title('pca mock')

[90]:

Text(0.5, 1.0, 'pca mock')

../_images/cookbook_mode_mixing_26_1.png

You can see that there is visible correlation at relatively large scales, induced by PCA.