%load_ext autoreload
%autoreload 2
preprocess_bold_fmri
preprocessed a BOLD Nifti and returns a numpy ndarray of the optionally masked and preprocessed fMRI data.
test = [1,2,3]
Example
make_X_Y
allows you to align the (preprocessed) fMRI and stimulus data by specifying fMRI TR
and stimulus stim_TR
, as well as the lag_time
(how long a stimulus window should be in seconds to predict a single fMRI TR) and potential stimulus offsets.
Since we potentially want to preprocess and concatenate multiple runs, both fmri
and stimuli
are supposed to be lists. To process only a single run, you can use a list of one element.
Let's look at an example, where the stimulus is sample every 100 ms and fMRI every 2s, i.e. every fMRI sample corresponds to 20 stimulus samples.
stim_TR, TR = 0.1, 2
Now create a simulated stimulus
object of 80 samples.
stimulus = np.tile(np.arange(80)[:, None], (1, 1))
print(stimulus.shape)
And an according fmri
object of 4 samples and one voxel (since we TRs differ).
fmri = np.tile(np.arange(0, 4)[:, None], (1, 1))
print(fmri.shape)
Let's first align fMRI
and stimulus
without any offset or lag:
X, y = make_X_Y([stimulus], [fmri], TR, stim_TR, lag_time=None, offset_stim=0, start_times=[0])
assert X.shape == (4, 20)
assert y.shape == (4, 1)
We keep the original number of samples in fMRI, but represent stimulus (and hence X) by the number of samples per fmri TR: stimulus thus becomes a (4, 20) array.
Lagging the stimulus
We can now call make_X_Y
with the stimulus and fMRI TRs and a specified lag_time
.
Here we want to use 4 seconds of the stimulus to predict fMRI, but do not want to shift fmri
relative to stimulus
(offset_stim
is 0.).
This means that our encoding model can approximate a hemodynamic response function (HRF) by estimating a finite impulse response (FIR) that is 4 seconds long.
X, y = make_X_Y([stimulus], [fmri], TR, stim_TR, lag_time=4, offset_stim=0, start_times=[0])
assert X.shape == (3, 40)
assert y.shape == (3, 1)
Shifting the stimulus
We could also shift fmri
relative to stimulus
, to account for the delayed onset of the hemodynamic response - this is different than estimating the hemodynamic response from the window given by lag_time
.
In practice this means we estimate an hemodynamic response function (HRF) by a FIR in the time period from -6s to -2s before each fMRI sample.
X, y = make_X_Y([stimulus], [fmri], TR, stim_TR, lag_time=4, offset_stim=2, start_times=[0])
assert X.shape == (2, 40)
assert y.shape == (2, 1)
Handling out-of-recording data
Because of our shift we "lose" one sample, because by default fill_value
fills values that lie outside the recording interval by NaNs and by default remove_nans
specifies that all samples with NaNs are dropped.
To check that behavior, we see what we get when we don't remove NaNs:
X, y = make_X_Y([stimulus], [fmri], TR, stim_TR, lag_time=4, offset_stim=2, start_times=[0], remove_nans=False)
assert X.shape == (4, 40)
assert y.shape == (4, 1)
We keep the original number of samples, but some are filled with NaNs now:
assert np.isnan(X).sum() == 60
print(X)
We can see that the first samples completely consists of NaNs, because by lagging and offsetting we assume that the fMRI sample at time point t can be predicted by the time period in the stimulus of t-6s to t-2s. However, we don't have any stimulus presented in that time! In the second sample we can see that the first half of the stimulus still consists of NaNs: that's because for t=2s, the time period in the stimulus from t-6s to t-2s has only data for t=0s but not t=4s. Keep in mind that the stimulus at t=0s corresponds to the first 2s of the stimulus (because we reshaped the stimulus TR to correspond to the 2s fmri TR).
generate_lagged_stimulus
takes care of aligning fMRI and stimulus data, it is used internally by make_X_Y
.