Example 3

This is an example that reproduces Figure 5a-c from the main text. We load two sequences of models optimized on two independent data samples generated from the same ramping model. We then calculate feature complexity for these models and perform feature consistency analysis to select the optimal model. The analysis is performed for non-stationary data.

[1]:
import neuralflow
from neuralflow.utilities.FC_nonstationary import JS_divergence_tdp, FeatureComplexities, FeatureComplexity
import numpy as np
import matplotlib.pyplot as plt, matplotlib.gridspec as gridspec

Step 1: Load data and calculate Feature complexities (FC)

In this example, data will be loaded from npz files. To obtain this files, we generated 400 trials of data from the ramping dynamics with the following parameters: 'peq_model':{"model": "linear_pot", "params": {"slope": -5}}, 'p0_model': {'model': 'single_well', 'params': {'miu': 200, 'xmin': -0.3}}, D0 : 0.1. We then split our data into two independent halves, \(D=D_1+D_2\), where \(D_1\) contains the first 200, and \(D_2\) contains the second 200 trials. We fit two models on each of the dataset by performing 10000 gradient-descent iterations. Here we only fit the potential function, and fix the rest of the parameters (performing 10000 iteration can take ~5-10 hours, thus in this example we offer you to load the optimization results).

Fitting results were saved in two data files (one for each optimization). Each data file contains the following entries:

iter_num: array of integers, iteration numbers on which peqs were recorded,

peqs: 2D array of fitted peqs (only recorded at iterations specified by iter_num array),

logliks: negative training loglikelihoods recorded on each iteration,

logliksCV: negative validated loglikelihoods recorded on each iteration (we don’t use it in this example).

  1. Initialize an instance of EnergyModel class. This is needed for feature complexities calculation.

  2. Load data files with the fitting results and convert them into dictionaries.

  3. To reduce the computation time, only calculate feature complexity on a selected subset of iterations (a selected subset of models obtained with GD). Calculate feature complexities, eigenvalues, and eigenvectors for the selected models.

[1]:
EnergyModelParams = {'pde_solve_param':{'method':{'name': 'SEM', 'gridsize': {'Np': 8, 'Ne': 16}}},
               'Nv': 111,
               'peq_model':{'model': 'uniform', 'params': {}},
               'p0_model': {'model': 'single_well', 'params': {'miu': 200, 'xmin': -0.3}},
               'D0': 0.1,
               'boundary_mode':'absorbing',
               'num_neuron':1,
               'firing_model':[{'model': 'linear','params': {'r_slope': 50, 'r_bias': 60}}],
               'verbose':True
               }

#An instance of EnergyModel class that will be used for FC analysis.
em_fitting = neuralflow.EnergyModel(**EnergyModelParams)

#Load data
data1 = dict(np.load('data/Ex3_datasample1.npz',allow_pickle=True))
data2 = dict(np.load('data/Ex3_datasample2.npz',allow_pickle=True))

#Only consider the models on selected iterations (arranged on a logspace between 0 and 10,000).
number_of_models = 100
iterations_selected=np.unique(np.concatenate((np.array([0]),np.logspace(0,4,number_of_models))).astype(int))
iteration_indeces = np.argmin(np.abs(np.subtract.outer(data1['iter_num'],iterations_selected)),axis=0)

#For each model calculate feature complexity, and the eigenvalues and eigenvectors of the operator H0
FCs_array1, lQ_array1, Qx_array1=FeatureComplexities(data1,em_fitting,iteration_indeces)
FCs_array2, lQ_array2, Qx_array2=FeatureComplexities(data2,em_fitting,iteration_indeces)
  File "/var/folders/xt/nql8j4c959l6b4s54h4l91340000gn/T/ipykernel_59170/1492902737.py", line 18
    peq_gt = neuralflow.peq_models.linear_pot(em_fitting.x_d_,em_fitting.w_d_,=2.65)
                                                                              ^
SyntaxError: invalid syntax

Step 2: Calculate JSD for each level of Feature complexity

Here we follow the algorithm described in Methods of Genkin, Hughes, Engel, Nat. Comm. 2021. For each model in FCs_array1, we consider \(2*\rm{FC_{stride}}+1\) models in FCs_array2 with similar feature complexities. We calculate JS divergences between the first model and all of the considered models from FCs_array2, and select a minimal value. We repeat this for each model in FCs_array1. We also save the indices of the optimal models for each level of feature complexity (min_inds1 and min_inds2 that allows us to find the pair of models in data1/data2 arrays for each level of the feature complexity).

[3]:
FC_stride = 5

# Preallocate the arrays
JS = np.zeros(FCs_array1.size)
min_inds1=np.zeros_like(JS)
min_inds2=np.zeros_like(JS)

for i in range(FCs_array1.size):
    # We only consider 1 model from the sequence of models optimized on data sample 1.
    FC1_ind = np.array([i])
    # Find the index of model in the second sequence that have FC similar to model 1.
    ind2 = np.argmin(np.abs(FCs_array1[i]-FCs_array2))
    # We consider 2*FC_stride-1 models from the second sequence of models with similar FC by selecting
    # the models around the index ind2 (and making sure our index is in the valid range).
    FC2_ind = np.array(np.arange(max(0,ind2-FC_stride),min(FCs_array2.size-1,ind2+1+FC_stride)))
    #Compute JSD for each pair of models selected from the sequences 1 and 2.
    JS_cur = np.zeros((FC1_ind.size,FC2_ind.size))
    for i1,ind1 in enumerate(FC1_ind):
        for i2, ind2 in enumerate(FC2_ind):
            peq_ind1=iteration_indeces[ind1]
            peq_ind2=iteration_indeces[ind2]
            JS_cur[i1,i2]=JS_divergence_tdp(data1['peqs'][...,peq_ind1],em_fitting.D_,em_fitting.p0_,
                                            data2['peqs'][...,peq_ind2],em_fitting.D_,em_fitting.p0_,
                                            em_fitting.w_d_, lQ_array1[ind1,:], Qx_array1[ind1,:],
                                            lQ_array2[ind2,:], Qx_array2[ind2,:],1,10)
    #Find the minimal JSD and assign it to the current feature complexity
    m1,m2=np.unravel_index(np.argmin(JS_cur, axis=None), JS_cur.shape)
    min_inds1[i] = iteration_indeces[FC1_ind[m1]]
    min_inds2[i] = iteration_indeces[FC2_ind[m2]]
    JS[i] = np.min(JS_cur)

Step 3. Visualize the results.

To visualise the results, we first choose a JS divergence threshold and find the index when JSD first exceeds this threshold. This index will correspond to a pair of model with optimal feature complexity that we select and compare against the ground-truth model. To illustrate the process of model selection, we also plot a pair of models at early feature complexity (FC_early) and late feature complexity (FC_late). As expected, at small feature complexity both potentials tightly overlap, but undefitted and do not overlap with the ground-truth. At large feature complexity the potentials are overfitted and contain the features not presented in the ground-truth model

[4]:
JS_thres=0.001
#Index of the optimal models that corresponds to a maximum FC before exceeding the threshold
optimal_ind=np.where(JS>JS_thres)[0][0]-1
#Optimal FC
FC_opt = FCs_array1[optimal_ind]
#For illustration purposes, choose early and late FCs.
FC_late=3.5
late_ind = np.where(FCs_array1>FC_late)[0][0]
FC_early=1.8
early_ind = np.where(FCs_array1>FC_early)[0][0]

#Create a ground-truth model, calculate its Feature Complexity.
EnergyModelParams = {'pde_solve_param':{'method':{'name': 'SEM', 'gridsize': {'Np': 8, 'Ne': 16}}},
               'Nv': 111,
               'peq_model':{"model": "linear_pot", "params": {"slope": -5}},
               'p0_model': {'model': 'single_well', 'params': {'miu': 200, 'xmin': -0.3}},
               'D0': 0.1,
               'boundary_mode':'absorbing',
               'num_neuron':1,
               'firing_model':[{'model': 'linear','params': {'r_slope': 50, 'r_bias': 60}}],
               'verbose':True
               }
em_gt = neuralflow.EnergyModel(**EnergyModelParams)
FC_gt = FeatureComplexity(em_gt)

#Visualization
fig=plt.figure(figsize=(20,7));
gs=gridspec.GridSpec(2,6,height_ratios=[3,2],hspace=0.5, wspace=0.5);
line_colors = [[239/255, 48/255, 84/255], [0, 127/255, 1], [0.5, 0.5, 0.5]]
dot_colors = [[0.6,0.6,0.6], [1, 169/255, 135/255],  [147/255, 192/255, 164/255]]

ax = plt.subplot(gs[0,:3])
ax.set_title('Feature complexity grows with iteration number',fontsize=18)
ax.plot(data1['iter_num'][iteration_indeces],FCs_array1,color=line_colors[0],linewidth=3,label='Data sample 1')
ax.plot(data2['iter_num'][iteration_indeces],FCs_array2,color=line_colors[1],linewidth=3,label='Data sample 2')
ax.hlines(FC_gt,data1['iter_num'][iteration_indeces][0],data1['iter_num'][iteration_indeces][-1],color=line_colors[2],linewidth=2,label='Ground truth')
plt.xscale('log')
plt.xlabel('Iteration number', fontsize=14)
plt.ylabel('Feature complexity', fontsize=14)

ax=plt.subplot(gs[0,3:])
ax.set_title('JS divergence as a function of feature complexity',fontsize=18)
ax.plot(FCs_array1,JS, color = [0.47, 0.34, 0.66],linewidth=3)
ax.plot(FC_early,JS[np.argmin(np.abs(FCs_array1-FC_early))],'.',markersize=20,color=dot_colors[0])
ax.plot(FC_opt,JS[np.argmin(np.abs(FCs_array1-FC_opt))],'.',markersize=20,color=dot_colors[1])
ax.plot(FC_late,JS[np.argmin(np.abs(FCs_array1-FC_late))],'.',markersize=20,color=dot_colors[2])
ax.hlines(JS_thres,FCs_array1[0],FCs_array1[-1],linestyles='dashed',color=line_colors[2],linewidth=2,label='Ground truth')
plt.xlabel('Feature complexity',fontsize=14)
plt.ylabel('JS divergence', fontsize=14)


ax=plt.subplot(gs[1,:2])
ax.set_title(r'$\mathcal{M}<\mathcal{M}^*$', fontsize=18)
ax.plot(em_gt.x_d_,-np.log(data1['peqs'][...,int(min_inds1[early_ind])]),color=line_colors[0],linewidth=3)
ax.plot(em_gt.x_d_,-np.log(data2['peqs'][...,int(min_inds2[early_ind])]),color=line_colors[1],linewidth=3)
ax.plot(em_gt.x_d_,-np.log(em_gt.peq_),color=[0.5, 0.5, 0.5],linewidth=2)
plt.xlabel(r'Latent state, $x$', fontsize=14)
plt.ylabel(r'Potential, $\Phi(x)$', fontsize=14)

ax=plt.subplot(gs[1,2:4])
ax.set_title(r'$\mathcal{M}=\mathcal{M}^*$', fontsize=18)
ax.plot(em_gt.x_d_,-np.log(data1['peqs'][...,int(min_inds1[optimal_ind])]),color=line_colors[0],linewidth=3)
ax.plot(em_gt.x_d_,-np.log(data2['peqs'][...,int(min_inds2[optimal_ind])]),color=line_colors[1],linewidth=3)
ax.plot(em_gt.x_d_,-np.log(em_gt.peq_),color=[0.5, 0.5, 0.5],linewidth=2)
plt.xlabel('latent state, x', fontsize=14)
plt.ylabel(r'Potential, $\Phi(x)$', fontsize=14)

ax=plt.subplot(gs[1,4:])
ax.set_title(r'$\mathcal{M}>\mathcal{M}^*$', fontsize=18)
ax.plot(em_gt.x_d_,-np.log(data1['peqs'][...,int(min_inds1[late_ind])]),color=line_colors[0],linewidth=3)
ax.plot(em_gt.x_d_,-np.log(data2['peqs'][...,int(min_inds2[late_ind])]),color=line_colors[1],linewidth=3)
ax.plot(em_gt.x_d_,-np.log(em_gt.peq_),color=[0.5, 0.5, 0.5],linewidth=2)
plt.xlabel('latent state, x', fontsize=14)
plt.ylabel(r'Potential, $\Phi(x)$', fontsize=14)


The code above should produce the following image: Jupyter notebook icon