Neuroscience and brain-computer interfaces (BCIs) heavily rely on the recording and analysis of neural activity [1]. Capturing neural signals from the motor cortex of non-human primates during tasks, such as playing video games, generates a substantial volume of data [2]. For instance, the N1 implant generates data at a rate of 200 Mbps (from 1024 electrodes at 20 kHz with 10-bit resolution), but it can only transmit 1 Mbps wirelessly [3]. This discrepancy necessitates a compression ratio exceeding 200 times to enable real-time data transmission. Additionally, the compression algorithm must process data within 1 millisecond (ms) and consume less than 10 milliwatts (mW) of power, including radio transmission [4].
Current neural data compression techniques include lossy and lossless methods. Lossless techniques preserve the exact original data but often achieve lower compression ratios [5]. Lossy techniques, such as Discrete Cosine Transform (DCT) and wavelet-based methods, achieve higher compression ratios but at the expense of data fidelity [6]. Recent advancements include the use of machine learning models like Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) for neural data compression [7]. However, these methods often face challenges such as mode collapse and high computational requirements. The proposed VQ-CAE method seeks to address these issues by combining Vector Quantized Variational Autoencoders (VQ-VAEs) with Convolutional Autoencoders (CAEs) to achieve a high compression ratio while maintaining the integrity of neural data.
The primary objective of this study is to propose a novel hybrid compression technique named "VQ-CAE: Vector Quantized Convolutional Autoencoder" to compress neural recordings from the motor cortex of a non-human primate. The goal is to achieve a high compression ratio while preserving the integrity of the neural data for subsequent analysis and interpretation.
The VQ-VAE consists of an encoder, a codebook, and a decoder. The encoder maps the input data x to a latent space representation z_e :
class VQVAE(nn.Module):
def __init__(self, input_dim, latent_dim, num_embeddings, embedding_dim, commitment_cost=0.25):
super(VQVAE, self).__init__()
self.encoder = nn.Sequential(
nn.Linear(input_dim, 512),
nn.ReLU(),
nn.Linear(512, 256),
nn.ReLU(),
nn.Linear(256, latent_dim)
)
self.codebook = nn.Embedding(num_embeddings, embedding_dim)
self.codebook.weight.data.uniform_(-1 / num_embeddings, 1 / num_embeddings)
self.decoder = nn.Sequential(
nn.Linear(embedding_dim, 256),
nn.ReLU(),
nn.Linear(256, 512),
nn.ReLU(),
nn.Linear(512, input_dim),
nn.Tanh()
)
self.commitment_cost = commitment_cost
def encode(self, x):
return self.encoder(x)
def decode(self, z):
return self.decoder(z)
def quantize(self, encoding_indices):
return self.codebook(encoding_indices)
def forward(self, x):
z = self.encode(x)
encoding_indices = torch.argmin(torch.sum((z.unsqueeze(1) - self.codebook.weight) ** 2, dim=2), dim=1)
z_q = self.quantize(encoding_indices)
commitment_loss = self.commitment_cost * torch.mean((z_q.detach() - z) ** 2)
z_q = z + (z_q - z).detach()
x_recon = self.decode(z_q)
return x_recon, z, encoding_indices, commitment_loss
Here, the encoder converts the input data to a latent representation, which is then quantized using the codebook. The decoder reconstructs the data from the quantized latent code.
The CAE consists of an encoder and a decoder, both implemented using convolutional layers. The CAE encoder further compresses the reconstructed data “hat”x from the VQ-VAE into a lower-dimensional latent space representation z_c :
class CAE(nn.Module):
def __init__(self, input_dim, latent_dim):
super(CAE, self).__init__()
self.encoder = nn.Sequential(
nn.Conv1d(1, 16, kernel_size=3, stride=2, padding=1),
nn.ReLU(),
nn.Conv1d(16, 32, kernel_size=3, stride=2, padding=1),
nn.ReLU(),
nn.Conv1d(32, latent_dim, kernel_size=3, stride=2, padding=1),
nn.ReLU()
)
self.decoder = nn.Sequential(
nn.ConvTranspose1d(latent_dim, 32, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.ConvTranspose1d(32, 16, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.ConvTranspose1d(16, 1, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.Tanh()
)
def forward(self, x):
z = self.encoder(x)
x_recon = self.decoder(z)
return x_recon
The CAE encoder compresses the data further, and the decoder reconstructs it back to the original data space.
The VQ-CAE hybrid model is trained using a custom loss function that combines the reconstruction losses from both the VQ-VAE and CAE, along with the commitment loss and a weighting factor alpha:
class HybridModel(nn.Module):
def __init__(self, input_dim, latent_dim_vqvae, latent_dim_cae, num_embeddings, embedding_dim, commitment_cost=0.25):
super(HybridModel, self).__init__()
self.vqvae = VQVAE(input_dim, latent_dim_vqvae, num_embeddings, embedding_dim, commitment_cost)
self.cae = CAE(latent_dim_vqvae, latent_dim_cae)
def forward(self, x):
x_recon_vqvae, z_vqvae, encoding_indices, commitment_loss = self.vqvae(x)
z_cae = self.cae.encoder(x_recon_vqvae.unsqueeze(1))
x_recon_cae = self.cae.decoder(z_cae)
return x_recon_cae.squeeze(1), x_recon_vqvae, z_vqvae, encoding_indices, commitment_loss
def hybrid_loss(model, x, x_recon_cae, x_recon_vqvae, z_vqvae, encoding_indices, commitment_loss, alpha=1.0):
recon_loss_vqvae = nn.functional.mse_loss(x_recon_vqvae, x)
recon_loss_cae = nn.functional.mse_loss(x_recon_cae, x)
z_q = model.vqvae.quantize(encoding_indices)
vq_loss = torch.mean((z_vqvae.detach() - z_q.float()) ** 2)
commitment_loss = torch.mean(commitment_loss)
total_loss = recon_loss_cae + alpha * (recon_loss_vqvae + vq_loss + commitment_loss)
return total_loss
Let's walk through the steps to implement the hybrid compression technique using Python code.