Abstract
Deep neural networks (DNNs) remain the dominant AI models at many visual tasks, and as models for biological vision, making it crucial to better understand the internal representations and operations undergirding their successes and failures, and to carefully compare these processing stages to those found in the brain. PyTorch has emerged as the leading framework for building DNN models; it would thus be highly desirable to have a method for easily and exhaustively extracting and characterizing the results of the internal operations of any arbitrary PyTorch model. Here we introduce Torchlens, a new open source Python package for extracting and characterizing hidden layer activations from PyTorch models. Uniquely among existing approaches for this task, Torchlens has the following features: 1) it exhaustively extracts the results of all intermediate operations, not just those associated with PyTorch module objects, yielding a full record of every step in the model's computational graph, 2) in addition to logging the outputs of each operation, it encodes metadata about each computational step in a model's forward pass, both facilitating further analysis and enabling an automatic intuitive visualization (in rolled or unrolled format) of the model's complete computational graph, 3) it contains a built-in validation procedure to algorithmically verify the accuracy of all saved hidden layer activations, and 4) the approach it uses can be automatically applied to any arbitrary PyTorch model with no modifications, including models with conditional (if-then) logic in their forward pass, recurrent models, branching models, and models with internally generated tensors (e.g., that add random noise). Furthermore, Torchlens requires minimal user-facing code, making it easy to incorporate into existing pipelines for model development and analysis, use as a pedagogical aid when teaching deep learning concepts, and more broadly, accelerate the process of understanding the internal operating principles of DNNs trained on visual tasks.