Haiping Huanghuanghp7@mail.sysu.edu.cnPMI Lab, School of Physics,Sun Yat-sen University, Guangzhou 510275, People’s Republic of China
(June 21, 2024)
Abstract
A good theory of mathematical beauty is more practical than any current observation, as new predictions about physical reality can be self-consistently verified. This belief applies to the current status of understanding deep neural networks including large language models and even the biological intelligence. Toy models provide a metaphor of physical reality, allowing mathematically formulating that reality (i.e., the so-called theory), which can be updated as more conjectures are justified or refuted. One does not need to pack all details into a model, but rather, more abstract models are constructed, as complex systems like brains or deep networks have many sloppy dimensions but much less stiff dimensions that strongly impact macroscopic observables. This kind of bottom-up mechanistic modeling is still promising in the modern era of understanding the natural or artificial intelligence. Here, we shed light on eight challenges in developing theory of intelligence following this theoretical paradigm. Theses challenges are representation learning, generalization, adversarial robustness, continual learning, causal learning, internal model of the brain, next-token prediction, and finally the mechanics of subjective experience.
I Introduction
Brain is one of the most challenging subjects to understand. The brain is complex with many levels of temporal and spatial complexity[1], allowing for coarse-grained descriptions at different levels, especially in theoretical studies. More abstract models lose the ability to generate predictions on low level details, but bring the conceptual benefits of explaining precisely how the system works, and the mathematical description may be universal, independent of details (or sloppy variables)[2]. One seminal example is the Hopfield model[3], where the mechanism underlying the associative memory observed in the brain was precisely isolated[4, 5]. There is a resurgence of research interests in Hopfield networks in recent years due to the large language models[6, 7].
In Marr’s viewpoint[8], understanding a neural system can be divided into three levels: computation (which task the brain solves),algorithms (how the brain solves the task, i.e., information processing level), and implementation (neural circuit level). Following the first two levels, researchers designed artificial neural networks to solve challenging real-world problems, such as powerful deep learning[9, 10]. However, biological details are also being incorporated intomodels of neural networks[11, 12, 13, 14], and even used to design new learning rules[15]. Indeed, neuroscience studies of biological mechanisms ofperception, cognition, memory and action have already provided a variety of fruitful insights inspiring the empirical or scientific studies of artificial neural networks, which in turninspires the neuroscience researchers to design mechanistic models to understand the brain[16, 17, 18]. Therefore, it is promising to integratephysics, statistics, computer science, psychology, neuroscience and engineering to reveal inner workings of deep (biological) networks andeven intelligence with testable predictions[19], rather than using a black box (e.g., deep artificial neural networks) to understand the other black box (e.g, brain or mind). In fact, the artificial intelligence may follow different principles from the natural intelligence, but both can inspire each other, which may lead to establishment of a coherent mathematical physics foundation for either artificial or biological intelligence.
The goal of providing a unified framework for neural computation is very challenging and even impossible. Due to re-boosted interests in neural networks, there appeara lot of important yet unsolved scientific questions. We shall detail these challenging questions below111Most of them were roughly provided in the book of statistical mechanics of neural networks[45]. Here we give a significantly expanded version., and provide our personal viewpoints toward astatistical mechanics theory solving these fundamental questions, based on first principles in physics. These open scientific questions toward theory of intelligence are summarized in Figure1.
II Challenge \@slowromancapi@—Representation learning
Given raw data (or input-output pairs in supervised learning), one canask what a good representation is and how the meaningful representation is achieved in deep neural networks. We have not yet satisfied answers for these questions.A promising argument is that entangled manifolds at earlier layers of a deep hierarchy are gradually disentangled into linearly separable features at outputlayers[20, 21, 22, 23, 24]. This manifold separation perspective is also promising in system neuroscience studies of associative learning byseparating overlapping patterns of neural activities[25]. However, an analytic theory of the manifold transformation is still lacking, prohibiting us from a full understanding of which keynetwork parameters control the geometry of manifold, and even how learning reshapes the manifold. For example, the correlation among synapses (e.g., arising during learning) will attenuate the decorrelation process along the network depth, but encourage dimension reduction compared to their orthogonal counterparts[26, 23]. This result is derived by using mean-field approximation and coincides with empirical observations[Huang-2022]. In addition, there may exist other biological plausible factors such as normalization, attention, homeostatic control impacting the manifold transformation[27, 28], which can be incorporated into a toy model in future to test the manifold transformation hypothesis.
Another argument from information theoretic viewpoints demonstrates thatthe input information is maximally compressed into a hidden representation whose task-related information should be maximally retrieved at the output layers, according to the information bottleneck theory[29, 30].In this sense, an optimal representation must be invariant to nuisance variability, and their components must be maximally independent, which may be related to causal factors (latent causes) explaining the sensory inputs (see the following fifth challenge). In a physics language, a coarse-grained (or more abstract) representation is formed in deeper layers compared to the fine-grained representation in shallower layers. How microscopic interactions among synapses determine this representation transformation remains elusive and thus deserves future studies; a few recent works started to address the clustering structure in the deep hierarchy[31, 32, 33, 34]. To conclude, the bottom-up mechanistic modeling would be fruitful in dissecting mechanisms of representation transformation.
III Challenge \@slowromancapii@— Generalization
Studying any neural learning system must consider three ingredients: data, network and algorithm (or DNA of neural learning). The generalization ability refers to the computationalperformance that the network is able to implement the rule well in unseen examples. Therefore, intelligence can be considered to some extent as the ability of generalization, especially given very few examples for learning.Therefore, the generalization is also a hot topic in current studies of deep learning. Traditional statistical learning theory claims that over-fitting effects should be strong whenthe number of examples is much less than the number of parameters to learn, which thereby could not explain the current success of deep learning.A promising perspective is to study the causal connection between the loss landscape and the generalization properties[35, 36, 37, 38].For a single layered perceptron, a statistical mechanics theory can be systematically derived and revealed a discontinuous transition from poor to perfec generalization[39, 40]. In contrast to the classical bias-variance trade-off (U-shaped curve of the test error versus increasing model complexity)[41], the moderndeep learning achieves the state-of-the-art performance in the over-parameterized regime[42, 37], a regime of the number of parameters much larger than the training data size.However, how to provide an analytic argument about the over-fitting effects versus differentparameterization regimes (e.g., under-, over- and even super-parameterization) for this empirical observationbecomes a non-trivial task[43]. A recent study of one-hidden-layer networks shows that a first transition occurs at the interpolation point, where perfect fitting becomes possible. This transition reflects the properties of hard-to-sample typical solutions. Increasing the model complexity, a second transition occurs with the discontinuous appearance of atypical solutions. They are wide minima of good generalization properties. This second transition sets an upper bound for the effectiveness of learning algorithms[44]. This statistical mechanics analysis focuses on the average case (average over all realizations of data, network and algorithm), rather than the worst case. The worst case determines the computational complexity category, while the average case tells us the universal properties of learning, and the statistical mechanics links the computational hardness to a few order parameters in physics[45], and these previous works show strong evidences[35, 36, 37, 46, 44].
For an infinitely wide neural network, there exists a lazy learning regime, where the overparameterized neural networks can be well approximated by a linear model corresponding to a first-order Taylor expansion around the initialization, and the the complex learning dynamics is simply training a kernel machine[47]. However, in a practical training, the dynamics is prone to escape the lazy regime, which has no satisfied theory yet. Therefore,clarifying which of lazy-learning (or neural tangent kernel limit) and feature-learning (or mean-field limit) may explain the success of deep supervised learning remains open and challenging[48, 49, 50]. The mean-field limit can be studied in the field theoretic framework, characterizing how the solution of learning deviates from the initialization through a systematic perturbation of the action in the framework[51]. Another related challenge is out-of-distribution generalization, which can also be studied using statistical mechanics, e.g., in a recent work, a kernel regression was analyzed[52]. In addition, the field theoretic method is also promising to write the learning problem of out-of-distribution prediction into propagating correlations and responses[51].
IV Challenge \@slowromancapiii@— Adversarial vulnerability
Adversarial examples are defined by those inputs with human-imperceptible modifications yet leading to unexpected errorsin a deep learning decision making system. The test accuracy drops as the perturbation grows; the perturbation can either rely on the trained network or be an independent noise[53, 54, 55]. The current deep learning is argued to learn predictive yet non-robust features in the data[56]. This adversarial vulnerability of deep neural networks poses a significant challenge in the practical applications of bothreal-world problems and AI4S (artificial intelligence for science) studies. Adversarial training remains the most effective solution to the problem[57], yet in contrast to human learning. However, the training sacrifices the standard discrimination. A recent work applied the physics principle that the hidden representation is clustered like replica symmetry breaking in spin glass theory[58], which leads to contrastive learning that is local and adversarial robust, resolving the trade-off between standard accuracy and adversarial robustness[34]. Furthermore, the adversarial robustness can be theoretically explained in terms of a cluster separation distance. In physics, systems with a huge number of degrees of freedom are able to be captured by a low-dimensional macroscopic description, such as Ising ferromagnetic model. Explaining the layered computation in terms of geometry may finally help to crack the mysterious property of the networks’ susceptibility to adversarial examples[59, 60, 32].Although some recent efforts were devoted to this direction[61, 60], more exciting results are expected in near future works.
V Challenge \@slowromancapiv@— Continual learning
A biological brain is good at adapting the acquired knowledge from similar tasks to domains of new tasks, even if only a handful examplesare available in the new-task domain. This kind of learning is called continual learning or multi-task learning[62, 63], an ability to learn many tasks in sequence, while transfer learning refers to the process ofexploiting the previously acquired knowledge from a source task to improve the generalization performance in a target task[64]. However, the stable adaptation to changing environments, an essence of lifelong learning, remains a significant challenge for modern artificial intelligence[64]. More precisely, neural networks are in general poor at the multi-task learning, although impressive progresses have been achieved inrecent years. For example, during learning, a diagonal Fisher information term is computed to measure importances of weights (then a rapid change is not allowed for those important weights) for previous tasks[63].A later refinement by allowing synapses accumulating task relevant information over time was also proposed[65].More machine learning techniques to reduce the catastrophic forgetting effects were summarized in the review[64]. However,we still do not know the exact mechanisms for mitigating the catastrophic forgetting effects in a principled way, which calls for theoretical studies ofdeep learning in terms of adaptation to domain-shift training, i.e., connection weights trained in a solution to one task are transformed to benefit learning on a related task.
Using asymptotic analysis, a recent work studying transfer learning identified a phase transition in the quality of the knowledge transfer[66].This work reveals how the related knowledge contained in a source task can be effectively transferred to boost the performance in a target task.Other recent theoretical studies interpreted the continual learning with a statistical mechanics framework using Franz-Parisi potential[67] or as an on-line mean-field dynamics of weight updates[68]. The Franz-Parisi potential is a thermodynamic potential used to study glass transition[69]. The recent work assumes that the knowledge from the previous task behaves as a reference configuration[67], where the previously acquired knowledge serves as an anchor for learning new knowledges. This framework also connects to elastic weight consolidation[63], heuristic weight-uncertainty modulation[70], and neuroscience inspired metaplasticity[71], providing a theory-grounded method for the real-world multi-task learning with deep networks.
VI Challenge \@slowromancapv@—Causal learning
Deep learning is criticized as being nothing but a fancy curve-fitting tool, making a naive association between inputs and outputs.In other words, this tool could not distinguish correlation from causation. What the deep network learns is not a concept but merely a statistical correlation, prohibiting the network from counterfactual inference (a hallmark ability of intelligence).A human-like AI must be good at retrieving causal relationship among feature components in sensory inputs, thereby carving relevant information from a sea ofirrelevant noise[72, 73, 74]. Therefore, understanding cause and effect in deep learning systems is particularly important for the next-generation artificial intelligence.The question whether the current deep learning algorithm is able to do causal reasoning remains open. Hence, how to design a learning system that can infer the effect of an intervention becomes a key toaddress this question, although it would be very challenging to make deep learning extract causal structure behind observations by applying simple physics principles, due to both architecture and learning complexities. This challenge is now intimately related to the astonishing performances of large language models (see the following seventh challenge), and the key question is whether the self-attention mechanism is sufficient for capturing the causal relationships in the training data.
VII Challenge \@slowromancapvi@—Internal model of the brain
The brain is argued to learn to build an internal model of the outside world, reflected byspontaneous neural activities as a reservoir for computing (e.g., sampling)[75]. The agreement between spontaneous activity and stimulus-evokedone increases during development especially for natural stimuli[76], while the spontaneous activity outlines the regime of evoked neural responses[77]. The relationship between the spontaneous fluctuation and task-evoked response causes recent interests in studying brain dynamics[78]. This can be formulated by the fluctuation-dissipation theorem in physics, and the violation can be a measure of deviation from equilibrium, although a non-equilibrium stationary state exists.
In addition, the stimuli were shown to carve a clustered neural space[79, 80]. Then, an interesting question is what the spontaneous neural space looks like, and how the space dynamically evolves, especially in adaptation to changing environments. Furthermore,how sensory inputs combined with the ongoing asynchronous cortical activity determine the animal behavior remains open and challenging.If the reward mediated learning is considered, reinforcement learning was used to build world models of structured environments[81]. In the reinforcement learning, observations are used todrive actions which are evaluated based on reward signals the agent receives from the environment after taking the actions. It is thus interesting to reveal which kind of internal models the agent establishes through learning from interactions withthe environments. This can be connected to aforementioned representation and generalization challenges.Moreover, a recent work showed a connection between the reinforcement learning and statistical physics[82], suggesting that a statistical mechanics theory could be potentially established to understand how an optimal policy is found to maximize the long-term accumulated reward, with an additionally potential impact on studying reward-based neural computations in the brain[83].
Another angle to look at the internal model of the brain is through the lens of neural dynamics[84, 85, 86, 87], which is placed onto a low dimensional surface, robust to variations in detailed properties of individual neurons or circuits. The representation of stimuli, tasks or contexts can be retrieved for deriving experimentally testable hypotheses[88]. Although previous theoretical studies were carried out in recurrent rate or spiking activity neural networks[89, 90], a challenging issue remains to address how neural activity and synaptic plasticity interact with each other to yield a low dimensional internal representation for cognitive functions. The recent development of synaptic plasticity combining connection probability, local synaptic noise and neural activity can realize a dynamic network in adaptation to time-dependent inputs[91]. This work interprets the learning as a variational inference problem, making optimal learning under uncertainty possible in a local circuit. Both learning and neural activity are placed on low-dimensional subspaces. Future works must include more biological plausible factors to test the hypothesis in neurophysiological experiments. Another recent exciting achievement is using dynamical mean-field theory to uncover rich dynamical regimes of coupled neuronal-synaptic dynamics[92].
Brain states can be considered as an ensemble of dynamical attractors[93]. The key challenge is how learning shapes the stable attractor landscape. One can interpret the learning as a Bayesian inference, e.g., in an unsupervised way, but not the autoregressive manner (see the next section). The learning can then be driven by synaptic weight symmetry breaking[94, 46], separating two phases of recognizing the network itself and the rule hidden in sensory inputs. It is very interesting to see if this picture still holds in recurrent learning supporting neural trajectories on dynamical attractors, and even predictive learning minimizing a free energy of belief and synaptic weights (the belief leads to error neurons)[95]. New methods must be developed, e.g., based on recently proposed quasi-potential method to study non-equilibrium steady neural dynamics[96], or dynamical mean-field theory for learning[97].
VIII Challenge \@slowromancapvii@—Large language models
The impressive problem-solving capabilities of Chat-GPT where GPT is a shorthand of generative pretrained transformer are leading the fourth industrial revolution. The Chat-GPT is based on large language models (LLMs)[98], which represents linguistic information as vectors in high dimensional state space, trained with a large text corpus in an autoregressive way (in analogy to the hypothesis that the brain is a prediction machine[99]), resulting in a complex statistical model of how the tokens in the training data correlate[100]. The computational model thus shows strong formal linguistic competence[101].The LLM is also a few-shot or even zero-shot learner[102, 103], i.e., the language model can perform a wide range of computationally challenging tasks with prompting alone (e..g, chain-of-thought prompting[104]). Remarkably,the LLMs display a qualitative leap in capability as the model complexity and sample complexity are both scaled up[105], akin to phase transitions in thermodynamic systems.
In contrast to the formal linguistic competence, the functional linguistic competence is argued to be weak[101]. This raises a fundamental question what the nature of intelligence is, or whether a single next-token context conditional prediction is a standard model of artificial general intelligence[106, 107, 108]. Human’s reasoning capabilities in real-world problems rely on non-linguistic information as well, e.g., it is unpredictable when a creative idea for a scientist would come to a challenging problem at hand, which relies on reasoning about the implications along a sequence of thought. In a biological implementation, the language modules are separated from the other modules involving high-level cognition[101]. The LLMexplains hierarchical correlations in word pieces in the training corpora, rather than hidden casual dependencies. In other words, the neural network has not constructed a mental model of the world, which requires heterogeneous modular networks, thereby unlike humans. Therefore, the LLM does not know what it generate (as a generative model). Even if some key patterns of statistical regularities are absent in the training data, the model can generate perfect texts in terms of syntax. However, the texts may be far away from the truth. Knowing what they know is a crucial hallmark of intelligent systems[108]. In this sense, the inner workings of the LLM are largely opaque, requiring a great effort to mathematically formulate the formal linguistic competence, and further identify key elements that must be included to develop a robust model of the world. Mechanisms behind the currently observed false positive like hallucination[109] could then be revealed, which may be related to interpolation between modes of token distributions. A recent work interpreting the attention used in transformer-based LLM as a generalized Potts model in physics seems inspiring[110], i.e., tokens as Potts spin vectors.
Most importantly, we currently do not have any knowledge about how to build an additional network that is able to connect performance and awareness[111], which is linked to what makes us conscious (see the last challenge). Following the Marr’s framework, both computational and neural correlates of consciousness remain unknown[112, 113, 114].A current physical way is to consider a Lyapunov function governing complex neural computation underlying LLMs[6, 7].In this way, the Lyapunov function perspective will open the door of many degrees of freedom to control how information is distilled via not only the self-attention but also other potential gating mechanisms, based on dynamical system theory.
IX Challenge \@slowromancapviii@—Theory of consciousness
One of the most controversial questions is the origin of consciousness—whether the consciousness is an emergent behavior of highly heterogeneous and modularbrain circuits with various carefully-designed regions (e.g., a total of about connections for human brain and many functionally specific modular structures, such as Prefrontal cortex, Hippocampus, Cerebellum, etc[115, 116]). The subjectivity of the conscious experience is in contradiction withthe objectivity of a scientific explanation. According to the Damasio’s model[117], the ability to identify one’s self in the world and its relationship with the world is considered to be acentral characteristic of conscious state. Whether a machine algorithm can achieve the self-awareness remains elusive. The self-monitoring ability (or meta-cognition[118]) may endow the machines (such as LLMs) to know what they generate. It may be important to clarify how the model of self is related to the internal model of the brain (e.g., through recurrent or predictive processing[119]). For example, Karl Friston argued that the conscious processing can be interpreted as a statistical inference problem of inferring causes of sensory observations. Therefore, minimizing the surprise (negative log probability of an event) may lead to self-consciousness[120], in consistent with the hypothesis that the brain is a prediction machine[99, 108].
There are currently two major cognitive theory of consciousness. One is theglobal workspace framework[121], which relates consciousness to the widespread and sustained propagation of cortical neural activities by demonstrating thatconsciousness arises from an ignition that leads to global information broadcast among brain regions. This computational functionalism was recentlyleveraged to discuss possibility of consciousness in non-organic artificial systems[122, 123]. The other is the integrated information theory that provides a quantitative characterization of conscious stateby integrated information[124]. In this second theory, unconscious states have a low information content, while conscious states bear a high information content. The second theory emphasizes the phenomenal properties of consciousness[125], i.e., the function performed by the brain is not subjective experience.Both theories follow a top-down approach, which is in stark contrast to the statistical mechanics approach following a bottom-up manner building the bridge frommicroscopic interactions to macroscopic behavior. These hypotheses are still under intensive criticism despite some cognitive experiments they can explain[126]. We remark that conscious states may be an emergent property of neural activities, lying at a higher level than neural activities. It is currently unknown how to connect these two levels, for which a new statistical mechanics theory is required. An exciting route is to link the spontaneous fluctuation to stimulus-evoked response, and a maximal response is revealed in a recurrent computational model[96], which can be thought of as a necessary condition for consciousness, as information-richness of cortical electrodynamics was also observed to be peaked at the edge-of-chaos (dynamics marginal stability)[127]. This peak thus distinguishes the conscious from unconscious brain states.From an information-theoretic argument, the conscious state may require a diverse range of configurations of interactions between brain networks, which can be linked to the entropy concept in physics[128].The large entropy leads to optimal segregation and integration of information[129].
Taken together, whether the consciousness can be created from an interaction of local dynamics within complex neural substrate is still unsolved[130]. A statistical mechanics theory, if possible, is always promising in the sensethat one can make theoretical predictions from just a few physics parameters[45], which may be possible from a high degree of abstraction and thus a universal principle could be expected.
X Conclusion
To sum up, in this viewpoint, we provide some naive thoughts about fundamental important questions related to neural networks, for which building a good theory is far from being completed. The traditional researches of statistical physics of neural networks bifurcate to two main streams: one is to the engineering side, developing theory-grounded algorithms; and the other is to the neuroscience side, formulating brain computation by mathematical models solved by physics methods.In physics, we have the principle of least action, from which we can deduce the classical mechanics or electrodynamics laws. We are not sure whether in physics of neural networks (and even the brain) there existsgeneral principles that can be expressed in a concise form of mathematics.It is exciting yet challenging to promote the interplay between physics theory and neural computations along these eight open problems discussed in this perspective paper. The advances will undoubtedly lead to a human-interpretable understanding of underlying mechanisms of the artificial intelligent systems, the brain and mind, especially in the era of big experimental data in brain science and rapid progress in AI researches.
Acknowledgements.
We would like to thank all PMI members for discussions lasting for five years. This perspective also benefits from discussions with students during the on-line course of statistical mechanics of neural networks (from September 2022 to June 2023). We are also grateful to invited speakers in the INTheory on-line seminar during the COVID-19 pandemic. We enjoyed a lot of interesting discussions with Adriano Barra, Yan Fyodorov, Sebastian Goldt, Pulin Gong, Moritz Helias, Kamesh Krishnamurthy, Yi Ma, Alexander van Meegen, Remi Monasson, Srdjan Ostojic, Riccardo Zecchina.This research was supported by the National Natural Science Foundation of China forGrant numbers 12122515.
References
- [1]RichardNaud WulframGerstner, Werner M.Kistler and Liam Paninski.Neuronal Dynamics: From single neurons to networks and models ofcognition.Cambridge University Press, United Kingdom, 2014.
- [2]Daniel Levenstein, VeronicaA. Alvarez, Asohan Amarasingham, Habiba Azab,ZheS. Chen, RichardC. Gerkin, Andrea Hasenstaub, Ramakrishnan Iyer,RenaudB. Jolivet, Sarah Marzen, JosephD. Monaco, AstridA. Prinz, SalmaQuraishi, Fidel Santamaria, Sabyasachi Shivkumar, MatthewF. Singh, RogerTraub, Farzan Nadim, HoracioG. Rotstein, and A.David Redish.On the role of theory and modeling in neuroscience.Journal of Neuroscience, 43(7):1074–1088, 2023.
- [3]J.J. Hopfield.Neural networks and physical systems with emergent collectivecomputational abilities.Proc. Natl. Acad. Sci. USA, 79:2554, 1982.
- [4]DanielJ Amit, Hanoch Gutfreund, and HSompolinsky.Statistical mechanics of neural networks near saturation.Annals of Physics, 173(1):30–67, 1987.
- [5]M.Griniasty, M.V. Tsodyks, and DanielJ. Amit.Conversion of temporal correlations between stimuli to spatialcorrelations between attractors.Neural Computation, 5(1):1–17, 1993.
- [6]Hubert Ramsauer, Bernhard Schäfl, Johannes Lehner, Philipp Seidl, MichaelWidrich, Thomas Adler, Lukas Gruber, Markus Holzleitner, Milena Pavlović,GeirKjetil Sandve, Victor Greiff, David Kreil, Michael Kopp, GünterKlambauer, Johannes Brandstetter, and Sepp Hochreiter.Hopfield networks is all you need.arXiv:2008.02217, 2020.in ICLR 2021.
- [7]Dmitry Krotov and John Hopfield.Large associative memory problem in neurobiology and machinelearning.arXiv:2008.06996, 2020.in ICLR 2021.
- [8]D.Marr.Vision: A Computational Investigation into the HumanRepresentation and Processing of Visual Information.MIT Press, Cambridge, MA, 1982.
- [9]Jurgen Schmidhuber.Deep learning in neural networks: An overview.Neural Networks, 61:85–117, 2015.
- [10]Yann LeCun, Yoshua Bengio, and Geoffrey Hinton.Deep learning.Nature, 521(7553):436–444, 2015.
- [11]LF Abbott, Brian DePasquale, and Raoul-Martin Memmesheimer.Building functional networks of spiking model neurons.Nature Neuroscience, 19(3):350–355, 2016.
- [12]AdamH. Marblestone, Greg Wayne, and KonradP. Kording.Toward an integration of deep learning and neuroscience.Frontiers in Computational Neuroscience, 10:94, 2016.
- [13]BlakeA Richards, TimothyP Lillicrap, Philippe Beaudoin, YoshuaBengio, Rafal Bogacz, Amelia Christensen, Claudia Clopath, RuiPonteCosta, Archy deBerker, Surya Ganguli, ColleenJ Gillon, DanijarHafner, Adam Kepecs, Nikolaus Kriegeskorte, Peter Latham, GraceWLindsay, KennethD Miller, Richard Naud, ChristopherC Pack,Panayiota Poirazi, Pieter Roelfsema, João Sacramento, Andrew Saxe,Benjamin Scellier, AnnaC Schapiro, Walter Senn, Greg Wayne, DanielYamins, Friedemann Zenke, Joel Zylberberg, Denis Therien, andKonradP Kording.A deep learning framework for neuroscience.Nature Neuroscience, 22(11):1761–1770, 2019.
- [14]TimothyP. Lillicrap, Adam Santoro, Luke Marris, ColinJ. Akerman, andGeoffrey Hinton.Backpropagation and the brain.Nature Reviews Neuroscience, 21(6):335–346, 2020.
- [15]Samuel Schmidgall, Jascha Achterberg, Thomas Miconi, Louis Kirsch, Rojin Ziaei,S.Pardis Hajiseyedrazi, and Jason Eshraghian.Brain-inspired learning in artificial neural networks: a review.arXiv:2305.11252, 2023.
- [16]Yamins Daniel L K and DiCarlo James J.Using goal-driven deep learning models to understand sensorycortex.Nat Neurosci, 19(3):356–365, 2016.
- [17]Andrew Saxe, Stephanie Nelli, and Christopher Summerfield.If deep learning is the answer, then what is the question?Nat. Rev. Neurosci., 22:55, 2020.
- [18]Demis Hassabis, Dharshan Kumaran, Christopher Summerfield, and MatthewBotvinick.Neuroscience-inspired artificial intelligence.Neuron, 95(2):245–258, 2017.
- [19]YiMa, Doris Tsao, and Heung-Yeung Shum.On the principles of parsimony and self-consistency for the emergenceof intelligence.Frontiers of Information Technology & Electronic Engineering,23(9):1298–1323, 2022.
- [20]JamesJ. DiCarlo and DavidD. Cox.Untangling invariant object recognition.Trends in Cognitive Sciences, 11(8):333–341, 2007.
- [21]Yoshua Bengio, Aaron Courville, and Pascal Vincent.Representation learning: A review and new perspectives.IEEE transactions on pattern analysis and machine intelligence,35(8):1798–1828, 2013.
- [22]P.P. Brahma, D.Wu, and Y.She.Why deep learning works: A manifold disentanglement perspective.IEEE Transactions on Neural Networks and Learning Systems,27(10):1997–2008, 2016.
- [23]Haiping Huang.Mechanisms of dimensionality reduction and decorrelation in deepneural networks.Phys. Rev. E, 98:062313, 2018.
- [24]Uri Cohen, SueYeon Chung, DanielD Lee, and Haim Sompolinsky.Separability and geometry of object manifolds in deep neuralnetworks.Nature Communications, 11(1):1–13, 2020.
- [25]N.Alex Cayco-Gajic and R.Angus Silver.Re-evaluating circuit mechanisms underlying pattern separation.Neuron, 101(4):584–602, 2019.
- [26]Jianwen Zhou and Haiping Huang.Weakly-correlated synapses promote dimension reduction in deep neuralnetworks.Phys. Rev. E, 103:012315, 2021.
- [27]GinaG. Turrigiano and SachaB. Nelson.Homeostatic plasticity in the developing nervous system.Nature Reviews Neuroscience, 5(2):97–107, 2004.
- [28]JohnH. Reynolds and DavidJ. Heeger.The normalization model of attention.Neuron, 61(2):168–185, 2009.
- [29]Ravid Shwartz-Ziv and Naftali Tishby.Opening the black box of deep neural networks via information.arXiv:1703.00810, 2017.
- [30]Alessandro Achille and Stefano Soatto.A separation principle for control in the age of deep learning.arXiv:1711.03321, 2017.
- [31]Chan Li and Haiping Huang.Learning credit assignment.Phys. Rev. Lett., 125:178301, 2020.
- [32]Chan Li and Haiping Huang.Emergence of hierarchical modes from deep learning.Phys. Rev. Res., 5:L022011, 2023.
- [33]Francesco Alemanno, Miriam Aquaro, Ido Kanter, Adriano Barra, and ElenaAgliari.Supervised hebbian learning.Europhysics Letters, 141(1):11001, 2023.
- [34]Mingshan Xie, Yuchen Wang, and Haiping Huang.Fermi-bose machine.arXiv:2404.13631, 2024.
- [35]Haiping Huang and Yoshiyuki Kabashima.Origin of the computational hardness for learning with binarysynapses.Physical Review E, 90:052813, 2014.
- [36]Carlo Baldassi, Christian Borgs, JenniferT Chayes, Alessandro Ingrosso, CarloLucibello, Luca Saglietti, and Riccardo Zecchina.Unreasonable effectiveness of learning neural networks: Fromaccessible states and robust ensembles to basic algorithmic schemes.Proceedings of the National Academy of Sciences,113(48):E7655–E7662, 2016.
- [37]SSpigler, MGeiger, Sd’Ascoli, LSagun, GBiroli, and MWyart.A jamming transition from under- to over-parametrization affectsgeneralization in deep learning.Journal of Physics A: Mathematical and Theoretical, 52:474001,2019.
- [38]Wenxuan Zou and Haiping Huang.Data-driven effective model shows a liquid-like deep learning.Phys. Rev. Res., 3:033290, 2021.
- [39]GGyorgyi.First-order transition to perfect generalization in a neural networkwith binary synapses.Physical Review A, 41(12):7097–7100, 1990.
- [40]H.Sompolinsky, N.Tishby, and H.S. Seung.Learning from examples in large neural networks.Physical review letters, 65:1683–1686, 1990.
- [41]Pankaj Mehta, Marin Bukov, Ching-Hao Wang, AlexandreG.R. Day, ClintRichardson, CharlesK. Fisher, and DavidJ. Schwab.A high-bias, low-variance introduction to machine learning forphysicists.Physics Reports, 810:1–124, 2019.
- [42]Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal.Reconciling modern machine-learning practice and the classicalbias-variance trade-off.Proceedings of the National Academy of Sciences of the UnitedStates of America, 116(32):15849–15854, 2019.
- [43]Ben Adlam and Jeffrey Pennington.The neural tangent kernel in high dimensions: Triple descent and amulti-scale theory of generalization.In ICML 2020: 37th International Conference on MachineLearning, 2020.
- [44]Carlo Baldassi, Clarissa Lauditi, EnricoM. Malatesta, Rosalba Pacelli,Gabriele Perugini, and Riccardo Zecchina.Learning through atypical phase transitions in overparameterizedneural networks.Phys. Rev. E, 106:014116, 2022.
- [45]Haiping Huang.Statistical Mechanics of Neural Networks.Springer, Singapore, 2022.
- [46]Tianqi Hou and Haiping Huang.Statistical physics of unsupervised learning with prior knowledge inneural networks.Phys. Rev. Lett., 124:248302, 2020.
- [47]Mikhail Belkin.Fit without fear: remarkable mathematical phenomena of deep learningthrough the prism of interpolation.arXiv:2105.14368, 2021.
- [48]Arthur Jacot, Franck Gabriel, and Clément Hongler.Neural tangent kernel: Convergence and generalization in neuralnetworks.In Advances in Neural Information Processing Systems,volume31, pages 8571–8580, 2018.
- [49]Cong Fang, Hanze Dong, and Tong Zhang.Mathematical models of overparameterized neural networks.Proceedings of the IEEE, 109(5):683–703, 2021.
- [50]PeterL. Bartlett, Andrea Montanari, and Alexander Rakhlin.Deep learning: a statistical viewpoint.arXiv:2103.09177, 2021.
- [51]Kai Segadlo, Bastian Epping, Alexander van Meegen, David Dahmen, MichaelKrämer, and Moritz Helias.Unified field theoretical approach to deep and recurrent neuronalnetworks.Journal of Statistical Mechanics: Theory and Experiment,2022(10):103401, 2022.
- [52]Abdulkadir Canatar, Blake Bordelon, and Cengiz Pehlevan.Out-of-distribution generalization in kernel regression.In M.Ranzato, A.Beygelzimer, Y.Dauphin, P.S. Liang, and J.WortmanVaughan, editors, Advances in Neural Information Processing Systems,volume34, pages 12600–12612. Curran Associates, Inc., 2021.
- [53]Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna,Dumitru Erhan, Ian Goodfellow, and Rob Fergus.Intriguing properties of neural networks.In ICLR 2014 : International Conference on LearningRepresentations (ICLR) 2014, 2014.
- [54]IanJ. Goodfellow, Jonathon Shlens, and Christian Szegedy.Explaining and harnessing adversarial examples.In ICLR 2015 : International Conference on LearningRepresentations 2015, 2015.
- [55]Zijian Jiang, Jianwen Zhou, and Haiping Huang.Relationship between manifold smoothness and adversarialvulnerability in deep learning with local errors.Chin. Phys. B, 30:048702, 2021.
- [56]Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel,Wieland Brendel, Matthias Bethge, and FelixA. Wichmann.Shortcut learning in deep neural networks.Nature Machine Intelligence, 2(11):665–673, 2020.
- [57]Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, andAdrian Vladu.Towards deep learning models resistant to adversarial attacks.In International Conference on Learning Representations, 2018.
- [58]M.Mézard, G.Parisi, and M.A. Virasoro.Spin Glass Theory and Beyond.World Scientific, Singapore, 1987.
- [59]Justin Gilmer, Luke Metz, Fartash Faghri, SamuelS. Schoenholz, Maithra Raghu,Martin Wattenberg, and Ian Goodfellow.Adversarial spheres.arXiv:1801.02774, 2018.
- [60]Luca Bortolussi and Guido Sanguinetti.Intrinsic geometric vulnerability of high-dimensional artificialintelligence.arXiv:1811.03571, 2018.
- [61]Richard Kenway.Vulnerability of deep learning.arXiv:1803.06111, 2018.
- [62]Michael McCloskey and NealJ. Cohen.Catastrophic interference in connectionist networks: The sequentiallearning problem.Psychology of Learning and Motivation, 24:109–165, 1989.
- [63]James Kirkpatrick, Razvan Pascanu, NeilC. Rabinowitz, Joel Veness,Guillaume Desjardins, AndreiA. Rusu, Kieran Milan, John Quan, TiagoRamalho, Agnieszka Grabska-Barwinska, Demis Hassabis, ClaudiaClopath, Dharshan Kumaran, and Raia Hadsell.Overcoming catastrophic forgetting in neural networks.Proceedings of the National Academy of Sciences of the UnitedStates of America, 114(13):3521–3526, 2017.
- [64]GermanI. Parisi, Ronald Kemker, JoseL. Part, Christopher Kanan, and StefanWermter.Continual lifelong learning with neural networks: A review.Neural Networks, 113:54–71, 2019.
- [65]Friedemann Zenke, Ben Poole, and Surya Ganguli.Continual learning through synaptic intelligence.In Proceedings of the 34th International Conference on MachineLearning - Volume 70, volume70, pages 3987–3995, 2017.
- [66]Oussama Dhifallah and YueM. Lu.Phase transitions in transfer learning for high-dimensionalperceptrons.Entropy, 23:400, 2021.
- [67]Chan Li, Zhenye Huang, Wenxuan Zou, and Haiping Huang.Statistical mechanics of continual learning: Variational principleand mean-field potential.Phys. Rev. E, 108:014309, 2023.
- [68]Sebastian Lee, Sebastian Goldt, and Andrew Saxe.Continual learning in the teacher-student setup: Impact of tasksimilarity.In Marina Meila and Tong Zhang, editors, Proceedings of the 38thInternational Conference on Machine Learning, volume 139 of Proceedingsof Machine Learning Research, pages 6109–6119. PMLR, 2021.
- [69]Silvio Franz and Giorgio Parisi.Recipes for metastable states in spin glasses.Journal De Physique I, 5(11):1401–1415, 1995.
- [70]Sayna Ebrahimi, Mohamed Elhoseiny, Trevor Darrell, and Marcus Rohrbach.Uncertainty-guided continual learning with bayesian neural networks.In International Conference on Learning Representations, 2020.
- [71]Axel Laborieux, Maxence Ernoult, Tifenn Hirtzlin, and Damien Querlioz.Synaptic metaplasticity in binarized neural networks.Nature Communications, 12(1):2549, 2021.
- [72]Bernhard Schölkopf.Causality for machine learning.arXiv:1911.10500, 2019.
- [73]Judea Pearl and Dana Mackenzie.The Book of Why: The New Science of Cause and Effect.Basic Books, New York, NY, 2018.
- [74]Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, NanRosemary Ke, NalKalchbrenner, Anirudh Goyal, and Yoshua Bengio.Toward causal representation learning.Proceedings of the IEEE, 109(5):612–634, 2021.
- [75]DarioL Ringach.Spontaneous and driven cortical activity: implications forcomputation.Current Opinion in Neurobiology, 19(4):439–444, 2009.
- [76]Pietro Berkes, Gergo Orban, Mate Lengyel, and Jozsef Fiser.Spontaneous Cortical Activity Reveals Hallmarks of an OptimalInternal Model of the Environment.Science, 331:83, 2011.
- [77]Luczak Artur, Bartho Peter, and Harris Kenneth D.Spontaneous Events Outline the Realm of Possible Sensory Responsesin Neocortical Populations.Neuron, 62:413, 2009.
- [78]Gustavo Deco, ChristopherW. Lynn, Yonatan SanzPerl, and MortenL.Kringelbach.Violations of the fluctuation-dissipation theorem reveal distinctnonequilibrium dynamics of brain states.Phys. Rev. E, 108:064410, 2023.
- [79]Haiping Huang and Taro Toyoizumi.Clustering of neural code words revealed by a first-order phasetransition.Phys. Rev. E, 93:062416, 2016.
- [80]MichaelJ Berry and Gašper Tkačik.Clustering of neural activity: A design principle for populationcodes.Frontiers in computational neuroscience, 14:20, 2020.
- [81]David Ha and Jurgen Schmidhuber.World models.arXiv: 1803.10122, 2018.
- [82]Jad Rahme and RyanP. Adams.A theoretical connection between statistical physics andreinforcement learning.arXiv:1906.10228, 2019.
- [83]EmreO. Neftci and BrunoB. Averbeck.Reinforcement learning in artificial and biological systems.Nature Machine Intelligence, 1(3):133–143, 2019.
- [84]Gustavo Deco, EdmundT. Rolls, and Ranulfo Romo.Stochastic dynamics as a principle of brain function.Progress in Neurobiology, 88(1):1–16, 2009.
- [85]David Sussillo and L.F. Abbott.Generating coherent patterns of activity from chaotic neuralnetworks.Neuron, 63(4):544–557, 2009.
- [86]DeanV. Buonomano and Wolfgang Maass.State-dependent computations: spatiotemporal processing in corticalnetworks.Nature Reviews Neuroscience, 10(2):113–125, 2009.
- [87]Saurabh Vyas, MatthewD. Golub, David Sussillo, and KrishnaV. Shenoy.Computation through neural population dynamics.Annual Review of Neuroscience, 43(1):249–275, 2020.
- [88]Mehrdad Jazayeri and Srdjan Ostojic.Interpreting neural computations by examining intrinsic and embeddingdimensionality of neural activity.Current Opinion in Neurobiology, 70:113–120, 2021.
- [89]H.Sompolinsky, A.Crisanti, and H.J. Sommers.Chaos in random neural networks.Physical Review Letters, 61(3):259–262, 1988.
- [90]Nicolas Brunel.Dynamics of sparsely connected networks of excitatory and inhibitoryspiking neurons.Journal of Computational Neuroscience, 8(3):183–208, 2000.
- [91]Wenxuan Zou, Chan Li, and Haiping Huang.Ensemble perspective for understanding temporal credit assignment.Physical Review E, 107(2):024307, 2023.
- [92]DavidG. Clark and L.F. Abbott.Theory of coupled neuronal-synaptic dynamics.Phys. Rev. X, 14:021001, 2024.
- [93]Christoph vonder Malsburg.Concerning the neural code.arXiv:1811.01199, 2018.
- [94]Tianqi Hou, KYMichael Wong, and Haiping Huang.Minimal model of permutation symmetry in unsupervised learning.Journal of Physics A: Mathematical and Theoretical,52(41):414001, 2019.
- [95]LinxingPreston Jiang and Rajesh P.N. Rao.Dynamic predictive coding: A model of hierarchical sequence learningand prediction in the neocortex.PLOS Computational Biology, 20(2):1–30, 2024.
- [96]Junbin Qiu and Haiping Huang.An optimization-based equilibrium measure describes non-equilibriumsteady state dynamics: application to edge of chaos.arXiv:2401.10009, 2024.
- [97]Wenxuan Zou and Haiping Huang.Introduction to dynamical mean-field theory of randomly connectedneural networks with bidirectionally correlated couplings.SciPost Phys. Lect. Notes, page79, 2024.
- [98]OpenAI.Gpt-4 technical report.arXiv:2303.08774, 2023.
- [99]Andy Clark.Whatever next? predictive brains, situated agents, and the future ofcognitive science.Behavioral and Brain Sciences, 36(3):181–204, 2013.
- [100]Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,AidanN. Gomez, Łukasz Kaiser, and Illia Polosukhin.Attention is all you need.In Proceedings of the 31st International Conference on NeuralInformation Processing Systems, NIPS’17, page 6000–6010, Red Hook, NY,USA, 2017. Curran Associates Inc.
- [101]Kyle Mahowald, AnnaA. Ivanova, IdanA. Blank, Nancy Kanwisher, JoshuaB.Tenenbaum, and Evelina Fedorenko.Dissociating language and thought in large language models.Trends in Cognitive Sciences, 28(6):517–540, 2024.
- [102]Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, JaredD Kaplan, PrafullaDhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell,Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, RewonChild, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, ChrisHesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess,Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever,and Dario Amodei.Language models are few-shot learners.In H.Larochelle, M.Ranzato, R.Hadsell, M.F. Balcan, and H.Lin,editors, Advances in Neural Information Processing Systems, volume33,pages 1877–1901. Curran Associates, Inc., 2020.
- [103]Takeshi Kojima, ShixiangShane Gu, Machel Reid, Yutaka Matsuo, and YusukeIwasawa.Large language models are zero-shot reasoners.arXiv:2205.11916, 2022.in NeurIPS 2022.
- [104]Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, EdHuai hsin Chi,F.Xia, Quoc Le, and Denny Zhou.Chain of thought prompting elicits reasoning in large languagemodels.arXiv:2201.11903, 2022.in NeurIPS 2022.
- [105]Jared Kaplan, Sam McCandlish, Tom Henighan, TomB. Brown, Benjamin Chess, RewonChild, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei.Scaling laws for neural language models.arXiv:2001.08361, 2020.
- [106]TerrenceJ. Sejnowski.Large Language Models and the Reverse Turing Test.Neural Computation, 35(3):309–342, 2023.
- [107]BrendenM. Lake, TomerD. Ullman, JoshuaB. Tenenbaum, and SamuelJ. Gershman.Building machines that learn and think like people.Behavioral and Brain Sciences, 40:e253, 2017.
- [108]Marcel van Gerven.Computational foundations of natural intelligence.Frontiers in Computational Neuroscience, 11:112, 2017.
- [109]Noam Chomsky, Ian Roberts, and Jeffrey Watumull.Noam chomsky: The false promise of chatgpt.The New York Times, 8, 2023.
- [110]Riccardo Rende, Federica Gerace, Alessandro Laio, and Sebastian Goldt.Mapping of attention mechanisms to a generalized potts model.Phys. Rev. Res., 6:023057, 2024.
- [111]Axel Cleeremans.Connecting conscious and unconscious processing.Cognitive Science, 38(6):1286–1315, 2014.
- [112]Francis Crick and Christof Koch.A framework for consciousness.Nature Neuroscience, 6(2):119–126, 2003.
- [113]Lenore Blum and Manuel Blum.A theory of consciousness from a theoretical computer scienceperspective: Insights from the conscious turing machine.Proceedings of the National Academy of Sciences of the UnitedStates of America, 119(21):e2115934119, 2022.
- [114]Abhilash Dwarakanath, Vishal Kapoor, Joachim Werner, Shervin Safavi, LeonidA.Fedorov, NikosK. Logothetis, and TheofanisI. Panagiotaropoulos.Bistability of prefrontal states gates access to consciousness.Neuron, 111(10):1666–1683, 2023.
- [115]K.Harris and G.Shepherd.The neocortical circuit: themes and variations.Nat Neurosci, 18:170–181, 2015.
- [116]Liqun Luo.Architectures of neuronal circuits.Science, 373(6559):eabg7285, 2021.
- [117]A.Damasio.Fundamental feelings.Nature, 413:781, 2001.
- [118]Stanislas Dehaene, Hakwan Lau, and Sid Kouider.What is consciousness, and could machines have it?Science, 358(6362):486–492, 2017.
- [119]JohanF. Storm, P.Christiaan Klink, Jaan Aru, Walter Senn, Rainer Goebel,Andrea Pigorini, Pietro Avanzini, Wim Vanduffel, PieterR. Roelfsema,Marcello Massimini, MatthewE. Larkum, and Cyriel M.A. Pennartz.An integrative, multiscale view on neural theories of consciousness.Neuron, 112(10):1531–1552, 2024.
- [120]Karl Friston.Am i self-conscious? (or does self-organization entailself-consciousness?).Frontiers in Psychology, 9:579, 2018.
- [121]Stanislas Dehaene, Michel Kerszberg, and Jean-Pierre Changeux.A neuronal model of a global workspace in effortful cognitive tasks.Proceedings of the National Academy of Sciences of the UnitedStates of America, 95(24):14529–14534, 1998.
- [122]Yoshua Bengio.The consciousness prior.arXiv:1709.08568, 2017.
- [123]Patrick Butlin, Robert Long, Eric Elmoznino, Yoshua Bengio, Jonathan Birch,Axel Constant, George Deane, StephenM. Fleming, Chris Frith, XuJi, RyotaKanai, Colin Klein, Grace Lindsay, Matthias Michel, Liad Mudrik, Megan A.K.Peters, Eric Schwitzgebel, Jonathan Simon, and Rufin VanRullen.Consciousness in artificial intelligence: Insights from the scienceof consciousness.arXiv:2308.08708, 2023.
- [124]Giulio Tononi.An information integration theory of consciousness.BMC Neuroscience, 5(1):42, 2004.
- [125]Larissa Albantakis, Leonardo Barbosa, Graham Findlay, Matteo Grasso, AndrewM.Haun, William Marshall, William G.P. Mayner, Alireza Zaeemzadeh, MelanieBoly, BjørnE. Juel, Shuntaro Sasai, Keiko Fujii, Isaac David, JeremiahHendren, JonathanP. Lang, and Giulio Tononi.Integrated information theory (iit) 4.0: Formulating the propertiesof phenomenal existence in physical terms.PLOS Computational Biology, 19(10):1–45, 10 2023.
- [126]Christof Koch, Marcello Massimini, Melanie Boly, and Giulio Tononi.Neural correlates of consciousness: progress and problems.Nature Reviews Neuroscience, 17(5):307–321, 2016.
- [127]Daniel Toker, Ioannis Pappas, JannaD Lendner, Joel Frohlich, DiegoM Mateos,Suresh Muthukumaraswamy, Robin Carhart-Harris, Michelle Paff, PaulM Vespa,MartinM Monti, etal.Consciousness is supported by near-critical slow corticalelectrodynamics.Proceedings of the National Academy of Sciences,119(7):e2024455119, 2022.
- [128]R.Guevara Erra, D.M. Mateos, R.Wennberg, and J.L.Perez Velazquez.Statistical mechanics of consciousness: Maximization of informationcontent of network is associated with conscious awareness.Physical Review E, 94(5):52402, 2016.
- [129]DavidW. Zhou, DavidD. Mowrey, Pei Tang, and Yan Xu.Percolation model of sensory transmission and loss of consciousnessunder general anesthesia.Phys. Rev. Lett., 115:108103, 2015.
- [130]Patrick Krauss and Andreas Maier.Will we ever have conscious machines?Frontiers in Computational Neuroscience, 14:556544, 2020.