Physiological Artifacts in EEG Signals: A Comprehensive Guide for Biomedical Research and Clinical Applications

Charles Brooks Dec 02, 2025 191

This article provides a comprehensive overview of physiological artifacts in electroencephalography (EEG), a critical challenge for researchers and clinicians in neuroscience and drug development.

Physiological Artifacts in EEG Signals: A Comprehensive Guide for Biomedical Research and Clinical Applications

Abstract

This article provides a comprehensive overview of physiological artifacts in electroencephalography (EEG), a critical challenge for researchers and clinicians in neuroscience and drug development. It details the origins, characteristics, and impacts of common artifacts like ocular, muscle, cardiac, and sweat artifacts. The content systematically explores established and emerging artifact detection and removal methodologies, from regression and blind source separation to advanced deep learning models. Furthermore, it offers practical troubleshooting guidance for data optimization and presents a comparative analysis of technique efficacy across different research scenarios, aiming to enhance data integrity and interpretation in both experimental and clinical settings.

Understanding Physiological EEG Artifacts: Origins, Characteristics, and Impact on Signal Integrity

Electroencephalography (EEG) records the brain's spontaneous electrical activity, providing crucial insights into brain function for clinical diagnosis, neuroscience research, and drug development [1] [2]. However, the recorded signals are invariably contaminated by physiological artifacts—electrical potentials originating from non-neural sources within the subject's body [1] [3]. These artifacts can significantly distort the EEG, leading to misinterpretation of brain activity, compromised research findings, and potentially erroneous clinical conclusions [1] [4]. In pharmaco-EEG studies, for instance, the choice of ocular artifact removal technique can influence the resulting pharmacokinetic-pharmacodynamic (PK-PD) models and the assessment of drug effects on the brain [4]. Understanding the nature, characteristics, and sources of these non-neural signals is therefore a fundamental prerequisite for any rigorous EEG research or analysis.

The challenge is particularly acute in emerging applications using wearable EEG devices, which operate in uncontrolled environments with dry electrodes and reduced channel counts, making them more susceptible to signal quality degradation from subject mobility and environmental noise [5]. This technical guide provides an in-depth examination of physiological artifacts, framing them within the broader context of EEG signal quality assurance. We detail their defining characteristics, present methodologies for their systematic identification and removal, and discuss the implications for research and drug development.

Categorization and Characteristics of Major Physiological Artifacts

Physiological artifacts arise from various bodily sources, each with distinct spatial, temporal, and spectral signatures [5] [3]. Accurate identification is the first critical step toward effective mitigation. The table below summarizes the key characteristics of the most common physiological artifacts.

Table 1: Characteristics of Major Physiological Artifacts in EEG Recordings

Artifact Type	Primary Source	Spectral Characteristics	Spatial Distribution	Morphology & Key Identifiers
Ocular Artifacts	Eye movements and blinks; cornea-retina dipole [3]	Slow, delta range (< 4 Hz) [1] [6]	Primarily frontal and frontopolar regions (Fp1, Fp2, F7, F8) [3]	High-amplitude, slow deflections; symmetric for blinks, asymmetric for lateral movements [3]
Muscle Artifacts (EMG)	Contraction of head, neck, and jaw muscles [3]	Broad spectrum (0 to >200 Hz), predominantly high-frequency (> 13 Hz) [1]	Widespread, but most prominent over temporal and frontal muscles [3]	High-frequency, spike-like, irregular patterns; can be rhythmic in movement disorders [3]
Cardiac Artifacts	Electrical activity of the heart (ECG) or pulse [3]	~1.2 Hz for pulse; characteristic ECG waveform [1]	Widespread, but often most evident in referential montages using earlobe references [3]	Highly rhythmic, recurring sharp transients synchronized with QRS complex on ECG channel [3]
Glossokinetic Artifact	Tongue movement (tip acts as a negative dipole) [3]	Delta range, variable [3]	Broad field, maximal inferiorly; drops from frontal to occipital [3]	Slow delta waves occurring synchronously with speech or swallowing [3]
Pulse Artifact	Pulsation of blood vessels beneath an electrode [3]	Slow, rhythmic [3]	Localized to a single electrode over a pulsating vessel [3]	Slow waves with a fixed delay (~200-300 ms) after the QRS complex [3]
Respiration Artifact	Body movement from breathing or electrode impedance changes [3]	Slow, rhythmic [3]	Can be global or localized to electrodes the patient is lying on [3]	Slow, rhythmic baseline sways synchronous with respiratory cycle [3]
Skin/Sweat Artifact	Changes in electrode impedance due to sweat [3]	Very slow, often < 1 Hz [3]	Often widespread, particularly at high-impedance sites [3]	Very slow baseline drifts or "sways" [3]

Methodologies for Artifact Detection and Removal

A wide array of techniques has been developed to manage physiological artifacts, ranging from traditional statistical approaches to modern deep-learning models. The choice of method often depends on the artifact type, available channel density, and the specific requirements of the application (e.g., real-time processing vs. offline analysis).

Classical Signal Processing Approaches

Regression Methods are traditional approaches, particularly for ocular artifacts [1] [4]. They operate on the assumption that each EEG channel is a linear combination of pure brain activity and a weighted fraction of the artifact recorded from a reference channel, such as the electrooculogram (EOG) [1]. The method estimates propagation factors (e.g., α and β for vertical and horizontal EOG) and subtracts the weighted artifact from the contaminated EEG signal: EEG_corrected = EEG_raw - α*VEOG - β*HEOG [4]. A significant limitation is the bidirectional contamination problem; since EOG channels also contain cerebral activity, regression risks removing genuine neural signals along with the artifact [4].

Blind Source Separation (BSS), particularly Independent Component Analysis (ICA), is a widely used and effective alternative [5] [1] [4]. BSS decomposes the multi-channel EEG signal into statistically independent components (ICs). The underlying assumption is that artifacts and neural signals originate from physiologically independent processes [4]. An expert then visually identifies and removes ICs that represent artifacts (e.g., those with topographies and time courses typical of eye blinks or muscle activity) before reconstructing the EEG signal from the remaining components [4]. Studies have shown that BSS-based techniques can preserve brain activity more effectively than regression, especially in anterior brain regions, and can lead to more accurate PK-PD modeling in pharmaco-EEG studies [4].

Wavelet Transform is a powerful tool for analyzing non-stationary signals like EEG. It decomposes a signal into different frequency components at different time points, allowing for the identification of localized artifacts. This makes it highly suitable for managing ocular and muscular artifacts [5]. Artifactual components in the wavelet domain can be thresholded or zeroed out before the signal is reconstructed.

Automatic Artifact Subspace Reconstruction (ASR) is an adaptive, data-driven method that is becoming increasingly popular, especially for handling ocular, movement, and instrumental artifacts in wearable EEG [5]. ASR works by first calibrating a "clean" segment of the data. It then continuously identifies and removes components in the EEG that deviate significantly from this clean reference, interpolating the removed data from surrounding clean channels.

Emerging Deep Learning and Automated Methods

Deep Learning Models represent the cutting edge of artifact removal. Models like AnEEG, which uses a Long Short-Term Memory (LSTM)-based Generative Adversarial Network (GAN), have demonstrated promising results [6]. In this architecture, a generator network learns to produce clean EEG from artifact-contaminated input, while a discriminator network tries to distinguish the generated signal from a ground-truth clean signal. This adversarial training process enables the model to learn complex, non-linear relationships between artifacts and neural signals, effectively suppressing a wide range of contaminants while preserving underlying brain activity [6].

Automated Detection based on Signal Properties offers a computationally simpler alternative suitable for large datasets, such as all-night sleep EEG. One effective method uses Hjorth parameters—activity, mobility, and complexity—which are simple statistical measures of the signal's properties [7]. Artifactual epochs are identified as statistical outliers in the distribution of these parameters across the recording. Studies have shown that such simple automatic detectors can achieve results comparable to visual scoring for calculating all-night average power spectral density (PSD), facilitating the processing of large-scale datasets [7].

Table 2: Comparison of Common Artifact Removal Techniques

Methodology	Primary Applications	Key Advantages	Key Limitations
Regression	Ocular artifact removal [1] [4]	Simple, computationally efficient [1]	Requires reference channels; bidirectional contamination removes neural signals [4]
ICA/BSS	Ocular, muscular, and cardiac artifacts [5] [1] [4]	Does not require reference channels; effective separation of sources [4]	Requires multi-channel EEG; computationally intensive; subjective component selection [5]
Wavelet Transform	Ocular and muscular artifacts [5]	Good for non-stationary and transient artifacts; preserves temporal information	Choice of wavelet and threshold can be subjective
ASR	Ocular, movement, and instrumental artifacts [5]	Adaptive, works well with low-density wearable EEG; operates in real-time [5]	Requires a clean data segment for calibration
Deep Learning (e.g., GANs)	All artifact types, particularly muscular and motion [5] [6]	Can model complex non-linear relationships; no need for manual feature engineering [6]	Requires large amounts of training data; "black box" nature; computationally intensive to train
Hjorth Parameters	General artifact detection in sleep EEG [7]	Computationally simple; suitable for large datasets and automatic pipelines [7]	May not capture all complex artifact morphologies

Experimental Protocols for Method Comparison

To illustrate how these methods are empirically validated, consider a protocol for comparing ocular artifact removal techniques, as described in [4].

Objective: To assess the impact of regression versus BSS (Second Order Blind Identification - SOBI) ocular filtering on the conclusions drawn from a pharmaco-EEG trial.

Design:

Subjects & Drugs: 20 healthy volunteers receive single oral doses of haloperidol (3 mg), risperidone (1 mg), olanzapine (5 mg), and placebo in a randomized, double-blind, cross-over design.
Signal Acquisition: 19-channel EEG (10-20 system) is recorded alongside vertical and horizontal EOG. Vigilance-controlled, eyes-closed EEG is recorded at baseline and serially for up to 12 hours post-drug administration.
Preprocessing: The same automatic artifact rejection process is applied post-ocular filtering to both method outputs.
Ocular Filtering:
- Regression: Propagation factors (α, β) are calculated for each subject and electrode using data segments with high EOG activity. The corrected EEG is computed as EEG_corr = EEG_raw - α*VEOG - β*HEOG [4].
- BSS (SOBI): The BSS model x = A*s is solved, where x is the matrix of raw EOG and EEG signals, s is the matrix of source signals, and A is the mixing matrix. Ocular-related components are identified and removed before signal reconstruction [4].
Outcome Measures: Drug-induced effects are evaluated using:
- Time & Frequency Analysis: Spectral variables (delta, theta, alpha, beta power) in individual channels.
- Topographic Brain Mapping: Significance probability maps of spectral variables.
- Tomographic Analysis: Low-Resolution Electromagnetic Tomography (LORETA).
- PK-PD Modeling: Correlation between drug plasma concentrations and EEG spectral variables.

Conclusion: While both methods showed similar results in topographic maps for most spectral variables, the BSS-based procedure led to higher PK-PD correlations and more neurophysiologically plausible tomographic maps, demonstrating that the filtering choice can critically influence study conclusions [4].

Visualization of Workflows and Signaling Pathways

The following diagrams illustrate a generalized artifact management workflow and the physiological basis of a common artifact.

Diagram 1: A generalized workflow for managing artifacts in EEG signals, incorporating both manual and automatic detection approaches alongside various removal methodologies.

Diagram 2: The generation of ocular artifacts. The eyeball acts as an electric dipole. Its rotation during movement or blinking generates a large electrical field that propagates to and is detected by nearby scalp electrodes, contaminating the EEG trace.

The Scientist's Toolkit: Essential Reagents and Materials

Table 3: Key Research Reagent Solutions for EEG Artifact Management

Item Name	Function/Application	Technical Notes
Multi-Channel EEG System with EOG/EMG	Records brain activity and reference signals for artifacts (e.g., eye movements, muscle activity).	Essential for regression methods and validating BSS component identification [4].
Dry or Semi-Dry Electrodes	Enables rapid setup for wearable EEG acquisition outside clinical settings.	Prone to higher impedance and motion artifacts compared to wet electrodes [5].
Inertial Measurement Units (IMUs)	Monitors subject head movement and acceleration.	Underutilized but promising for enhancing motion artifact detection in ecological conditions [5].
Software with ICA/BSS Algorithms	Decomposes multi-channel EEG into independent components for artifact identification and removal.	A core tool in modern EEG preprocessing pipelines (e.g., EEGLAB, MNE-Python) [5] [4].
Artifact Subspace Reconstruction (ASR)	An adaptive, data-driven method for removing large-amplitude, transient artifacts.	Particularly useful for cleaning continuous EEG data in wearable and real-time systems [5].
Deep Learning Frameworks (e.g., TensorFlow, PyTorch)	Provides environment for developing and training custom artifact removal models like GANs and LSTMs.	Enables state-of-the-art performance but requires significant computational resources and expertise [6].

Physiological artifacts are an inherent and formidable challenge in EEG signal interpretation. Their diverse origins and overlapping characteristics with neural signals necessitate a meticulous and informed approach to artifact management. As EEG technology expands into wearable, real-world applications and its role in quantitative biomarker discovery and drug development grows, the demand for robust, automated, and computationally efficient artifact handling strategies will only intensify. The future lies in the development of adaptive, intelligent pipelines that can selectively suppress artifacts while faithfully preserving the integrity of the underlying neural information, thereby ensuring the reliability of insights derived from the brain's electrical symphony.

Electroencephalography (EEG) is a fundamental tool in neuroscience and clinical diagnostics, providing a non-invasive method for recording the brain's spontaneous electrical activity with high temporal resolution. However, the interpretation of EEG signals and event-related potentials (ERPs) is critically hampered by contamination from physiological artifacts—unwanted signals originating from the participant's own body [1]. Among these, ocular artifacts represent a predominant source of contamination, capable of severely distorting the EEG recording by generating electrical potentials several times larger than those arising from neural activity [8] [9]. This in-depth technical guide explores the biophysical mechanisms, scalp topography, and correction methodologies for ocular artifacts, framing this discussion within the broader context of physiological artifact research in EEG. A precise understanding of these artifacts is essential for researchers, scientists, and drug development professionals to ensure the validity of their data in both basic research and clinical applications, such as the assessment of neuropharmacological agents.

Biophysical Origin and Mechanisms

The fundamental source of ocular artifacts lie in the existence of a steady corneoretinal potential (also known as the corneofundal potential). This potential difference arises from the metabolic activity of the retinal pigment epithelium, creating a dipole field across the eyeball where the cornea is positively charged (approximately +13 mV relative to the forehead) and the retina is negatively charged [8] [10]. This system can be modeled as an equivalent dipole located in the eye.

The manifestation of this dipole as a measurable artifact on the scalp depends on ocular kinematics:

Eyeblinks: During a blink, the eyelids slide over the positively charged cornea. This movement results in a rotation of the corneoretinal dipole, producing a large, low-frequency potential shift that is most prominent over the frontal and prefrontal electrode sites [8] [10]. The artifact waveform is characterized by a positive deflection.
Vertical Eye Movements: Similar to blinks, vertical eye movements cause a change in the orientation of the corneoretinal dipole, leading to a potential shift that contaminates the EEG [8].
Lateral Eye Movements: When the eyes move laterally, the cornea (positive pole) moves towards one side of the head. For instance, a rightward gaze brings the cornea closer to the right temporal electrode (F8), causing a positive waveform at F8 and a corresponding negative waveform at the left temporal electrode (F7) [10].

The amplitude of these ocular artifacts is generally an order of magnitude larger (often in the hundreds of microvolts) than the background EEG activity (typically tens of microvolts), making them a significant source of contamination [10].

Topographical Distribution and Propagation

The propagation of the ocular artifact from its source in the eyes to the scalp electrodes is governed by volume conduction through the head's tissues. The scalp distribution of these artifacts is not uniform and can be quantitatively described using propagation factors, defined as the fraction of the electrooculogram (EOG) signal recorded at periocular electrodes that is detected at a specific scalp location [8].

These propagation factors exhibit systematic variations:

Spatial Gradient: The amplitude of the ocular artifact is highest at electrodes closest to the eyes, such as the frontal and prefrontal sites (e.g., Fp1, Fp2, F7, F8). The amplitude attenuates with increasing distance from the eyes, with central, parietal, and occipital electrodes showing less contamination [8] [10].
Differential Propagation: Critically, the propagation factors for blinks and upward eye movements are significantly different, indicating that the volume conduction effects are not identical for all types of ocular activity. This necessitates careful consideration when applying correction algorithms [8].

The following diagram illustrates the core mechanism of ocular artifact generation and its pathway to contaminating the EEG signal.

Table 1: Characteristics of Major Physiological Artifacts in EEG

Artifact Type	Source	Typical Amplitude	Typical Frequency	Primary Topography
Ocular (Blink)	Corneoretinal potential dipole movement	Hundreds of µV [10]	Low-frequency (< 4 Hz) [6]	Prefrontal/Frontal [8]
Ocular (Movement)	Change in dipole orientation	Hundreds of µV [10]	Low-frequency (< 4 Hz)	Frontal/Temporal [10]
Muscle (EMG)	Muscle contractions (head, face, jaw)	Variable	Broadband ( >30 Hz) [10]	Widespread, temporal region [1]
Cardiac (ECG)	Electrical activity of the heart	Low amplitude	~1.2 Hz (pulse) [1]	Left hemisphere, near blood vessels [1]

Methodologies for Ocular Artifact Management

A range of techniques has been developed to manage ocular artifacts, each with its own advantages and limitations. The choice of method depends on the research question, the experimental paradigm, and the available data.

Traditional and Classical Approaches

Artifact Rejection: This is the simplest and most conservative approach. It involves identifying and manually discarding EEG epochs contaminated by ocular (or other large) artifacts. While straightforward, a major drawback is the significant data loss, which can bias the resulting data sample, especially in populations or tasks prone to frequent eye movements [9] [11].
Regression Methods: These time-domain or frequency-domain techniques use simultaneously recorded EOG channels as a reference. Regression analysis estimates the propagation factors (weights) of the EOG artifact into each EEG channel and subtracts a scaled version of the EOG from the EEG [1] [9]. A key limitation is the assumption of bidirectional independence, as the EOG signal itself can be contaminated by neural activity, potentially leading to an over-correction and removal of cerebral signals [1].
Blind Source Separation (BSS): This is a widely used modern approach, with Independent Component Analysis (ICA) being the most prominent algorithm. ICA decomposes the multi-channel EEG data into statistically independent components (ICs). Ocular artifacts, due to their strong, stereotyped, and focal origin, are often segregated into specific ICs. These artifactual ICs can then be manually identified and removed from the data before reconstructing the clean EEG [1] [10]. ICA is considered highly effective but requires careful implementation and is computationally intensive.

Emerging Deep Learning Approaches

Recent advances have demonstrated the potential of deep learning models for effective artifact removal.

Generative Adversarial Networks (GANs): GAN-based frameworks have been successfully applied to "denoise" EEG signals. In this architecture, a Generator network learns to transform artifact-contaminated EEG into clean EEG, while a Discriminator network tries to distinguish the generated signal from the true, clean ground-truth signal. This adversarial training process forces the Generator to produce increasingly realistic, artifact-free signals [6].
Hybrid Models (e.g., AnEEG): Newer models like AnEEG enhance GANs by integrating Long Short-Term Memory (LSTM) layers into the Generator. LSTMs are adept at capturing temporal dependencies, making them well-suited for modeling the dynamic nature of both EEG signals and ocular artifacts, thereby improving the quality of the reconstructed signal [6].
Subject-Specific Frameworks (e.g., Motion-Net): While initially designed for motion artifacts, the principle of subject-specific deep learning models like Motion-Net, which uses a U-Net architecture, shows promise for handling various artifact types. These models are trained on individual subjects' data, allowing them to learn personalized artifact features, which can be particularly beneficial when working with smaller datasets [12].

The workflow below generalizes the experimental protocol for implementing and validating a deep learning-based artifact removal method.

The Scientist's Toolkit: Essential Research Reagents and Materials

Table 2: Key Materials and Tools for Ocular Artifact Research

Item	Function / Explanation
High-Density EEG System	Multi-channel amplifier and electrode cap for recording scalp potentials. Essential for capturing the spatial distribution of artifacts and for methods like ICA that require many channels.
Electrooculogram (EOG) Electrodes	Dedicated electrodes placed near the eyes (vertical and lateral) to record reference signals for eye movements and blinks. Critical for regression-based correction and for validating other removal methods [9].
ICA Software (e.g., EEGLAB)	Interactive MATLAB toolbox for performing Blind Source Separation, particularly Independent Component Analysis. Allows for visualization, manual identification, and removal of artifact-related components [10].
GAN/LSTM Deep Learning Models (e.g., AnEEG)	Advanced computational frameworks for automated, data-driven artifact removal. The generator creates clean EEG, while the discriminator ensures fidelity, often enhanced with LSTM layers to model temporal context [6].
Synchronized Stimulus Presentation Software (e.g., PsychoPy)	Software to present visual or auditory stimuli and send precise event markers (triggers) to the EEG recording system. Crucial for time-locking EEG segments to events for ERP analysis and subsequent artifact correction [13].

Experimental Protocols for Method Validation

Validating the efficacy of an ocular artifact removal technique requires a rigorous experimental and analytical protocol. Below is a detailed methodology adapted from current literature.

Protocol: Benchmarking Deep Learning for Artifact Removal

1. Objective: To quantitatively evaluate the performance of a deep learning model (e.g., a GAN-LSTM hybrid) against classical methods (e.g., Regression, ICA) in removing ocular artifacts while preserving neural signal integrity.

2. Data Acquisition and Preparation:

Participants & Recording: Record simultaneous EEG and EOG from a cohort of healthy participants using a standard electrode montage (e.g., international 10-20 system). Include tasks that naturally induce blinks and saccades, as well as event-related paradigms (e.g., oddball) to assess neural preservation [6] [13].
Ground Truth Generation: A significant challenge is obtaining a "perfect" ground truth. Common solutions include:
- Semi-Simulated Data: Artificially add recorded EOG signals or simulated ocular artifacts to clean, resting-state EEG recorded during periods of minimal eye movement [6].
- Expert-Cleaned Data: Use data segments where ocular artifacts have been meticulously removed by expert manual rejection or via state-of-the-art ICA cleaning to serve as a reference [6].

3. Data Preprocessing:

Filter the raw data (e.g., 0.5-70 Hz bandpass, 50/60 Hz notch filter).
Segment the data into epochs time-locked to events of interest.
Normalize the data, for example, using energy threshold-based normalization to handle signal range limitations [6].

4. Implementation of Correction Methods:

Deep Learning Model: Implement the chosen architecture (e.g., AnEEG). The generator typically comprises LSTM layers to model temporal dynamics. The discriminator is often a convolutional network. Train the model using an adversarial loss function and potentially a supplementary loss (e.g., mean-squared-error) to ensure the generated signal matches the ground truth [6].
Classical Methods: Perform regression in the time or frequency domain, subtracting a scaled EOG signal from each EEG channel [1]. Alternatively, run ICA, manually identify and remove components representing ocular artifacts, and reconstruct the signal.

5. Quantitative Performance Metrics: Compare the corrected output of each method against the ground truth using the following standard metrics [6]:

Normalized Mean Square Error (NMSE): Lower values indicate better agreement with the original signal.
Root Mean Square Error (RMSE): Lower values indicate less error.
Correlation Coefficient (CC): Higher values indicate stronger linear agreement with the ground truth.
Signal-to-Noise Ratio (SNR) / Signal-to-Artifact Ratio (SAR): Higher values indicate better artifact suppression and signal preservation.

Ocular artifacts, stemming from the fundamental electrophysiology of the eye, present a persistent and significant challenge in EEG research. A thorough understanding of their biophysical basis—the corneoretinal dipole and its movement—is paramount for correctly interpreting scalp topographies and selecting appropriate correction methodologies. While classical techniques like rejection and regression remain in use, the field is rapidly advancing towards sophisticated computational approaches, including ICA and, more recently, deep learning models like GANs and LSTM networks. These data-driven methods show immense promise for automated, robust, and effective artifact removal, which is crucial for enhancing the signal quality and reliability of EEG in both experimental and clinical settings, including the critical domain of pharmaceutical development and neurotherapeutic assessment. The ongoing development and rigorous validation of these tools ensure that EEG will continue to be a powerful window into brain function.

Within the context of physiological artifacts in electroencephalography (EEG) research, electromyogenic (EMG) artifacts pose a significant and unique challenge to inferential validity. Unlike other biological artifacts, muscle activity is neither small nor rare. Peak cranial EMG can be 1–2 orders of magnitude larger than typical mean differences in the EEG (75–400µV vs. <10µV), meaning even modest contamination can severely distort findings [14]. The particular risk stems from the fact that facial EMG is sensitive to a variety of cognitive and affective processes, making it temporally confounded with experimental manipulations, especially in studies of ongoing, induced, or evoked EEG in the frequency-domain [14]. This technical guide details the spectral and topographical properties of these artifacts and methodologies for their characterization, providing a crucial resource for researchers, scientists, and drug development professionals.

Spectral and Topographical Characteristics of EMG Artifacts

The difficulty in separating EMG from neurogenic signals arises from their overlap across key dimensions: temporal, anatomical, and spectral [14].

Spectral Profile

Muscle artifacts exhibit a broad spectral signature that extensively overlaps with and can mask neural signals of interest. The power-frequency spectrum of EMG artifacts ranges from 2 Hz to 100 Hz [15]. Critically, even weak EMG activity is detectable across the scalp in frequencies as low as the alpha band (8–13 Hz) [14]. This wide band easily obscures the typical EEG bands of interest, including delta, theta, alpha, and beta, complicating the study of various cognitive and sensory states.

Table 1: Spectral Characteristics of EMG Artifacts in EEG

Feature	Description	Implication for EEG Research
Frequency Range	2 Hz to 100 Hz [15]	Overlaps with all classic EEG frequency bands (Delta, Theta, Alpha, Beta)
Low-Frequency Penetration	Detectable in the Alpha band (8-13 Hz) [14]	Can contaminate rhythms associated with relaxation and idle states
Spectral Variability	Signature varies with different muscle groups and contraction intensity [14]	Prevents the use of simple, canonical spectral filters

Topographical Distribution

The topographical distribution of EMG artifacts is broad and anatomically complex. EMG arises from spatially distributed, functionally independent muscle groups across the cranium, including the face, neck, and head [15] [14]. Due to volume conduction, the electrical activity from these muscles is detectable across the entire scalp [14]. This is in contrast to ocular (EOG) or cardiogenic (ECG) artifacts, which have more localized origins.

Intramuscular topographical studies, such as those on the masseter muscle, reveal that activation patterns shift significantly with different functional tasks. For instance, the power maximum can move from the inferior third of the masseter during biting to the posterosuperior third when compensating for ipsilaterally applied forces [16] [17]. This illustrates the dynamic and task-dependent nature of EMG topographies.

Table 2: Topographical Characteristics of EMG Artifacts

Feature	Description	Contrast with Other Artifacts
Spatial Distribution	Broad, detectable across the entire scalp [14]	EOG and ECG are more spatially localized [15]
Source Muscles	Multiple, independent groups (face, jaw, neck, head) [14]	Arises from fixed sources (e.g., heart, eyes) [14]
Distribution Pattern	Can manifest as a broad fringe or rim distribution on the scalp [14]	-

Experimental Protocols for EMG Characterization

Understanding these characteristics requires robust experimental methodologies. The following protocols are employed to systematically study EMG artifacts.

Protocol for Intramuscular Topographical Analysis

This protocol, adapted from Schumann et al. (1994), is designed to map activation patterns within a specific muscle [16] [17].

Subject Population: 20 healthy subjects.
Electrode Setup: 16-channel surface electromyograms (EMGs) recorded over the muscle of interest (e.g., the masseter muscle).
Functional Conditions: Recordings are taken under various conditions:
- Mandible in postural position.
- Compensation for forces applied from ipsilateral, contralateral, and frontal directions.
- Force-constant biting on a unilaterally placed force transducer.
Signal Processing:
- Artefact Elimination: Remove obvious non-EMG noise from the raw signals.
- Spectral Calculation: Compute EMG power spectra from the original curves using Fast Fourier Transformation (FFT).
- Map Generation: Compute spectral EMG maps using an interpolation algorithm and an imaging procedure to visualize topographical power distribution.

Protocol for Validating EMG Correction Techniques

This protocol uses scripted data to quantitatively establish the sensitivity and specificity of EMG correction tools like the General Linear Model (GLM) or Independent Component Analysis (ICA) [14].

Subject Population: A reasonably large and varied sample (e.g., n=17).
Experimental Design: A factorial design that independently varies neurogenic and myogenic activation.
- Neurogenic Manipulation: An alpha-blocking task (e.g., eyes open vs. eyes closed).
- Myogenic Manipulation: A low-intensity muscle activation task (e.g., tensing vs. quiescence).
Data Acquisition: High-density EEG data (e.g., 125-channel) is acquired during the crossed conditions.
Data Analysis:
- Gross Artifact Rejection: Remove large artifacts before correction to create a realistic contamination level.
- Correction Application: Apply the EMG correction technique to the data.
- Validation:
  - Sensitivity: Compare corrected EMG-contaminated data to uncorrected EMG-free data in an anterior, myogenic region of interest (ROI). A good technique will show equivalence.
  - Specificity: Similarly, compare corrected and uncorrected data in a posterior, neurogenic (alpha-blocking) ROI. A good technique will preserve the neurogenic effect.

Figure 1: Workflow for validating EMG correction techniques using scripted data, testing both sensitivity and specificity [14].

The Scientist's Toolkit: Key Research Reagents and Materials

Successfully conducting research in this field requires a suite of specialized tools and algorithms.

Table 3: Essential Research Tools for EMG Artifact Analysis

Tool Category	Specific Examples	Function/Purpose
Signal Acquisition	High-Density EEG System (e.g., 125-channel) [14]	Captures detailed spatial distribution of artifacts.
	Multi-channel Surface EMG Array (e.g., 16-channel) [16]	Records topographical activity from multiple muscle sites.
Core Analysis Algorithms	Fast Fourier Transform (FFT) [16]	Converts time-domain signals to power spectra for frequency analysis.
	Independent Component Analysis (ICA) [18] [14]	Blind source separation to identify and isolate artifact components.
	General Linear Model (GLM) [14]	Removes variance in a neurogenic band predicted by an EMG band.
	Wavelet Packet Decomposition (WPD) [15]	Provides time-frequency analysis for non-stationary signals like EMG.
Advanced Processing Techniques	Non-Local Means (NLM) Filter [15]	Denoising algorithm that can be optimized for artifact correction.
	Meta-heuristic Optimization Algorithms [15]	Automatically optimizes parameters for filters and other algorithms.

Analytical and Visualization Techniques

The complex data generated from these experiments requires sophisticated processing and visualization.

From Raw Signal to EMG Maps

The process of creating topographical EMG maps involves a defined sequence of steps to transform raw electrical signals into interpretable spatial maps [16].

Figure 2: An analytical workflow transforms raw EMG signals into topographical spectral maps for visualization [16].

Techniques for EMG Artifact Identification and Removal

Multiple algorithmic approaches exist to manage EMG artifacts, especially in EEG data. The choice of technique often depends on the number of available EEG channels.

For Multi-channel EEG:
- Independent Component Analysis (ICA): A widely used blind source separation method that decomposes EEG signals into independent components, which can be manually or algorithmically classified as neural or artifactual (e.g., EMG). The artifactual components are then discarded before signal reconstruction [14].
- General Linear Model (GLM): An intra-individual method that uses regression to remove variance in a neurogenic band (e.g., alpha) that is predicted by activity in a high-frequency EMG band (e.g., 70-80 Hz). This is effective for ongoing or induced, but not phase-locked, spectral changes [14].
For Single-channel or Few-channel EEG:
- Wavelet Transform-Based Methods: These are well-suited for non-stationary signals like EMG. A novel approach combines Wavelet Packet Decomposition (WPD) with a modified Non-Local Means (NLM) filter. The corrupted EEG is decomposed, the wavelet coefficients are corrected by the optimized NLM filter, and the signal is reconstructed [15].
- Hybrid Methods: Cascading multiple algorithms (e.g., WPD + NLM) can suppress artifacts in stages, achieving a higher degree of robustness, which is particularly important for the challenging task of single-channel EMG removal [15].

In conclusion, muscle artifacts represent a critical challenge in EEG research due to their broad spectral characteristics, complex topographical distribution, and sensitivity to psychological variables. A thorough understanding of their properties, combined with rigorous experimental protocols and a growing toolkit of analytical techniques, is essential for ensuring the validity of neuroscientific and clinical findings, including those in drug development.

Electroencephalography (EEG) is a powerful, non-invasive tool for monitoring brain activity, but its utility is often challenged by the presence of physiological artifacts. These artifacts are signals recorded by EEG that do not originate from neural activity and can significantly contaminate the data [19]. While ocular and muscular artifacts receive considerable attention, other physiological sources—specifically sweat, respiration, and glossokinetic artifacts—pose distinct and often complex challenges for researchers and clinicians. Effectively identifying and mitigating these artifacts is not merely a technical exercise; it is a critical prerequisite for ensuring the validity of neural data analysis, particularly in drug development and clinical research where data integrity directly impacts diagnostic accuracy and therapeutic assessment [19] [20]. This guide provides an in-depth technical examination of these three artifact types, detailing their origins, characteristics, and advanced methodologies for their management.

Physiological Artifacts in EEG: A Primer

Physiological artifacts in EEG are signals generated by the body's own biological processes. As noted by Bitbrain, "Physiological artifacts originate from the patient" and can distort or mask genuine neural signals, potentially leading to clinical misdiagnosis or biased research conclusions [19]. The low-amplitude nature of EEG signals (measured in microvolts) makes them highly susceptible to such contamination [19]. Traditional and modern approaches to artifact management range from blind source separation methods like Independent Component Analysis (ICA) to emerging deep learning models, such as those combining CNN and LSTM architectures [5] [19] [20]. However, the effective application of these techniques requires a deep understanding of the specific temporal, spectral, and spatial signatures of each artifact type.

In-Depth Analysis of Target Artifacts

Sweat Artifact

Origin and Mechanism: Sweat artifacts arise from the activity of sweat glands, which modifies the local electrode-skin impedance and creates slow electrochemical potential shifts. This is particularly problematic during long-duration recordings, physical activity, or in high-temperature environments [19].
Impact on EEG Signal: The presence of sweat can introduce slow baseline drifts and may even create short circuits between closely spaced electrodes. This fundamentally alters the electrical contact properties, compromising signal integrity across multiple channels [19].
Characteristic Signatures:
- Time-Domain Effect: Manifested as very slow potential shifts that are apparent over long epochs [19].
- Frequency-Domain Effect: Predominantly contaminates the delta (0.5–4 Hz) and theta (4–8 Hz) frequency bands. This overlap is particularly problematic for studies focusing on sleep staging or low-frequency cognitive assessments, as it can mimic or obscure genuine neural rhythms [19].

Respiration Artifact

Origin and Mechanism: This artifact is caused by the mechanical movements of the chest and head during the breathing cycle. These movements can subtly alter the electrode-skin contact and, in some cases, create motion-related potentials [19].
Impact on EEG Signal: The effect is most pronounced in sleep studies where subjects are recumbent, but it can also affect recordings in relaxed, seated participants. It introduces a rhythmic, low-frequency modulation of the EEG signal [19].
Characteristic Signatures:
- Time-Domain Effect: Appears as slow, sinusoidal waveforms that are synchronized with the respiration rate (typically 12–20 cycles per minute) [19].
- Frequency-Domain Effect: The spectral energy is concentrated at the fundamental respiration frequency and its harmonics, which primarily fall within the delta band and can encroach on the theta band, potentially confounding the analysis of endogenous slow-wave activity [19].

Glossokinetic Artifact

Origin and Mechanism: The glossokinetic artifact is generated by tongue movements. The tongue possesses a significant tip-to-root electrical potential. Movement of the tongue within the oral cavity shifts this electrical field, which can be volume-conducted to scalp electrodes [19].
Impact on EEG Signal: This artifact is a classic example of a non-cephalic biological potential that contaminates EEG recordings. It is commonly associated with swallowing, speaking, or restless patients, and can be particularly challenging to distinguish from cerebral activity due to its distribution [19].
Characteristic Signatures:
- Time-Domain Effect: Often presents as a slow, lateralized shift that is most prominent over the frontal and temporal electrodes. The polarity and distribution can change depending on the direction and nature of the tongue movement [19].
- Frequency-Domain Effect: Like sweat and respiration, its spectral content is primarily in the delta and theta bands. However, the spatial pattern (fronto-temporal emphasis) is a key differentiator [19].

Table 1: Summary of Characteristic Features of Sweat, Respiration, and Glossokinetic Artifacts

Feature	Sweat Artifact	Respiration Artifact	Glossokinetic Artifact
Biological Origin	Sweat gland activity	Chest/head movement	Tongue movement (electric potential)
Primary Time-Domain Signature	Very slow baseline drift	Slow, rhythmic waveforms synchronized with breath	Slow, lateralized voltage shifts
Primary Frequency-Domain Signature	Delta/Theta band power increase	Peak at respiration frequency (e.g., ~0.2-0.3 Hz)	Delta/Theta band power increase
Spatial Distribution	Widespread, often maximal at forehead	Variable, can be global or channel-specific	Predominantly frontal and temporal
Common Triggers	Heat, stress, long recordings, physical exertion	Deep breathing, sleep, relaxed state	Swallowing, talking, patient restlessness

Experimental Protocols for Artifact Investigation

Protocol for Simultaneous EEG and Respiration Monitoring

This protocol is designed to capture and characterize respiration artifacts for subsequent analysis or model training.

Participant Setup: Apply a standard EEG cap according to international 10-20 system. Simultaneously, attach a respiratory belt transducer around the participant's abdomen or chest to record a reference respiration signal.
Data Acquisition:
- Record EEG in a resting state with eyes open for 5 minutes, instructing the participant to breathe normally.
- Follow with a 5-minute session where the participant is instructed to perform deep, paced breathing (e.g., 6 breaths per minute guided by a metronome). This exaggerates the artifact for clearer identification.
- Finally, record a 5-minute session where the participant holds their breath for short periods (e.g., 20 seconds) interspersed with normal breathing. This creates a dynamic contrast.
Data Analysis:
- Synchronize the EEG and respiration reference data streams.
- Perform coherence analysis between the respiration signal and each EEG channel to identify channels most affected by respiratory rhythms.
- Use the deep breathing and breath-hold epochs to train or validate algorithms, such as adaptive filters or deep learning models like CLEnet, which integrates temporal and morphological feature extraction [20].

Protocol for Inducing and Characterizing Glossokinetic Artifacts

This protocol systematically elicits tongue movements to map their EEG manifestations.

Participant Setup: Apply a high-density EEG cap (e.g., 32+ channels) for better spatial localization.
Task Design:
- Baseline: 2 minutes of rest with tongue still.
- Lateral Movements: Participant performs paced left-to-right tongue movements (e.g., touching cheeks) for 2 minutes.
- Swallowing: Participant swallows on cue every 15 seconds for 2 minutes.
- Articulation: Participant silently repeats specific syllables or words to engage subtle tongue motions.
Data Analysis:
- Calculate the average voltage topography for each task and subtract the baseline average to highlight the artifact's spatial distribution.
- Employ Blind Source Separation methods like ICA to isolate the glossokinetic component. The component's topography (fronto-temporal focus) and time-course (locked to movement cues) are key identifiers.
- For single-channel or low-density wearable systems, deep learning approaches that are trained on such task data can be more effective than traditional methods [5].

The following workflow diagram illustrates the core steps for investigating these artifacts.

Methodologies for Detection and Removal

Traditional and Modern Signal Processing Approaches

A variety of algorithms are available for managing artifacts in EEG signals.

Filtering: High-pass filtering with a very low cutoff (e.g., 0.5 Hz or 1 Hz) can attenuate the slowest drifts caused by sweat. However, this approach is often ineffective for respiration and glossokinetic signals, as their frequency content overlaps critically with neural delta/theta activity, and can distort the genuine neural signal [19] [20].
Blind Source Separation (BSS): Methods like Independent Component Analysis (ICA) are widely used. ICA projects multi-channel EEG data into a space of statistically independent components, which can then be manually or automatically inspected and rejected if they represent an artifact [5] [19]. A key limitation is that BSS methods typically require a sufficient number of channels (e.g., >16) for effective decomposition, which can be a constraint for low-density wearable systems [5] [20].
Regression Methods: These rely on a recorded reference signal (e.g., from a respiration belt) to model and subtract the artifact's influence from the EEG. While effective, the need for additional hardware can increase complexity and cost [20].

Emerging Deep Learning Pipelines

Deep learning (DL) represents a paradigm shift in artifact handling, overcoming several limitations of traditional methods.

Capabilities: DL models, such as the dual-branch CLEnet which integrates dual-scale CNN and LSTM with an attention mechanism, can learn to separate artifacts from brain signals in an end-to-end manner without requiring reference signals or manual component selection [20]. These models are particularly promising for wearable EEG, where channel count is low and artifacts have specific features due to dry electrodes and subject mobility [5].
Performance: Studies show that models like CLEnet outperform traditional methods and other DL architectures in tasks involving the removal of mixed and unknown artifacts, demonstrating higher Signal-to-Noise Ratio (SNR) and lower temporal and spectral errors [20].
Workflow: The following diagram illustrates the typical workflow of a deep learning model for artifact removal.

Table 2: Performance Metrics of Different Artifact Removal Techniques on a Standardized Task (e.g., Mixed Artifact Removal)

Technique	Category	Key Strength	Key Limitation	Reported SNR (dB)	Reported CC
High-Pass Filtering	Traditional	Simple to implement	Removes neural slow waves; ineffective for rhythmic artifacts	-	-
ICA	Traditional	Effective for many physiological artifacts	Requires many channels; often needs manual inspection	-	-
1D-ResCNN [20]	Deep Learning	Multi-scale feature extraction	May not fully capture temporal context	9.048*	0.905*
NovelCNN [20]	Deep Learning	Optimized for specific artifacts (e.g., EMG)	Performance may drop for other artifact types	10.108*	0.906*
DuoCL [20]	Deep Learning	Combines CNN and LSTM for temporal features	Potential disruption of original temporal features	11.224*	0.899*
CLEnet [20]	Deep Learning	End-to-end; handles multi-channel/unknown artifacts; best all-around performance	Computational complexity	11.498*	0.925*

Example values from a mixed artifact (EMG+EOG) removal task on a semi-synthetic dataset. CC: Average Correlation Coefficient [20].

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials and Tools for Advanced EEG Artifact Research

Item / Reagent	Function in Research	Application Context
Dry Electrode EEG Systems [5] [21]	Enables EEG acquisition in real-world, mobile settings. Reduces setup time but may be more susceptible to motion and sweat artifacts.	Studying artifacts in ecological conditions; long-term monitoring.
Auxiliary Sensors (IMU, Respiration Belt) [5]	Provides reference signals for motion (Inertial Measurement Unit) and respiration, crucial for validating detection algorithms.	Ground truth data collection for model training and evaluation.
Semi-Synthetic Benchmark Datasets [20]	Provides a controlled, ground-truthed environment by adding known artifacts to clean EEG, enabling fair algorithm comparison.	Training and quantitative evaluation of deep learning models.
ICA Software Packages (e.g., EEGLAB)	Implements traditional Blind Source Separation methods for component-based artifact rejection.	Standard pipeline for artifact removal in research-grade, multi-channel EEG.
Deep Learning Frameworks (e.g., TensorFlow, PyTorch)	Provides the environment to build, train, and deploy models like CNNs, LSTMs, and Transformers for artifact removal.	Developing next-generation, automated artifact removal pipelines.

Sweat, respiration, and glossokinetic artifacts represent significant, yet manageable, obstacles in EEG research. A comprehensive understanding of their distinct biophysical origins and characteristic signatures is the foundation for effective artifact management. While traditional signal processing methods remain useful, the field is rapidly advancing toward sophisticated, data-driven solutions, particularly deep learning models. These modern pipelines offer the promise of robust, automated, and practical artifact handling, which is indispensable for leveraging the full potential of EEG in critical applications like clinical drug development and reliable brain-computer interfaces. As wearable EEG technology continues to expand into new real-world domains, the development of artifact management techniques that are specifically tailored to the challenges of low-density, mobile acquisition will be paramount.

In electroencephalography (EEG) research, an artifact is defined as any recorded signal that does not originate from neural activity within the brain [22] [19]. These unwanted signals represent a fundamental challenge for researchers, scientists, and drug development professionals, as they can severely compromise data integrity and lead to significant clinical misinterpretations. Artifacts are legion and pervasive in EEG recordings, and the interpreter must always beware of the possibility that a waveform in question may be non-cerebral in origin [23]. The core problem stems from the inherent nature of EEG signals, which are typically measured in microvolts and are consequently highly susceptible to contamination from various physiological and non-physiological sources [19]. This contamination can distort or mask genuine neural signals, reducing data quality and potentially leading to erroneous conclusions in both research and clinical settings.

Artifacts are broadly categorized into two main types: physiological artifacts, which originate from the patient's own body (such as eye movements, muscle activity, or cardiac signals), and non-physiological artifacts, which result from external electrical phenomena or recording devices in the environment [23] [19]. The following diagram illustrates the primary artifact categories and their respective sources:

The identification and management of artifacts becomes particularly crucial in the context of wearable EEG systems, which are increasingly used in both clinical monitoring and pharmaceutical trials. These systems face specific challenges including dry electrodes, reduced scalp coverage, and subject mobility, which can exacerbate artifact-related issues [5]. Furthermore, the expansion of EEG into novel applications such as exergaming—where participants engage in physical activity while EEG is recorded—introduces additional complexities from large body movements that can severely impede signal quality [24].

Physiological Artifacts: Origins and Characteristics

Ocular Artifacts

Ocular artifacts represent one of the most common sources of contamination in EEG recordings. These artifacts arise from the corneo-retinal potential difference, where the cornea is positively charged relative to the negatively charged retina, creating an electric dipole [22] [19]. During eye blinks—governed by Bell's Phenomenon—the eyes roll upward, bringing the corneal positive charge closer to frontal electrodes (Fp1 and Fp2), which consequently record a positive deflection [22]. These blinks manifest as high-amplitude negative waveforms in the bifrontal regions with a typical amplitude of 100-200 µV, often an order of magnitude larger than genuine EEG signals [22] [19].

Lateral eye movements produce a different signature characterized by opposing polarities in the F7 and F8 leads. When looking to the right, the right cornea moves closer to F8 (producing a positive charge), while the left retina moves closer to F7 (producing a negative charge) [22]. The reverse pattern occurs when looking to the left. In bipolar montages, this creates characteristic phase reversals that can be identified by trained interpreters. Critically, ocular artifacts should be confined to frontal regions without significant spread to posterior areas, helping distinguish them from cerebral activity such as frontal spike and waves [22].

Muscle and Movement Artifacts

Muscle artifacts (EMG) originate from contractions of various muscle groups, particularly the frontalis and temporalis muscles, producing high-frequency, broadband noise that overlaps with important EEG rhythms [22] [19]. This artifact typically appears as high-frequency, often low-amplitude activity overlying normal cerebral rhythms, most prominent in awake states [22]. Muscle artifacts are especially problematic as they dominate the beta (13-30 Hz) and gamma (>30 Hz) frequency ranges, potentially masking important cognitive and motor activity signals [19].

Chewing artifact represents a specific form of muscle artifact originating from the temporalis muscle, characterized by sudden onset, intermittent bursts of generalized very fast activity [22]. Similarly, hypoglossal (tongue) movement artifact appears as slower, diffuse delta frequency activity that is reproducible and can be elicited by asking the patient to say "la la la" or perform other lingual movements [22]. These movements can be distinguished from true cerebral activity by their highly organized, reproducible nature and lack of evolutionary patterns characteristic of seizures.

Other Physiological Artifacts

Cardiac artifacts include both ECG artifact, marked by waveforms time-locked to the QRS complex (often more prominent on the left side due to heart position), and cardioballistic artifact, where EEG electrodes placed near arteries pick up pulsatile motion artifacts [22]. These artifacts present as rhythmic waveforms recurring at the heart rate, often in central or neck-adjacent channels [19].

Sweat artifact results from the sodium chloride in sweat carrying a charge that is detected by EEG electrodes, producing very slow (typically <0.5 Hz), relatively low-amplitude activity that can be bilateral, unilateral, or focal [22] [19]. This artifact contaminates the delta and theta bands, potentially impairing sleep studies and low-frequency cognitive assessments [19]. Respiration artifacts arise from chest and head movements during breathing, creating slow waveforms synchronized with respiration rate (typically 12-20 breaths per minute) that mainly affect low-frequency bands [19].

Consequences for Data Analysis and Research Applications

Impact on Data Quality and Analytical Outcomes

The presence of artifacts in EEG data introduces significant challenges for quantitative analysis and can severely compromise research outcomes, particularly in drug development and clinical trials. Artifacts reduce the signal-to-noise ratio (SNR) of EEG recordings, potentially obscuring genuine neural signals of interest and introducing spurious findings [19]. This is particularly problematic when investigating drug effects on neural oscillations, where artifact contamination can mimic or mask true pharmacological effects on brain activity.

In wearable EEG systems, which are increasingly used in ecological monitoring and pharmaceutical trials, artifacts exhibit specific features due to dry electrodes, reduced scalp coverage, and subject mobility [5]. The table below summarizes the quantitative impacts of artifacts on EEG data quality and analysis:

Table 1: Impact of Artifacts on EEG Data Analysis in Research Settings

Artifact Type	Frequency Range Affected	Amplitude Range	Impact on Data Analysis
Ocular Artifacts	Delta/Theta (0.5-8 Hz)	100-200 µV	Masks cognitive processes in low frequencies; corrupts frontal channels
Muscle Artifacts	Beta/Gamma (13-300 Hz)	Variable, often high	Obscures cognitive/motor activity; reduces validity of connectivity measures
Cardiac Artifacts	Multiple bands	Relatively low	Introduces rhythmic confounds; affects heart-rate variability correlations
Sweat Artifacts	Delta (<0.5 Hz)	Low amplitude	Compromises slow potential studies; affects sleep and resting-state analysis
Electrode Pop	Broadband	High amplitude	Creates channel-specific outliers; disrupts topographic mapping

The challenges are particularly pronounced in emerging research applications such as exergaming studies, where motion artifacts from large body movements can lead to significant data loss if not properly addressed [24]. In such paradigms, accurately quantifying data loss due to artifacts becomes essential because large portions of EEG data may be discarded, leading to reduced sample sizes or biased results [24].

Methodological Implications for Artifact Management

Current approaches to artifact management in research settings typically integrate detection and removal phases, though these stages are rarely separated when assessing performance metrics [5]. The most frequently used techniques include wavelet transforms, Independent Component Analysis (ICA), and thresholding methods, with deep learning approaches emerging as promising solutions, particularly for muscular and motion artifacts [5]. A systematic review of artifact detection methods found that accuracy (71%) and selectivity (63%) are the most commonly reported performance metrics when clean signal is available as a reference [5].

Recent advances in unsupervised artifact detection have demonstrated the potential for patient- and task-specific approaches that extract clinically relevant features and apply ensemble outlier detection algorithms to identify artifacts unique to a given task and subject [25]. Such methods have shown relative improvements of up to 10% in classification performance when compared to non-corrected data [25]. The following workflow illustrates a modern, automated approach to EEG artifact detection and correction:

A significant challenge in artifact management is that the definition of what constitutes an "artifact" often depends on the specific research task at hand. A given EEG segment may be considered an artifact if it impacts the performance of downstream analytical methods by manifesting as uncorrelated noise in a feature space relevant to those methods [25]. For instance, muscle movement signatures may confound coma-prognostic classification but serve as useful features for sleep stage identification [25].

Clinical Consequences and Misinterpretation Risks

Diagnostic Challenges and Misinterpretation

In clinical settings, EEG artifacts present substantial risks for misinterpretation, potentially leading to false diagnoses and inappropriate treatments. Artifacts can mimic true epileptiform abnormalities or seizures, particularly for less experienced interpreters [22] [26]. The consequences can be severe, including unnecessary administration of antiseizure medications, extended hospital stays, and inappropriate escalation of care.

In intensive care unit (ICU) settings, where continuous video EEG (cvEEG) is increasingly used for seizure detection in critically ill patients, physiological artifacts and device-related artifacts can closely mimic epileptic seizures [26]. One study demonstrated that only 27% of abnormal motor events in critically ill patients were true seizures, with the remainder being tremor-like movements, myoclonus without electrographic changes, or other abnormal movements [26]. The following table outlines common artifact types and their potential clinical misinterpretations:

Table 2: Clinical Misinterpretation Risks of Common EEG Artifacts

Artifact Type	Typical EEG Appearance	Potential Misinterpretation	Clinical Risk
Eye Blinks	High-amplitude frontal positive deflections	Frontal spike and waves, anterior predominant generalized spike and waves	False diagnosis of epilepsy; inappropriate medication
Chewing Muscle Artifact	Bursts of generalized very fast activity	Generalized periodic fast activity, ictal patterns	Misdiagnosis of seizure activity; treatment escalation
Lateral Eye Movements	Phase reversals at F7/F8	Focal temporal seizure activity	Incorrect lateralization of seizure focus
ECG Artifact	Rhythmic waveforms time-locked to QRS complex	Periodic discharges, epileptiform activity	False positive for ictal patterns; unnecessary intervention
Electrode Pop	Sudden discharge with steep upslope in single electrode	Focal epileptiform discharge	Incorrect localization of epileptogenic zone
Pacemaker/Device Artifact	Highly periodic, stereotyped waveforms	Electrographic seizures, periodic discharges	Misdiagnosis of nonconvulsive status epilepticus

Device-related artifacts present particular challenges in hospital environments. Implantable devices such as vagus nerve stimulators (VNS), deep brain stimulators (DBS), and responsive neurostimulators (RNS) can produce rhythmic, highly periodic patterns that may be mistaken for electrographic seizures [26]. These artifacts often display features that can help distinguish them from true cerebral activity, including perfect periodicity, highly stereotyped or monomorphic waveforms, absence of a physiological electric field, and failure to localize to physiologic brain regions [26].

Case Examples from Clinical Practice

In clinical practice, distinguishing artifacts from true cerebral activity requires careful attention to contextual factors and EEG characteristics. A case series from ICU settings highlights several instructive examples [26]. One patient with B-cell lymphoma presented with altered mental status and 1.5-2 Hz generalized periodic discharges (GPDs) on EEG, raising concern for non-convulsive status epilepticus. However, the patient also exhibited continuous large-amplitude rhythmic movements in the left upper extremity that were not consistently time-locked to the GPDs. After benzodiazepine administration, the GPDs resolved but the movements persisted, indicating they represented non-epileptic movements rather than seizure activity [26].

Another case involved a patient with drug-resistant epilepsy and neuromodulation devices who displayed rhythmic activity in posterior regions occurring in a highly periodic pattern every 5 minutes. While initially concerning for breakthrough seizure activity, further analysis revealed the pattern was stereotyped, monomorphic, lacked consistent spatial field, and showed no temporal or spatial evolution—all features suggesting an artifact from the patient's neuromodulation devices rather than true seizure activity [26].

These cases underscore the importance of maintaining a broad differential diagnosis and avoiding diagnostic anchoring when interpreting EEG studies, particularly in complex clinical environments like the ICU where multiple artifact sources coexist [26]. Video correlation can be invaluable in such scenarios, as the artifact source (chewing, rhythmic patting, chest percussion) is often visible and time-locked with suspicious EEG discharges [26].

Methodologies and Experimental Protocols for Artifact Management

Detection and Identification Protocols

Effective artifact management begins with robust detection methodologies. Current approaches range from manual visual inspection to automated computational methods. Visual inspection by experienced EEG technologists and interpreters remains a common practice, particularly in clinical settings, where identification relies on recognizing characteristic waveforms, distributions, and timing patterns [22] [23]. However, this approach is time-consuming and subject to interpreter variability.

Quantitative evaluation protocols are critical for developing algorithms that optimally remove artifacts from real EEG data [27]. One novel approach proposes a "rating-by-detection" protocol that computes average artifact duration, measuring the recovered EEG's deviation from modeled background activity with a single score [27]. This method enables reliable comparisons between artifact filtering configurations despite the missing ground-truth neural signals [27].

For wearable EEG systems, artifact detection pipelines must address specific challenges including low-density configurations and motion-related artifacts [5]. Wavelet transforms and Independent Component Analysis (ICA), often using thresholding as a decision rule, are among the most frequently used techniques for managing ocular and muscular artifacts [5]. Meanwhile, ASR-based pipelines are widely applied for ocular, movement, and instrumental artifacts [5].

Removal and Correction Techniques

Once detected, multiple strategies exist for addressing artifacts in EEG data. Simple rejection involves removing contaminated epochs from analysis, though this approach can lead to significant data loss, particularly in paradigms with frequent artifacts [24]. More sophisticated correction techniques aim to preserve neural signals while removing artifactual components.

Independent Component Analysis (ICA) remains a popular method for artifact removal that separates EEG signals into statistically independent components, allowing for identification and removal of artifactual sources [24] [25]. However, ICA has limitations, particularly when the number of channels is low, as it can only extract as many independent components as there are channels [25]. Additionally, ICA typically requires manual review by experts to classify components as signal or noise [25].

Regression-based techniques predict and subtract the contribution of artifacts to the signal using mathematical models, particularly effective for ocular artifacts [24]. Deep learning approaches are emerging as powerful alternatives, especially for muscular and motion artifacts, with promising applications in real-time settings [5]. These include convolutional auto-encoder approaches that learn task- and subject-specific interpolation in a self-supervised manner without human annotation [25].

The Research Toolkit: Essential Solutions for Artifact Management

Table 3: Research Reagent Solutions for EEG Artifact Management

Solution Type	Specific Examples	Function	Application Context
Signal Processing Algorithms	Independent Component Analysis (ICA), Wavelet Transforms	Separate neural signals from artifactual sources	Research settings with sufficient channel density
Automated Detection Tools	Ensemble outlier detection, Deep CNN-LSTM models	Identify artifacts based on feature anomalies	High-throughput studies; wearable EEG systems
Reference Sensors	EOG, ECG, EMG sensors	Provide reference signals for artifact regression	Controlled research environments; detailed mechanism studies
Source Separation	Principal Component Analysis (PCA), ICA	Decompose signals into neural and non-neural components	Preprocessing pipeline for quantitative EEG analysis
Hardware Solutions	Shielded cables, active electrodes, impedance monitoring	Reduce environmental interference and electrode artifacts	Mobile EEG; studies in electrically noisy environments
Validation Tools	Simultaneous EEG-fMRI, intracranial recordings	Provide ground truth for artifact removal validation	Method development; validation studies

Recent advances in unsupervised artifact detection and correction provide flexible end-to-end frameworks that can be applied to novel EEG data without expert supervision [25]. These methods extract numerous clinically relevant features and apply ensembles of unsupervised outlier detection algorithms to identify EEG artifacts unique to a given task and subject [25]. The identified artifact segments can then be processed through deep encoder-decoder networks for unsupervised artifact correction, framing the problem as a "frame-interpolation" task where missing or corrupted segments are reconstructed from clean surrounding data [25].

A critical consideration in selecting artifact management approaches is the trade-off between preserving brain signals and removing noise [24]. This balance depends on the specific research questions, the types of artifacts present, and the analytical methods being employed. For instance, in studies focusing on high-frequency neural activity, more aggressive muscle artifact removal might be necessary, while in studies of slow cortical potentials, different approaches would be prioritized.

EEG artifacts represent a fundamental challenge with far-reaching consequences for both data analysis and clinical interpretation. These non-cerebral signals can profoundly impact data quality, potentially leading to erroneous research conclusions and clinical misdiagnoses. The risks are particularly pronounced in complex environments such as intensive care units and in emerging applications like wearable EEG and exergaming, where artifact sources are abundant and varied.

Effective artifact management requires a multifaceted approach combining sophisticated detection methodologies with appropriate correction techniques tailored to specific research or clinical contexts. While automated methods show increasing promise, the critical role of expert interpretation remains, particularly in distinguishing subtle cerebral patterns from sophisticated artifacts. As EEG technology continues to evolve and expand into new applications, developing robust, validated approaches to artifact management will remain essential for ensuring the validity and reliability of both research findings and clinical diagnoses.

For researchers, scientists, and drug development professionals, a thorough understanding of artifact types, their consequences, and management strategies is not merely technical detail but fundamental to producing rigorous, reproducible science and ensuring patient safety in clinical applications.

EEG Artifact Removal Techniques: From Traditional Algorithms to State-of-the-Art Deep Learning

Electroencephalography (EEG) signals are invariably contaminated by potentials of non-cerebral origin, with electrooculographic (EOG) and electrocardiographic (ECG) artifacts representing two of the most pervasive challenges in neurophysiological data analysis. These artifacts originate from biological sources: EOG artifacts arise from eye movements and blinks due to the corneo-retinal dipole, while ECG artifacts are generated by the electrical activity of the heart muscle. Their high amplitude relative to cortical signals and broad spectral overlap with neural activity of interest make them particularly problematic for EEG interpretation and analysis.

Regression-based methods represent a foundational approach for correcting these artifacts by leveraging separately recorded reference channels. These techniques operate on a simple but powerful principle: record the artifact source directly using dedicated EOG/ECG electrodes, mathematically model its propagation to EEG electrodes, and subtract this modeled contamination from the recorded signals. The robustness and computational efficiency of these methods have maintained their relevance despite the development of more complex approaches like Independent Component Analysis (ICA), particularly in contexts with limited channel counts or requirements for real-time processing.

Theoretical Foundations and Mathematical Formulation

Core Regression Model

The fundamental assumption underlying regression-based artifact correction is that the recorded EEG signal represents a linear superposition of true cerebral activity and propagated artifact signals. This relationship is mathematically expressed as:

Y(t, ch) = S(t, ch) + A(t) × B(ch)

Where:

Y(t, ch) is the recorded value of EEG channel ch at time t
S(t, ch) is the true cerebral source signal without artifact contamination
A(t) represents the artifact source time series (from EOG/ECG reference channels)
B(ch) represents the weighting coefficients (regression coefficients) quantifying how strongly the artifact propagates to each EEG channel

The primary goal of regression correction is to obtain an accurate estimate of B(ch), then compute the cleaned signal as: S(t, ch) = Y(t, ch) - A(t) × B(ch).

Key Physiological and Physical Assumptions

The validity of regression-based correction rests on several critical assumptions about the nature of physiological artifacts:

Linearity: The volume conduction of artifacts from source to recording electrodes follows a linear model, which is generally valid for electrical signals propagating through biological tissues.
Stationarity: The weighting coefficients B(ch) remain constant throughout the recording session, implying stable electrical properties of tissues and fixed spatial relationships between artifact sources and EEG electrodes.
Adequate Reference Signals: The EOG/ECG reference channels must capture the essential spatial dimensions of the artifact. For EOG, this typically requires three spatial components (horizontal, vertical, and radial) to fully characterize ocular artifacts, while ECG generally requires a single reference channel adequately capturing the cardiac electrical activity.

Table 1: Spatial Characteristics of Physiological Artifacts

Artifact Type	Primary Sources	Spatial Distribution on Scalp	Recommended Reference Channels
EOG Artifacts	Corneo-retinal dipole movement (blinks, saccades)	Primarily frontal regions, attenuating with distance	Horizontal EOG (bipolar outer canthi), Vertical EOG (bipolar above/below eye), Radial EOG
ECG Artifacts	Cardiac electrical activity (QRS complex)	Variable distribution, often posterior or temporal regions	Single bipolar ECG channel (e.g., lead II)

Experimental Implementation and Protocols

Data Acquisition Requirements

Successful regression-based correction begins with proper experimental setup and data acquisition:

Electrode Placement: For EOG correction, a minimum of three EOG electrodes is recommended to capture the complete spatial profile of ocular artifacts. For ECG, standard limb leads or chest placements provide adequate reference signals.
Synchronized Recording: All EEG and reference channels must be recorded simultaneously on the same acquisition system with precise temporal alignment.
Impedance Management: Maintain low and stable electrode impedances (<50 kΩ for high-density EEG systems) to ensure high-quality signals and stable regression estimates.
Recording Duration: Include dedicated calibration periods (2-3 minutes) where subjects perform standardized eye movements (blinks, saccades) to facilitate robust estimation of regression coefficients.

Core Regression Workflow

The following diagram illustrates the complete workflow for regression-based artifact correction:

Regression-Based Artifact Correction Workflow

Practical Implementation Using MNE-Python

The MNE-Python ecosystem provides robust implementations of regression-based correction methods. The following code example demonstrates the essential steps:

This implementation demonstrates the standard approach, but several methodological variations exist that can enhance performance:

Gratton et al. Method: Computing regression coefficients on epoch data with the evoked response subtracted to focus on noise components.
Croft & Barry Method: Using dedicated blink-onset epochs to create an evoked blink response for regression, amplifying the artifact signal relative to neural activity.

Table 2: Regression Method Variations and Applications

Method Variation	Key Innovation	Best Suited Applications
Standard Regression	Direct estimation from continuous data	General-purpose artifact correction
Gratton Method	Evoked response subtraction before regression	Event-related potential studies
Croft & Barry Method	Regression on evoked blink/saccade responses	Data with pronounced ocular artifacts

Performance Validation and Quantitative Results

Efficacy Metrics and Expert Validation

Regression-based methods have been quantitatively validated through both automated metrics and expert evaluation:

Expert Blind Scoring: In a rigorous validation study, independent expert scorers identified EOG artifacts in 5.9% of raw data segments, with regression correction successfully addressing 4.7% of these contaminated segments. Post-correction, experts identified only 1.9% of data as containing residual artifacts that went undetected in uncorrected data [28].
Artifact Reduction Rate: The same study reported an 80% overall reduction in EOG artifacts following regression-based correction, demonstrating substantial cleanup of contaminated segments while preserving cerebral activity [28].
Spectral Preservation: Performance can be quantified using changes in power spectral density (ΔPSD) across standard frequency bands after artifact suppression. Lower ΔPSD values indicate less distortion of underlying cerebral activity [29].

Comparative Performance Analysis

Regression methods are most effective for EOG artifacts, which propagate to EEG electrodes through volume conduction in a manner well-captured by linear models. However, their performance for ECG artifacts is more limited because the cardiac vector represents a rotating dipole whose temporal dynamics are not adequately captured by a single reference channel [30]. For ECG contamination, alternative approaches like ICA or SSP generally yield superior results.

Table 3: Quantitative Performance of Regression Methods

Performance Metric	EOG Artifact Reduction	ECG Artifact Reduction	Data Loss
Expert-Rated Efficacy	80% artifact reduction [28]	Limited, not recommended [30]	None
Spectral Distortion (ΔPSD)	Minimal when properly applied [29]	Moderate to high	None
Temporal Signal Integrity	High preservation of neural dynamics	Variable, often poor	None
Comparative Performance	Superior for low-channel counts [28]	Inferior to ICA/SSP [30]	Superior to rejection methods

Integration with Broader Artifact Correction Framework

The Scientist's Toolkit: Essential Research Materials

Table 4: Essential Research Reagents and Solutions for Regression-Based Methods

Item Name	Specifications	Function in Experiment
EEG Recording System	64+ channels, 24-bit resolution, synchronized auxiliary inputs	Simultaneous acquisition of EEG and reference signals
EOG Electrodes	3+ dedicated channels (horizontal, vertical, radial)	Capture spatial profile of ocular artifacts
ECG Electrodes	Single bipolar channel (lead II configuration)	Record cardiac electrical activity
Electrode Cap	Standard 10-20 or extended 10-5 system	Consistent scalp electrode placement
Conductive Gel/Paste	Low impedance, long-term stability	Ensure high-quality signal acquisition
Calibration Stimuli	Visual targets for saccades, blink prompts	Generate artifact-rich data for coefficient estimation

Comparison with Alternative Artifact Correction Methods

Regression-based approaches occupy a specific niche in the broader ecosystem of artifact correction methods. The following diagram situates regression in relation to other common approaches:

Positioning Regression Among Artifact Correction Methods

Each method presents distinct advantages and limitations. Regression excels in scenarios with limited channel counts, requirements for computational efficiency, or needs for complete data preservation. However, it depends critically on high-quality reference signals and assumes a linear propagation model. In contrast, ICA can separate artifacts without reference signals but requires higher channel counts and may inadvertently remove neural activity when discarding components.

Limitations and Best Practices

Methodological Constraints and Considerations

While regression-based methods offer significant advantages, researchers must consider several limitations:

Bidirectional Contamination: EOG reference channels also contain cerebral activity, particularly from frontal regions, potentially leading to over-correction and removal of genuine neural signals [29].
Stationarity Assumption: While generally valid for short recordings, regression coefficients may drift during extended sessions, requiring periodic re-estimation.
Inadequate ECG Modeling: The rotating dipole nature of cardiac activity limits regression effectiveness for ECG artifacts compared to spatial methods like ICA or SSP [30].
Reference Signal Quality: Method efficacy depends entirely on clean, well-recorded reference signals free from other contaminating sources.

Recommendations for Implementation

Based on empirical validation and practical experience:

Always inspect reference channels for quality before applying regression correction.
Include dedicated calibration periods with directed eye movements to improve coefficient estimation.
Validate correction efficacy through both visual inspection and quantitative metrics for each dataset.
Apply appropriate filtering (0.3-40 Hz bandpass) consistently to both EEG and reference channels before regression.
Consider hybrid approaches for ECG artifacts, where regression may serve as an initial processing step before more sophisticated methods.
Re-apply baseline correction after regression to account for potential DC shifts introduced during the correction process.

When properly implemented with attention to these considerations, regression-based methods provide a computationally efficient, robust approach for physiological artifact reduction that preserves data integrity and maintains statistical power by avoiding data rejection.

The interpretation of Electroencephalography (EEG) data is fundamentally complicated by the presence of physiological artifacts. These unwanted signals, originating from non-cerebral sources such as eye movements, muscle activity, and cardiac rhythms, can obscure genuine brain activity and lead to erroneous conclusions in both clinical and research settings [1]. In the context of pharmacological research, or pharmaco-EEG, the accurate assessment of drug effects on the central nervous system is highly dependent on clean EEG data [4]. Blind Source Separation (BSS) has emerged as a powerful framework for addressing this challenge. As a special case of BSS, Independent Component Analysis (ICA) provides a computational method for isolating and removing artifacts by separating a multivariate signal into additive, statistically independent subcomponents [31] [32]. This technical guide details the principles of ICA and provides a comprehensive overview of its application for artifact correction in physiological EEG research.

Core Mathematical Principles of ICA

The Generative Model

ICA is based on a linear generative model. It assumes that the observed multi-channel EEG data, represented as a vector x = [x₁(t), x₂(t), …, xₙ(t)]ᵀ, is a linear mixture of underlying source signals s = [s₁(t), s₂(t), …, sₙ(t)]ᵀ [31] [32]. The model is expressed as:

x = A s

Here:

x is the n × 1 vector of observed EEG signals at time t.
A is the n × n mixing matrix, which is unknown and represents the conductivity properties of the head volume conductor.
s is the n × 1 vector of underlying independent components (ICs), which include both cerebral and artifactual sources [31].

The goal of ICA is to find an unmixing matrix W such that:

s = W x

This equation yields the estimated independent components s. When successful, W is approximately the inverse of A (W ≈ A⁻¹) [32].

Statistical Assumptions and Identifiability

The identifiability of the true source signals relies on two key statistical assumptions [31] [32]:

Statistical Independence: The components sᵢ are mutually statistically independent. This means the value of any one component provides no information about the value of any other.
Non-Gaussianity: The components sᵢ must have non-Gaussian (non-normal) probability distributions. The sole exception is that at most one source can be Gaussian.

These assumptions are crucial because, for Gaussian distributions, uncorrelatedness implies independence. Since methods like Principal Component Analysis (PCA) only decorrelate data, they are insufficient for blind source separation. ICA uses higher-order statistics to achieve independence, which is a stronger condition than uncorrelatedness [31]. The model is identifiable under these conditions, albeit with unavoidable ambiguities: the order (permutation) and the scale (amplitude and sign) of the recovered sources cannot be uniquely determined [32].

Preprocessing: Centering and Whitening

For numerical stability and efficiency, ICA is typically preceded by two preprocessing steps:

Centering: Subtracting the mean from each channel to create a zero-mean signal.
Whitening (or Sphering): Linearly transforming the data so that its components become uncorrelated and have unit variance. If Z is the whitened data, then Z Zᵀ = I, where I is the identity matrix [31] [32]. Whitening reduces the number of parameters to be estimated by constraining the unmixing matrix to be orthogonal.

ICA for EEG Artifact Removal: A Practical Workflow

The following section outlines a standard protocol for using ICA to remove artifacts from continuous EEG data.

Experimental Setup and Data Acquisition

Objective: To acquire EEG data suitable for ICA decomposition and subsequent artifact removal. Materials: An EEG system with appropriate amplifiers and electrodes [33]. Protocol:

Apply electrodes according to the international 10-20 system or other relevant montages. A sufficient number of channels (e.g., 19 or more) is recommended for effective ICA [34].
Record EEG data under the desired experimental conditions (e.g., resting state, task-based). Include dedicated electrooculography (EOG) and electrocardiography (ECG) channels if possible, as they serve as valuable references for identifying artifact-related components [4] [34].
Export the continuous, multi-channel EEG data for preprocessing.

Signal Preprocessing

Objective: To prepare the raw EEG data for ICA by reducing noise and standardizing the signal. Protocol:

Band-pass filtering: Apply a filter (e.g., 1-40 Hz) to remove slow drifts and high-frequency noise that falls outside the typical range of cerebral activity [35].
Notch filtering: Optionally, apply a notch filter at 50/60 Hz to suppress line noise from power mains [34].
Re-referencing: Re-reference the data to a common average reference to reduce the effect of common-mode noise [35].
Segmentation: For continuous data, segment the data into epochs. A sliding window technique with overlapping epochs can be used for ongoing, real-time correction [36].

ICA Decomposition and Component Classification

Objective: To decompose the EEG data into independent components and identify those representing artifacts. Protocol:

Run ICA: Perform ICA decomposition on the preprocessed data using an algorithm such as FastICA, Infomax, or JADE [32]. This results in (a) the unmixing matrix W and (b) the time courses and topographies of all independent components.
Component Inspection: Analyze the components to identify those corresponding to artifacts. This can be done manually or automated using algorithms. Key features for identification include [33] [36]:
- Temporal Features: The component's time course is highly correlated with signals from EOG or ECG reference channels [4] [34].
- Spatial Features: The component's scalp topography shows a characteristic pattern, such as strong frontal focus for ocular artifacts [4].
- Spectral Features: The component's power spectrum shows high power in frequency bands atypical for cerebral activity (e.g., high-frequency muscle artifacts) [36].
Label Artifactual Components: Flag the components identified as artifacts for removal.

Table 1: Characteristics of Major EEG Artifact Types and Their ICA Identification.

Artifact Type	Physiological Origin	Key Identifying Features in ICA
Ocular Artifact	Eye movements and blinks [1]	- High correlation with EOG channels.- Fronto-polar scalp topography.- High amplitude, low-frequency transient peaks in the time course [4].
Muscle Artifact (EMG)	Contraction of head and neck muscles [1]	- High-frequency activity in the power spectrum (>20 Hz).- Diffuse or temporalis/occipital topography [35] [36].
Cardiac Artifact (ECG)	Electrical activity of the heart [1]	- Regular, periodic pattern time-locked to the QRS complex.- High correlation with ECG channel.- Topography often maximal at electrodes over blood vessels [36].

Signal Reconstruction and Validation

Objective: To reconstruct the EEG signal without the contaminating artifacts and validate the results. Protocol:

Reconstruction: Set the columns of the mixing matrix A that correspond to the artifactual components to zero. Alternatively, exclude these components and reconstruct the EEG data using the remaining components and the mixing matrix: X_corrected = A_clean * s_clean [33].
Validation: Compare the corrected data to the original raw data.
- Quantitative Metrics: Calculate the percentage of artifact reduction. One study reported success rates of 81% for ocular, 84% for cardiac, and 98% for muscle artifacts using an automated BSS algorithm [36].
- Qualitative Inspection: Visually inspect the corrected data to ensure artifact removal and preservation of neural signals.
- Downstream Analysis: In pharmaco-EEG, compare the results of analyses (e.g., topographic mapping, PK-PD modeling) performed on data corrected with ICA versus other methods like regression [4].

The Scientist's Toolkit: Essential Research Reagents

Table 2: Key Software and Computational Tools for ICA in EEG Research.

Tool / Resource	Function / Purpose	Example Use Case / Note
EEGLAB	An open-source MATLAB toolbox for processing EEG data [33]	Provides a graphical interface and functions for running ICA, component inspection, and artifact removal.
MNE-Python	An open-source Python package for EEG/MEG data analysis [34]	Used for full preprocessing workflow, ICA decomposition, and automated component labeling (e.g., `find_bads_eog`).
FastICA Algorithm	A computationally efficient algorithm for performing ICA [32]	Commonly used due to its speed and robustness; available in toolboxes like EEGLAB and MNE-Python.
TUH EEG Corpus	A large, publicly available database of clinical EEG recordings [35]	Serves as a benchmark dataset for developing and validating new artifact detection and removal algorithms.
Autoreject	A Python tool for automated artifact rejection [34]	Can be used as an alternative or complement to ICA for handling artifacts in EEG data.

Advanced Topics and Current Advances

Cross-Validation and Model Validation

Determining the correct number of independent components is a critical step. Cross-validation and jack-knifing procedures can be applied to ICA to estimate uncertainties for the component loadings and to determine how many components are statistically significant. This helps prevent overfitting and improves the stability of the ICA model [37].

Comparison with Other Methods

ICA is often compared to regression-based methods, which subtract a scaled version of EOG/ECG reference signals from the EEG. A key advantage of ICA is that it does not require a pure artifact reference signal. Regression methods assume EOG channels contain only ocular activity, which is often false as these channels are also contaminated by cerebral signals. This "bidirectional contamination" can lead to the unwanted removal of brain activity [4]. Studies have shown that ICA can lead to more neurophysiologically sound results and better PK-PD relationships in drug trials compared to regression [4].

Real-Time and Automated Artifact Removal

Recent advances focus on making ICA and BSS practical for online systems, such as brain-computer interfaces and continuous epilepsy monitoring. New algorithms use a sliding window technique with overlapping epochs and automated component classification based on spatial, temporal, and frequency features. These methods have demonstrated high artifact removal rates with computation times fast enough for online application [36].

Integration with Deep Learning

While ICA remains a cornerstone technique, deep learning approaches are emerging. Studies have developed specialized Convolutional Neural Networks (CNNs) for detecting specific artifact classes in EEG. These models have been shown to outperform traditional rule-based methods and can be optimized for different artifact types using specific temporal window lengths (e.g., 20s for eye movements, 5s for muscle activity) [35]. These methods represent a complementary, data-driven approach to the problem of artifact handling.

Visual Workflows

The following diagrams illustrate the core conceptual and experimental workflows of ICA.

ICA Core Concept: Illustrates the fundamental principle of blind source separation, where observed signals are mixtures of independent sources.

EEG Artifact Removal Workflow: Outlines the end-to-end experimental protocol for using ICA to clean EEG data, from acquisition to validation.

The electroencephalogram (EEG) provides a crucial window into brain function, capturing voltage fluctuations generated by the synchronous activity of millions of cortical neurons [38]. However, this neural signal is highly susceptible to contamination by physiological artifacts—unwanted signals originating from the subject's own body. These artifacts, which include activity from ocular, muscular, and cardiac sources, often exhibit amplitudes orders of magnitude greater than cerebral signals, potentially obscuring neural information and leading to misinterpretation [1] [10]. Effective artifact management is therefore a prerequisite for valid analysis in both clinical and research EEG applications.

Filtering represents a fundamental preprocessing step for mitigating these contaminants. By selectively attenuating specific frequency bands associated with known artifacts, filters can enhance the signal-to-noise ratio of underlying neural activity. However, filtering is not a panacea; inappropriate application can introduce significant distortions, altering the temporal and spectral characteristics of the EEG [39] [40]. This guide provides an in-depth examination of three principal filter classes—high-pass, band-pass, and notch filters—detailing their optimal use cases, parameter configurations, and the inherent pitfalls associated with their application in the context of physiological artifact management.

Characterization of Major Physiological Artifacts

A strategic approach to filtering first requires a clear understanding of the spectral and topographic characteristics of common physiological artifacts. The table below summarizes the primary artifacts and their properties.

Table 1: Characteristics of Major Physiological Artifacts in EEG

Artifact Type	Source	Spectral Characteristics	Topographic Distribution	Amplitude Range
Ocular Artifacts	Eye blinks and movements [1]	Low-frequency (< 4 Hz) [6]	Primarily frontal and pre-frontal regions [10]	Up to hundreds of microvolts [10]
Muscle Artifacts (EMG)	Head, face, neck muscle contraction [1] [10]	Broadband, high-frequency (> 30 Hz) [10]	Widespread, but focused near muscle groups (temporal areas) [1]	Varies with contraction force [10]
Cardiac Artifacts	Electrical activity of the heart (ECG) [1]	~1.2 Hz (pulse); characteristic QRS complex [1]	Left-side electrodes, or over pulsating vessels [1] [10]	Low amplitude at scalp [10]

High-Pass and Band-Pass Filtering for Drift and Ocular Artifacts

Rationale and Target Artifacts

High-pass filters (HPF) attenuate low-frequency components below a specified cutoff frequency. They are primarily employed to remove slow baseline drift and the very low-frequency components of ocular artifacts, such as those from eye blinks [41]. Band-pass filters combine a high-pass filter with a low-pass filter to restrict the EEG signal to a specific frequency range of interest (e.g., 0.5-40 Hz), thereby removing both very low and very high-frequency noise.

The Critical Role of Cutoff Frequency and Filter Choice

A key consideration in high-pass filtering is the selection of an appropriate cutoff frequency. Overly aggressive high-pass filtering (e.g., using cutoffs of 0.3 Hz and above) is a common source of significant distortion in event-related potential (ERP) data [39].

Table 2: Impact of High-Pass Filter Cutoff on ERP Components (based on [39])

High-Pass Filter Cutoff	Effect on Simulated P600 Amplitude	Induced Artifactual Activity	Recommendation
0.1 Hz and lower	Minimal attenuation	Negligible	Safe for use; recommended for preserving slow cortical potentials.
0.3 Hz	Linear reduction in amplitude	Significant negative peak preceding the true P600, resembling an N400	Can lead to false conclusions about component engagement.
1.0 Hz	Severe attenuation	Pronounced artifactual peaks; delays apparent onset latency by ~200 ms	Not recommended for ERP studies; risks severe waveform distortion.

As illustrated in Table 2, high-pass filters with cutoffs as low as 0.3 Hz can introduce artifactual peaks of opposite polarity before the genuine ERP component. This can mislead researchers into concluding that an experimental manipulation affects multiple components when it actually impacts only one [39]. For instance, in a language processing paradigm, a 0.3 Hz HPF can create a spurious N400 effect preceding a P600 in syntactic violation conditions [39].

Experimental Protocol for Establishing High-Pass Filter Parameters

Objective: To determine the optimal high-pass filter cutoff that minimizes baseline drift and low-frequency ocular artifacts without distorting endogenous ERP components.

Data Acquisition: Record EEG data using a standard paradigm known to elicit the ERP components of interest (e.g., an oddball task for P300). Include a sufficient number of trials to ensure a high signal-to-noise ratio.
Preprocessing: Apply conservative artifact rejection routines to remove major ocular and muscle artifacts. Do not apply any high-pass filter at this stage, or use an ultra-low cutoff (e.g., 0.01-0.05 Hz).
Filtering and Analysis: Process the unfiltered data through multiple parallel pipelines, applying high-pass FIR filters with different cutoff frequencies (e.g., 0.1 Hz, 0.3 Hz, 0.5 Hz). Use a consistent, non-causal (zero-phase) filter type like the default in EEGLAB to avoid phase shifts [41] [40].
Comparison and Validation:
- Visual Inspection: Overlay the grand-average ERPs from each pipeline. Look for the emergence of deflections with opposite polarity preceding the components of interest (e.g., a negative peak before a P300 or P600) as the cutoff increases.
- Quantitative Analysis: Measure the peak amplitude and latency of the key components across conditions. A significant reduction in amplitude or a systematic delay in onset latency with increasing cutoff frequency indicates filtering-induced distortion.
- Residual Analysis: For data with strong low-frequency drifts, assess whether the lowest cutoff (e.g., 0.1 Hz) effectively stabilizes the baseline without introducing the artifacts seen at higher cutoffs.

The general recommendation is to use a high-pass filter cutoff between 0.01 Hz and 0.1 Hz for studies focusing on slow cortical potentials like the P300, N400, or LPP [39] [40].

Notch Filtering for Power Line Interference

Rationale and Target Artifacts

Notch filters are sharp band-stop filters designed to attenuate a very narrow frequency band. Their primary use in EEG is to remove 50 Hz or 60 Hz power line interference [42]. This artifact is pervasive, especially in unshielded environments or mobile EEG setups [42].

Pitfalls and Superior Alternative Methods

While conceptually simple, standard notch filters (e.g., Butterworth IIR filters) carry a high risk of causing time-domain distortions, including ringing artifacts (pre- and post-oscillations) due to the Gibbs phenomenon [42]. This occurs because of the sharp and narrow stopband in the frequency response, which translates to oscillatory behavior in the time domain. Consequently, some methodologies recommend avoiding traditional notch filters in ERP research [42].

Fortunately, several alternative methods have been developed that are often superior:

Spectrum Interpolation: This method involves transforming the time-domain signal into the frequency domain via a Discrete Fourier Transform (DFT), interpolating the amplitude spectrum at the interference frequency using neighboring frequencies, and then transforming the data back into the time domain [42]. This approach effectively removes line noise while introducing less distortion in the time domain compared to a standard notch filter [42].
Discrete Fourier Transform (DFT) Filter: This method fits sine and cosine waves at the interference frequency to the signal and subtracts them. It performs well when line noise amplitude is stationary but may fail if the amplitude fluctuates significantly over the data segment [42].
CleanLine: This is a regression-based method that uses a sliding window and Slepian multitapers to estimate and subtract the line noise component adaptively. It is designed to remove only deterministic line components while preserving background neural spectral energy [41] [42].

Experimental Protocol for Comparing Notch Filtering Methods

Objective: To evaluate the efficacy of different line-noise removal techniques in preserving the integrity of the original EEG signal.

Data Simulation: Generate a clean, line-noise-free ground truth signal. This could be a simulated ERP waveform (e.g., a Gaussian-shaped pulse) or a real MEG/EEG recording obtained in a perfectly shielded room [42].
Artifact Introduction: Add simulated 50/60 Hz power line noise with non-stationary properties (e.g., fluctuating amplitude, abrupt on-/offsets) to the clean signal.
Application of Methods: Apply the following methods to the contaminated signal:
- Traditional IIR Notch Filter (e.g., Butterworth)
- Spectrum Interpolation
- DFT Filter
- CleanLine
Performance Quantification: Compare the processed signals to the original ground truth using multiple metrics:
- Visual Inspection: Plot time-domain signals to identify ringing or waveform distortions.
- Quantitative Metrics: Calculate Normalized Mean Square Error (NMSE) or Root Mean Square Error (RMSE) to assess overall agreement, and Signal-to-Noise Ratio (SNR) to measure noise suppression [6].
- Frequency Analysis: Examine the power spectrum to confirm the removal of the line noise and check for the introduction of spectral holes or distortions at other frequencies.

Studies have shown that spectrum interpolation outperforms the DFT filter and CleanLine for non-stationary line noise and introduces less distortion than a traditional notch filter [42].

A Strategic Workflow for Filter Selection and Application

The following diagram synthesizes the information above into a logical workflow for selecting and applying filters to address specific physiological artifacts.

Diagram 1: A strategic workflow for applying filters to remove physiological artifacts from EEG signals, emphasizing the choice of filter type based on the artifact's spectral properties and the recommendation of modern alternatives to traditional notch filtering.

Table 3: Key Software Tools and Analytical Resources for EEG Filtering Research

Tool/Resource	Type	Primary Function in Filtering	Application Note
EEGLAB [41]	MATLAB Toolbox	Provides a high-level environment for applying FIR filters, ICA, and other preprocessing steps.	The default "Basic FIR filter" uses `filtfilt` for zero-phase distortion [41]. Its CleanLine plugin is useful for line noise.
FieldTrip [42]	MATLAB Toolbox	Offers advanced filtering and analysis functions, including DFT filtering and spectrum interpolation.	Default IIR Butterworth filter can be applied with zero-phase; contains implementations for method comparisons [42].
FIR Filter [43] [40]	Filter Type	Finite Impulse Response filter with linear phase.	Preferred for its stability and predictable time-domain properties. Can be implemented in various environments [43].
Bartlett Window [43]	FIR Window Function	Shapes the filter kernel to balance roll-off and side-lobe levels.	One study found it provided optimal response times for filtering various EEG rhythms [43].
Independent Component Analysis (ICA) [1]	Blind Source Separation	Identifies and removes artifact-related components from the data before filtering.	Highly effective for separating and removing ocular and muscle artifacts, often used as a complement to filtering [1].

Electroencephalography (EEG) is a fundamental tool in neuroscience and clinical diagnostics, prized for its exceptional temporal resolution and non-invasive nature. However, the interpretation of EEG signals is profoundly complicated by the presence of physiological artifacts—signal contaminants originating from non-cerebral sources within the human body. These artifacts often exhibit amplitudes that significantly exceed genuine neural activity, potentially obscuring brain rhythms of interest and leading to misinterpretation in both research and clinical settings. The most prevalent physiological artifacts include those stemming from ocular movements (eye blinks and saccades), muscle activity (electromyographic or EMG artifacts from facial, jaw, or neck muscles), cardiac activity (electrocardiographic or ECG artifacts), and movements related to swallowing or respiration [44] [45]. The primary challenge in addressing these artifacts lies in their spectral overlap with genuine neural signals; for instance, eye blinks manifest in the low-frequency delta band (below 4 Hz), while muscle artifacts occupy the high-frequency beta and gamma ranges (above 13 Hz) [6]. This overlap renders simple frequency-based filtering ineffective, as it would inevitably remove valuable neural information alongside the artifacts. Consequently, advanced signal processing techniques capable of separating signal components based on properties beyond frequency—such as statistical independence or temporal characteristics—have become indispensable. Among these, decomposition techniques like Wavelet Transform and Empirical Mode Decomposition (EMD) have emerged as powerful tools for isolating and removing physiological artifacts while preserving the integrity of the underlying brain activity [46] [45].

Theoretical Foundations of Decomposition Techniques

Wavelet Transform

The Wavelet Transform is a time-frequency analysis technique that overcomes the fixed resolution limitation of traditional Fourier methods. It decomposes a signal into a set of basis functions called wavelets, which are localized in both time and frequency. This multi-resolution analysis is particularly suited to non-stationary signals like EEG, as it can capture transient features and localize artifacts precisely.

Discrete Wavelet Transform (DWT): DWT employs a dyadic filter bank to decompose a signal into approximation coefficients (low-frequency components) and detail coefficients (high-frequency components) at multiple resolution levels. This efficient decomposition is well-suited for denoising and artifact removal, as artifacts can often be isolated to specific wavelet coefficients [47].
Stationary Wavelet Transform (SWT): Unlike DWT, SWT is translation-invariant, as it does not employ downsampling at each decomposition level. This characteristic makes it particularly effective for artifact removal tasks, as it avoids introducing aliasing artifacts and provides a more accurate signal reconstruction. Research has demonstrated its utility in frameworks that combine it with machine learning classifiers for identifying and cleansing artifactual components identified by blind source separation methods [45].

Empirical Mode Decomposition (EMD) and its Variants

EMD is a fully data-driven, adaptive technique designed for analyzing non-linear and non-stationary signals. Unlike wavelet transforms, EMD does not require predefined basis functions. Instead, it decomposes a signal into a collection of Intrinsic Mode Functions (IMFs) adaptively derived from the data itself.

Standard EMD: The algorithm iteratively sifts the signal to extract IMFs, which are functions with a symmetric envelope and the same number of zero-crossings and extrema. Although effective, standard EMD can suffer from mode mixing, where an IMF contains oscillations of dramatically different scales, or a similar oscillation is split across multiple IMFs. This can lead to the incomplete separation of artifacts from neural activity [46].
Fixed Frequency Empirical Wavelet Transform (FF-EWT): FF-EWT is a more recent hybrid approach that combines the adaptability of EMD with the frequency precision of wavelets. It creates adaptive wavelet-like filters tailored to the specific frequency components present in the signal. This method has shown superior performance in targeting fixed frequency ranges associated with specific artifacts, such as EOG, providing more focused and accurate removal while preserving non-artifact content [46].

Table 1: Comparison of Core Decomposition Techniques for EEG Artifact Removal

Technique	Core Principle	Adaptivity	Key Strength	Primary Limitation
Discrete Wavelet Transform (DWT)	Dyadic multi-resolution analysis using pre-defined wavelets	Non-adaptive	Computational efficiency; clear frequency band separation	Lack of translation invariance can cause artifacts
Stationary Wavelet Transform (SWT)	Translation-invariant multi-resolution analysis	Non-adaptive	Superior reconstruction quality; avoids aliasing	Higher computational complexity than DWT
Empirical Mode Decomposition (EMD)	Data-driven sifting to extract Intrinsic Mode Functions (IMFs)	Fully Adaptive	No need for basis functions; handles non-stationarity well	Susceptible to mode mixing and boundary effects
Fixed Frequency EWT (FF-EWT)	Builds adaptive wavelet filters based on signal's Fourier spectrum	Fully Adaptive	Combines adaptivity with precise frequency separation	Parameter selection (e.g., number of modes) can be complex

Quantitative Performance Analysis

Evaluating the performance of artifact removal techniques requires a set of standardized quantitative metrics. These metrics are typically calculated by comparing the processed signal against a known ground truth, often using semi-simulated data where clean EEG is artificially contaminated with artifacts.

Relative Root Mean Square Error (RRMSE): A lower RRMSE indicates a smaller error between the cleaned signal and the ground truth, signifying better artifact removal performance. For instance, the novel GCTNet model, which integrates decomposition concepts with deep learning, achieved an 11.15% reduction in RRMSE compared to other methods [6].
Correlation Coefficient (CC): This measures the linear agreement between the cleaned signal and the original, clean EEG. A CC value closer to 1.0 indicates that the cleaned signal better preserves the original neural information. Deep learning models incorporating decomposition principles have reported high CC values, demonstrating strong linear agreement with ground truth signals [6].
Signal-to-Artifact Ratio (SAR) and Signal-to-Noise Ratio (SNR): SAR and SNR quantify the level of desired signal relative to the residual artifact or noise. An increase in these ratios post-processing denotes a more effective denoising outcome. Studies have shown that wavelet-based methods can achieve specific SAR improvements, and deep learning models like AnEEG have demonstrated significant enhancements in both SNR and SAR values [47] [6].

Table 2: Quantitative Performance Metrics from Key Studies

Study & Method	Artifact Type	Key Performance Metrics	Reported Outcome
Wavelet & Regression [47]	Galvanic Vestibular Stimulation (GVS)	Signal-to-Artifact Ratio (SAR)	Achieved a higher SAR of -1.625 dB, outperforming ICA and adaptive filters
FF-EWT + GMETV [46]	Ocular (EOG)	RRMSE, Correlation Coefficient (CC)	Lower RRMSE and higher CC on synthetic data compared to other techniques
AnEEG (Deep Learning) [6]	Muscle, Ocular, Environmental	NMSE, RMSE, CC, SNR, SAR	Lower NMSE/RMSE, higher CC, and improved SNR/SAR versus wavelet decomposition
SWT & Machine Learning [45]	Biological (EOG, EMG)	Mean Square Error (MSE), Accuracy	~2% MSE in reconstruction; 98% accuracy in detecting artifactual components

Experimental Protocols and Methodologies

Protocol for Wavelet-Based GVS Artifact Removal

A clearly defined protocol for removing Galvanic Vestibular Stimulation (GVS) artifacts using a wavelet-based method demonstrates the application of this technique [47]:

Data Acquisition: EEG is recorded using a standard system (e.g., NeuroScan SynAmps2 with 20 electrodes) at a sampling frequency of 1 kHz. The GVS stimulus (e.g., zero-mean pink noise, 0.1–10 Hz) is applied via a bipolar current stimulator, with its amplitude kept below the feeling threshold (e.g., 100–800 μA). The delivered stimulus current and voltage are recorded concurrently.
Signal Decomposition: The recorded EEG signal and the recorded GVS current signal are projected into various frequency sub-bands using multiresolution analysis, specifically the Discrete Wavelet Transform (DWT) or the Stationary Wavelet Transform (SWT).
Regression Modeling: Within each wavelet-decomposed frequency band, a time-series regression model (e.g., discrete-time polynomials, nonlinear Hammerstein-Wiener, or state-space models) is used to estimate the specific contribution of the GVS current to the EEG signal.
Artifact Subtraction & Reconstruction: The estimated GVS artifact is subtracted from the recorded EEG in each frequency sub-band. The clean sub-bands are then reconstructed using the inverse wavelet transform to produce the final artifact-free EEG signal.

Wavelet-Based GVS Artifact Removal Workflow

Protocol for Single-Channel EOG Removal with FF-EWT

For single-channel EEG systems, where techniques like ICA are less effective, an automated protocol using FF-EWT has been developed [46]:

Decomposition: The single-channel EEG signal contaminated with EOG artifacts is decomposed into six Intrinsic Mode Functions (IMFs) using the Fixed Frequency Empirical Wavelet Transform (FF-EWT). This method creates adaptive filters based on the signal's Fourier spectrum.
Component Identification: The decomposed IMFs are analyzed using quantitative metrics—Kurtosis (KS), Dispersion Entropy (DisEn), and Power Spectral Density (PSD)—to automatically identify which IMFs are dominated by EOG artifacts based on pre-determined threshold values.
Artifact Filtering: The artifact-dominated IMFs identified in the previous step are processed by a cascaded Generalized Moreau Envelope Total Variation (GMETV) filter. This filter is specifically designed to suppress the eyeblink event while minimizing signal distortion.
Signal Reconstruction: The cleaned IMFs (both unmodified and filtered) are summed together to reconstruct the final, artifact-free single-channel EEG signal.

Single-Channel EOG Artifact Removal Workflow

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials and Tools for EEG Artifact Removal Research

Item / Tool	Function / Description	Example Use Case
NeuroScan SynAmps2	High-performance EEG acquisition system with high sampling rate (e.g., 1 kHz).	Used in controlled studies for acquiring high-fidelity EEG data during stimulation [47].
Digitimer DS5 Stimulator	Isolated bipolar current stimulator for applying precise electrical stimuli (e.g., GVS, tES).	Generating Galvanic Vestibular Stimulation artifacts in EEG records [47].
Dry/Semi-Wet Electrodes	Electrodes for rapid setup in wearable EEG systems, but prone to motion artifacts.	Used in wearable EEG research, presenting specific artifact removal challenges [18].
Auxiliary Sensors (IMU, EOG)	Inertial Measurement Units (IMUs) and EOG electrodes for recording non-EEG reference signals.	Provides reference signals for motion and ocular artifacts to enhance detection [18].
EEGLAB Toolbox	Open-source MATLAB toolbox providing a standard platform for processing EEG data.	Includes implementations of algorithms like SOBI and ICA for artifact removal [45].
Semi-Simulated Datasets	Datasets created by mixing clean EEG with artificially generated artifacts.	Enables controlled and rigorous evaluation of artifact removal methods with a known ground truth [48] [6].

Decomposition techniques, including Wavelet Transforms and Empirical Mode Decomposition, provide powerful and flexible frameworks for addressing the persistent challenge of physiological artifacts in EEG research. The Wavelet Transform offers a robust multi-resolution analysis that can be effectively combined with regression models to isolate and remove structured artifacts like GVS. In contrast, EMD and its advanced variants like FF-EWT offer a fully data-driven, adaptive approach that is particularly valuable for non-stationary artifacts and challenging recording contexts, such as single-channel wearable systems. The quantitative evidence and detailed experimental protocols outlined in this guide demonstrate that these methods can achieve high performance in artifact suppression while preserving the integrity of the underlying neural signals. As EEG applications continue to expand into real-world, mobile, and clinical settings, these decomposition techniques will remain cornerstone methodologies, often integrated with machine learning and deep learning approaches, to ensure the reliability and interpretability of brain activity data.

Electroencephalography (EEG) is a fundamental tool in neuroscience and clinical diagnostics, valued for its non-invasive nature and high temporal resolution for capturing brain activity. However, a primary challenge in EEG analysis is the pervasive presence of physiological artifacts—interfering signals originating from non-neuronal biological sources [49] [10]. These artifacts can significantly distort the EEG recording, leading to misinterpretation of brain activity and potentially erroneous conclusions in both research and clinical settings, such as misdiagnosis of neurological disorders [23] [10].

Physiological artifacts are broadly categorized based on their source. Ocular artifacts arise from eye movements and blinks, generating slow, large-amplitude waveforms most prominent in frontal electrodes due to the corneo-retinal dipole [10] [22]. Myogenic (muscle) artifacts result from the contraction of head, face, or neck muscles (e.g., from jaw clenching, chewing, or talking), producing high-frequency, low-amplitude activity that can propagate across the scalp [49] [10]. Cardiac artifacts include electrical activity from the heart (ECG), visible as waveforms time-locked to the heartbeat, and pulse artifacts caused by electrodes placed over pulsating blood vessels [23] [22]. Other sources include glossokinetic artifacts from tongue movement and respiratory artifacts [23] [22].

The core challenge in artifact removal lies in the frequency overlap between these artifacts and genuine neural signals. For instance, eye blinks contain low-frequency components that obscure delta waves, while muscle activity has high-frequency components that overlap with and can mask beta rhythms [6]. This makes simple filtering ineffective, as it would also remove valuable neural information. Consequently, advanced signal processing and deep learning techniques are required to disentangle these mixed signals and recover clean brain activity data.

Conventional and Deep Learning-Based Removal Methods

The pursuit of clean EEG signals has led to the development of numerous artifact removal methodologies, which can be broadly divided into conventional techniques and modern deep learning-based approaches.

Conventional Artifact Removal Techniques

Conventional methods often rely on specific statistical or signal processing assumptions about the nature of the artifacts and the EEG signal.

Blind Source Separation (BSS): Techniques like Independent Component Analysis (ICA) assume that the recorded EEG is a linear mixture of statistically independent source signals, including artifacts from the eyes, heart, and muscles [49] [50]. ICA decomposes the signal into these components, allowing for the manual or semi-automatic identification and removal of artifact-related components before signal reconstruction [51]. A variant, Constrained ICA (cICA), incorporates prior knowledge to improve separation [49]. Canonical Correlation Analysis (CCA) is another BSS method that separates signals based on their autocorrelation, effectively isolating muscle artifacts which typically have low autocorrelation from brain signals with higher autocorrelation [49].
Regression Methods: Linear regression requires a dedicated reference channel (e.g., EOG for ocular artifacts) to model and subtract the artifact's contribution from the EEG signal [49]. While effective, its need for additional hardware is a limitation. Methods like REBLINCA have been developed to operate without a dedicated EOG channel by using specific EEG channels as templates for blink correction [49].
Filtering and Decomposition: Adaptive filters, Wiener filters, and Kalman filters dynamically estimate and remove noise [50]. Wavelet Transform decomposes the signal into time-frequency components, allowing for the thresholding or removal of coefficients associated with artifacts before reconstruction [50]. Similarly, Empirical Mode Decomposition (EMD) and its variants adaptively decompose non-stationary signals for artifact removal [50].

While useful, these conventional methods have limitations, including a reliance on linear assumptions, the potential for removing neural signals along with artifacts ("brain signal loss"), and often requiring expert supervision [50].

The Shift to Deep Learning Models

Deep learning models offer a data-driven alternative, capable of learning complex, non-linear relationships between contaminated and clean EEG signals without requiring strong a priori assumptions [49] [50]. Their ability to automatically extract features from large datasets makes them highly adaptable. Early deep learning approaches for EEG denoising included:

Autoencoders, which learn to compress input data and reconstruct a cleaned version [50].
Generative Adversarial Networks (GANs), where a generator creates denoised signals and a discriminator distinguishes them from real clean EEG, driving the generator to produce more realistic outputs [6].
Recurrent Neural Networks (RNNs), particularly Long Short-Term Memory (LSTM) networks, which are well-suited for modeling the temporal dependencies in sequential data like EEG [51] [6].
Convolutional Neural Networks (CNNs), which excel at extracting spatial features from multi-channel EEG data or temporal patterns from single channels [50].

Advanced Deep Learning Architectures for Artifact Removal

Hybrid CNN-LSTM Architecture

The hybrid CNN-LSTM architecture leverages the strengths of both convolutional and recurrent networks for spatiotemporal feature learning. One advanced approach uses simultaneous facial and neck EMG recordings as additional inputs to guide the removal of muscle artifacts [49].

Table 1: Key Components of the Hybrid CNN-LSTM Model

Component	Function	Architecture Details
Input	Takes contaminated EEG and reference EMG signals.	Raw time-series data from EEG and EMG channels.
CNN Stage	Extracts local spatial and temporal features from the input signals.	Multiple convolutional layers with ReLU activation and pooling.
LSTM Stage	Models long-range temporal dependencies in the feature sequence.	One or more LSTM layers with a hidden state.
Fusion Layer	Integrates features from EEG and auxiliary EMG streams.	Concatenation or attention-based fusion.
Output Layer	Reconstructs the artifact-free EEG signal.	Fully connected layer with linear activation.

The model is trained in a supervised manner using a dataset of concurrent EEG and EMG recordings. The EMG signal provides a direct reference of muscle activity, allowing the network to learn a mapping from the contaminated EEG and its corresponding EMG artifact to the underlying clean EEG. This method has demonstrated excellent performance in removing strong muscle artifacts induced by jaw clenching while preserving sensitive neural responses like Steady-State Visual Evoked Potentials (SSVEPs) [49]. A key evaluation metric is the improvement in the Signal-to-Noise Ratio (SNR) of the SSVEP response after cleaning, which quantitatively confirms that noise is reduced while the signal of interest is retained [49].

Convolutional Neural Network (CNN) Architectures

Pure CNN-based models provide a powerful framework for artifact removal by leveraging convolutional layers to extract hierarchical features. A novel CNN architecture was specifically designed for the simultaneous removal of ocular and myogenic artifacts [50]. This model uses a series of convolutional layers with ReLU activation, average pooling, and a fully connected output layer to reconstruct the clean signal. It integrates the Adam optimizer for efficient training. The model's strength lies in its ability to capture the spatial features of different artifact types directly from the contaminated EEG, without requiring auxiliary reference signals. It reported a low Root Relative Mean Squared Error (RRMSE) of 0.35 and a high cross-correlation coefficient of 0.94 with ground-truth EEG, outperforming other architectures like U-Net and MultiResUNet3+ across a range of SNR values [50].

State Space Models (SSM)

State Space Models (SSMs) represent a advanced approach for processing sequential data, showing particular promise in handling complex, non-stationary artifacts. A multi-modular SSM network (M4) was benchmarked against other methods for removing artifacts induced by Transcranial Electrical Stimulation (tES)—a particularly challenging noise source that overlaps with EEG in both time and frequency domains [48]. SSMs excel at modeling long-range dependencies and the complex dynamics of tES artifacts (including tACS, tDCS, and tRNS). The study found that while a Complex CNN performed best for tDCS noise, the SSM-based M4 model was superior for removing the more complex artifacts from tACS and tRNS [48]. This highlights that model performance is highly dependent on the stimulation (artifact) type, and SSMs are a leading choice for handling sophisticated interference.

Quantitative Performance Comparison

The performance of these advanced models is quantitatively evaluated using a range of metrics that assess the fidelity of the reconstructed signal and the effectiveness of artifact removal.

Table 2: Quantitative Performance of Advanced Deep Learning Models

Model / Architecture	Primary Artifact Target	Key Performance Metrics	Reported Results
Hybrid CNN-LSTM [49]	Muscle Artifact (with EMG reference)	SSVEP Signal-to-Noise Ratio (SNR)	Shows significant SNR increase, outperforming ICA and linear regression.
Novel CNN Model [50]	Simultaneous Ocular & Myogenic	RRMSE, Cross-Correlation (CC)	RRMSE: 0.35, CC: 0.94 with ground-truth.
LSTM-based GAN (AnEEG) [6]	Multiple Biological Artifacts	NMSE, RMSE, CC, SNR, SAR	Lower NMSE/RMSE, higher CC/SNR/SAR vs. wavelet techniques.
Multi-modular SSM (M4) [48]	tES Artifacts (tACS, tRNS)	RRMSE, Correlation Coefficient (CC)	Best performance for tACS and tRNS artifact removal.
Complex CNN [48]	tES Artifacts (tDCS)	RRMSE, Correlation Coefficient (CC)	Best performance for tDCS artifact removal.

Detailed Experimental Protocols

To ensure reproducibility and provide a clear framework for implementation, this section outlines the core experimental methodologies common to evaluating deep learning models for EEG artifact removal.

Data Preparation and Pre-processing Protocol

Data Acquisition: Record EEG data using a multi-electrode system according to the international 10-20 system. For methods requiring reference signals, simultaneously record auxiliary data such as EMG from facial/neck muscles for myogenic artifacts or EOG for ocular artifacts [49]. The sampling frequency should be sufficiently high (e.g., ≥200 Hz) to capture relevant neural and artifact dynamics [6].
Semi-Synthetic Data Generation: For controlled model training and evaluation, create a semi-synthetic dataset by adding realistic artifacts to clean EEG recordings [48] [6]. This provides a known ground truth for validation.
- Artifact Modeling: Artifacts can be recorded in isolation from subjects (e.g., forced blinking, jaw clenching) or synthetically generated based on physiological models for tES [48].
- Mixing Procedure: Artifacts are linearly or non-linearly mixed with the clean EEG signals at varying signal-to-noise ratios to simulate different contamination levels [6].
Data Segmentation and Augmentation: Segment the continuous EEG and artifact data into short, overlapping epochs (e.g., 1-2 seconds). Apply data augmentation techniques such as scaling, shifting, or adding minor noise to increase the size and diversity of the training dataset, which improves model generalization [49].

Model Training and Validation Protocol

Data Splitting: Partition the dataset into training (e.g., 70%), validation (e.g., 15%), and testing (e.g., 15%) sets. Ensure that data from the same subject is not spread across different sets to prevent data leakage and ensure robust subject-independent evaluation.
Loss Function Definition: Select an appropriate loss function to guide the model training. Common choices include:
- Mean Squared Error (MSE): Measures the average squared difference between the denoised output and the ground-truth clean EEG [6].
- Spectral Loss: Measures the difference in the frequency domain (e.g., using Power Spectral Density) to ensure key neural oscillations are preserved [6].
- Composite Loss: A weighted combination of multiple loss functions (e.g., temporal + spectral) often yields the best results [6].
Training Loop: Train the model using the Adam optimizer [50] for a fixed number of epochs or until convergence. Monitor the loss on the validation set to apply early stopping and prevent overfitting.
Performance Evaluation: Evaluate the final model on the held-out test set using the metrics listed in Table 2 (RRMSE, CC, SNR, etc.). Perform statistical tests to compare model performance against benchmark methods.

The Scientist's Toolkit: Research Reagents & Materials

Table 3: Essential Research Materials for Deep Learning-Based EEG Denoising

Item / Solution	Function / Purpose
High-Density EEG System	Records scalp potentials with multiple electrodes (e.g., 32, 64, or more channels) for capturing detailed spatial neural information.
Auxiliary Biosignal Amplifiers	Records reference signals for artifacts, such as EMG from facial muscles or EOG from around the eyes, to guide supervised denoising models [49].
Conductive Electrode Gel/Paste	Ensures high-quality, low-impedance electrical contact between electrodes and the scalp, minimizing noise at the source.
EEG/EMG Caps with Integrated Electrodes	Provides a standardized and stable platform for positioning recording electrodes.
Stimulation Equipment	Presents controlled stimuli (e.g., visual for SSVEP [49], transcranial electrical for tES [48]) to evoke brain responses for functional validation of denoising.
Computational Hardware (GPUs)	Provides the necessary processing power for training complex deep learning models (CNNs, LSTMs, SSMs) on large EEG datasets.
Software Libraries (Python, TensorFlow/PyTorch, EEGLab)	Offers the programming environment and specialized toolboxes for implementing models, processing data, and comparing against conventional methods like ICA [50] [51].

Electroencephalography (EEG) provides a non-invasive, cost-effective method for recording brain activity with superior temporal resolution, making it invaluable for clinical diagnosis, neuroscience research, and brain-computer interfaces (BCIs) [52]. However, the accurate interpretation of neural signals is persistently challenged by physiological artifacts—contaminations in the EEG signal originating from non-neural biological sources [53]. These artifacts include signals from ocular movements, cardiac activity, muscle contractions, and motion, which can obscure or mimic neurogenic activity, potentially leading to erroneous conclusions in both research and clinical settings [53] [54].

The management of these artifacts is particularly crucial in emerging applications such as wearable EEG systems, which operate in uncontrolled environments and are more susceptible to signal quality issues [5]. Traditional single-method approaches often prove insufficient for addressing the complex, non-stationary, and multidimensional nature of physiological artifacts [53]. This paper explores how hybrid and emerging frameworks, which combine multiple computational techniques, are advancing the state of artifact management and EEG signal classification, thereby enhancing the reliability and performance of EEG-based systems.

Defining Physiological Artifacts: A Taxonomy and Mechanisms

Physiological artifacts in EEG can be systematically categorized based on their biological sources and mechanisms of contamination. Understanding this taxonomy is fundamental to developing effective countermeasures.

Table: Taxonomy of Key Physiological Artifacts in EEG Research

Artifact Category	Biological Source	Primary Characteristics	Impact on EEG Signal
Ocular Artifacts	Eye movements & blinks	High-amplitude, frontal dominance, slow dynamics [53]	Obscures frontal lobe activity; mimics slow-wave activity
Cardiac Artifacts	Heartbeat & blood flow	Rhythmic, correlated with pulse, ~1-2 Hz frequency [53]	Introduces regular, pulse-synchronous distortions
Myogenic Artifacts	Muscle activity	High-frequency, broadband, location-specific [5]	Masks high-frequency neural oscillations (e.g., gamma)
Motion Artifacts	Head & body movement	Transient, high-amplitude, non-stationary [5] [54]	Causes abrupt signal shifts and broadband noise

A critical insight is that these artifacts are not merely additive noise but often involve complex interactions with the underlying neural signals. For instance, during transcranial Direct Current Stimulation (tDCS), physiological processes can cause impedance changes that dynamically modulate the stimulation current itself, creating artifacts that are dose-specific and inseparable from neurogenic activity via conventional filtering [53]. These artifacts are high-dimensional, non-stationary, and spectrally overlap with neurogenic frequencies, making them particularly challenging to remove [53].

Hybrid Frameworks for Enhanced Performance

The CNN-LSTM Architecture for Motor Imagery Classification

The integration of Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks represents a powerful hybrid framework for improving Brain-Computer Interface (BCI) performance. This architecture was rigorously evaluated using the "PhysioNet EEG Motor Movement/Imagery Dataset" [55].

In this framework, each component addresses a distinct aspect of the EEG signal:

The CNN component excels at extracting spatial features from the multi-channel EEG data, identifying local patterns related to motor imagery tasks across different electrode locations.
The LSTM component captures temporal dependencies, modeling the evolution of these spatial patterns over time, which is crucial for decoding the dynamic nature of neural signals.

The performance superiority of this hybrid approach is demonstrated in the comparative results below.

Table: Performance Comparison of Classifiers on Motor Imagery EEG Data [55]

Model Type	Specific Classifier	Reported Accuracy
Traditional Machine Learning	Random Forest (RF)	91.00%
	Support Vector Classifier (SVC)	Information missing in source
	k-Nearest Neighbors (KNN)	Information missing in source
Deep Learning	Convolutional Neural Network (CNN)	88.18%
	Long Short-Term Memory (LSTM)	16.13%
Hybrid Framework	CNN-LSTM (Proposed)	96.06%

This table shows that the hybrid CNN-LSTM model achieved exceptional accuracy of 96.06%, significantly outperforming both the best traditional classifier (Random Forest at 91%) and individual deep learning models [55]. The remarkably low performance of the standalone LSTM (16.13%) highlights its limitations in processing raw EEG data without complementary spatial feature extraction, a shortcoming effectively addressed by the hybrid architecture.

The CNN-Transformer Architecture for Global and Local Context

Another emerging hybrid framework combines CNNs with Transformer models, particularly beneficial for applications like emotion recognition from EEG signals [52]. This architecture addresses a fundamental limitation: while CNNs excel at detecting local spatial patterns, they struggle with long-range dependencies. Transformers, with their self-attention mechanisms, capture global context but may overlook fine-grained local relationships [52].

In this hybrid model:

CNN layers extract hierarchical spatial features from specific brain regions.
Transformer layers model interactions between distributed brain areas through self-attention.
A novel fusion mechanism hierarchically integrates these local and global features, preserving both spatial and temporal relationships.

When evaluated on the DEAP dataset for emotion recognition, this hybrid CNN-Transformer architecture achieved 87% accuracy, outperforming pure CNN models like AlexNet (83.50%) and VGG-16 (85.00%), as well as pure Transformer approaches (84.7%) [52]. This demonstrates the framework's enhanced capability to capture the complex neural signatures of emotional states.

Complementary Feature Extraction and Data Augmentation

Beyond architectural innovations, hybrid frameworks often incorporate advanced feature extraction and data augmentation techniques to further enhance performance. One study combined Wavelet Transform and Riemannian Geometry to capture both time-frequency characteristics and the intrinsic geometric structure of EEG data [55]. To address the challenge of limited data, Generative Adversarial Networks (GANs) were utilized to generate synthetic EEG data, helping to balance datasets and improve model generalization [55]. The training process was also optimized, with the hybrid model reaching peak accuracy within just 30-50 epochs when each epoch was limited to 5 seconds, highlighting its computational efficiency [55].

Experimental Protocols and Methodologies

Standardized EEG Data Collection Protocol

Robust EEG research begins with meticulous data collection. The following protocol, derived from large-scale EEG studies, ensures consistency and quality [56]:

Team Structure: Establish three dedicated teams:
- Data Collection Team: Responsible for acquiring EEG data, proper data backup, and documenting remarkable session events.
- Data Preprocessing Team: Trained to perform consistent basic EEG preprocessing across all datasets.
- EEG Supervisory Team: Provides oversight, troubleshoots technical issues, conducts quality control, and trains other teams [56].
Pre-collection Setup:
- Conduct thorough pilot testing of all experimental tasks and scripts.
- Verify identical equipment setup across all recording sites, especially in multi-site studies.
- Develop and disseminate formal protocol documents to ensure consistent implementation [56].
Quality Control Implementation:
- Perform deep inspection of initial datasets to identify errors early.
- Hold regular quality control meetings to review data quality and address issues.
- Designate an experienced researcher to be on-call during recording sessions for immediate troubleshooting [56].

Artifact Management Workflow

The systematic approach to handling physiological artifacts involves detection, categorization, and removal, with techniques tailored to specific artifact properties [5].

This workflow emphasizes the importance of artifact categorization—identifying whether contamination stems from ocular, cardiac, myogenic, or motion sources—as a critical step that enables targeted removal strategies optimized for each artifact's specific characteristics [5].

The Scientist's Toolkit: Essential Research Reagents and Materials

Table: Key Resources for Hybrid EEG Artifact Management Research

Resource Category	Specific Tool/Technique	Function in Research
Computational Frameworks	Hybrid CNN-LSTM Model [55]	Extracts spatial features and captures temporal dependencies for MI classification
	Hybrid CNN-Transformer [52]	Captures both local spatial patterns and global dependencies in EEG signals
	Generative Adversarial Networks [55]	Generates synthetic EEG data to balance datasets and improve model generalization
Signal Processing Tools	Wavelet Transform [55]	Provides time-frequency analysis of non-stationary EEG signals
	Riemannian Geometry [55]	Captures the intrinsic geometric structure of covariance matrices from EEG
	Independent Component Analysis	Separates mixed signals into statistically independent components for artifact isolation
Reference Datasets	PhysioNet EEG Motor Movement/Imagery Dataset [55]	Benchmark dataset for evaluating motor imagery classification algorithms
	DEAP Dataset [52]	Standardized dataset for emotion analysis using physiological signals
Validation Metrics	Classification Accuracy [55] [52]	Primary metric for evaluating model performance on specific tasks
	Selectivity [5]	Assesses algorithm performance with respect to preserving physiological signal

Hybrid and emerging frameworks represent a paradigm shift in addressing the persistent challenge of physiological artifacts in EEG research. By strategically combining complementary methods—such as CNNs with LSTMs or Transformers—these approaches achieve synergistic effects that surpass the capabilities of individual techniques. The integration of advanced feature extraction methods and data augmentation strategies further enhances the robustness and generalizability of these systems.

As EEG applications expand into wearable devices and real-world environments, the effective management of physiological artifacts becomes increasingly critical. The frameworks discussed herein, which combine spatial and temporal modeling with sophisticated artifact characterization, offer promising pathways toward more reliable, accurate, and clinically viable EEG technologies. Future research should focus on developing more interpretable models, optimizing computational efficiency for real-time applications, and creating standardized benchmarking frameworks to accelerate the translation of these hybrid approaches into both clinical and consumer domains.

Troubleshooting EEG Artifacts: Proactive Prevention and Data Cleaning Strategies

Electroencephalography (EEG) research provides unparalleled insights into neural dynamics, but its utility is critically dependent on signal integrity. Physiological artifacts—electrical signals of non-cerebral origin—represent a fundamental challenge, potentially confounding experimental results and leading to erroneous conclusions. While numerous post-processing algorithms exist for artifact removal, a paradigm shift toward proactive control is essential for data quality preservation. This technical guide details evidence-based strategies for optimizing experimental setup and subject instruction, framing them within a comprehensive approach to managing physiological artifacts. These artifacts are not merely noise but are inherent to the measurement process during interventions like transcranial Direct Current Stimulation (tDCS), where they introduce dose-specific contamination that scales with applied current and confounds conventional controls [57]. By implementing rigorous protocols before data acquisition, researchers can mitigate these artifacts at their source, thereby preserving the fidelity of neural signals and the validity of scientific findings.

Understanding the Adversary: A Taxonomy of Physiological Artifacts

A proactive strategy begins with a precise understanding of the artifacts themselves. Physiological artifacts can be categorized by their origin, characteristics, and susceptibility to experimental control. A foundational distinction exists between inherent physiological artifacts, which result from interactions between stimulation-induced voltage and the body, and methodology-related artifacts, which arise from non-ideal equipment or conditions [57]. The former are particularly pernicious as they are present regardless of hardware performance.

Cardiac artifacts manifest in the EEG as rhythmic, periodic fluctuations linked to the heartbeat. These artifacts arise from the electrical field of the heart and associated pulsatile blood flow, which can modulate scalp potentials. Ocular artifacts are primarily generated by eyeblinks and eye movements. The cornea-retina potential difference creates a robust electric field that moves with the eyes, producing high-amplitude, low-frequency deflections in frontal EEG channels. Myogenic artifacts, or electromyogenic (EMG) signals, originate from the contraction of cranial, facial, neck, and jaw muscles. These artifacts are typically high-frequency, non-stationary, and can be localized or diffuse, depending on the muscle group involved [58].

The challenge is compounded during concurrent neuromodulation and recording, such as EEG-tDCS. Here, physiological processes like heartbeat and eye movements cause biological source-specific body impedance changes. This leads to incremental changes in scalp DC voltage that are significantly larger than real neural signals. Because these artifacts modulate the DC voltage and scale with applied current, they are dose-specific, meaning their contamination cannot be accounted for by conventional experimental controls like differing stimulation montage or current [57].

Table 1: Taxonomy and Characteristics of Key Physiological Artifacts in EEG

Artifact Type	Biological Source	Primary EEG Manifestation	Susceptibility to Proactive Control
Ocular	Cornea-retina potential; eye movement [58]	High-amplitude, low-frequency deflections (esp. frontal)	High (via instruction and setup)
Myogenic (Muscle)	Head, neck, jaw muscle contractions [58]	High-frequency, non-stationary, broadband activity	Moderate to High (via instruction, task design, and setup)
Cardiac	Electrical activity of the heart (ECG) [57]	Rhythmic, periodic fluctuations linked to heartbeat	Low (Inherent)
Motion	Head or body movement; cable sway [59]	Large transients or slow, high-amplitude oscillations	High (via instruction and setup)

Optimizing the Experimental Setup

The physical experimental environment and hardware configuration are the first lines of defense against artifact contamination.

Environmental and Hardware Configuration

Creating a controlled recording environment is paramount. The setup should minimize distractions that could prompt unnecessary subject movement, startling, or excessive ocular activity. Furthermore, a meticulous approach to electrode and amplifier setup is required to reduce methodology-related artifacts.

Electrode Application and Impedance Management: Consistent, low-impedance electrode-skin contact is critical. Impedances should be stabilized and balanced across all channels to below 5-10 kΩ for active electrodes, or as specified by the amplifier manufacturer, prior to recording commencement. This reduces motion artifacts and baseline drift. The use of abrasive electrolytic gels and careful skin preparation is essential. For protocols involving significant movement, consider mechanical stabilization of the electrode cap with chin straps or other supports to minimize cable sway and electrode displacement [58].
Montage and Referencing: Select a montage appropriate for the research question. A high-density electrode array (e.g., 64+ channels) significantly improves the efficacy of subsequent blind source separation techniques like Independent Component Analysis (ICA) by providing a richer spatial map for source localization [59]. For studies targeting specific brain regions, the use of a Laplacian montage can provide a more localized signal, which can also be beneficial for certain real-time processing algorithms [60].
Concurrent Physiological Monitoring: The integration of auxiliary biosignal recordings is a powerful proactive measure. Simultaneous Electrooculography (EOG), Electrocardiography (ECG), and Electromyography (EMG) provide critical data streams to inform and validate artifact removal pipelines [57] [58]. EOG channels are indispensable for characterizing and regressing out ocular artifacts. Likewise, ECG provides a clear cardiac reference, and EMG from the neck or trapezius muscles can help identify periods of generalized muscle tension.

The Scientist's Toolkit: Essential Research Reagent Solutions

Table 2: Key Materials and Equipment for an Optimized EEG Setup

Item	Function & Importance
High-Density EEG System (64+ channels)	Enables superior spatial sampling and more effective ICA decomposition for artifact separation [59].
Abrasive Electrolyte Gels	Ensures stable, low-impedance electrical contact between electrode and skin, reducing noise and drift.
Active Electrode Systems	Minimizes cable motion artifacts and environmental interference, beneficial for mobile protocols.
Auxiliary Biosignal Amplifiers	Allows concurrent recording of EOG, ECG, and EMG to provide ground-truth data for artifact identification [57] [58].
Comfortable, Stabilizing Headgear	Reduces gross head movements and electrode shifts, especially in mobile or long-duration studies [58].
ICA Software (e.g., AMICA)	Provides advanced blind source separation to isolate and remove artifactual components from neural data [59].

Optimizing Subject Instruction and Preparation

The human subject is the most dynamic variable in EEG research. Proactive engagement and clear instruction are as crucial as technical setup.

Pre-Experimental Briefing and Training

A comprehensive briefing sets expectations and empowers the subject to be an active participant in data quality control.

Explain the "Why": Briefly explain what artifacts are and how specific behaviors (blinking, clenching jaw, frowning) introduce noise that masks the brain's signal. This transforms subject compliance from a passive following of rules to an active collaboration.
Provide a "Practice" Period: Before the official recording begins, allow the subject to sit in the setup and practice the task. Instruct them to perform a series of deliberate artifacts (e.g., a few large blinks, looking left and right, clenching their jaw) while the experimenter monitors the data stream. This serves two purposes: it familiarizes the subject with the environment, and it provides the experimenter with a clear, labeled baseline of what artifacts look like for that specific subject, which can later inform cleaning pipelines [58].
Define and Practice "Rest" States: Clearly specify what is meant by "rest" in the context of the experiment. For example, "Please keep your eyes open, fixated on the cross, with a relaxed face and jaw, and minimize blinking during the stimulus presentation." A well-defined rest state is critical for establishing a clean baseline.

Strategic Instruction During Recording

Instructions must be tailored to the specific artifact profile of the experimental paradigm.

Managing Ocular Artifacts: For tasks requiring visual attention, instruct subjects to minimize blinking during critical trial epochs (e.g., stimulus presentation). Instead, designate specific periods, such as the inter-trial interval, as "free blink" periods. This strategic timing confines the majority of high-amplitude blink artifacts to data segments that can be more easily discounted or rejected [58].
Minimizing Myogenic Artifacts: Provide explicit, repeated reminders for subjects to relax their facial, jaw, and neck muscles. Instruct them to keep their tongue relaxed and not pressed against the roof of the mouth, and to ensure their jaw is slightly parted. For protocols involving physical movement (MoBI, sports science), this is more challenging, but subjects can still be coached to minimize unnecessary tension and to maintain consistent, smooth movements to reduce jerk-related EMG bursts [59] [58].
General Posture and Movement: Instruct the subject on optimal seating posture and the importance of remaining as still as possible, barring any required movements for the task. Ensure cables are secured and routed in a way that minimizes pulling and swaying.

The following workflow diagram synthesizes these proactive measures into a coherent, step-by-step experimental protocol.

A Proactive Experimental Protocol for EEG-tDCS

Combining EEG with transcranial electrical stimulation like tDCS presents unique challenges due to the induction of inherent physiological artifacts. A proactive protocol for such studies must be exceptionally rigorous [57].

Extended Pre-Stimulation Baseline: Incorporate a longer resting-state baseline recording (e.g., 5-10 minutes) with the tDCS setup in place but without current flow. This helps characterize the subject's natural artifact profile in the full experimental context.
Stimulation Parameter Awareness: Recognize that artifacts are dose-specific. Proactive measures should be intensified with higher current densities or specific montages known to be sensitive to physiological interference.
Robust Real-Time Monitoring: During simultaneous tDCS-EEG, use the auxiliary EOG and ECG channels to continuously monitor for cardiac and ocular distortion. Because these artifacts are non-stationary, high-dimensional, and overlap with neurogenic frequencies, their real-time identification is crucial, though their complete removal post-hoc may not be possible with conventional filters without significant signal degradation [57].
Post-Stimulation Control: Continue recording after the stimulation ends to monitor the persistence of artifacts, which can inform the analysis of neural after-effects.

In the pursuit of unambiguous neural signals, a proactive stance is not merely beneficial—it is imperative. By integrating a thorough understanding of physiological artifacts with a meticulous approach to experimental setup and subject instruction, researchers can significantly enhance the signal-to-noise ratio at its source. This guide outlines a comprehensive strategy, from the initial subject briefing to the final data acquisition command, designed to fortify data integrity against the pervasive challenge of physiological artifacts. The implementation of these measures, particularly when framed within the broader context of inherent physiological noise, will yield more reliable, interpretable, and valid EEG data, thereby accelerating discovery in neuroscience, clinical neurophysiology, and drug development.

Electroencephalography (EEG) is designed to record cerebral activity, but it invariably captures electrical activities arising from other sources, which are termed artifacts [3]. The accurate identification of these artifacts, particularly physiological artifacts that originate from the patient's own body, is a fundamental challenge in EEG research and clinical practice. These artifacts can significantly distort the EEG signal, potentially leading to misinterpretation of brain activity [23] [10]. For instance, eye flutters may be wrongly identified as epileptic discharges due to similarities in their appearance on EEG [10]. The proliferation of wearable EEG systems for use in real-world environments has intensified these challenges, as uncontrolled settings and the use of dry electrodes make the signals more susceptible to contamination [5] [61]. This guide provides an in-depth technical framework for the real-time monitoring and identification of physiological artifacts during data acquisition, a critical step for ensuring the validity of neurophysiological data in research and drug development.

Classification and Characteristics of Major Physiological Artifacts

Physiological artifacts are generated from the patient's body from sources other than the brain [3]. The most prevalent include ocular activity, muscle activity, and cardiac activity [10]. Each type exhibits distinct spatial, temporal, and spectral characteristics, which are summarized in Table 1 below. Recognizing these signatures is the first step toward their effective management.

Table 1: Characteristics of Common Physiological Artifacts in EEG

Artifact Type	Typical Source	Spectral Profile	Spatial Distribution on Scalp	Morphology
Eye Blink/Movement	Eyeball dipole (cornea-retina); Orbicularis oculi muscle [3] [10]	Slow frequency (Delta range) [3]	Maximal at frontal and frontopolar electrodes (Fp1, Fp2, F7, F8) [3] [10]	High-amplitude, smooth deflections; Blinks cause downward deflection in frontal channels [3]
Muscle (EMG) Activity	Frontalis, temporalis, jaw, and neck muscles [3]	High-frequency (>30 Hz) [10]	Widespread, but often localized over muscle groups (e.g., temporal regions) [3]	High-frequency, spiky, irregular pattern [3]
Cardiac (ECG) Artifact	Electrical activity of the heart (QRS complex) [3]	Corresponds to heart rate (~1-2 Hz) [3]	Often more prominent on left-side electrodes; best seen with earlobe references [3]	Sharp, rhythmic transients synchronous with QRS complex on ECG [3]
Pulse Artifact	Pulsation of cranial arteries beneath an electrode [3]	Slow frequency (Delta range) [3]	Highly localized to a single electrode [3]	Slow, rhythmic waves with a fixed delay (~200-300 ms) after the QRS complex [3]
Glossokinetic Artifact	Tongue movement (tip of tongue is negative) [3]	Delta range [3]	Broad field, maximal at inferior and frontal electrodes [3]	Slow, rhythmic waves synchronous with tongue movement [3]
Sweat Artifact	Skin impedance changes from sweat [23] [3]	Very slow (<0.5 Hz) [23]	Widespread, often anterior [23]	Very slow baseline drifts or sways [23]

The following diagram illustrates the logical workflow for identifying these primary physiological artifacts during real-time monitoring, based on their key characteristics.

Methodologies for Real-Time Artifact Monitoring

Real-time artifact monitoring requires a combination of hardware configurations, signal processing techniques, and automated detection algorithms. The move towards wearable EEG systems demands that these methods be fully automatable and capable of adapting to dynamic environments [61].

Hardware and Acquisition Setup

The foundation of effective artifact management is a high-quality acquisition setup. Modern approaches utilize actively driven ground systems to sense and cancel out common-mode interference, which is crucial for wearable applications [61]. Furthermore, the use of auxiliary sensors is highly recommended to provide reference signals for artifact identification [5]. These include:

Electrooculogram (EOG) electrodes: Placed above, below, and lateral to the eyes to isolate electrical activity from eye movements [10].
Electrocardiogram (ECG) electrodes: Placed on the torso or limbs to provide a precise reference of the cardiac rhythm [3].
Electromyogram (EMG) electrodes: Placed on relevant muscles (e.g., masseter, sternocleidomastoid) to capture muscle activity [61].
Inertial Measurement Units (IMUs): To monitor head movement, which is a common source of motion artifacts [5].

Signal Processing and Automated Detection Pipelines

Automated algorithms are essential for real-time artifact monitoring. These pipelines often integrate detection and removal phases, and their performance is typically assessed using metrics like accuracy and selectivity [5]. Common techniques include:

Adaptive Artifact Rejection: Methods like Artifact Subspace Reconstruction (ASR) are used in real-time frameworks to automatically identify and remove components of the data that represent artifacts, which is particularly useful for handling high-amplitude, transient artifacts [61].
Threshold Rejection: This simple yet effective method involves setting bounding values (e.g., ±75 µV) for the EEG signal. If the data from selected electrodes exceed these thresholds, the corresponding epoch is marked for rejection [62].
Trend and Improbability Detection: Algorithms can detect abnormal linear drifts in the data or identify trials with statistically improbable data distributions, which often indicate the presence of artifacts [62].
Source-Based Techniques: Advanced methods like Signal-Space Projection-Source-Informed Reconstruction (SSP-SIR) leverage forward head models to separate neural signals from artifact components, such as TMS-evoked muscle artifacts, in the source space [63].

Table 2: Quantitative Thresholds and Parameters for Common Artifact Detection Methods

Detection Method	Key Parameters	Typical Threshold Values / Settings	Primary Artifact Targets
Threshold Rejection [62]	Amplitude limit	±75 µV (for 32 channels); Adjust based on subject and channel count [62] [64]	High-amplitude events (eye blinks, movement)
Trend Rejection [62]	Maximum allowed slope; R-square fit	e.g., Slope < 50 µV over epoch duration [62]	Slow drifts, sweat artifacts
Improbable Data Rejection [62]	Standard deviation limits for probability	Single channel: 5 std; All channels: 3 std [62]	Unusual, non-Gaussian signals
Channel Statistics [62]	Kurtosis, Skewness; Kolmogorov-Smirnov test	p < 0.05 for Gaussian test [62]	"Bad" channels with non-Gaussian noise

The following diagram outlines a comprehensive real-time processing workflow that integrates several of these techniques.

Experimental Protocols for Artifact Assessment

To ensure rigorous and reproducible research, implementing standardized experimental protocols for artifact assessment is crucial. The following methodologies are cited in the literature.

Protocol for Epoch-Based Artifact Rejection

This protocol, adapted from bio-protocol, details a multi-stage process for artifact rejection in epoched data [64]:

Noisy Channel Substitution: Detect and substitute consistently noisy individual channels. The noisy channels are replaced with the average signals of the six nearest electrodes surrounding the noisy electrode. After this, re-reference the EEG to the common average of all electrodes.
Gross Artifact Rejection: To reject data recorded during coordinated muscle movements or blinks, exclude 1-second-long epochs for all electrodes if signals from more than 5% (e.g., 7 out of 128) of the electrodes exceed a set threshold amplitude (e.g., 60–520 µV; median, 100 µV) at any point during the epoch.
Remove Data Segments: Exclude the first and last 1 second of the recording from data analysis to avoid initial instabilities and termination effects.
Final Epoch Scrubbing: Finally, exclude 1-second epochs from individual electrodes if more than 10% of the epoch samples exceed a set amplitude limit (e.g., ±30 µV) [64].

Real-Time Monitoring System for Wearable EEG

A study designing a real-time system for cerebral palsy rehabilitation exemplifies a hardware-software co-design approach [65]:

Hardware Design: The system uses a main control chip (ESP32) to integrate data from a dry-electrode EEG sensor module, a muscle electrical sensor module, and a blood/heart rate acquisition module (MAX30102). The EEG module integrates hardware and software filtering, including a 50 Hz trap filter to suppress environmental interference [65].
Software Design: The software performs data receiving, processing, storage, and visualization. It enables the visual monitoring of EEG and other physiological signals in real-time, allowing for immediate adjustment of rehabilitation training [65].

The Scientist's Toolkit: Research Reagent Solutions

This table details key hardware and software solutions used in advanced EEG artifact research and monitoring systems.

Table 3: Essential Research Tools for Real-Time EEG Artifact Monitoring

Tool / Material	Specification / Function	Research Application
High-Density Dry EEG Headset [61]	64-channel dry electrode system with wireless data streaming and active noise cancellation.	Enables high-quality EEG acquisition in mobile, real-world environments, forming the basis for real-time analysis.
Auxiliary Biosensors (EOG, ECG, EMG, IMU) [5] [61]	Sensors to record eye movement, heart electrical activity, muscle activity, and head motion.	Provides reference signals for identifying and separating physiological artifacts from cerebral activity.
Artifact Subspace Reconstruction (ASR) [61]	An adaptive, online-capable method for identifying and removing artifact components from the data.	Used in real-time pipelines for cleaning high-amplitude, transient artifacts without requiring manual intervention.
Source-Based Cleaning (SSP-SIR, SOUND) [63]	Algorithms that use forward head models to separate neural and artifact signals in the source space.	Particularly effective for suppressing muscle and other structured artifacts in TMS-EEG and other paradigms.
Independent Component Analysis (ICA) [10]	A blind source separation technique implemented in toolboxes like EEGLAB.	Used post-hoc or in near-real-time to isolate and remove artifact components (e.g., blink, cardiac) from continuous data.

The accurate identification of physiological artifacts during real-time EEG monitoring is a non-trivial challenge that is critical for the integrity of neuroscientific research and clinical applications. As EEG technology evolves toward wearable, real-world use, the nature of artifacts becomes more complex, necessitating advanced and adaptive solutions. A successful strategy involves a multi-layered approach: a robust hardware setup with auxiliary sensors, the implementation of automated, quantitative detection algorithms, and a thorough understanding of the characteristic signatures of different artifact types. While techniques like adaptive filtering, source-based reconstruction, and machine learning offer promising paths forward, researchers must be aware that artifacts are "legion and pervasive" [23]. Continuous vigilance and refinement of these monitoring protocols are essential to ensure that the signals analyzed truly reflect cortical activity rather than extracerebral contamination.

Electroencephalography (EEG) is a vital tool in neuroscience research, clinical diagnosis, and drug development. However, the accurate interpretation of neural signals is fundamentally compromised by physiological artifacts—unwanted signals originating from the subject's own body. These artifacts, which include ocular, muscular, and cardiac activities, can obscure genuine brain activity, leading to biased analyses and erroneous conclusions. The effective removal of these artifacts is not a one-size-fits-all process; it requires a strategic selection of techniques tailored to the specific artifact type, data characteristics, and research objectives. This guide provides an in-depth technical framework for matching artifact types with optimal removal strategies, enabling researchers to enhance data integrity and reliability in EEG research.

Understanding Physiological Artifacts in EEG

Physiological artifacts are signals recorded by EEG that do not originate from cerebral activity. Their amplitude is often significantly larger than that of neural signals, sometimes by an order of magnitude, which can severely reduce the signal-to-noise ratio and mask the brain's electrical activity [19]. A foundational knowledge of their origin and characteristics is the first step toward their effective removal.

Definition and Impact: An EEG artifact is any recorded signal not generated by the brain. In the context of a research thesis, it is crucial to recognize that these artifacts can mimic true epileptiform abnormalities, seizures, or other pathological or cognitive rhythms, posing a significant risk of clinical misdiagnosis or biased research findings [22] [19].
The Challenge of Overlap: A primary difficulty in artifact removal is the substantial overlap in the frequency spectra of artifacts and genuine EEG signals. For instance, ocular artifacts dominate the low-frequency delta and theta bands, while muscle artifacts are broadband, affecting beta and gamma ranges essential for studying cognitive and motor processes [19] [66]. This spectral overlap renders simple filtering techniques often ineffective or detrimental, necessitating more sophisticated approaches.

Classification and Characteristics of Major Artifacts

The table below summarizes the key physiological artifacts, their properties, and their impact on the EEG signal.

Table 1: Characteristics of Major Physiological EEG Artifacts

Artifact Type	Origin	Main Topography	Time-Domain Signature	Frequency-Domain Signature	Amplitude Range
Ocular (EOG)	Corneo-retinal dipole (eye blinks, movements) [19]	Bifrontal (Fp1, Fp2) [22]	Sharp, high-amplitude deflections [19]	Delta/Theta bands (< 8 Hz) [66]	100–200 µV [19]
Muscular (EMG)	Muscle contractions (jaw, neck, face) [1]	Frontal, Temporal regions [66]	High-frequency, chaotic activity [22]	Broadband, Beta/Gamma (>13 Hz) [19]	Varies with contraction
Cardiac (ECG/Pulse)	Heart electrical activity or arterial pulsation [1]	Central, regions near neck vessels [19]	Rhythmic, recurring waveforms [19]	Overlaps multiple EEG bands [19]	Low, but visible
Sweat/Skin Potentials	Changes in skin impedance due to sweat [66]	Variable, can be generalized [22]	Very slow baseline drifts (< 0.5 Hz) [22] [66]	Very low frequencies (< 1 Hz) [66]	Low amplitude, slow shifts

The following workflow diagram outlines the logical process for identifying these common physiological artifacts during EEG review.

A range of techniques from traditional signal processing to modern deep learning is available for artifact removal. Each has distinct strengths, weaknesses, and optimal use cases.

Regression Methods: These are traditional methods that use a linear model to estimate and subtract the artifact contribution from the EEG channels based on a reference signal (e.g., EOG). A significant limitation is the requirement for a separate reference channel and the risk of "over-subtraction" due to bidirectional interference, where the EEG signal also contaminates the reference channel [1].
Blind Source Separation (BSS): BSS methods, such as Independent Component Analysis (ICA), are among the most frequently used techniques [1]. They work by decomposing the multi-channel EEG signal into statistically independent components. The artifactual components are then identified and removed, and the remaining components are projected back to the sensor space. ICA is highly effective for ocular and, to some extent, muscular artifacts but requires multi-channel data and often involves manual component selection, which can be time-consuming [5] [67].
Wavelet Transform: This technique is powerful for analyzing non-stationary signals like EEG. It decomposes the signal into different frequency bands at different points in time, allowing for the targeted removal of artifactual coefficients before reconstruction. It is often applied for managing ocular and muscular artifacts and can be effective for single-channel data [5].

Emerging Deep Learning (DL) Techniques

Deep learning represents a paradigm shift in artifact removal, moving towards automated, end-to-end solutions.

Core Principle: DL models, such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks, learn to map artifact-contaminated EEG signals to their clean counterparts in a supervised manner using large datasets [20]. They can jointly extract spatial (morphological) and temporal features, making them highly adaptable.
Architectural Innovations: Modern architectures are designed to handle specific challenges. For instance, CLEnet integrates dual-scale CNNs with LSTM and an attention mechanism to extract features at multiple scales and capture long-range temporal dependencies, showing superior performance in removing mixed and unknown artifacts from multi-channel data [20]. Other models, like State Space Models (SSMs), have shown excellence in removing complex, structured artifacts such as those induced by transcranial Electrical Stimulation (tES) [48].
Advantages: DL methods overcome key limitations of traditional techniques: they do not require reference channels, can be fully automated, and are capable of learning complex, non-linear relationships between artifacts and neural signals [20].

Strategy Selection: Matching Techniques to Artifacts

Selecting the optimal artifact removal strategy depends on a careful consideration of the artifact type, available data, and research context. The following diagram provides a high-level decision pathway for this selection.

Table 2: Strategic Matching of Removal Techniques to Artifact Types

Artifact Type	Highly Recommended Techniques	Alternative Techniques	Key Considerations & Experimental Protocol
Ocular (EOG)	ICA [5] [66]	Regression (with EOG reference) [1], Deep Learning (CNN-LSTM) [20]	Protocol for ICA: 1. Apply high-pass filter (e.g., 1 Hz). 2. Run ICA (e.g., Infomax algorithm). 3. Identify components with large frontal topography, low frequency, and high correlation with EOG. 4. Remove components and reconstruct signal.
Muscular (EMG)	Deep Learning (e.g., NovelCNN, CLEnet) [20], Wavelet Transform [5]	ICA (for persistent, localized artifacts) [66], Artifact Rejection	Protocol for DL: 1. Use a pre-trained model (e.g., CLEnet) on a semi-synthetic dataset. 2. Input raw multi-channel epochs. 3. Model outputs clean EEG. Performance is evaluated via SNR and Correlation Coefficient.
Cardiac (ECG/Pulse)	ICA [66], Template Subtraction (with ECG reference)	Filtering (if frequency is distinct)	Protocol for Template Subtraction: 1. Record simultaneous ECG. 2. Detect QRS complexes. 3. Create an average pulse artifact template. 4. Subtract the time-locked template from EEG.
Sweat/Skin Potentials	High-Pass Filtering (e.g., 0.5 Hz cutoff) [66]	-	Protocol: Use a zero-phase high-pass filter to remove slow drifts without distorting the timing of subsequent event-related potentials (ERPs).
Motion & Complex Artifacts	Deep Learning (Multi-modular SSM/CNN) [20] [48], ASR (Artifact Subspace Reconstruction) [5]	ICA (for specific movement types) [67]	Protocol for ASR: 1. Define a clean segment of initial data as a calibration baseline. 2. Set a threshold (e.g., 3 SD). 3. Reconstruct data portions that exceed the threshold using a mixing matrix.

The Scientist's Toolkit: Essential Research Reagents and Materials

Successful artifact management extends beyond software algorithms to include critical hardware and data resources.

Table 3: Essential Research Reagents and Materials for EEG Artifact Research

Item	Function & Application
High-Density EEG System (64+ channels)	Provides high spatial resolution, which is crucial for the efficacy of source separation techniques like ICA. The number of channels is a key factor in their performance [5].
Auxiliary Reference Sensors (EOG, EMG, ECG)	Provides a dedicated, clean recording of physiological activity for use in regression-based methods or for validating the output of automated removal techniques [1].
Active Electrode Systems	Amplifies the signal at the electrode site, which reduces the susceptibility to cable movement artifacts and environmental interference [66].
Semi-Synthetic Benchmark Datasets	Datasets where clean EEG is artificially contaminated with known artifacts. These are essential for training, validating, and benchmarking the performance of artifact removal algorithms, especially deep learning models [20].
Public Datasets with Real Artifacts	Real-world EEG data with annotated artifacts are critical for testing the ecological validity and generalization of artifact removal pipelines outside controlled, semi-synthetic conditions [5].

In EEG research, the strategic selection of artifact removal techniques is paramount for data integrity. As this guide illustrates, the optimal strategy is contingent on a clear identification of the artifact type and a nuanced understanding of the available methodological arsenal. While established techniques like ICA remain powerful for specific, well-defined artifacts like those from ocular sources, the field is increasingly moving towards sophisticated, automated deep learning solutions. These DL methods offer unparalleled promise for handling complex, mixed, and unknown artifacts, especially in the challenging and ecologically valid environments that characterize modern research, including wearable EEG and drug development studies. By adopting a deliberate, evidence-based strategy for artifact management, researchers can ensure the fidelity of their neural data and the robustness of their scientific conclusions.

Electroencephalography (EEG) is a fundamental tool in neuroscience and clinical diagnostics, prized for its high temporal resolution and non-invasive nature [6]. However, a significant challenge inherent to EEG recording is contamination by physiological artifacts—unwanted signals originating from the patient's own body that do not stem from cerebral cortical activity [19] [68]. These artifacts can obscure genuine neural signals, potentially leading to misinterpretation of brain activity and, in clinical settings, to misdiagnosis [19] [10]. For researchers and drug development professionals, accurate artifact handling is not merely a technical preprocessing step but a critical component in ensuring the validity of neurophysiological biomarkers and treatment efficacy assessments.

Physiological artifacts are traditionally categorized by their biological source. The most common and challenging include:

Ocular Artifacts: Generated by eye blinks, saccades, and movements of the eyelids and eyeball [19] [69].
Muscle Artifacts (EMG): Produced by contractions of head, face, neck, and jaw muscles [68] [10].
Cardiac Artifacts: Arising from the electrical activity of the heart (ECG) or pulsation of blood vessels near electrodes (pulse artifact) [19] [10].

This guide focuses specifically on the complex scenarios involving myogenic (muscle), pulse, and persistent ocular artifacts, providing an in-depth technical analysis of their characteristics, detection, and removal within the context of modern EEG research.

Characterization of Challenging Artifacts

Understanding the spatial, temporal, and spectral signatures of these artifacts is the first step toward their effective mitigation.

Myogenic (Muscle) Artifacts

Muscle contractions generate electrical signals known as electromyography (EMG). Because myogenic activities from the head, face, and neck muscles are conducted through the entire scalp, they can be monitored across most EEG electrodes [68].

Origin: Facial, jaw, neck, and scalp muscle contractions [19] [10].
Impact: EMG signals are broadband and high-frequency, introducing significant noise that overlaps with and can mask important cognitive and motor EEG rhythms [19].
Spectral Signature: The frequency range of EMG activity is wide, being maximal at frequencies higher than 30 Hz, dominating the beta (13–30 Hz) and gamma (>30 Hz) bands [19] [68].
Spatial Distribution: Generalized activity across the scalp, with amplitude dependent on the muscle type and contraction force [6] [10]. Studies using ear-EEG have shown that jaw-related artifacts can be even more pronounced in the ear compared to scalp electrodes [70].

Pulse (Cardiac Ballistic) Artifacts

Cardiac-related artifacts manifest in EEG in two primary forms: the electrical signal from the heart (ECG) and the pulse artifact.

Origin: The pulse artifact, or cardio-ballistic artifact, occurs when an EEG electrode is placed directly over a pulsating blood vessel [10].
Impact: This artifact resembles a slow, rhythmic, pulse-synchronous baseline shift and can be mistaken for genuine slow-wave brain activity [10]. Its morphology is not identical to the QRS complex of the ECG.
Spectral Signature: Rhythmic waveforms recurring at the heart rate (typically 0.8-2 Hz or 48-120 BPM), primarily affecting the delta band [19].
Spatial Distribution: Typically localized to a single electrode or a small cluster of electrodes positioned over a superficial artery [10]. It is more likely to be present on the left side of the scalp due to the heart's position [10].

Persistent Ocular Artifacts

Eye movements and blinks are a major source of contamination, especially for frontal electrodes.

Origin: The corneo-retinal dipole (charge difference between the cornea and retina), eyelid movement over the cornea, and extraocular muscle activity [19] [69].
Impact: Ocular artifacts have amplitudes (100–200 µV) that are an order of magnitude larger than background EEG activity, overwhelming the signal [19] [69]. Their bandwidth (3–15 Hz) critically overlaps with the EEG theta and alpha bands [69].
Spectral Signature: Dominant in low frequencies, particularly delta (0.5–4 Hz) and theta (4–8 Hz) bands [19].
Spatial Distribution: Greatest influence over frontal and prefrontal electrodes. Lateral eye movements most affect electrodes near the temples (F7, F8) [19] [10].

Table 1: Quantitative Characterization of Challenging Physiological Artifacts

Artifact Type	Spectral Band	Amplitude Range	Spatial Topography	Temporal Signature
Myogenic (EMG)	Beta/Gamma (>13 Hz) [19] [68]	Variable, proportional to contraction force [10]	Generalized, but focused near muscle groups (temples, neck) [70] [10]	High-frequency, non-stationary bursts [19]
Pulse (Cardiac)	Delta (0.5-4 Hz) [19]	Low amplitude, but significant for baseline	Localized to electrodes over vessels [10]	Slow, rhythmic, pulse-synchronous waves [10]
Persistent Ocular	Delta/Theta (0.5-8 Hz) [19] [69]	High (100-200 µV) [19] [69]	Frontal, Prefrontal (Fp1, Fp2, F7, F8) [19] [10]	Sharp, high-amplitude deflections from blinks; smoother from saccades [19]

Experimental Protocols for Artifact Investigation

Robust experimentation is required to quantify the impact of artifacts and validate removal techniques. The following protocols are commonly used in controlled studies.

Protocol for Quantifying Artifact Impact on Steady-State Responses

This method is effective for quantifying how artifacts degrade the signal-to-noise ratio (SNR) of a known neurophysiological response [70].

Stimulus Presentation: Administer a steady-state stimulus, such as a 40 Hz amplitude-modulated auditory tone, to elicit an Auditory Steady-State Response (ASSR). This response is stable and not interact with most artifacts [70].
EEG Acquisition: Record EEG from scalp and/or ear electrodes under two conditions: a relaxed baseline and an artifact condition (e.g., jaw clenching for EMG, forced blinking for ocular, or normal rest for pulse).
Signal Processing: Calculate the SNR for the ASSR in both conditions. The SNR is defined as the ratio between the power at the stimulus frequency (e.g., 40 Hz) and the average power in the surrounding frequency bins, excluding harmonics [70].
Quantitative Analysis: Compute the Signal-to-Noise Ratio Deterioration (SNRD) as the difference in SNR between the relaxed and artifact conditions: SNRD = SNR_relaxed - SNR_artifact [70]. A positive SNRD indicates the artifact has degraded the signal quality.

Protocol for Eye-Blink Artifact Detection using Machine Learning

This protocol outlines the steps for building a supervised classifier to identify eye-blink events [71].

Data Acquisition & Labeling: Collect EEG data during a paradigm where subjects are cued to blink at intervals. Expert reviewers or synchronized EOG recording are used to label epochs as "blink" or "non-blink."
Feature Extraction: From the labeled EEG epochs, compute a set of potential features. Comparative studies have evaluated features including:
- Scalp Topography: Spatial distribution of voltage across electrodes [71].
- Statistical Features: Variance, kurtosis, skewness.
- Time-Frequency Features: Wavelet coefficients.
- Spectral Features: Band power in delta, theta, alpha, etc.
Classifier Training: Partition the data into training and testing sets. Train multiple machine learning classifiers (e.g., Artificial Neural Networks, Support Vector Machines) using the extracted features.
Model Validation: Evaluate classifier performance on the held-out test set using metrics such as accuracy, precision, recall, and F1-score [71]. Research has found that a combination of scalp topography features and an Artificial Neural Network classifier can achieve superior performance [71].

Figure 1: Machine Learning Workflow for Blink Detection

Advanced Artifact Removal Methodologies

Conventional techniques like simple filtering are often insufficient for the targeted artifacts due to spectral overlap with neural signals. Advanced methods are required.

Deep Learning-Based Approaches

Deep learning models have emerged as powerful, data-driven tools for end-to-end artifact removal.

Generative Adversarial Networks (GANs): Models like AnEEG use a LSTM-based GAN architecture. The generator takes artifact-contaminated EEG and attempts to produce clean EEG, while the discriminator tries to distinguish between the generated signal and a ground-truth clean signal. This adversarial training forces the generator to produce realistic, artifact-free data [6].
Hybrid CNN-LSTM Models: Networks such as CLEnet integrate Convolutional Neural Networks (CNNs) to extract spatial/morphological features and Long Short-Term Memory (LSTM) networks to capture temporal dependencies in the EEG signal. An attention mechanism (e.g., EMA-1D) can be incorporated to enhance feature selection, leading to improved performance in removing mixed artifacts (EMG+EOG) from multi-channel data [20].

These deep learning methods have been shown to outperform traditional techniques like wavelet decomposition, achieving lower relative root mean square error (RRMSE) and higher signal-to-noise ratio (SNR) and correlation coefficient (CC) with the ground-truth signal [6] [20].

Table 2: Performance Comparison of Advanced Artifact Removal Techniques

Method	Underlying Principle	Best For Artifact Type	Reported Performance (Example)
Regression (Time-Domain)	Linear subtraction of EOG template [69]	Ocular (with reference EOG)	Similar performance to frequency-domain regression [69]
Independent Component Analysis (ICA)	Blind source separation & component rejection [72] [69]	Ocular, Cardiac (high-density EEG)	Requires manual inspection; degrades with low channel count [5]
Artifact Subspace Reconstruction (ASR)	Statistical detection & reconstruction of artifact subspaces [69]	Ocular, Motion, Instrumental	Suitable for real-time processing [5]
GAN (e.g., AnEEG)	Adversarial learning to generate clean EEG [6]	Muscular, Ocular, Mixed	Lower NMSE/RMSE, higher CC & SNR vs. wavelet methods [6]
Hybrid CNN-LSTM (e.g., CLEnet)	Spatial feature extraction + temporal modeling [20]	EMG, EOG, Unknown, Multi-channel	SNR: 11.50 dB, CC: 0.925 for mixed artifact removal [20]

Specific Techniques for Challenging Scenarios

For Muscle Artifacts: Because muscle signals are widespread and broadband, methods like ICA can struggle. Adaptive filtering and deep learning approaches that learn the non-stationary characteristics of EMG are often more effective [20] [10]. CLEnet has demonstrated particular efficacy in removing EMG artifacts by leveraging its dual-scale CNN to capture morphological features of the contamination [20].
For Pulse Artifacts: Removal is challenging due to the lack of a simple template. One approach is to use a simultaneously recorded ECG channel as a reference for adaptive filtering or ICA [10]. In the absence of ECG, algorithmic detection of the pulse waveform from the contaminated EEG channel, followed by subtraction or interpolation, may be necessary.
For Persistent Ocular Artifacts: Beyond standard regression and ICA, advanced methods like EEGENet (a GAN-based framework) have been developed specifically for ocular artifact removal under various conditions (blinks, vertical/horizontal movements) and can operate without a separate EOG reference by learning from pre-processed "clean" targets [6].

Figure 2: Hybrid CNN-LSTM (CLEnet) Architecture

The Scientist's Toolkit: Research Reagents & Materials

Table 3: Essential Materials and Computational Tools for Artifact Research

Item / Tool	Function / Application	Technical Notes
High-Density EEG System (64+ channels)	Provides sufficient spatial sampling for source separation techniques like ICA [69].	Essential for validating artifact topographies and source localization.
Active Electrodes (Ag/AgCl)	Improves signal quality and reduces motion-related cable artifacts [70].	Reduces impedance, minimizing environmental interference.
Electrooculogram (EOG) Electrodes	Records horizontal and vertical eye movements as a reference for ocular artifact removal [69].	Placed above/below the eye and lateral to the outer canthi.
Electrocardiogram (ECG) Electrode	Provides a reference signal for cardiac artifact removal [10].	Typically placed on the chest or limbs.
Conductive Gel & Abrasive Skin Prep Gel	Ensures stable, low-impedance connection between electrode and skin [70].	Critical for signal quality and reducing baseline noise.
EEGLAB (MATLAB Toolbox)	Interactive environment for implementing ICA, regression, and other preprocessing pipelines [10].	Widely used standard with a large user community and plugin ecosystem.
Python (MNE, TensorFlow, PyTorch)	Flexible programming environment for implementing custom deep learning models (e.g., GANs, CNN-LSTM) and signal processing [6] [20].	Enables development and testing of novel algorithms.
Public Datasets (e.g., EEGdenoiseNet)	Benchmark datasets for training and validating artifact removal algorithms [20].	Contains clean EEG and artifact signals for creating semi-synthetic data.

Effectively handling muscle, pulse, and persistent ocular artifacts is a non-trivial challenge that is central to ensuring data integrity in EEG research. While traditional methods like regression and ICA remain useful in specific, controlled contexts, the field is rapidly advancing toward sophisticated, automated solutions. Deep learning approaches, particularly GANs and hybrid CNN-LSTM models, show significant promise in addressing the non-stationary and spectrally overlapping nature of these artifacts, even in multi-channel and real-world scenarios. For researchers in academia and drug development, adopting and refining these advanced methodologies is paramount for extracting robust and reliable neural signals from contaminated recordings, thereby strengthening the validity of neuroscientific findings and clinical conclusions.

Electroencephalography (EEG) records the brain's spontaneous electrical activity, but these neural signals (typically ranging from 0.5 to 100 μV) are exceptionally vulnerable to contamination by physiological artifacts—extraneous signals originating from non-cerebral sources within the body [1] [19]. These artifacts present a fundamental challenge to data integrity because they can mimic genuine neural activity, obscure true brain signals, and introduce spurious findings that compromise scientific validity and clinical interpretation [10] [22]. For instance, eye blinks may be misinterpreted as frontal epileptiform discharges, and muscle artifacts can mask beta and gamma frequency oscillations crucial for understanding cognitive processes [1] [22]. Effective management of these artifacts is therefore not merely a technical preprocessing step but a critical component of rigorous EEG research, particularly in drug development where accurate biomarker identification is essential.

Physiological artifacts are broadly categorized by their biological origin. The most prevalent include ocular artifacts from eye blinks and movements, myogenic artifacts from muscle activity, and cardiac artifacts from heart electrical activity and pulse pulsations [1] [10]. A particularly critical insight from recent research is that during concurrent brain stimulation and EEG recording (e.g., tDCS-EEG), these artifacts become "inherent"—they result from physical interactions between the applied current and the body and are therefore unavoidable regardless of equipment performance [72]. These stimulation-induced artifacts are especially problematic because they are high-dimensional, non-stationary, and overlap with neurogenic frequencies, making them resistant to conventional removal techniques [72]. This review provides an in-depth technical guide to the two primary paradigms for managing these contaminants: segment rejection and artifact correction, framing them within a comprehensive data quality control strategy.

Classification and Characteristics of Major Physiological Artifacts

Table 1: Major Physiological Artifacts in EEG Recordings

Artifact Type	Biological Source	Typical Amplitude	Spectral Characteristics	Spatial Distribution
Ocular Artifacts	Corneo-retinal dipole movement from blinks and saccades	100–200 μV [10]	Delta/Theta bands (0.5–8 Hz) [19]	Frontal maxima (Fp1, Fp2); polarity varies with eye movement direction [22]
Muscle Artifacts (EMG)	Head, face, neck muscle contractions	Variable (depends on contraction force)	Broadband (20–300 Hz), dominates Beta/Gamma [1] [19]	Widespread, but particularly prominent in frontal and temporal regions [22]
Cardiac Artifacts	Electrical activity of the heart (ECG) or arterial pulsation	Low amplitude (varies with electrode placement)	~1.2 Hz for pulse; broader for ECG [1]	Left hemisphere predominance (proximity to heart); pulse artifact can be focal [22]
Pulse Artifact	Vascular pulsation beneath electrodes	Variable	~1.2 Hz [1]	Focal to electrodes overlying blood vessels [10]
Sweat Artifact	Electrolyte shifts from perspiration	Slow drifts	Very low frequency (<0.5 Hz) [22]	Diffuse, often bilateral [19]
Respiration Artifact	Chest/head movement during breathing	Slow oscillations	Delta band (0.1–0.3 Hz) [19]	Diffuse, varies with body position

Understanding these artifact signatures is essential for selecting appropriate mitigation strategies. For example, the high-amplitude, low-frequency nature of ocular artifacts makes them particularly amenable to certain correction methods, while the broadband characteristics of muscle artifacts present distinct challenges [1] [22]. During concurrent tDCS-EEG, these physiological artifacts manifest as modulations of the scalp DC voltage that scale with applied current, creating dose-specific contamination that cannot be accounted for by conventional experimental controls [72].

Figure 1: Taxonomy of Physiological Artifacts and Their Characteristics

Artifact Rejection Approaches

Principles and Methodology

Artifact rejection operates on a simple principle: complete removal of data segments contaminated by artifacts, preserving only "clean" data for analysis. This approach is conceptually straightforward and ensures that no residual artifactual content remains in the analyzed data [73]. The most common implementation involves establishing amplitude thresholds (typically ±100 μV) and rejecting any epochs where voltage deflections exceed these limits in any channel [73] [56]. This method is particularly effective for large, infrequent artifacts such as gross head movements, electrode pops, or sudden muscle contractions that create extreme voltage deflections [22].

The primary advantage of rejection is certainty—by completely removing contaminated segments, researchers avoid the risk of introducing new artifacts or leaving residual contamination through imperfect correction [73]. This is particularly valuable when artifact morphology closely resembles neural signals of interest, creating potential for misinterpretation. However, this approach carries the significant disadvantage of data loss, which can substantially reduce statistical power, especially in populations with high artifact prevalence (e.g., patient groups, children) or in paradigms where artifacts are systematically related to experimental conditions [73]. Additionally, strict rejection criteria may create biased datasets if artifacts correlate with specific behaviors or cognitive states.

Experimental Protocols for Threshold-Based Rejection

Implementing effective artifact rejection requires systematic procedures:

Amplitude Thresholding: Establish voltage thresholds (e.g., ±100 μV) based on pilot data and the specific EEG components under investigation. More conservative thresholds (e.g., ±50 μV) may be necessary for components with low amplitude, while more liberal thresholds may be acceptable for robust, high-amplitude components [73] [56].
Gradient-Based Rejection: Implement additional criteria based on maximum voltage step between consecutive samples (e.g., >50 μV) to identify sudden jumps characteristic of movement artifacts or electrode pops [56].
Channel-Specific Criteria: Apply stricter thresholds to channels known to be particularly vulnerable to specific artifacts (e.g., frontal channels for ocular artifacts, temporal channels for muscle artifacts) [22].
Protocol Documentation: Clearly document all rejection criteria and procedures in study protocols to ensure consistency across sessions and researchers, particularly in large-scale or multi-site studies [56].

Artifact Correction Approaches

Principles and Methodology

Artifact correction aims to identify and remove artifactual components from contaminated data while preserving underlying neural signals. Rather than discarding data, correction techniques attempt to separate neural and artifactual components mathematically, subtract the artifactual elements, and retain the cleaned neural signals [1]. This approach is particularly valuable when artifacts are frequent, systematically related to experimental conditions, or when data retention is critical for statistical power.

The most widely used correction approach is Independent Component Analysis (ICA), a blind source separation technique that decomposes EEG data into statistically independent components [73] [1]. ICA operates on the principle that artifacts and neural signals originate from different sources and have distinct spatial, temporal, and spectral characteristics. Once separated, artifact-related components can be removed, and the remaining components can be reconstructed back into channel space [73]. Regression-based methods represent another correction approach, particularly for ocular artifacts, where EOG recordings are used as reference signals to estimate and remove artifact contributions from EEG channels [1]. However, regression approaches have limitations due to potential bidirectional contamination (where neural signals contaminate EOG references) and assumptions of linearity [1].

Experimental Protocols for ICA-Based Correction

A standardized protocol for ICA-based artifact correction includes:

Data Preparation: Band-pass filter data (typically 1–40 Hz) and segment into epochs. Some approaches recommend high-pass filtering up to 2 Hz before ICA to improve decomposition quality [74].
ICA Decomposition: Apply ICA algorithms (e.g., Extended Infomax, SOBI) to the preprocessed data to separate independent components. Different algorithms may yield comparable results for artifact removal [74].
Component Classification: Identify artifact-related components based on their temporal, spectral, and spatial characteristics [73]:
- Ocular artifacts: Frontal topography, time-locked to blinks or saccades, high low-frequency power
- Muscle artifacts: Broad spectral power, focal temporal topography, high high-frequency power
- Cardiac artifacts: Periodic waveform matching ECG, left-sided topography
Component Removal and Reconstruction: Remove identified artifact components and project remaining components back to sensor space.
Validation: Compare data quality before and after correction using quantitative metrics (e.g., signal-to-noise ratio, standardized measurement error) [73].

Comparative Analysis: Performance and Applications

Table 2: Comparative Analysis of Artifact Rejection vs. Correction Approaches

Parameter	Artifact Rejection	Artifact Correction
Primary Mechanism	Complete removal of contaminated epochs	Mathematical separation and removal of artifactual components from data
Data Preservation	Low (direct data loss)	High (preserves data continuity)
Residual Artifact Risk	None (if properly implemented)	Possible (incomplete separation or removal)
Best Applications	Large, infrequent artifacts; movement artifacts; electrode pops; studies with abundant trials [73]	Frequent artifacts (blinks, cardiac); small sample sizes; artifacts with stable topography [73] [1]
Limitations	Reduces trials available for analysis; may introduce bias if artifacts are condition-related [73]	May leave residual artifacts or remove neural signals; requires expertise for component identification [1]
Impact on SNR	Improves SNR by removing noisy trials but reduces trial count [73]	Can improve SNR without reducing trial count when successful [73]
Automation Potential	High (algorithmic thresholding)	Moderate to low (often requires manual component verification)
Computational Demand	Low	High (especially for ICA decomposition)

Recent large-scale evaluations demonstrate that a combined approach often yields optimal results. Specifically, applying ICA-based correction for structured artifacts with stable topographies (e.g., ocular artifacts) followed by rejection of trials with remaining extreme values addresses both structured and unstructured artifacts effectively [73]. This hybrid approach has been shown to minimize artifact-related confounds while maintaining acceptable data retention rates across multiple ERP components (P3b, N400, N170, MMN, ERN) [73].

For multivariate pattern analysis (MVPA) and decoding approaches, evidence suggests that artifact correction may be sufficient without additional trial rejection. A comprehensive study found that while the combination of artifact correction and rejection did not significantly improve decoding performance in most cases, correction alone was recommended to minimize potential artifact-related confounds that might artificially inflate decoding accuracy [75].

Figure 2: Decision Framework for Artifact Management Strategies

Advanced Considerations and Research Reagents

The Scientist's Toolkit: Essential Research Reagents and Solutions

Table 3: Essential Tools for EEG Artifact Management

Tool/Resource	Function	Application Notes
Independent Component Analysis (ICA)	Blind source separation to isolate artifactual components	Most effective for ocular, cardiac, and some muscle artifacts; requires appropriate preprocessing [73] [1]
EEGLAB Toolbox	Interactive MATLAB toolbox for EEG processing	Provides ICA implementation and visualization tools for component review [10]
EOG/ECG Reference Channels	Record vertical/horizontal EOG and ECG for reference-based correction	Essential for regression methods; helpful for validating ICA components [1] [76]
Standardized Measurement Error (SME)	Metric for assessing data quality after processing	Directly relates to effect sizes and statistical power; useful for optimizing pipeline [73]
Automated Classification Algorithms	Machine learning approaches for component classification	Reduces manual labor in identifying artifact components; improves consistency [76]
High-Density EEG Systems	Increased spatial sampling for better source separation	Improves ICA performance and spatial localization of artifacts [72]

Special Considerations for Combined tDCS-EEG Studies

Combining transcranial direct current stimulation (tDCS) with EEG introduces unique challenges for artifact management. During concurrent tDCS-EEG, physiological processes (cardiac, ocular) modulate body impedance, creating dynamic artifacts that scale with stimulation current and are inherently non-stationary [72]. These "inherent physiological artifacts" are particularly problematic because:

They are high-dimensional and overlap with neurogenic frequencies
They cannot be eliminated by equipment improvements
Conventional signal processing techniques (high-pass filtering, ICA) have limited effectiveness
Their dynamics vary with stimulation parameters (montage, polarity, current) [72]

In these challenging scenarios, advanced approaches such as Generalized Singular Value Decomposition (GSVD) may be necessary, though complete artifact removal may significantly degrade signal integrity [72]. For tDCS-EEG studies, careful experimental design with appropriate control conditions and computational modeling of current flow may be necessary to disambiguate true neural effects from stimulation-induced artifacts [72].

Effective management of physiological artifacts through segment rejection and correction approaches is fundamental to EEG data quality control. The choice between these strategies involves tradeoffs between data retention and contamination risk, with the optimal approach depending on artifact characteristics, experimental paradigm, and analysis goals. For most conventional ERP research, a combined approach—using ICA-based correction for structured artifacts with stable topographies followed by trial rejection for remaining extreme values—provides an effective balance [73]. For MVPA/decoding applications, evidence suggests that artifact correction alone may be sufficient [75]. In specialized applications such as tDCS-EEG, where artifacts are inherent and particularly challenging, advanced specialized approaches are necessary [72]. As EEG continues to play a crucial role in basic neuroscience and drug development, rigorous implementation of these artifact management strategies remains essential for generating valid, interpretable, and reproducible findings.

In electroencephalography (EEG) research, physiological artifacts—unwanted signals originating from non-neural biological sources—represent a fundamental challenge to data integrity. These contaminants, which include ocular, muscular, and cardiac activities, can obscure genuine neural signals and lead to spurious research findings if not properly addressed [19] [1]. The removal of these artifacts is particularly crucial within applied contexts such as drug development and clinical neuroscience, where accurate interpretation of brain activity informs critical decisions. Despite advanced artifact removal algorithms becoming increasingly accessible, their improper application persists, introducing significant errors that compromise study validity and reproducibility [77] [78]. This guide details the most common and impactful pitfalls in EEG artifact removal, provides evidence-based protocols for their mitigation, and outlines their potential consequences on data interpretation, thereby supporting the advancement of reliable physiological artifacts research.

Fundamental Concepts: Defining Physiological Artifacts

Physiological artifacts are signals recorded by EEG that do not originate from cerebral cortical activity [19]. Unlike non-physiological artifacts (e.g., power line interference, electrode pops), these contaminants arise from the subject's own body, making them inherently difficult to avoid completely. Their key characteristic is the substantial overlap in frequency and amplitude with neurogenic signals, rendering simple filtering approaches often ineffective and necessitating more sophisticated processing techniques [1].

Table 1: Major Types of Physiological Artifacts in EEG Research

Artifact Type	Biological Source	Spectral Characteristics	Spatial Distribution	Key Challenges for Removal
Ocular (EOG)	Eye blinks and movements [1]	Dominant in delta/theta bands (0.5–4 Hz, 4–8 Hz) [19]	Primarily frontal electrodes (Fp1, Fp2) [19]	High amplitude (100–200 µV); bidirectional interference with EEG [1]
Muscle (EMG)	Facial, jaw, neck muscle contractions [1]	Broadband, dominating beta/gamma (>13 Hz) [19]	Widespread, often temporal regions [1]	Extensive spectral overlap with neural signals [1]
Cardiac (ECG/ Pulse)	Heart electrical activity or pulsation [1]	Overlaps multiple EEG bands [19]	Central or neck-adjacent channels [19]	Rhythmic, can be mistaken for neural oscillations [1]
Perspiration	Sweat gland activity [19]	Very low frequency (delta band) [19]	Diffuse, often frontal	Causes slow baseline drifts and impedance changes [19]
Respiration	Chest/head movement during breathing [19]	Low frequency (delta/theta) [19]	Variable	Synchronized with respiration rate [19]

Common Pitfalls and Their Impacts on Data Integrity

A frequent error in artifact removal is the inappropriate use of Blind Source Separation (BSS) methods, such as Independent Component Analysis (ICA), without regard for their underlying assumptions and limitations. ICA operates on the principle of separating statistically independent sources, an assumption that may be violated in low-density EEG systems or specific artifact types [5] [78]. This pitfall is exacerbated when researchers apply ICA as a universal solution without validating its suitability for their specific experimental setup.

Impact on Data: The misapplication of ICA can lead to two critical errors: (1) Incomplete Artifact Removal, where residual contaminations persist in the data, and (2) Over-Correction, where genuine neural signals are mistakenly identified as artifacts and removed [78]. This is particularly problematic in wearable EEG systems with limited channel counts (often below 16 channels), where spatial resolution is insufficient for effective ICA performance [5]. The consequence is a distorted representation of brain activity that can mimic or obscure genuine neurophysiological phenomena of interest.

Pitfall 2: Inadequate Handling of Non-Stationary and Complex Artifacts

Conventional artifact removal techniques often assume stationarity in the signal, a condition frequently violated in real-world EEG recordings. This is especially evident in two scenarios: movement-related artifacts in mobile EEG studies and physiological artifacts during concurrent brain stimulation (e.g., tDCS-EEG) [72] [53]. A critical error is treating these dynamic artifacts with static removal algorithms.

Impact on Data: Research has identified that inherent physiological artifacts during concurrent tDCS-EEG, specifically cardiac and ocular motor distortions, are non-stationary, high-dimensional, and scale with applied current [72] [53]. Applying conventional high-pass filtering or standard ICA to these artifacts fails because the contaminants overlap highly with neurogenic frequencies and are not spatially stationary [53]. The resulting data contains residual, stimulation-induced physiological noise that can be misinterpreted as neuromodulatory effects of stimulation, fundamentally compromising conclusions about intervention efficacy.

Pitfall 3: Over-reliance on Single-Method Approaches and Lack of Validation

Many studies persist in using a single artifact removal method in isolation, despite evidence that hybrid approaches consistently outperform singular techniques [1] [78]. This pitfall stems from a tendency toward methodological convenience rather than optimal signal processing. A related error is the failure to quantitatively validate the artifact removal process against ground-truth data or known standards.

Impact on Data: Single-method approaches are inherently limited because different artifacts have distinct spatial, temporal, and spectral characteristics [5]. For instance, while wavelet transforms may be effective for certain ocular artifacts, they might perform poorly for muscular artifacts that require different decomposition strategies [5]. Without rigorous validation using metrics like Signal-to-Noise Ratio (SNR), correlation coefficients (CC), or relative root mean square error (RRMSE) [20], researchers cannot quantify the performance of their chosen method, leading to uncontrolled and unmeasured signal distortion that introduces uncertainty in all downstream analyses.

Pitfall 4: Neglecting Reproducibility and Methodological Transparency

A profound yet common oversight is the failure to document artifact processing pipelines with sufficient detail to enable replication. This includes incomplete reporting of algorithm parameters, decision thresholds, and component rejection criteria [77]. This pitfall extends to the underutilization of auxiliary sensors (e.g., IMU, EOG, ECG) that could enhance artifact detection under ecological conditions [5].

Impact on Data: The lack of reproducibility documentation makes it impossible for other researchers to verify findings or build upon established work. Studies have shown that over 50% of variables required for reproducibility are inadequately documented in computational research [77]. This not only undermines the credibility of individual studies but also hinders field-wide progress by preventing meaningful comparison across methodologies and datasets. The impact is a literature filled with potentially significant findings that cannot be independently verified or reliably translated into practical applications.

Table 2: Quantitative Performance Metrics for Artifact Removal Algorithms

Algorithm	Reported SNR Improvement	Reported CC Values	Reported RRMSE Reduction	Best-Suited Artifact Types	Key Limitations
CLEnet (Deep Learning) [20]	11.50 dB (mixed artifacts)	0.925 (mixed artifacts)	RRMSEt: 0.300, RRMSEf: 0.319	EMG, EOG, Mixed, Unknown artifacts	Requires large training datasets; computational intensity
ICA-based (SOBI) [78]	Varies by study	Varies by study	Varies by study	Ocular, Cardiac	Requires sufficient channels; assumes statistical independence
ASR-based Pipelines [5]	Not specified	Not specified	Not specified	Ocular, Movement, Instrumental	Parameters require careful tuning
Wavelet Transform [5]	Not specified	Not specified	Not specified	Ocular, Muscular	Choice of mother wavelet and thresholds is critical
Regression Methods [1]	Performance decreases without reference	Not specified	Not specified	Ocular	Requires reference channels; bidirectional contamination

Recommended Experimental Protocols and Guidelines

Protocol for a Robust, Multi-Stage Artifact Removal Pipeline

A single-algorithm approach is insufficient for comprehensive artifact removal. The following multi-stage protocol, synthesized from current literature, provides a more robust framework:

Preprocessing and Initial Filtering: Apply a high-pass filter (e.g., 1 Hz cutoff) to remove slow drifts and a notch filter (e.g., 50/60 Hz) to eliminate line noise. This step addresses non-physiological artifacts that can interfere with subsequent analysis [1].
Multi-Method Artifact Identification: Implement a hybrid approach combining:
- Blind Source Separation (BSS): Use ICA (e.g., SOBI or Extended InfoMax) [78] to decompose the signal. For low-density EEG (<16 channels), consider alternative BSS methods or use ICA with extreme caution [5].
- Auxiliary Sensor Integration: Incorporate data from EOG, ECG, or IMU sensors to inform the identification of artifact components [5]. This is particularly valuable for distinguishing cardiac and motion artifacts.
- Automated Detection: For specific artifact types like muscle activity, leverage validated deep learning models (e.g., CLEnet [20] or AnEEG [79]) that can extract morphological and temporal features to separate EEG from artifacts.
Targeted Component Rejection/Correction: Based on the hybrid identification in Step 2, proceed with component rejection. Utilize validated criteria such as spatial patterns, spectral characteristics, and correlation with auxiliary signals rather than relying solely on visual inspection.
Validation and Quality Control: Quantify the performance of the artifact removal process using standardized metrics. Calculate SNR, CC, and RRMSE [20] on a representative subset of data where ground truth can be approximated (e.g., using clean segments or semi-synthetic data). This step is non-negotiable for establishing processing reliability.

Protocol for Addressing Artifacts in Concurrent tDCS-EEG

The combination of tDCS with EEG introduces unique physiological artifacts that demand specialized handling [72] [53]. Standard processing pipelines are insufficient.

Pre-stimulation Baseline: Record a high-quality baseline EEG prior to stimulation onset. This helps characterize individual physiological patterns.
Comprehensive Physiological Monitoring: Simultaneously record EOG, ECG, and EMG throughout the experiment. This data is essential for identifying and modeling artifact dynamics specific to the stimulation context [53].
Advanced Modeling and Processing: Recognize that conventional high-pass filtering and ICA are inadequate. Employ techniques that account for the current-dose-specific, non-stationary nature of the artifacts. Spatial filtering techniques like Generalized Singular Value Decomposition (GSVD) may be considered, though with caution as they may degrade signal integrity [53].
Dose-Response Analysis: Analyze artifact magnitude as a function of stimulation parameters (current, montage). This can help distinguish stimulation-induced physiological modulations from true neurophysiological changes.

Diagram: A workflow for a robust, multi-stage artifact removal protocol, highlighting where common pitfalls typically occur in the process.

The Scientist's Toolkit: Essential Research Reagent Solutions

Table 3: Key Resources for Advanced Artifact Removal Research

Tool / Resource	Category	Primary Function	Example Use Case
Public Datasets (e.g., EEGdenoiseNet) [20]	Data	Provides benchmark semi-synthetic & real EEG data with artifacts	Training & validation of deep learning models; algorithm comparison
Independent Component Analysis (ICA) [78]	Algorithm	Blind source separation for isolating artifact components	Ocular and cardiac artifact identification in research-grade EEG
ASR (Artifact Subspace Reconstruction) [5]	Algorithm	Statistical method for real-time artifact removal	Online artifact correction in mobile EEG and BCI applications
CLEnet & AnEEG [20] [79]	Deep Learning Model	End-to-end artifact removal using CNN-LSTM & GAN architectures	Handling unknown artifacts and multi-channel EEG correction
Auxiliary Sensors (EOG, ECG, IMU) [5] [53]	Hardware	Provides reference signals for physiological artifacts	Ground-truth for artifact identification in concurrent tDCS-EEG

Effective management of physiological artifacts in EEG is not merely a technical preprocessing step but a fundamental determinant of data quality and research validity. The pitfalls detailed in this guide—including the misapplication of BSS techniques, inadequate handling of non-stationary artifacts, over-reliance on single methods, and neglect of reproducibility—represent significant sources of error that can systematically bias research outcomes. By adopting the recommended multi-stage protocols, leveraging hybrid methods that combine classical and deep learning approaches, and rigorously validating all processing steps with quantitative metrics, researchers can significantly enhance the reliability of their EEG data. The path toward robust artifact removal requires a shift from convenient, one-size-fits-all solutions to carefully validated, context-specific processing pipelines that acknowledge the complex nature of physiological contaminants. Through such rigorous approaches, the neuroscience community can advance more reproducible and trustworthy research on physiological artifacts in EEG signals.

Comparative Analysis of EEG Artifact Removal Methods: Performance, Validation, and Best Practices

Electroencephalography (EEG) provides direct, millisecond-resolution access to human neuronal activity, making it indispensable for clinical trials and neuroscience research [80]. However, the utility of EEG is often compromised by physiological artifacts—non-neural signals originating from the participant's body. These include artifacts from eye blinks and movements (ocular), muscle activity (electromyographic), cardiac activity (electrocardiographic), and sweat (galvanic skin response). Effective identification and removal of these artifacts is paramount, as residuals can distort neural signals, leading to flawed interpretations in both scientific and clinical contexts. This necessitates rigorous benchmarking of artifact removal efficacy using standardized metrics and protocols.

Evaluating how well an algorithm or pipeline removes these artifacts requires a framework that quantitatively assesses both the preservation of neural signals and the elimination of artifactual components. This guide details the key metrics, experimental protocols, and analytical tools essential for this benchmarking process, providing researchers with a standardized approach for rigorous method evaluation.

Core Quantitative Metrics for Removal Efficacy

The performance of an artifact removal pipeline is quantified through metrics that evaluate its impact on both the artifact and the underlying neural signal.

Table 1: Key Metrics for Evaluating Artifact Removal Efficacy

Metric Category	Specific Metric	Description	Interpretation
Artifact Attenuation	Average Event Duration [27]	Measures the average duration of detected artifactual events remaining after processing.	A lower score indicates more effective suppression of artifacts.
	Framewise Displacement (FD) Correlation [81]	Quantifies correlation between artifact topography presence and motion parameters (from fMRI or accelerometry).	A strong correlation suggests residual motion-related artifacts.
Signal Fidelity	Global Explained Variance (GEV) [81]	Measures how well the cleaned signal's microstates explain the original data's variance.	Higher GEV indicates better preservation of brain-generated signal topography.
	Power Spectrum Deviation	Compares spectral power in clean vs. artifact-removed data across frequency bands.	Smaller deviations indicate better preservation of oscillatory neural content.
Task-Based Performance	Evoked Potential Amplitude/Latency [80]	Assesses changes in key features (e.g., P300) after processing.	Preserved amplitudes and latencies indicate neural signal integrity.
	Signal-to-Noise Ratio (SNR)	Measures the ratio of task-related neural signal power to the power of the remaining noise.	A higher SNR indicates a more successful isolation of the neural signal of interest.

Different artifact types and research goals necessitate a focus on specific metrics. For instance, in studies of resting-state EEG microstates, the appearance of a Vertical Topography (VT)—a topography with a straight line dividing positive and negative values from nasion to inion—has been strongly linked to motion artifacts. Its spatiotemporal characteristics and correlation with framewise displacement serve as a key benchmark for motion artifact removal [81]. Conversely, for event-related potentials (ERPs), the critical metrics are the amplitude and latency of components like the P300, which must be adequately captured in the cleaned data [80].

Experimental Protocols for Benchmarking

A robust benchmark requires a structured experimental design that tests artifact removal methods under controlled and realistic conditions.

Data Acquisition and Ground Truth Establishment

The foundation of any benchmark is a high-quality dataset. Prof. Steve Luck's adage that "there is no substitute for clean data" underscores that all subsequent processing depends on initial recording quality [82]. Key steps include:

Pilot Testing: Conduct pilot sessions to verify all equipment, stimuli, and procedures are functioning correctly before full-scale data collection [82].
Multi-Modal Recording: Simultaneously record data from auxiliary sensors, such as electrooculography (EOG) for eye movements, electrocardiography (ECG) for cardiac activity, and electromyography (EMG) for muscle activity. These signals provide reference channels that are critical for both designing and validating artifact removal algorithms [82] [83].
Experimental Paradigms: Data should encompass a range of tasks relevant to the intended application. For a comprehensive benchmark, include:
- Resting-state recordings (eyes open and closed).
- Event-related potentials (ERPs) like the P300, which are common biomarkers in clinical trials [80].
- Tasks that induce artifacts, such as instructed blinks, head movements, or jaw clenching, to challenge the removal pipeline.

The Rating-by-Detection Protocol

A principled offline evaluation protocol, termed "Rating-by-Detection," uses a detector to score the presence of artifacts in the corrected EEG without requiring a ground-truth neural signal. The core metric is the Average Event Duration of detected artifacts [27].

This protocol's workflow provides a standardized method for comparative evaluation.

Diagram 1: Rating by Detection Workflow

This method enables reliable comparisons between multiple artifact removal configurations by providing a single, quantitative score reflecting the cleaned data's quality [27].

Benchmarking in Clinical and Real-World Contexts

When evaluating methods for use in clinical trials or real-world settings, additional practical factors must be measured:

Temporal Efficiency: Measure set-up and clean-up times. Dry-electrode systems, for example, can halve set-up time compared to standard EEG, significantly reducing site burden in trials [80].
Participant Comfort: Use structured questionnaires to track perceived comfort over time, as this impacts data quality and participant retention [80].
Generalization Gap: Evaluate performance across diverse populations and conditions. The EEG-FM-Bench framework highlights that models often fail to generalize to novel tasks, a critical consideration for artifact removal pipelines intended for broad use [84].

The Scientist's Toolkit: Essential Research Reagents and Materials

Successful experimentation in this field relies on a suite of software, hardware, and methodological "reagents."

Table 2: Essential Research Reagents for Artifact Removal Benchmarking

Tool Category	Specific Tool / Method	Function in Benchmarking
Software & Algorithms	Independent Component Analysis (ICA) [81] [82]	A blind source separation technique used to identify and remove artifactual components from EEG data.
	EEG-Cleanse [83]	A fully automated, modular pipeline for cleaning EEG recorded during full-body movement, combining motion-adaptive methods.
	Wiener Filter [27]	A configurable, generic artifact removal method often used to validate the reliability of new evaluation protocols.
Hardware & Sensors	Dry-Electrode EEG Systems [80]	EEG devices that speed up set-up and improve comfort; their performance must be benchmarked against standard wet EEG.
	Auxiliary Biosensors (EOG, EMG, ECG) [82]	Provide ground-truth signals for major physiological artifacts, enabling validation of removal accuracy.
	MR-Compatible EEG Systems [81]	Allow for investigation of artifacts specific to simultaneous EEG/fMRI acquisition.
Data & Frameworks	EEG-FM-Bench [84]	A comprehensive benchmark suite with standardized datasets and protocols for evaluating EEG foundation models, including their robustness to artifacts.
	Phantom Head Measurements [81]	Used to isolate and study non-physiological artifacts (e.g., from cap movement) in a controlled environment.

Visualization and Qualitative Analysis

Quantitative metrics should be supplemented with qualitative visualization to build a complete picture of a method's performance and potential failure modes.

Topographical Maps: Visual inspection of topographies, such as identifying the non-physiological Vertical Topography (VT), is crucial. VT's presence can distort the shape and dynamics of other microstate topographies [81].
Representation Visualization: Techniques like t-SNE and Integrated Gradients can be used to qualitatively analyze the feature space learned by models, helping to identify if artifacts have been effectively separated from neural signals in a latent representation [84].
ERP Waveforms: Overlaying ERP waveforms before and after processing, and across different methods, allows researchers to visually assess the preservation of key components like the P300 and the attenuation of noise [80] [85].

The following workflow integrates these qualitative checks with quantitative scoring.

Diagram 2: Integrated Evaluation Workflow

Benchmarking the efficacy of physiological artifact removal is a multi-faceted process that extends beyond a single metric. A comprehensive evaluation must integrate quantitative scores like Average Event Duration, qualitative visual assessments of topographies and waveforms, and practical measures of speed and comfort. As EEG foundation models and automated pipelines like EEG-cleanse become more prevalent, standardized benchmarks such as EEG-FM-Bench will be critical for ensuring these tools perform reliably and robustly across the diverse contexts of modern neuroscience and clinical neurology. By adopting the structured metrics and protocols outlined in this guide, researchers can systematically advance the field, ensuring that EEG data supports valid and impactful scientific conclusions.

Electroencephalography (EEG) is a fundamental tool in neuroscience research and clinical diagnostics, prized for its non-invasive nature and millisecond-scale temporal resolution. However, a central challenge in EEG analysis stems from the vulnerability of these microvolt-level signals to contamination by physiological artifacts—unwanted signals originating from the patient's own body rather than cerebral activity [19]. These artifacts can profoundly distort data interpretation, potentially leading to inaccurate conclusions in both basic research and applied settings such as pharmaceutical efficacy studies. The most prevalent and disruptive physiological artifacts include those from ocular activity (eye blinks and movements), muscle activity (from jaw, face, and neck muscles), and cardiac activity (heartbeat signals) [69] [19].

The core problem is that these artifacts often exhibit spectral and temporal overlap with genuine neural signals of interest. Ocular artifacts, dominated by low-frequency content in the 3–15 Hz range, obscure informative EEG features in the theta and alpha bands [69]. Muscle artifacts present as broadband noise that can mask higher-frequency beta and gamma oscillations crucial for understanding cognitive processes [19]. With amplitudes that can reach hundreds of microvolts—an order of magnitude larger than background EEG—these artifacts can easily swamp genuine neural signals, making robust artifact removal a prerequisite for reliable analysis [33].

Within this context, researchers have developed numerous algorithmic approaches to purify EEG data. This whitepaper provides a comprehensive comparative analysis of three foundational families of techniques: Independent Component Analysis (ICA), Regression-based methods, and emerging Deep Learning (DL) algorithms. We evaluate their underlying principles, practical implementation, efficacy against different artifact types, and suitability for various research scenarios.

Methodological Foundations and Comparative Performance

Independent Component Analysis (ICA)

Principles and Workflow: ICA is a blind source separation technique that decomposes multi-channel EEG recordings into statistically independent components [33]. The fundamental assumption is that the recorded EEG data matrix (X) represents a linear mixture of underlying independent sources (S), such that X = A×S, where A is the mixing matrix. The algorithm solves for an unmixing matrix W that maximizes the statistical independence of the output components, yielding S = W×X [33] [59]. The subsequent crucial step is component classification, where an expert researcher or automated algorithm identifies components corresponding to artifacts based on their temporal, spectral, and topographic characteristics [86]. Finally, signal reconstruction occurs by projecting only the brain-related components back to the sensor space, effectively excluding the artifactual contributions.

Experimental Protocol for ICA:

Data Preparation: Apply a high-pass filter (e.g., 1 Hz cutoff) to remove slow drifts. Bad channels should be removed or interpolated.
Dimensionality Reduction: Techniques like Principal Component Analysis (PCA) may be applied to reduce computational complexity.
Algorithm Selection and Execution: Choose an ICA algorithm (e.g., Adaptive Mixture ICA (AMICA) [59], Infomax) and apply it to the data.
Component Classification: Visually inspect components for archetypal artifact signatures:
- Ocular: Large, low-frequency deflections maximal at frontal sites.
- Muscle: High-frequency, broadband activity with a focal scalp topography.
- Cardiac: Regular, pulsatile waveforms time-locked to the QRS complex.
Signal Reconstruction: Create a cleaned dataset by back-projecting all components except those identified as artifacts.

ICA-based Artifact Removal Workflow

Regression-Based Methods

Principles and Workflow: Regression-based techniques operate on the principle of subtracting a scaled template of the artifact from the contaminated EEG signal [69]. These methods assume a linear and time-invariant relationship where the raw signal is the sum of true brain activity and the artifact, expressed as RawEEG(n) = EEG(n) + artifacts(n) [69]. The critical requirement is an artifact reference signal, which can be a dedicated Electrooculography (EOG) channel or an EEG channel most strongly affected by the artifact (e.g., Fp1 for blinks) [69]. A calibration phase is used to estimate regression coefficients (β) that define the magnitude of the artifact's influence on each EEG channel. Finally, these coefficients are applied to scale the reference signal, which is then subtracted from each EEG channel in the correction phase.

Experimental Protocol for Regression:

Reference Signal Acquisition: Record a simultaneous EOG signal or identify a frontal EEG channel that robustly captures the artifact.
Calibration Phase: During the experiment or a dedicated calibration run, collect data containing known artifacts. For each EEG channel i, compute the weight β_i that minimizes the difference between the recorded signal and the scaled reference artifact.
Correction Phase: Apply the correction throughout the data: Clean_EEG_i(n) = Raw_EEG_i(n) - β_i × Reference_Artifact(n).
Validation: Inspect the corrected data to ensure artifact reduction without over-subtraction of neural signals.

Deep Learning Approaches

Principles and Workflow: Deep learning models represent a paradigm shift, learning a complex, non-linear mapping function f_θ that transforms a noisy input signal directly into a clean output: f_θ(y) ≈ x, where y is the noisy EEG and x is the clean target [87]. These are end-to-end models trained in a supervised manner, typically using a loss function like Mean Squared Error (MSE) between the model's output and a ground-truth clean signal [87]. The architecture variety is extensive, including Convolutional Neural Networks (CNNs) that extract spatial-temporal features, Long Short-Term Memory (LSTM) networks that model long-range temporal dependencies, Generative Adversarial Networks (GANs) where a generator creates denoised signals and a discriminator critiques them, and hybrid models like CLEnet that combine CNNs and LSTMs to capture both morphological and temporal features [6] [20].

Experimental Protocol for Deep Learning:

Data Preparation and Standardization: Resample signals to a uniform rate, apply bandpass filtering, and normalize amplitude across channels [35].
Dataset Creation: For supervised learning, create a dataset of paired data: (noisy EEG input, clean EEG target). This often requires semi-synthetic data, where clean EEG is artificially contaminated with known artifacts, or the use of expertly cleaned data as the target [48] [6] [20].
Model Selection and Training: Choose an appropriate architecture (e.g., CNN, GAN, CLEnet). Train the model by iteratively presenting input-target pairs and using an optimizer (e.g., Adam) to minimize the loss function.
Inference: Apply the trained model to new, unseen noisy EEG data to generate the cleaned output.

Deep Learning Training and Inference Process

Quantitative Performance Comparison

Table 1: Comparative Performance of Artifact Removal Methods Across Different Artifact Types

Artifact Type	Method	Key Performance Metrics	Advantages	Limitations
Ocular (EOG)	Regression	Similar performance to ICA for time-domain correction [69].	Simple, computationally efficient [69].	Requires reference channel; risks over-subtraction [69].
	ICA	Considered a top-performing approach for high-density EEG [69] [59].	No reference needed; separates neural & artifactual sources [33].	Requires many channels (>40 ideal); manual component inspection [69] [59].
	Deep Learning	CLEnet: CC=0.925, RRMSEt=0.300 (mixed artifacts) [20].	End-to-end; no manual intervention; preserves signal [20].	Requires large, labeled datasets for training [87].
Muscle (EMG)	ICA	Effective but performance decreases with low-channel counts [69].	Can separate focal EMG artifacts from neural signals [19].	Muscle ICs can be numerous and hard to classify completely [19].
	Deep Learning	NovelCNN/CLEnet excel at EMG removal (SNR: 11.498dB for mixed) [20].	Superior at handling broadband, overlapping noise [20] [87].	Model performance is artifact-specific (e.g., NovelCNN for EMG) [20].
Cardiac (ECG)	ICA	Can identify and remove periodic cardiac components [19].	Effective if ECG is statistically independent from EEG.	May not fully remove pulse artifact due to its non-neural origin.
	Deep Learning	CLEnet: 5.13% SNR increase, 8.08% RRMSEt decrease vs. DuoCL [20].	Learns complex patterns without strict statistical assumptions.	Limited published results specifically for ECG removal.
Mixed/Unknown	ICA	Quality degrades with increased participant movement [59].	Robust for lab data; AMICA algorithm is particularly powerful [59].	Decomposition quality drops in highly mobile settings [59].
	Deep Learning	CLEnet: 2.45% SNR, 2.65% CC improvement in multi-channel tasks [20].	Generalizes to remove unknown artifacts in multi-channel data [20].	Computationally complex; "black box" nature reduces interpretability [87].

Abbreviations: CC (Correlation Coefficient), RRMSEt (Relative Root Mean Square Error in temporal domain), SNR (Signal-to-Noise Ratio).

Table 2: Essential Resources for EEG Artifact Removal Research

Resource Category	Specific Tool / Algorithm	Primary Function in Research
Software & Libraries	EEGLAB (with AMICA plugin) [59]	Provides a complete environment for running ICA and other preprocessing steps, including the powerful AMICA algorithm.
	RELAX Pipeline [86]	An EEGLAB plugin implementing targeted artifact reduction to minimize false positives and source localization biases.
	MNE-Python [33]	A Python package for EEG/MEG data analysis, featuring implementations of ICA, filtering, and other preprocessing tools.
Benchmark Datasets	EEGdenoiseNet [20]	A semi-synthetic benchmark dataset with clean EEG, EOG, and EMG signals, essential for training and evaluating DL models.
	Temple University Hospital (TUH) EEG Corpus [35]	A large-scale clinical EEG dataset with expert artifact annotations, used for developing and validating detection algorithms.
Deep Learning Models	CLEnet [20]	A hybrid CNN-LSTM model with an attention mechanism for removing various artifacts from multi-channel EEG.
	AnEEG (GAN with LSTM) [6]	A generative model for producing artifact-free EEG signals.
	Complex CNN / M4 Network [48]	DL architectures benchmarked for removing tES artifacts, showing performance dependent on stimulation type.

The comparative analysis reveals that the optimal choice for artifact removal is not universal but is dictated by the specific research context. ICA remains the gold standard for well-controlled laboratory studies with high-density EEG systems, particularly for ocular and cardiac artifacts, offering a robust balance of performance and interpretability [69] [33] [59]. Regression-based methods provide a simple, computationally efficient solution when a clean artifact reference is available, though they carry the risk of removing neural signals along with artifacts [69]. Deep Learning approaches represent the frontier of artifact removal, demonstrating superior performance in handling complex artifacts like EMG and in challenging scenarios such as mobile EEG, at the cost of computational complexity and reduced interpretability [20] [87].

Future advancements are likely to focus on hybrid methodologies that leverage the strengths of multiple approaches. These may include DL models that automate the classification of ICA components or architectures specifically designed for real-time, low-latency processing in clinical monitoring and brain-computer interfaces. Furthermore, the development of standardized benchmarking datasets and a greater emphasis on model interpretability will be critical for the translation of these advanced methods from research labs into routine clinical and pharmaceutical applications.

Electroencephalography (EEG) records the brain's spontaneous electrical activity, representing postsynaptic potentials of pyramidal neurons with high temporal resolution. [88] However, EEG signals are highly susceptible to contamination from undesired sources, broadly categorized as physiological artifacts (originating from the subject's own body) and non-physiological artifacts (from external sources). [10] [89] Physiological artifacts include cardiac activity, eye movements/blinks, muscle activity (EMG), glossokinetic signals, and respiratory movements. [10] [89] Non-physiological artifacts can arise from monitoring devices, infusion pumps, or environmental electrical equipment. [89]

Simultaneous EEG recording during transcranial electrical stimulation (tES) presents a unique challenge. The stimulation currents introduce massive stimulation artifacts that can dominate the EEG trace, obscuring the underlying neural signals. [90] During transcranial Alternating Current Stimulation (tACS), for instance, the gross artifact manifests as a large sinusoidal signal at the stimulation frequency, often with a Signal-to-Noise Ratio (SNR) as low as -33 dB for 1 mA stimulation. [90] These artifacts are problematic because they occur within the same frequency band (5-40 Hz) as many endogenous brain rhythms of interest, making simple filtering ineffective. [90] This technical guide details the nature of these artifacts and provides methodologies for their effective removal, a critical capability for developing closed-loop neuromodulation systems. [90] [91]

Physiological and Stimulation Artifacts in EEG

Characterizing Physiological Artifacts

A proper understanding of artifact removal begins with recognizing common physiological contaminants.

Ocular Artifacts: Eye blinks and movements produce large electrical potentials due to the cornea-retina dipole. Blinks typically cause slow, large-amplitude deflections (hundreds of microvolts) maximal in frontal electrodes, while lateral eye movements create positive-negative waveforms at F7 and F8. [10]
Muscle Artifacts (EMG): Contractions of head, face, or neck muscles generate high-frequency, low-amplitude activity that can propagate via volume conduction, contaminating most EEG channels. [10]
Cardiac Artifacts: The heart's electrical activity (ECG) can appear in EEG recordings, often most prominent in electrodes on the left side of the scalp. A related pulse artifact can occur when an electrode is placed over a pulsating blood vessel. [10]

Table 1: Common Physiological Artifacts in EEG Recordings

Artifact Type	Typical Manifestation in EEG	Primary Source
Ocular (Blinks)	High-amplitude, low-frequency waves frontal leads	Cornea-retina dipole, Bell's Phenomenon [10]
Muscle (EMG)	High-frequency, low-amplitude fast activity	Head, face, neck muscle contraction [10]
Cardiac (ECG)	Periodic QRS-like complexes, left-side prominence	Electrical activity of the heart muscle [10]
Glossokinetic	Low-frequency potential shifts	Tongue movement creating electrical field [89]
Respiratory	Slow, rhythmic baseline oscillations	Chest movement altering electrical properties [89]

The Nature of tES Stimulation Artifacts

Transcranial electrical stimulation introduces distinct artifacts that differ between modalities.

tDCS Artifacts: During transcranial Direct Current Stimulation, artifacts typically present as low-frequency noise. [90]
tACS Artifacts: The gross tACS artifact is a large sinusoidal signal at the stimulation frequency. However, it is not a pure sinusoid; it often contains amplitude modulations (e.g., a ~100 µV ripple) due to impedance changes from factors like blood circulation, electrode drying, or muscle movements. [90] The stimulator itself, while maintaining constant current, can also introduce non-linear artifacts. [90]
The Challenge of Overlap: A primary difficulty is that tACS is typically applied at frequencies overlapping endogenous EEG rhythms (5-40 Hz). This means a simple notch filter at the stimulation frequency would remove a substantial portion of the neural signal of interest. [90]

Methodologies for Artifact Removal

Effective artifact removal requires a combination of hardware solutions, signal processing techniques, and experimental design. The following workflow outlines a general approach for recovering neural signals from artifact-contaminated EEG data during tES.

Standard Preprocessing for Physiological Artifacts

Before addressing stimulation-specific artifacts, standard EEG preprocessing is crucial.

Filtering: Band-pass filtering (e.g., 0.5-100 Hz) and notch filtering at line noise frequency (e.g., 50/60 Hz) are typical first steps. [92]
Ocular Artifact Removal: Techniques include regression in the time domain and Blind Source Separation methods like Independent Component Analysis (ICA), which can identify and remove components correlated with blinks and eye movements. [10]
Muscle Artifact Removal: ICA can also be effective for muscle artifacts. Other approaches include filtering, linear regression, source decomposition, and even neural networks. [10]

Core Algorithms for tES Artifact Removal

Two advanced algorithms have shown significant promise for removing the gross tACS artifact.

Superposition of Moving Averages (SMA)

The SMA method is a low-complexity, channel-count independent technique. [90]

Principle: It uses the collected EEG data to build a time-localized template of the current artifact and subtracts this from the data. [90]
Procedure: The EEG data for each channel is split into non-overlapping segments whose length matches the period of the stimulation frequency. An artifact template is created by averaging the data from a moving window of segments (e.g., the previous N segments). This dynamic template is then subtracted from the current data segment. [90]
Advantages: Low computational cost, does not require a high channel count, and is time-localized to adapt to changing artifact profiles. [90]

Adaptive Filtering (AF)

The Adaptive Filter technique is a powerful parametric approach.

Principle: It uses a known reference signal—the injected stimulation current—to model and subtract the artifact. This is similar to noise-canceling headphones. [90]
Procedure: The known stimulation waveform is used as the reference input to an adaptive filter (e.g., a Recursive Least Squares filter). The filter continuously adjusts its weights to minimize the difference between its output and the artifact-contaminated EEG signal. The output of the filter is then a best-estimate of the artifact, which can be subtracted from the original signal. [90]
Advantages: Highly effective at tracking and removing non-stationary artifacts and suitable for real-time, closed-loop applications. [90]

Table 2: Comparison of tACS Artifact Removal Algorithms

Feature	Superposition of Moving Averages (SMA)	Adaptive Filtering (AF)
Core Principle	Time-localized template subtraction via segment averaging [90]	Parametric subtraction using known reference signal [90]
Computational Load	Low [90]	Higher
Channel Count Dependence	Independent; works with low channel counts [90]	Processes each channel
Suitability for Real-Time	Good	Excellent [90]
Key Requirement	Data to build a moving template	Accurate recording of the stimulation waveform [90]

Validation of Artifact Removal

Robust validation is essential. A multi-stage strategy is recommended over relying on a single metric. [90]

Head Phantom Testing: Provides a controlled environment to analyze performance without neural signals. [90]
Detection of Biological Signals: The cleaned signal should allow for the detection of known EEG phenomena, such as alpha activity in the occipital lobe when eyes are closed, or event-related potentials (ERPs). [90]
Comparison of Descriptive Statistics: Basic statistics (mean, variance, kurtosis) of the cleaned EEG should be comparable to those of clean, resting EEG. [90]

Experimental Protocols and The Scientist's Toolkit

Example Experimental Protocol for tACS+EEG

The following protocol, based on current research, outlines a method for studying tACS effects with simultaneous EEG. [91]

Participant Preparation: Fit the participant with a high-density EEG cap (e.g., 64 electrodes according to the 10-10 system). Use Ag/AgCl electrodes and apply conductive gel to ensure good impedance (< 10 kΩ). The reference electrode is typically placed on the tip of the nose. [91]
Stimulation Setup: Place tACS electrodes according to the experimental montage. For a study targeting the central executive network (CEN) and default mode network (DMN), electrodes might be positioned over frontal and parietal sites. [91]
Stimulation Parameters: Apply cross-frequency coupled tACS (CFC-tACS). An example is theta/alpha-gamma phase-amplitude coupled stimulation. Parameters might include a 6-second stimulation duration during a cognitive task, with phase-lag conditions (e.g., 45° vs 180°) between networks. [91]
Data Acquisition: Record EEG continuously at a sampling rate of 500 Hz or higher. Ensure the amplifier can handle the dynamic range of the stimulation artifact without saturation. [91]
Online Artifact Removal: In a closed-loop design, implement a real-time capable algorithm (like AF or SMA) to remove the gross stimulation artifact from the EEG stream. [90]
Offline Processing:
- Apply the chosen artifact removal algorithm (SMA or AF) to the raw data.
- Perform standard EEG preprocessing on the cleaned data: band-pass filtering (e.g., 4-50 Hz), bad channel removal, and re-referencing. [91]
- Apply ICA to remove any residual physiological artifacts (ocular, cardiac).
- Epoch the data relative to task events and perform time-frequency or ERP analysis.

The Scientist's Toolkit: Essential Research Reagents and Materials

Table 3: Essential Materials for tES-EEG Research

Item	Specification / Example	Function in Research
EEG Amplifier & Cap	64-channel Ag/AgCl system (e.g., BrainAmp, actiCAP) [91]	Records scalp electrical potentials with high temporal resolution.
Transcranial Stimulator	Programmable tES device (e.g., DC-STIMULATOR) [93]	Generates precise tDCS, tACS, or tRNS currents.
Electrodes & Conductive Gel	Ag/AgCl pellet electrodes; high-chloride abrasive gel	Ensures stable, low-impedance electrical contact with the scalp.
Electrode Paste/Skin Prep	Abrasive paste (e.g., NuPrep)	Prepares skin surface to reduce impedance at the electrode-skin interface.
Head Phantom Model	Conductive head-shaped phantom (e.g., kappa carrageenan with NaCl) [94]	Provides a controlled, biomimetic environment for testing and validation.
Signal Processing Software	MATLAB with toolboxes (EEGLab, FieldTrip), Python (MNE)	Implements artifact removal algorithms and general EEG analysis.

Discussion and Future Directions

Removing stimulation artifacts is a critical enabling step for advancing tES research. Clean simultaneous EEG allows researchers to move beyond simplistic before/after comparisons and observe the direct neural effects of stimulation in real-time. [90] This capability is the foundation for developing closed-loop neuromodulation systems, where stimulation parameters (e.g., frequency, phase, intensity) are dynamically adjusted based on the subject's immediate brain state. [90] [91] Deep learning approaches are now being explored to decode the type of stimulation applied directly from task-based EEG signals, further blurring the line between recording and stimulation. [91]

Future developments will likely involve the refinement of hybrid artifact removal methods that combine the strengths of SMA, AF, and ICA. Furthermore, as new stimulation techniques like Temporal Interference Stimulation (TIS)—which uses high-frequency fields to stimulate deep brain structures—move toward human applications, novel artifact challenges and removal strategies will undoubtedly emerge. [95] The ongoing collaboration between biomedical engineering and clinical neuroscience will continue to drive this field forward, ultimately leading to more effective and personalized neuromodulation therapies.

In electroencephalography (EEG) research, physiological artifacts represent non-cerebral signals originating from biological sources that significantly contaminate neural data. These artifacts, which include activities from ocular, muscular, and cardiac systems, exhibit amplitude ranges often exceeding genuine brain activity by orders of magnitude, thereby complicating neurological assessment and interpretation [96]. The establishment of robust validation frameworks for artifact detection algorithms necessitates comprehensive ground truth datasets where artifact occurrences are precisely annotated. This foundation enables rigorous benchmarking of algorithmic performance against known contamination events, ensuring that automated detection methods meet the stringent requirements of both clinical and research applications. The fundamental challenge in constructing these frameworks lies in the accurate identification and labeling of diverse artifact types within EEG recordings, a process that traditionally relies heavily on expert visual inspection [97].

Physiological Artifact Typology and Characterization

Physiological artifacts in EEG signals originate from various biological sources, each possessing distinct temporal, spectral, and spatial characteristics that facilitate their identification. Understanding this typology is essential for developing effective validation frameworks and algorithmic detection strategies.

Table 1: Characteristics of Major Physiological Artifact Types in EEG Research

Artifact Type	Biological Origin	Spectral Characteristics	Spatial Distribution	Amplitude Range
Ocular Artifacts (EOG)	Eye movements, blinking	Low frequency (delta/theta bands)	Primarily frontal regions	100-200 μV [96]
Muscle Artifacts (EMG)	Muscle contraction	High frequency (beta/gamma bands)	Temporal/frontal regions	Varies with contraction strength [96]
Cardiac Artifacts (ECG)	Heart electrical activity	Overlaps with EEG bands	Diffuse, often lateralized	Low amplitude [96]
Sweat Artifacts	Skin sweat glands	Very low frequency (<0.5 Hz)	Variable distribution	Slow baseline shifts [96] [44]
Respiratory Artifacts	Chest/head movement during breathing	Low frequency (delta/theta bands)	Diffuse	Slow rhythmic waves [96]

The spatial distribution patterns of these artifacts provide critical features for algorithmic detection. Ocular artifacts predominantly manifest in frontal electrodes with characteristic dipole patterns, while muscle artifacts typically localize to temporal regions and electrode sites overlaying cranial muscles [96] [98]. Cardiac artifacts may appear as rhythmic patterns time-locked to QRS complexes, often with lateralized presentation depending on individual anatomy [44]. These distinctive spatial signatures, combined with temporal and spectral features, enable the creation of multi-dimensional ground truth annotations essential for validating detection algorithms.

Validation Framework Architectures for Algorithm Benchmarking

Ground Truth Establishment Methodologies

Establishing reliable ground truth represents the foundational step in validating EEG artifact detection algorithms, with approaches spanning manual, semi-automated, and fully automated paradigms:

Expert Visual Annotation: The historical gold standard involves trained electroencephalographers visually identifying artifacts based on morphological characteristics in temporal and spectral domains [97]. This method leverages human pattern recognition capabilities but suffers from inter-rater variability and limited scalability for large datasets.
Reference Sensor Approaches: Physiological recordings from dedicated sensors provide objective ground truth measures. Electrooculography (EOG) electrodes placed near eyes capture ocular artifacts, while electrocardiography (ECG) leads record cardiac signals [99]. These hardware-based methods offer temporal precision but require additional equipment and setup complexity.
Independent Component Analysis (ICA) with Expert Verification: ICA decomposes EEG signals into spatially fixed and temporally independent components [100]. Experts then classify components as neural or artifactual based on topography, time course, and spectral properties [101]. This hybrid approach combines computational efficiency with expert validation.
Multimodal Fusion Frameworks: Advanced frameworks integrate multiple verification sources (expert annotation, reference sensors, component classification) to create high-confidence ground truth labels [97]. This approach mitigates limitations inherent in any single method through data fusion techniques.

Performance Metrics for Algorithm Validation

Quantitative evaluation of artifact detection algorithms requires comprehensive metrics that capture various dimensions of performance:

Table 2: Key Performance Metrics for EEG Artifact Detection Algorithm Validation

Metric Category	Specific Metrics	Calculation	Interpretation
Detection Accuracy	Sensitivity, Specificity, Precision, F1-score	TP/(TP+FN), TN/(TN+FP), TP/(TP+FP), 2×(Precision×Recall)/(Precision+Recall)	Measures correctness of artifact identification against ground truth
Temporal Precision	Mean absolute error, Onset/offset detection delay	Average time difference between detected and actual artifact events	Quantifies temporal alignment precision
Spatial Accuracy	Topographic correlation, Localization error	Spatial correlation between actual and detected artifact topography	Assesses accuracy in identifying spatial distribution
Computational Efficiency	Processing time, Memory usage	Time/memory required to process standard dataset	Determines practical feasibility for real-time applications
Robustness	Performance variance across subjects/conditions	Standard deviation of performance metrics across datasets	Evaluates consistency across diverse recording scenarios

These metrics collectively provide a comprehensive assessment framework, enabling direct comparison between different algorithmic approaches and establishing performance benchmarks for specific application contexts, from clinical diagnostics to brain-computer interfaces [97] [100].

Experimental Protocols for Ground Truth Generation

Protocol 1: Manual Annotation and ICA-Based Labeling

The most established protocol for generating high-quality ground truth involves systematic manual annotation with ICA decomposition:

Data Acquisition and Preprocessing: Acquire high-density EEG recordings (≥64 channels recommended) with simultaneous reference signals (EOG, ECG) [100]. Apply bandpass filtering (0.5-70 Hz) and notch filtering (50/60 Hz) to minimize technical artifacts while preserving physiological signals.
Independent Component Analysis: Perform ICA decomposition using extended Infomax or similar algorithm to separate EEG data into statistically independent components [100]. Each component comprises a fixed spatial topography and associated time course.
Component Classification: Expert reviewers evaluate components based on multiple criteria:
- Topographic patterns (e.g., frontal focus for ocular artifacts)
- Temporal characteristics (e.g., pulse synchronization for cardiac artifacts)
- Spectral properties (e.g., high-frequency content for muscle artifacts)
- Relationship to reference signals (e.g., correlation with EOG/ECG)
Ground Truth Annotation: Label artifact-contaminated epochs in original data based on classified components, specifying artifact type, temporal extent, and spatial distribution.

This protocol benefits from leveraging the human visual system's sophisticated pattern recognition capabilities while utilizing ICA to isolate artifact sources, making it particularly effective for establishing reference standards [101] [100].

Protocol 2: Unsupervised Anomaly Detection Framework

For applications requiring scalability to large datasets without extensive manual labeling, unsupervised approaches provide an alternative ground truth establishment method:

Feature Extraction: Compute comprehensive feature set from EEG epochs including:
- Temporal features (variance, amplitude, entropy)
- Spectral features (band power across standard frequency bands)
- Spatial features (topographic distribution, hemispheric asymmetry)
- Statistical features (kurtosis, skewness, outlier metrics)
Multi-Algorithm Ensemble Detection: Apply diverse unsupervised outlier detection algorithms including:
- Isolation Forests for detecting global outliers
- Local Outlier Factor for identifying local density anomalies
- One-class SVM for modeling normative feature distribution
Consensus Labeling: Aggregate outputs from multiple detectors using voting schemes or statistical fusion to identify high-confidence artifact segments [97].
Expert Verification: Subsampled consensus outputs undergo expert review to validate detection accuracy and refine algorithm parameters.

This protocol offers advantages in scalability and objectivity while reducing reliance on extensive manual annotation efforts, particularly valuable for large-scale datasets where comprehensive expert review is impractical [97].

Advanced Computational Approaches for Artifact Detection

Deep Learning Architectures for Automated Detection

Convolutional Neural Networks (CNNs) applied to Independent Component topographies have demonstrated state-of-the-art performance in automated artifact recognition:

Architecture Design: Optimized three-CNN framework dividing Topoplots into four classes: three artifact types (ocular, muscular/cardiac, muscular/impedance fluctuations) and useful brain signals [100].
Performance Metrics: These systems achieve overall accuracy, sensitivity, and specificity greater than 98%, processing 32 Topoplots in approximately 1.4 seconds on standard computing hardware [100].
Scalability Advantages: The scalable architecture accommodates varying sensor configurations and emerging artifact patterns without structural redesign, crucial for real-world applications where recording conditions frequently change.

End-to-End Unsupervised Correction Frameworks

Recent approaches extend beyond detection to include artifact correction using representation learning:

Feature-Based Detection: Extraction of 58 clinically relevant features with application of unsupervised outlier detection algorithms to identify task- and subject-specific artifacts [97].
Deep Encoder-Decoder Correction: Artifact segments processed through deep encoder-decoder networks for unsupervised correction, framed as a temporal interpolation task rather than simple removal [97].
Performance Validation: Classification models trained on corrected EEG data demonstrate approximately 10% relative performance improvement compared to uncorrected data, validating the efficacy of this approach [97].

Table 3: Essential Research Tools for EEG Artifact Validation Research

Tool/Resource	Type	Primary Function	Application Context
FieldTrip Toolbox [101]	Software Library	EEG/MEG analysis with artifact detection functions	Manual and automated artifact rejection, including visual and statistical methods
BrainBeats EEGLAB Plugin [102]	Software Plugin	Joint analysis of EEG and cardiovascular signals	Extraction of cardiac artifacts, HEP assessment, heart-brain interaction studies
ICA Topoplot CNN Framework [100]	Deep Learning Model	Automated classification of IC topographies	Fast artifact recognition for online BCI applications
Unsupervised Artifact Correction Pipeline [97]	Machine Learning Framework	Automated detection and correction without manual labeling	Scalable preprocessing for large EEG datasets
Sweat Sensor Integration [99]	Hardware-Software System	Direct measurement and removal of sweat artifacts	Mobile EEG applications where sweat artifacts are prevalent

These tools collectively provide researchers with comprehensive capabilities for establishing ground truth and validating artifact detection algorithms across diverse experimental contexts. The selection of appropriate tools depends on specific research requirements including dataset scale, artifact types of interest, available computational resources, and application constraints (e.g., real-time processing needs) [101] [97] [102].

The establishment of robust validation frameworks for EEG artifact detection represents a critical methodological foundation for advancing cognitive neuroscience, clinical neurology, and brain-computer interface research. As computational approaches evolve from supervised methods requiring extensive manual annotation to increasingly sophisticated unsupervised and deep learning techniques, the importance of standardized benchmarking against comprehensive ground truth becomes ever more essential. Future progress in this domain will depend on continued development of shared validation datasets, standardized performance metrics, and modular frameworks that can adapt to emerging recording technologies and analysis paradigms. Only through such rigorous validation approaches can the field overcome the persistent challenge of physiological artifacts and unlock the full potential of EEG for understanding brain function and dysfunction.

In electroencephalography (EEG) research, the accurate identification and removal of physiological artifacts is paramount to ensuring data integrity. However, the computational methods employed for this purpose exist within a constrained design space where increasing model complexity to improve accuracy often incurs significant processing speed penalties. This technical guide examines the fundamental trade-offs between model sophistication and computational efficiency within the context of physiological EEG artifact research. We synthesize current methodologies, from traditional signal processing to advanced deep learning architectures, and provide structured analysis of their performance characteristics. For researchers and drug development professionals, optimizing this balance is crucial for enabling real-time applications and managing computational costs in large-scale studies.

Electroencephalography (EEG) records electrical activity generated by the brain, but this sensitive measurement is highly vulnerable to contamination from undesired physiological sources [10]. These physiological artifacts originate from the patient's body but not from cerebral activity, and they represent a significant challenge for data analysis and interpretation [3]. Unlike non-physiological artifacts from external sources like equipment or environment, physiological artifacts are inherent to the recording situation and can be difficult to isolate and remove without affecting neural signals of interest.

The most common physiological artifacts include:

Ocular artifacts: Generated by eye movements and blinks due to the dipole between cornea (positive) and retina (negative) [3] [10]
Muscle artifacts: Produced by tension in head, face, or neck muscles (electromyographic activity) [3]
Cardiac artifacts: Arising from electrical activity of the heart (ECG) or pulse effects from scalp blood vessels [3]
Glossokinetic artifacts: Resulting from tongue movements that create electrical potentials [3]
Respiratory artifacts: Caused by rhythmic body movements during breathing [3]

These artifacts can mimic cerebral activity and lead to misinterpretation of EEG data, potentially resulting in clinical diagnostic errors or invalid research findings [10]. For instance, eye flutters may be wrongly identified as interictal discharges indicative of epilepsy [10]. The amplitude of these artifacts often far exceeds that of background EEG activity—eye blinks, for example, can produce signals in the hundreds of microvolts compared to cerebral signals typically measuring just a few to tens of microvolts [10].

Computational Methods for Artifact Handling

Traditional Signal Processing Approaches

Traditional methods for artifact handling typically rely on mathematical models of signal properties and are generally less computationally demanding.

Filtering techniques represent the most computationally efficient approach, applying frequency-based exclusion of artifact-prone bands. Muscle artifacts, predominantly high-frequency (>30 Hz), are often addressed with low-pass filtering, while slow drift artifacts may be removed with high-pass filtering [10]. While highly efficient, filtering risks removing neurologically relevant signals sharing frequency bands with artifacts.

Regression methods use reference signals (e.g., electrooculogram EOG) to model and subtract artifact components from EEG channels. These methods require moderate computational resources, primarily for parameter estimation, but performance depends heavily on the quality of reference signals [10].

Blind Source Separation (BSS) techniques, particularly Independent Component Analysis (ICA), have become standard for artifact removal in research settings. ICA decomposes multichannel EEG into statistically independent components, allowing researchers to identify and remove artifact-related components before reconstructing the signal [81] [10]. This approach is particularly effective for ocular, cardiac, and muscle artifacts but requires significant computational resources, especially with high-density EEG systems.

Table 1: Computational Characteristics of Traditional Artifact Handling Methods

Method	Computational Complexity	Primary Artifacts Addressed	Advantages	Limitations
Filtering	Low (O(n))	Muscle (high-frequency), Slow drifts	Fast, minimal processing requirements	Risks removing neural signals, ineffective for overlapping frequencies
Regression	Medium (O(n²))	Ocular, Cardiac	Effective with good reference signals	Requires additional recordings, may over-correct
ICA/BSS	High (O(n³))	Ocular, Cardiac, Muscle	No reference signals needed, handles multiple artifacts	Computationally intensive, requires manual component inspection

Machine Learning and Deep Learning Approaches

Modern machine learning approaches offer increasingly sophisticated artifact detection capabilities but with varied computational demands.

Supervised machine learning models including Support Vector Machines (SVM), Random Forests (RF), and gradient boosting methods (XGBoost, LightGBM, CatBoost) have been applied for automated artifact classification [103]. These models can achieve high accuracy when trained on sufficiently large datasets with proper feature engineering, with computational load varying significantly by algorithm.

Deep Learning architectures—particularly Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs)—represent the most computationally intensive approach [104]. CNNs can automatically learn relevant features from raw EEG signals or time-frequency representations, while RNNs (including LSTM networks) effectively model temporal dependencies in EEG time series. These models have demonstrated classification accuracy exceeding 90% in some studies but require substantial computational resources for both training and inference [104].

The primary computational bottleneck for deep learning models in real-time applications is not merely processing speed but the trade-off between accuracy and efficiency [104]. More complex architectures with higher parameter counts generally achieve better performance but incur prohibitive computational costs that limit practical implementation, particularly for real-time processing [104].

Quantitative Analysis of Complexity-Speed Trade-offs

Performance Metrics Comparison

Research indicates a consistent pattern where computational demands increase disproportionately with model sophistication while delivering diminishing returns in accuracy.

Table 2: Performance Comparison of Artifact Handling Methods

Method Type	Reported Accuracy	Processing Speed	Hardware Requirements	Suitability for Real-Time
Digital Filtering	Low-Moderate (varies by artifact)	Very Fast	Minimal	Excellent
ICA	Moderate-High	Slow (minutes to hours)	Moderate-High	Poor
Traditional ML (SVM, RF)	Moderate-High (75-85%)	Fast (seconds to minutes)	Moderate	Good with optimization
Deep Learning (CNN/RNN)	High (>90% in some studies)	Very Slow (training); Moderate (inference)	High (GPUs recommended)	Limited to optimized models

The "high computational cost" of deep learning models presents a "prohibitive" barrier for many real-world applications, creating a fundamental tension between classification performance and practical implementability [104]. This is particularly relevant for drug development studies involving longitudinal monitoring or multi-site trials with standardized processing pipelines.

Memory and Processing Requirements

Computational complexity in artifact processing manifests in both time and space complexity:

Time complexity ranges from O(n) for simple filtering to O(n³) for ICA decomposition of n-channel EEG
Space complexity varies with model parameter count, from minimal for filtering to extensive for deep learning models
Processing latency is critical for real-time applications, where batch processing approaches may be unacceptable

The integration of EEG with other data modalities (facial expressions, physiological sensors) further compounds these computational challenges, though multimodal approaches have demonstrated improved classification accuracy [104].

Experimental Protocols for Method Evaluation

Standardized Evaluation Framework

To objectively assess the trade-offs between model complexity and processing speed, researchers should implement standardized evaluation protocols:

Data Acquisition Specifications

Use high-density EEG systems (e.g., 256-channel EGI GES 400MR) [81]
Maintain impedance below 50 kΩ for all electrodes [81]
Apply bandpass filtering (1-40 Hz) using 8th-order Butterworth filters [81]
Implement proper grounding to minimize 60-Hz AC interference [3]

Artifact Induction Protocol

Record deliberate artifact conditions: eye blinks, lateral eye movements, jaw clenching, head rotation
Obtain corresponding reference signals (EOG, EMG, ECG) for validation
Include resting state segments for baseline comparison

Processing Pipeline

Preprocessing: Filtering, bad channel detection/interpolation
Artifact Detection: Apply method under evaluation
Component Removal: For ICA-based methods, remove identified artifact components
Signal Reconstruction: Reconstruct clean EEG
Validation: Compare with reference signals and expert ratings

Performance Assessment Metrics

Computational Efficiency Measures

Execution time per epoch (seconds)
Memory utilization (MB/GB)
CPU/GPU utilization percentage
Scaling behavior with channel count

Artifact Handling Performance

Sensitivity and specificity for artifact detection
Signal-to-noise ratio improvement
Preservation of neural signals in clean segments
Inter-rater reliability with expert annotations

Implementation Strategies for Efficiency Optimization

Algorithmic Optimization Techniques

Several strategies can help balance the complexity-efficiency trade-off:

Dimensionality Reduction techniques such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) can substantially improve computational efficiency while maintaining performance [103]. Studies demonstrate that even poorly performing models like Gaussian Naive Bayes show "substantially increased performance after dimension reduction" [103].

Feature Selection approaches that identify the most discriminative EEG features (e.g., frontal asymmetry, spectral power bands, connectivity metrics) can reduce input dimensionality without significantly compromising accuracy [104].

Model Compression techniques including pruning, quantization, and knowledge distillation can reduce deep learning model size and computational requirements while preserving functionality.

Hybrid Approaches that combine efficient traditional methods with targeted machine learning can optimize the balance. For example, using ICA for initial component separation followed by a lightweight classifier for automated component labeling.

Hardware and Parallelization Strategies

GPU Acceleration dramatically improves performance for parallelizable operations in ICA and deep learning models.

Cloud Computing resources enable scaling for large datasets without local infrastructure investment.

Edge Computing approaches optimize models for deployment in resource-constrained environments, such as wearable EEG systems.

Research Reagent Solutions and Essential Materials

Table 3: Essential Research Materials for EEG Artifact Research

Item	Function	Specification Considerations
High-Density EEG System	Signal acquisition	256-channel systems (e.g., EGI GES 400MR) provide better spatial resolution for artifact identification [81]
MR-Compatible EEG Systems	Simultaneous EEG/fMRI research	Required for studying artifacts specific to MR environments [81]
Active Electrodes	Signal quality improvement	Reduce interference in non-shielded environments [105]
Reference Recording Equipment	Artifact validation	EOG, EMG, ECG for ground truth validation [10]
Faraday Cage/Shielded Room	Environmental control	Minimizes external electromagnetic interference [105]
Computational Hardware	Signal processing	GPU acceleration recommended for deep learning and ICA [104]
Software Toolboxes	Analysis implementation	EEGLab, Cartool, BrainVision Analyzer provide standardized implementations [81]

Visualizing Method Selection and Workflow

Figure 1: Artifact Processing Method Selection

Figure 2: Complexity-Speed Relationship

The trade-off between computational model complexity and processing speed represents a fundamental consideration in physiological EEG artifact research. While advanced deep learning methods offer impressive accuracy, their computational demands frequently preclude real-time application and large-scale implementation. Future research directions should focus on developing adaptive, real-time processing algorithms that maintain sufficient accuracy while operating within practical computational constraints [104].

Optimization techniques that reduce model size without significant performance loss, combined with hardware acceleration strategies, offer promising pathways to bridge this gap. Additionally, standardized protocols for emotion elicitation and artifact benchmarking would enhance comparability across studies and improve generalizability of findings [104].

For researchers and drug development professionals, the optimal balance point depends on specific application requirements: real-time clinical applications may prioritize efficiency, while post-hoc research analysis may justify more computationally intensive approaches. By thoughtfully navigating these trade-offs, the field can advance toward more robust, scalable, and clinically applicable EEG artifact handling methods that maintain both scientific rigor and practical utility.

Electroencephalography (EEG) is a powerful, non-invasive tool for investigating brain function, boasting high temporal resolution and portability that make it invaluable in fields ranging from clinical neurology and psychology to cognitive neuroscience and pharmaceutical development [5] [19]. However, the recorded EEG signal is notoriously susceptible to contamination by unwanted non-neural signals, known as artifacts. These artifacts can obscure genuine brain activity and compromise data integrity, leading to misinterpretation and flawed conclusions.

Physiological artifacts, which originate from the participant's own body, represent a particularly pervasive challenge. Unlike non-physiological artifacts (e.g., line noise, electrode pops), physiological artifacts often exhibit spectral and temporal properties that overlap with those of neural signals of interest, making them difficult to isolate and remove [19]. Effectively managing these artifacts is not a one-size-fits-all endeavor; it requires a deliberate, evidence-based selection of methodologies tailored to the specific artifact type, research context, and available equipment. This guide provides a structured framework for researchers to navigate this complex methodological landscape, offering actionable recommendations for optimizing EEG data quality and reliability.

Classification and Characteristics of Major Physiological Artifacts

A foundational step in artifact management is the accurate identification of the contaminant. Different physiological artifacts have distinct origins and signatures in the EEG signal. The table below summarizes the key characteristics of the most common physiological artifacts.

Table 1: Characteristics of Major Physiological EEG Artifacts

Artifact Type	Biological Origin	Typical Causes	Key Features in Time Domain	Key Features in Frequency Domain
Ocular (EOG)	Corneo-retinal dipole (eye) [19]	Blinks, saccades, lateral gaze [19]	High-amplitude, slow deflections, maximal over frontal sites (e.g., Fp1, Fp2) [19]	Dominant in delta (0.5–4 Hz) and theta (4–8 Hz) bands [19]
Muscle (EMG)	Muscle fiber contractions [19]	Jaw clenching, swallowing, talking, frowning [19]	High-frequency, low-voltage "spiky" activity [19]	Broadband noise, dominates beta (13–30 Hz) and gamma (>30 Hz) ranges [19]
Cardiac (ECG)	Electrical activity of the heart [19]	Heartbeat (pulse artifact) [19]	Rhythmic, sharp waveforms recurring at heart rate, often in central/temporal channels [19]	Overlaps multiple EEG bands; peak at heart rate (~1-1.7 Hz) [19]
Movement	Disruption of electrode-skin interface [24]	Head turns, walking, postural shifts [19]	High-amplitude, low-frequency drifts or sudden, non-stationary bursts [19]	Can introduce low-frequency drift and broadband noise [5]

A Structured Framework for Artifact Management Method Selection

The selection of an artifact management strategy should be guided by the specific research context. The following decision framework outlines a recommended pipeline, from data acquisition to final processing, highlighting the most effective techniques for different scenarios.

Core Workflow for Artifact Management

The diagram below visualizes the step-by-step, evidence-based workflow for managing physiological artifacts in EEG research, from preparation to final processing.

Pre-Acquisition and Hardware Considerations

The optimal approach to artifacts begins before data collection. Proactive strategies can significantly reduce contamination at the source.

Auxiliary Sensors: For experiments involving significant movement (e.g., exergaming, ambulatory monitoring), auxiliary sensors are strongly recommended. As highlighted in a systematic review, inertial measurement units (IMUs) are underutilized despite their high potential for enhancing motion artifact detection under ecological conditions [5]. Simultaneous recording of EOG and EMG provides reference signals that drastically improve the identification and removal of ocular and muscular artifacts.
Electrode and System Choice: The choice between traditional wet-electrode systems and modern dry-electrode wearable systems carries trade-offs. Wearable EEG systems, often using dry electrodes, are prone to specific artifacts due to reduced scalp coverage and subject mobility [5]. Researchers must select hardware appropriate for the experimental context, acknowledging that relaxed constraints often compromise signal quality.

Selection of Processing Methods Based on Artifact Type

Once data is acquired, the choice of processing method should be guided by the nature of the dominant artifacts, as illustrated in the workflow.

For Ocular Artifacts: Independent Component Analysis (ICA) is among the most frequently used and effective techniques [5] [24] [19]. ICA is a blind source separation method that decomposes the EEG signal into statistically independent components. Components with topography, time course, and spectral profile characteristic of eye blinks or movements can be manually or automatically identified and removed before signal reconstruction [19]. Regression-based techniques offer an alternative, though they may perform poorly without a dedicated reference channel [20].
For Muscle and Motion Artifacts: A combination of techniques is often required. Wavelet transforms are highly effective for managing muscular artifacts due to their ability to localize transient, high-frequency features in both time and frequency domains [5]. For scenarios with continuous or gross movement, Artifact Subspace Reconstruction (ASR)-based pipelines are widely applied [5]. ASR functions as a powerful, adaptive filter that removes high-variance signal components indicative of large artifacts.
For Broad-Spectrum or Unknown Artifacts: In cases where artifacts are mixed or not easily categorized, deep learning (DL) approaches are emerging as powerful, versatile tools. A 2025 study proposed CLEnet, a novel architecture integrating dual-scale Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks [20]. This model is designed to extract both the morphological and temporal features of EEG, enabling the separation of clean neural data from various artifacts, even in multi-channel contexts with "unknown" noise sources [20]. These models are particularly promising for real-time settings and can adapt to a wider range of artifact types than traditional algorithms [5].

Table 2: Evidence-Based Method Selection for Common Research Contexts

Research Context	Dominant Artifact Types	Recommended Methods	Performance Considerations
Resting-State / Sedentary	Ocular, Cardiac	ICA, Regression	High accuracy for ocular artifact removal; assessed via selectivity (63%) and accuracy (71%) when clean signal is reference [5].
Ambulatory / Exergaming	Motion, Muscle	ASR, Wavelet Transform, IMU-assisted detection	Effective for high-intensity motion; deep learning is emerging for muscular and motion artifacts [5] [24].
High-Channel Count (>32) EEG	Ocular, Muscle, Cardiac	ICA, PCA	Leverages high spatial resolution; performance impaired in low-density setups [5].
Low-Channel Count / Wearable EEG	Mixed, Motion	Deep Learning (e.g., CNN-LSTM), ASR	Adapts to low spatial resolution; CLEnet improved SNR by 2.45% and CC by 2.65% on 32-channel data [20].
Event-Related Potential (ERP) Studies	Ocular, Muscle	ICA, Wavelet Transform	Preserves trial-to-trial latency; visual inspection is common but time-consuming [106] [107].

Experimental Protocols and Validation Metrics

Example Protocol: Validating a Deep Learning Artifact Removal Model

A 2025 study on the CLEnet model provides a robust protocol for developing and validating a DL-based artifact removal tool [20].

Dataset Curation: Create multiple datasets for training and evaluation.
- Dataset I (Semi-synthetic EMG/EOG): Artificially mix clean, single-channel EEG with recorded EMG and EOG signals at known signal-to-noise ratios [20].
- Dataset II (Semi-synthetic ECG): Mix clean EEG with Electrocardiogram (ECG) data from a public database like MIT-BIH Arrhythmia Database [20].
- Dataset III (Real, Multi-channel Unknown Artifacts): Collect a bespoke dataset (e.g., 32-channel EEG from participants performing a cognitive task like a 2-back test) containing real, unknown physiological artifacts [20].
Network Architecture & Training: Design a dual-branch neural network (e.g., CLEnet) that uses CNN blocks to extract morphological features and LSTM networks to capture temporal dependencies. An attention mechanism (e.g., EMA-1D) can be incorporated to enhance feature selection. Train the model in a supervised manner using mean squared error (MSE) between the model's output and the known clean EEG as the loss function [20].
Performance Validation: Quantify model performance using multiple metrics on the test datasets. Key metrics include:
- Signal-to-Noise Ratio (SNR) [20]
- Correlation Coefficient (CC) between cleaned and clean EEG [20]
- Relative Root Mean Square Error in both temporal (RRMSEt) and frequency (RRMSEf) domains [20].
Comparative and Ablative Analysis: Benchmark the model's performance against established mainstream models (e.g., 1D-ResCNN, NovelCNN). Conduct ablation studies (e.g., removing the EMA-1D module) to confirm the contribution of each network component to the overall performance [20].

Standard Validation Metrics and Reporting

Regardless of the method chosen, rigorous validation is essential. Researchers should consistently report quantitative performance metrics and data retention statistics to allow for cross-study comparison and reproducibility.

Primary Metrics: The most common metrics, derived from having a ground-truth clean signal, are Accuracy (reported in 71% of studies) and Selectivity (reported in 63% of studies) [5].
Advanced Metrics: As seen in DL research, SNR, CC, and RRMSE provide a more granular view of the reconstruction quality in both temporal and spectral domains [20].
Data Reporting: It is critical to report the proportion of data discarded due to artifacts, as large-scale rejection can bias results and reduce statistical power [24].

The Scientist's Toolkit: Essential Research Reagents and Solutions

The following table details key hardware, software, and methodological "reagents" essential for effective EEG artifact research.

Table 3: Essential Toolkit for EEG Artifact Research

Tool / Solution	Category	Primary Function	Example Application / Note
Auxiliary Sensors (EOG, EMG, IMU)	Hardware	Provide reference signals for specific artifacts; motion tracking.	Critical for improving artifact detection in mobile and real-world settings [5].
ICA (e.g., in EEGLAB)	Algorithm	Blind source separation for isolating neural and non-neural components.	Gold-standard for ocular artifact removal; requires multiple channels and manual inspection [5] [19].
Wavelet Transform	Algorithm	Time-frequency analysis for isolating transient signals.	Highly effective for identifying and removing myogenic (muscle) artifacts [5].
Artifact Subspace Reconstruction (ASR)	Algorithm	Adaptive, statistical method for removing high-variance signal components.	Suitable for online and real-time processing of motion and other large artifacts [5].
Deep Learning Models (e.g., CNN-LSTM)	Algorithm	Automated, adaptive artifact removal from learned features.	Emerging for multi-artifact removal; CLEnet is an example for multi-channel data [20].
Semi-Synthetic Benchmark Datasets	Data	Provide ground truth for training and validating new algorithms.	e.g., EEGdenoiseNet; enables supervised learning and fair model comparison [20].

The landscape of EEG artifact management is evolving, moving from traditional, often manual methods toward increasingly automated and adaptive computational approaches. The most effective strategy is not to seek a single universal solution, but to implement a structured, context-aware pipeline. This begins with proactive experimental design, leverages auxiliary data where possible, and applies evidence-based processing methods—from established tools like ICA and wavelet transforms for well-defined artifacts to sophisticated deep learning models for complex, multi-channel, and real-world scenarios. By adhering to these guidelines and rigorously validating their workflows, researchers can significantly enhance the fidelity of their EEG data, thereby solidifying the foundation for their neuroscientific, clinical, and pharmacological discoveries.

Conclusion

Effectively managing physiological artifacts is not merely a preprocessing step but a fundamental requirement for ensuring the validity of EEG-based research and clinical applications. A one-size-fits-all approach is inadequate; the optimal strategy depends on the specific artifact type, research context, and available computational resources. While traditional methods like ICA and regression remain highly valuable, emerging deep learning and state space models like Complex CNN and M4 offer superior performance for complex, non-stationary artifacts, especially in specialized applications like simultaneous tES-EEG. Future directions should focus on developing standardized validation frameworks, enhancing the real-time capabilities and generalizability of deep learning models, and creating integrated, automated preprocessing pipelines. These advancements will be crucial for unlocking the full potential of EEG in translational research, neuromodulation studies, and the development of robust biomarkers for neurological and psychiatric drug development.