Navigating the Noise: Challenges and Advanced Solutions in High-Density EEG Artifact Removal

Ava Morgan Dec 02, 2025 462

High-density electroencephalography (hd-EEG), with its vast spatial resolution, is indispensable for modern neuroscience research and clinical applications.

Navigating the Noise: Challenges and Advanced Solutions in High-Density EEG Artifact Removal

Abstract

High-density electroencephalography (hd-EEG), with its vast spatial resolution, is indispensable for modern neuroscience research and clinical applications. However, the immense data volume from hundreds of channels complicates the critical preprocessing step of artifact removal. This article provides a comprehensive analysis of the unique challenges in hd-EEG artifact remediation, exploring a spectrum of solutions from semi-automatic routines and traditional blind source separation to cutting-edge deep learning models. Tailored for researchers and drug development professionals, the content details methodological applications, offers troubleshooting guidance for optimization, and presents a comparative validation of prevalent techniques. The synthesis aims to equip practitioners with the knowledge to enhance data integrity, thereby ensuring the reliability of neural signatures in biomedical research.

The High-Density EEG Artifact Landscape: Understanding the Core Challenges

The evolution from conventional low-density electroencephalography (EEG) to high-density (hd-EEG) systems represents a fundamental shift in neuroimaging capabilities. While traditional systems typically employ 19-21 electrodes, modern hd-EEG configurations utilize 64 to 256 channels or more, dramatically increasing spatial resolution and providing a more complete representation of the brain's electrical field [1]. This technological advancement enables researchers to capture neural dynamics with unprecedented detail, but it simultaneously introduces profound challenges for artifact removal that differ substantially from those encountered in conventional EEG. The core problem is that artifact removal in hd-EEG is not merely a matter of scaling up existing methods, but requires a complete re-evaluation of approaches due to fundamental differences in how artifacts manifest, propagate, and interact with neural signals across dense electrode arrays.

The implications of these challenges extend across multiple domains of neuroscience research and clinical applications. For drug development professionals studying pharmaco-EEG biomarkers, the integrity of neural signals is paramount for accurately assessing drug effects on brain dynamics. Similarly, basic researchers investigating functional connectivity or neural oscillations rely on artifact-free data to draw valid conclusions about brain function. Understanding why hd-EEG demands specialized artifact handling approaches is therefore essential for advancing research across multiple neuroscientific disciplines.

Fundamental Differences in Artifact Manifestation

Altered Spatial Characteristics of Artifacts

In hd-EEG, artifacts manifest in fundamentally different patterns compared to conventional systems due to the dense sampling of the scalp's electrical field. Where traditional EEG might show a simplified artifact pattern, hd-EEG reveals the complete spatial distribution of artifactual fields, including both positive and negative potential areas and the precise transition boundaries between them [1].

For example, a simple eye blink creates a characteristic pattern that clearly illustrates this difference:

In conventional EEG (19-21 channels), an eye blink appears as a "V"-shaped positive transient primarily visible in frontopolar electrodes (Fp1, Fp2, F7, F8)
In hd-EEG (256 channels), the same blink reveals a comprehensive dipolar field with positive potentials extending from frontopolar to frontocentral regions and corresponding negative potentials recorded from electrodes below the eyes on the cheeks [1]

This detailed spatial information enables more precise artifact identification and source localization, but simultaneously complicates removal by revealing the complex, distributed nature of artifacts that simple regression methods cannot adequately address.

Table 1: Comparative Analysis of Artifact Manifestation in Conventional vs. High-Density EEG

Characteristic	Conventional EEG (19-21 channels)	High-Density EEG (128-256 channels)
Spatial Sampling	Sparse coverage, especially in inferior regions	Comprehensive coverage including cheeks and inferior temporal areas
Artifact Field Visualization	Partial, fragmented patterns	Complete dipolar fields with clear polarity inversions
Eye Blink Manifestation	"V"-shaped positive transient in frontopolar electrodes	Positive field above eyes, negative field below, with clear inversion line
Lateral Eye Movement	Difficult to distinguish from frontal slowing	Clear contralateral positivity and ipsilateral negativity
Complex Artifact Resolution	Limited ability to detect combined artifacts	Can differentiate combined events (e.g., blink + lateral eye movement)

The Information-Rich Nature of Hd-EEG Artifacts

Paradoxically, the same characteristics that make hd-EEG artifacts challenging also make them more informative. The dense spatial sampling allows researchers to visualize the complete topography of artifactual fields, enabling more precise identification of their biological origins [1]. For instance, a pure lateral eye movement in hd-EEG shows a distinctive pattern with positive potentials on the cheek toward which the eyes are moving and negative potentials on the opposite cheek, with a nearly vertical inversion line between them. When a blink accompanies lateral eye movement, the inversion line becomes diagonal, revealing the composite nature of the artifact [1].

This level of detail fundamentally changes the artifact removal problem from one of simply identifying and deleting contaminated periods to one of carefully separating overlapping neural and non-neural signals while preserving the integrity of both. The spatial complexity that makes artifacts more challenging to remove also provides the necessary information to remove them more precisely than ever before possible with conventional systems.

Methodological Challenges and Limitations

Algorithmic and Computational Constraints

The transition to hd-EEG exposes significant limitations in traditional artifact removal methods, necessitating specialized approaches. Blind source separation (BSS) methods like Independent Component Analysis (ICA) face particular challenges in the hd-EEG context, where the assumption of statistical independence between neural signals and artifacts may not hold true in practical applications [2] [3].

Critical limitations of conventional methods include:

Reference channel dependency: Regression-based methods requiring EOG/ECG references inevitably subtract cerebral activity along with artifacts since reference signals contain brain-derived components [2] [3]
Spatial stationarity assumptions: Methods assuming fixed artifact topographies fail to account for dynamic changes during movement or postural adjustments [4]
Component misclassification: Automated ICA classification risks misidentifying neural components as artifactual, potentially removing clinically or scientifically relevant brain activity [3]
Computational intensity: Processing 128-256 channels demands substantially greater computational resources and time, complicating real-time applications [5]

The online processing requirement for brain-computer interface (BCI) and neurofeedback applications presents particular challenges for hd-EEG, as most artifact removal methods were designed for offline analysis without strict temporal constraints [2]. This limitation is especially relevant for drug development studies incorporating real-time neurofeedback or for clinical applications requiring immediate processing.

Validation and Preservation of Brain Dynamics

A fundamental concern in hd-EEG artifact removal is ensuring that denoising methods preserve the true brain dynamics underlying the recorded signals. Without appropriate validation, even methods that successfully remove artifacts may distort genuine neural activity, leading to erroneous scientific conclusions or clinical interpretations [3].

Microstate analysis has emerged as a promising validation approach, representing global brain dynamics as sequences of a few scalp potential topographies that remain stable for brief intervals (60-120 ms) [3]. Studies comparing automated artifact removal methods (optimized fingerprint method and ARCI approach) against expert visual classification have demonstrated that:

Automated methods can achieve classification accuracy comparable to expert reviewers (p > 0.05)
Microstate templates explain approximately 80% of EEG variance in denoised datasets
Microstate metrics (duration, occurrence, coverage) show no significant differences between automated and expert-based denoising (p > 0.5)
Cronbach's α values confirm excellent test-retest reliability for microstate parameters after automated artifact removal [3]

These findings confirm that automated methods can effectively remove physiological artifacts while preserving global brain dynamics, addressing a critical concern in hd-EEG research.

Specialized Scenarios and Emerging Solutions

Artifact Removal in Mobile Brain/Body Imaging (MoBI)

The growing field of Mobile Brain/Body Imaging (MoBI), which combines hd-EEG with motion capture during naturalistic movement, presents particularly complex artifact scenarios. Traditional artifact removal methods fail completely in these contexts due to the non-stationary, high-amplitude artifacts generated by whole-body movements [4] [6].

Successful approaches for movement artifact removal employ multi-stage processing pipelines:

Channel-based template regression: Uses stride time warping to address gait-related artifacts through moving time-window averaging of stride phase-locked data [4]
Spatial filtering via ICA: Applies adaptive ICA mixture model algorithms to parse signals into maximally independent components [4]
Component-level template regression: Extends the template regression approach to independent components [4]
Signal reconstruction: Reverses time-warping and applies the ICA mixing matrix to recover artifact-reduced EEG signals [4]

This sophisticated approach has demonstrated efficacy in removing severe movement artifacts during walking and running while preserving cognitive event-related potentials (ERPs) during simultaneous visual oddball tasks [4]. The method significantly reduces EEG spectral power in the 1.5-8.5 Hz frequency range during locomotion without evidence of overcorrection.

Advanced Computational Approaches

Deep learning (DL) approaches represent a promising frontier in hd-EEG artifact removal, potentially overcoming limitations of traditional methods. The CLEnet architecture exemplifies this direction, integrating dual-scale convolutional neural networks (CNN) with Long Short-Term Memory (LSTM) networks and an improved EMA-1D (One-Dimensional Efficient Multi-Scale Attention Mechanism) [7].

This approach addresses key challenges in hd-EEG artifact removal:

Morphological feature extraction: Dual-scale CNN kernels identify artifacts at different spatial scales
Temporal feature preservation: LSTM networks maintain temporal dependencies in neural signals
Multi-channel processing: The architecture handles full hd-EEG arrays rather than requiring channel-wise processing
Unknown artifact handling: The model generalizes to artifact types beyond those in training data [7]

Experimental results demonstrate CLEnet's superiority over mainstream models, achieving 2.45-5.13% improvements in signal-to-noise ratio (SNR) and 0.75-2.65% improvements in correlation coefficients (CC) across different artifact types, while reducing temporal and frequency domain errors by 3.30-8.08% [7].

Experimental Protocols and Research Toolkit

Methodologies for Validating Artifact Removal

Robust experimental protocols are essential for validating hd-EEG artifact removal methods. The following methodologies represent current best practices:

Simultaneous Inside-Outside Scanner Recordings This validation approach involves collecting EEG data simultaneously inside and outside the MRI scanner environment [8]. The outside-scanner recording serves as a benchmark for evaluating artifact reduction methods applied to the inside-scanner data. Validation metrics include:

Spectral analysis: Comparing oscillatory activity during resting-state and motor tasks
Time-domain assessment: Evaluating visual evoked response recovery during visual stimulation tasks
Statistical inference: Using hierarchical Bayesian probabilistic modeling for method comparison [8]

Carbon-Wire Loop (CWL) Reference Systems The CWL method uses six carbon-wire loops placed on the head but isolated from the scalp to exclusively capture MR-induced artifacts [8]. This provides a reference signal uncontaminated by neural activity, enabling:

Direct regression of imaging and ballistocardiogram (BCG) artifacts
Superior performance in recovering spectral contrast in alpha and beta bands
Enhanced visual evoked potential reconstruction compared to purely computational methods [8]

Semi-Automatic Graphical User Interface (GUI) Approaches For sleep hd-EEG, specialized routines like High-Density-SleepCleaner provide semi-automatic artifact removal through interactive visualization of data quality metrics [5]. This approach:

Presents four sleep quality markers (SQMs) for epoch assessment
Enables artifact identification based on topography and underlying signal characteristics
Generates binary matrices (channels × epochs) marking artifactual periods
Incorporates epoch-wise interpolation to recover affected channels [5]

Table 2: Research Reagent Solutions for Hd-EEG Artifact Management

Tool/Method	Primary Function	Application Context	Key Advantages
Carbon-Wire Loops (CWL)	Reference-based artifact capture	EEG-fMRI recordings	Records MR artifacts without neural contamination
High-Density-SleepCleaner	Semi-automatic artifact identification	Sleep hd-EEG	Specialized for overnight recordings with dynamic GUI
CLEnet	Deep learning artifact separation	General hd-EEG	Handles unknown artifacts and multi-channel inputs
Optimized Fingerprint Method	Automated IC classification	General hd-EEG	Reference-free with high accuracy for physiological artifacts
ARCI Approach	Cardiac artifact removal	General hd-EEG	Specialized for pulse and cardiac-related interference
MoBI Template Regression	Movement artifact removal	Mobile brain/body imaging	Addresses gait and whole-body movement artifacts

Application-Specific Method Selection

Different research applications impose unique constraints on artifact removal method selection, requiring careful consideration of trade-offs between accuracy, speed, reliability, and ease of use [2]:

Clinical Diagnostic Applications (e.g., epilepsy, Alzheimer's disease)

Priority: Accuracy and reliability over speed
Recommended methods: Hybrid approaches combining multiple techniques
Channel requirements: Tolerance for reference channels when necessary [2]

Brain-Computer Interface (BCI) and Neurofeedback

Priority: Balanced speed and accuracy for real-time processing
Recommended methods: Optimized online-capable algorithms
Constraints: Computational efficiency with minimal latency [2]

Neuromarketing and Ecological Studies

Priority: Subject comfort and minimal setup
Recommended methods: Limited-channel approaches without reference requirements
Considerations: Maximum user convenience with acceptable accuracy [2]

Artifact removal in high-density EEG presents challenges that differ fundamentally from those in conventional EEG systems. These differences stem from the complex spatial manifestation of artifacts across dense electrode arrays, the computational intensity of processing hundreds of channels, and the methodological limitations of approaches designed for lower-density systems. The specialized requirements of emerging applications like Mobile Brain/Body Imaging (MoBI) and sleep studies further complicate the artifact landscape, necessitating tailored solutions that account for movement, environmental interference, and recording duration.

Future progress in hd-EEG artifact management will likely focus on deep learning approaches that can adapt to unknown artifact types, real-time processing algorithms for BCI and neurofeedback applications, and standardized validation frameworks using microstate analysis and other brain dynamics preservation metrics. For researchers and drug development professionals, understanding these fundamental differences is crucial for designing robust studies, selecting appropriate artifact handling strategies, and accurately interpreting hd-EEG data in both basic and applied contexts.

Electroencephalography (EEG) is a fundamental tool in clinical and neuroscience research, providing non-invasive measurement of brain activity with high temporal resolution. A significant challenge in EEG analysis is the contamination of the neural signal by artifacts—extraneous electrical potentials originating from non-cerebral sources. These artifacts can obscure or mimic neurophysiological patterns, compromising the validity of scientific and clinical conclusions. This guide characterizes the primary categories of artifacts—ocular, muscular, cardiac, and motion-related—within the context of the specific challenges posed by high-density EEG systems. Effective artifact management is a critical preliminary step, as the choice of removal strategy involves significant trade-offs between signal fidelity, data integrity, and decoding performance [2] [9].

Artifact Classification and Characteristics

Artifacts in EEG signals are broadly categorized as physiological (originating from the subject's body) or non-physiological (originating from external sources). The following sections detail the primary physiological artifacts.

Table 1: Characteristics of Common Physiological Artifacts in EEG

Artifact Type	Primary Sources	Spectral Band	Topographical Distribution	Amplitude Range	Morphology
Ocular	Eye blinks, vertical and horizontal eye movements [2]	Mainly low-frequency (< 4 Hz) [10]	Primarily anterior regions (frontal, prefrontal) [10]	High (50-100 μV for blinks)	Slow, monophasic (blinks) or diphasic (saccades) waves
Muscular	Muscle activity from jaw (chewing), head/neck movement, forehead (frowning) [11] [2]	High-frequency (> 20 Hz), can extend to 100+ Hz [10]	Focal, depends on muscle group; common in temporal and frontal regions [10]	Variable, often very high	High-frequency, spiky, non-rhythmic patterns
Cardiac	Electrical activity of the heart (ECG) [2]	~1-2 Hz (pulse)	Widespread, but often prominent in channels near blood vessels	Low to moderate	Periodic, complex waveform synchronized with heartbeat
Motion	Head/body movement, cable sway, electrode displacement [11]	Broadband	Global or channel-specific	Very high, transient	Sudden, large-amplitude jumps or slow drifts

Experimental Protocols for Artifact Analysis

Rigorous experimental design and data preprocessing are prerequisites for reliable artifact characterization and removal. The following protocols are essential for robust analysis.

Data Acquisition and Preprocessing Pipeline

Standardized data preparation ensures consistency and reproducibility. A recommended workflow, adapted from studies using public datasets like the Temple University Hospital (TUH) EEG Corpus, involves several key stages [10]:

Signal Standardization: Resample all recordings to a uniform sampling rate (e.g., 250 Hz). Convert channel configurations to a standardized montage (e.g., a 22-channel bipolar montage). Apply bandpass filtering (e.g., 1-40 Hz) and notch filtering (50/60 Hz) to suppress line noise and out-of-band interference [10].
Referencing and Normalization: Apply an average reference to reduce common-mode noise. Remove DC offsets. Perform global normalization (e.g., using RobustScaler) across all channels and timepoints to standardize input ranges for subsequent analysis or model training [10].
Segmentation: Divide continuous EEG into epochs or non-overlapping windows. The optimal window size is artifact-dependent; research suggests 20s for eye movements, 5s for muscle activity, and 1s for non-physiological artifacts yield superior detection performance [10].

Impact of Preprocessing on Decoding Performance

A multiverse analysis of preprocessing choices reveals their profound impact on downstream decoding performance. Key findings include [9]:

Artifact Correction: Steps like Independent Component Analysis (ICA) and Autoreject generally decrease decoding performance because they remove signal variance that may be systematically associated with the experimental condition. For instance, removing ocular artifacts in an experiment where eye movements are correlated with the stimulus (e.g., N2pc) can severely reduce a classifier's accuracy [9].
Filtering: Higher high-pass filter (HPF) cutoffs consistently increase decoding performance. For time-resolved decoding, lower low-pass filter (LPF) cutoffs are also beneficial [9].
Other Steps: Linear detrending and a longer baseline correction window typically have a positive effect on decoding performance [9].

These findings highlight a critical trade-off: preprocessing steps that maximize decoding performance may do so by allowing the classifier to exploit structured noise, thereby threatening the interpretability and validity of the model [9].

High-Density EEG Artifact Management Workflow

The Scientist's Toolkit: Research Reagents and Computational Solutions

Table 2: Essential Tools for EEG Artifact Research

Tool/Solution	Function	Example Use-Case
Public Datasets	Provides expert-annotated, real-world data for algorithm development and benchmarking.	Temple University Hospital (TUH) EEG Corpus [10]
Independent Component Analysis (ICA)	A blind source separation method that identifies statistically independent components, which can be manually or automatically classified as neural or artifactual.	Isolation and removal of ocular and cardiac artifacts [11]
Automated Statistical Methods (e.g., FASTER)	Provides a rule-based framework for automatic artifact detection in multi-channel datasets.	Rapid, automated screening of epochs for multiple artifact types [10]
Artifact Subspace Reconstruction (ASR)	An online-capable method that removes high-variance signal components exceeding statistical thresholds from the data.	Handling of motion and instrumental artifacts in wearable EEG [11]
Convolutional Neural Networks (CNNs)	Deep learning models that can be trained to detect specific artifact classes from raw EEG signals with high sensitivity and specificity.	Specialized detection of eye movement, muscle, and non-physiological artifacts [10]
Auxiliary Sensors (EOG, EMG, ECG)	Provide reference signals for physiological artifacts, enhancing the performance of regression and adaptive filtering techniques.	Direct recording of eye (EOG) and muscle (EMG) activity for use as a noise reference [2]

The effective characterization and management of ocular, muscular, cardiac, and motion artifacts are paramount for the integrity of high-density EEG research. Each artifact type possesses distinct spatial, temporal, and spectral signatures, necessitating tailored identification and removal strategies. While advanced techniques like ICA and deep learning offer powerful solutions, researchers must critically evaluate the trade-offs involved, particularly the risk that enhancing decoding performance may come at the cost of biological interpretability. A rigorous, systematic approach to artifact management, as outlined in this guide, is therefore an indispensable component of the EEG research workflow.

The advent of high-density electroencephalography (HD-EEG), particularly 256-channel systems, has revolutionized neuroscientific research and clinical diagnostics by offering unprecedented spatial resolution for mapping brain dynamics. This technological advancement enables researchers to capture nuanced cortical activity that lower-density systems might miss [12]. However, the transition to dense electrode arrays, especially in long-duration overnight recordings, generates a data deluge of exceptional magnitude, introducing profound computational and practical challenges that threaten to outpace current analytical capabilities.

When deployed for overnight sleep studies, these systems produce massive datasets that strain storage infrastructure, processing pipelines, and analytical methods. The core challenge lies not merely in handling the data volume but in effectively distinguishing neural signals from the complex artifact contamination that inevitably accumulates during extended recording sessions [13]. This article examines the specific hurdles posed by 256-channel overnight recordings and details the advanced methodologies being developed to transform this data wealth into neuroscientific insight.

The Scale of the Data Deluge

The data generation capacity of 256-channel EEG systems operating continuously through an 8-hour sleep period creates unprecedented computational demands. Understanding these fundamental scaling relationships is crucial for planning and implementing successful research infrastructure.

Table 1: Data Generation and Computational Demands of 256-Channel Overnight EEG

Parameter	Specification	Practical Implication
Recording Duration	8 hours (overnight)	Captures full sleep architecture with sufficient cycles for analysis
Typical Sampling Rate	256 - 1000 Hz	Balances temporal resolution with manageable file sizes
Estimated Data Volume	~150 - 600 GB per recording	Requires substantial storage solutions and efficient data transfer protocols
Peak Memory Usage (Processing)	~128 GB RAM (for 251-ch, 250 Hz) [13]	Necessitates high-performance computing (HPC) nodes for analysis
Processing Runtime (Cleaning)	~45 minutes (scaling with channels/ length) [13]	Limits iterative analysis and demands efficient, automated pipelines

The computational burden extends beyond simple storage. Processing and analyzing these datasets requires specialized hardware and software architectures capable of handling the high-dimensional data structures inherent to HD-EEG. For instance, a 251-channel recording sampled at 250 Hz can require approximately 128 GB of RAM for processing, with runtimes around 45 minutes for cleaning operations on a standard 4-core machine—figures that scale roughly proportionally with channel count and recording length [13]. This creates a significant bottleneck for researchers needing to process multiple datasets.

Signature Artifacts in Overnight High-Density Recordings

The uncontrolled environment of overnight sleep studies, combined with the high channel count, introduces artifact types with specific features that complicate the cleaning process. These artifacts exhibit distinct spatial, temporal, and spectral characteristics requiring tailored detection strategies [14].

Table 2: Characteristic Artifacts in 256-Channel Overnight EEG Recordings

Artifact Category	Specific Types	Key Characteristics & Challenges
Physiological	Eye movements/blinks, sweat artifacts, muscle twitches, large body movements, arousals, cardiac/pulse activity, respiration, swallowing [13]	Frequency overlap with neural signals; spatially evolving patterns; myogenic artifacts from head/neck muscles; pulsatile artifacts from cardiac cycle.
Technical/Environmental	Electrode popping, signal discontinuities, amplifier saturation/disconnection, electrolyte evaporation/bridging [13]	Often channel-specific, requiring localized detection rather than global rejection. Can be intermittent and hard to distinguish from neural bursts.
Motion-Related	Head shifts, gross body movements, electrode displacement [14]	High-amplitude, broadband signals affecting multiple channels. Particularly problematic in wearable HD-EEG with dry electrodes.

A critical challenge is that artifacts are often expressed in only a subset of channels or for limited time periods, making the complete rejection of channels or epochs wasteful and scientifically costly [13]. This reality necessitates channel-wise and time-resolved artifact handling approaches that preserve valuable neural data.

Advanced Computational Pipelines for Artifact Management

Emerging Deep Learning Architectures

Novel deep learning architectures are showing remarkable success in tackling artifact removal in multichannel EEG, overcoming limitations of traditional methods like Independent Component Analysis (ICA), which often require manual intervention and perform poorly with low channel counts [14] [15].

CLEnet: This dual-branch network integrates dual-scale Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks, supplemented by an improved attention mechanism (EMA-1D). It simultaneously extracts morphological features and temporal features from EEG signals, enabling effective separation of neural data from various artifacts, including unknown types in multi-channel data. CLEnet has demonstrated performance improvements of 2.45% in SNR and 2.65% in cross-correlation over other models [15].
ART (Artifact Removal Transformer): Leveraging transformer architecture, this end-to-end model captures transient millisecond-scale dynamics characteristic of EEG signals. Trained on pseudo clean-noisy data pairs generated via ICA, ART effectively removes multiple artifact sources simultaneously and has shown superior performance in restoring multichannel EEG signals, significantly improving Brain-Computer Interface (BCI) performance [16].
TCN-Based IED Detectors: For specific applications like epilepsy research, Temporal Convolutional Network (TCN)-based systems combined with novel decision mechanisms have been developed to identify Interictal Epileptiform Discharges (IEDs) while distinguishing them from artifacts. These systems achieve high precision with low false-positive rates (0.194/min), which is crucial for clinical applications [17].

Specialized Toolboxes for Large-Scale Data

Dedicated software toolboxes are emerging to address the practical needs of processing overnight HD-EEG data, offering automated cleaning with extensive customization options.

SleepTrip, a free Matlab-based toolbox, provides a flexible approach for automated cleaning of multichannel sleep recordings. Its key functionality includes channel-wise detection of various artifact types, channel- and time-resolved marking of data segments for repair through interpolation, and visualization options to review performance. As part of the FieldTrip ecosystem, it repurposes established functions while adding sleep-specific capabilities, supporting efficient processing of large-scale overnight datasets [13].

Other tools like Luna and High-Density-SleepCleaner offer complementary approaches, with Luna providing fully automated epoch- and channel-resolved flagging of outliers, and High-Density-SleepCleaner offering important visualization and epoch-wise interpolation options, though it requires more user interaction [13].

HD-EEG Processing Pipeline

Experimental Protocols for Validation

Rigorous validation of artifact removal techniques requires specialized experimental protocols and benchmark datasets. The following methodologies represent current best practices:

Semi-Synthetic Data Generation

This approach involves combining clean EEG recordings with real artifact signals (EOG, EMG, ECG) in controlled ratios to create datasets with known ground truth. Zhang et al. developed a semi-synthetic benchmark dataset specifically for removing EMG and EOG artifacts, enabling standardized comparison of different algorithms [15]. Protocols typically involve:

Acquiring clean EEG signals during rest conditions with minimal artifact.
Recording pure artifact signals (e.g., EOG from eye movements, EMG from muscle contractions).
Linearly mixing artifacts with clean EEG at specific signal-to-noise ratios.
Validating that the statistical properties of synthetic data match real contaminated EEG.

Real-World Data Annotation with Expert Consensus

For evaluating performance on genuine overnight recordings, researchers employ multi-expert annotation protocols:

Multiple Expert Annotators: Three or more trained neurologists individually annotate IED events and artifacts in the same recordings.
Majority Agreement: Only events identified by at least two experts are considered validated ground truth.
Channel-Wise Annotation: Annotations specify both temporal occurrence and spatial distribution across channels.
Comprehensive Assessment: Performance is evaluated using metrics including Area Under the Precision-Recall Curve (AUPRC), false-positive rates, F1 scores, and kappa agreement scores [17].

Performance Metrics and Evaluation

Quantitative evaluation employs multiple complementary metrics:

Temporal Domain Metrics: Signal-to-Noise Ratio (SNR), Relative Root Mean Square Error (RRMSEt), and correlation coefficient (CC) between cleaned and clean reference signals.
Spectral Domain Metrics: Relative Root Mean Square Error in frequency domain (RRMSEf) to assess preservation of spectral features.
Clinical Utility Metrics: For IED detection, false-positive rates per minute and agreement scores between algorithms and expert consensus [15] [17].

The Scientist's Toolkit: Essential Research Reagents

Successfully managing 256-channel overnight recordings requires a suite of specialized software and hardware solutions.

Table 3: Essential Tools for HD-EEG Artifact Research

Tool Category	Specific Examples	Function & Application
Specialized Software Toolboxes	SleepTrip [13], Luna [13], High-Density-SleepCleaner [13]	Provide automated, channel-wise artifact detection and repair functions specifically designed for sleep EEG.
Deep Learning Frameworks	CLEnet [15], ART (Artifact Removal Transformer) [16]	End-to-end artifact removal using advanced neural architectures for multi-channel data.
Benchmark Datasets	EEGdenoiseNet [15], MLSPred-Bench [18]	Standardized datasets for training and validating artifact removal algorithms and seizure prediction models.
High-Performance Computing Infrastructure	HPC clusters with >128 GB RAM per node [13]	Essential for processing large overnight HD-EEG datasets within feasible timeframes.

HD-EEG System Architecture

The data deluge from 256-channel overnight EEG recordings presents a formidable but surmountable challenge. Success requires integrated approaches combining advanced computational infrastructure, sophisticated algorithms, and specialized software tools. Emerging deep learning architectures show particular promise for handling the complexity and scale of this data, offering automated, end-to-end solutions for artifact management.

Future progress will depend on developing more efficient computational methods, creating standardized benchmarking datasets, and improving the accessibility of specialized toolboxes. As these technologies mature, they will unlock the full potential of HD-EEG for understanding brain function during sleep, ultimately advancing both basic neuroscience and clinical applications in epilepsy, sleep disorders, and cognitive research. The artifact removal challenges in high-density EEG research are not merely technical obstacles but opportunities for innovation that will shape the next generation of neuroscientific discovery.

In high-density electroencephalography (HD-EEG), the intricate interplay between neural signals and artifacts presents a fundamental analytical challenge. The core of this problem lies in the spectral and spatial overlap where artifactual components occupy the same frequency bands and topographic distributions as neurophysiologically relevant brain activity [11]. This convergence severely complicates the process of distinguishing brain-derived signals from non-neural contaminants, thereby undermining the reliability of both clinical and research applications.

Artifacts in HD-EEG originate from multiple sources. Physiological artifacts, such as ocular movements (EOG), muscle activity (EMG), and cardiac rhythms (ECG), exhibit characteristic signatures that often blend with neural oscillations [7] [19]. Conversely, non-physiological artifacts include motion-related disturbances, electrode displacement, and environmental electromagnetic interference, which become particularly pronounced in mobile and real-world recording scenarios [11] [20]. The transition towards wearable EEG systems with dry electrodes and reduced scalp coverage further intensifies these challenges by increasing susceptibility to motion artifacts and limiting the effectiveness of traditional source separation techniques that rely on high channel counts [11].

Addressing this overlap is not merely a technical exercise but a prerequisite for advancing HD-EEG applications in neurological disorder diagnosis, cognitive neuroscience, and pharmaceutical drug development. The following sections provide a technical examination of the artifact removal landscape, evaluating traditional and modern computational approaches, detailing experimental protocols, and presenting quantitative performance comparisons to guide methodological selection.

Technical Approaches to Artifact Removal

Traditional methodologies often rely on blind source separation (BSS) to leverage the spatial resolution of HD-EEG. These algorithms project multi-channel recordings into components that are maximally independent, enabling the manual or semi-automated identification and rejection of artifactual sources.

Independent Component Analysis (ICA): A cornerstone BSS method, ICA decomposes EEG signals into statistically independent components. Its efficacy depends on high channel density and the linear independence of sources. However, its performance degrades with high-amplitude motion artifacts and in low-density wearable setups [11] [20].
Wavelet Transform and Variational Mode Decomposition (VMD): These techniques address spectral overlap by decomposing signals into time-frequency representations. A 2025 comparative analysis of VMD-BSS and Discrete Wavelet Transform-BSS (DWT-BSS) demonstrated robust artifact removal, with both methods achieving a strong Spearman Correlation Coefficient of 0.82 between original and denoised signals, effectively preserving neural information [21].
Artifact Subspace Reconstruction (ASR): An adaptive, real-time capable method, ASR uses a sliding-window principal component analysis (PCA) to identify and remove high-variance signal segments exceeding a predefined threshold (a common setting, 'k,' is recommended between 10-30 to avoid overcleaning) [20] [22]. Its performance is heavily influenced by the quality of the clean calibration reference data.

Table 1: Performance Metrics of Traditional Blind Source Separation (BSS) Techniques

Method	Key Principle	Best For	Limitations	Reported Performance (SCC/ED)
ICA	Statistical independence of sources	Ocular, cardiac artifacts in high-density systems	Fails with high-amplitude motion artifacts; requires many channels	N/A [11]
VMD-BSS	Signal decomposition into intrinsic mode functions	Handling non-stationary signals, ocular artifacts	Parameter selection (K-modes) is critical for performance	SCC: 0.82, ED: 704.04 [21]
DWT-BSS	Time-frequency decomposition via wavelets	Muscle and ocular artifacts with distinct spectral signatures	Choice of mother wavelet impacts outcomes	SCC: 0.82, ED: 703.64 [21]
ASR	PCA-based rejection of high-variance components	Real-time motion artifact removal in mobile EEG	Sensitive to calibration data; risk of over/under-cleaning	Improved ICA dipolarity at k=20 [20]

The Rise of Deep Learning and End-to-End Models

Deep Learning (DL) models represent a paradigm shift, learning to map artifact-contaminated EEG to clean signals in an end-to-end fashion, thereby overcoming many limitations of BSS.

CLEnet: A novel architecture integrating a dual-scale Convolutional Neural Network (CNN) with Long Short-Term Memory (LSTM) networks and an improved Efficient Multi-Scale Attention mechanism (EMA-1D). It is specifically designed to handle unknown artifacts and multi-channel EEG data. On a mixed artifact (EOG+EMG) removal task, CLEnet achieved a Signal-to-Noise Ratio (SNR) of 11.50 dB and a Correlation Coefficient (CC) of 0.93, outperforming other benchmark models [7].
Generative Adversarial Networks (GANs): Models like AnEEG and GCTNet use a generator to produce clean EEG and a discriminator to distinguish it from ground-truth clean data. This adversarial training forces the generator to produce highly realistic, artifact-free signals. GCTNet, for instance, demonstrated an 11.15% reduction in Relative Root Mean Square Error (RRMSE) and a 9.81 dB improvement in SNR [19].
iCanClean: This approach leverages canonical correlation analysis (CCA) to identify and subtract noise subspaces highly correlated with reference or pseudo-reference noise signals. It has proven particularly effective for motion artifact removal during locomotion, outperforming ASR in recovering dipolar brain components and restoring the expected P300 amplitude in event-related potential (ERP) analyses during running [20] [22].

Table 2: Quantitative Performance of Deep Learning Artifact Removal Models

Model	Architecture	Artifact Types	Key Metrics	Reported Results
CLEnet [7]	Dual-scale CNN + LSTM + EMA-1D	EMG, EOG, ECG, Mixed, Unknown	SNR (dB), CC, RRMSEt, RRMSEf	SNR: 11.50, CC: 0.93, RRMSEt: 0.30 (Mixed)
AnEEG [19]	LSTM-based GAN	Ocular, Muscle, Environmental	NMSE, RMSE, CC, SNR, SAR	Lower NMSE/RMSE, higher CC/SNR/SAR vs. Wavelet
GCTNet [19]	GAN + CNN + Transformer	EMG, EOG	RRMSE, SNR	RRMSE ↓ 11.15%, SNR ↑ 9.81 dB
iCanClean [20]	Canonical Correlation Analysis	Motion (Gait, Running)	ICA Dipolarity, P300 Effect	More dipolar components; restored P300 congruency effect
1D-ResCNN [7]	1D Residual CNN	EMG, EOG	SNR (dB), CC	SNR: ~10.50, CC: ~0.90 (Benchmark)

Specialized Techniques for Motion Artifacts

Motion artifacts present a unique challenge due to their high amplitude and complex, non-stationary characteristics. Direct comparisons of iCanClean and ASR during overground running highlight their specialized utility. iCanClean, especially when used with pseudo-reference noise signals, was somewhat more effective than ASR at reducing power at the gait frequency and its harmonics and was the only method to successfully identify the expected P300 congruency effect in an adapted Flanker task [20] [22]. This makes it a superior choice for cognitive neuroscience studies involving whole-body movement.

Motion artifact removal workflow for mobile EEG

Experimental Protocols and Benchmarking

Establishing Ground Truth with Semi-Synthetic Datasets

Rigorous evaluation of artifact removal pipelines requires datasets where the ground-truth, clean EEG is known. A common and robust protocol involves creating semi-synthetic datasets [7] [19].

Data Acquisition: Clean EEG segments are recorded under controlled, resting conditions. Simultaneously, artifact signals (e.g., EOG from eye movements, EMG from jaw clenching) are recorded separately on dedicated channels.
Linear Mixing: The clean EEG and artifact signals are linearly mixed using a predefined mixing matrix to simulate realistic contamination across multiple EEG channels. This creates a dataset where the original clean EEG is known, enabling precise calculation of error metrics.
Model Training & Evaluation: The mixed data serves as the input to the model (e.g., CLEnet, GAN), while the original clean EEG is the training target. Performance is quantified using metrics like SNR, CC, and RRMSE by comparing the model's output to the known ground truth.

Protocol for Motion Artifact Removal in Locomotion

For evaluating motion artifacts during running, a protocol adapted from [20] involves:

Task Design: Participants perform a cognitive task (e.g., Flanker task) under two conditions: static standing and dynamic overground running. The standing condition provides a low-motion baseline.
Data Preprocessing: The continuous EEG from the running condition is processed with different algorithms (e.g., ASR with k=20, iCanClean with R²=0.65 and a 4s window).
Performance Evaluation:
- ICA Quality: The number and proportion of dipolar components post-ICA are calculated.
- Spectral Analysis: Power at the gait frequency and its harmonics is measured before and after cleaning.
- ERP Analysis: The presence and characteristics of expected ERP components (e.g., P300 congruency effect) are compared between the cleaned running data and the standing baseline.

Quantitative Performance Metrics

The field employs a standardized set of metrics to compare algorithms objectively [7] [21] [19].

Temporal Similarity: Correlation Coefficient (CC) measures the linear relationship between the cleaned and ground-truth signal. Relative Root Mean Square Error (RRMSEt) quantifies the magnitude of error.
Spectral Similarity: Relative Root Mean Square Error in Frequency Domain (RRMSEf) assesses how well the spectral power is preserved.
Signal Quality: Signal-to-Noise Ratio (SNR) and Signal-to-Artifact Ratio (SAR) estimate the relative power of the brain signal versus residual noise/artifacts.

Table 3: Key Metrics for Evaluating Artifact Removal Efficacy

Metric	Formula / Principle	Interpretation	Ideal Value
Correlation Coefficient (CC)	( CC = \frac{\text{cov}(S, \hat{S})}{\sigmaS \sigma{\hat{S}}} )	Linearity between cleaned (Ŝ) and ground-truth (S) signal	Closer to 1.0
Relative RMSE (Temporal)	( \text{RRMSEt} = \sqrt{\frac{\sum(S - \hat{S})^2}{\sum S^2}} )	Magnitude of temporal reconstruction error	Closer to 0
Relative RMSE (Spectral)	( \text{RRMSEf} = \sqrt{\frac{\sum(PS - P{\hat{S}})^2}{\sum P_S^2}} )	Magnitude of spectral distortion	Closer to 0
Signal-to-Noise Ratio (SNR)	( \text{SNR} = 10 \log{10}\left(\frac{P{\text{signal}}}{P_{\text{noise}}}\right) )	Ratio of signal power to noise power	Higher is better

The Scientist's Toolkit: Essential Research Reagents & Materials

Table 4: Key Resources for Advanced EEG Artifact Removal Research

Resource Category	Specific Example / Tool	Primary Function in Research
Public Datasets	EEGdenoiseNet [7]	Provides semi-synthetic data with ground truth for benchmarking model performance on EMG, EOG, and ECG artifacts.
Public Datasets	NeurIPS 2025 EEG Foundation Dataset [23]	Large-scale, high-density (128-channel) dataset from 3,000+ participants across six cognitive tasks for cross-task validation.
Software & Algorithms	ICALabel [20]	Automated classification of ICA components into brain, ocular, muscle, and other sources.
Software & Algorithms	Artifact Subspace Reconstruction (ASR) [20]	Real-time, PCA-based bad segment removal; available in EEGLAB plugins.
Software & Algorithms	iCanClean Routines [20]	Code packages for implementing CCA-based motion artifact correction with pseudo-reference signals.
Deep Learning Models	CLEnet, AnEEG, GCTNet [7] [19]	Pre-trained or open-source architectures for end-to-end artifact removal, providing a starting point for transfer learning.
Hardware Solutions	Dry Electrode Headsets [11] [24]	Enable recordings in ecological settings but introduce specific artifacts that algorithms must address.
Hardware Solutions	Dual-Layer Noise Sensors [20]	Provide ideal reference noise signals for algorithms like iCanClean, though pseudo-references can be derived from EEG.

Decision guide for artifact removal method

The challenge of spectral and spatial overlap in HD-EEG is being met with increasingly sophisticated computational strategies. While traditional BSS methods like ICA and wavelet transforms remain effective for specific, well-defined artifacts in controlled settings, the field is rapidly advancing toward data-driven deep learning models such as CLEnet and GANs. These models show superior performance in handling complex, unknown, and mixed artifacts, especially in the low-density configurations common in wearable devices [11] [7].

Future progress hinges on several key developments: the creation of larger, standardized, and publicly available benchmark datasets [23]; the refinement of hybrid models that combine the interpretability of BSS with the power of DL; and the optimization of algorithms for real-time processing in clinical monitoring and brain-computer interfaces. For researchers in drug development and cognitive neuroscience, adopting these advanced artifact removal protocols is no longer optional but essential for ensuring that the neural signals underlying cognitive processes and therapeutic effects are accurately isolated and measured.

Electroencephalography (EEG), particularly high-density EEG (HD-EEG) with dozens to hundreds of electrodes, is a vital tool for non-invasive brain monitoring in clinical and research settings [25] [26]. The analysis of these signals relies heavily on the quality of the recorded data, as the presence of unwanted artifacts poses a significant threat to the validity of both automated feature extraction and final clinical interpretation. Artifacts are any recorded signals that do not originate from neural activity, and because EEG signals are inherently weak (measured in microvolts), they are highly susceptible to contamination from various sources [27]. These artifacts can distort or entirely obscure genuine neural signals, leading to inaccurate feature extraction in automated analysis pipelines and, ultimately, to clinical misdiagnosis or flawed research conclusions [28] [27]. This guide examines the risks artifacts pose to downstream analysis, framed within the broader challenge of artifact removal in HD-EEG research.

Types of EEG Artifacts and Their Specific Impacts on Analysis

EEG artifacts are broadly categorized by their origin. The following table summarizes the major types, their characteristics, and their direct impact on downstream analysis.

Table 1: Physiological Artifacts and Their Impact on Analysis

Artifact Type	Origin & Cause	Key Features in Signal	Impact on Feature Extraction & Clinical Interpretation
Ocular (EOG)	Eye blinks and movements creating a corneo-retinal potential [27].	High-amplitude, slow deflections, maximal over frontal electrodes [27].	Feature Risk: Inflates power in delta/theta bands, mimicking cognitive processes [27].Clinical Risk: Misinterpreted as frontal slow waves, indicative of encephalopathy or epileptiform activity.
Muscle (EMG)	Contractions of facial, jaw, or neck muscles [27].	High-frequency, broadband (20-300 Hz), non-stationary noise [27].	Feature Risk: Obscures genuine beta/gamma oscillations, crucial for motor and cognitive studies [29] [27].Clinical Risk: Masks spike-wave complexes or is misinterpreted as epileptic high-frequency oscillations.
Cardiac (ECG/BCG)	Electrical activity of the heart or pulsatile head movement in EEG-fMRI [30] [27].	Rhythmic, spike-like waveforms synchronized with the heartbeat.	Feature Risk: Introduces periodic, non-neural spikes that corrupt time-domain features.Clinical Risk: Misidentified as epileptic spikes, leading to false lateralization or focus localization.
Perspiration/Respiratory	Sweat altering impedance; chest/head movement during breathing [27].	Very slow, drifting baselines; rhythmic waveforms at respiration rate.	Feature Risk: Contaminates low-frequency delta bands, vital for sleep and coma studies.Clinical Risk: Obscures pathological slow activity or induces false "burst-suppression" patterns.

Table 2: Non-Physiological (Technical) Artifacts and Their Impact on Analysis

Artifact Type	Origin & Cause	Key Features in Signal	Impact on Feature Extraction & Clinical Interpretation
Electrode "Pop"	Sudden change in electrode-skin impedance [27].	Abrupt, high-amplitude transient, often isolated to a single channel [27].	Feature Risk: Creates extreme outliers that skew statistical features and machine learning models.Clinical Risk: Closely mimics a true epileptic spike, leading to false positive identification of interictal epileptiform discharges (IEDs) [27].
Cable Movement	Motion of electrode cables causing electromagnetic interference [27].	High-amplitude, irregular deflections or rhythmic waveforms if movement is periodic.	Feature Risk: Generates high-power, non-stationary noise across a broad frequency range.Clinical Risk: Rhythmic movement can mimic alpha or mu rhythms; irregular bursts can be mistaken for seizure activity.
AC Power Line	Electromagnetic interference from mains electricity (50/60 Hz) [27].	Persistent, high-amplitude narrowband noise at 50/60 Hz and its harmonics.	Feature Risk: Dominates the gamma range and higher frequencies, rendering them uninterpretable.Clinical Risk: Can obscure high-frequency activity of interest, such as fast ripples in epilepsy.
Poor Reference	Incorrect placement or high impedance at the reference electrode [27].	Drift, noise, or abnormal signals present across all recording channels.	Feature Risk: Renders all recorded potentials invalid, as the fundamental measurement reference is corrupted.Clinical Risk: Creates a globally abnormal recording, preventing any reliable clinical interpretation.

Methodological Protocols for Artifact Management and Mitigation

Robust artifact handling requires a multi-stage pipeline, from acquisition to advanced signal processing. The following experimental protocols are critical for ensuring data integrity.

Data Acquisition and Preprocessing Protocols

The first line of defense against artifacts is high-quality data acquisition [31]. This includes:

Proper Electrode Application: Ensuring low impedance (< 5 kΩ for many systems) at all electrode sites to minimize noise and pops [27].
Environmental Control: Using electrically shielded rooms when possible and keeping cables away from power sources to reduce AC interference [31].
Synchronized Monitoring: Recording parallel signals like electrooculogram (EOG), electromyogram (EMG), and electrocardiogram (ECG) to provide reference channels for artifact removal algorithms [27].

Preprocessing is the next critical step. Bandpass filtering (e.g., 0.5–70 Hz) is standard to remove slow drifts and high-frequency noise. A * notch filter* at 50/60 Hz can be applied to suppress line noise, though it may distort data, making advanced techniques like Zapline sometimes preferable [29].

Advanced Processing and Decomposition Protocols

For the complex task of separating artifacts from brain signals, advanced decomposition methods are employed.

Independent Component Analysis (ICA): A widely used blind source separation method. ICA decomposes the multi-channel EEG signal into statistically independent components (ICs). Artifactual components (e.g., from eye blinks, muscle activity) can be identified based on their topography, time course, and frequency spectrum and then removed from the data before reconstructing the signal [28] [27]. The protocol involves:
- Filtering the data (e.g., 1-100 Hz).
- Running an ICA algorithm (e.g., FastICA, Infomax).
- Manually or automatically classifying and rejecting artifactual components.
- Reconstructing the "cleaned" EEG signal.

Wavelet Transform: This method is particularly effective for non-stationary artifacts like muscle pops and electrode noise. It decomposes the signal into different frequency bands at multiple resolutions, allowing for the targeted removal of artifacts in specific time-frequency regions without affecting the entire signal [29] [28]. The protocol involves:
- Selecting a mother wavelet (e.g., Daubechies).
- Decomposing the signal into coefficients.
- Applying a thresholding rule (e.g., soft thresholding) to denoise the coefficients.
- Reconstructing the signal from the thresholded coefficients.
Empirical Mode Decomposition (EMD) and Variants: EMD adaptively decomposes non-linear and non-stationary signals like EEG into intrinsic mode functions (IMFs). Artifact-affected IMFs can be identified and removed. Advanced variants like Ensemble EMD (EEMD) and Self-Adaptive Multivariate EMD (SA-MEMD) have been developed to overcome mode mixing and improve performance for multi-channel EEG [28].

Table 3: Comparison of Key Artifact Removal Methods

Method	Underlying Principle	Best For	Key Advantages	Key Limitations
ICA	Statistical independence of sources [28].	Ocular, cardiac, and persistent EMG artifacts.	Does not require reference channels; preserves neural activity well.	Requires many channels; struggles with non-stationary, sporadic artifacts; risk of removing neural data.
Wavelet Transform	Time-frequency decomposition [29].	Short-duration, transient artifacts (e.g., pops).	Excellent for localizing artifacts in time and frequency.	Choice of wavelet and threshold is critical and can be subjective.
EMD/EEMD	Adaptive, data-driven decomposition into IMFs [28].	Non-linear and non-stationary signals.	Fully data-driven, no pre-defined basis required.	Prone to mode mixing (standard EMD); computationally intensive.

Protocol for Validating Artifact Removal

After processing, validation is essential. This involves:

Visual Inspection: Comparing raw and cleaned data to ensure artifacts are removed without excessive distortion of neural signals.
Quantitative Metrics: Calculating changes in signal-to-noise ratio (SNR) or power spectral density (PSD) to quantify improvement.
Downstream Analysis Check: Verifying that the cleaned data produces physiologically plausible and consistent results in subsequent feature extraction and modeling steps [28].

The following diagram illustrates a comprehensive artifact mitigation workflow.

The Scientist's Toolkit: Key Reagents and Computational Solutions

Table 4: Essential Research Tools for HD-EEG Artifact Management

Tool / Solution	Category	Primary Function	Application in Workflow
Ag/AgCl Electrodes	Hardware	High-fidelity signal acquisition from scalp.	Data Acquisition: Provides stable, low-noise electrical contact [29].
Auxiliary EOG/ECG Electrodes	Hardware	Records eye movement and heart signals.	Data Acquisition: Provides reference channels for physiological artifact removal [27].
Digitizer (e.g., Fastrak)	Hardware	Records 3D electrode positions.	Data Acquisition: Enables precise co-registration with MRI for source localization [25].
Boundary Element Model (BEM)	Computational Model	Models electrical conductivity of head tissues.	Source Analysis: Creates a realistic forward model for Electrical Source Imaging (ESI) [25].
Independent Component Analysis (ICA)	Algorithm	Blind source separation.	Artifact Decomposition: Identifies and isolates artifactual sources from neural data [28] [27].
Wavelet Toolbox (e.g., DWT)	Algorithm	Time-frequency analysis and denoising.	Artifact Removal: Targets and removes transient artifacts in specific time-frequency bins [29] [28].
sLORETA/eLORETA	Algorithm	Distributed source localization.	Downstream Analysis: Estimates the origin of neural activity from scalp potentials [25].
Bidirectional LSTM (BiLSTM)	Algorithm	Deep learning for sequence modeling.	Downstream Analysis: Classifies brain states (e.g., stress, sleep stages) from temporal EEG features [32].

The path from raw HD-EEG recording to reliable clinical interpretation is fraught with challenges posed by artifacts. These unwanted signals directly threaten the integrity of automated feature extraction and can lead to profound clinical misdiagnosis, such as the confusion of a simple electrode pop for an epileptic spike [27]. Mitigating these risks requires a rigorous, multi-layered methodology that combines meticulous data acquisition with sophisticated signal processing techniques like ICA and wavelet analysis. As HD-EEG continues to grow in clinical and research importance, the development and rigorous application of robust, validated artifact handling protocols will be paramount to ensuring that downstream analyses and interpretations are based on genuine brain activity, not deceptive contaminants.

From ICA to AI: A Methodological Toolkit for hd-EEG Artifact Removal

High-density electroencephalography (hdEEG) provides unparalleled insight into human brain dynamics, yet its interpretation is fundamentally hampered by biological and non-biological artifacts. These unwanted signals—from ocular movements, muscle activity, cardiac rhythms, and motion—often overshadow neural sources of interest, particularly in naturalistic experimental paradigms [33]. The challenge is especially pronounced in mobile brain/body imaging (MoBI) studies where head motion during whole-body movements produces artifacts that contaminate the EEG and reduce the quality of subsequent analysis [20]. Blind Source Separation (BSS) approaches, particularly Independent Component Analysis (ICA), have emerged as powerful computational strategies for attenuating these artifacts while preserving neural information [33] [34]. This technical guide examines the core principles, methodological implementations, and experimental validation of ICA as a cornerstone technique for addressing the critical challenge of artifact removal in hdEEG research.

Theoretical Foundations of ICA and BSS

Core Mathematical Principles

Independent Component Analysis is a specific embodiment of Blind Source Separation that operates on the principle of statistical independence [34]. The fundamental model assumes that recorded EEG signals represent linear mixtures of underlying source activities. Mathematically, this relationship is expressed as:

X = AS

Where X is the recorded data matrix (electrodes × time points), A is the mixing matrix (representing how sources project to sensors), and S contains the time courses of the independent sources [35]. The computational goal of ICA is to find an unmixing matrix W such that:

WX = S

The ideal outcome is that the estimated sources S contain maximally independent components, which can then be classified as neural signals or artifacts [35]. The independence criterion in ICA is stronger than mere uncorrelatedness; it requires that the joint probability distribution of the components factorizes into the product of their marginal distributions, encompassing all moments of the distributions, not just covariance [35]. This is typically achieved by optimizing measures of non-Gaussianity such as kurtosis, under the assumption that neural sources generate signals with non-Gaussian distributions [35].

Critical Assumptions and Requirements

Successful application of ICA to hdEEG data relies on several key assumptions:

Statistical Independence: The underlying sources must be statistically independent, meaning the activity of one source provides no information about the activity of another [35].
Stationarity: The locations of both neural generators and recording electrodes must remain fixed throughout the recording [35].
Linear Mixing: The propagation of signals from sources to sensors must occur through linear summation, with negligible time delays [35].
Non-Gaussianity: The probability distributions of the independent component activations must not be precisely Gaussian for the separation to be possible [35].

Violations of these assumptions, particularly non-stationarities introduced by significant head movement, can limit ICA's effectiveness and necessitate specialized approaches or preprocessing steps [20] [33].

Methodological Implementations and Protocols

ICA Unmixing and Component Classification Workflow

The standard pipeline for ICA-based artifact removal involves a sequence of well-defined stages, from raw data preparation to cleaned signal reconstruction.

Figure 1: ICA-Based Artifact Removal Workflow

The process begins with appropriate preprocessing of hdEEG data, which may include filtering, bad channel removal, and re-referencing [35]. ICA decomposition then separates the recorded signals into statistically independent components. Each component is characterized by both a time course (activation pattern) and a spatial map (topographic distribution) [35]. Critically, components are classified as brain-based or artifactual using algorithms like ICLabel or by assessing component properties such as dipolarity—the extent to which their scalp projections resemble those of a single neural generator [20]. Artifactual components are removed, and the remaining brain components are back-projected to sensor space to reconstruct clean EEG data.

Advanced and Hybrid Methodologies

Recent advancements have moved beyond standard ICA implementations to address specific limitations in artifact removal:

Multi-Step BSS Approaches: Recognizing that different artifacts have distinct properties, Zhao et al. [33] developed a multi-step BSS approach that uses specific methods and parameters optimized for different artifact types (ocular, movement-related, myogenic). This methodology yielded lower residual noise and permitted retrieval of stronger, more reliable neural activity modulations compared to single-step approaches [33].

ICA with Complementary Techniques: ICA is increasingly combined with other signal processing methods to enhance artifact removal. For instance, the Four Class Iterative Filtering (FCIF) technique combines iterative filtering with ICA to identify and remove artifact-related components while preserving neural information [36]. Similarly, hybrid approaches integrating ICA with wavelet transform methods have demonstrated effectiveness in protecting important neural information during artifact removal [37].

Online Artifact Correction: Recent work has focused on developing fast automatic algorithms for ongoing correction of artifacts in continuous EEG, using sliding window techniques with overlapping epochs and features in spatial, temporal, and frequency domains to detect and correct various artifact types [38]. These approaches achieve high artifact reduction rates (81-100% across artifact types) with computation times suitable for online applications [38].

Experimental Validation and Performance Metrics

Quantitative Comparison of Artifact Removal Algorithms

Researchers employ multiple quantitative metrics to evaluate the performance of ICA and other artifact removal approaches, particularly in the context of mobile EEG studies.

Table 1: Performance Metrics for ICA and Alternative Artifact Removal Methods

Method	Key Metric	Performance Value	Experimental Context	Reference
iCanClean with pseudo-reference	ICA dipolarity	High (most dipolar brain components)	Running Flanker task	[20]
Artifact Subspace Reconstruction (ASR)	Power reduction at gait frequency	Significant reduction	Human locomotion during running	[20]
Multi-step BSS approach	Residual noise in hdEEG	Lowest among compared methods	Standing, slow-walking, fast-walking	[33]
Fast automatic BSS algorithm	Overall artifact reduction rate	88% (2035 marked artifacts)	Continuous EEG with marked artifacts	[38]
DWT-LMM approach	Average correlation coefficient	0.9369	Ocular artifacts removal	[37]

Protocol for Validating Motion Artifact Removal

Studies evaluating artifact removal during locomotion employ standardized protocols to quantify effectiveness. A representative protocol from recent research includes:

Task Design: Participants perform cognitive tasks (e.g., Flanker task) under both static (standing) and dynamic (jogging) conditions, enabling comparison to a low-motion baseline [20].
Data Acquisition: hdEEG is recorded alongside motion capture data to precisely identify gait cycle timing and head movement parameters [20] [33].
Algorithm Application: Multiple artifact removal approaches (e.g., ICA, iCanClean, ASR) are applied to the same dataset using standardized parameters [20].
Effectiveness Assessment:
- Component Quality: Quantified via dipolarity measures of ICA components [20].
- Spectral Analysis: Power reduction at step frequency and harmonics is calculated [20].
- ERP Analysis: Preservation of expected event-related potential components (e.g., P300 congruency effects) is verified [20].
- Neural Measures: Sensorimotor alpha/beta suppression during walking is quantified to confirm preservation of neural signals [20].

This comprehensive validation approach ensures that artifact removal methods not only reduce noise but also preserve functionally relevant neural signatures.

The Scientist's Toolkit: Essential Research Reagents

Table 2: Key Resources for ICA and Artifact Removal Research

Resource Category	Specific Tool/Solution	Function/Purpose	Example Implementation
Software Toolboxes	EEGLAB	MATLAB toolbox implementing ICA and component analysis	Delorme & Makeig, 2004 [39]
Classification Algorithms	ICLabel	Automated component classification using trained dataset	PMC Disclaimer [20]
Reference-Based Methods	iCanClean	Leverages noise references for artifact subspace identification	PMC Disclaimer [20]
Component Rejection Tools	DIPFIT	Localizes components using dipole modeling	James et al., 2005 [34]
Automated Cleaning	Artifact Subspace Reconstruction (ASR)	Identifies and removes artifact subspaces using PCA	PMC Disclaimer [20]
Hybrid Methods	DWT-LMM	Wavelet-based artifact removal with local thresholding	Sciencedirect [37]
Validation Datasets	BCI Competition IV Dataset 2a & 2b	Standardized data for method benchmarking	PMC Disclaimer [36]

Comparative Analysis of Methodologies

Algorithm Selection Framework

Choosing an appropriate artifact removal strategy requires consideration of multiple experimental factors and methodological trade-offs. The following decision pathway provides a structured approach for selecting and implementing ICA-based methods in hdEEG research.

Figure 2: Algorithm Selection Decision Pathway

Performance Trade-offs and Considerations

Each artifact removal approach involves specific trade-offs that researchers must consider when designing analysis pipelines:

ICA-Based Approaches:

Strengths: Effectively separates biologically generated artifacts (ocular, cardiac, muscle) based on their statistical properties; preserves neural signals when properly implemented [35] [34].
Limitations: Requires high-quality data with limited non-stationarities; performance degrades with excessive head movement; component classification requires expertise or validated automated tools [20] [33].

iCanClean:

Strengths: Particularly effective for motion artifacts during locomotion; produces highly dipolar brain components; can utilize pseudo-reference signals when dedicated noise sensors are unavailable [20].
Limitations: Effectiveness depends on appropriate parameter selection (R² threshold, sliding window); may require optimization for specific movement types [20].

Artifact Subspace Reconstruction (ASR):

Strengths: Effective for removing high-amplitude motion artifacts; improves ICA decomposition quality when used as preprocessing step; computational efficiency supports online applications [20] [38].
Limitations: Risk of "overcleaning" with aggressive threshold settings (k < 10); performance depends on appropriate calibration data selection [20].

Multi-Step BSS:

Strengths: Optimized handling of different artifact types; yields lowest residual noise in validation studies; produces stronger and more reliable neural activity modulations [33].
Limitations: Increased implementation complexity; requires careful parameter tuning at each processing stage [33].

The field of artifact removal in hdEEG continues to evolve, with several promising directions emerging. Integration of machine learning approaches for more accurate component classification shows particular promise, as does the development of real-time artifact correction systems for brain-computer interface applications [38] [36]. There is also growing recognition that different experimental scenarios may require specialized artifact removal strategies—a one-size-fits-all approach is often suboptimal [33].

ICA remains a cornerstone technique for EEG artifact removal due to its principled mathematical foundation and proven effectiveness across diverse experimental contexts. However, optimal application requires careful consideration of experimental parameters, appropriate method selection, and rigorous validation. As research progresses, hybrid approaches that combine ICA's blind source separation capabilities with complementary signal processing techniques and domain knowledge will likely provide the most robust solutions to the persistent challenge of artifacts in high-density EEG research.

The advent of high-density electroencephalography (hd-EEG) with up to 256 electrodes has revolutionized sleep research by providing unprecedented spatial resolution for mapping brain activity during sleep. However, this technological advancement introduces a significant data quality challenge: the vast amount of data generated by overnight hd-EEG recordings complicates the removal of artifacts [5] [40]. Unlike brief wake-EEG recordings, overnight sleep studies capture hours of data across hundreds of channels, creating a massive dataset where artifacts can obscure crucial neurophysiological information. These artifacts stem from diverse sources including muscle activity during arousals, cardiac signals, electrode disconnection, and perspiration, each requiring detection and removal to ensure data integrity [41].

Existing fully automated artifact removal methods often fall short for hd-EEG sleep data because they typically target shorter wake EEG recordings and may lack the precision needed for sleep-specific neurophysiological phenomena [5]. The research community consequently faces a critical methodological gap: how to efficiently process hd-EEG sleep data without sacrificing analytical precision. This whitepaper examines how semi-automatic, graphical user interface (GUI)-based solutions address this challenge by combining computational efficiency with researcher expertise, creating a targeted, transparent cleaning approach suitable for the rigorous demands of both academic research and clinical drug development.

High-Density-SleepCleaner: A Case Study in Semi-Automatic Processing

The "High-Density-SleepCleaner" represents a specialized methodological innovation specifically designed to address the unique challenges of hd-EEG sleep data [5] [40] [42]. This open-source, semi-automatic artifact removal routine employs a GUI that enables researchers to assess data epochs according to four sleep quality markers (SQMs), which evaluate key characteristics of the sleep EEG signal. Through dynamic visualization of both topography and underlying EEG signals, the interface allows users to identify and remove artifactual values while preserving neurologically genuine activity [5].

The methodology requires the researcher to possess fundamental knowledge of both typical (patho-)physiological EEG patterns and common artifactual signals, ensuring that cleaning decisions incorporate neurophysiological expertise rather than relying solely on statistical thresholds [40]. The algorithm produces a binary matrix (channels × epochs) marking artifactual sections, with the valuable capability to restore channels in afflicted epochs using epoch-wise interpolation—a function included in the online repository. Implementation results demonstrate that between 95% and 100% of bad epochs can be effectively restored using this interpolation approach, significantly preserving data integrity despite the presence of artifacts [5].

Table: High-Density-SleepCleaner Performance Metrics

Metric	Performance	Context
Application Scope	54 overnight sleep hd-EEG recordings	Demonstrated robustness across multiple datasets [5]
Epoch Restoration Rate	95-100% of bad epochs	Using epoch-wise interpolation function [5]
Output Format	Binary matrix (channels × epochs)	Enables precise identification of artifactual sections [40]
Topographical Validation	Expected delta power topography and cyclic pattern	Confirmed in cases with both few and many artifacts [5]

Experimental Protocol and Workflow

The High-Density-SleepCleaner methodology follows a structured protocol to ensure systematic artifact identification and removal:

Data Loading and Initialization: Import hd-EEG data (up to 256 channels) from overnight sleep recordings into the MATLAB-based environment.
Sleep Quality Marker (SQM) Calculation: The algorithm computes four key SQMs that quantify different aspects of signal quality across all channels and epochs.
GUI-Based Review Process: Researchers interact with the dynamic, multi-functional GUI to visualize SQM topography and underlying EEG signals simultaneously, assessing data quality while scrolling through epochs.
Artifact Identification and Marking: Based on topographic patterns and signal characteristics, users manually mark artifactual segments, leveraging their expertise in EEG pattern recognition.
Binary Matrix Generation: The system produces a comprehensive binary matrix marking all artifactual sections across the recording (channels × epochs).
Epoch-Wise Interpolation: For channels marked as artifactual in specific epochs, the algorithm employs interpolation from surrounding clean channels to restore signal integrity.
Quality Validation: Researchers verify output quality by examining standard EEG metrics (e.g., delta power topography and cyclic patterns) to ensure expected physiological patterns emerge post-processing [5].

Comparative Analysis: Manual vs. Automatic Detection Methods

Recent research systematically compares traditional visual artifact detection with emerging automatic approaches, revealing critical insights for hd-EEG processing methodologies. A 2025 study examining sleep EEG recordings from 252 healthy volunteers found that while visual and automatic detections show only moderate agreement on which specific data segments contain artifacts, the resulting all-night average power spectrum density (PSD) estimates are remarkably similar across methods [41]. This finding challenges the long-held assumption that extensive visual inspection is indispensable for accurate spectral analysis.

The automatic detection method evaluated in this research utilizes Hjorth parameters—computationally simple indicators of statistical signal properties including activity (signal variance), mobility (average slope relative to amplitude), and complexity (deviation from pure sine wave) [41]. Despite their algorithmic simplicity, these parameters effectively identify the minority of highly anomalous artifacts that cause most distortions in EEG spectra, particularly in beta/gamma frequencies and NREM delta. Crucially, PSD estimates derived from this automatic method successfully recovered the known correlations with age and sex, performing equally well as visually cleaned data in identifying established biological relationships [41].

Table: Artifact Detection Method Comparison

Method Characteristic	Visual Detection	Automatic Hjorth Parameters	Semi-Automatic GUI Approach
Basis of Decision	Expert pattern recognition	Statistical thresholds (variance, mobility, complexity)	Hybrid: Algorithm pre-screening + expert validation
Processing Time	High (impractical for large datasets)	Low (suitable for big data)	Moderate (efficient for hd-EEG)
Agreement with Gold Standard	Gold standard	Moderate for epochs, high for PSD outcomes	High (incorporates gold standard)
Required Expertise	Advanced EEG interpretation	Minimal technical implementation	Intermediate (EEG knowledge essential)
Data Recovery Capability	Limited to exclusion	Primarily exclusion-based	Epoch-wise interpolation (95-100% recovery)
Best Application Context	Small-scale studies	Large database processing	High-density EEG with limited artifacts

Implementing effective artifact removal strategies requires specific methodological tools and computational resources. The following table outlines key solutions mentioned in recent literature:

Table: Research Reagent Solutions for EEG Artifact Management

Resource	Type	Function	Application Context
High-Density-SleepCleaner	Software routine	Semi-automatic artifact identification via GUI	hd-EEG sleep recordings [5]
Hjorth Parameters	Algorithmic feature set	Statistical detection of anomalous epochs	Large-scale sleep EEG datasets [41]
Epoch-Wise Interpolation	Signal processing method	Restoration of artifactual channels using spatial information	Recovery of hd-EEG epochs with limited artifacts [5]
Sleep Quality Markers (SQMs)	Quantitative metrics	Multi-dimensional assessment of signal quality	GUI-based artifact review process [5]
Luna	Open-source software tool	Large-scale sleep EEG analysis platform	Processing big EEG datasets from repositories [41]

Implications for Research and Drug Development

The development of specialized semi-automatic cleaning routines carries significant implications for both basic research and pharmaceutical development. For sleep researchers, these methodologies enable more efficient processing of high-density datasets while maintaining the precision required for detecting subtle neural phenomena. The transparent nature of GUI-based approaches—where cleaning decisions are documented and reviewable—addresses growing concerns about reproducibility in neuroscience research [41].

For drug development professionals, semi-automatic artifact removal offers particular advantages in clinical trials where EEG may serve as a biomarker for treatment efficacy. The method ensures consistent processing across multiple study sites and timepoints while preserving data integrity through its interpolation capabilities. This is especially valuable when working with patient populations where artifact prevalence may be higher, yet data retention is critical for statistical power. Furthermore, the ability to maintain expected topographical patterns of key sleep waveforms (such as delta power) after cleaning provides confidence in subsequent quantitative analyses [5].

Semi-automatic, GUI-based artifact removal routines represent a methodological advance that effectively balances the competing demands of efficiency and precision in hd-EEG research. By leveraging computational preprocessing while retaining expert oversight, approaches like the High-Density-SleepCleaner address the unique challenges posed by overnight sleep studies with high channel counts. The transparent nature of these methods—coupled with robust interpolation techniques that preserve data integrity—makes them particularly valuable for both academic research and clinical applications. As sleep EEG continues to gain prominence as a source of biomarkers for neurological and psychiatric conditions, these targeted cleaning solutions will play an increasingly vital role in ensuring data quality and analytical reproducibility.

Electroencephalography (EEG) is a fundamental tool in neuroscience and clinical diagnostics, prized for its high temporal resolution. However, a persistent challenge in high-density EEG research is the vulnerability of recordings to various artifacts—signals of non-cerebral origin that can obscure genuine brain activity. These artifacts, which include those from eye movements, muscle activity, cardiac signals, and motion, often exhibit amplitudes significantly larger than cortical signals, leading to biased analysis and misinterpretation of neural data [43]. The problem is particularly acute in high-density systems, where the sheer number of channels can amplify the complexity of artifact identification and removal. Traditional filtering methods are often insufficient as the spectral patterns of artifacts frequently overlap with those of neural signals of interest, resulting in the unwanted suppression of informative brain signatures [43].

In response to these challenges, advanced signal processing techniques have been developed. Among the most prominent are Artifact Subspace Reconstruction (ASR) and Canonical Correlation Analysis (CCA), which represent powerful blind source separation approaches. ASR is an automated, component-based method designed for the rapid removal of non-stationary, high-amplitude artifacts from multi-channel EEG data [44] [45]. Concurrently, CCA is a statistical method that leverages the autocorrelation properties of signals to separate brain activity from artifacts, proving particularly effective for muscle and other biological contaminants [43] [46]. This whitepaper provides an in-depth technical guide to these two core methodologies, detailing their underlying principles, algorithmic workflows, and performance characteristics within the challenging context of high-density EEG artifact removal.

Core Algorithmic Principles

Artifact Subspace Reconstruction (ASR)

Artifact Subspace Reconstruction (ASR) is an adaptive, component-based method designed for the online or offline correction of artifacts in multi-channel EEG recordings. Its core principle is the statistical identification and reconstruction of data segments containing high-amplitude, non-stationary artifacts, based on the statistics of clean "reference" data [20] [44].

The algorithm operates via a sliding window that moves through the continuous EEG data. For each window, the following steps are executed:

Principal Component Analysis (PCA): The data within the window is decomposed using PCA, transforming the data from the original channel space into a component space defined by principal components [20] [45].
Artifact Detection: The principal components are evaluated as potential artifacts. A component is identified as artifactual if the standard deviation of its root mean square (RMS) value exceeds a threshold, Γ, defined by the user-defined parameter k and the statistics of the calibration data [44]: Γi = μi + k * σi Here, μi and σi are the mean and standard deviation of the RMS for the i-th component calculated from the clean reference data. A lower k value results in a more aggressive cleaning strategy [44].
Subspace Reconstruction: Components identified as artifactual are removed. The data from the remaining components is then projected back to the channel space, effectively reconstructing the signal from which the artifact subspace has been removed [45] [44].

A critical step in ASR is the selection of the calibration data. This clean reference dataset is used to compute the μi and σi for the RMS values of the principal components. Users can supply their own calibration data (e.g., a resting-state recording) or allow the algorithm to automatically extract clean segments from the contaminated data itself [46] [44].

Canonical Correlation Analysis (CCA)

Canonical Correlation Analysis (CCA) is a blind source separation technique that exploits the differential autocorrelation properties of brain signals and artifacts. The fundamental premise is that brain signals typically exhibit higher autocorrelation over short time lags compared to many artifacts, such as muscle activity, which are more random and thus have weaker autocorrelation [43] [46].

The mathematical procedure for CCA-based artifact removal is as follows:

Formulate Data Matrices: Let the observed multi-channel EEG signal be X(t) = [x1(t), x2(t),…,xM(t)]T, where M is the number of channels and N is the number of samples. A time-lagged version of the signal is created as Y(t) = X(t-1) [43].
Define Linear Combinations: Two sets of canonical variables, U and V, are defined as linear combinations of the components in X and Y: U(t) = wxTX(t), V(t) = wyTY(t) where wx and wy are the weight vectors to be determined [43].
Maximize Correlation: The goal of CCA is to find the weight vectors wx and wy that maximize the correlation ρ between U and V [43]: max wx,wy ρ(U,V) = (wxT Cxy wy) / ( sqrt( wxT Cxx wx ) * sqrt( wyT Cyy wy ) ) Here, Cxx and Cyy are the within-set covariance matrices, and Cxy is the between-sets covariance matrix.
Solve Eigenvalue Problem: The optimization problem can be transformed into a generalized eigenvalue problem to solve for the canonical weights [43]: Cxx-1 Cxy Cyy-1 Cyx wx = ρ2 wx The eigenvectors wx represent the CCA components, and the eigenvalues ρ2 represent the squared canonical correlations, which indicate the autocorrelation strength of each component.
Source Separation and Reconstruction: The derived components Ŝ(t) = U(t) = wxT X(t) are sorted by their autocorrelation coefficients. Components with the lowest correlation (e.g., high-frequency muscle artifacts) are considered artifactual and removed. The cleaned EEG signals are reconstructed by back-projecting only the brain-related components using the corrected mixing matrix [43].

Comparative Performance Analysis

The efficacy of ASR and CCA has been evaluated in various studies, ranging from simulated phantom head experiments to real-world human locomotion tasks. The table below summarizes key quantitative findings from recent research, highlighting the performance of each method under different artifact conditions.

Table 1: Quantitative Performance Comparison of ASR and CCA-based Methods

Method	Experimental Condition	Key Performance Metric	Result	Citation
iCanClean (CCA-based)	Phantom head with all artifacts (Brain + Eyes + Muscles + Motion)	Data Quality Score (0-100%, correlation with ground-truth)	55.9% (from 15.7% pre-cleaning)	[46]
ASR	Phantom head with all artifacts (Brain + Eyes + Muscles + Motion)	Data Quality Score (0-100%, correlation with ground-truth)	27.6% (from 15.7% pre-cleaning)	[46]
Auto-CCA	Phantom head with all artifacts (Brain + Eyes + Muscles + Motion)	Data Quality Score (0-100%, correlation with ground-truth)	27.2% (from 15.7% pre-cleaning)	[46]
ASR	Human running (Flanker task)	Reduction in power at gait frequency & harmonics	Significant reduction	[20]
iCanClean (CCA-based)	Human running (Flanker task)	Recovery of expected P300 ERP congruency effect	Successful identification	[20]
CCA	Controlled artifacts (blinks, head movement, chewing)	Preservation of temporal/spectral features in VEP/SSVEP	Effective preservation	[43]

Beyond quantitative metrics, the choice of parameters significantly influences performance. For ASR, the cutoff parameter k is critical. Research indicates an optimal k between 20 and 30 serves as a good compromise between removing non-brain signals and retaining brain activity [44]. A lower k (e.g., 10) leads to more aggressive cleaning and a higher percentage of data modification, which risks "over-cleaning" and removing neural signals of interest [20] [44].

For CCA-based methods like iCanClean, performance is influenced by the criterion for rejecting noise components (the R² threshold). Studies on human locomotion data suggest that an R² of 0.65 with a sliding window of 4 seconds produces optimal results in terms of yielding the most dipolar brain components from a subsequent Independent Component Analysis (ICA) [20].

Experimental Protocols and Methodologies

To ensure the reproducibility of research involving ASR and CCA, this section outlines detailed protocols based on cited experiments.

Protocol for CCA-based Artifact Removal in Controlled Environments

This protocol is adapted from a study investigating visual-evoked potentials (VEP) and steady-state visual-evoked potentials (SSVEP) [43].

Participants: 11 right-handed adults with no history of psychological disorders.
EEG Acquisition:
- Equipment: 62-electrode Ag/AgCl cap (60 EEG + 2 reference electrodes) following the modified international 10-20 system.
- Amplifier: Scan SynAmps2 Express system.
- Sampling Rate: 1000 Hz with a 16-bit resolution.
- Impedance: Maintained below 5 kΩ.
- Reference: Opposite lateral mastoids.
Experimental Paradigm:
- Tasks: Subjects performed four runs: one without motion, and three inducing common artifacts (blinking, chewing, head rotation).
- Structure: Each run involved sessions with an instruction period, a 10-second stimulation period (using 1 Hz and 15 Hz flickering stimuli), and a 10-second resting period.
- Environment: A shielded room to prevent external environmental artifacts.
Data Preprocessing:
- Downsampling the data to 256 Hz.
- Applying a passband filter of 0.1–60 Hz.
CCA Processing:
- Decompose the preprocessed EEG signals using CCA.
- Extract spectral and temporal features from the derived components.
- Cluster these features using a Gaussian Mixture Model (GMM) to automatically identify and label artifact components.
- Reconstruct the cleaned EEG signal by removing the identified artifact components.

Protocol for Motion Artifact Removal During Human Locomotion

This protocol is drawn from a study comparing ASR and iCanClean during running [20].

Participants: 21 young adult athletes.
EEG Acquisition:
- Setup: Mobile EEG system suitable for whole-body movement.
- Task: An adapted Flanker task was performed under two conditions: dynamic jogging and static standing.
Data Processing & Comparison:
- Pipelines: Preprocess the EEG data using either iCanClean (with pseudo-reference noise signals) or ASR.
- Evaluation Metrics:
  - ICA Dipolarity: The number and quality of dipolar, brain-related independent components after cleaning.
  - Spectral Power: Reduction in power at the gait frequency and its harmonics.
  - Event-Related Potentials (ERPs): Ability to recover the expected P300 amplitude and latency differences between congruent and incongruent Flanker trials, comparing the dynamic task to the static ground truth.

Integrated Processing Pipelines and Future Directions

While powerful individually, ASR and CCA are often integrated with other methods like Independent Component Analysis (ICA) to form robust processing pipelines for real-world EEG. A particularly effective strategy is the ASRICA pipeline, where ASR is applied before ICA [45].

Diagram: ASRICA Pipeline for EEG Artifact Removal

In this workflow, ASR first removes high-amplitude, non-stationary motion artifacts that violate ICA's assumption of stationarity. This initial cleaning enhances the subsequent ICA decomposition, leading to the identification of more brain-related and dipolar components [45]. This pipeline has been successfully used to extract single-trial brain activity during highly dynamic activities like skateboarding on a half-pipe ramp [45].

Future directions in artifact removal research include:

Development of All-in-One Solutions: Frameworks like iCanClean that can handle multiple artifact types without requiring separate noise sensors or clean calibration data [46].
Optimization for Low-Density EEG: Adapting component-based methods like ASR for low-channel-count, wearable EEG systems using signal decomposition techniques [47] [14].
Leveraging Auxiliary Sensors: More widespread integration of data from inertial measurement units (IMUs) and other motion sensors to improve the detection of motion artifacts [14].
Advanced Deep Learning Approaches: Exploring deep learning models for end-to-end artifact detection and removal, particularly for complex artifact types in ecological settings [14].

The Scientist's Toolkit

The following table details key hardware and software solutions used in advanced EEG artifact removal research.

Table 2: Essential Research Reagents and Tools for Artifact Removal

Item Name	Type	Function/Benefit	Citation
High-Density Ag/AgCl EEG Cap	Hardware	Standard wet-electrode setup providing high-quality signal and low impedance for laboratory-grade recordings.	[43]
Dual-Layer EEG Sensors	Hardware	A secondary sensor layer detects motion artifacts not in contact with the scalp, providing reference noise for advanced algorithms like iCanClean.	[20] [46]
Mobile EEG Amplifier	Hardware	Portable device enabling data collection during whole-body movement and locomotion studies.	[20]
Inertial Measurement Unit (IMU)	Hardware	Motion sensor to track head acceleration and movement, providing reference signals for motion artifact correction.	[14]
EEGLAB	Software	A dominant open-source MATLAB toolbox offering implementations of ASR, ICA, and other preprocessing functions.	[20] [48]
ICLabel	Software	An EEGLAB plugin for automated classification of independent components into brain, muscle, eye, heart, and other categories.	[20]
iCanClean Algorithm	Algorithm/Software	A generalized CCA-based framework for removing multiple artifact types in real-time, usable with or without reference noise signals.	[20] [46]

Electroencephalography (EEG) is a fundamental tool in neuroscience research, clinical diagnosis, and brain-computer interface (BCI) development, prized for its non-invasive nature and high temporal resolution [49]. The advent of high-density EEG (hd-EEG), utilizing up to 256 electrodes, has provided researchers with unparalleled spatial detail of brain dynamics, particularly valuable in sleep studies and cognitive task monitoring [5]. However, the microvolt-range amplitudes of neural signals are highly susceptible to contamination from both physiological and non-physiological artifacts [49]. Physiological artifacts, including ocular movements (EOG), muscle activity (EMG), and cardiac signals (ECG), often exhibit spectral and temporal overlap with genuine brain activity, while non-physiological sources like power line interference and electrode pop further degrade signal quality [49]. These artifacts can be ten times greater in amplitude than the neural signals of interest, severely hindering accurate analysis and interpretation [49]. Traditional artifact removal methods, such as regression, blind source separation (BSS), and wavelet transforms, often rely on linear assumptions, manual parameter tuning, or require reference signals, limiting their effectiveness and generalizability across diverse hd-EEG datasets [49] [15] [14]. The deep learning revolution is overcoming these limitations by providing models capable of learning complex, non-linear mappings between noisy and clean EEG signals in an end-to-end manner, dramatically advancing the state of artifact removal in hd-EEG research [49] [19].

Deep Learning Architectures for EEG Denoising

Core Architectures and Operating Principles

Deep learning models approximate a function that maps a noisy EEG signal ( \mathbf{y} ) to an estimate of the underlying clean signal ( \mathbf{x} ), where ( \mathbf{y} = \mathbf{x} + \mathbf{z} ) and ( \mathbf{z} ) represents artifact contamination [49]. The network learns parameters ( \mathbf{\theta} ) (weights and biases) by minimizing a loss function, often the Mean Squared Error (MSE), between the estimated clean signal ( {\varvec{f}}_{\varvec{\theta}}\left(\varvec{y}\right) ) and the ground truth ( \varvec{x} ) [49]. The following architectures have proven most effective.

Convolutional Neural Networks (CNNs) excel at extracting local spatial and temporal features from EEG signals through their kernel-based filtering operations. Their hierarchical structure allows them to identify artifact patterns at multiple scales. For instance, 1D-ResCNN uses residual connections and multiple convolutional kernels of different sizes to extract and reconstruct EEG features from contaminated data effectively [15]. CNNs are particularly strong at removing artifacts with distinct morphological signatures, such as EOG and EMG [50] [15].
Long Short-Term Memory Networks (LSTMs) are a type of Recurrent Neural Network (RNN) designed to model temporal dependencies and contextual information in sequential data [19]. Their gated memory cells allow them to learn long-range patterns in EEG time series, making them well-suited for capturing the dynamic properties of both neural signals and artifacts [19] [15]. They are often integrated with CNNs to jointly model temporal and morphological features [15].
Generative Adversarial Networks (GANs) employ an adversarial training framework between a generator and a discriminator [19]. The generator creates denoised EEG signals from noisy inputs, while the discriminator judges their authenticity against clean EEG data [19] [51]. This adversarial process drives the generator to produce highly realistic, artifact-free signals. Models like EEGNet and AnEEG have used GANs, sometimes augmented with LSTM layers, to successfully remove ocular and muscle artifacts [19] [51].
Transformers leverage a self-attention mechanism to weigh the importance of all time points in a sequence when processing a given time point [52]. This allows them to capture global, long-range dependencies in EEG data more effectively than RNNs or CNNs [51] [52]. Architectures like EEGDNet have demonstrated the Transformer's capability to handle complex artifacts, including those induced by transcranial electrical stimulation (tES) [50] [52].

Advanced Hybrid and Integrated Architectures

To overcome the limitations of individual architectures, recent research focuses on sophisticated hybrid models that integrate the strengths of multiple approaches.

Dual-Branch Hybrid CNN-Transformer (DHCT-GAN): This model uses one branch to learn clean EEG features and another to learn artifact features, fusing this information through an adaptive gating network [51]. It combines CNNs for local feature extraction and Transformers for long-term dependency modeling, stabilized by a multi-discriminator GAN framework [51].
Artifact-Aware Denoising Model ((A^2) DM): This framework incorporates an artifact-aware module (AAM) that first identifies the type of artifact present (e.g., EOG or EMG) and generates an artifact representation [53]. This prior knowledge is then fused into a denoising network featuring a hard attention-based Frequency Enhancement Module (FEM) to selectively remove artifact-specific frequency components, followed by a Time-domain Compensation Module (TCM) to recover any lost neural information [53].
CLEnet: This network integrates dual-scale CNNs with LSTMs and an improved one-dimensional Efficient Multi-Scale Attention mechanism (EMA-1D) [15]. The CNN extracts multi-scale morphological features, the LSTM captures temporal dependencies, and the attention mechanism enhances critical features, enabling the model to handle unknown artifacts and multi-channel EEG inputs effectively [15].

Table 1: Performance Comparison of Deep Learning Models for EEG Denoising

Model	Architecture Type	Primary Artifacts Targeted	Key Performance Metrics (Typical Range)	Strengths
1D-ResCNN [15]	CNN	EOG, EMG	SNR: >11.5 dB, CC: >0.92 [15]	Strong on morphological features, computationally efficient
AnEEG [19]	GAN (with LSTM)	Ocular, Muscle	Improved SNR & SAR vs. wavelet methods [19]	Effective temporal modeling via adversarial training
EEGDNet [50]	Transformer	EOG, EMG, tES	Superior RRMSE & CC for tACS/tRNS [50]	Excels at capturing long-range dependencies
DHCT-GAN [51]	Hybrid (CNN+Transformer+GAN)	EMG, EOG, ECG, Mixed	Outperforms state-of-the-art across 6 metrics [51]	Dual-branch learning, stable multi-discriminator training
(A^2) DM [53]	Hybrid (CNN with Attention)	EOG, EMG (Interleaved)	~12% CC improvement over NovelCNN [53]	Unified artifact removal using artifact-type prior knowledge
CLEnet [15]	Hybrid (CNN+LSTM)	EMG, EOG, ECG, Unknown	SNR: 11.50 dB, CC: 0.925 (Mixed artifacts) [15]	Handles unknown artifacts, multi-channel input

Figure 1: High-Level Workflow of a Hybrid Deep Learning Model for EEG Denoising

Experimental Protocols and Benchmarking

Dataset Preparation and Preprocessing

Robust experimental evaluation requires carefully curated datasets, often combining semi-synthetic and real EEG data.

Semi-Synthetic Data Generation: This approach involves adding recorded or simulated artifacts to clean EEG segments, providing a known ground truth for controlled performance evaluation [50] [15]. For example, EEGDenoiseNet provides a benchmark dataset where clean EEG is artificially contaminated with EOG and EMG artifacts at specific signal-to-noise ratios [15]. Similarly, studies on tES artifacts create synthetic datasets by combining clean EEG with simulated tDCS, tACS, and tRNS artifacts [50].
Real and Task-Specific Datasets: Models are also validated on real EEG recordings that contain inherent artifacts. These include overnight sleep hd-EEG recordings [5], data from subjects performing cognitive tasks (e.g., the 2-back task) [15], and recordings from wearable EEG devices in ecological settings [14]. These datasets capture the full complexity of real-world artifacts but lack a perfect ground truth.

Standard preprocessing steps include band-pass filtering (e.g., 1-100 Hz), notch filtering for power line noise, and normalization. For hd-EEG, bad channel detection and interpolation are often necessary [5].

Evaluation Metrics and Performance Validation

A multi-faceted assessment using complementary metrics is essential to gauge both artifact removal efficacy and neural signal preservation.

Temporal Domain Metrics: Root Relative Mean Squared Error (RRMSEt) and Correlation Coefficient (CC) measure the similarity between the denoised and clean signal in the time domain [50] [15]. Lower RRMSEt and higher CC indicate better performance.
Spectral Domain Metrics: Relative Root Mean Squared Error in the Frequency Domain (RRMSEf) assesses the accuracy of the reconstructed power spectral density [15].
Signal Quality Metrics: Signal-to-Noise Ratio (SNR) and Signal-to-Artifact Ratio (SAR) quantify the level of noise suppression and signal preservation [19] [15].

Table 2: Standardized Evaluation Metrics for EEG Denoising Models

Metric Category	Specific Metric	Formula / Principle	Interpretation
Temporal Fidelity	Correlation Coefficient (CC)	( \rho = \frac{\text{cov}(X{\text{clean}}, X{\text{denoised}})}{\sigma{X{\text{clean}}} \sigma{X{\text{denoised}}}} )	Higher is better (max 1)
	Relative RMSE (Temporal)	( \text{RRMSE}t = \frac{ \sqrt{ \frac{1}{N} \sum{i=1}^N (X{\text{clean}, i} - X{\text{denoised}, i})^2 } }{ \sigma{X{\text{clean}}} } )	Lower is better
Spectral Fidelity	Relative RMSE (Spectral)	( \text{RRMSE}f = \frac{ \sqrt{ \frac{1}{K} \sum{j=1}^K (P{\text{clean}, j} - P{\text{denoised}, j})^2 } }{ \sigma{P{\text{clean}}} } )	Lower is better
Signal Quality	Signal-to-Noise Ratio (SNR)	( \text{SNR} = 10 \log{10}\left( \frac{P{\text{signal}}}{P_{\text{noise}}} \right) )	Higher is better
	Signal-to-Artifact Ratio (SAR)	( \text{SAR} = 10 \log{10}\left( \frac{P{\text{signal}}}{P_{\text{artifact}}} \right) )	Higher is better

Table 3: Key Resources for Deep Learning-Based EEG Denoising Research

Resource Category	Specific Tool / Dataset	Function and Utility in Research
Benchmark Datasets	EEGDenoiseNet [15]	Semi-synthetic dataset with clean EEG, EOG, and EMG; enables standardized model benchmarking.
	MIT-BIH Arrhythmia Database [19] [15]	Source of ECG signals for creating semi-synthetic datasets to evaluate cardiac artifact removal.
	Public tES-EEG Datasets [50]	Datasets containing EEG recordings with transcranial electrical stimulation artifacts.
Software & Libraries	TensorFlow / PyTorch [49]	Core deep learning frameworks for implementing and training CNN, LSTM, GAN, and Transformer models.
	HomER2 (for fNIRS) [54]	A prevalent toolbox for fNIRS data processing; illustrates cross-domain application of denoising principles.
	PRISMA Guidelines [49] [14]	Systematic review guidelines for comprehensive literature search and study selection.
Hardware Considerations	High-Density EEG Systems (256ch) [5]	Acquisition systems providing high spatial resolution data, crucial for evaluating spatial denoising.
	Wearable EEG with Dry Electrodes [14]	Devices for ecological monitoring; present unique artifacts from motion and reduced electrode contact.

The field of deep learning-based EEG denoising is rapidly evolving. Future research will focus on enhancing model generalizability across diverse subjects, recording setups, and artifact types [49] [15]. Self-supervised learning and federated learning are emerging paradigms to address data scarcity and privacy concerns, respectively [49]. Furthermore, the development of lightweight, computationally efficient models is critical for real-time applications such as closed-loop BCIs and clinical neurofeedback [49] [51]. The integration of auxiliary signals (e.g., from IMU sensors) holds promise for better identification of motion artifacts in wearable hd-EEG [14]. Finally, improving model interpretability will be key for building trust and facilitating the adoption of these methods in clinical practice [49].

In conclusion, deep learning models have irrevocably transformed the landscape of artifact removal in high-density EEG research. From CNNs and LSTMs to the sophisticated hybrid models of today, these approaches have demonstrated a remarkable capacity to separate complex, non-linear artifacts from genuine neural signals in an end-to-end, data-driven manner. As these architectures continue to mature, they will unlock deeper insights from hd-EEG data, accelerating progress in neuroscience, neuromedicine, and brain-inspired computing.

Figure 2: Evolution of Deep Learning Models for EEG Denoising

Electroencephalography (EEG) remains a cornerstone technique for investigating functional brain dynamics with millisecond temporal precision in both clinical and research settings [55]. However, EEG data are frequently contaminated by numerous biological and environmental artifacts which, if not adequately removed, can obscure underlying neural signals and compromise data integrity [55]. This challenge is particularly pronounced in high-density EEG systems, where the complex interplay of neural sources and artifacts demands sophisticated processing pipelines. Artifacts in EEG exhibit specific spatial, temporal, and spectral characteristics that require tailored detection and removal strategies [11]. Without clear classification and targeted processing, pipelines risk applying overly generic solutions that may prove ineffective or even compromise neurophysiological components of interest [11].

The proliferation of wearable EEG technology has further complicated artifact management, as relaxed constraints of acquisition setups often compromise signal quality through factors including dry electrodes, reduced scalp coverage, and subject mobility [11]. This technical guide provides a comprehensive framework for implementing an effective artifact cleaning pipeline from raw data to cleaned output, with specific consideration for the challenges inherent in high-density EEG research.

Experimental Workflow for EEG Artifact Removal

The following diagram illustrates the core stages of a robust EEG artifact cleaning pipeline, integrating both established and emerging methodological approaches.

Workflow Description

This workflow implements a sequential processing structure where each stage addresses specific aspects of artifact contamination:

Preprocessing & Filtering: Initial data quality enhancement through bandpass filtering, line noise removal, and bad channel identification.
Artifact Detection: Identification of contaminated segments using threshold-based, template-matching, or machine learning approaches.
Artifact Removal Methods: Application of specialized algorithms targeting specific artifact classes (ocular, muscular, motion, etc.).
Cleaned EEG Output: Final data quality assessment and export in standardized formats for subsequent analysis.

Performance Comparison of Artifact Removal Techniques

Quantitative Assessment of Removal Methods

Table 1: Performance metrics for major artifact removal approaches in wearable EEG applications

Method Category	Primary Techniques	Effectiveness Metrics	Optimal Application Context	Computational Demand
Blind Source Separation	Independent Component Analysis (ICA), Principal Component Analysis (PCA)	Accuracy: 71%, Selectivity: 63% [11]	Ocular and muscular artifacts in multi-channel setups [11]	High (especially with high channel counts)
Spatial Filtering	Artifact Subspace Reconstruction (ASR)	Significantly reduces power at gait frequency; improves ICA dipolarity [56]	Motion artifacts during locomotion; general artifact correction [11] [56]	Medium to High
Deep Learning	Complex CNN, M4 (State Space Models)	RRMSE: 0.15-0.25 (temporal); 0.18-0.30 (spectral) [50]	tES-induced artifacts; muscular and motion artifacts [11] [50]	Very High (GPU acceleration recommended)
Adaptive Filtering	iCanClean with pseudo-reference signals	Reduces gait frequency power; enables P300 ERP congruency detection [56]	Motion artifacts during running; mobile brain imaging [56]	Medium
Wavelet-Based Methods	Wavelet-enhanced ICA (wICA)	Strong performance across multiple artifact types [55]	Ocular and muscular artifacts; pediatric EEG [55]	Medium

Table 2: Specialized pipelines for developmental EEG populations

Pipeline Name	Core Methodology	Target Population	Key Adaptations	Performance Highlights
RELAX-Jr	Multi-channel Wiener Filtering (MWF) + wICA + adjusted-ADJUST	Children (4-12 years)	PICARD algorithm; sensitive to increased noise; accounts for lower alpha peaks [55]	Strong artifact reduction while preserving neural signals; effective for high-motion data [55]
MADE	Automated preprocessing with ICA	Developmental populations	Optimized for movement-rich data	Effective for large-scale developmental studies [55]
HAPPE	ICA-based artifact removal	Developmental and clinical populations	Enhanced bad channel detection	Maintains data integrity in compromised recordings [55]

Detailed Methodological Protocols

RELAX-Jr Pipeline Implementation for Developmental EEG

The RELAX-Jr pipeline represents a fully automated approach specifically adapted for cleaning EEG data from children, who typically exhibit more pronounced movement and muscle artifacts [55]. The protocol implements these key stages:

Preprocessing Configuration:

Input: Raw data files in EEGLAB format
Initial filtering: 1Hz high-pass and 50/60Hz line noise removal
Bad channel identification: Statistical thresholding based on signal variance and correlation patterns
Re-referencing: Robust average reference computation excluding bad channels [55]

Artifact Removal Core Processing:

Multi-channel Wiener Filtering (MWF) implementation: Uses spatial and temporal characteristics to separate neural activity from artifacts without requiring reference signals [55]
Wavelet-enhanced ICA (wICA): Applies wavelet thresholding to ICA components before reconstruction, preserving neural signals while removing artifacts [55]
Adjusted-ADJUST algorithm: Automated independent component classification optimized for geodesic electrode nets commonly used in pediatric studies [55]
PICARD algorithm: Maximum likelihood ICA implementation with faster convergence compared to Infomax ICA [55]

Validation and Quality Metrics:

Signal-to-error ratio calculation
Artifact-to-residue ratio assessment
Alpha power preservation between eyes-open and eyes-closed conditions
Comparison against ground truth where available [55]

Motion Artifact Protocol for Mobile EEG

For EEG recorded during locomotion, specialized protocols are required to address movement artifacts:

iCanClean with Pseudo-Reference Signals:

Application: Overground running with simultaneous EEG recording
Setup: Integration of inertial measurement units (IMUs) for motion reference signals
Processing: Adaptive filtering using motion signals as noise references
Validation: Assessment of power reduction at gait frequency and harmonics; recovery of expected P300 ERP congruency effects [56]

Artifact Subspace Reconstruction (ASR):

Parameters: Sliding window approach with statistical thresholding
Calibration: Initial clean segment identification for baseline statistics
Performance: Significant power reduction at gait frequency; improved ICA component dipolarity [56]

The Scientist's Toolkit: Essential Research Reagents

Table 3: Critical computational tools and algorithms for EEG artifact management

Tool/Algorithm	Function	Implementation Considerations
Independent Component Analysis (ICA)	Blind source separation to identify and isolate artifact components	Effectiveness decreases with low-channel-count systems (<16 channels); requires sufficient data length for convergence [11]
Artifact Subspace Reconstruction (ASR)	Automated removal of high-variance artifact components using statistical thresholding	Particularly effective for motion and ocular artifacts; requires calibration data [11] [56]
Multi-channel Wiener Filter (MWF)	Spatial filtering technique that estimates and removes artifacts using signal statistics	Does not require reference channels; effective for various artifact types [55]
Complex CNN	Deep learning approach for temporal and spectral artifact removal	Superior performance for tDCS artifacts; requires substantial training data [50]
State Space Models (M4)	Multi-modular network for complex artifact patterns	Excels at removing tACS and tRNS artifacts; high computational demands [50]
Wavelet-Enhanced ICA	Hybrid approach combining wavelet thresholding with ICA	Effective for ocular and muscle artifacts; preserves neural signal integrity [55]

Emerging Approaches and Future Directions

Deep learning approaches represent the cutting edge of artifact removal methodology, particularly for complex artifact patterns that challenge traditional techniques. Recent benchmarks demonstrate that method performance is highly dependent on stimulation type and artifact characteristics [50]. For tDCS artifacts, convolutional networks (Complex CNN) deliver superior performance, while multi-modular networks based on State Space Models (SSMs) yield optimal results for tACS and tRNS artifacts [50].

Semi-synthetic datasets with known ground truth enable controlled and rigorous model evaluation, providing reliable benchmarks for method selection in real-time neurophysiological monitoring applications [50]. These approaches are particularly valuable for clinical and neuroimaging applications where artifact removal must be both effective and efficient.

Future developments in artifact management will likely focus on real-time processing capabilities, improved adaptation to individual differences in artifact characteristics, and enhanced preservation of neural signals during the cleaning process. The integration of auxiliary sensors (e.g., IMUs, EOG, EMG) remains underutilized despite significant potential for enhancing artifact detection under ecological conditions [11]. As wearable EEG systems continue to evolve, artifact removal pipelines must simultaneously address the competing demands of computational efficiency, processing accuracy, and practical implementation across diverse research and clinical contexts.

Optimizing Your Pipeline: Troubleshooting Common Artifact Removal Pitfalls

In high-density electroencephalography (EEG) research, the process of artifact removal presents a fundamental paradox: how to eliminate contaminating noise while preserving the integrity of underlying neural signals. Over-cleaning can systematically remove or alter genuine neurophysiological data, potentially leading to erroneous conclusions in both basic research and clinical drug development. This challenge has intensified with the rapid adoption of wearable EEG systems and dry electrodes, which, while offering unprecedented access to brain activity in ecological settings, introduce new types of artifacts and signal quality concerns [24] [11]. The expansion of EEG into new domains—including neuropharmacology, neuromarketing, and real-world cognitive monitoring—demands rigorous methodologies that balance cleaning efficacy with neural information preservation.

Artifacts in EEG originate from multiple sources, broadly categorized as physiological (e.g., ocular, muscular, cardiac) and non-physiological (e.g., environmental noise, electrode movement) [11] [57]. Traditional artifact removal approaches, developed for controlled laboratory settings with gel-based systems, often prove inadequate for the dynamic environments where modern high-density EEG is deployed. The core challenge lies in the significant spectral and temporal overlap between artifacts and neural signals of interest, making their separation particularly difficult without advanced processing techniques [15]. For researchers in drug development, where quantitative EEG biomarkers may serve as critical endpoints in clinical trials, preserving signal fidelity is not merely methodological but essential for valid scientific inference.

The Perils of Over-Cleaning: Methods and Consequences

Defining and Identifying Over-Cleaning

Over-cleaning occurs when artifact removal algorithms inadvertently discard or distort genuine neural signals, resulting in a loss of neurophysiologically meaningful information. This phenomenon manifests through several measurable indicators: excessive attenuation of signal amplitude in specific frequency bands, reduced complexity of the neural time series, introduction of spurious correlations between channels, and elimination of event-related potentials or high-frequency oscillations [11] [15]. In pharmacological EEG studies, over-cleaning can obscure dose-dependent changes in spectral power or connectivity metrics, potentially masking therapeutic effects or creating false positive findings.

The risk of over-cleaning is particularly pronounced in high-density EEG systems due to their increased sensitivity to subtle neural processes and the computational complexity of processing numerous channels simultaneously. Artifact removal techniques that perform adequately with low-channel count systems may become over-aggressive when applied to high-density arrays, as they might misinterpret spatial patterns of neural activity as artifacts [57]. This problem is exacerbated in dry EEG systems, where the absence of conductive gel increases impedance and movement artifacts, creating a more challenging signal environment that tempts researchers toward increasingly aggressive cleaning pipelines [57].

Method-Specific Risks and Limitations

Different artifact removal approaches carry distinct risks for over-cleaning. Table 1 summarizes the primary limitations of common techniques when applied to high-density EEG research.

Table 1: Artifact Removal Methods and Their Associated Over-Cleaning Risks

Method Category	Specific Techniques	Over-Cleaning Manifestations	Neural Information Most at Risk
Spatial Filtering	PCA, ICA, SPHARA	Over-component rejection, spatial smoothing that blurs localized activity	High-frequency oscillations, focal pathological patterns (e.g., epileptiform discharges)
Temporal Filtering	High-pass/Low-pass filters, Notch filters	Ringing artifacts, phase distortion, abolition of transient signals	Evoked potentials, cross-frequency coupling, phase-amplitude relationships
Regression-Based	EOG/ECG regression, Surface Laplacian	Over-correction, introduction of negative power	Frontal theta activity, genuine frontal signals correlated with ocular movements
Wavelet Transform	Thresholding techniques	Over-thresholding of high-frequency components	Gamma-band activity, sleep spindles
Deep Learning	CNN-LSTM models (e.g., CLEnet)	Over-fitting to training data, removal of unknown neural patterns	Novel cognitive states, individual-specific signatures

Independent Component Analysis (ICA), while powerful for separating neural from non-neural sources, requires careful manual inspection to avoid rejecting components containing neural information. Automated ICA rejection algorithms frequently misclassify components containing genuine brain activity, particularly from frontal regions where neural and ocular sources exhibit similar spatial distributions [11]. Similarly, regression-based methods that use reference signals from electrooculography (EOG) or electromyography (EMG) can create "over-correction" artifacts, particularly when the reference channels themselves contain neural signals [15].

Deep learning approaches represent a promising advancement but introduce new challenges. Models like CLEnet, which combines convolutional neural networks (CNN) with long short-term memory (LSTM) networks, demonstrate improved capability for removing unknown artifacts while preserving neural information [15]. However, these models may over-fit to their training data and remove novel neural patterns not represented in the training set. As noted in recent research, "network structures capable of removing various types of artifacts and performing artifacts removal on multi-channel EEG have broader prospects for development" [15], highlighting the need for more adaptable architectures.

Quantitative Assessment of Cleaning Efficacy

Performance Metrics for Balanced Cleaning

Evaluating artifact removal performance requires multiple complementary metrics to assess both noise reduction and neural preservation. No single metric adequately captures this balance, necessitating a multidimensional assessment framework. Table 2 presents key quantitative metrics used in recent studies to evaluate cleaning efficacy without over-removal.

Table 2: Quantitative Metrics for Assessing Cleaning Efficacy and Neural Preservation

Metric Category	Specific Metrics	Optimal Range	Interpretation in Balance Context
Time-Domain Accuracy	RRMSEt (Relative Root Mean Square Error)	Lower values preferred (<0.35)	Measures temporal distortion; values >0.4 suggest significant shape alteration
Frequency-Domain Accuracy	RRMSEf (Relative Root Mean Square Error)	Lower values preferred (<0.35)	Assesses spectral preservation; elevated values indicate unwanted frequency manipulation
Signal Quality	SNR (Signal-to-Noise Ratio)	Higher values preferred (>11 dB)	Indicates noise reduction but can be misleading if neural signals are also removed
Temporal Structure	CC (Correlation Coefficient)	Higher values preferred (>0.9)	Measures waveform preservation; values <0.8 suggest important features lost
Spatial Integrity	RMSD (Root Mean Square Deviation)	Context-dependent	Assesses topographic preservation; critical for source localization

Recent research by Du et al. (2025) demonstrates the application of these metrics for evaluating their CLEnet model, which achieved a correlation coefficient of 0.925 and signal-to-noise ratio of 11.498dB in removing mixed artifacts while maintaining RRMSEt at 0.300 and RRMSEf at 0.319 [15]. These values indicate successful artifact reduction with minimal signal distortion, representing the target balance researchers should seek.

For pharmacological studies, additional validation is necessary to ensure that cleaning methods preserve drug-induced EEG changes. This typically requires establishing test-retest reliability in controlled conditions and demonstrating sensitivity to known drug effects before applying methods to novel compounds.

Experimental Protocols for Method Validation

Robust validation of artifact removal pipelines requires carefully designed experimental protocols that quantify both artifact reduction and neural preservation. The following methodologies represent current best practices:

Semi-Synthetic Data Benchmarking: This approach involves adding real artifacts (e.g., EMG, EOG) to clean EEG baseline recordings or using simultaneously recorded artifact-free data as ground truth. Zhang et al. established a semi-synthetic benchmark dataset specifically for evaluating EMG and EOG artifact removal [15]. The protocol involves: (1) recording clean EEG during resting state with minimal artifact contamination; (2) separately recording artifact signals (e.g., eye blinks, muscle activity); (3) mathematically combining these signals with specific signal-to-noise ratios; and (4) applying artifact removal methods with the original clean EEG as ground truth for quantitative comparison.

Real-World Task Paradigms: Studies incorporating naturalistic movements provide critical validation for ecological applications. Fiedler et al. (2025) employed a motor execution paradigm where participants performed movements with left hand, right hand, feet, and tongue while dry EEG was recorded [57]. This protocol enables assessment of artifact removal during known neural activation patterns (movement-related desynchronization/synchronization) in central regions, providing a biological reference for evaluating neural preservation.

Multi-Method Comparison: Gorjan et al. (2025) implemented a comparative protocol applying multiple cleaning pipelines (Fingerprint + ARCI, SPHARA, and their combination) to the same dataset [57]. Performance was quantified using standard deviation (SD), signal-to-noise ratio (SNR), and root mean square deviation (RMSD), with a generalized linear mixed effects (GLME) model identifying significant differences between methods.

Validation Workflow for balanced artifact removal method evaluation.

Optimized Pipelines for Balanced Cleaning

Hybrid and Multi-Stage Approaches

Emerging research demonstrates that combining multiple artifact removal techniques in structured pipelines yields superior results compared to any single method. These hybrid approaches leverage the complementary strengths of different algorithms while minimizing their individual limitations. Fiedler et al. (2025) reported that combining ICA-based methods (Fingerprint + ARCI) with spatial harmonic analysis (SPHARA) significantly outperformed either approach alone in dry EEG recordings [57].

The sequential pipeline implemented in their study achieved remarkable improvements: grand average values of standard deviation improved from 9.76μV (reference preprocessed EEG) to 6.15μV, while signal-to-noise ratio increased from 2.31 to 5.56dB [57]. This demonstrates how carefully orchestrated multi-stage approaches can simultaneously enhance noise reduction and preserve neural information. The improved SPHARA version included an additional step of zeroing artifactual jumps in single channels before spatial filtering, highlighting how targeted pre-processing can optimize subsequent stages.

Another promising development is the EEG-cleanse pipeline, a modular and fully automated preprocessing system designed specifically for EEG recorded during full-body movement [58]. This pipeline combines motion-adaptive preprocessing methods with a hybrid strategy for labeling artifacts and preserves neural signals through structured logging and integration of open-source tools. Its modular design allows researchers to customize the sequence based on their specific artifact challenges and neural signals of interest.

Advanced Deep Learning Architectures

Deep learning approaches represent the cutting edge of balanced artifact removal, with architectures specifically designed to separate neural and non-neural components while minimizing information loss. Du et al. (2025) developed CLEnet, which integrates dual-scale CNN and LSTM with an improved EMA-1D (One-Dimensional Efficient Multi-Scale Attention Mechanism) [15]. This architecture specifically addresses the limitation of previous models that disrupted temporal features during morphological feature extraction.

CLEnet operates through three specialized stages: (1) morphological feature extraction and temporal feature enhancement using two convolutional kernels of different scales; (2) temporal feature extraction using LSTM after dimensionality reduction; and (3) EEG reconstruction through fully connected layers [15]. This structured approach enables the model to capture both spatial and temporal characteristics of genuine neural activity, resulting in superior performance across multiple artifact types. On multi-channel EEG data containing unknown artifacts, CLEnet achieved 2.45% and 2.65% improvements in SNR and correlation coefficient respectively compared to the next best model, while reducing temporal and frequency domain errors by 6.94% and 3.30% [15].

Deep learning architecture for balanced artifact removal.

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Research Tools for Balanced EEG Artifact Removal

Tool Category	Specific Solutions	Function in Balanced Cleaning	Implementation Considerations
Reference Datasets	EEGdenoiseNet, MIT-BIH Arrhythmia Database	Provide ground truth for method development and validation	Critical for training supervised algorithms; enables benchmarking
Spatial Processing	SPHARA (Spatial Harmonic Analysis)	Reduces noise while preserving spatial patterns	Particularly effective combined with ICA; improves source localization
Source Separation	Independent Component Analysis (ICA)	Separates neural and non-neural sources	Requires careful component selection; risk of neural component rejection
Temporal Modeling	LSTM Networks	Captures long-range dependencies in neural signals	Preserves temporal structure; essential for event-related potentials
Feature Attention	EMA-1D (Efficient Multi-Scale Attention)	Enhances relevant features across time scales	Improves artifact detection without aggressive removal
Hybrid Frameworks	EEG-cleanse Pipeline	Modular automated cleaning with structured logging	Customizable for specific research needs; promotes reproducibility
Multi-Method Platforms	Fingerprint + ARCI + SPHARA	Combined physiological artifact reduction and denoising	Complementary approaches yield superior balanced performance

The pursuit of balanced artifact removal in high-density EEG research requires both technical sophistication and philosophical discipline. The most advanced algorithms must be guided by a fundamental principle: cleaning should be driven by the specific requirements of the neural signals of interest rather than the wholesale elimination of all non-neural components. As EEG applications expand into real-world environments and pharmacological studies, the balance between cleaning efficacy and neural preservation becomes increasingly critical for scientific validity.

Future developments will likely focus on context-aware cleaning systems that adapt their parameters based on the experimental context, expected neural signals, and individual subject characteristics. The integration of auxiliary sensors (e.g., IMUs, EOG, EMG) remains underutilized despite their potential to enhance artifact detection under ecological conditions [11]. As deep learning approaches evolve, greater emphasis should be placed on explainable AI that provides transparency in which components are removed and why, enabling researchers to make informed decisions about the tradeoffs between signal cleanliness and neural integrity.

For the drug development community, establishing standardized validation protocols specifically designed for pharmacological EEG applications represents an urgent priority. Such standards would ensure that artifact removal methods preserve the subtle signal changes induced by neuroactive compounds, ultimately enhancing the reliability of EEG biomarkers in clinical trials. Through continued methodological innovation and rigorous validation, the field can overcome the balancing act challenge, unlocking the full potential of high-density EEG as a window into brain function and therapeutic response.

High-density electroencephalography (hd-EEG) has become essential in both clinical and research settings, providing unparalleled spatial resolution for analyzing brain dynamics. However, the vast data complexity from 256-channel setups introduces significant artifact removal challenges that directly impact interpretation validity [5]. The core challenge in hd-EEG artifact management lies in optimizing three parameter categories: (1) thresholds for statistical decision rules in artifact detection, (2) k-values for dimensionality reduction and fractal analysis, and (3) model hyperparameters for deep learning architectures. These parameters collectively determine the balance between preserving neural signals and removing contaminants—a balance particularly crucial in high-density configurations where traditional artifact rejection methods become computationally prohibitive [11] [5].

Current research reveals a troubling sensitivity in EEG decoding pipelines, where performance fluctuates significantly based on preprocessing choices and random initialization [59]. This systematic guide addresses the pressing need for standardized parameter optimization by synthesizing evidence from recent methodological advances, providing researchers with actionable frameworks for tuning the key parameters that govern hd-EEG artifact removal efficacy.

Core Parameters in EEG Artifact Management

Detection Thresholds: Statistical Decision Rules

Threshold parameters serve as critical decision boundaries in both traditional and machine learning-based artifact detection pipelines. These values determine whether a signal component is classified as neural activity or artifact, making their optimization fundamental to analysis integrity.

Table 1: Key Threshold Parameters in EEG Artifact Management

Parameter Type	Typical Range	Function	Impact of Improper Tuning
ICA Correlation Threshold	0.7-0.9 (for ocular artifacts)	Identifies ICA components correlated with reference EOG/EMG	Under-correction (low threshold) leaves artifacts; over-correction (high threshold) removes neural signals
ASR Burst Criteria	3-20 standard deviations	Defines threshold for identifying unusual activity in Artifact Subspace Reconstruction	Too conservative: insufficient artifact removal; too liberal: neural signal distortion
Fractal Dimension Threshold	Varies by baseline HFD	Separates artifact-contaminated segments from clean EEG	Task-dependent; requires baseline establishment for each cognitive state
Amplitude Rejection Threshold	±50-100 μV	Identifies extreme amplitude deviations	High values miss subtle artifacts; low values reject excessive neural data

Independent Component Analysis (ICA) remains widely used despite hd-EEG challenges, with correlation thresholds between ICA components and reference signals requiring careful optimization. Studies indicate that thresholds between 0.7-0.9 for ocular artifacts balance specificity and sensitivity, though these values must be adjusted based on signal-to-noise ratio and research objectives [11]. For Artifact Subspace Reconstruction (ASR), the burst criterion—typically set between 3-20 standard deviations—defines the threshold for identifying unusual activity worthy of correction [11].

The emergence of nonlinear measures has introduced additional threshold considerations. Higuchi Fractal Dimension (HFD) analysis has demonstrated exceptional sensitivity in detecting state changes in EEG signals, with thresholds for artifact identification requiring establishment of baseline HFD values for each cognitive state [60]. Comparative studies have found HFD 11 times more likely to detect consciousness state differences than the best-performing linear methods, highlighting its sensitivity but also its threshold optimization challenges [60].

k-values: Fractal Analysis and Dimensionality

The k-value represents a particularly nuanced parameter in Higuchi Fractal Dimension analysis, controlling the time series segmentation approach for fractal dimension calculation. This parameter directly influences the balance between computational efficiency and measurement accuracy in quantifying signal complexity.

In HFD analysis, the k-value (kmax) defines the maximum time interval for constructing signal subsets. Optimal k-values are dataset-specific and depend on sampling rate, with common values ranging from 8-25 for EEG signals sampled at 128-1000 Hz [60]. Higher k-values provide more accurate fractal dimension estimates but increase computational burden, while lower values may undersample the signal's fractal properties. Research indicates that k-values should be set to approximately one-quarter to one-third of the time series length for robust HFD calculation [60].

Beyond fractal analysis, k-values appear in dimensionality reduction techniques, where k determines the number of components to retain. In PCA-based artifact removal, the k parameter defines how many principal components to remove as potential artifacts—a delicate balance that requires both statistical and domain knowledge to optimize [11].

Deep Learning Hyperparameters

Deep learning approaches have introduced a new category of parameters requiring optimization, with architecture-specific hyperparameters dramatically influencing artifact removal performance across diverse EEG contexts.

Table 2: Key Hyperparameters in Deep Learning EEG Denoising

Hyperparameter	Influence on Model Performance	Optimization Strategies
Learning Rate	Controls parameter update steps; critical for training stability	Cyclical learning rates (0.001-0.1) often outperform fixed values; impacts convergence speed and final performance
Batch Size	Affects gradient estimation and generalization	Smaller batches (16-32) often better for non-stationary EEG data; limited by hardware constraints
Network Depth/Width	Determines model capacity and feature abstraction capability	Deeper networks better for temporal dependencies; width increases feature diversity; requires balance to prevent overfitting
Loss Function Weights	Balances multiple objectives in denoising	In GAN architectures, discriminator/generator balance crucial; task-specific weighting improves targeted artifact removal

The AnEEG model exemplifies hyperparameter sensitivity, utilizing Long Short-Term Memory (LSTM) layers within a Generative Adversarial Network (GAN) framework. The generator employs a two-layered LSTM architecture with 50 hidden units each, requiring careful tuning of learning rates and loss function weights to maintain the discriminator/generator balance [19]. The A²DM framework introduces artifact representation as prior knowledge, with hyperparameters controlling the fusion of time-frequency domain information and the hard attention mechanism in its Frequency Enhancement Module [53].

Comprehensive protocol validation on 9 datasets with 204 participants demonstrated that automatic hyperparameter search encompassing the entire pipeline—not just network parameters—consistently outperformed baseline state-of-the-art pipelines [59]. The optimal protocol employed a 2-step hyperparameter search via an informed search algorithm, with final training and evaluation performed using 10 random initializations for reliability [59].

Experimental Protocols and Methodologies

Protocol 1: Trustworthy Hyperparameter Search for EEG Decoding

Recent research establishes a comprehensive protocol for reliable hyperparameter optimization in EEG decoding pipelines, validated across multiple datasets and deep learning models [59].

Materials and Setup

Datasets: 9 public EEG datasets encompassing motor imagery, P300, and SSVEP paradigms
Participants: 204 participants across 26 recording sessions
Models: Various deep learning architectures (EEGNet, ConvNet, etc.)
Computing: Infrastructure supporting multiple parallel training sessions

Methodology

Comprehensive Search Space Definition: Define hyperparameters across all pipeline components: data preprocessing, network architecture, training parameters, and data augmentation.
Informed Search Algorithm Application: Utilize Bayesian optimization or genetic algorithms for efficient exploration of high-dimensional spaces.
Multi-Step Search Implementation: Conduct sequential simpler searches rather than single complex searches to reduce computational burden while maintaining effectiveness.
Subset Sizing for Search: Use a subset of 3-5 participants for hyperparameter search to balance performance and computational time.
Multi-Seed Validation: Train and evaluate final models using 10 random initializations to ensure robust performance estimates.

Key Parameters Optimized

Data preprocessing: Filter ranges, artifact rejection thresholds
Network architecture: Layer configurations, activation functions
Training: Learning rates, batch sizes, regularization parameters
The optimal protocol consistently outperformed baseline state-of-the-art pipelines across datasets and models, demonstrating the critical importance of systematic hyperparameter optimization [59].

Protocol 2: A²DM for Multi-Artifact Removal

The Artifact-Aware Denoising Model (A²DM) presents a unified framework for removing multiple artifact types through artifact representation fusion and specialized modules [53].

Materials and Setup

Dataset: EEGdenoiseNet benchmark containing EOG and EMG artifacts
Architecture: Artifact-Aware Module (AAM) with 6 denoise blocks
Feature Extraction: 1D convolutional layers expanding channels from 32 to 1024

Methodology

Artifact Representation Generation: The AAM processes noisy EEG to produce an artifact representation (AR) capturing high-level features of contamination.
Frequency Enhancement Module (FEM): Implements hard attention mechanism to selectively remove frequency components based on artifact type:
- EOG artifacts: Low-frequency spectrum attention (below 4Hz)
- EMG artifacts: Mid-to-high frequency attention (20-200Hz)
Time-Domain Compensation Module (TCM): Compensates for potential global information loss from frequency-domain processing.
End-to-End Training: Joint optimization of AAM and denoising blocks using composite loss function.

Key Parameters

Hard attention thresholds for frequency component removal
Channel expansion ratios in 1D convolutional layers
Loss function weights balancing frequency and time domain objectives
A²DM demonstrated a 12% improvement in correlation coefficient metrics compared to NovelCNN, highlighting the effectiveness of its parameterized approach to multi-artifact removal [53].

A²DM Architecture: Artifact-Aware Denoising Model workflow

Protocol 3: Multiverse Analysis of Preprocessing Parameters

A multiverse approach systematically evaluates how preprocessing parameter choices influence decoding performance across seven EEG experiments [9].

Materials and Setup

Data: ERP CORE dataset with 40 participants
Experiments: ERN, N400, LRP, MMN, N2pc, P3, LRP
Classifiers: EEGNet (neural network) and time-resolved logistic regression
Preprocessing Variations: 7 key steps with multiple parameter options each

Methodology

Forking Paths Definition: Systematically vary preprocessing steps including:
- Filter cutoffs (HPF: 0.1-1Hz; LPF: 10-40Hz)
- Artifact correction (ICA, autoreject)
- Reference scheme (average, Cz, mastoids)
- Baseline intervals (-200-0ms, -500-0ms)
- Detrending (none, linear)
Parallel Pipeline Execution: Run all possible preprocessing combinations (multiverse)
Performance Assessment: Evaluate decoding accuracy for each path using both classifiers
Marginal Means Analysis: Quantify individual preprocessing step influence using linear mixed models

Key Findings

Higher high-pass filter cutoffs (1Hz vs 0.1Hz) consistently increased decoding performance
Artifact correction steps (ICA, autoreject) generally decreased decoding performance
Baseline correction improved EEGNet performance
Lower low-pass filter cutoffs improved time-resolved decoding
The influence of other preprocessing choices was experiment-specific, highlighting the need for targeted parameter optimization [9]

Multiverse Preprocessing Analysis: Evaluating parameter influences

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Research Tools for hd-EEG Parameter Optimization

Tool/Resource	Function	Application Context
EEGdenoiseNet	Benchmark dataset for artifact removal	Training and evaluating denoising algorithms; contains EOG and EMG artifacts
MNE-Python	Open-source Python package for EEG analysis	Implementing preprocessing pipelines; multiverse analysis
Artifact Subspace Reconstruction (ASR)	Statistical method for burst artifact removal	Real-time artifact correction in wearable EEG; parameter: burst criterion
Higuchi Fractal Dimension (HFD)	Nonlinear measure of signal complexity	Detecting state changes and artifacts; parameter: k-value
ICA	Blind source separation for artifact isolation	Ocular and muscular artifact removal; parameter: correlation threshold
Autoreject	Automated artifact rejection pipeline	Handling bad channels and epochs; parameter: consensus threshold
ERP CORE	Stimulus set for eliciting core ERPs	Standardized paradigm for methodological studies

Discussion: Optimization Principles Across EEG Modalities

The parameter optimization principles established in this guide share common foundations while requiring modality-specific adaptations. Three key principles emerge across studies:

First, informed automation outperforms manual selection for hyperparameter optimization. The comprehensive protocol validating 2-step hyperparameter search with informed algorithms demonstrated consistent performance improvements across 9 datasets and multiple models [59]. This approach reduces researcher bias while systematically exploring the parameter space.

Second, context determines optimal values for many critical parameters. The multiverse analysis revealed that while some preprocessing parameters (like high-pass filter cutoff) showed consistent directional effects, others were experiment-specific [9]. This underscores the importance of domain knowledge in parameter optimization rather than universal presets.

Third, validation rigor must match optimization effort. The demonstrated practice of using multiple random initializations (10 seeds) provides more stable performance estimates, addressing the sensitivity of deep learning models to initialization [59]. Similarly, the multiverse approach provides comprehensive sensitivity analysis rather than single-pipeline reporting [9].

Future directions point toward increasing integration of artifact-specific knowledge into parameter selection, as demonstrated by A²DM's use of artifact representation to guide denoising strategy [53]. This artifact-aware approach represents a promising middle ground between fully automated and manually tuned pipelines, potentially offering the robustness of automation with the precision of expert knowledge.

Parameter optimization in hd-EEG artifact management remains both challenge and necessity. As evidence accumulates regarding the profound influence of thresholds, k-values, and hyperparameters on decoding outcomes, the field moves toward more systematic, transparent optimization approaches. The protocols and parameters detailed in this guide provide a foundation for trustworthy EEG analysis—one that balances computational efficiency with methodological rigor, and automated search with domain expertise. Through continued refinement of these optimization strategies, the EEG research community can advance toward more reproducible, valid neural decoding across diverse applications from basic neuroscience to clinical translation.

The application of high-density electroencephalography (hd-EEG) has traditionally been confined to controlled laboratory environments, where stationary equipment and restricted participant movement minimize signal contamination. However, the growing demand for neuroimaging in naturalistic settings—ranging from real-world cognitive monitoring to at-home therapeutic interventions—has driven the development of wearable hd-EEG systems. This transition from the lab to the real world introduces significant challenges, primarily concerning the management of artifacts introduced by subject movement, environmental noise, and the limitations of mobile hardware [61] [14]. Artifact removal, therefore, transforms from a primarily offline preprocessing step into a critical constraint that determines the viability of real-time applications such as brain-computer interfaces (BCIs), neurofeedback, and closed-loop neuromodulation [62].

Within the broader context of challenges in artifact removal for hd-EEG research, this technical guide addresses the specific strategies required to overcome the constraints imposed by wearable systems and real-time processing demands. The core challenge lies in the fact that artifacts in mobile EEG are more frequent, more intense, and inherently non-stereotypical, while the computational resources for processing are often limited [14]. Furthermore, traditional artifact removal methods like Independent Component Analysis (ICA), which often require manual inspection and offline processing, are ill-suited for these new paradigms [63]. This document provides an in-depth analysis of contemporary hardware and software strategies designed to overcome these hurdles, offering researchers and drug development professionals a framework for implementing robust and reliable real-world hd-EEG applications.

Core Challenges in Wearable and Real-Time hd-EEG

The pursuit of high-fidelity hd-EEG outside the laboratory is fraught with technical obstacles that directly impact data quality and interpretability.

Motion Artifacts: Unlike in stationary setups, motion artifacts are a dominant source of signal corruption in wearable systems. These artifacts arise from electrode-skin interface instability, cable movements, and head movements, generating signals that can be orders of magnitude larger than neural activity [61]. Their non-stationary and transient nature makes them particularly difficult to filter without distorting the underlying neural signal.
Constraints of Wearable Hardware: Wearable systems prioritize user comfort and mobility, which often leads to compromises. The use of dry electrodes, while reducing setup time, typically results in higher and more variable contact impedance compared to gel-based wet electrodes [24] [14]. Additionally, to maintain a small form factor and low power consumption, these devices often feature a reduced number of channels (low-density configurations) and less powerful computational hardware, limiting the effectiveness of spatial filtering and complex algorithms [64].
Real-Time Processing Demands: Applications like BCI and neurofeedback require artifact mitigation to occur with minimal latency. This precludes the use of lengthy data segments or computationally intensive offline methods like manual ICA. Algorithms must be optimized for speed and efficiency, operating on trial-by-trial or continuous data streams without access to future signal information [62].

Hardware and Acquisition Strategies

The foundation for clean signal acquisition is laid at the hardware level. Strategic choices in sensor technology and system design can preemptively mitigate certain types of artifacts.

Advanced Electrode Technologies

Dry Electrodes: Modern dry electrode systems are engineered with features like ultra-high impedance amplifiers (>47 GOhms) to handle elevated contact impedances (up to 1-2 MOhms), delivering signal quality comparable to wet electrodes without the need for skin preparation or conductive gel [24]. Their setup time is significantly lower, averaging 4.02 minutes compared to 6.36 minutes for wet systems [24].
In-Ear EEG Designs: Placing electrodes within the ear canal offers a unique form factor that is naturally shielded from external electromagnetic noise and provides a stable mechanical platform, reducing motion artifacts [61]. Studies have validated in-ear EEG against conventional scalp recordings using paradigms like alpha attenuation and auditory steady-state responses, confirming its capability to capture meaningful neural information [61].

A powerful hardware-level approach involves integrating auxiliary sensors to provide reference signals for artifact removal. Table 1: Key Auxiliary Sensors for Wearable hd-EEG Artifact Removal

Sensor Type	Primary Function	Application in Artifact Removal
Inertial Measurement Units (IMUs)	Track head acceleration and rotational velocity.	Detect and characterize motion artifacts caused by head movements for subsequent regression or rejection [14].
Electrooculography (EOG)	Record electrical potentials from eye movements.	Provide a reference signal for regression-based removal of ocular artifacts (blinks, saccades) [63].
Photoplethysmography (PPG)	Measure blood volume changes optically.	Identify cardiac-related artifacts (ballistocardiogram) in the EEG signal [61].

Signal Processing and Algorithmic Solutions

Software-based artifact removal strategies have evolved significantly, with a clear trend towards automated, real-time-capable algorithms that can function with the constraints of wearable hd-EEG.

Real-Time Capable Classical Algorithms

Artifact Subspace Reconstruction (ASR): ASR is a component-based, automated method that identifies and removes high-variance, non-stationary signal segments in real-time. It works by comparing incoming data to a clean "calibration" baseline and interpolating contaminated channels using data from other clean channels. Crucially, ASR has been validated for low-density systems (e.g., 8 channels), with one study showing it enhanced the Steady-State Visually Evoked Potential (SSVEP) response by up to 40-45% [64]. Its suitability for online operation makes it a popular choice for BCI applications [62].
Online Independent Component Analysis (ICA) and Empirical Mode Decomposition (EMD): While traditional ICA is an offline procedure, online variants have been developed for real-time use. Similarly, Online EMD adapts the decomposition of signals into oscillatory modes for artifact removal. In a comparative study of online methods, both ASR and Online EMD were able to reveal a significant Mismatch Negativity (MMN) response in an auditory oddball task and its subtle modulation by contextual changes, performing on par with offline processing [62].

Deep Learning-Based Approaches

Deep learning models represent a paradigm shift, learning to map artifact-contaminated EEG to clean EEG in an end-to-end fashion, often outperforming traditional methods in handling complex and unknown artifacts. Table 2: Performance Comparison of Deep Learning Models for Artifact Removal

Model Name	Architecture Core	Key Performance Metrics	Best For
CLEnet [15]	Dual-scale CNN + LSTM with improved EMA-1D attention.	SNR: 11.498 dB, CC: 0.925 (mixed artifacts); 2.45% SNR increase on real unknown artifacts [15].	Multi-channel EEG with mixed/unknown artifacts.
AnEEG [19]	GAN with LSTM layers.	Lower NMSE/RMSE, higher CC, SNR, and SAR vs. wavelet methods [19].	Generating artifact-free EEG signals.
GCTNet [19]	GAN-guided parallel CNN & Transformer.	11.15% reduction in RRMSE, 9.81 improvement in SNR [19].	Capturing global and temporal dependencies.
1D-ResCNN [15]	Multi-scale 1D Convolutional Neural Network.	Effective for feature extraction and reconstruction at multiple scales [15].	Scale-invariant feature learning.

These models, such as CLEnet, are designed to overcome the limitations of algorithms tailored to specific artifact types. By integrating Convolutional Neural Networks (CNNs) for extracting morphological features and Long Short-Term Memory (LSTM) networks for capturing temporal dependencies, they can handle a wide variety of artifacts simultaneously, including those not well-defined a priori [15]. The incorporation of attention mechanisms (e.g., EMA-1D) further enhances their ability to focus on relevant signal features [15].

Experimental Protocols and Methodologies

For researchers seeking to implement or validate these strategies, the following protocols provide a detailed methodological roadmap.

Protocol: Validating an In-Ear EEG System

Objective: To assess the signal quality and artifact susceptibility of an in-ear EEG device against a conventional scalp hd-EEG system.

Participant Preparation: Fit participants with both the in-ear EEG device and a standard 64-or-more channel scalp hd-EEG cap.
Paradigm Design:
- Resting State: 5 minutes eyes-open, 5 minutes eyes-closed to observe alpha rhythm modulation.
- Auditory Steady-State Response (ASSR): Present auditory stimuli at 40Hz and record the evoked neural response [61].
- Motion Protocol: Record during prescribed head movements (e.g., nodding, shaking) and walking.
Data Analysis:
- Compare the power spectral densities from both systems during eyes-open and eyes-closed conditions.
- Calculate the signal-to-noise ratio (SNR) of the ASSR for both systems.
- Quantify the amplitude of motion artifacts generated during the movement tasks.

Protocol: Benchmarking Real-Time Artifact Removal Algorithms

Objective: To compare the performance of different online artifact removal methods (e.g., ASR, Online EMD, a deep learning model) in a BCI-like task.

Data Acquisition: Collect hd-EEG data from participants performing a motor imagery or P300 speller task.
Data Contamination: Artificially introduce known artifacts (e.g., using a simulated EOG/EMG dataset [19]) into the clean recordings to create a semi-synthetic benchmark with a ground truth.
Online Simulation: Process the contaminated data through each algorithm in a simulated online manner, i.e., chunk-by-chunk without using future information.
Performance Evaluation:
- Signal Quality Metrics: Calculate Signal-to-Noise Ratio (SNR), Average Correlation Coefficient (CC) with the ground truth, and Relative Root Mean Square Error (RRMSE) [15] [19].
- Task Performance Metrics: For the P300 speller, measure the character selection accuracy and information transfer rate before and after artifact removal [62] [64].

The following workflow diagram illustrates the key stages of this benchmarking protocol:

The Scientist's Toolkit: Essential Research Reagents and Materials

Successful implementation of real-time wearable hd-EEG requires a suite of hardware, software, and data resources. Table 3: Essential Research Toolkit for Wearable hd-EEG Applications

Category / Item	Specification / Example	Primary Function in Research
Wearable hd-EEG System	16+ channel headset with dry electrodes; e.g., BioWolf [64].	Mobile neural data acquisition platform for real-world studies.
Auxiliary Sensors	Tri-axial IMU, EOG electrodes, PPG sensor [61] [14].	Provide reference signals for motion, ocular, and cardiac artifacts.
Benchmark Datasets	EEGdenoiseNet [15] [19], MIT-BIH Arrhythmia Database [15] [19].	Provide standardized, semi-synthetic data for training and validating artifact removal algorithms.
Software Libraries	EEGLAB, Python (MNE, TensorFlow, PyTorch).	Provide implementations of standard preprocessing, ICA, and deep learning models.
Artifact Removal Algorithms	Artifact Subspace Reconstruction (ASR), CLEnet, AnEEG.	Core software components for automated, real-time signal cleaning.

The transition of hd-EEG into real-world, wearable applications is critically dependent on robust strategies for managing artifacts under real-time constraints. No single solution exists; rather, a combined approach is necessary. This involves selecting appropriate hardware with stable electrode interfaces and integrated auxiliary sensors, coupled with the implementation of sophisticated, computationally efficient algorithms. The emergence of deep learning models offers a powerful, end-to-end solution for handling complex, mixed, and unknown artifacts, often surpassing the capabilities of classical methods. As these technologies continue to mature, they will unlock the full potential of hd-EEG, enabling unprecedented insights into brain function in naturalistic environments and paving the way for more effective clinical diagnostics and therapeutic interventions in neurology and drug development.

Leveraging Epoch-Wise Interpolation to Recover Bad Channels and Epochs

High-density electroencephalography (hd-EEG), utilizing 64 to 256 or more electrodes, has become essential in cognitive neuroscience and clinical research for its superior spatial resolution [65] [25]. However, the vast data volume from overnight or long-term recordings significantly complicates artifact removal [5]. Artifacts originating from ocular movements, muscle activity, cardiac signals, sweating, or electrode pops can profoundly distort neural signals, compromising data integrity and leading to erroneous conclusions in both academic research and drug development studies. Traditional artifact rejection methods, which simply discard contaminated epochs or channels, result in substantial data loss, reduced statistical power, and potential biases, especially in clinical trials where data retention is critical.

Within this challenging landscape, epoch-wise interpolation has emerged as an advanced preprocessing technique that enables researchers to recover and preserve valuable data. This method involves identifying artifactual periods in specific channels and reconstructing the corrupted signals using information from spatially adjacent, clean electrodes within the same epoch. Unlike whole-channel rejection or deletion approaches, epoch-wise interpolation operates on a fine-grained temporal scale, allowing for the precise restoration of neural signals while minimizing the loss of biological information. This technical guide explores the methodology, efficacy, and implementation of epoch-wise interpolation as a crucial tool for addressing the persistent challenge of artifact contamination in hd-EEG research.

Core Concept and Efficacy of Epoch-Wise Interpolation

Fundamental Principles

Epoch-wise interpolation is a semi-automatic artifact removal routine specifically designed for the complexities of sleep hd-EEG and other long-duration recordings [5]. The methodology operates on a fundamental premise: when artifacts affect specific channels transiently rather than throughout an entire recording, the clean signals from surrounding electrodes within the same temporal epoch can be used to reconstruct the corrupted data. This approach generates a binary matrix (channels × epochs) that identifies artifactual values, enabling targeted interpolation only where necessary while preserving original data elsewhere.

The technique is particularly valuable for addressing localized artifacts such as electrode "pops" resulting from abrupt impedance changes, which typically affect single channels briefly rather than entire electrode arrays [66]. By leveraging the high spatial sampling of hd-EEG systems, where electrodes are positioned in dense arrays (often 128 or 256 channels), the method capitalizes on the strong spatial correlations between neighboring sensors to accurately reconstruct missing or artifactual data points.

Quantitative Efficacy

Applied across 54 overnight sleep hd-EEG recordings, this approach has demonstrated exceptional recovery capabilities, with the proportion of bad epochs highly dependent on the number of channels required to be artifact-free [5]. The results from large-scale validation studies confirm its effectiveness:

Table 1: Recovery Performance of Epoch-Wise Interpolation

Metric	Performance	Contextual Notes
Epoch Recovery Rate	95% to 100% of bad epochs restored	Depends on number of channels required to be artifact-free [5]
Topographic Preservation	Expected delta power topography maintained	Post-recovery patterns match physiological expectations [5]
Cyclic Pattern Preservation	Normal cyclic patterns preserved	Demonstrated in extreme cases with both few and many artifacts [5]
Comparative Advantage	Reduces effect size inflation	Compared to component subtraction methods [67]

The restoration of between 95% and 100% of bad epochs represents a substantial improvement in data retention compared to traditional rejection methods [5]. Furthermore, after artifact removal using this approach, the topography and cyclic pattern of neural oscillations such as delta power appear as expected, confirming that the method preserves fundamental physiological properties of the EEG signal [5].

Methodological Implementation: A Step-by-Step Workflow

Comprehensive Processing Pipeline

The successful implementation of epoch-wise interpolation requires a systematic approach that begins with data acquisition and proceeds through artifact detection, validation, and reconstruction. The following workflow diagram illustrates this complete process:

EEG Artifact Removal and Data Recovery Pipeline

Artifact Detection via Signal Quality Markers (SQMs)

The initial critical phase involves comprehensive artifact identification through a graphical user interface (GUI) that enables researchers to assess epochs based on four sleep quality markers (SQMs) or analogous vigilance state indicators [5]. This semi-automatic approach requires the operator to have foundational knowledge of both physiological EEG patterns and common artifactual contamination. The detection process leverages multiple complementary strategies:

Table 2: Artifact Detection Methodologies

Method Category	Specific Approach	Key Advantages
Feature-Based Detection	Extraction of 58 clinically relevant EEG features with unsupervised outlier detection [66]	Adaptable to various artifact types without predefined templates
Targeted Artifact Reduction	Independent Component Analysis (ICA) with period/frequency-specific cleaning [67]	Reduces effect size inflation and source localization biases
Deep Learning Approaches	Transformer architectures (ART) capturing millisecond-scale dynamics [16]	Holistic removal of multiple artifact types simultaneously
Hybrid Methods	Dual-scale CNN and LSTM networks (CLEnet) with attention mechanisms [7]	Effective for unknown artifacts and multi-channel data

The semi-automatic routine produces a binary matrix (channels × epochs) that flags artifactual periods while preserving clean segments, enabling highly targeted intervention rather than wholesale channel rejection [5]. This precise identification is crucial for minimizing data loss and maintaining signal integrity.

Interpolation and Reconstruction Algorithms

Once artifacts are identified, several computational approaches can be employed for the actual interpolation process:

Spatiotemporal interpolation leverages the dense spatial sampling of hd-EEG systems, using algorithms that weight contributions from surrounding channels based on distance and signal correlation. This method is particularly effective for localized artifacts affecting single channels or small channel clusters.

Deep learning-based reconstruction represents a more recent advancement, with encoder-decoder networks trained to reconstruct corrupted segments using information from both spatial and temporal dimensions [66]. These approaches can be particularly effective for artifacts that simultaneously affect multiple channels.

Ensemble methods combine multiple outlier detection algorithms with reconstruction networks, framing the problem as a "frame-interpolation" task where artifactual segments are identified and then corrected through representation learning [66]. This approach has demonstrated approximately 10% relative improvement in downstream classification performance compared to non-corrected data.

Successfully implementing epoch-wise interpolation requires both computational tools and methodological rigor. The following table summarizes key resources mentioned in recent literature:

Table 3: Research Reagents and Computational Tools

Tool/Resource	Function/Purpose	Implementation Notes
RELAX Pipeline	Targeted artifact reduction focusing on artifact periods of eye movement components and artifact frequencies of muscle components [67]	Freely available as EEGLAB plugin; reduces effect size inflation common in ICA
ART (Artifact Removal Transformer)	End-to-end denoising model employing transformer architecture to capture transient EEG dynamics [16]	Effectively removes multiple artifact sources simultaneously; improves BCI performance
CLEnet	Dual-branch neural network integrating CNN and LSTM with improved attention mechanisms [7]	Specifically designed for unknown artifacts and multi-channel EEG data
High-Density-SleepCleaner	Semi-automatic artifact removal routine with GUI for SQM assessment [5]	Includes epoch-wise interpolation function in online repository
Unsupervised Detection Framework	Ensemble of unsupervised outlier detection algorithms for patient- and task-specific artifact identification [66]	Does not require manual annotation; adaptable to novel EEG data

Workflow Integration and Validation

The logical relationship between detection and correction methodologies follows a sequential decision process that can be visualized as:

Artifact Correction Decision Framework

Experimental Protocols and Validation Metrics

Benchmarking Study Designs

Rigorous validation of epoch-wise interpolation requires carefully designed experiments that quantify both artifact removal efficacy and neural signal preservation. Recent literature provides several methodological paradigms:

Semi-synthetic dataset validation involves adding known artifacts (EMG, EOG, ECG) to clean EEG baselines, enabling precise quantification of removal performance through signal-to-noise ratio (SNR), correlation coefficients (CC), and temporal/frequency domain error metrics [7]. Studies utilizing this approach have demonstrated that advanced interpolation methods can achieve SNR improvements of 11.498dB and correlation coefficients of 0.925 for mixed artifact removal [7].

Real-world performance assessment applies these methods to experimentally collected hd-EEG data during cognitive tasks (e.g., 2-back working memory paradigms) with unknown artifact compositions, testing robustness under realistic conditions [7]. Performance metrics typically include:

Temporal domain preservation: Relative root mean square error (RRMSEt) with advanced methods achieving values as low as 0.300 [7]
Spectral integrity: Relative root mean square error in frequency domain (RRMSEf) with best-performing methods reaching 0.319 [7]
Topographic consistency: Visual and quantitative comparison of oscillation topography before and after processing [5]

Clinical validation examines how artifact removal impacts downstream analyses such as source localization accuracy [25], with studies demonstrating that targeted methods reduce biases common in conventional approaches [67].

Comparative Performance Analysis

Recent benchmarking studies provide quantitative comparisons between various artifact removal approaches:

Table 4: Performance Comparison of Artifact Removal Methods

Method	SNR Improvement	Correlation Coefficient	Temporal Error (RRMSEt)	Best Use Case
CLEnet [7]	11.498 dB	0.925	0.300	Mixed artifacts (EMG+EOG) in multi-channel EEG
DuoCL [7]	-	0.901	0.322	Temporal feature preservation
Targeted ICA Cleaning [67]	-	-	-	Reducing effect size inflation
1D-ResCNN [7]	-	0.917	0.304	Single-channel focus
Epoch-Wise Interpolation [5]	-	-	-	Localized artifacts in hd-EEG

These quantitative comparisons highlight the significant advances achieved by contemporary methods, particularly for complex artifact types and multi-channel EEG data. The integration of epoch-wise interpolation within broader processing pipelines represents a robust approach to maximizing data quality while preserving valuable experimental data that would otherwise be lost to artifact contamination.

Epoch-wise interpolation has emerged as a powerful technique within the artifact removal arsenal for high-density EEG research, addressing the critical challenge of balancing rigorous artifact correction with maximal data preservation. By enabling precise, spatially-informed reconstruction of transient artifactual periods rather than wholesale rejection of channels or epochs, this approach maintains the statistical power and ecological validity of hd-EEG studies while ensuring signal integrity. When integrated with complementary methods ranging from targeted ICA cleaning to sophisticated deep learning architectures, epoch-wise interpolation forms part of a comprehensive framework for addressing the persistent challenge of artifacts in electrophysiological research. As hd-EEG continues to expand its role in both basic neuroscience and applied drug development, these advanced preprocessing methodologies will play an increasingly vital role in ensuring the reliability, validity, and translational impact of brain connectivity and dynamics research.

In high-density electroencephalography (hd-EEG) research, particularly in the challenging domain of artifact removal, ensuring reproducibility is a fundamental requirement for scientific progress. Reproducibility enables the verification and validation of study findings, facilitates the identification or reduction of errors, and allows for accurate comparison of newly developed methodologies [68]. Within the specific context of artifact removal, the complexity of distinguishing neural signals from contaminants such as ocular movements, muscle activity, and environmental interference creates a critical point where methodological transparency becomes essential.

Despite its importance, a significant reproducibility crisis permeates scientific research. A Nature survey revealed that 70% of researchers could not reproduce another researcher's experiments, while over 50% could not reproduce their own research [68]. In EEG research, this crisis is exacerbated by the vast analytical flexibility available to researchers, with numerous methodological options and tools to be selected at each step of the research workflow [69]. This high analytical flexibility introduces substantial variability in research outcomes, particularly in artifact removal where methods range from traditional regression-based approaches to advanced deep learning techniques [27] [19]. The standardization of documentation and workflow practices presented in this whitepaper addresses these challenges directly, providing a framework for producing reliable, reproducible research in hd-EEG artifact removal and beyond.

Methodological Standardization in Multi-Site Research

Inter-dataset variability in EEG studies can originate from numerous sources throughout the research lifecycle. The Canadian Biomarker Integration Network in Depression (CAN-BIND) EEG working group has identified ten primary categories where errors or differences can introduce bias and variability [70]. Understanding these sources is the first step in controlling their impact on research outcomes, particularly in multi-site studies where integration of hd-EEG data is planned.

Table 1: Key Sources of Variability in Multi-Site EEG Research

Category	Specific Sources of Variability	Impact on Reproducibility
Study Design	Sequence of data collection, time of day, participant instructions	Affects state-dependent EEG components including artifact prevalence
Equipment & Setup	Make/model of equipment, electrode types, amplifier systems	Introduces technical differences in signal acquisition and noise profiles
Acquisition Parameters	Sampling rate, filter settings, reference placement	Creates fundamental differences in raw data characteristics
Data Collection Monitoring	Standardized operating procedures, quality control checks	Affects consistency of data quality across sites and sessions
Quality Control	Criteria for rejecting channels/epochs, artifact detection methods	Leads to different inclusion/exclusion of data segments
Data Pre-processing	Filtering algorithms, artifact removal techniques, parameter choices	Directly impacts the cleaned dataset available for analysis
Feature Extraction	Algorithm selection, mathematical approaches, time/frequency parameters	Affects the final features used for statistical testing
Statistical Frameworks	Analytical approaches, correction methods, software tools	Influences interpretation of results and significance testing
Data Archiving	Format, metadata completeness, documentation	Impacts ability to reanalyze data with alternative methods
Knowledge Translation	Reporting completeness, methodological transparency	Determines whether other researchers can understand and replicate methods

Standardization Protocols for Multi-Site Studies

Implementing rigorous standardization protocols across research sites is essential for producing comparable, reproducible hd-EEG data. The CAN-BIND initiative established comprehensive guidelines that address critical phases of the research lifecycle [70]:

Temporal Standardization: Consistent timing of data collection across sites and within subjects controls for fluctuations in circadian rhythms that impact functional data. Documentation of exact collection times enables post-hoc assessment of "time of day" effects on outcomes.
Participant Instruction Protocols: Development of standard operating procedures (SOPs) to instruct participants about sleep hygiene, caffeine intake, smoking, and alcohol consumption before EEG sessions. These protocols aim to decrease state-dependent noise while promoting participants' honest reporting of deviations.
Comprehensive Data Annotation: Establishment of clear, consistent naming conventions strictly followed across sites, particularly important for studies with multiple tasks, conditions, groups, and longitudinal time points. Annotation should include technical details, demographic information, and participant state variables.

The impact of equipment variation must be specifically addressed in multi-site studies. Research indicates that different software packages (EEGLAB, Brainstorm, FieldTrip) applied to the same dataset with aligned preprocessing methods can produce considerable variability in the magnitude of absolute voltage observed at particular channels and time instants [69]. This underscores the necessity of reporting software versions and parameters when documenting artifact removal methodologies.

Documentation Frameworks for Reproducible Research

The CRISP-DM Framework for EEG Machine Learning

For research involving machine learning approaches to artifact removal or EEG analysis, the Cross-Industry Standard Process for Data Mining (CRISP-DM) provides a robust framework for structuring reproducible workflows [68]. This methodology organizes the research process into interconnected phases that ensure systematic documentation and transparency:

Business Understanding: Clearly define the research question, experimental hypotheses, and specific artifact types targeted for removal. Document domain knowledge about expected neural signals and potential contaminants.

Data Understanding: Comprehensive description of dataset characteristics, including participant demographics, acquisition parameters, and initial assessment of artifact prevalence and types. This phase should include exploratory analysis to identify common artifacts in the specific research context.

Data Preparation: Detailed documentation of all preprocessing and artifact removal steps. This is the most critical phase for reproducibility in hd-EEG artifact removal, requiring explicit parameter reporting and algorithmic descriptions.

Modeling: For computational artifact removal approaches, complete specification of model architectures, training parameters, and implementation details. This includes random seed reporting for stochastic algorithms.

Evaluation: Transparent reporting of evaluation metrics, statistical tests, and comparison methodologies. Documentation should include both quantitative metrics and qualitative assessments of artifact removal effectiveness.

Deployment: Sharing of code, data (where possible), and detailed methodologies to enable independent verification and application to new datasets.

FAIR Data Principles and BIDS Standardization

Adopting the FAIR data principles (Findable, Accessible, Interoperable, and Reusable) is essential for enhancing reproducibility in developmental EEG research and beyond [71]. The Brain Imaging Data Structure (BIDS) provides a standardized framework for organizing EEG data according to these principles, significantly enhancing the reusability and shelf life of research data beyond the original study.

Implementation of BIDS includes standardized naming conventions for files and directories, consistent metadata reporting, and clear documentation of preprocessing steps. This standardization is particularly valuable for artifact removal methodologies, as it enables direct comparison of different approaches across datasets and laboratories. When combined with detailed workflow documentation, BIDS-compliant data sharing creates a foundation for truly reproducible hd-EEG research.

Experimental Protocols in Artifact Removal

Semi-Automatic Artifact Removal for hd-EEG

The "High-Density-SleepCleaner" protocol represents a specialized approach to artifact removal tailored to the challenges of high-density sleep EEG [5] [40]. This method addresses the substantial data volume resulting from 256-channel overnight recordings through a semi-automatic routine combining computational detection with expert validation.

Table 2: Performance Metrics of Artifact Removal Methods

Method	Application Context	Key Metrics	Performance Results
High-Density-SleepCleaner [5] [40]	Sleep hd-EEG (256 channels)	Proportion of bad epochs restored	95-100% of bad epochs restored using epoch-wise interpolation
AnEEG (Deep Learning) [19]	General EEG artifact removal	NMSE, RMSE, CC, SNR, SAR	Lower NMSE/RMSE, higher CC values vs. wavelet decomposition
Channel-based + ICA Template Regression [4]	EEG during locomotion	Spectral power reduction (1.5-8.5 Hz)	Significant reduction in movement artifact during walking and running
GAN-Guided Approaches [19]	Ocular artifact removal	Signal-to-Noise Ratio improvement	9.81% improvement in SNR reported in GCTNet implementation

The protocol employs a graphical user interface (GUI) that enables researchers to assess epochs regarding four sleep quality markers (SQMs). Based on their topography and underlying EEG signal, users can remove artifactual values while preserving neural data of interest. This method requires users to have basic knowledge of typical (patho-)physiological EEG patterns as well as artifactual EEG [40]. The final output consists of a binary matrix (channels × epochs) identifying artifactual components, with affected channels restored in afflicted epochs using epoch-wise interpolation.

Movement Artifact Removal During Locomotion

For hd-EEG recordings during motor activities, specialized protocols are required to address movement artifacts that can be an order of magnitude larger than underlying brain signals [4]. A two-step approach has been developed specifically for these challenging recording environments:

Channel-Based Template Regression: This initial step removes stride phase-locked mechanical artifact using a moving time-window averaging of stride phase-locked data to compute artifact templates for each stride and channel. The method addresses step-to-step fluctuations in phase and amplitude through regression of artifact template signals from each EEG channel.

Component-Based Template Regression: Following initial cleaning, adaptive independent component analysis (ICA) decomposes EEG signals into maximally independent component processes. The template regression procedure is then applied to these IC processes, with reversed time-warping to produce artifact-reduced ICs before applying the ICA mixing matrix to recover artifact-reduced EEG signals.

This combined approach has been shown to significantly reduce EEG spectral power in the 1.5-8.5 Hz frequency range during walking and running, while preserving event-related potentials that remain nearly identical to those recorded during standing conditions [4].

Deep Learning Approaches for Automated Removal

Advanced deep learning architectures represent the frontier of automated artifact removal methodologies. The AnEEG model exemplifies this approach, leveraging Long Short-Term Memory (LSTM) networks within a Generative Adversarial Network (GAN) framework to effectively capture temporal dependencies in EEG data while removing artifacts [19].

The experimental protocol for deep learning approaches typically includes:

Data Preparation: Careful curation of training datasets with both contaminated and clean EEG segments
Architecture Design: Specification of generator and discriminator networks tailored to EEG characteristics
Loss Function Implementation: Development of specialized loss functions that preserve neural information while removing artifacts
Validation Framework: Comprehensive quantitative evaluation using multiple metrics (NMSE, RMSE, CC, SNR, SAR)

These automated approaches show particular promise for standardizing artifact removal across research sites, potentially reducing the inter-rater variability introduced by manual or semi-automatic methods.

Research Reagent Solutions for Reproducible EEG Research

Table 3: Essential Tools for Reproducible EEG Artifact Removal Research

Tool/Category	Specific Examples	Function in Reproducible Research
Open-Source Software Toolboxes	EEGLAB, Brainstorm, FieldTrip, MNE [69] [68]	Provide standardized implementations of preprocessing and artifact removal algorithms
Standardized Data Structures	BIDS (Brain Imaging Data Structure) [71]	Ensure consistent data organization and metadata documentation across studies
Artifact Removal Algorithms	High-Density-SleepCleaner [5], ICA-based approaches [4], AnEEG [19]	Offer specialized methods for different artifact types and recording contexts
Reproducibility Checklists	CRISP-DM framework [68], Machine Learning Reproducibility Checklist [68]	Guide comprehensive documentation of methodologies and parameters
Data Sharing Platforms	Brain-CODE [70], Donders Repository [71], OSF	Enable data accessibility and verification of published results
Computational Resources	MATLAB, Python, Containerization (Docker/Singularity)	Ensure consistent computational environments for analysis replication

Workflow Visualization for Standardized Artifact Removal

Implementing a consistent workflow for artifact removal in hd-EEG research is fundamental to ensuring reproducibility. The following diagram illustrates a comprehensive pipeline integrating both standardized preprocessing and specialized artifact removal techniques:

Quantitative Assessment of Methodologies

Comparative Analysis of Artifact Removal Performance

Rigorous quantitative assessment is essential for evaluating the performance of different artifact removal methodologies. The metrics presented in Table 2 provide a foundation for comparative analysis, but researchers must consider the context-specific appropriateness of each method.

For semi-automatic approaches like High-Density-SleepCleaner, the 95-100% restoration rate of bad epochs through epoch-wise interpolation demonstrates exceptional data preservation capabilities [5]. This is particularly valuable in sleep research where overnight recordings represent significant investment and participant burden.

Deep learning approaches show strong performance on quantitative metrics, with the AnEEG model achieving lower NMSE (Normalized Mean Square Error) and RMSE (Root Mean Square Error) values alongside higher CC (Correlation Coefficient) values compared to traditional wavelet decomposition techniques [19]. These metrics indicate better agreement with original signals and stronger linear agreement with ground truth data.

Impact of Standardization on Reproducibility

The implementation of standardization protocols has measurable effects on research reproducibility. Studies examining the impact of different software tools on EEG analysis outcomes have found that while there is generally a good degree of convergence in ERP waveform profiles, peak latencies, and effect size estimates, considerable variability exists in the magnitude of absolute voltage observed with each software package [69]. This variability manifests as statistical differences at particular channels and time instants, highlighting the necessity of reporting software versions and processing parameters.

The adoption of standardized data structures like BIDS enhances reproducibility by ensuring consistent organization of data and metadata [71]. When combined with comprehensive documentation of preprocessing workflows, this standardization enables independent verification of research findings and facilitates meta-analytic approaches across multiple studies.

Ensuring reproducibility in hd-EEG research, particularly in the methodologically challenging domain of artifact removal, requires systematic implementation of standardized documentation and workflow practices. The frameworks, protocols, and tools presented in this whitepaper provide researchers with a comprehensive roadmap for enhancing the transparency, reliability, and verifiability of their research outputs.

By adopting the CRISP-DM framework, implementing FAIR data principles through BIDS standardization, selecting appropriate artifact removal methodologies for specific research contexts, and comprehensively documenting all methodological decisions, researchers can significantly advance the reproducibility of hd-EEG research. These practices are particularly crucial as the field moves toward increasingly complex analytical approaches, including deep learning and multi-site collaborations.

The continued development and adoption of standardized practices will not only address the current reproducibility crisis but also accelerate scientific discovery in hd-EEG research by creating a solid foundation of verifiable, buildable knowledge. As research in artifact removal methodologies advances, maintaining commitment to these reproducible research practices will ensure that new developments rest upon a trustworthy foundation of prior work.

Benchmarking Success: Validating and Comparing Artifact Removal Performance

Electroencephalography (EEG), particularly high-density EEG (HD-EEG) systems utilizing 64, 128, or 256 electrodes, provides unparalleled temporal resolution for monitoring brain activity [72]. However, the fidelity of these neural signatures is persistently compromised by physiological and non-physiological artifacts, presenting a fundamental challenge in both clinical and research settings. Artifacts originating from ocular movements (EOG), muscle activity (EMG), cardiac rhythms (ECG), and head motion can masquerade as or obscure genuine brain signals, complicating data interpretation and analysis [7]. The pursuit of robust artifact removal methodologies is therefore not merely a technical exercise but a prerequisite for scientific validity, especially in high-stakes domains like drug development and neurological disorder diagnosis.

The core obstacle in developing and validating these artifact removal techniques has been the establishment of ground truth—a known, uncontaminated neural signal against which the performance of any cleaning algorithm can be objectively measured [73]. In response to this challenge, the neuroscience community has increasingly relied on two parallel approaches: semi-synthetic datasets, where artifact-free EEG is deliberately contaminated with known artifacts, and meticulously curated real-world datasets, which capture the full complexity of in-situ neural recordings. This technical guide examines the roles, construction, applications, and limitations of these two dataset paradigms, providing a framework for their use in advancing HD-EEG research.

Semi-Synthetic Datasets: The Controlled Benchmark

Semi-synthetic datasets solve the ground truth problem by artificially creating contaminated EEG signals where the underlying, clean brain signal is known. This enables direct, quantitative comparison of how different artifact removal algorithms perform.

Core Methodology and Generation

The creation of a semi-synthetic dataset follows a rigorous experimental design to ensure physiological realism. The process typically involves several key stages, as Artifact Rejection from a foundational study involved obtaining artifact-free EEG signals from 27 healthy subjects during eyes-closed sessions using 19 electrodes placed according to the 10-20 International System [73]. Simultaneously, EOG signals were recorded from the same subjects during an eyes-opened condition to capture genuine ocular artifacts [73]. The critical step is the contamination phase, where the clean EEG is artificially contaminated using a biophysical model. A common approach uses a linear addition model:

ContaminatedEEGᵢ,ⱼ = PureEEGᵢ,ⱼ + aⱼVEOG + bⱼHEOG

Where Pure_EEGᵢ,ⱼ is the artifact-free signal from subject i at electrode j, VEOG and HEOG are the vertical and horizontal EOG recordings, and aⱼ and bⱼ are contamination coefficients calculated for each electrode via linear regression against an eyes-opened baseline session [73]. This method produces a dataset containing both the pre-contamination EEG and the artificially contaminated signals, providing a complete benchmark for objective algorithm assessment.

A Representative Experimental Protocol

The following workflow outlines the standard protocol for creating a semi-synthetic EOG-contaminated dataset, as detailed in the research:

Key Artifact Removal Algorithms for Benchmarking

Semi-synthetic datasets enable rigorous testing of various artifact removal algorithms. The table below summarizes quantitative performance comparisons of established methods, serving as a reference for expected outcomes in benchmark studies.

Table 1: Performance Comparison of Artifact Removal Methods on Semi-Synthetic Data

Method	Artifact Type	Key Metric	Reported Performance	Limitations
REG-ICA [73]	EOG	Component Separation	Effective hybrid method	Requires multiple channels
1D-ResCNN [7]	Mixed (EMG+EOG)	Signal-to-Noise Ratio (SNR)	11.498 dB	Network tailored to specific artifacts
CLEnet [7]	Mixed (EMG+EOG)	Average Correlation Coefficient (CC)	0.925	Complex architecture
ICA [20]	Motion	Dipolar Components	Improved with preprocessing	Sensitive to high-amplitude artifacts
Artifact Subspace Reconstruction (ASR) [20]	Motion	Power Reduction at Gait Frequency	Significant reduction	Risk of "over-cleaning"

Real-World Datasets: Capturing Ecological Complexity

While semi-synthetic datasets provide controlled benchmarks, real-world datasets capture the full complexity of artifacts encountered in ecological settings, from motion during locomotion to unpredictable physiological noise.

Characteristics and Collection Paradigms

High-quality real-world datasets are characterized by large sample sizes, multiple recording sessions, and well-documented experimental paradigms. For instance, a comprehensive Motor Imagery (MI) dataset collected from 62 healthy participants across three recording sessions includes both two-class (left vs. right hand-grasping) and three-class (adding foot-hooking) tasks, providing extensive data for studying cross-session and cross-subject variability [74]. Such datasets are invaluable for evaluating how artifact removal techniques perform under realistic and challenging conditions.

A Protocol for Real-World Motor Imagery Data Collection

Collecting real-world data for movement-related studies requires a protocol that balances experimental control with ecological validity. The following workflow illustrates the steps for a multi-session motor imagery dataset collection:

In a specific implementation, each participant completes three recording sessions on different days. Each session includes eye-opening (60s) and eye-closing (60s) baseline periods, followed by five blocks of motor imagery tasks [74]. A single trial lasts 7.5 seconds, beginning with visual and auditory cues (1.5s), followed by the MI period (4s) where participants mentally perform the cued action without physical movement, and ending with a break period (2s) [74]. This structured yet flexible protocol ensures the collection of robust, multi-session data while accounting for participant fatigue through optional breaks.

Real-World Dataset Applications and Performance

Real-world datasets enable the validation of artifact removal methods in clinically relevant scenarios. The table below summarizes the specifications and performance benchmarks of a representative large-scale real-world dataset.

Table 2: Specifications of a Representative Real-World Motor Imagery EEG Dataset

Parameter	Specification	Research Value
Participants	62 healthy subjects	Enables subject-independent studies
Sessions	3 per subject	Allows cross-session variability analysis
EEG Channels	64 electrodes	Provides high spatial density
Tasks	2-class and 3-class MI	Supports complex discrimination tasks
Trial Count	200-300 per session	Ensues statistical power
Accuracy	85.32% (2-class) [74]	Sets performance benchmark
Data Type	Raw and preprocessed	Flexible for different research needs

Comparative Analysis: Strategic Selection for Research Objectives

The choice between semi-synthetic and real-world datasets depends on the research phase and specific objectives. Each approach offers distinct advantages and suffers from particular limitations that must be considered in study design.

Table 3: Strategic Comparison of Dataset Approaches for EEG Artifact Removal Research

Dimension	Semi-Synthetic Datasets	Real-World Datasets
Ground Truth	Known and precisely defined	Unknown, must be inferred
Primary Use Case	Algorithm development and benchmarking	Validation and ecological testing
Artifact Control	Exact timing and amplitude known	Uncontrolled and variable
Complexity	Isolated, single artifact types	Multiple co-occurring artifacts
Scalability	Easily expanded computationally	Costly and time-consuming to collect
Limitations	May oversimplify real-world conditions	Lack of objective ground truth
Ideal Application	Initial algorithm validation and comparison	Clinical translation studies

Semi-synthetic datasets provide an unmatched benchmark for objective evaluation because the uncontaminated neural signal is known. This enables direct computation of performance metrics like signal-to-noise ratio improvement and correlation coefficient with ground truth [73] [7]. However, the primary limitation is that the contamination process may not fully capture the complex, non-stationary nature of artifacts in real-world settings, potentially leading to algorithms that perform well on benchmarks but fail in practice.

Conversely, real-world datasets capture the full complexity and unpredictability of artifacts encountered in clinical and ecological environments, including motion artifacts during locomotion [20] and composite artifacts from multiple physiological sources [7]. These datasets are essential for testing algorithmic robustness but lack precise ground truth, forcing researchers to rely on indirect validation measures such as task classification accuracy or the reasonableness of extracted neural components [74].

Implementing rigorous artifact removal research requires specific computational tools, algorithms, and data resources. The following table catalogs key "research reagents" essential for working with semi-synthetic and real-world HD-EEG datasets.

Table 4: Essential Research Reagents for EEG Artifact Removal Studies

Resource	Type	Function	Example Implementation
Semi-Synthetic Data	Benchmark Dataset	Provides ground truth for validation	Artificially contaminated EEG with pre-contamination signals [73]
Real-World MI Data	Experimental Dataset	Tests ecological performance	62-subject, 64-channel motor imagery data [74]
ICA Algorithms	Software Tool	Separates neural and artifactual sources	ICLabel for component classification [20]
Deep Learning Models	Algorithm	End-to-end artifact removal	CLEnet (CNN-LSTM with attention) [7]
Motion Correction	Preprocessing Tool	Handles locomotion artifacts	iCanClean with pseudo-reference signals [20]
Reference Signals	Hardware/Software	Captures pure artifact signatures	Carbon-Wire Loops (CWL) for MR artifacts [75]
Performance Metrics	Analytical Framework	Quantifies algorithm performance	SNR, CC, RRMSEt, RRMSEf [7]

Emerging Trends and Future Directions

The field of EEG artifact removal is rapidly evolving, driven by advances in deep learning and the growing availability of large-scale datasets. Modern approaches like CLEnet, which integrates dual-scale CNN with LSTM and an improved attention mechanism, demonstrate the shift toward end-to-end models capable of handling both known and unknown artifacts across multiple channels [7]. These architectures address limitations of traditional methods by automatically learning feature representations without requiring manual component selection or reference channels.

Concurrently, sophisticated artifact removal techniques like iCanClean and Artifact Subspace Reconstruction (ASR) are being optimized for challenging real-world scenarios such as motion artifact correction during running and other whole-body movements [20]. The integration of reference signals from dedicated hardware, like carbon-wire loops, provides an additional dimension for capturing artifact signatures, leading to improved signal recovery in both temporal and spectral domains [75]. As these methodologies mature, the synergistic use of semi-synthetic datasets for development and real-world datasets for validation will become increasingly crucial for translating laboratory breakthroughs into clinical applications, particularly in drug development and personalized medicine.

The establishment of reliable ground truth through semi-synthetic and real-world datasets represents a cornerstone of rigorous HD-EEG research. Semi-synthetic datasets provide the controlled benchmarks necessary for objective algorithm development and comparison, while real-world datasets capture the ecological complexity essential for clinical validation. The strategic integration of both approaches—using semi-synthetic data for initial benchmarking and real-world data for performance verification—enables a comprehensive evaluation pathway for artifact removal techniques. As the field advances toward more sophisticated deep learning approaches and larger-scale data collection, this dual-dataset framework will continue to be indispensable for developing robust, clinically applicable tools that enhance the signal fidelity of high-density EEG, ultimately advancing neuroscience research and therapeutic development.

In high-density electroencephalography (EEG) research, the process of artifact removal presents a fundamental paradox: the very techniques used to eliminate non-neural contaminants can inadvertently distort or remove genuine brain signals, potentially leading to misinterpretations of neural activity [67]. The challenge is particularly acute in clinical and pharmacological applications, where the integrity of neural data directly impacts diagnostic conclusions and treatment development [76] [77]. Without rigorous validation, artifact removal can create a false impression of clean data while introducing new forms of distortion, sometimes artificially inflating effect sizes in event-related potentials and functional connectivity analyses [67].

To address these challenges, the field has converged on a set of core performance metrics that collectively provide a multidimensional assessment of artifact removal efficacy. These metrics—Signal-to-Noise Ratio (SNR), Correlation Coefficient (CC), Root Mean Square Error (RMSE), and Component Dipolarity—form an essential validation framework that enables researchers to quantify both the removal of artifacts and the preservation of neural information [78] [79] [7]. This technical guide examines each metric's theoretical foundation, computational methodology, and interpretive significance within the context of high-density EEG research, providing experimental protocols and analytical frameworks for their application in cutting-edge neuroscience research.

Core Performance Metrics: Theoretical Foundations and Methodologies

Signal-to-Noise Ratio (SNR)

Theoretical Foundation: SNR quantifies the relative power between the desired neural signal and residual noise or artifacts following processing. It is particularly valuable for assessing how effectively an algorithm suppresses high-amplitude artifacts (e.g., ocular blinks, muscle activity) while preserving underlying brain rhythms [78] [7]. In pharmacological EEG applications, SNR improvements are crucial for detecting drug-induced changes in brain rhythms, such as the increased gamma power and decreased alpha power associated with ketamine-like antidepressants [76].

Methodology: SNR is typically calculated in the frequency domain after applying artifact removal algorithms to contaminated EEG data. The calculation involves:

Processing a contaminated EEG signal with a target artifact removal algorithm
Comparing the processed output to a ground-truth clean signal
Computing the ratio of signal power to noise power, often expressed in decibels (dB)

Higher SNR values indicate superior artifact suppression. For instance, the CLEnet algorithm demonstrated SNR improvements of 2.45% over competing methods when removing unknown artifacts from multi-channel EEG data [7].

Correlation Coefficient (CC)

Theoretical Foundation: The Correlation Coefficient measures the linear relationship between processed and ground-truth signals, evaluating how well the temporal dynamics of original neural activity are preserved through the artifact removal process [78] [7]. This metric is especially sensitive to waveform distortion that can occur with aggressive filtering or inappropriate component rejection.

Methodology: CC is computed in the time domain between the processed signal and a reference clean signal:

Calculate the covariance between the processed and reference signals
Normalize by the product of their standard deviations
Resulting values range from -1 to 1, with 1 indicating perfect preservation

The AnEEG model achieved higher CC values, indicating stronger linear agreement with ground truth signals [78], while CLEnet reached a CC of 0.925 in removing mixed artifacts, demonstrating excellent temporal structure preservation [7].

Root Mean Square Error (RMSE)

Theoretical Foundation: RMSE provides a comprehensive measure of overall difference between processed and ideal signals, capturing the cumulative effect of both artifact residue and signal distortion [78] [7]. It is particularly sensitive to large, localized errors that might be introduced by incomplete artifact removal or neural signal loss.

Methodology: RMSE is calculated as the square root of the average squared differences between processed and reference signals:

Compute differences between corresponding timepoints in processed and reference signals
Square these differences to emphasize larger errors
Calculate the mean of squared errors across the entire epoch
Take the square root to return to original signal units

Lower RMSE values indicate better overall agreement. The AnEEG model achieved lower RMSE values compared to wavelet decomposition techniques, reflecting superior reconstruction fidelity [78]. Relative RMSE (RRMSE) variants in temporal (RRMSEt) and frequency (RRMSEf) domains provide additional domain-specific insights, with CLEnet reducing these metrics by 6.94% and 3.30% respectively compared to other models [7].

Component Dipolarity

Theoretical Foundation: Component Dipolarity assesses the physiological plausibility of independent components derived from source separation techniques like Independent Component Analysis (ICA) [79] [80]. This metric is grounded in the biophysical principle that coherent neural activity originating from a compact cortical source produces a scalp potential topography that can be explained by an equivalent current dipole.

Methodology: Dipolarity is quantified through dipole fitting procedures:

Decompose EEG signals into independent components
Project component topographies to source space using a head model
Fit an equivalent current dipole to the component topography
Calculate the percentage of variance in the scalp topography explained by the dipole

Components with dipolarity >90% are considered physiologically plausible neural sources [80]. In ESI validation, the 4LCNN method significantly improved dipole localization accuracy for subcortical sources, reducing errors to 5.9 mm at SNR=30 dB [79].

Table 1: Summary of Key Performance Metrics for EEG Artifact Removal

Metric	Theoretical Basis	Computational Approach	Interpretation	Ideal Value
SNR	Power ratio of signal to noise	Ratio of signal variance to noise variance	Higher values indicate better artifact rejection	Maximize
Correlation Coefficient (CC)	Linear dependence between signals	Covariance normalized by product of standard deviations	Higher values indicate better signal preservation	Close to 1
RMSE	Cumulative difference between signals	Root of average squared differences	Lower values indicate better reconstruction fidelity	Minimize
Component Dipolarity	Physiological plausibility of sources	Variance explained by equivalent current dipole	Higher values indicate more plausible neural sources	>90%

Experimental Protocols for Metric Validation

Benchmarking with Semi-Synthetic Datasets

Protocol Objective: To establish controlled validation of artifact removal performance using ground-truth data.

Methodology: Semi-synthetic datasets are created by systematically adding artifact recordings to clean EEG baseline data [7]:

Data Acquisition: Obtain clean EEG segments from established databases (e.g., EEGdenoiseNet) and artifact signals (EOG, EMG, ECG) from separate recordings
Linear Mixing: Combine clean EEG and artifact signals using realistic mixing matrices: Contaminated_EEG = Clean_EEG + β * Artifact
Parameter Variation: Systematically vary artifact-to-EEG ratio (β) and artifact types to simulate different contamination scenarios
Algorithm Application: Process contaminated signals through target artifact removal algorithms
Metric Computation: Calculate SNR, CC, and RMSE between processed signals and original clean EEG

This approach enables precise quantification of performance, as demonstrated in studies validating the GCTNet and CLEnet models, which showed 11.15% reduction in RRMSE and 9.81 improvement in SNR compared to other methods [78] [7].

Source Localization Accuracy Assessment

Protocol Objective: To validate the impact of artifact removal on electrophysiological source imaging (ESI).

Methodology: For deep learning-based ESI approaches like 4LCNN:

Forward Simulation: Generate training data using realistic head models with 10+ different head tissues segmented from individual MRI [79]
Source Activation: Simulate cortical and subcortical source activations with varying locations and temporal patterns
EEG Simulation: Compute scalp potentials using boundary element method or finite element method forward models
Noise Addition: Add realistic noise to achieve target SNR levels (typically 5-30 dB for evoked potential protocols)
Localization Assessment: Compare estimated source locations with known simulated locations using:
- Localization Error: Distance between true and estimated maximum activity
- Spatial Dispersion: Volume of activated regions
- Area Under Precision-Recall Curve (AUPRC): Overall detection accuracy

This protocol revealed that 4LCNN achieved significantly lower localization errors (5.9 mm at SNR=30 dB) for subcortical sources compared to traditional methods like eLORETA and LCMV beamformer [79].

Real-Data Validation with Independent Components

Protocol Objective: To assess artifact removal performance on experimental data without ground truth.

Methodology: When clean reference signals are unavailable:

ICA Decomposition: Apply ICA to raw multichannel EEG data to obtain independent components
Component Classification: Identify neural and artifactual components using:
- Dipolarity Analysis: Fit equivalent current dipoles and calculate variance explained
- Spectral Features: Examine power spectral characteristics
- Temporal Patterns: Inspect time-course dynamics
Targeted Removal: Apply artifact removal specifically to identified artifactual components or periods
Data Reconstruction: Reconstruct clean EEG after targeted processing
Validation Metrics: Assess performance through:
- Effect Size Inflation: Compare event-related potential amplitudes before and after processing
- Source Localization Bias: Evaluate consistency of source estimates
- Component Dipolarity: Track changes in neural component quality

This approach forms the basis of the RELAX pipeline, which reduces artificial inflation of effect sizes while minimizing source localization biases [67].

Diagram 1: Performance metric validation workflow for EEG artifact removal algorithms.

Advanced Analytical Frameworks

Intermetric Relationships and Trade-offs

Comprehensive validation requires understanding the relationships and potential trade-offs between different performance metrics. In practice, no single artifact removal method excels across all metrics, requiring researchers to select methods based on their specific analytical priorities.

SNR-RMSE Trade-off: Algorithms that aggressively remove artifacts often improve SNR but may increase RMSE if they distort genuine neural signals. For example, traditional ICA component subtraction can improve SNR but artificially inflate effect sizes, introducing a different form of error [67].

CC-Dipolarity Relationship: Methods that preserve temporal dynamics (high CC) typically also maintain physiologically plausible sources (high dipolarity). The 4LCNN approach demonstrated this relationship by achieving both accurate temporal reconstruction and precise source localization [79].

Domain-Specific Performance: Some algorithms perform differently across temporal and frequency domains. CLEnet showed balanced improvement across both domains, with RRMSEt decreasing by 6.94% and RRMSEf by 3.30% [7].

Table 2: Performance Comparison of Advanced Artifact Removal Methods

Method	Architecture	SNR Improvement	CC Performance	RMSE Reduction	Application Context
AnEEG [78]	LSTM-based GAN	Significant	Higher CC values	Lower NMSE/RMSE	General artifact removal
CLEnet [7]	Dual-scale CNN + LSTM with EMA-1D	2.45% increase vs. benchmarks	0.925 with mixed artifacts	6.94% RRMSEt reduction	Multi-channel, unknown artifacts
ART [16]	Transformer	Improved	Enhanced	Significant reduction	Multichannel EEG, multiple artifacts
4LCNN [79]	Four-layer CNN	Optimized for 5-30 dB conditions	Preserved temporal dynamics	Minimal spatial dispersion	Cortical and subcortical source localization
RELAX [67]	Targeted ICA reduction	N/A	N/A	Reduced effect size inflation	Preserving neural signals in ERP/connectivity

Metric Selection Framework for Specific Research Applications

Different research applications prioritize different metric combinations based on analytical goals:

Event-Related Potential Studies:

Primary Metrics: CC, RMSE (temporal domain)
Rationale: Accurate waveform morphology and timing are critical
Validation Approach: Semi-synthetic datasets with known ground truth

Functional Connectivity Research:

Primary Metrics: Component Dipolarity, SNR
Rationale: Physiological plausibility and noise suppression essential for network inference
Validation Approach: Real-data validation with independent components

Source Localization Applications:

Primary Metrics: Component Dipolarity, Spatial Dispersion
Rationale: Accurate spatial localization depends on physiologically plausible sources
Validation Approach: Forward simulation with known source locations

Pharmaco-EEG and Biomarker Development:

Primary Metrics: SNR, RMSE (frequency domain)
Rationale: Detection of subtle drug-induced spectral changes requires optimal noise suppression
Validation Approach: Test-retest reliability in controlled conditions

Diagram 2: Metric selection framework for different EEG research applications.

Table 3: Key Computational Tools and Datasets for EEG Artifact Removal Validation

Resource	Type	Primary Function	Application Context
EEGdenoiseNet [7]	Benchmark Dataset	Provides semi-synthetic EEG with ground truth	Algorithm development and validation
RELAX Pipeline [67]	Software Toolbox	Implements targeted artifact reduction	ERP and connectivity studies
4LCNN Model [79]	Deep Learning Architecture	Cortical and subcortical source localization	Source imaging validation
SMICA Algorithm [80]	Source Separation	ICA with noise modeling for M/EEG	Artifact rejection and source identification
BNA Platform [76]	Commercial Analytics	Objective treatment efficacy measurement	Pharmaco-EEG and drug development

The rigorous validation of artifact removal techniques in high-density EEG research requires a multifaceted approach that addresses both artifact elimination and neural information preservation. No single metric provides a comprehensive assessment, necessitating the strategic combination of SNR, Correlation Coefficient, RMSE, and Component Dipolarity based on specific research objectives. The emerging generation of deep learning approaches—including LSTM-GAN hybrids, transformer architectures, and specialized CNNs—demonstrates promising advancements across these metrics, but also highlights the persistent trade-offs between different aspects of performance. As EEG applications expand into more complex domains including pharmacological biomarker development and real-world neuroimaging, this comprehensive metric framework will play an increasingly critical role in ensuring the validity and interpretability of neuroscientific findings.

This technical guide provides a comprehensive analysis of four predominant artifact removal methodologies in high-density electroencephalography (EEG) research: Independent Component Analysis (ICA), Artifact Subspace Reconstruction (ASR), iCanClean, and emerging deep learning models. With the expansion of EEG into mobile brain-body imaging and real-world applications, effective artifact removal has become increasingly critical for data integrity. The following comparison synthesizes current evidence to guide researchers and drug development professionals in selecting optimal preprocessing pipelines for specific experimental conditions.

Table 1: High-Level Method Comparison for EEG Artifact Removal

Method	Core Principle	Primary Artifacts Addressed	Hardware/Data Requirements	Computational Load	Implementation Context
ICA	Blind source separation to maximize statistical independence	Ocular, cardiac, line noise [20]	High-density EEG (100+ channels); 30+ minutes of data [81] [46]	Very high (hours to days) [81] [46]	Offline analysis
ASR	Principal component analysis to identify and remove high-variance signal bursts	Motion, muscular, ocular [20] [14]	Requires clean calibration data segment [20] [81]	Low to moderate (real-time capable) [81]	Real-time or offline
iCanClean	Canonical correlation analysis with reference noise signals	Motion, muscle, ocular, line noise [20] [81] [46]	Dual-layer noise sensors or pseudo-reference signals [20]	Moderate (real-time capable) [81]	Real-time or offline
Deep Learning	Neural networks trained to map contaminated EEG to clean signals	All types (performance varies by model) [19] [15]	Large labeled datasets for training [19] [10]	High for training, variable for inference	Primarily offline (some real-time)

In-Depth Technical Analysis of Methods

Independent Component Analysis (ICA)

ICA is a blind source separation technique that linearly decomposes multi-channel EEG data into maximally statistically independent components [20]. The underlying assumption is that artifacts and neural signals originate from distinct sources that mix linearly at the electrodes.

Experimental Protocol: For effective decomposition, studies recommend recording at least 30 minutes of high-density EEG (100+ channels) at a sampling frequency ≥500 Hz [81] [46]. Components are typically classified using automated algorithms like ICLabel, though these have not been trained specifically on mobile EEG data [20]. The quality of ICA decomposition is often evaluated using component dipolarity, where brain sources should exhibit a single scalp topography consistent with a single neural generator [20].
Limitations in Mobile Settings: During whole-body movements like running, head motion produces artifacts that contaminate the EEG and significantly reduce ICA decomposition quality [20]. The continued presence of large motion artifacts impairs ICA's ability to identify maximally independent sources, making it suboptimal for dynamic movement paradigms.

Artifact Subspace Reconstruction (ASR)

ASR employs a sliding-window principal component analysis (PCA) to identify and remove high-variance signal components indicative of artifacts [20] [14]. The method compares incoming data to a calibration period of clean baseline data.

Algorithm Details: First, the root mean squares (RMS) of sliding 1-second EEG segments are calculated and converted to z-scores using a condensed Gaussian distribution [20]. Data segments with z-scores between -3.5 and 5.0 for at least 92.5% of electrodes comprise the reference data. A sliding-window PCA then derives principal components from both reference and non-reference data. Components in non-reference data are identified as artifactual if their standard deviation of RMS exceeds a user-defined threshold ("k"), and the signal is reconstructed using the calibration data [20].
Parameter Optimization: The "k" parameter (typically 10-30) controls sensitivity, with lower values producing more aggressive cleaning [20]. Studies recommend k=10 for human locomotion data to avoid "overcleaning" and inadvertent manipulation of neural signals [20]. Recent research has identified limitations in ASR's reference period algorithm, which may explain why higher k values sometimes fail to address high-amplitude motion artifacts [20].

iCanClean

iCanClean leverages canonical correlation analysis (CCA) to detect and correct noise-based subspaces using reference noise signals [20] [81]. The algorithm identifies subspaces of scalp EEG that are correlated with noise subspaces based on a user-selected correlation criterion (R²).

Implementation Variants: The optimal implementation uses dual-layer sensors with mechanically coupled noise electrodes that only capture motion artifacts [20] [81]. When dedicated noise sensors are unavailable, iCanClean can create "pseudo-reference" noise signals by temporarily applying a notch filter to identify noise within the EEG (e.g., below 3 Hz) [20].
Performance Optimization: In human locomotion studies during walking, parameters of R²=0.65 and a sliding window of 4 seconds produced the most dipolar brain components from subsequent ICA [20]. After identifying noise components correlated with reference signals exceeding the R² threshold, these components are projected back onto EEG channels using a least-squares solution and subtracted from the scalp EEG [20] [81].

Deep Learning Models

Deep learning approaches represent a paradigm shift in artifact removal, using neural networks trained to directly map artifact-contaminated EEG to clean signals in an end-to-end manner [19] [15].

Architecture Diversity: Proposed models include:
- CLEnet: Integrates dual-scale CNN and LSTM with an improved EMA-1D attention mechanism to extract morphological and temporal features simultaneously [15].
- AnEEG: Uses LSTM-based Generative Adversarial Networks (GANs) where the generator produces clean EEG and the discriminator evaluates authenticity [19].
- Specialized CNNs: Lightweight convolutional networks optimized for specific artifact classes with distinct temporal windows (1s for non-physiological, 5s for muscle, 20s for eye movements) [10].
Training Requirements: These models require extensive labeled datasets, often created semi-synthetically by combining clean EEG with recorded artifacts [19] [15]. For example, EEGDenoiseNet provides a benchmark dataset combining clean EEG with EMG and EOG artifacts [15].

Quantitative Performance Comparison

Table 2: Quantitative Performance Metrics Across Methodologies

Method	Data Quality Score Improvement	Component Dipolarity	Power Reduction at Gait Frequency	Signal-to-Noise Ratio (SNR) Improvement	Computational Time
ICA	Not quantified in studies	Reduced quality during motion [20]	Limited reduction [20]	Not quantified	5+ hours for high-density data [81] [46]
ASR	27.6% (from 15.7% baseline) [81]	Improved with optimal k=10 [20]	Significant reduction [20]	Not quantified	Minutes (real-time capable) [81]
iCanClean	55.9% (from 15.7% baseline) [81]	Greatest improvement [20] [81]	Significant reduction [20]	Not quantified	Minutes (real-time capable) [81]
Deep Learning (CLEnet)	Not quantified	Not quantified	Not quantified	11.498dB for mixed artifacts [15]	High for training, faster inference

The data quality score represents the average correlation between known brain sources and EEG channels in phantom head testing [81]. iCanClean demonstrated superior performance, improving data quality from 15.7% to 55.9% in conditions with all artifacts simultaneously present, compared to 27.6% for ASR [81]. In running studies, both ASR and iCanClean significantly reduced power at the gait frequency and its harmonics and enabled identification of ERP components similar to those in stationary conditions [20].

Experimental Protocols and Methodologies

Motion-Artifact Comparison Protocol (ICA vs. ASR vs. iCanClean)

A 2025 study established a rigorous protocol for comparing artifact removal methods during dynamic motor tasks [20]:

Participants: Young adult athletes performing Flanker tasks during both jogging and static standing.
EEG Recording: High-density mobile EEG during overground running.
Processing Pipeline:
- Apply each preprocessing method (ICA, ASR, iCanClean) to identical raw datasets
- Perform ICA decomposition on preprocessed data
- Evaluate outcomes using multiple metrics:
  - ICA component dipolarity
  - Spectral power changes at gait frequency and harmonics
  - Recovery of stimulus-locked ERP components (P300)
  - P300 congruency effects (expected amplitude differences between congruent and incongruent Flanker stimuli)
Key Findings: iCanClean with pseudo-reference noise signals and ASR led to recovery of more dipolar brain independent components, with iCanClean being somewhat more effective [20]. Only iCanClean successfully identified the expected greater P300 amplitude to incongruent flankers during running.

Phantom Head Validation Protocol

A 2023 study established objective performance benchmarks using a phantom head with known ground-truth brain signals [81] [46]:

Apparatus: Electrically conductive phantom head with 10 embedded brain sources and 10 contaminating sources.
Test Conditions: Six conditions including Brain only, Brain + Eyes, Brain + Neck Muscles, Brain + Facial Muscles, Brain + Walking Motion, and Brain + All Artifacts.
Evaluation Metric: Data Quality Score (0-100%) based on average correlation between known brain sources and EEG channels.
Results: iCanClean consistently outperformed ASR, Auto-CCA, and Adaptive Filtering across all artifact types, particularly in the challenging "All Artifacts" condition [81].

Deep Learning Training Protocol

Recent studies have established standardized protocols for training deep learning models for artifact removal [19] [15]:

Data Preparation:
- Utilize semi-synthetic datasets created by combining clean EEG with recorded artifacts (EMG, EOG, ECG)
- Apply standardized preprocessing: resampling to 250Hz, bandpass filtering (1-40Hz), notch filtering (50/60Hz), robust scaling
- Segment into temporal windows optimized for specific artifacts (1s for non-physiological, 5s for muscle, 20s for eye movements) [10]
Model Training:
- Use mean squared error (MSE) between generated and clean EEG as loss function
- Implement adversarial training for GAN-based approaches
- Incorporate attention mechanisms to capture spatial-temporal dependencies
Evaluation Metrics: Correlation coefficient (CC), signal-to-noise ratio (SNR), relative root mean square error in temporal (RRMSEt) and frequency (RRMSEf) domains [19] [15].

Method Selection Workflow

The following diagram illustrates the decision process for selecting an appropriate artifact removal method based on experimental conditions and research objectives:

Table 3: Key Research Materials and Computational Tools for EEG Artifact Removal Research

Resource Category	Specific Tool/Platform	Function/Purpose	Accessibility
Software Libraries	EEGLAB [20] [58]	MATLAB toolbox providing ICA, ASR, and preprocessing pipelines	Open source
Benchmark Datasets	EEGDenoiseNet [15]	Semi-synthetic dataset with clean EEG and artifacts for training/testing	Publicly available
Benchmark Datasets	TUH EEG Artifact Corpus [10]	Clinical EEG with expert artifact annotations for development/validation	Publicly available
Phantom Platforms	Conductive Phantom Head [81] [46]	Hardware with known brain sources for objective algorithm validation	Research institutions
Deep Learning Models	CLEnet [15]	Dual-branch CNN-LSTM with attention for multi-channel artifact removal	Open source code
Deep Learning Models	AnEEG [19]	LSTM-based GAN for generating artifact-free EEG from contaminated data	Open source code
Mobile EEG Systems	Dual-layer EEG sensors [20] [81]	Hardware with dedicated noise channels for optimal motion artifact removal	Commercial purchase

The field of EEG artifact removal is rapidly evolving, with several emerging trends poised to shape future research and clinical applications. Deep learning approaches show remarkable potential but face challenges in generalizability across diverse populations and recording conditions [10]. The development of specialized convolutional neural networks optimized for specific artifact classes represents a promising direction, with studies demonstrating that eye movement, muscle, and non-physiological artifacts each require distinct temporal window sizes for optimal detection [10].

Integration of auxiliary sensors, particularly inertial measurement units (IMUs), remains underutilized despite significant potential for enhancing motion artifact detection under real-world conditions [14]. Future pipelines will likely combine multiple approaches, such as using iCanClean for initial motion artifact removal followed by deep learning for residual artifact correction.

For researchers and drug development professionals, method selection must align with experimental constraints and objectives. For stationary paradigms with high-density systems, ICA remains viable. For mobile brain imaging during whole-body movement, iCanClean currently demonstrates superior performance, with deep learning approaches rapidly closing the gap. As wearable EEG continues to displace traditional lab-based systems [24], robust artifact removal methodologies will become increasingly critical for maintaining data integrity in real-world neuroscience research and clinical applications.

High-density electroencephalography (hd-EEG) serves as a critical tool in neuroscience research and clinical applications, from investigating sleep architecture to monitoring neurological disorders. However, the analysis of hd-EEG data is persistently challenged by the presence of various artifacts that obscure genuine neural signals. These artifacts are particularly problematic in two key scenarios: during sleep studies, where biological processes and prolonged recording durations introduce unique contaminants, and in mobile settings, where motion introduces severe, non-stationary noise. Effective artifact removal is therefore not merely a preprocessing step but a fundamental necessity for ensuring the validity of neuroscientific findings and the reliability of clinical biomarkers.

The challenges are magnified in high-density systems due to the increased complexity of separating neural signals from artifacts across many channels. As highlighted in a systematic review, artifacts in wearable EEG exhibit specific features due to dry electrodes, reduced scalp coverage, and subject mobility, yet only a few studies explicitly address these peculiarities [11]. This case study examines the performance of various artifact removal methodologies within the context of a broader thesis on hd-EEG analysis, focusing specifically on sleep EEG and motion-contaminated data. We provide a quantitative evaluation of existing techniques, detail experimental protocols for performance validation, and visualize the core workflows, aiming to establish a framework for robust artifact management in sensitive research and clinical applications.

Performance Evaluation of Artifact Removal Techniques

The efficacy of artifact removal methods is quantified using a standard set of metrics that evaluate both the fidelity of the cleaned signal and the degree of artifact suppression. The following tables summarize the performance of various contemporary techniques across different artifact types and experimental conditions.

Table 1: Performance of Deep Learning & Signal Processing Models on Motion Artifact Removal

Method	Architecture/Approach	Key Performance Metrics	Artifact Type	Context
Motion-Net [82]	CNN-based (U-Net) with Visibility Graph features	Artifact reduction (η): 86% ±4.13SNR improvement: 20 ±4.47 dBMAE: 0.20 ±0.16	Motion Artifacts	Mobile EEG, Subject-specific
AnEEG [19]	GAN with LSTM layers	Improved NMSE, RMSE, CC, SNR, and SAR over wavelet techniques	Muscle, Ocular, Environmental	General Artifact Removal
FF-EWT + GMETV [83]	Fixed Frequency Empirical Wavelet Transform & GMETV filter	Lower RRMSE, higher CC on synthetic data; Improved SAR and MAE on real data	Ocular (EOG) Artifacts	Single-Channel EEG

Table 2: Performance of Reference-Based and Blind Source Separation Methods

Method	Category	Key Performance Metrics / Findings	Artifact Type	Context
iCanClean [20]	Reference-Based (CCA)	Produced most dipolar ICA components; Enabled identification of P300 congruency effect during running.	Motion Artifacts	Mobile EEG (Running)
Artifact Subspace Reconstruction (ASR) [20]	Blind Source Separation	Improved ICA dipolarity; Reduced power at gait frequency; Required less aggressive cleaning (k=20-30 recommended).	Motion Artifacts	Mobile EEG (Running)
IMU-Enhanced LaBraM [84]	Multi-modal Deep Learning (Fine-tuned Transformer)	Outperformed ASR-ICA benchmark; Improved robustness under diverse motion scenarios.	Motion Artifacts	Mobile EEG with IMU reference
ICA & Autoreject [9]	Blind Source Separation & Statistical Rejection	Generally decreased decoding performance, though sometimes removed signal features useful for classification.	Ocular & Muscle	ERP Decoding

Table 3: Simple Automatic Detection for Sleep EEG

Method	Basis	Key Findings	Artifact Type	Context
Hjorth Parameters [41]	Activity, Mobility, Complexity	Achieved highly similar all-night average Power Spectral Density (PSD) to visual detections; Effectively recovered correlations of PSD with age and sex.	Myogenic, Cardiac, Electrode Pops	Sleep EEG

Experimental Protocols for Validation

To ensure the validity and comparability of artifact removal techniques, standardized experimental protocols and benchmarking procedures are essential. The following section details methodologies for generating and evaluating performance on sleep hd-EEG and motion-contaminated data.

Protocol for Validating Motion Artifact Removal

Evaluating methods on data with real-world motion artifacts requires a robust experimental design that can simulate realistic conditions while allowing for ground-truth comparisons.

Dataset Curation: The "Mobile BCI" dataset is a common benchmark, containing EEG recordings during Event-Related Potential (ERP) and Steady-State Visual Evoked Potential (SSVEP) paradigms under various movement conditions (standing, slow walking, fast walking, slight running) [84]. Using the standing condition as a relatively clean benchmark provides a reference for evaluating performance under motion.
Performance Metrics: A multi-faceted assessment is critical [82] [20] [19]:
- Signal Fidelity: Quantitative metrics like Signal-to-Noise Ratio (SNR), Signal-to-Artifact Ratio (SAR), Mean Absolute Error (MAE), and Correlation Coefficient (CC) should be reported.
- Component Quality: The dipolarity of Independent Components obtained after ICA decomposition is a key indicator of successful brain source separation [20].
- Spectral Analysis: Power reduction at the gait frequency and its harmonics indicates effective removal of periodic motion artifacts [20].
- Functional Recovery: The ability to recover expected neurophysiological components, such as the P300 ERP congruency effect during a Flanker task, demonstrates the preservation of neural information [20].
Benchmarking: New methods should be compared against established pipelines such as Artifact Subspace Reconstruction (ASR) followed by ICA (ASR-ICA) [84].

Protocol for Sleep hd-EEG Artifact Detection

For sleep EEG, the focus often shifts to reliable artifact detection with minimal data loss, given the long recording durations.

Data and Preprocessing: Utilize large-scale, visually scored sleep hd-EEG datasets (e.g., from sleepdata.org). Data is typically segmented into standard epochs (e.g., 4 seconds) [41].
Detection Methodology: A simple yet effective approach involves calculating Hjorth parameters (Activity, Mobility, Complexity) for each epoch and channel. Artifactual epochs are identified as statistical outliers in the distribution of these parameters [41].
Validation Strategy:
- Compare the all-night average Power Spectral Density (PSD) derived from automatically cleaned data against that from gold-standard visual cleaning.
- Assess whether the cleaned data preserves established physiological relationships, such as the correlation between PSD and age or sex [41].

General Framework for Deep Learning Models

For deep learning-based approaches like Motion-Net and AnEEG, a standardized training and testing framework is necessary.

Data Preparation: For subject-specific models, data is split into training and testing sets separately for each subject. Inputs are often raw EEG segments, sometimes augmented with features like Visibility Graph properties to enhance structural information [82].
Model Training: The model (e.g., CNN, GAN) is trained to learn a mapping from artifact-contaminated signals to clean "ground-truth" signals. The loss function typically includes a reconstruction term (e.g., Mean Squared Error) [82] [19].
Evaluation: Model performance is evaluated on a held-out test set using the quantitative metrics listed in Table 1, ensuring a direct comparison with other methods.

Workflow Visualization

The following diagrams, generated using Graphviz, illustrate the logical workflows and signaling pathways for the key methodologies discussed in this case study.

Motion Artifact Removal Pipeline

Motion Artifact Removal Workflow

Sleep EEG Artifact Detection

Sleep EEG Artifact Detection Workflow

The Scientist's Toolkit: Essential Research Reagents & Materials

This section details key hardware, software, and data resources essential for conducting rigorous research in hd-EEG artifact removal.

Table 4: Key Research Reagent Solutions for hd-EEG Artifact Research

Item / Resource	Function & Application	Relevance to hd-EEG Research
Mobile BCI Dataset [84]	A public dataset containing synchronized EEG and IMU data from participants standing, walking, and running.	Serves as a critical benchmark for developing and validating motion artifact removal algorithms under realistic, ecologically valid conditions.
High-Density EEG Systems (64+ channels) [85]	Scalp electrode systems providing high spatial resolution for source localization and improved blind source separation.	Essential for studying brain connectivity and for applying techniques like ICA, which benefit from a high channel count.
Inertial Measurement Units (IMUs) [84]	Wearable sensors measuring acceleration, rotation, and orientation.	Provide a direct, hardware-based reference signal for motion artifacts, enabling powerful reference-based removal methods like iCanClean and adaptive filtering.
ERP CORE Dataset [9]	A public resource containing EEG data from seven classic Event-Related Potential (ERP) experiments.	Useful for systematically evaluating how artifact removal pipelines affect downstream decoding performance and the recovery of known neural responses.
Software Toolboxes (MNE-Python, EEGLAB, Brainstorm) [85] [9]	Open-source software platforms providing standardized implementations of preprocessing, source localization, and artifact removal algorithms (e.g., ICA, ASR).	Ensure reproducibility, provide community-vetted methods, and facilitate the construction of complex analysis pipelines.
Dual-Layer EEG Electrodes [20]	Specialized electrode setups where a second layer of electrodes is mechanically coupled but not in contact with the scalp, recording only noise.	Provide an ideal noise reference for algorithms like iCanClean, significantly improving motion artifact separation from brain signals.

Electroencephalography (EEG) remains a cornerstone technique for non-invasive monitoring of brain activity, playing an increasingly vital role in both neuroscientific research and clinical applications such as brain-computer interfaces (BCIs), neurological disorder diagnosis, and cognitive monitoring. However, the transition of EEG-based machine learning (ML) models from research environments to real-world clinical and commercial applications faces a fundamental obstacle: the generalizability challenge. This challenge refers to the frequent performance degradation of models when applied to data that differs from their training sets in aspects such as participant demographics, recording equipment, experimental protocols, or artifact profiles.

The problem is particularly acute in high-density EEG research, where artifact removal is a critical preprocessing step. Models that demonstrate exceptional performance on controlled, homogeneous datasets often fail to maintain this performance when confronted with the inherent variability of real-world data. This whitepaper examines the roots of the generalizability challenge, assesses current methodological approaches to address it, and provides a quantitative framework for evaluating model performance across diverse datasets, with a specific focus on implications for artifact removal in high-density EEG research.

The Data Diversity Bottleneck in EEG Research

The Participant Diversity Problem

A critical but often overlooked aspect of EEG data collection is its hierarchical structure. EEG datasets are typically composed of recordings from multiple participants, with each recording segmented into numerous samples for analysis. This structure creates a fundamental tension between the overall sample size and participant diversity. While sample size can be artificially inflated through segmentation, true participant diversity—the number of unique individuals contributing data—remains a fixed constraint.

Recent empirical research has demonstrated that participant distribution shifts significantly impact model generalizability. One large-scale study systematically investigated this effect across multiple datasets (TUAB, CAUEEG, PhysioNet) and tasks (EEG normality prediction, dementia diagnosis, sleep staging). The findings revealed that model performance scaling is severely constrained when participant diversity is limited, even with large overall sample sizes [86]. This occurs because models trained on data from few participants may learn participant-specific features that do not generalize to new individuals.

Data Heterogeneity and Domain Shifts

Beyond participant diversity, EEG data exhibits multiple dimensions of heterogeneity that challenge model generalizability:

Pathological Diversity: Models trained on homogeneous pathological conditions may struggle with the varied presentations found in real clinical populations. One study introducing a massive EEG corpus of 55,787 recordings from 39 hospitals highlighted that heterogeneous datasets containing diverse pathological conditions, recording protocols, and labeling standards present significantly greater challenges for model performance compared to homogeneous datasets [87].
Experimental Paradigms: The HBN-EEG dataset, used in the 2025 EEG Foundation Challenge, illustrates this diversity with six distinct cognitive tasks including resting state, surround suppression, movie watching, contrast change detection, sequence learning, and symbol search [88]. Models must generalize across these varied paradigms.
Acquisition Parameters: Differences in electrode placement, recording equipment, sampling rates, and preprocessing pipelines introduce additional domain shifts that can degrade model performance.

Methodological Approaches for Enhanced Generalization

Advanced Neural Architectures

Several neural architectures have shown promise for improving generalization in EEG analysis:

Transformer and Attention-Based Models: These approaches have demonstrated superior performance, particularly when dealing with large, heterogeneous datasets. Their self-attention mechanisms enable better modeling of long-range dependencies in EEG signals and greater robustness to domain shifts. Studies have found that transformer and attention-based networks performed best, especially when combined with gradient-boosted ensembles [87].

Hybrid Architectures for Artifact Removal: The CLEnet model exemplifies this trend, integrating dual-scale CNN and LSTM with an improved EMA-1D (One-Dimensional Efficient Multi-Scale Attention Mechanism). This design enables simultaneous extraction of morphological features and temporal dependencies from EEG signals, achieving state-of-the-art performance in removing various artifacts including EMG, EOG, and unknown artifacts across multiple datasets [7].

State Space Models (SSMs): For challenging artifact removal tasks such as those encountered in Transcranial Electrical Stimulation (tES), SSMs have demonstrated exceptional performance. A comprehensive benchmark study found that a multi-modular network based on SSMs yielded the best results for removing complex tACS and tRNS artifacts, outperforming conventional approaches [50].

Data-Centric Strategies

Data Augmentation: Specific augmentation techniques including AmplitudeScaling, FrequencyShift, and PhaseRandomisation have been systematically evaluated for their ability to improve model robustness. Research shows these augmentations are particularly valuable in data-limited regimes, though their effectiveness varies across tasks and model architectures [86].

Self-Supervised Learning (SSL): Methods like masked token prediction using transformer architectures operating on channel-wise EEG segments have emerged as powerful approaches for learning generalizable representations. The LaBraM model exemplifies this trend, demonstrating that SSL pre-training can enhance performance across different data regimes, particularly when participant diversity is limited [86].

Meta-Learning: The Curriculum Model-Agnostic Meta-Learning (CMAML) framework integrates meta-learning with curriculum learning to impart knowledge of variable artifact complexity. This approach enables models to adaptively learn restoration of multiple artifacts during training, demonstrating better generalization to unseen artifact types and improved performance on composite artifacts (scans with multiple artifacts) compared to conventional training approaches [89].

Quantitative Performance Assessment

Performance Metrics for Artifact Removal

A standardized set of metrics is essential for comparing artifact removal methods across studies:

Table 1: Key Performance Metrics for EEG Artifact Removal

Metric	Description	Interpretation
SNR (Signal-to-Noise Ratio)	Ratio of signal power to noise power	Higher values indicate better artifact suppression
CC (Correlation Coefficient)	Linear correlation between processed and clean signals	Values closer to 1 indicate better preservation of original signal
RRMSEt (Relative Root Mean Square Error, Temporal)	Normalized error in time domain	Lower values indicate better performance
RRMSEf (Relative Root Mean Square Error, Frequency)	Normalized error in frequency domain	Lower values indicate better spectral preservation
PSNR (Peak Signal-to-Noise Ratio)	Ratio of maximum possible power to corrupting noise	Higher values indicate better quality reconstruction
SSIM (Structural Similarity Index)	Perceived quality comparison between signals	Values closer to 1 indicate better structural preservation

Comparative Performance of Artifact Removal Methods

Recent studies have provided quantitative comparisons of artifact removal approaches:

Table 2: Performance Comparison of Deep Learning Artifact Removal Methods

Method	Architecture	Best For	SNR Improvement	RRMSEt Reduction	Key Strength
CLEnet [7]	Dual-scale CNN + LSTM + EMA-1D	Multi-artifact removal	2.45-5.13%	6.94-8.08%	Handles unknown artifacts in multi-channel EEG
CMAML [89]	Meta-learning with curriculum	Unseen and multiple MRI artifacts	-	-	83% cases better generalization to unseen artifacts
SSM (M4) [50]	State Space Models	tACS and tRNS artifacts	-	-	Superior for complex stimulation artifacts
Complex CNN [50]	Convolutional Neural Network	tDCS artifacts	-	-	Best for specific stimulation types
PA OmniNet [90]	Modified U-Net	Sparse sampling reconstruction	1.55 dB PSNR increase	11.6% RMSE reduction	System configuration generalization

Preprocessing Method Comparison for ASD EEG Analysis

Research on EEG-based Autism Spectrum Disorder (ASD) detection provides valuable insights into preprocessing choices:

Table 3: Performance of Preprocessing Techniques on ASD EEG Data

Method	SNR (Normal)	SNR (ASD)	MAE	MSE	Key Strength
ICA [91]	86.44	78.69	Moderate	Moderate	Superior denoising capability
DWT [91]	Lower than ICA	Lower than ICA	4785.08	309,690	Optimal feature preservation
Butterworth [91]	Moderate	Moderate	Higher than DWT	Higher than DWT	Balanced approach

Experimental Protocols for Generalization Assessment

Cross-Dataset Validation Framework

A robust methodology for assessing generalizability involves cross-dataset validation:

Diagram 1: Cross-Dataset Validation Workflow

Participant-Centric Data Splitting

To properly evaluate participant-independent generalization, data splitting must occur at the participant level rather than at the sample level:

Diagram 2: Participant-Centric Data Splitting

Protocol for Multi-Task Generalization Assessment

The 2025 EEG Foundation Challenge has established a standardized protocol for assessing cross-task generalization [88]:

Pretraining Phase: Models are pretrained on passive tasks (resting state, surround suppression, movie watching)
Fine-tuning Phase: Models are fine-tuned on active tasks (contrast change detection, sequence learning, symbol search)
Evaluation Phase: Performance is assessed on both within-task and cross-task scenarios
Clinical Correlation: Model representations are evaluated for their correlation with clinical factors (externalizing, internalizing, p-factor, attention)

The Scientist's Toolkit: Research Reagent Solutions

Table 4: Essential Resources for EEG Generalization Research

Resource	Type	Key Features	Application
HBN-EEG Dataset [88]	Dataset	3,000+ participants, 6 cognitive tasks, psychopathology dimensions	Cross-task and cross-subject generalization
Elmiko Dataset [87]	Dataset	55,787 recordings, 39 hospitals, diverse pathologies	Large-scale heterogeneity studies
WBCIC-MI Dataset [74]	Dataset	62 participants, 3 sessions, 2-3 class motor imagery	Cross-session and cross-subject BCI research
CLEnet [7]	Algorithm	Dual-scale CNN + LSTM + EMA-1D	Multi-artifact removal in multi-channel EEG
CMAML [89]	Framework	Meta-learning with curriculum	Generalization to unseen artifact types
EEGDenoiseNet [7]	Benchmark	Semi-synthetic dataset with ground truth	Controlled artifact removal evaluation
LaBraM [86]	Foundation Model	Self-supervised pre-training	Transfer learning for data-limited scenarios

The generalizability challenge represents a critical bottleneck in the translation of EEG-based machine learning models from research to clinical practice. Our assessment reveals that participant diversity, rather than overall sample size, is frequently the limiting factor in model performance. This understanding necessitates a paradigm shift in how we collect EEG data, develop models, and assess their performance.

Promising directions for future research include the development of foundation models for EEG that can adapt to new tasks and domains with minimal fine-tuning, increased focus on explainable AI techniques to understand what features generalize across domains, and the establishment of standardized benchmarking protocols that explicitly measure generalizability rather than just within-dataset performance.

For researchers and drug development professionals, prioritizing participant diversity during data collection, incorporating cross-dataset validation as a standard evaluation practice, and selectively applying generalization-enhancing techniques such as meta-learning and self-supervised pre-training will be essential for building EEG-based tools that deliver reliable performance in real-world settings. The methodologies and metrics outlined in this whitepaper provide a framework for these efforts, moving the field toward more robust and generalizable EEG analysis systems.

Conclusion

The endeavor of artifact removal in high-density EEG is a complex but surmountable challenge, central to extracting valid and reliable neural insights. A successful strategy is not one-size-fits-all; it requires a nuanced understanding of the artifact types, a carefully selected methodological toolkit blending established and emerging techniques, and rigorous validation tailored to the research context. The field is poised for significant advancement through the development of more robust, generalizable deep learning architectures, the creation of standardized, high-quality public datasets for benchmarking, and a stronger focus on real-time, automated solutions for clinical and translational environments. For researchers and drug development professionals, mastering these artifact removal challenges is not merely a technical exercise—it is a fundamental prerequisite for ensuring the fidelity of the neural biomarkers and endpoints that underpin modern neuroscience and therapeutic development.

Navigating the Noise: Challenges and Advanced Solutions in High-Density EEG Artifact Removal

Navigating the Noise: Challenges and Advanced Solutions in High-Density EEG Artifact Removal

Abstract

The High-Density EEG Artifact Landscape: Understanding the Core Challenges

Fundamental Differences in Artifact Manifestation

Altered Spatial Characteristics of Artifacts

The Information-Rich Nature of Hd-EEG Artifacts

Methodological Challenges and Limitations

Algorithmic and Computational Constraints

Validation and Preservation of Brain Dynamics

Specialized Scenarios and Emerging Solutions

Artifact Removal in Mobile Brain/Body Imaging (MoBI)

Advanced Computational Approaches

Experimental Protocols and Research Toolkit

Methodologies for Validating Artifact Removal

Application-Specific Method Selection

Artifact Classification and Characteristics

Experimental Protocols for Artifact Analysis

Data Acquisition and Preprocessing Pipeline

Impact of Preprocessing on Decoding Performance

The Scientist's Toolkit: Research Reagents and Computational Solutions

The Scale of the Data Deluge

Signature Artifacts in Overnight High-Density Recordings

Advanced Computational Pipelines for Artifact Management

Emerging Deep Learning Architectures

Specialized Toolboxes for Large-Scale Data

Experimental Protocols for Validation

Semi-Synthetic Data Generation

Real-World Data Annotation with Expert Consensus

Performance Metrics and Evaluation

The Scientist's Toolkit: Essential Research Reagents

Technical Approaches to Artifact Removal

Traditional Signal Processing and Blind Source Separation

The Rise of Deep Learning and End-to-End Models

Specialized Techniques for Motion Artifacts

Experimental Protocols and Benchmarking

Establishing Ground Truth with Semi-Synthetic Datasets

Protocol for Motion Artifact Removal in Locomotion

Quantitative Performance Metrics

The Scientist's Toolkit: Essential Research Reagents & Materials

Types of EEG Artifacts and Their Specific Impacts on Analysis

Methodological Protocols for Artifact Management and Mitigation

Data Acquisition and Preprocessing Protocols

Advanced Processing and Decomposition Protocols

Protocol for Validating Artifact Removal

The Scientist's Toolkit: Key Reagents and Computational Solutions

From ICA to AI: A Methodological Toolkit for hd-EEG Artifact Removal

Theoretical Foundations of ICA and BSS

Core Mathematical Principles

Critical Assumptions and Requirements

Methodological Implementations and Protocols

ICA Unmixing and Component Classification Workflow

Advanced and Hybrid Methodologies

Experimental Validation and Performance Metrics

Quantitative Comparison of Artifact Removal Algorithms

Protocol for Validating Motion Artifact Removal

The Scientist's Toolkit: Essential Research Reagents

Comparative Analysis of Methodologies

Algorithm Selection Framework

Performance Trade-offs and Considerations

High-Density-SleepCleaner: A Case Study in Semi-Automatic Processing

Experimental Protocol and Workflow

Comparative Analysis: Manual vs. Automatic Detection Methods

Implications for Research and Drug Development

Core Algorithmic Principles

Artifact Subspace Reconstruction (ASR)

Canonical Correlation Analysis (CCA)

Comparative Performance Analysis

Experimental Protocols and Methodologies

Protocol for CCA-based Artifact Removal in Controlled Environments

Protocol for Motion Artifact Removal During Human Locomotion

Integrated Processing Pipelines and Future Directions

The Scientist's Toolkit

Deep Learning Architectures for EEG Denoising

Core Architectures and Operating Principles

Advanced Hybrid and Integrated Architectures

Experimental Protocols and Benchmarking

Dataset Preparation and Preprocessing

Evaluation Metrics and Performance Validation

Experimental Workflow for EEG Artifact Removal