EOG vs EMG Artifacts in EEG: A Comprehensive Guide for Biomedical Research and Signal Processing

Nathan Hughes Dec 02, 2025 410

This article provides a comprehensive analysis of electrooculogram (EOG) and electromyogram (EMG) artifacts, two predominant physiological contaminants in electroencephalography (EEG) data.

EOG vs EMG Artifacts in EEG: A Comprehensive Guide for Biomedical Research and Signal Processing

Abstract

This article provides a comprehensive analysis of electrooculogram (EOG) and electromyogram (EMG) artifacts, two predominant physiological contaminants in electroencephalography (EEG) data. Tailored for researchers and drug development professionals, it explores the distinct origins, spatial-topographical profiles, and temporal-spectral signatures of these artifacts. The scope extends from foundational principles to advanced methodologies for detection, correction, and removal, including traditional techniques like Independent Component Analysis (ICA) and wavelet transforms, as well as emerging deep learning and real-time processing approaches. It further offers practical troubleshooting guidance for optimizing pipelines in clinical and mobile settings and presents a comparative evaluation of artifact management strategies on downstream data analysis, empowering scientists to enhance EEG signal quality and reliability in biomedical research.

Understanding the Adversaries: A Deep Dive into EOG and EMG Artifact Origins and Characteristics

Electroencephalography (EEG) provides a non-invasive window into the brain's spontaneous electrical activity, playing a crucial role in both clinical diagnostics and neuroscience research. However, the recorded electrical activity is invariably contaminated with artifacts—undesired signals that can fundamentally compromise signal interpretation and analysis [1]. Among these, physiological artifacts originating from the body itself present particularly formidable challenges. Electrooculographic (EOG) artifacts from eye movements and blinks, and electromyographic (EMG) artifacts from muscle activity represent two of the most significant sources of contamination due to their amplitude, frequency characteristics, and spatial distribution [1] [2] [3]. Within the context of a broader research thesis comparing EOG and EMG artifacts, this technical guide examines the physiological origins, defining characteristics, and methodological approaches for understanding and mitigating these pervasive signal contaminants, with particular relevance for researchers and drug development professionals requiring clean neural signals for accurate analysis.

Physiological Origins and Signal Characteristics

The Nature of EEG and the Problem of Artifacts

EEG measures voltage fluctuations resulting from ionic current within the neurons of the brain, typically recorded via electrodes placed on the scalp. These signals are characterized by low amplitude (microvolts to millivolts) and specific frequency rhythms that are categorized into distinct bands, each associated with different brain states [1].

Table 1: Primary EEG Rhythm Characteristics

Band Name	Frequency Range (Hz)	Associated Brain State
Delta	<4	Deep sleep
Theta	4-8	Relaxed state, meditation
Alpha	8-13	Relaxed wakefulness
Beta	13-30	Active thinking, focus
Gamma	>30	Higher cognitive processing

The fundamental challenge arises because the amplitude of physiological artifacts often dwarfs genuine cortical activity. EOG and EMG artifacts can be orders of magnitude larger than the underlying neural signals, making their elimination essential for accurate data interpretation [1] [3].

EOG Artifacts: The Ocular Component

Ocular artifacts originate from two primary sources: eye blinks and eye movements. Both phenomena alter the orientation of the corneo-retinal dipole—the electrical potential between the positively charged cornea and negatively charged retina [1] [3]. During a blink, the upward rotation of the eyeballs (Bell's Phenomenon) changes this potential difference, generating a positive waveform that is most prominent in frontal electrodes [3]. The amplitude of EOG artifacts is typically many times greater than background EEG activity, often reaching hundreds of microvolts [1] [3]. While the frequency content of EOG artifacts primarily overlaps with the delta and theta bands, their high amplitude and characteristic spike-like morphology in the time domain make them particularly disruptive for event-related potential studies and spectral analyses in lower frequency ranges [1] [4].

EMG Artifacts: The Muscular Component

Muscle artifacts arise from the electrical activity associated with muscle contraction, which can be generated by various activities including talking, chewing, swallowing, and facial movements [1] [2]. Unlike EOG artifacts, EMG contamination exhibits a broad frequency distribution ranging from 0 Hz to over 200 Hz, with significant energy in the beta and gamma ranges that often overlaps with neural oscillations associated with higher cognitive processing [1] [2]. The amplitude and waveform of EMG artifacts vary considerably based on the force of contraction and the specific muscle groups involved [3]. A critical distinguishing feature is that EMG artifacts lack stereotypical spatial topographies, as they can originate from numerous muscle groups around the head and neck, each with different volume conduction pathways to the recording electrodes [2] [5].

Table 2: Comparative Characteristics of EOG and EMG Artifacts

Feature	EOG Artifacts	EMG Artifacts
Origin	Eye blinks and movements	Muscle contraction (face, head, neck)
Amplitude	Very high (hundreds of µV) [3]	Variable, depends on contraction force [3]
Frequency Range	Similar to EEG (0.1-15 Hz) [1]	Broad spectrum (0->200 Hz) [1]
Primary EEG Band Overlap	Delta, Theta [1] [4]	Beta, Gamma [1] [2]
Spatial Distribution	Primarily anterior regions [3]	Widespread, depends on muscle group [2]
Waveform Morphology	Stereotyped, slow potentials [1]	Stochastic, burst-like [2]

Methodological Approaches for Artifact Management

Experimental Design and Pre-processing Considerations

Effective artifact management begins during experimental design. For EOG artifacts, subjects can be instructed to minimize blinks and eye movements, though this approach increases cognitive load and is not always feasible, particularly in clinical populations or lengthy recordings [3]. The use of auxiliary EOG electrodes placed above, below, and lateral to the eyes provides reference signals that can significantly enhance subsequent artifact removal algorithms [1] [3]. For EMG artifacts, minimizing head and neck movement through proper subject positioning and instruction is helpful, though often insufficient for complete artifact elimination [2].

In pre-processing, simple filtering approaches have limited utility because of the substantial spectral overlap between artifacts and neural signals. High-pass filtering can attenuate very slow EOG drifts, but risks distorting genuine low-frequency neural activity [1]. Notch filters (e.g., 50/60 Hz) effectively remove line noise but do not address physiological artifacts [1].

Core Algorithmic Approaches for Artifact Removal

Regression-Based Methods Traditional regression methods model the relationship between reference artifact channels (EOG) and EEG signals using transmission factors, then subtract the estimated artifact component from the contaminated signal [1] [6]. While straightforward to implement, these methods assume a linear relationship and stationary mixing, and they risk removing neural signals that are correlated with the reference channels [1] [6].

Blind Source Separation (BSS) Techniques BSS approaches, particularly Independent Component Analysis (ICA), have become a standard tool for artifact removal. ICA decomposes multichannel EEG into statistically independent components, which can be manually or automatically classified as neural or artifactual [1] [7] [2]. ICA is particularly effective for EOG artifacts, which typically separate into distinct components due to their stereotyped topography and temporal structure [7]. For EMG artifacts, Canonical Correlation Analysis (CCA)—a BSS variant that maximizes autocorrelation—often outperforms ICA because muscle artifacts have low autocorrelation compared to rhythmic brain activity [5].

Hybrid and Advanced Methods Recent approaches combine decomposition methods with other signal processing techniques to enhance artifact removal. The wavelet-enhanced ICA (wICA) method applies wavelet transform to independent components containing EOG artifacts, selectively thresholding coefficients corresponding to artifacts while preserving neural information [7]. Similarly, Singular Spectrum Analysis (SSA) combined with CCA has demonstrated superior performance for removing EMG artifacts from multichannel EEG compared to traditional methods [5]. For single-channel configurations, which are common in portable EEG systems, techniques combining k-means clustering with SSA have been developed to identify and remove eye-blink artifacts without modifying uncontaminated signal regions [4].

Deep Learning Approaches Emerging deep learning architectures show promise for handling complex artifact removal challenges. Networks such as CLEnet integrate convolutional neural networks (CNN) with long short-term memory (LSTM) layers to simultaneously capture spatial and temporal features of artifacts, enabling effective removal even of unknown artifact types without requiring reference channels [8]. These approaches are particularly valuable for real-world applications where artifacts may not conform to stereotypical patterns.

Experimental Protocols for Method Validation

Semi-Synthetic Data Generation To quantitatively evaluate artifact removal algorithms, researchers often create semi-synthetic data by combining clean EEG recordings with real or simulated artifacts [4] [8] [5]. This approach provides ground truth for validation. The standard protocol involves:

Identifying artifact-free EEG epochs from lengthy recordings through visual inspection [4].
Recording actual EOG/EMG artifacts or constructing them to maintain real morphology (e.g., using MATLAB smoothing functions on identified artifact segments) [4].
Mixing artifacts with clean EEG at controlled signal-to-noise ratios using linear mixing models [8] [5].

Performance Metrics Algorithm performance is quantified using multiple metrics calculated between processed signals and ground truth:

Relative Root Mean Squared Error (RRMSE): Measures temporal domain reconstruction accuracy [8].
Correlation Coefficient (CC): Assesses waveform similarity [8].
Signal-to-Noise Ratio (SNR): Quantifies artifact suppression [8].
Spectral Distortion Measures: Evaluate frequency domain preservation [9].

Visualization of Method Workflows

The Researcher's Toolkit: Essential Methods and Reagents

Table 3: Research Reagent Solutions for Artifact Management

Tool Category	Specific Method/Algorithm	Primary Function	Application Context
Reference-Based Methods	Regression (Time/Frequency Domain)	Estimates and subtracts artifact using reference EOG/ECG	Laboratory settings with reference channels [1] [6]
Blind Source Separation	Independent Component Analysis (ICA)	Separates sources by statistical independence	Multichannel EEG, especially effective for EOG [1] [7]
Blind Source Separation	Canonical Correlation Analysis (CCA)	Separates sources based on autocorrelation	Multichannel EEG, particularly effective for EMG [5]
Decomposition Techniques	Wavelet Transform	Time-frequency decomposition for selective filtering	Component enhancement or single-channel processing [7]
Decomposition Techniques	Singular Spectrum Analysis (SSA)	Decomposes time series based on covariance	Single-channel or hybrid multichannel methods [4] [5]
Hybrid Methods	wICA [7], SSA-CCA [5]	Combines strengths of multiple approaches	Challenging artifacts in multichannel EEG
Machine Learning	k-means Clustering + SSA [4]	Identifies artifact segments using time-domain features	Single-channel EEG without reference signals
Deep Learning	CLEnet (CNN + LSTM) [8]	End-to-end artifact removal using learned features	Complex or unknown artifacts in any configuration
Validation Tools	Semi-synthetic datasets with ground truth	Quantitative algorithm evaluation	Method development and benchmarking [8] [5]

EOG and EMG artifacts represent fundamental challenges to EEG signal integrity due to their distinct physiological origins, characteristic properties, and complex interactions with neural signals. EOG artifacts, with their stereotyped topography and low-frequency predominance, can mimic pathological slow waves or distort event-related potentials. EMG artifacts, with their broad spectral distribution and variable topography, can obscure high-frequency neural oscillations and generate misleading topographic patterns. Effective management requires a methodological approach tailored to the specific artifact type, available channels, and research context. While traditional regression and BSS methods remain valuable, emerging hybrid and deep learning approaches show increasing promise for handling the complex artifact profiles encountered in real-world research settings, particularly with the growing use of wearable EEG systems. For researchers in drug development and clinical applications, where signal fidelity is paramount, selecting appropriate artifact handling strategies is not merely a preprocessing step but a fundamental methodological consideration that can significantly impact study outcomes and interpretations.

Within electrophysiological research, the accurate isolation of neural signals is paramount. The electrooculogram (EOG), a record of the eye's corneo-retinal standing potential, represents a significant confounding artifact in electroencephalography (EEG) and a source of cross-talk in electromyogram (EMG) studies. This technical guide details the biophysical origins of EOG artifacts, differentiating the distinct signals generated by blinks and saccades. A core challenge in the broader context of EOG vs. EMG artifact research lies in their discrimination; while EMG artifacts from facial muscles are typically high-frequency and burst-like, EOG artifacts are characterized by their slow, large-amplitude waveforms, though both can co-occur [10]. Understanding these ocular artifacts is critical for researchers and drug development professionals to ensure the integrity of neurophysiological data in clinical trials and experimental studies.

The Biophysics of the Corneo-Retinal Potential

The fundamental source of the EOG signal is the corneo-retinal standing potential, a bioelectrical phenomenon where the retina is electrically negative relative to the cornea [10] [11]. This establishes a steady dipole field across the eye, with an amplitude typically in the tens of millivolts [12].

The Dipole Model: This dipole can be modeled as a simple current source and sink aligned with the line of sight. When the eyes are stationary and gazing straight ahead, this potential field remains relatively constant.
Signal Generation via Dipole Rotation: The key principle for EOG artifact generation is the rotation of this dipole during eye movements. As the eyeball rotates, the cornea moves toward one electrode while the retina moves toward the opposing one. This change in the dipole's orientation relative to fixed facial electrodes results in a measurable change in the electric potential at the scalp surface [11]. The recorded potential shift is linearly related to the angle of eyeball rotation within a range of approximately ±30 degrees [10].
Volume Conduction: The potential changes generated by the ocular dipole are propagated throughout the head via volume conduction [13]. This mechanism causes the electrical activity originating in the eyes to contaminate recordings from distant electrodes, most prominently those in the frontal and prefrontal scalp regions, but with significant impact on central and parietal areas as well [14] [12].

Diagram Title: The Biophysical Pathway of EOG Signal Generation.

Characterizing Blinks and Saccades

Although both blink and saccade artifacts originate from the corneo-retinal potential, their underlying mechanisms and resulting signal characteristics are distinct. The table below summarizes the key differences.

Table 1: Characteristic Differences Between Blink and Saccade Artifacts

Feature	Blink Artifact	Saccade Artifact
Primary Mechanism	Eyelid movement over the cornea, altering the electrical field and causing a "sliding electrode effect" [13].	Physical rotation of the eyeball, changing the orientation of the corneo-retinal dipole [11].
Main EOG Component	Vertical EOG (VEOG) and Radial EOG (REOG). Blinks involve a strong REOG component due to the eye rolling upwards and backwards (Bell's phenomenon) [14].	Horizontal EOG (HEOG) for lateral saccades; VEOG for vertical saccades.
Typical Amplitude	Very high (hundreds of microvolts), often the largest artifact in EEG recordings [10].	Lower than blinks, proportional to saccade amplitude [10].
Waveform Morphology	Smooth, large, positive-going spike in frontal channels [10].	Rapid, step-like potential shift corresponding to eye movement velocity [11].
Spectral Content	Predominantly very low-frequency (< 4 Hz) [15].	Broader frequency content, but still primarily low-frequency.
Propagation on Scalp	Widespread, maximal at Fp1/Fp2, but significant propagation to central sites [14] [12].	More directional propagation. Horizontal saccades affect temporal sites; vertical saccades affect frontal and parietal sites [12].

The Critical Role of the Radial EOG Component

A key finding in advanced artifact research is the necessity of the Radial EOG (REOG) component for accurate blink correction. The REOG captures voltage changes perpendicular to the plane of vertical and horizontal movements [14]. Blinks involve a strong radial component due to Bell's phenomenon, where the eyeballs rotate upwards and back into the orbit during a blink. This radial activity is highly correlated with VEOG during blinks (r-square ~0.98), making it difficult to separate them using traditional regression methods that only employ VEOG and HEOG [14]. Consequently, correction coefficients (Bs) derived from saccade data, which lacks substantial radial activity, correct blink data poorly, and vice-versa [14].

Experimental Protocols for EOG Artifact Research

To systematically study these artifacts, robust experimental protocols are required. The following methodology, synthesized from recent studies, provides a framework for acquiring high-quality EOG data.

Data Acquisition and Calibration

Electrode Placement: A minimum of three electrodes is recommended to capture the three spatial dimensions of eye movements: HEOG (outer canthi), VEOG (above and below one eye), and REOG (referenced to capture activity perpendicular to the others) [14] [13]. For comprehensive research, a 7-electrode setup surrounding the eyes provides robust data [12]. Skin must be cleaned with substances like petroleum ether to ensure optimal electrode contact and signal quality [16].
Equipment and Settings: Use a high-input impedance differential amplifier. A sampling rate of ≥250 Hz is sufficient [11] [16], but higher rates (500 Hz) are beneficial for analyzing saccadic dynamics [11]. Hardware filters (e.g., a 50 Hz low-pass filter) can help reduce high-frequency noise [16].
Calibration Procedure: Participants should sit in a stable chair, often using a chin-rest to minimize head movements [12]. The calibration involves fixating on a series of targets (e.g., a 5x5 grid on a monitor) [11]. It is critical to maintain constant ambient illumination (e.g., 200 lux) as the corneo-retinal potential is light-sensitive [16]. A 15-minute stabilization period after electrode placement is recommended before calibration [16].

Experimental Paradigms

Saccade Paradigms: The "gap" and "overlap" paradigms are classic methods for eliciting saccades with different latencies [11]. In the gap paradigm, the central fixation point disappears 200 ms before a peripheral target appears, resulting in shorter latency saccades. In the overlap paradigm, the fixation point remains visible after the target appears, producing longer latency saccades.
Systematic Viewing Area Tasks: To investigate the influence of corneal-retinal dipole orientation, researchers can employ tasks that segregate eye movements into different screen zones (e.g., left, middle, right for horizontal saccades; top, middle, bottom for vertical saccades) [12]. This reveals that EOG influence on EEG is direction and viewing-area sensitive.

Diagram Title: Experimental Workflow for EOG Artifact Investigation.

The Scientist's Toolkit: Key Research Reagents and Materials

Table 2: Essential Research Materials for EOG Artifact Studies

Item	Function & Specification
Ag/AgCl Electrodes	Disposable electrodes (e.g., 30x22 mm) are standard for stable signal acquisition and subject comfort [16].
Differential Amplifier	A multi-channel amplifier with high resolution (e.g., 20-bit ADC, 0.1 μV) and adjustable gain (~3,200) to capture the full range of EOG signals [16].
Signal Processing Software	Open-source platforms like BioSig provide validated, fully automated implementations of artifact correction methods for reproducible research [13].
Stimulus Presentation Software	Software like MATLAB with Psychophysics Toolbox allows for precise control over calibration targets and experimental paradigms [12].
Calibrated Display	A monitor with a known refresh rate (e.g., 85-100 Hz) for presenting fixation targets and controlling visual stimuli [11] [12].
Ophthalmic Chin Rest	Stabilizes the head to prevent movement artifacts and ensure consistent eye movement angles during calibration and tasks [12].

Advanced Correction Methodologies

Given the distinct nature of blink and saccade artifacts, advanced correction techniques have been developed.

Revised Aligned-Artefact Average (RAAA): This method addresses the failure of traditional regression when using saccade-derived coefficients to correct blinks. The RAAA calculates VEOG and HEOG coefficients from saccade data, then uses these to correct a blink average, forming a residual. The REOG coefficient is then calculated from this de-correlated residual, finally allowing a single set of coefficients to correct both blinks and saccades effectively [14].
Stationary Subspace Analysis (SSA): SSA is a blind source separation technique that does not require source independence. It identifies components based on their non-stationarity (changes in mean and covariance over epochs), making it effective for highly non-stationary EOG artifacts and situations with limited EEG channels [17].
Single-Channel and Automated Methods: For wearable EEG systems, methods like Fixed Frequency Empirical Wavelet Transform (FF-EWT) combined with specialized filters can automatically identify and remove EOG artifacts from a single channel by targeting their characteristic low-frequency content [15]. Fully automated regression-based methods that use multiple EOG channels have also been validated to reduce EOG artifacts by up to 80% [13].

The corneo-retinal potential is a robust and predictable source of bioelectrical artifacts. A deep understanding of its distinct manifestations during blinks and saccades is not merely a technical concern but a foundational element for data integrity in neurophysiological research. By employing appropriate multi-channel recording setups, systematic experimental paradigms, and modern correction algorithms that account for the radial component, researchers can effectively mitigate these pervasive artifacts. This ensures the accuracy of neural signal interpretation, which is critical in fundamental neuroscience and applied fields like pharmaceutical drug development, where clean electrophysiological endpoints are essential for assessing treatment efficacy.

Within electrophysiological research, particularly in studies utilizing electroencephalography (EEG) and brain-computer interfaces (BCI), the accurate separation of neural signals from non-cerebral artifacts is a fundamental challenge. This guide focuses on a specific and pervasive source of contamination: the electromyogram (EMG) artifact generated by the facial, jaw, and neck musculature. In the broader context of artifact research, EMG artifacts are often contrasted with electrooculogram (EOG) artifacts. While EOG artifacts, caused by eye blinks and movements, are typically lower frequency (0.5-12 Hz) and more localized frontally, EMG artifacts present a distinctly different and often more problematic profile [15] [18]. The myogenic signals from pericranial muscles are characterized by their broadband nature, high amplitude, and widespread anatomical distribution, which allows them to masquerade as, or fundamentally alter, genuine neurogenic activity across virtually the entire EEG spectrum [19]. This contamination poses a profound threat to the validity of research interpreting oscillatory brain activity, especially with the rising use of portable, single-channel EEG systems in both clinical and real-world settings [15] [20]. Understanding the core characteristics and methodologies for investigating these artifacts is therefore critical for researchers and drug development professionals relying on clean neural data.

Core Characteristics of Pericranial EMG Artifacts

The EMG signal originates from the electrical activity of muscle fibers during contraction. When recorded from the scalp surface, these signals are volume-conducted, leading to a complex interference pattern that contaminates the EEG. The artifacts from facial (e.g., frontalis, orbicularis oculi), jaw (e.g., temporalis, masseter), and neck muscles are particularly challenging due to their proximity to recording electrodes.

Spectral and Spatial Properties

The most defining feature of these EMG artifacts is their broadband frequency signature. Unlike the relatively narrow-band oscillations of EEG, EMG artifacts possess a remarkably wide spectral distribution.

Spectral Breadth: EMG power is detectable from very low frequencies (~2 Hz) to well over 100 Hz [19]. This broad range significantly overlaps with all standard EEG frequency bands of interest (Delta, Theta, Alpha, Beta, Gamma), making simple frequency-based filtering ineffective as it would remove crucial neural information [19] [21].
Spectral Peaks and Variation: The power spectrum of EMG artifacts does not peak in the standard EEG bands. Instead, different muscle groups have distinct spectral signatures. For instance, frontalis muscle activity can peak around 25 Hz, while temporalis activity has a lower peak near 20 Hz and a broad plateau between 40-80 Hz [19]. Furthermore, the spectral composition is not static; it can vary with factors such as contraction intensity and muscle fatigue [19].

Spatially, myogenic artifacts are exceptionally widespread. Due to volume conduction, activity from muscles across the head, face, and neck can be detected anywhere on the scalp [19]. This lack of a focal point, combined with significant individual differences in the topographic and spectral profile of myogenic activity, complicates the use of canonical spatial templates for their removal [19].

Table 1: Key Characteristics of EMG Artifacts from Pericranial Muscles

Characteristic	Description	Research Implication
Frequency Range	Broadband, from ~2 Hz to >200 Hz [19] [21] [22].	Overlaps with all neural oscillations; renders basic band-pass filtering destructive.
Spectral Peaks	Varies by muscle group (e.g., ~20 Hz for temporalis, ~25 Hz for frontalis) [19].	Requires sophisticated, adaptive denoising methods tailored to the contaminating source.
Amplitude	High amplitude, often much larger than underlying EEG [19] [18].	Can easily swamp genuine neurogenic signals and create spurious effects.
Spatial Distribution	Widespread across the entire scalp due to volume conduction [19].	Not confined to specific electrode locations; requires whole-scalp analysis.
Sensitivity to Context	Sensitive to cognitive load, emotional state, and performance motivation [19].	Creates confounds where changes in neurogenic and myogenic activity are correlated.

Contrasting EMG and EOG Artifacts

Framing EMG characteristics within the broader EOG vs. EMG research context clarifies their distinct challenges.

Table 2: Contrasting EMG and EOG Artifacts in EEG Recordings

Property	EMG Artifact	EOG Artifact
Origin	Contraction of head, face, jaw, and neck muscles [18].	Eye blinks and movements (cornea-retinal dipole) [18] [21].
Primary Frequency	Broadband (0->200 Hz); high-frequency content [19] [21].	Primarily low-frequency (0.5-12 Hz) [15].
Typical Morphology	Short-duration, spike-like motor unit potentials [18] [22].	Slow, large-amplitude deflections [18].
Primary Scalp Topography	Widespread; can be focal (e.g., temporalis) or general [19].	Primarily frontal and frontopolar [18].
Removal Complexity	High; lacks stereotypy and requires advanced methods [19] [21].	Moderate; more stereotyped, allowing regression or template-based removal [21].

Experimental Methodologies for EMG Artifact Analysis

Investigating the nature and impact of EMG artifacts requires robust experimental protocols. The following methodologies are foundational to the field.

This protocol is designed to collect clean EMG artifact data for analysis or for creating semi-synthetic datasets.

Participant Preparation: Apply standard EEG preparation procedures (skin abrasion, cleaning) to minimize electrode impedance and non-physiological noise [23]. For a comprehensive study, additionally place surface EMG electrodes over target muscles (e.g., frontalis, temporalis, sternocleidomastoid) following SENIAM recommendations to obtain reference signals.
Artifact Elicitation Tasks: Instruct participants to perform a series of controlled, isolated muscle contractions. Standard tasks include:
- Jaw Clenching/Tightening: To elicit artifacts from temporalis and masseter muscles [18] [22].
- Frowning/Raising Eyebrows: To activate the frontalis muscle.
- Swallowing or Jaw Grinding: To simulate common, involuntary artifacts.
- Head Turning Against Resistance: To engage neck muscles (e.g., sternocleidomastoid) [19].
- Each task should be performed for short, timed epochs (e.g., 5-10 seconds) with adequate rest between trials to avoid fatigue.
Data Acquisition: Simultaneously record high-density EEG (64+ channels is ideal) and reference EMG signals. A high sampling rate (≥1000 Hz) is necessary to accurately capture the high-frequency content of the EMG without aliasing. Synchronize all data streams.

Protocol 2: Creating Semi-Synthetic Contaminated EEG

This method allows for quantitative validation of artifact removal algorithms by combining clean EEG with known artifacts [24].

Source Signal Acquisition:
- Obtain "clean" EEG data from a resting-state paradigm with eyes open, or use publicly available datasets.
- Obtain "pure" EMG artifact data from Protocol 1.
Signal Preprocessing: Band-pass filter the clean EEG and pure EMG signals to the same frequency range (e.g., 1-100 Hz). Ensure both signals are mean-centered.
Linear Mixing: Generate a contaminated signal s_contaminated by linearly mixing the clean EEG signal s_EEG with the pure EMG signal s_EMG at a specific Signal-to-Noise Ratio (SNR):
- s_contaminated = s_EEG + γ * s_EMG
- The scaling factor γ is calculated based on the desired SNR (in dB) and the powers of the source signals. This creates a ground-truth dataset where the clean EEG is perfectly known, enabling precise performance metrics for artifact removal algorithms.

Protocol 3: Quantitative Analysis of Artifact Impact

This protocol assesses how EMG artifacts distort standard EEG analysis metrics.

Feature Extraction: From both clean and contaminated EEG epochs (from Protocol 2), extract standard features:
- Time-Domain: Root Mean Square (RMS), Mean Absolute Value (MAV).
- Frequency-Domain: Power Spectral Density (PSD) across standard bands (Delta, Theta, Alpha, Beta, Gamma), Median Frequency.
- Time-Frequency-Domain: Spectrograms using methods like Wavelet Transform.
Comparative Statistics: Perform paired statistical tests (e.g., paired t-tests) to compare the features extracted from clean vs. contaminated conditions. The significant inflation of power, particularly in Beta and Gamma bands, is a key expected outcome demonstrating the artifactual effect.

Visualization of Analysis Workflows

The following diagrams illustrate the logical flow of key experimental and processing pipelines described in this guide.

Diagram 1: Experimental Protocol for EMG Artifact Study

Diagram 2: Semi-Synthetic Dataset Creation & Validation

The Researcher's Toolkit: Essential Reagents & Materials

Successful research into EMG artifacts relies on a suite of methodological tools and computational solutions.

Table 3: Key Research Reagent Solutions for EMG Artifact Studies

Category / Item	Specific Examples	Function & Application
Data Acquisition	High-density EEG systems (e.g., 64+ channels); Bipolar surface EMG electrodes	Captures the spatial distribution of artifacts and provides reference signals for method validation [19].
Signal Decomposition Algorithms	Fixed Frequency Empirical Wavelet Transform (FF-EWT) [15]; Empirical Mode Decomposition (EMD) [15]; Independent Component Analysis (ICA) [19] [21]	Decomposes the single- or multi-channel signal into constituent components for identification and isolation of artifactual sources.
Advanced Filtering & Denoising	Generalized Moreau Envelope Total Variation (GMETV) filter [15]; Wavelet Denoising; Wiener Filtering	Removes or suppresses identified artifact components while aiming to preserve the underlying neural signal.
Deep Learning Architectures	CLEnet (CNN + LSTM) [24]; 1D-ResCNN [24]; Transformer-based models (EEGDNet) [24]	Provides end-to-end, automated artifact removal from contaminated EEG signals, capable of handling unknown artifacts and multi-channel data.
Validation Metrics	Correlation Coefficient (CC) [15] [24]; Signal-to-Artifact Ratio (SAR) [15]; Relative Root Mean Square Error (RRMSE) [15] [24]	Quantifies the performance of artifact removal algorithms in terms of signal fidelity and error.

The EMG artifacts generated by facial, jaw, and neck muscles represent a significant and complex challenge in electrophysiological research. Their broadband spectral profile, lack of stereotypy, and sensitivity to psychologically relevant states necessitate a move beyond simple filtering techniques. A sophisticated toolkit, combining advanced signal decomposition, targeted filtering, and emerging deep learning approaches, is required to mitigate their pervasive effects. As research moves increasingly into real-world settings with wearable EEG, the development of robust, automated, and computationally efficient methods for handling these myogenic artifacts will be paramount for ensuring the validity and reliability of neural data interpretation in both scientific and clinical domains.

Electrooculogram (EOG) and electromyogram (EMG) artifacts represent two of the most pervasive and disruptive sources of contamination in electroencephalography (EEG) signals. Their presence significantly compromises signal integrity across diverse applications, from clinical diagnostics to brain-computer interface (BCI) systems and cognitive neuroscience research [25] [20]. The challenge of distinguishing these artifacts from neural signals of interest is compounded by their overlapping characteristics and the uncontrolled conditions of modern wearable EEG systems [20]. Within this context, a precise understanding of the temporal, spectral, and spatial signatures of EOG and EMG artifacts is not merely an academic exercise but a fundamental prerequisite for developing effective artifact mitigation strategies. This whitepaper provides an in-depth comparative analysis of these signatures, equipping researchers and drug development professionals with the knowledge necessary to enhance the reliability of EEG-based analyses.

Fundamental Characteristics and Underlying Physiology

The fundamental differences between EOG and EMG artifacts stem from their distinct physiological origins. EOG artifacts arise from the movement of the eyeball, which acts as an electric dipole between the positively charged cornea and negatively charged retina. Eye movements and blinks cause a shift in this dipole, generating a potential difference that is measured as a slow, high-amplitude deflection on scalp EEG recordings [26] [7]. In contrast, EMG artifacts are generated by the electrical activity of muscle fibers during contraction. These artifacts result from the summation of action potentials from numerous motor units, producing a high-frequency, burst-like signal that can contaminate a wide range of scalp locations, particularly from the head, neck, and jaw muscles [25] [27].

The table below summarizes the core physiological mechanisms and primary sources of these artifacts.

Table 1: Physiological Origins of EOG and EMG Artifacts

Feature	EOG Artifact	EMG Artifact
Biological Source	Corneo-retinal dipole (eye)	Motor unit action potentials (muscle)
Primary Trigger	Eye blinks, saccades, smooth pursuit movements	Head, jaw, neck, and facial muscle contractions
Typical Contamination Area	Predominantly frontal and prefrontal electrodes	Widespread, but often temporal, frontal, and occipital regions

Figure 1: Physiological Origins of EEG Artifacts

Comparative Signatures: A Multi-Domain Analysis

A robust differentiation between EOG and EMG artifacts requires analysis across temporal, spectral, and spatial domains. Each domain offers distinct, complementary signatures that can be leveraged for artifact identification and removal.

Temporal Domain Signatures

In the time domain, EOG and EMG artifacts exhibit markedly different morphologies and amplitudes. EOG artifacts, particularly from blinks, are characterized by slow, monophasic or biphasic deflections with a high amplitude that can be an order of magnitude greater than background EEG [26] [7]. These deflections are typically smooth and have a characteristic duration that matches the blink or eye movement. Conversely, EMG artifacts manifest as rapid, spiky, and burst-like activity. Their amplitude is highly variable but generally lower than that of a full blink, and their morphology is irregular and non-stationary, reflecting the firing patterns of underlying motor units [25] [27].

Spectral Domain Signatures

Spectral analysis provides one of the most reliable means for distinguishing these artifacts. The power spectral density (PSD) of each artifact reveals its core frequency characteristics.

Table 2: Spectral and Spatial Characteristics of EOG and EMG Artifacts

Feature	EOG Artifact	EMG Artifact
Dominant Spectral Range	Low-frequency (< 4 Hz) [28]	Broadband, mid-to-high frequency (20-200 Hz+) [28]
Power Spectral Density (PSD) Shape	Peaks in low-frequency range; steep roll-off [26]	"White" spectral profile; flatter, more uniform power distribution across frequencies [27] [29]
Primary Spatial Topography	Maximum amplitude over frontal regions; amplitude decreases with distance from eyes [26]	Highly localized or diffuse depending on muscle source; common in temporal, frontal, and neck regions [30] [27]
Spatial Propagation	Widespread due to volume conduction, but strongest anteriorly [7]	Can be focal or widespread; does not follow a simple distance decay from a single source [27]

EOG artifacts are predominantly low-frequency phenomena. Their power is concentrated below 4 Hz, with a steep roll-off at higher frequencies [28]. This concentration is because the EOG signal is generated by the relatively slow movement of the entire eyeball. EMG artifacts, in contrast, exhibit a broadband spectral profile that spans from low frequencies well into the high-frequency range (20 Hz to over 200 Hz) [28]. Studies of spatial spectra have shown that EMG power in the temporal PSD is more "white" than the 1/fα scaling typically seen in genuine EEG, meaning it has a more uniform power distribution across a wide frequency band [27] [29]. This makes EMG a primary confound for studies investigating beta and gamma band neural oscillations.

Spatial Domain Signatures

The spatial distribution, or topography, of an artifact on the scalp is a direct consequence of its source origin and volume conduction. EOG artifacts have a characteristic and relatively stable topography, with the highest amplitude recorded over the frontal and prefrontal electrodes closest to the eyes [26]. The amplitude attenuates systematically as a function of distance from the ocular source, though volume conduction ensures the artifact is visible across most of the scalp [7]. The topography of EMG artifacts is more variable and complex. It depends entirely on the specific muscle group active. For example, temporalis muscle tension produces focal artifacts over the temporal lobes, while frontalis muscle activity contaminates frontal channels, and neck muscle contractions can affect occipital electrodes [30] [27]. This lack of a single, predictable spatial pattern makes EMG particularly challenging to identify based on topography alone.

Experimental Protocols for Characterization and Removal

Accurate characterization of artifacts is a critical first step toward their removal. This section outlines standard and emerging experimental methodologies.

Data Acquisition and Preprocessing

High-quality data acquisition is paramount. While traditional research uses high-density EEG arrays (e.g., 64 channels) with wet electrodes to precisely map spatial topographies [27] [29], the rise of wearable EEG has shifted focus to systems with fewer channels (often ≤16) and dry electrodes [20]. For definitive artifact identification, it is considered best practice to record simultaneous reference EOG and EMG signals. Horizontal and vertical EOG channels are placed around the eyes, while EMG references can be placed on the neck, jaw, or temple muscles [26] [7]. Preprocessing typically involves band-pass filtering (e.g., 0.5-100 Hz) and notch filtering (50/60 Hz) to remove line noise.

Core Methodologies for Analysis and Removal

ICA is a cornerstone technique for artifact removal. It decomposes multi-channel EEG data into statistically independent components (ICs) [26] [7]. The underlying assumption is that artifacts and neural signals originate from statistically independent sources.

Figure 2: ICA-Based Artifact Removal Workflow

Experimental Protocol for ICA:

Decomposition: Apply an ICA algorithm (e.g., Infomax, FastICA) to the multi-channel EEG data to obtain a set of ICs and a corresponding unmixing matrix.
Component Classification: Identify artifact-laden ICs. This can be done manually by inspecting component topography and time-course, or automatically by calculating correlation coefficients with reference EOG/EMG channels [7] or using machine learning classifiers that detect specific temporal or spatial features [20].
Artifact Removal: Two primary approaches exist:
- Component Rejection: The entire artifact IC is set to zero before signal reconstruction. This is effective but risks losing neural data present in the same component [7].
- Component Correction: Only the artifactual sections within an IC are corrected. This is often achieved by applying a secondary method, like wavelet transformation, to identify and remove artifact peaks within the IC while preserving neural segments [7].
Reconstruction: The remaining (and/or corrected) ICs are projected back to the sensor space using the mixing matrix, resulting in a cleaned EEG signal.

Deep Learning-Based Approaches

Recent advances have introduced deep learning models that perform end-to-end artifact removal. These models learn the complex, non-linear mappings from contaminated EEG to clean EEG.

Table 3: The Researcher's Toolkit: Key Materials and Algorithms

Tool/Reagent	Function/Description	Utility in EOG/EMG Research
High-Density EEG Array	A 64+ channel electrode system with close spacing (e.g., 3mm-1cm) [27].	Enables detailed spatial spectral analysis and optimal source separation via ICA.
Reference EOG/EMG Electrodes	Dedicated sensors placed to record pure artifact signals.	Provides ground truth for validating artifact detection and removal algorithms [7].
Independent Component Analysis (ICA)	A blind source separation algorithm.	The gold-standard for isolating and removing artifact components from multi-channel data [26] [7].
Wavelet Transform	A mathematical tool for analyzing transient, non-stationary signals.	Used to identify and remove sharp, transient artifacts like blinks and EMG bursts within ICA components [7].
Long Short-Term Memory (LSTM) Network	A type of recurrent neural network (RNN).	Models temporal dependencies in EEG data; effective for estimating EOG signals from contaminated EEG [26] [8].
Convolutional Neural Network (CNN)	A neural network designed for spatial feature extraction.	Extracts morphological features from EEG; effective in distinguishing signal from artifact [28] [8].
Artifact-Aware Denoising Model (A²DM)	A unified DL framework that uses artifact representation as prior knowledge [28].	Dynamically removes multiple interleaved artifact types (EOG+EMG) by leveraging their frequency-domain signatures.

Experimental Protocol for Deep Learning:

Dataset Preparation: Use a semi-synthetic benchmark dataset like EEGdenoiseNet [28] [8], where clean EEG is artificially contaminated with recorded EOG and EMG at known signal-to-noise ratios. This provides a ground truth for model training and evaluation.
Model Architecture & Training: Design a neural network suitable for time-series data. Popular architectures include:
- Hybrid CNN-LSTM Networks (e.g., CLEnet): CNNs extract local morphological features, while LSTMs capture long-range temporal dependencies [8].
- Artifact-Aware Models (e.g., A²DM): These models first identify the type of artifact present and then apply a targeted removal strategy, such as a hard attention mechanism in the frequency domain to filter out artifact-specific components [28].
Model Evaluation: The performance of the denoising model is quantified using metrics such as the Correlation Coefficient (CC), Signal-to-Noise Ratio (SNR), and Root Mean Square Error (RMSE) between the cleaned signal and the ground-truth clean EEG [28] [8].

Discussion and Future Directions

The comparative analysis underscores that while EOG artifacts are relatively predictable in their low-frequency and frontal-dominant nature, EMG artifacts present a more formidable challenge due to their broadband spectral characteristics and variable spatial topography. Traditional methods like ICA are powerful but can be compromised in low-channel count wearable systems and often require manual intervention [20]. The emergence of deep learning signifies a paradigm shift towards automated, robust, and unified artifact removal systems. Models like A²DM, which leverage prior knowledge of artifact type to inform the denoising process, represent the cutting edge [28]. Future research must focus on developing methods that are adaptable to the unique constraints of wearable EEG, including limited channels and the presence of unknown or complex artifact mixtures encountered in real-world settings [20] [8]. Furthermore, the creation of standardized, publicly available datasets of real-world artifacts will be crucial for benchmarking and advancing the field.

Electroencephalography (EEG) provides a non-invasive window into brain dynamics, serving as a critical tool in clinical diagnostics, cognitive neuroscience, and neuropharmacology. However, the accurate interpretation of neural signals is perpetually threatened by contamination from extra-cerebral sources, primarily electrooculographic (EOG) and electromyogenic (EMG) artifacts. These artifacts represent a significant challenge to inferential validity, particularly as EEG applications expand into real-world, wearable monitoring and brain-computer interfaces. EOG artifacts, generated by eye movements and blinks, introduce high-amplitude, low-frequency deflections that dominate frontal EEG channels. EMG artifacts, originating from cranial muscle activity, exhibit high-amplitude, broad-spectrum properties that can masquerade as genuine neural effects. Within the context of EOG vs. EMG artifacts research, this assessment delineates the mechanisms by which these artifacts obscure neural signals, quantifies their impact on data interpretation, and evaluates contemporary mitigation strategies to preserve data integrity in both academic and applied settings.

The Nature and Scope of the Problem

Fundamental Characteristics of EOG and EMG Artifacts

EOG and EMG artifacts possess distinct physiological origins and signal properties, necessitating differentiated approaches for their identification and correction.

EOG Artifacts: These arise from the retinal dipole movement and eyelid activity, effectively creating an electric dipole within the eyeball whose orientation aligns with the line of sight [13]. The resulting potential shifts are volume-conducted across the scalp, presenting as large-amplitude, slow-wave deflections most prominent over the frontal lobes. Their morphology differs between blinks (typically broader) and saccades (sharper), but both are generated by the same underlying dipole mechanism [13]. The spectral content of EOG artifacts predominantly overlays the critical low-frequency EEG bands (delta and theta), complicating simple filtering approaches [7].
EMG Artifacts: These originate from the electrical activity of spatially distributed cranial muscle groups (e.g., frontalis, temporalis, masseter). A key challenge is that even weak EMG is detectable across the entire scalp due to volume conduction [31]. Its amplitude can be 1-2 orders of magnitude larger than mean EEG differences (75–400 µV vs. <10 µV), and its spectral profile is exceptionally broad, contaminating frequencies from the alpha band (8–13 Hz) upwards to high gamma [31]. Unlike the more stereotyped EOG, EMG exhibits substantial variability in its spatial and spectral signature across individuals and muscle groups, rendering canonical filters ineffective [31].

Comparative Impact on Data Interpretation

The following table summarizes the core characteristics and primary risks these artifacts pose to neural signal interpretation.

Table 1: Comparative Analysis of EOG and EMG Artifact Impact

Characteristic	EOG Artifacts	EMG Artifacts
Spectral Overlap	Dominates low frequencies (Delta, Theta) [15]	Broad spectrum, contaminating Alpha to Gamma bands [31]
Spatial Topography	Maximal over frontal electrodes [13]	Widespread across scalp, variable by muscle group [31]
Amplitude	High-amplitude, slow deflections [7]	Very high amplitude (75-400 µV) [31]
Primary Statistical Risk	Reduced signal-to-noise ratio, obscuring ERPs [32]	Inferential validity risk; can masquerade as genuine spectral effects [31]
Confounding with Cognitive Processes	Blinks may be time-locked to stimuli or cognitive load [33]	Facial EMG is sensitive to cognitive/affective processes (e.g., stress, effort) [31]

Quantitative Impact on Decoding and Analysis

The presence of artifacts directly compromises the performance of advanced analytical techniques, including multivariate pattern analysis (MVPA) and machine learning models used for decoding neural states.

A comprehensive 2025 study assessed the impact of artifact correction on support vector machine (SVM) and linear discriminant analysis (LDA)-based decoding across seven common event-related potential paradigms. The key finding was that the combination of artifact correction and rejection did not significantly enhance decoding performance in the vast majority of cases [32]. This suggests that decoders can, to some extent, learn to ignore consistent artifact patterns. However, the study strongly recommended retaining artifact correction, specifically through methods like Independent Component Analysis (ICA), to minimize the risk of artifact-related confounds artificially inflating decoding accuracy, which would lead to incorrect conclusions about neural representations [32].

The quantitative impact of artifact removal can be visualized through the following experimental workflow, which outlines a standard pipeline for validation.

Methodologies for Artifact Detection and Correction

A wide array of experimental protocols and signal processing techniques have been developed to combat EOG and EMG artifacts. These range from classical statistical approaches to modern deep learning models.

Classical and Component-Based Methods

Regression-Based Methods: This classical approach models EOG artifacts as a linear combination of recorded EOG channels. The method calculates weighting coefficients (e.g., via least-squares estimation) to subtract the EOG contribution from each EEG channel [13]. While simple and robust, a primary criticism is that it may oversubtract and remove genuine neural activity that is correlated with the EOG reference. Studies have shown it can reduce EOG artifacts by up to 80% in a fully automated manner [13].
Independent Component Analysis (ICA): ICA is a blind source separation technique that decomposes multi-channel EEG into statistically independent components. Artifactual components (identified by their temporal, spectral, and spatial features) can be removed before signal reconstruction [31] [7]. A significant limitation is that removing entire components risks losing neural information contained within them. Validation tests reveal that ICA does not represent a panacea for EMG contamination, though it remains a highly popular and effective tool for EOG removal [31].
Wavelet-Enhanced ICA (wICA): This hybrid method improves upon standard ICA by applying a discrete wavelet transform to the artifact-laden independent components. High-amplitude coefficients corresponding to artifacts (e.g., EOG peaks) are thresholded and zeroed out, allowing for the retention of the non-artifactual parts of the component before reconstruction [7]. An improved, fully automatic wavelet-based method corrects EOG components selectively within EOG activity regions only, leaving other parts of the component untouched and better preserving neural data [7].

Modern Data-Driven and Adaptive Methods

General Linear Model (GLM) for EMG: Intra-individual GLM-based methods have been validated for correcting ongoing or induced (but not evoked) spectral changes caused by EMG. This technique uses regression to remove variance in a neurogenic band (e.g., alpha) predicted by activity in a high-frequency EMG band (e.g., 70–80 Hz) [31]. It has been shown to be sensitive and specific for sensor-level data, though it struggles with source-localized data [31].
Deep Learning Approaches (LSTM & BiLSTM): Emerging methods combine Long Short-Term Memory (LSTM) networks with ICA to estimate EOG signals directly from contaminated EEG, even when no EOG reference is available [26]. The LSTM network is trained to learn the features of EOG artifacts, which are then separated via ICA. Other models, like a Bidirectional LSTM (BiLSTM) with an attention mechanism, are designed to denoise neural signals while preserving the precise shape of spikes, demonstrating high signal-to-noise ratio improvements even under high noise levels [34]. These models are data-driven and require no prior assumptions about signal independence, offering strong robustness [34].
Fixed Frequency Empirical Wavelet Transform (FF-EWT): This recent, automated method for single-channel EEG decomposes the signal and identifies EOG-contaminated components using kurtosis, dispersion entropy, and power spectral density metrics. These components are then cleaned using a finely-tuned Generalized Moreau Envelope Total Variation (GMETV) filter, effectively suppressing artifacts while preserving low-frequency EEG information [15].

Table 2: Performance Metrics of Various Artifact Removal Techniques

Method	Best For	Key Performance Metrics	Reported Limitations
Regression [13]	EOG with reference channels	80% EOG reduction; Validated via blind expert scoring [13]	May oversubtract cerebral activity
ICA [31] [7]	Multi-channel EOG/EMG	Widely adopted; Good spatial separation	Manual component inspection; Neural data loss in rejected components
wICA [7]	EOG preservation	Outperforms component rejection in time & spectral accuracy	Computational cost; Threshold selection critical
GLM (Intra-individual) [31]	Induced EMG spectral changes	Adequate sensitivity & specificity at sensor level	Fails for evoked activity & source-localized data
LSTM-ICA [26]	EOG without reference	Low MSE/MAE vs. ground truth; Superior to other DL methods	LSTM architecture optimization can be trial-and-error
BiLSTM-Attention [34]	Spike denoising in high noise	Maintains SNR >27 dB; Pearson ~0.91 at high noise [34]	Requires large, high-quality training datasets
FF-EWT+GMETV [15]	Single-channel EOG	High SAR; Low RRMSE & MAE on synthetic & real data	Performance may drop with mode mixing

The Scientist's Toolkit: Essential Research Reagents and Materials

Successful experimentation in this field relies on a suite of specialized tools, from software libraries to hardware configurations.

Table 3: Key Research Reagent Solutions for Artifact Research

Tool / Material	Type	Primary Function	Example/Reference
BioSig Library	Software Library	Open-source signal processing for offline/online artifact correction [13]	Implements automated regression method [13]
High-Density EEG	Hardware	64+ channels improve spatial resolution for source separation (ICA, PCA) [31]	125-channel system used in GLM validation [31]
Dry/Semi-Wet Electrodes	Hardware	Enable wearable, long-term EEG; but can increase artifact prevalence [35]	Common in portable, single-channel systems [15]
Reference EOG/EMG Electrodes	Hardware	Provide dedicated recordings of artifact sources for regression/validation	3 EOG electrodes for vertical, horizontal, radial components [13]
FieldTrip Toolbox	Software Library	MATLAB toolbox for M/EEG analysis, including comprehensive artifact handling [33]	Used for visual/automatic artifact rejection and ICA [33]
MNE-Python	Software Library	Python package for neurophysiology data exploration and artifact detection [36]	Includes tools for automated heartbeat/blink detection & ICA [36]

EOG and EMG artifacts present a multifaceted and persistent threat to the integrity of EEG data interpretation. Their capacity to obscure neural signals stems from their high amplitude, broad spectral contamination, and—most insidiously—their sensitivity to psychological processes, creating latent confounds. While traditional methods like regression and ICA provide substantial mitigation, they are not without limitations, including the risk of neural signal loss and incomplete correction, especially for complex EMG artifacts. The emerging generation of techniques, including wavelet-enhanced ICA, data-driven deep learning models, and adaptive single-channel approaches, offers promising pathways toward more precise and automated artifact removal. As the field moves toward wearable EEG applications in real-world environments, the development and rigorous validation of robust, transparent artifact handling pipelines will be paramount. Ensuring the fidelity of neural signals is not merely a technical pre-processing step but a foundational requirement for valid inference in basic neuroscience and applied drug development.

From Theory to Practice: Advanced Detection and Removal Techniques for EOG and EMG

In the field of neuroscience and biomedical signal processing, the analysis of clean neural signals is paramount for both clinical diagnostics and brain-computer interface (BCI) applications. Electroencephalogram (EEG) signals, which record the brain's electrical activity, are consistently contaminated by physiological artifacts, with electrooculogram (EOG) and electromyogram (EMG) representing the most significant interfering sources [21]. These artifacts originate from eye movements/blinks and muscle activity respectively, and possess amplitudes that can be 10-200 times greater than the underlying brain signals, posing a substantial challenge for accurate signal interpretation [37].

Within this context, Blind Source Separation (BSS) has emerged as a powerful framework for isolating artifacts from desired neural signals without prior knowledge of the mixing process. Among BSS techniques, Independent Component Analysis (ICA) and Principal Component Analysis (PCA) have established enduring roles as fundamental tools for artifact isolation. This whitepaper examines the technical principles, comparative performance, and methodological applications of ICA and PCA within a structured research framework focused on EOG and EMG artifact separation, providing researchers with both theoretical foundations and practical experimental protocols.

Fundamental Principles of BSS in Biomedical Signal Processing

Blind Source Separation operates on the principle that observed biomedical signals are linear mixtures of statistically independent source signals. The fundamental model can be represented as:

X = AS

Where X is the matrix of observed signals from EEG electrodes, A is the unknown mixing matrix determined by volume conduction through head tissues, and S contains the underlying source signals including both neural activity and artifacts [38]. The objective of BSS is to estimate a separating matrix W that approximates the original sources: Ŝ = WX.

PCA contributes to artifact isolation through dimensionality reduction and decorrelation by identifying orthogonal directions of maximum variance in the data. It effectively separates components based on their power, making it particularly useful for removing high-variance artifacts like eye blinks during preliminary processing [38].

ICA extends this capability by searching for components that are statistically independent rather than merely uncorrelated. This enables separation of sources with overlapping frequency content but different generative processes—precisely the scenario encountered with EOG/EMG artifacts that overlap with neural signals in the frequency domain [39] [38]. The independence criterion (e.g., maximization of non-Gaussianity through FastICA algorithms) allows ICA to effectively isolate artifactual components based on their distinctive temporal structures and spatial distributions.

Comparative Performance Analysis of ICA and PCA for EOG and EMG Isolation

Quantitative Performance Metrics

Table 1: Comparative performance of ICA and PCA for artifact removal based on independent validation studies

Metric	ICA Performance	PCA Performance	Evaluation Context
Overall Artifact Removal Rate	88% (2035 artifacts) [39]	Lower than ICA [38]	Continuous EEG with marked artifacts
Ocular (EOG) Artifact Removal	81% [39]	Incomplete separation [38]	Publicly available EEG datasets
Muscle (EMG) Artifact Removal	98% [39]	Limited effectiveness [38]	Publicly available EEG datasets
Cardiac Artifact Removal	84% [39]	Not specifically reported	Publicly available EEG datasets
Powerline Noise Removal	100% [39]	Effective for high-power noise	Publicly available EEG datasets
Computational Efficiency	Fast automatic algorithm [39]	Generally faster	Real-time processing capability

Technical Advantages and Limitations

ICA demonstrates superior performance for EOG and EMG isolation due to its ability to separate sources with statistical independence rather than just orthogonality [38]. This is particularly valuable for muscle artifacts (EMG) which have a broad frequency distribution (0->200 Hz) that significantly overlaps with neural signals [21]. ICA can effectively isolate these artifacts based on their characteristic temporal patterns and spatial topographies, achieving a remarkable 98% removal rate for EMG contamination [39].

However, ICA requires multiple channels for effective separation and depends on the statistical independence of sources, which can be compromised when artifacts and neural signals occur synchronously [2]. The methodology also necessitates subsequent component classification to identify artifactual sources before rejection, adding a step to the processing pipeline.

PCA provides a computationally efficient approach for removing high-variance artifacts like eye blinks through its variance-maximization principle [38]. This makes it valuable for preliminary processing or in resource-constrained environments. However, PCA's limitation to orthogonal transformations restricts its effectiveness for separating sources that are statistically independent but have correlated amplitude distributions, resulting in incomplete artifact separation and potential loss of neural signal when artifactual components are discarded [38].

Experimental Protocols for ICA/PCA-Based Artifact Removal

Standardized EEG Acquisition Protocol

For reproducible artifact isolation research, the following acquisition parameters are recommended:

Electrode Placement: Follow the international 10-20 system with additional EOG electrodes (vertical: above/below eye; horizontal: outer canthi) and EMG electrodes on relevant facial/neck muscles [21]
Reference Channels: Include dedicated EOG (for ocular artifacts) and ECG (for cardiac artifacts) reference channels to facilitate validation [40]
Sampling Rate: Minimum 256 Hz to adequately capture EMG frequency content up to 200 Hz [21]
Filter Settings: Bandpass filter 0.5-100 Hz with notch filter at 50/60 Hz for powerline interference [41]

ICA Processing Workflow for EOG/EMG Isolation

PCA Processing Workflow for Preliminary Artifact Reduction

Advanced Hybrid Methodologies and Emerging Approaches

Integration with Complementary Techniques

While ICA and PCA provide powerful foundation for artifact isolation, recent research has demonstrated enhanced performance through hybrid methodologies that combine these BSS approaches with complementary signal processing techniques:

Wavelet-ICA Fusion: An improved multi-layer wavelet transform combined with FastICA has shown superior performance for removing ECG artifacts from EMG signals by leveraging both temporal and statistical properties [41]
CNN-LSTM Architectures: Hybrid deep learning approaches incorporating simultaneous EMG recordings can precisely eliminate muscle artifacts from EEG while preserving neurologically relevant components like steady-state visual evoked potentials (SSVEPs) [40]
Riemannian Geometry Integration: Modification of the Artifact Subspace Reconstruction (ASR) algorithm by incorporating Riemannian geometry has shown improved artifact removal from EEG signals while preserving essential bioelectrical patterns [40]

Frequency Domain Characteristics of Artifacts and Neural Signals

Table 2: Frequency characteristics of neural signals and major artifacts informing BSS separation strategies

Signal Type	Frequency Range	Key Characteristics	Spatial Distribution
Delta Waves	0.5-4 Hz	High amplitude, slow waves	Frontal in adults, posterior in children
Theta Waves	4-8 Hz	Related to creativity, meditation	Temporal, parietal
Alpha Waves	8-13 Hz	Prominent during relaxation	Posterior regions, eyes closed
Beta Waves	13-30 Hz	Associated with active thinking	Frontal, parietal
Gamma Waves	30-100 Hz	Cognitive processing, perception	Somatosensory cortex
EOG Artifacts	0-20 Hz	High amplitude, frontal dominance	Primarily frontal electrodes
EMG Artifacts	0->200 Hz	Broadband, high frequency	Temporal, frontal, neck regions
ECG Artifacts	0-100 Hz	Periodic pattern, ~1.2 Hz pulse	Variable, near blood vessels

The Researcher's Toolkit: Essential Materials and Reagents

Table 3: Essential research reagents and materials for BSS artifact isolation studies

Item	Specification	Research Function
EEG Acquisition System	High-density (64+ channels) with sync capability	Records raw neural data with sufficient spatial sampling
Ag/AgCl Electrodes	Standard 8-12 mm diameter, wet gel	Ensures optimal skin contact and signal quality [42]
Electrode Gel	NaCl-based, high conductivity	Reduces skin-electrode impedance [42]
Reference Sensors	EOG (vertical/horizontal), ECG, EMG	Provides ground truth for artifact validation [40]
Skin Prep Kit	Abrasive paste, alcohol wipes, measuring tape	Standardizes electrode placement and impedance [42]
Signal Processing Suite	MATLAB with EEGLAB, Python MNE	Implements ICA/PCA algorithms and visualization
Component Classifier	ICLabel, ADJUST, MARA	Automates identification of artifactual components
Validation Dataset	Publicly available EEG corpus (e.g., EEGMMIDB)	Enables method benchmarking and comparison

ICA and PCA continue to provide the methodological foundation for artifact isolation in biomedical signal processing despite the emergence of alternative approaches. ICA's statistical independence criterion makes it particularly effective for separating physiologically distinct sources like EOG and EMG from neural signals, with validated removal rates exceeding 80% across artifact types [39]. PCA offers complementary strengths through its computational efficiency and variance-based separation, making it valuable for preliminary processing and resource-constrained applications.

The enduring utility of these BSS techniques is increasingly enhanced through integration with wavelet analysis, deep learning architectures, and Riemannian geometry, creating hybrid frameworks that address the limitations of any single approach [40] [41]. As the field advances toward real-time artifact removal in clinical and BCI applications, the computational efficiency of these algorithms—with recent implementations achieving rapid processing suitable for online systems—ensures their continued relevance [39]. Future developments will likely focus on adaptive implementations that maintain the statistical principles of ICA and PCA while increasing automation and integration with multimodal data streams.

Electroencephalogram (EEG) signals are routinely contaminated by unwanted artifacts originating from extra-cerebral sources, with electrooculographic (EOG) and electromyographic (EMG) artifacts representing two of the most significant challenges in both clinical and research settings. These artifacts introduce high-amplitude, non-neural signals that can severely distort EEG measurements, potentially rendering entire datasets unusable for accurate analysis [7] [18]. EOG artifacts generated by eye blinks and movements manifest as large amplitude peaks that dominate the frontal regions, while EMG artifacts from muscle activity exhibit broad spectral distributions and variable topographical patterns that overlap with crucial EEG frequencies [25] [43]. The fundamental problem stems from the spectral overlapping between artifact and neural signals, which renders simple filtering approaches ineffective and necessitates more sophisticated signal processing techniques [7].

In Brain-Computer Interface (BCI) systems and pharmacological research, the presence of these artifacts poses a particularly critical challenge as they may alter neurological phenomena or even be mistakenly interpreted as control signals or drug effects [25] [2]. Despite the recognized importance of this issue, surveys of BCI literature reveal that most studies do not adequately report whether or how they have addressed EOG and EMG artifacts, with only a small percentage implementing automated methods for their rejection or removal [25]. This comprehensive technical guide focuses specifically on advanced wavelet-enhanced independent component analysis (ICA) methodologies that enable selective artifact correction while preserving underlying neural information, with particular emphasis on their application within the broader context of EOG versus EMG artifact research.

Fundamental Principles: EOG vs. EMG Artifact Characteristics

Understanding the distinct properties of EOG and EMG artifacts is essential for developing effective removal strategies. The table below summarizes their key characteristics:

Table 1: Characteristic Differences Between EOG and EMG Artifacts

Characteristic	EOG Artifacts	EMG Artifacts
Origin	Eye blinks and movements [7]	Muscle activity (face, neck, jaw) [25]
Spectral Range	Low-frequency (0.5-12 Hz) [15]	Wide spectrum (2-100 Hz) [44]
Amplitude	Very high amplitude peaks [7]	High amplitude, variable [18]
Spatial Distribution	Primarily frontal regions [7]	Broad anatomical distribution [44]
Temporal Pattern	Transient, synchronized with blinks [7]	Burst-like or sustained with muscle contraction [18]
Removal Challenge	Overlap with EEG delta/theta rhythms [7]	Spectral overlap with beta/gamma rhythms [44]

The eyeball functions as a dipole with the cornea positive relative to the retina, generating large-amplitude alternating current fields detectable by nearby electrodes during eye movements [18]. In contrast, EMG artifacts originate from multiple muscle groups with distinct topographic and spectral signatures that vary with contraction intensity [44]. This fundamental difference in generation mechanisms necessitates tailored approaches within the wavelet-ICA framework, particularly regarding component identification and threshold selection.

The Evolution of ICA-Based Artifact Removal

Independent Component Analysis has emerged as a cornerstone technique for artifact removal due to its ability to separate mixed signals into statistically independent sources without requiring reference channels [7]. The traditional ICA approach for artifact removal involves:

Decomposing EEG signals into independent components (ICs)
Identifying artifact-related components through visual inspection or automated metrics
Rejecting entire identified components before signal reconstruction [7]

However, this method suffers from a critical limitation: neural information loss. Since ocular sources are not completely separated from neural sources, rejecting entire components inevitably discards valuable EEG data present within those components [7] [45]. This limitation is particularly problematic for research and clinical applications where preserving the integrity of original neural signals is paramount, such as in drug development studies investigating subtle pharmacological effects on brain activity.

The emergence of wavelet-enhanced ICA methodologies addressed this limitation by introducing an intermediate processing step that selectively targets artifact components while preserving neural information within the same component [7] [45]. This paradigm shift from component rejection to component correction represents a significant advancement in artifact removal technology, particularly for EOG artifacts which often contain valuable low-frequency neural activity intermixed with blink-related potentials.

Wavelet-Enhanced ICA: Core Methodology and Mechanisms

Theoretical Foundation

The wavelet-enhanced ICA framework combines the blind source separation capability of ICA with the multi-resolution analysis of Discrete Wavelet Transform (DWT) to achieve selective artifact correction [7]. The fundamental innovation lies in processing individual independent components containing artifacts at the wavelet coefficient level rather than rejecting entire components. This approach leverages the key property that wavelet coefficients of artifact components typically exhibit higher amplitudes than those of cerebral activity components, enabling their selective identification and correction [7].

The mathematical foundation of DWT involves decomposing a signal using basis functions from wavelet families (Symlets, Coiflets, Haar, etc.) through expansion and translation operations [46]. The transformation progressively refines the signal at multiple scales, enabling simultaneous time and frequency localization – a crucial advantage for analyzing non-stationary EEG signals [46]. The DWT of a signal (f(t)) is represented as:

[W{\psi}f(j,k) = \int f(t) \psi{j,k}^*(t) dt]

where (\psi_{j,k}) represents the wavelet basis function at scale (j) and translation (k) [46].

Workflow Architecture

The following diagram illustrates the comprehensive workflow for wavelet-enhanced ICA methodology:

Figure 1: Workflow of Wavelet-Enhanced ICA Methodology

Component Identification Strategies

Accurate identification of artifact-related independent components is crucial for the success of wavelet-enhanced methods. The following approaches are commonly employed:

Kurtosis-based identification: EOG artifacts often exhibit high kurtosis due to their transient, peak-like nature [47]
Dispersion entropy and power spectral density: Effective for identifying EOG-related components in single-channel applications [15]
Spatial and temporal features: Used in automated toolboxes like ADJUST without requiring reference electrodes [7]
Sample entropy: Applied after CEEMDAN decomposition to identify EOG-related components in single-channel EEG [46]

For EMG artifacts, identification becomes more challenging due to their broad spectral characteristics and variable topographical distribution [44]. Advanced approaches combine multiple metrics including temporal, spectral, and spatial features to improve identification accuracy.

Advanced Hybrid Frameworks and Performance Evaluation

Extended Hybrid Methodologies

Recent research has explored several hybrid frameworks that build upon the basic wavelet-ICA foundation:

EEMD-ICA (EICA): Combines Ensemble Empirical Mode Decomposition with ICA, where EOG-related ICs are processed using EEMD to discriminate and eliminate intrinsic mode functions linked to EOG [47]
WPTEMD and WPTICA: Combine Wavelet Packet Transform with either EMD or ICA, showing superior performance for highly contaminated data [43]
DWT-CEEMDAN-ICA: Addresses overcompleteness and mode aliasing problems in single-channel EEG by integrating Complete Ensemble EMD with Adaptive Noise [46]
ICA-WNN: Combines ICA with Wavelet Neural Networks to correct contaminated components while minimizing data loss [45]

These advanced frameworks demonstrate the ongoing evolution toward fully automatic artifact removal systems that require no manual intervention or a priori knowledge of artifact characteristics [7] [43].

Quantitative Performance Metrics

The performance of wavelet-enhanced ICA methods is typically evaluated using multiple quantitative metrics:

Table 2: Performance Metrics for Artifact Removal Algorithms

Metric	Formula/Definition	Interpretation
Root Mean Square Error (RMSE)	(\sqrt{\frac{1}{N}\sum{i=1}^{N}(x{clean}(i)-x_{reconstructed}(i))^2})	Lower values indicate better reconstruction [7] [43]
Signal-to-Artifact Ratio (SAR)	(10\log{10}\left(\frac{P{signal}}{P_{artifact}}\right))	Higher values indicate better artifact suppression [15]
Correlation Coefficient (CC)	(\frac{\sum(x{clean}-\bar{x}{clean})(x{reconstructed}-\bar{x}{reconstructed})}{\sigma{clean}\sigma{reconstructed}})	Measures waveform preservation [15]
Artifact to Signal Ratio (ASR)	Analogous to SNR but for artifacts [43]	Lower values indicate better performance [43]
Mutual Information (MI)	(I(X;Y) = \sum{y\in Y}\sum{x\in X}p(x,y)\log\left(\frac{p(x,y)}{p(x)p(y)}\right))	Higher values indicate better information preservation [44]

Comparative Performance Analysis

Studies directly comparing wavelet-enhanced methods with other approaches demonstrate their superior performance:

Table 3: Comparative Performance of Artifact Removal Methods

Method	Artifact Type	Performance Advantages	Limitations
Wavelet-Enhanced ICA	EOG	51.88% better recovery than component rejection methods [7]	Threshold selection critical [7]
wICA	EOG	Reduces neural data loss compared to full component rejection [7]	May not handle movement artifacts effectively [43]
EICA (EEMD-ICA)	EOG	Optimal performance with highest SNR increase and RMSE decrease [47]	Computationally intensive [47]
WPTEMD	General artifacts	Best for highly contaminated data; outperforms wICA and FASTER [43]	Parameter selection important [43]
Regression-Based ADF	EOG	Does not require component identification [47]	Assumes linear mixing; cross-contamination issue [47]
Traditional ICA	EOG/EMG	Established method; no reference channels needed [7]	Significant neural information loss [7]

The improved wavelet-based component correction method described in [7] demonstrates particularly strong performance by correcting EOG components selectively within EOG activity regions only, leaving other parts of the component untouched. This approach outperforms both component rejection methods and earlier wavelet-based EOG removal methods in accuracy across both time and spectral domains [7].

Experimental Protocols and Implementation Guidelines

Standard Experimental Protocol for Method Validation

Implementing and validating wavelet-enhanced ICA methods requires a structured experimental approach:

Data Acquisition:
- For EOG artifacts: Record EEG with simultaneous eye blink tasks (voluntary blinks at 3-5 second intervals)
- For EMG artifacts: Incorporate jaw clenching, head movement, or talking tasks
- Use standard electrode placement (10-20 system) with sampling rate ≥256 Hz [7] [18]
Signal Preprocessing:
- Apply bandpass filtering (0.5-45 Hz) to remove extreme frequencies
- Re-reference to average reference or mastoids
- Identify artifact-contaminated segments using amplitude thresholding (>±100μV)
ICA Decomposition:
- Use established ICA algorithms (Infomax, FastICA, SOBI)
- Ensure data dimensionality reduction appropriate for channel count
Component Identification:
- For EOG: Use kurtosis (>3) combined with frontal topography
- For EMG: Use high-frequency power ratio and spatial distribution
- Employ multiple criteria to reduce false positives
Wavelet Processing:
- Select appropriate mother wavelet (Symlet, Coiflet, or Daubechies families)
- Determine optimal decomposition level based on signal length
- Apply adaptive thresholding to wavelet coefficients
Validation:
- Compare with simultaneously recorded artifact-free segments
- Use multiple quantitative metrics (RMSE, SAR, CC)
- Perform visual inspection by experienced EEG reviewers

Research Reagent Solutions

The following table outlines essential computational tools and their functions for implementing wavelet-enhanced artifact removal:

Table 4: Essential Research Reagent Solutions for Wavelet-ICA Implementation

Tool/Category	Specific Examples	Function in Workflow
ICA Algorithms	Infomax, FastICA, SOBI	Blind source separation of EEG signals [7] [46]
Wavelet Families	Symlets, Coiflets, Daubechies	Multi-resolution signal analysis [7] [46]
Artifact Identification Toolboxes	ADJUST, FASTER	Automated component classification [7]
Decomposition Techniques	EEMD, CEEMDAN, VMD	Signal decomposition for single-channel applications [15] [46]
Performance Metrics	RMSE, SAR, MI	Quantitative evaluation of artifact removal [7] [44]
Programming Environments	MATLAB, Python, EEGLAB	Implementation platform and visualization [7] [46]

Applications in EOG vs. EMG Research and Future Directions

Differential Applications

The application of wavelet-enhanced ICA methods demonstrates important differences between EOG and EMG artifact removal:

For EOG artifact research, the focus has been on selective region correction approaches that identify and process only the specific temporal segments containing ocular events [7]. This precision correction is feasible due to the transient nature of eye blinks and the characteristic frontal topography of EOG artifacts. The wavelet enhancement primarily targets the high-amplitude peaks while preserving surrounding neural activity in the same independent component [7] [45].

For EMG artifact research, the approach must address the broader spectral and temporal distribution of muscle artifacts. Techniques often incorporate additional decomposition methods like EEMD or CEEMDAN to handle the non-stationary characteristics of EMG [44]. The combination of wavelet transform with optimized non-local means filters has shown promise for EMG removal while preserving EEG information [44].

Emerging Trends and Future Developments

Future research directions in wavelet-enhanced artifact removal include:

Complete automation of artifact identification and removal without requiring manual intervention [7]
Deep learning integration with wavelet analysis to improve artifact detection and removal [44]
Real-time implementation for BCI and clinical monitoring applications [43]
Personalized artifact removal adapting to individual subject characteristics [47]
Multi-modal frameworks combining wavelet-ICA with other signal processing techniques for improved performance [46] [47]

The continuing evolution of these methods represents a crucial advancement toward high-fidelity EEG analysis in both research and clinical applications, particularly for drug development studies where precise quantification of neurological phenomena is essential.

Wavelet-enhanced ICA methodologies represent a significant advancement over traditional artifact removal approaches by enabling selective correction of artifactual components while maximizing preservation of neural information. The integration of DWT with ICA creates a powerful framework that leverages the strengths of both techniques: the blind source separation capability of ICA and the multi-resolution analytical power of wavelet transforms. For EOG artifacts, these methods have demonstrated superior performance compared to component rejection approaches, while for EMG artifacts, more sophisticated hybrid approaches continue to evolve. As EEG research expands into more naturalistic settings and mobile applications, the development of fully automated, robust artifact removal methods will remain essential for both basic neuroscience research and applied clinical applications.

Electroencephalography (EEG) is a cornerstone of non-invasive brain monitoring, prized for its high temporal resolution. The advent of wearable, mobile EEG systems has unlocked potential for real-world brain-computer interface (BCI) applications in neuroscience, clinical monitoring, and pharmaceutical research. However, this transition from controlled laboratory settings to dynamic environments introduces a significant challenge: increased vulnerability to physiological and motion artifacts. Among these, artifacts generated by eye movements (electrooculogram, EOG) and muscle activity (electromyogram, EMG) are particularly pervasive and problematic, as they can obscure neural signals and be mistakenly used as a source of control in BCI systems [2].

This technical guide explores the strategic integration of Inertial Measurement Units (IMUs) and other reference channels to mitigate these artifacts. By providing a direct, independent measure of the artifact sources, these auxiliary sensors enable more robust and accurate artifact removal. This is crucial for developing reliable mobile BCIs, whose performance must be robust to the presence of artifacts for practical application [2] [48]. We will delve into the specific characteristics of EOG and EMG artifacts, detail advanced methodologies for leveraging auxiliary sensors, and provide a practical toolkit for researchers.

Physiological Artifacts: A Comparative Analysis of EOG and EMG

Artifacts in EEG are undesired signals that introduce significant changes in brain recordings. EOG artifacts, caused by eye blinks and movements, are characterized by their high amplitude and slow, low-frequency nature (typically 0.5–12 Hz), which allows them to propagate widely across the scalp [1] [15]. In contrast, EMG artifacts originate from the contraction of various muscle groups (e.g., jaw, neck, forehead) and have a broad frequency spectrum that can overlap with key neural rhythms, making them exceptionally difficult to separate from genuine brain activity [1] [2].

Table 1: Characteristics and Mitigation of Key Physiological Artifacts

Artifact Type	Source	Spectral Range	Spatial Distribution on Scalp	Primary Removal Challenge
EOG / Ocular	Eye blinks & movements [1]	0.5 - 12 Hz [15]	Primarily frontal, but widespread due to volume conduction [1]	Frequency overlap with Delta/Theta bands; high amplitude can swamp neural signals [1]
EMG / Muscular	Muscle activity (head, neck, jaw) [1]	0 - >200 Hz (overlaps Beta/Gamma) [1]	Localized near muscle groups (e.g., temporal for jaw) but can be widespread [1]	Broad spectral overlap with EEG; high statistical independence from neural signals [1] [2]
Motion	Head movement, cable sway [48]	Broadband	Global, affecting all channels	Directly correlated with physical motion dynamics; non-stationary [49]

The Auxiliary Sensor Arsenal: IMUs and Reference Channels

Auxiliary sensors provide a direct, independent measure of artifact sources, moving beyond purely mathematical source separation. The core principle is multi-modal fusion, where data from these sensors informs the cleaning of the EEG signal.

Inertial Measurement Units (IMUs)

IMUs are compact sensors that measure tri-axial acceleration, angular velocity, and orientation [49]. In mobile EEG, they are typically mounted on the head to directly quantify the kinematics of head motion, which is a primary source of movement artifacts. An IMU-based pipeline was shown to improve robustness under diverse motion scenarios compared to single-modality approaches like Artifact Subspace Reconstruction (ASR) [49].

Dedicated Physiological Reference Channels

Beyond IMUs, other physiological signals can be recorded as references for specific artifacts:

Electrooculogram (EOG) Channels: Electrodes placed around the eyes specifically record eye movements and blinks, providing a clean reference signal for EOG artifact removal [1].
Electrocardiogram (ECG) Channels: A single channel can capture the heartbeat, which is useful for removing pulse and cardiac artifacts [1].

IMU-Enhanced Deep Learning

Recent state-of-the-art approaches involve fine-tuning large, pre-trained neural network models to integrate IMU data directly for EEG motion artifact removal. For instance, the LaBraM (Large Brain Model) framework uses a transformer-based architecture to encode motion-contaminated EEG segments and IMU signals into a shared latent space [49]. A correlation attention mapping mechanism then identifies and gates out motion-related components from the EEG based on their relationship with the IMU features [49].

Table 2: Comparison of Multi-Sensor Fusion Methodologies

Methodology	Core Principle	Best Suited For	Key Advantage	Example Performance
IMU-LaBraM (Deep Learning) [49]	Projects EEG & IMU into shared latent space; uses attention to gate artifacts	Motion artifacts during walking, running	High robustness in diverse motion scenarios; leverages pre-trained models	Outperformed ASR+ICA pipeline under varying motion intensities [49]
Mixture-of-Experts (MoE) [50]	Trains multiple "expert" networks, each specialized for a specific noise level or artifact subtype	EMG artifacts in high-noise settings	Superior lower-bound performance in high-noise conditions; modular design	Achieved competitive overall performance and better performance in high noise on EEGdenoiseNet [50]
Fixed Frequency EWT + GMETV [15]	Decomposes single-channel EEG; uses metrics (kurtosis, entropy) to identify and filter artifact components	EOG artifacts in single-channel systems	Effective for portable, low-density systems without need for multiple EEG channels	Improved Signal-to-Artifact Ratio (SAR) and lower Mean Absolute Error (MAE) on real EEG [15]
D-S Evidence Theory [51]	Fuses heterogeneous data (e.g., IMU motion intention, LiDAR obstacles) for path planning	Fusing motion intent with environmental perception in assistive robotics	Explicitly handles uncertainty and conflict, ideal for incomplete or conflicting sensor data	Achieved 98% average accuracy in head-motion directional control for wheelchairs [51]

Statistical Frameworks and Multi-Sensor Fusion

Other powerful methods include:

Statistical Mixture-of-Experts (MoE): This framework partitions EMG artifacts into quantifiable subtypes and trains specialized "expert" networks (e.g., CNNs and RNNs) for different signal-to-noise ratio (SNR) ranges. A gating network then selects the most appropriate expert for a given EEG segment, optimizing denoising performance, especially in high-noise settings [50].
D-S Evidence Theory for Sensor Fusion: In practical systems like intelligent wheelchairs, D-S evidence theory is used to fuse head motion intention from an IMU with environmental data from LiDAR and ultrasonic sensors. This allows the system to resolve conflicts (e.g., a user commanding a turn into a wall) by dynamically optimizing the path for safety [51].

Experimental Protocols for Validation

To validate the efficacy of auxiliary sensors, researchers employ rigorous experimental paradigms:

Mobile BCI Protocols: As used in the "Mobile BCI dataset" [49], these involve participants performing BCI tasks (e.g., Event-Related Potentials or Steady-State Visually Evoked Potentials) under varying movement conditions: standing, slow walking (0.8 m/s), fast walking (1.6 m/s), and running (2.0 m/s). This design systematically tests artifact removal algorithms across a range of motion intensities.
Assistive Robotics Navigation: Studies evaluating shared control for smart wheelchairs test navigation in simulated indoor environments with static and dynamic obstacles. Performance is measured by classification accuracy of neural commands (e.g., ErrP or SSVEP) and the success rate of obstacle avoidance, comparing systems with and without integrated human feedback via BCI [52].

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials and Tools for Multi-Sensor EEG Research

Tool / Reagent	Specification / Function	Application in Research
Science-Grade Mobile EEG	32+ Ag/AgCl electrodes; 500+ Hz sampling [53]	High-fidelity neural data acquisition in mobile settings.
9-Axis IMU Module	3-axis accelerometer, gyroscope, magnetometer; 128+ Hz sampling [49]	Captures head kinematics for motion artifact reference.
EOG Electrodes	Disposable Ag/AgCl electrodes	Placed near eyes to record pure EOG reference signals [1].
LaBraM Model	Pre-trained transformer-based encoder for EEG [49]	Provides a powerful foundation for transfer learning in artifact removal.
iCanClean Algorithm	Canonical Correlation Analysis with IMU references [49]	A benchmark algorithm for robust, calibration-light EEG cleaning.
EEGdenoiseNet	Public benchmark dataset with 67 subjects [50]	Standardized dataset for training and comparing EMG denoising models.
Particle Filter Gmapping	SLAM algorithm for environmental mapping [51]	Fuses LiDAR & odometry data to create a context for motion intention in assistive devices.

The integration of auxiliary sensors like IMUs and dedicated reference channels represents a paradigm shift in mobile EEG research. By moving beyond purely mathematical artifact separation and incorporating direct, physical measurements of noise sources, these methods significantly enhance the robustness and reliability of EEG in real-world conditions. This is indispensable for the future of BCI in dynamic applications, from assistive robotics and neuro-pharmaceutical trials to sports science and consumer technology. As deep learning and sensor fusion techniques continue to evolve, the potential for fully transparent, real-time artifact suppression will unlock new frontiers in our ability to decode brain activity in the wild.

The accurate analysis of electroencephalogram (EEG) signals is fundamental to advancements in neuroscience, clinical diagnostics, and brain-computer interfaces (BCIs). However, the utility of these signals, particularly those from portable, single-channel systems increasingly used in home-based healthcare and wellness tracking, is often compromised by physiological artifacts [15] [35]. Among these, electrooculogram (EOG) and electromyogram (EMG) artifacts pose significant challenges due to their high amplitude and spectral overlap with neural signals of interest. EOG artifacts, originating from eye movements and blinks, are characterized by their low-frequency, high-amplitude nature, while EMG artifacts from muscle activity, such as from the face, neck, or scalp, often manifest as high-frequency noise [15] [28] [54]. The distinction between these artifact types is critical, as their differing spatial, temporal, and spectral characteristics demand tailored removal strategies to prevent misinterpretation of brain activity and to preserve the integrity of underlying neural signals [35] [20].

Traditional artifact removal techniques like Independent Component Analysis (ICA) are inherently designed for multi-channel EEG recordings and are less effective for the single-channel configurations common in wearable devices [15] [55]. This limitation has spurred the development and adoption of data-driven decomposition techniques, including Empirical Mode Decomposition (EMD), Empirical Wavelet Transform (EWT), and Singular Spectrum Analysis (SSA). These methods are capable of analyzing monovariate, non-stationary signals like EEG by adaptively decomposing them into constituent components, thereby facilitating the isolation and removal of artifact sources such as EOG and EMG without the need for reference signals or a multi-electrode setup [15] [55] [56]. This technical guide provides an in-depth examination of these three core decomposition methods, framing their application within the specific research context of discriminating and removing EOG versus EMG artifacts from single-channel EEG.

Core Decomposition Techniques: Principles and Workflows

Empirical Mode Decomposition (EMD)

Principles: EMD is a fully data-driven technique that adaptively decomposes a non-linear and non-stationary signal into a collection of Intrinsic Mode Functions (IMFs) and a residue. The decomposition is based on the local characteristics of the signal's timescale, making it highly suitable for EEG analysis. A key challenge in using EMD is mode mixing, where oscillations of similar scales are distributed across different IMFs or a single IMF contains oscillations of dramatically different scales, which can complicate the isolation of artifacts [15].

Workflow for Artifact Removal: The general workflow for using EMD in artifact removal involves decomposition, component identification, and reconstruction.

Decomposition: The contaminated single-channel EEG signal ( x(t) ) is decomposed into ( n ) IMFs ( (IMF1, IMF2, ..., IMFn) ) and a residue ( rn ), such that ( x(t) = \sum{i=1}^{n} IMFi + r_n ).
Component Identification: Artifact-laden components (e.g., those containing EOG or EMG) are identified. EOG artifacts, being low-frequency and high-amplitude, are often captured in the first few IMFs. EMG artifacts, being high-frequency, may reside in earlier IMFs. Identification can be based on statistical metrics like kurtosis, entropy, or power spectral density [15].
Reconstruction: The clean EEG signal is reconstructed by summing the remaining IMFs, excluding those identified as containing artifacts: ( x{clean}(t) = \sum{i \notin A} IMF_i ), where ( A ) is the set of artifact-related IMF indices.

Empirical Wavelet Transform (EWT)

Principles: EWT constructs an adaptive wavelet filter bank based on the information contained in the signal's Fourier spectrum. It segments the spectrum and builds a set of wavelet filters, specifically an empirical scaling function and empirical wavelet functions, to extract the corresponding modulated modes [15]. This approach combines the advantages of the spectral separation capability of wavelets with the adaptivity of empirical methods.

Workflow for Artifact Removal (Fixed Frequency EWT): Recent advancements, such as the Fixed Frequency EWT (FF-EWT), tailor the method specifically for EOG artifact removal by focusing on fixed frequency ranges associated with these artifacts (typically 0.5-12 Hz) [15].

Spectrum Analysis: The Fast Fourier Transform (FFT) is applied to the EEG signal to obtain its spectrum in the ( 0 - \pi ) radian range.
Boundary Detection & Filter Bank Construction: The Fourier spectrum is segmented, and boundaries are identified. Empirical wavelet and scaling functions are defined as bandpass filters on each segment. The scaling function ( \upsilonl(\theta) ) and wavelet function ( \gammal(\theta) ) are defined as per Equations (1) and (2) in the foundational literature [15].
Component Extraction & Filtering: The signal is decomposed into several sub-band signals (SBSs). EOG-related SBSs are identified using metrics like kurtosis (KS), dispersion entropy (DisEn), and power spectral density (PSD). A specialized filter, such as the Generalized Moreau Envelope Total Variation (GMETV) filter, is then applied to these components to suppress artifacts while preserving low-frequency EEG information [15].

Singular Spectrum Analysis (SSA)

Principles: SSA is a non-parametric subspace-based technique that decomposes a time series into a set of independent components, such as trend, oscillatory components, and noise, without requiring prior knowledge of the signal's structure. Its effectiveness in isolating oscillatory noise, including EOG artifacts, has been well-documented [55] [56]. A key variant is Circulant SSA (CiSSA), which uses circulant matrices to enhance the separation of oscillatory components [15].

Workflow for Artifact Removal: The SSA procedure consists of four key steps: embedding, singular value decomposition (SVD), grouping, and diagonal averaging.

Embedding: The one-dimensional EEG signal ( x(t) ) of length ( N ) is mapped into a trajectory matrix ( \mathbf{X} ) of dimension ( L \times K ) by forming lagged vectors, where ( L ) is the window length and ( K = N - L + 1 ).
Singular Value Decomposition (SVD): The trajectory matrix ( \mathbf{X} ) is decomposed via SVD into a sum of rank-one elementary matrices: ( \mathbf{X} = \sum{i=1}^{d} \sqrt{\lambdai} Ui Vi^T ), where ( \sqrt{\lambdai} ) are the singular values, and ( Ui ) and ( V_i ) are the left and right singular vectors.
Grouping: The indices ( {1, ..., d} ) are partitioned into ( m ) disjoint subsets ( I1, I2, ..., I_m ). Components corresponding to artifacts are identified and grouped separately. For single-channel analysis, SSA can be coupled with ICA by using the decomposed components to create a multivariate dataset, an approach known as Higher-Order L-moment SSA (HOL-SSA) [55].
Reconstruction (Diagonal Averaging): Each grouped set of elementary matrices is transformed back into a reconstructed time series component. The clean signal is obtained by reconstructing from the groups excluding the artifact-related components.

SSA Decomposition Process: This diagram illustrates the four-stage workflow of Singular Spectrum Analysis (SSA) for single-channel EEG artifact removal, from embedding the signal to reconstructing the cleaned time series.

Comparative Analysis of Techniques

The choice of decomposition technique significantly impacts the efficacy of artifact removal. The table below provides a structured, quantitative comparison of EMD, EWT, and SSA, highlighting their performance and characteristics in the context of EOG and EMG artifact removal.

Table 1: Comparative Analysis of EMD, EWT, and SSA for Single-Channel EEG Artifact Removal

Feature	Empirical Mode Decomposition (EMD)	Empirical Wavelet Transform (EWT)	Singular Spectrum Analysis (SSA)
Core Principle	Data-driven, adaptive decomposition into Intrinsic Mode Functions (IMFs)	Adaptive wavelet filter bank based on signal's Fourier spectrum	Subspace method using Singular Value Decomposition (SVD) of trajectory matrix
Key Strength	High adaptability to non-stationary signals; no need for basis functions	Combines spectral precision of wavelets with empirical adaptivity	Effective isolation of oscillatory components; strong for EOG artifacts [55]
Primary Limitation	Susceptible to mode mixing	Performance depends on accurate spectrum segmentation	Requires careful selection of components for grouping
Effectiveness (EOG)	Moderate. Can separate EOG but may suffer from mode mixing [15]	High (especially FF-EWT). Effectively targets fixed low-frequency EOG bands [15]	High. Identified as superior for EOG removal, preserving SNR [55]
Effectiveness (EMG)	Moderate. Can capture high-frequency EMG but separation is challenging	High. Can target specific high-frequency EMG bands	Moderate to High, especially when combined with CCA (SSA-CCA) [55]
Reported Performance (EOG)	-	Lower RRMSE, higher CC, improved SAR & MAE vs. EMD/SSA [15]	Highest improvement in SNR, lower RMSE and correlation coefficient [55]
Reported Performance (EMG)	-	-	SSA-CCA outperforms EEMD-CCA for multichannel EMG [55]
Computational Load	Moderate	Moderate to High (depends on spectrum analysis)	Moderate

Experimental Protocols and Methodologies

To ensure the validity and reproducibility of research employing these decomposition techniques, a rigorous experimental protocol must be followed. This section outlines a detailed methodology for a comparative study on EOG and EMG artifact removal.

Data Acquisition and Preprocessing

EEG Datasets: Experiments should utilize both benchmark and real-world datasets.
- Synthetic Data: Artificially contaminate clean EEG recordings with recorded EOG and EMG signals at known Signal-to-Noise Ratios (SNRs). This provides a ground truth for validation [15] [8].
- Real EEG Data: Use publicly available datasets like the EEGdenoiseNet [28] [8] or the motor imagery High-Gamma Dataset [55]. For EOG-specific research, ensure the dataset contains marked eye-blink events.
Preprocessing: Apply band-pass filtering (e.g., 0.5-45 Hz) to remove DC offset and high-frequency noise outside the range of interest. Resample all signals to a uniform sampling rate (e.g., 256 Hz). For single-channel analysis, select channels most affected by the target artifact (e.g., FP1 for EOG).

Implementation of Decomposition and Artifact Removal

Parameter Selection:
- EMD: Use the standard algorithm to decompose the signal until a monotonic residue is obtained. No preset basis functions are needed.
- EWT: For FF-EWT, set the frequency range of interest to 0.5-12 Hz for EOG artifacts. The number of scales and the filter bank parameters should be optimized for the signal [15].
- SSA: The window length ( L ) is a critical parameter. A common choice is ( L = N/2 ), where ( N ) is the signal length. For automated component selection, use criteria like the ratio of singular values or kurtosis thresholds [55].
Component Identification: Automate the identification of artifact components using quantitative metrics.
- For EOG: Components with high kurtosis (indicating peaky, high-amplitude blinks) and high power in the low-frequency band (e.g., < 4 Hz) are likely EOG-related [15] [55].
- For EMG: Components with high power in the high-frequency band (e.g., 20-100 Hz) and characteristics identified by dispersion entropy (DisEn) can be targeted [15].
Signal Reconstruction: Reconstruct the signal by excluding the components identified as artifacts. In hybrid methods like SSA-ICA or EWT-GMETV, apply the secondary algorithm to the selected components before reconstruction [15] [55].

Performance Evaluation and Validation

The performance of the artifact removal pipeline must be quantified using established metrics.

Table 2: Key Performance Metrics for Artifact Removal Validation

Metric	Formula / Principle	Interpretation
Correlation Coefficient (CC)	( CC = \frac{\text{cov}(x{clean}, x{reconstructed})}{\sigma{x{clean}} \sigma{x{reconstructed}}} )	Measures the linear similarity between the clean and processed signal. Higher values (closer to 1) are better.
Signal-to-Artifact Ratio (SAR)	( SAR = 10 \log{10}\left(\frac{P{signal}}{P_{artifact}}\right) )	Measures the power ratio of the desired signal to the residual artifact. Higher values indicate better artifact suppression.
Relative Root Mean Square Error (RRMSE)	( RRMSE = \frac{\sqrt{\frac{1}{N}\sum{i=1}^{N}(x{clean} - x{reconstructed})^2}}{\sigma{x_{clean}}} )	Quantifies the relative error introduced by the processing. Lower values are better.
Mean Absolute Error (MAE)	( MAE = \frac{1}{N}\sum_{i=1}^{N}	x{clean} - x{reconstructed}	)	Measures the average magnitude of errors. Lower values are better.

Artifact Removal Experiment: This diagram outlines the end-to-end experimental protocol for validating decomposition-based artifact removal methods, from data acquisition to performance evaluation.

Successful experimentation in this field relies on a suite of computational tools and datasets. The following table details key resources that form the foundation for research on EOG and EMG artifact removal.

Table 3: Essential Research Reagents and Resources for EEG Artifact Removal Research

Tool/Resource	Type	Function in Research
EEGdenoiseNet [28] [8]	Benchmark Dataset	Provides clean EEG segments and artificially contaminated EEG with EOG and EMG, serving as a standard for training and benchmarking denoising algorithms.
MNE-Python [57]	Software Library	An open-source Python package for exploring, visualizing, and analyzing human neurophysiological data; essential for preprocessing, ICA, and standard time-frequency analysis.
High-Gamma Dataset [55]	Public Dataset	A 128-channel EEG dataset for motor imagery tasks, useful for validating artifact removal methods in a BCI context.
Functional Link Neural Network (FLNN) [54]	Algorithmic Component	Used in conjunction with adaptive filters (e.g., ANFIS) to enhance nonlinear approximation capabilities for removing ocular and muscular artifacts.
Canonical Correlation Analysis (CCA)	Statistical Method	Used in hybrid pipelines (e.g., SSA-CCA, WPD-CCA) to separate artifact components from neural signals by maximizing correlation across multiple channels or derived components [55] [56].
Online Recursive ICA (ORICA) [55]	Algorithm	An adaptive, real-time variant of ICA suitable for processing streaming data, often used after decomposition methods like HOL-SSA for source separation.

The effective removal of EOG and EMG artifacts from single-channel EEG is a critical preprocessing step that directly influences the quality and reliability of subsequent brain activity analysis. Data-driven decomposition techniques—EMD, EWT, and SSA—provide powerful, flexible frameworks for tackling this challenge without relying on multi-channel setups. While each method has its distinct strengths, the emerging trend points toward the superiority of tailored and hybrid approaches. Fixed Frequency EWT demonstrates exceptional performance for targeted EOG removal, while SSA-based hybrids (e.g., HOL-SSA with ORICA, SSA-CCA) offer robust solutions for both EOG and EMG artifacts. The choice of method should be guided by the specific artifact profile of the data, computational constraints, and the requirement for signal preservation. Future research will continue to refine these techniques and be shaped by the growing influence of deep learning models, such as ( A^2DM ) and CLEnet, which show great promise in handling the complex, interleaved nature of artifacts in real-world EEG signals [28] [8].

Electroencephalogram (EEG) signals provide a non-invasive window into brain activity, serving as crucial tools for diagnosing neurological disorders, conducting neuroscientific research, and developing brain-computer interfaces (BCIs) [58] [59]. However, their low amplitude and non-stationary nature make them highly susceptible to contamination by physiological artifacts, primarily originating from eye movements (electrooculogram, EOG) and muscle activity (electromyogram, EMG) [2]. These artifacts introduce extraneous signals that can obscure genuine brain activity, leading to misinterpretation of data and reduced reliability in both clinical and research settings [40] [2].

The fundamental challenge in EEG artifact removal lies in the distinct characteristics of EOG and EMG artifacts. EOG artifacts caused by eye blinks and movements are typically low-frequency, high-amplitude signals that predominantly affect frontal EEG channels [15]. In contrast, EMG artifacts generated by muscle contractions in the face, jaw, or neck are broadband, high-frequency signals that can affect multiple brain regions and often overlap spectrally with neural signals of interest [40] [2]. This heterogeneity has traditionally necessitated specialized algorithms for each artifact type, complicating the development of unified denoising solutions.

Recent advances in deep learning have transformed this landscape, enabling the development of sophisticated hybrid architectures that simultaneously address both temporal dependencies and spatial features in EEG data. This technical guide explores two cutting-edge approaches: CNN-LSTM hybrids that leverage the complementary strengths of convolutional and recurrent networks, and the artifact-aware A²DM framework that incorporates explicit artifact representation for unified denoising.

Core Architectural Frameworks

CNN-LSTM Hybrid Networks: Principles and Implementation

CNN-LSTM hybrid architectures represent a significant advancement in EEG artifact removal by synergistically combining spatial feature extraction and temporal sequence modeling. The convolutional neural network (CNN) component excels at identifying local morphological patterns and spatial hierarchies in EEG signals through its layered filter architecture, effectively extracting features across different frequency scales [40] [59]. The long short-term memory (LSTM) component processes these extracted features as temporal sequences, capturing long-range dependencies and contextual information through its gated memory cells, which is crucial for modeling the non-stationary nature of EEG and artifact dynamics [40] [24].

Several innovative implementations of this hybrid approach have demonstrated remarkable efficacy in handling both EOG and EMG artifacts:

Dual-Scale CLEnet: This architecture employs two parallel CNN branches with different kernel sizes to capture both fine-grained and broad morphological features from contaminated EEG. An embedded Efficient Multi-Scale Attention mechanism (EMA-1D) enhances relevant features while suppressing noise. The extracted features are then processed by LSTM layers to model temporal dependencies before final reconstruction of clean EEG [24].
Multi-Scale CNN with Bidirectional GRU (MSCGRU): Utilizing a generative adversarial network (GAN) framework, this approach incorporates a multi-scale CNN module with channel attention to extract frequency-specific features, followed by a bidirectional gated recurrent unit (BiGRU) to capture forward and backward temporal dependencies. The discriminator network evaluates the similarity between generated and clean EEG, further refining denoising performance [59].
CNN-LSTM with EMG Reference: A specialized implementation for muscle artifact removal incorporates simultaneously recorded EMG signals as additional inputs. The CNN processes both EEG and EMG inputs to identify artifact representations, while the LSTM models their temporal evolution, enabling precise subtraction of EMG artifacts while preserving neural information [40].

A²DM: Artifact-Aware Denoising Model

The A²DM framework introduces a paradigm shift in unified artifact removal by explicitly incorporating artifact type representation as prior knowledge into the denoising process [60]. This approach addresses the fundamental challenge of heterogeneous artifact distributions in the time-frequency domain by enabling a single model to adapt its denoising strategy based on the specific artifact type present.

The A²DM architecture operates through three sophisticated components:

Artifact Representation Module: A pre-trained artifact classification model analyzes the input EEG and generates an artifact type representation vector indicating the presence and characteristics of EOG, EMG, or other artifacts [60].
Frequency Enhancement Module: This component employs a hard attention mechanism that leverages the artifact representation to identify and suppress frequency bands disproportionately affected by the specific artifact type. For EOG artifacts, it primarily targets low-frequency components (0.5-12 Hz), while for EMG artifacts, it focuses on broadband high-frequency interference [60] [15].
Time-Domain Compensation Module: To mitigate potential information loss during frequency-domain processing, this module captures global temporal patterns and reconstructs potentially suppressed genuine neural activity, ensuring preservation of clinically relevant EEG components [60].

This explicit artifact awareness enables A²DM to achieve state-of-the-art performance across multiple artifact types while maintaining the computational efficiency necessary for real-world applications.

Performance Analysis: Quantitative Comparisons

Table 1: Performance Metrics of Deep Learning Models for EMG Artifact Removal

Model Architecture	Relative RMSE	Correlation Coefficient	Signal-to-Noise Ratio (dB)	Primary Artifact Target
MSCGRU [59]	0.277 ± 0.009	0.943 ± 0.004	12.857 ± 0.294	EMG
CLEnet [24]	0.300 (temporal)	0.925	11.498	Mixed (EMG+EOG)
CNN-LSTM with EMG [40]	-	-	Significant improvement reported	EMG
A²DM [60]	-	12% improvement over baseline CNN	-	Unified (EOG+EMG)

Table 2: Performance Comparison for EOG Artifact Removal

Model Architecture	Correlation Coefficient	Signal-to-Artifact Ratio	Mean Absolute Error	Notable Features
FF-EWT + GMETV [15]	High improvement	Substantial improvement	Significant reduction	Automated component identification
LSTM with Eye-Tracking [61]	-	-	-	High specificity for EOG
EEGDNet (Transformer) [59]	-	-	-	Specialized for EOG
A²DM [60]	Superior to conventional CNN	-	-	Unified architecture

Table 3: Multi-Artifact Performance Comparison

Model Architecture	Artifact Types Handled	Key Advantage	Computational Complexity
CLEnet [24]	EMG, EOG, ECG, Unknown	Excellent on multi-channel with unknown artifacts	Moderate
A²DM [60]	EOG, EMG (unified)	Artifact-type awareness	Moderate to High
Traditional ICA [2]	EOG, EMG	Well-established	Low
Hybrid EEMD-CCA [59]	EOG, EMG	No reference channel required	Moderate

Experimental Protocols and Methodologies

Protocol for Evaluating CNN-LSTM Hybrids

The validation of CNN-LSTM hybrid models typically follows a rigorous protocol involving both semi-synthetic and real EEG datasets to comprehensively evaluate performance under controlled and realistic conditions:

Data Preparation and Augmentation: For semi-synthetic datasets, clean EEG recordings are artificially contaminated with measured EOG and EMG artifacts at varying signal-to-noise ratios. Data augmentation techniques such as sliding window segmentation, additive noise injection, and amplitude scaling are employed to enhance dataset diversity and model robustness [40] [24].
Architecture Implementation: The CNN component typically employs 1D convolutional layers with varying kernel sizes (e.g., 3, 5, 7) to capture multi-scale features. The LSTM component is configured with either unidirectional or bidirectional layers depending on the need for historical context. Specific implementations like CLEnet incorporate dual-scale CNN branches with kernel sizes 5 and 9 followed by three LSTM layers with 64, 32, and 1 units respectively [24].
Training Configuration: Models are trained using mean squared error (MSE) or related loss functions between denoised and clean EEG signals. Optimization is typically performed with Adam or RMSprop optimizers with learning rates between 0.001 and 0.0001. Training employs early stopping based on validation loss to prevent overfitting [59] [24].
Validation Framework: Performance is quantified using multiple metrics including relative root mean square error (RRMSE) in both temporal and frequency domains, correlation coefficient (CC) between cleaned and clean EEG, and signal-to-noise ratio (SNR) improvement. Statistical significance is assessed through repeated measures with different data splits [40] [24].

Protocol for A²DM Framework Evaluation

The experimental validation of the A²DM framework emphasizes its unified approach to multiple artifact types:

Artifact Representation Learning: A convolutional classifier is pre-trained on labeled artifact examples to generate informative artifact representation vectors. This model learns distinctive features of EOG (low-frequency, frontal dominance) and EMG (broadband, multi-region) artifacts through supervised training on diverse examples [60].
Integration with Denoising Network: The artifact representation is fused into a U-Net inspired architecture through feature-wise linear modulation (FiLM) layers, which scale and shift intermediate feature maps based on the artifact type. This enables the network to adapt its processing strategy dynamically [60].
Frequency-Specific Processing: The frequency enhancement module applies artifact-specific attention weights to time-frequency representations (e.g., spectrograms or wavelet transforms) of the input EEG, selectively suppressing frequencies most affected by the identified artifact type [60].
Cross-Artifact Generalization Testing: The model is systematically evaluated on datasets containing EOG artifacts, EMG artifacts, and mixed artifacts to verify its unified denoising capability. Performance is compared against both specialized single-artifact models and other unified approaches [60].

Visualization of Architectures and Workflows

CNN-LSTM Hybrid Architecture for EEG Denoising

A²DM Unified Artifact Removal Framework

Table 4: Key Research Resources for EEG Artifact Removal Studies

Resource Category	Specific Tool/Platform	Primary Function	Application Context
Reference Datasets	EEGdenoiseNet [24]	Provides semi-synthetic EEG with controlled artifacts	Algorithm training and benchmarking
	MIT-BIH Arrhythmia Database [24]	Source of ECG artifacts	ECG artifact removal studies
	Multimodal EEG+Eye-Tracking Data [61]	Simultaneous EEG and eye movement recordings	EOG artifact analysis and removal
Software Frameworks	TensorFlow/PyTorch	Deep learning implementation	Model development and training
	EEGLAB/BCILAB	Traditional EEG processing	Baseline method implementation
	ICA-based Toolboxes (ICLabel) [40]	Automatic component classification	Comparison with traditional methods
Hardware Requirements	High-density EEG Systems (32+ channels) [24]	Data acquisition with spatial information	Multi-channel artifact removal
	Simultaneous EMG Recording Setup [40]	Reference muscle activity recording	EMG-informed artifact removal
	Eye-Tracking Systems [61]	Reference eye movement recording	EOG-informed artifact removal
Evaluation Metrics	Signal-to-Noise Ratio (SNR) [40]	Quantifies noise reduction	Performance assessment
	Correlation Coefficient (CC) [59] [24]	Measures waveform preservation	Quality of clean EEG reconstruction
	Relative RMSE [59] [24]	Quantifies reconstruction error	Overall denoising accuracy

Future Directions and Clinical Translation

The evolution of deep learning approaches for EEG artifact removal continues to advance along several promising trajectories. Interpretability and transparency remain significant challenges, with current research focusing on incorporating explainable AI techniques such as gradient-weighted class activation mapping (Grad-CAM) and Shapley values to elucidate model decision processes [62]. Federated learning frameworks are emerging as solutions for privacy-preserving model training across multiple clinical institutions, addressing data governance concerns while expanding training diversity [62].

For real-world clinical adoption, future research must prioritize computational efficiency and robustness to domain shift between laboratory and clinical environments. The integration of these advanced artifact removal systems with wearable EEG platforms represents a particularly promising direction for extending neurological monitoring beyond clinical settings into daily life, potentially revolutionizing the management of epilepsy, sleep disorders, and neurodegenerative conditions [15] [58].

The convergence of CNN-LSTM hybrids with artifact-aware architectures like A²DM points toward a future where unified, adaptive models can handle the full spectrum of physiological artifacts while preserving the delicate neural signatures essential for both clinical diagnosis and neuroscientific discovery.

Optimizing Your Pipeline: Troubleshooting Common Challenges in Real-World Settings

The evolution of electroencephalography (EEG) toward wearable devices marks a transformative shift in neurophysiological monitoring, enabling unprecedented access to brain activity in real-world environments. This transition from controlled laboratory settings to ambulatory use in healthcare, neuroscience, and consumer applications is driven by technological advances in dry electrodes, wireless connectivity, and miniaturized electronics [63] [35]. Wearable EEG platforms promise to revolutionize clinical trials, neurology, and personalized health monitoring by facilitating extended-duration recordings in ecological settings [64] [65].

However, this paradigm shift introduces significant technical challenges that impact data quality and interpretability. The core obstacles stem from three interconnected factors: the use of dry electrodes without conductive gel, subject mobility during recordings, and reduced electrode counts compared to traditional high-density systems [66] [35]. These factors collectively exacerbate the presence and complexity of artifacts in EEG signals. The relaxed constraints of the acquisition setup often compromise signal quality, with artifacts exhibiting specific features due to dry electrodes, reduced scalp coverage, and subject mobility [66]. Within this context, managing electrooculogram (EOG) and electromyogram (EMG) artifacts becomes particularly critical, as these physiological contaminants present distinct characteristics that require specialized detection and removal strategies, especially in low-channel-count configurations [15] [67].

This technical guide examines the core challenges of wearable EEG systems, with a specific focus on the comparative analysis of EOG and EMG artifacts. It provides researchers and drug development professionals with experimentally validated methodologies for addressing these challenges, supported by quantitative performance data and standardized experimental protocols.

Core Technical Challenges in Wearable EEG

Dry Electrode Technology and Signal Quality

Dry-electrode EEG systems substantially reduce patient and site burden in clinical applications, with set-up times approximately 50% faster than standard EEG systems [65]. However, this operational efficiency comes with specific signal quality considerations. Quantitative performance varies significantly across different neural metrics, with resting-state EEG and P300 evoked activity being adequately captured by dry-electrode systems, while low-frequency activity (<6 Hz) and induced gamma activity (40–80 Hz) present notable challenges [65].

Dry electrodes feature ultra-high impedance amplifiers (>47 GOhms) that handle contact impedances up to 1-2 MOhms, producing signal quality comparable to wet electrodes in optimal conditions [63]. The practical advantages include an average set-up time of just 4.02 minutes compared to 6.36 minutes for wet electrode systems, with acceptable comfort ratings during extended 4-8 hour recordings [63]. Unlike wet electrodes that deteriorate as conductive gel dries, dry electrode signal quality maintains stability over longer periods, though it is more susceptible to motion artifacts and impedance fluctuations from poor skin contact [66] [63].

Motion Artifacts in Ambulatory Environments

Wearable EEG operation in everyday environments limits the experimenter's ability to mitigate environmental noise and exposes the system to high-intensity motion, significantly increasing artifact contamination [66] [35]. Motion artifacts manifest as high-amplitude, low-frequency signals that can obscure neural activity of interest, particularly in mobile scenarios.

In clinical validation studies, the false detection rate in ambulatory monitoring environments (.290 false positives/hour) was notably higher than in controlled epilepsy monitoring units (.136 false positives/hour), indicating a clear area for improvement for unrestricted at-home monitoring [64]. These artifacts are especially problematic for pediatric populations and movement disorders where complete stillness is not feasible [64] [35].

Low-Channel-Count Constraints

Traditional artifact removal techniques like Independent Component Analysis (ICA) and Principal Component Analysis (PCA) require sufficient spatial sampling for effective source separation, making them less effective for wearable systems with typically sixteen or fewer channels [66] [35]. This limitation necessitates development of specialized single-channel and reduced-channel artifact management pipelines that can operate without the spatial information available in high-density systems.

Table 1: Impact of Wearable EEG Constraints on Signal Quality

Technical Challenge	Primary Effects on Signal Quality	Common Artifact Types	Typical Performance Metrics
Dry Electrodes	Higher baseline impedance, reduced signal stability, susceptibility to movement	Motion artifacts, electrode pop, impedance fluctuations	Set-up time: 4.02 min (dry) vs 6.36 min (wet) [63]
Subject Mobility	Low-frequency drift, muscle artifact contamination, non-stationary noise	Motion artifacts, EMG, EOG (from head movement)	False detection rate: 0.290/hr (ambulatory) vs 0.136/hr (EMU) [64]
Low Channel Count	Limited spatial information, reduced effectiveness of source separation	All artifact types, but with fewer remediation options	Accuracy: 71% (when clean signal is reference); Selectivity: 63% [66]

EOG vs EMG Artifacts: Comparative Analysis

Artifacts originating from ocular (EOG) and muscular (EMG) activity represent the most common physiological contaminants in wearable EEG. Understanding their distinct characteristics is fundamental to developing effective artifact management strategies.

Origin and Physiological Mechanisms

EOG artifacts arise from eye movements and blinks, generating electrical potentials through the corneo-retinal dipole. These artifacts primarily affect prefrontal and frontal regions in wearable EEG configurations [15] [68]. The primary generators are:

Eye blinks: Voluntary or involuntary eyelid closure producing biphasic waveforms
Saccades: Rapid eye movements during visual tracking
Vertical and horizontal eye movements: Slow ocular drifts

EMG artifacts originate from facial, scalp, and neck muscle activity during head movement, speech, or expression. These artifacts manifest as high-frequency bursts across multiple channels, with specific patterns depending on the muscle groups involved [66] [67]. In wearable EEG, common sources include:

Frontalis muscle activity: Affecting forehead electrodes
Temporalis muscle contraction: Impacting temporal electrodes
Sternocleidomastoid activation: From head turning or flexion

Temporal and Spectral Characteristics

EOG artifacts exhibit low-frequency, high-amplitude characteristics, typically dominating the 0.5-12 Hz range [15]. This spectral overlap with delta, theta, and alpha neural oscillations makes filtering particularly challenging without neural signal loss. The table below summarizes the key differentiating features:

Table 2: Characteristic Differences Between EOG and EMG Artifacts

Characteristic	EOG Artifacts	EMG Artifacts
Frequency Range	0.5-12 Hz [15]	20-300 Hz [67]
Spectral Overlap	Delta, theta, alpha bands	Gamma band and higher frequencies
Amplitude	High (often 5-10× neural signals)	Variable (can exceed 10× neural signals)
Duration	100-400 ms (blinks), variable (movements)	Bursts (50-500 ms) or sustained
Topography	Prefrontal/frontal dominance [68]	Widespread, muscle-dependent
Morphology	Smooth, biphasic (blinks)	Irregular, spike-like

Impact on Wearable EEG Applications

The differential impact of EOG and EMG artifacts varies across applications. For clinical trials focusing on resting-state quantitative EEG or P300 evoked responses, EOG artifacts present greater challenges due to their spectral overlap with these signals [65]. In contrast, for motor imagery BCIs or active movement paradigms, EMG contamination becomes the primary concern [67] [35].

Artifact Management Strategies

Detection and Identification Algorithms

Modern artifact management pipelines employ diverse detection strategies tailored to wearable EEG constraints:

EOG-Specific Detection Methods:

Fixed Frequency Empirical Wavelet Transform (FF-EWT): Automatically identifies EOG-contaminated components using kurtosis, dispersion entropy, and power spectral density metrics [15]
Variational Mode Extraction (VME): Targets specific frequency bands associated with ocular artifacts [68]
Threshold-based decision rules: Apply amplitude and frequency thresholds to identify blink-related peaks [66]

EMG-Specific Detection Methods:

Dispersion Entropy: Effective for identifying non-linear characteristics of muscle activity [66]
Wavelet Transform Analysis: Captures time-frequency properties of transient muscle bursts [66] [67]
Deep Learning Approaches: Emerging methods showing promise for muscular and motion artifacts, with applications in real-time settings [66]

Auxiliary Sensor Integration: Inertial Measurement Units (IMUs) and dedicated EOG sensors provide reference signals to enhance artifact detection, though they remain underutilized despite their potential in ecological conditions [66]. The StimEMG system exemplifies this approach with a stimulation artifact generation (SAG) circuit and Recursive Least Squares (RLS) adaptive filter for real-time removal of time-varying electrical stimulation artifacts [69].

Removal and Filtering Techniques

EOG-Specific Removal Methods:

FF-EWT + GMETV Filter: Combines Fixed Frequency Empirical Wavelet Transform with a Generalized Moreau Envelope Total Variation filter, demonstrating substantial improvements with lower Relative Root Mean Square Error (RRMSE) and higher Correlation Coefficient (CC) on synthetic data [15]
VME-GMETV Model: Integrates Variational Mode Extraction with GMETV filtering, achieving averaged RRMSE of 0.1557 and CC of 0.9695 on real EEG data [68]
Single-Channel EOG Removal: Automated methods requiring no reference input, addressing the limitations of wearable EEG's reduced channel counts [15]

EMG-Specific Removal Methods:

ASR-based Pipelines: Effectively manage muscular artifacts through automated sensor noise suppression [66]
StimEMG System: Implements SAG-RLS method, achieving higher correlation (R² = 0.98±0.0044) between denoised EMG and clean EMG compared to traditional methods (R² = 0.65±0.3217) [69]
Real-time Adaptive Filtering: Particularly valuable for functional electrical stimulation applications where stimulation artifacts vary over time [69]

Table 3: Performance Comparison of Artifact Removal Techniques

Method	Artifact Type	Performance Metrics	Limitations/Requirements
FF-EWT + GMETV [15]	EOG	Improved SAR and MAE on real EEG	Requires parameter tuning
VME-GMETV [68]	EOG	RRMSE: 0.1557, CC: 0.9695	Less complex than EMD/EEMD approaches
SAG-RLS [69]	EMG (Stimulation)	R²: 0.98±0.0044, SNR: 12.83±2.1745	Requires hardware integration
Wavelet Transform + ICA [66]	Ocular and Muscular	Accuracy: 71%, Selectivity: 63%	Limited effectiveness with low channel count
Deep Learning Approaches [66]	Muscular and Motion	Promising for real-time applications	Requires extensive training data

Emerging Approaches and Future Directions

Deep learning approaches are emerging as powerful tools for artifact management, particularly for muscular and motion artifacts where traditional methods face limitations in low-channel-count configurations [66]. These data-driven methods can learn complex, non-linear relationships between artifact-contaminated signals and underlying neural activity without explicit feature engineering.

The 2025 EEG Foundation Challenge focuses on cross-task transfer learning and subject-invariant representation, aiming to develop robust models that generalize across different subjects and cognitive paradigms [70]. Such approaches hold promise for creating more adaptive artifact management pipelines that maintain performance across diverse usage scenarios.

Future research directions include improved integration of auxiliary sensors, development of standardized benchmarking datasets, and creation of modular pipelines that automatically select optimal processing strategies based on artifact type and signal characteristics [66] [35].

Experimental Protocols and Methodologies

Protocol for Validating Dry-Electrode Performance

Objective: Benchmark dry-electrode EEG devices against standard wet-EEG systems for clinical trial applications [65].

Equipment:

Dry-electrode EEG devices (e.g., DSI-24, Quick-20R, zEEG)
Standard wet-EEG system (e.g., QuikCap Neo Net with Grael amplifier)
Timing apparatus for set-up and clean-up duration measurements
Participant comfort rating scales (0-10 visual analog scales)

Procedure:

Recruit 32+ healthy participants representing target demographic
Perform recordings on two separate days with device order randomized
Measure set-up time (preparation to recording start)
Measure clean-up time (recording end to fully cleaned set-up)
Administer structured questionnaires to participants and technicians after each session
Record resting state and task-based EEG (P300, auditory/visual paradigms)

Validation Metrics:

Quantitative EEG parameters (resting state power, P300 amplitude/latency)
Set-up and clean-up time comparisons
Comfort ratings across devices and over time
Technician ease-of-use ratings (0-10 scale)

This protocol revealed that while dry-electrode EEG sped up experiments (set-up time approximately 50% faster), standard EEG remained the most comfortable option, and dry-electrode performance varied significantly across different neural metrics [65].

Protocol for EOG Artifact Removal Validation

Objective: Evaluate performance of EOG artifact removal algorithms on single-channel EEG [15] [68].

Equipment:

Portable EEG system with prefrontal placement
Simultaneous EOG recording for ground truth (optional)
Synthetic EEG dataset with controlled EOG contamination

Procedure:

Record or generate EEG data with EOG artifacts
Apply proposed algorithm (e.g., FF-EWT + GMETV or VME-GMETV)
Compare output with clean reference signal
Calculate performance metrics across multiple trials

Validation Metrics:

Relative Root Mean Square Error (RRMSE) = RMS(clean EEG - denoised EEG)/RMS(clean EEG)
Correlation Coefficient (CC) = covariance(clean EEG, denoised EEG)/(σclean × σdenoised)
Mean Absolute Error (MAE) of power spectrum
Signal-to-Artifact Ratio (SAR) improvement

Using this protocol, the VME-GMETV method achieved RRMSE of 0.1557 and CC of 0.9695, outperforming existing techniques [68].

Protocol for Reduced-Channel Seizure Detection Validation

Objective: Validate automated seizure detection algorithms for reduced-channel wearable EEG [64].

Equipment:

Reduced-channel wearable EEG system (e.g., 4-channel REMI sensors)
Simultaneous full-montage wired EEG for ground truth
Automated detection algorithm (e.g., Epitel's REMI Vigilenz AI)

Procedure:

Collect data from 50+ participants (31+ with seizures)
Place reduced-channel sensors bilaterally between F7-Fp1, F8-Fp2, T5-T3, T6-T4
Record simultaneous reduced-channel and full-montage EEG
Determine ground truth seizure annotations via three-expert consensus
Apply automated detection algorithm to reduced-channel data only
Compare algorithm outputs with expert consensus

Validation Metrics:

Event-level sensitivity = true positives/(true positives + false negatives)
False detection rate = false positives per hour
Participant-level mean sensitivity and false detection rate
Performance variation by seizure type (focal, generalized, focal evolving to generalized)

This validation demonstrated 86.2% event-level sensitivity with a false detection rate of 0.162 per hour, comparable to clinical software for full-montage EEG [64].

The Scientist's Toolkit: Research Reagent Solutions

Table 4: Essential Tools for Wearable EEG Artifact Research

Tool/Category	Specific Examples	Function/Application	Key Characteristics
Dry-Electrode EEG Systems	DSI-24 (Wearable Sensing), Quick-20R (CGX), zEEG (Zeto) [65]	Signal acquisition with minimal set-up	Ultra-high impedance amplifiers (>47 GOhms), hair-penetrating electrodes
Artifact Removal Algorithms	FF-EWT + GMETV [15], VME-GMETV [68], SAG-RLS [69]	Targeted removal of specific artifact types	Reference-free operation, single-channel capability
Validation Datasets	HBN-EEG Dataset [70], Synthetic EEG with controlled artifacts [15]	Algorithm benchmarking and comparison	Includes diverse tasks, ground truth annotations
Auxiliary Sensors	IMU sensors, EOG electrodes, PPG sensors [66] [63]	Reference signals for artifact detection	Multi-modal data synchronization
Performance Metrics	RRMSE, CC, MAE, SAR [15] [68], Sensitivity, False Detection Rate [64]	Quantitative algorithm assessment	Standardized comparison across studies

Workflow Diagrams

Generalized Artifact Management Pipeline

Artifact Management Workflow: This diagram illustrates the comprehensive pipeline for managing artifacts in wearable EEG systems, from data acquisition through final analysis, highlighting the critical role of artifact classification in determining appropriate removal strategies.

EOG-Specific Removal Method

EOG Artifact Removal Process: This diagram details the VME-GMETV method for EOG artifact removal, showing how artifact segments are identified and processed separately from neural signals before reconstruction.

Addressing the challenges of dry electrodes, motion artifacts, and low channel counts in wearable EEG requires specialized approaches that account for the distinct characteristics of EOG and EMG artifacts. The methodologies and experimental protocols presented in this guide provide researchers with validated tools for managing these challenges while maintaining signal integrity.

Future progress in wearable EEG technology will depend on continued development of adaptive algorithms that can dynamically respond to changing artifact profiles in real-world environments. The integration of multimodal sensing, machine learning, and standardized validation frameworks will be essential for achieving the full potential of wearable EEG in both clinical and research applications.

The analysis of electroencephalography (EEG) signals is fundamentally constrained by the presence of physiological artifacts, primarily electrooculographic (EOG) and electromyographic (EMG) contaminants. These artifacts introduce large-amplitude, non-neural signals that can severely compromise data integrity, leading to misinterpretation of brain activity and reducing the validity of neuroscientific findings and brain-computer interface (BCI) applications [2]. Effective artifact management is therefore a critical prerequisite for any EEG-based research or clinical application. Within a broader thesis on EOG versus EMG artifacts research, this guide provides an in-depth examination of the three principal strategic approaches for handling these contaminants: artifact rejection, which involves completely removing contaminated data segments; artifact correction, which aims to remove artifacts while preserving neural signals; and robust decoding, which utilizes analysis methods resistant to artifact interference [20] [2].

The strategic selection between these approaches carries significant implications for data quality, interpretability, and practical utility. A systematic review of EEG artifact management reveals that most processing pipelines integrate detection and removal phases but rarely separate their impact on performance metrics [20]. Furthermore, the properties of EOG and EMG artifacts differ substantially, necessitating tailored approaches. EOG artifacts, generated by eye movements and blinks, typically manifest as low-frequency, high-amplitude frontal signals. In contrast, EMG artifacts originating from muscle activity are characterized by high-frequency components with a broad spatial distribution [2] [24]. These distinctions fundamentally influence optimal strategy selection across different research contexts and application domains.

Characterizing EOG and EMG Artifacts

A comprehensive understanding of artifact origins and characteristics forms the foundation for selecting appropriate management strategies. EOG and EMG artifacts exhibit distinct spatial, temporal, and spectral properties that necessitate different handling approaches.

EOG Artifacts primarily arise from two sources: eye blinks and eye movements. The cornea-retinal potential (approximately 100 mV) creates an electrical field that changes with eye position and lid movement, generating potentials that spread across the scalp and contaminate EEG recordings [2]. These artifacts are characterized by their large amplitude (often exceeding EEG amplitudes by 10-100 times), frontal dominance with maximum amplitude at frontal electrodes, and low-frequency content typically below 4 Hz, though blinks can contain higher frequencies [71] [2]. The stereotypical morphology of eye blinks presents a recognizable pattern that facilitates detection and separation.

EMG Artifacts originate from the electrical activity of cranial, facial, neck, and shoulder muscles during contraction. Unlike EOG artifacts, EMG contaminants display markedly different characteristics: they typically manifest as high-frequency activity (20-200 Hz) with a broad spectrum that significantly overlaps with neural gamma oscillations, irregular morphology with rapid spikes, and variable spatial distribution depending on the activated muscle groups [67] [2]. Facial expressions, jaw clenching, head movements, and swallowing can all generate EMG artifacts that locally or broadly contaminate EEG signals.

Table 1: Comparative Characteristics of EOG and EMG Artifacts

Characteristic	EOG Artifacts	EMG Artifacts
Origin	Cornea-retinal potential; eye movements/blinks	Muscle fiber contractions (face, neck, jaw)
Spectral Properties	Low-frequency (0.5-12 Hz) [71]	Broad-spectrum, high-frequency (20-200 Hz) [67]
Spatial Distribution	Primarily frontal electrodes	Widespread, muscle-dependent
Amplitude	High-amplitude, slow waves	Variable, often high-amplitude spikes
Typical Morphology	Stereotypical, monophasic/biphasic	Irregular, polyphasic spikes
Main Challenges	Overlap with delta/theta bands; volume conduction	Spectral overlap with gamma activity; unpredictable patterns

Artifact Rejection Strategies

Principles and Methodologies

Artifact rejection operates on a straightforward principle: complete removal of data segments identified as contaminated by artifacts. This all-or-nothing approach ensures that only data of the highest quality undergo further analysis, eliminating the risk of residual artifact contamination influencing results. The fundamental strength of this strategy lies in its simplicity and certainty—by completely excluding contaminated epochs, researchers avoid the complex signal separation problems inherent in correction approaches [2].

The technical implementation of artifact rejection typically involves a multi-stage process. First, algorithms scan continuous EEG data to identify segments exceeding predefined amplitude thresholds (e.g., ±100 μV), displaying abnormal statistical properties (kurtosis, skewness), or demonstrating unusual frequencies. Advanced detection methods may employ machine learning classifiers trained on annotated artifact datasets to improve detection accuracy [20]. For EOG artifacts, the stereotypical morphology of blinks facilitates pattern-based detection, while EMG artifacts often require power-based detection in specific frequency bands due to their less predictable morphology [2].

Experimental Protocols and Applications

A standardized protocol for automated artifact rejection utilizing the AUTOREJECT toolbox implements the following methodology. First, EEG data is segmented into epochs based on experimental events. Second, amplitude thresholds are automatically optimized across the dataset using cross-validation, typically set at ±100-150 μV for adult human EEG. Third, statistical outlier detection is performed based on joint probabilities across channels and epochs. Finally, rejected epochs are completely excluded from subsequent analysis, and only clean data is retained for decoding or ERP analysis [72].

The effectiveness of rejection strategies varies significantly between artifact types. Research demonstrates that epoch rejection performs optimally for large-amplitude, transient artifacts like eye blinks, which typically affect limited time periods. However, this approach proves less effective for sustained EMG artifacts, which may persist throughout recordings and lead to excessive data loss when applied stringently [72] [2]. This fundamental limitation has motivated the development of alternative approaches, particularly for applications where data preservation is critical.

Artifact Correction Approaches

Technical Foundations

Artifact correction methodologies aim to remove contaminating artifacts while preserving underlying neural signals, operating on the principle that artifactual and neural components can be separated within contaminated data. Unlike rejection strategies, correction approaches preserve data continuity and volume, making them particularly valuable when data retention is paramount or when artifacts pervade significant portions of the recording [7].

The correction landscape encompasses both traditional and emerging computational approaches. Regression methods use reference EOG or EMG channels to estimate and subtract artifact contributions from EEG signals, though these require additional recording hardware and assume reference independence from neural signals [2]. Blind source separation (BSS) techniques, particularly independent component analysis (ICA), have gained prominence for their ability to separate neural and artifactual sources based on statistical independence without requiring reference signals [72] [7]. Recently, deep learning approaches have demonstrated remarkable capability in learning complex artifact features directly from data, showing particular promise for handling unknown artifact types and multi-channel configurations [24].

Advanced Correction Methodologies

ICA-based correction implements a multi-stage protocol beginning with data decomposition via algorithms like Infomax or FastICA, which separate multichannel EEG into statistically independent components. Subsequent component classification identifies artifactual components based on spatial, temporal, and spectral features—EOG components typically show frontal topography and low-frequency dominance, while EMG components display high-frequency content and muscle-specific topographies [7]. Finally, artifact-free EEG is reconstructed by excluding artifactual components from the inverse transformation. A significant advancement, wavelet-enhanced ICA (wICA), improves upon standard ICA by applying discrete wavelet transform to artifact components, thresholding wavelet coefficients to remove only artifactual elements while preserving neural data within the component, thus minimizing information loss [7].

Deep learning approaches represent the frontier of artifact correction technology. The CLEnet architecture exemplifies modern solutions, implementing a dual-branch network combining convolutional neural networks (CNN) for morphological feature extraction and long short-term memory (LSTM) networks for temporal dependency modeling [24]. This architecture incorporates an improved EMA-1D (One-Dimensional Efficient Multi-Scale Attention Mechanism) to capture cross-dimensional interactions, enabling superior performance in removing both known and unknown artifacts from multi-channel EEG data. Experimental validation demonstrates that CLEnet achieves signal-to-noise ratio improvements of 2.45-5.13% over competing methods while reducing temporal and frequency domain errors by 3.30-8.08% across diverse artifact types [24].

Table 2: Performance Metrics of Artifact Correction Methods Across Artifact Types

Correction Method	Artifact Type	Performance Metrics	Advantages	Limitations
ICA + Component Rejection	EOG, EMG	Component classification accuracy: 85-92% [7]	No reference channels needed; preserves neural components	Risk of neural data loss; requires multiple channels
Wavelet-Enhanced ICA (wICA)	EOG	Signal-to-Error Ratio: 12.8 dB; Artifact-to-Residue Ratio: 2.1 [7]	Selective artifact removal; minimal neural loss	Complex parameter optimization
Ci_SSA + Wavelet + Clustering	EOG (single-channel)	Correlation Coefficient: 0.94; RRMSE: 0.15 [71]	Effective for single-channel setups; preserves non-artifact regions	Limited validation for EMG artifacts
Deep Learning (CLEnet)	EOG, EMG, Mixed	SNR: 11.50 dB; CC: 0.925; RRMSEt: 0.300 [24]	Handles unknown artifacts; multi-channel capability	Computational intensity; requires large training datasets
SAG-RLS Adaptive Filter	EMG (stimulation artifacts)	R²: 0.98±0.0044; SNR: 12.83±2.17 [69]	Real-time capability; effective for time-varying artifacts	Requires artifact reference signal

Figure 1: Artifact Correction Method Selection Workflow

Robust Decoding Methods

Fundamental Concepts

Robust decoding represents a paradigm shift in artifact management—rather than removing artifacts from data, these methods employ analysis techniques that remain effective despite artifact presence. This approach acknowledges that in many practical applications, complete artifact elimination may be impossible, and instead focuses on developing decoders insensitive to artifact interference [72]. The fundamental principle involves designing classification or regression algorithms that leverage features minimally affected by common artifacts while maintaining sensitivity to neural signals of interest.

Research demonstrates that preprocessing choices significantly influence decoding robustness. A systematic multiverse analysis revealed that higher high-pass filter cutoffs (e.g., 1 Hz vs. 0.1 Hz) consistently increase decoding performance across both neural networks and time-resolved classifiers, likely by attenuating low-frequency EOG artifacts [72]. Conversely, artifact correction steps generally reduce decoding performance, suggesting that some artifacts may provide discriminative information that classifiers exploit for task decoding. This raises critical questions about whether improved decoding accuracy reflects genuine neural processing or merely systematic artifact patterns [72].

Implementation Frameworks

EEGNet, a compact convolutional neural network architecture, demonstrates inherent robustness to artifact interference through its design principles. The network incorporates temporal convolution to learn frequency filters, depthwise convolution to learn spatial filters, and separable convolution to combine features across time and space [72]. This multi-scale feature learning enables the network to prioritize neural patterns while discounting transient artifacts. Experimental evidence shows EEGNet maintains higher performance with minimal preprocessing compared to traditional methods, achieving median accuracies of 0.85 on error-related negativity (ERN) tasks even with unprocessed data [72].

Time-resolved logistic regression classifiers employ a different robustness strategy by performing classification at each individual time point using electrode signals in isolation. This approach minimizes the impact of transient artifacts confined to specific time periods, as the classifier can leverage clean data from other time points [72]. For this framework, linear detrending and lower low-pass filter cutoffs significantly increase decoding performance, contrasting with the requirements for EEGNet and highlighting how robustness strategies must be tailored to specific decoding architectures [72].

Strategic Selection Framework

Decision Matrix for Artifact Management

The selection of an optimal artifact management strategy depends on multiple interrelated factors, including artifact type, research objectives, data characteristics, and analytical requirements. A systematic decision framework ensures alignment between methodological choices and experimental goals while maintaining analytical rigor.

Table 3: Decision Matrix for Artifact Management Strategy Selection

Scenario	Recommended Strategy	Rationale	Implementation Considerations
Low artifact prevalence (<10% data affected)	Artifact Rejection	Minimal data loss; maximum signal preservation	Use amplitude/statistical thresholding; validate with visual inspection
High-density EEG (>32 channels)	ICA-Based Correction	Spatial information enables effective source separation	Combine with wavelet enhancement to minimize neural data loss
Single-channel or low-density EEG	Advanced Decomposition Methods (Ci_SSA, EMD, DWT)	Effective without spatial information [71]	Apply clustering for automatic component selection
Real-time BCI applications	Robust Decoding (EEGNet)	No latency for correction; maintains continuous operation [72]	Optimize high-pass filters; avoid artifact correction steps
Unknown or multiple artifact types	Deep Learning Correction (CLEnet)	Adapts to diverse artifacts without prior specification [24]	Requires substantial training data; computational intensive
Quantitative EEG analysis	Correction Approaches	Preserves data continuity and spectral properties	Validate with artifact-free benchmarks when available
Stimulation artifacts in EMG	SAG-RLS Adaptive Filter	Specifically designed for time-varying stimulation artifacts [69]	Requires artifact reference signal

Experimental Design Considerations

Strategic artifact management begins during experimental design, where proactive measures can significantly reduce artifact burden. For EOG artifacts, implementing controlled fixation intervals, minimizing visually demanding tasks, and providing explicit blink instructions during inter-trial intervals can dramatically reduce ocular contamination [2]. For EMG artifacts, proper participant positioning, head stabilization, and explicit instructions to relax facial, jaw, and neck muscles can substantially decrease muscle-related contamination.

The research context fundamentally influences strategy selection. In clinical diagnostics, where interpretability and signal fidelity are paramount, artifact correction approaches that preserve neural signal integrity are generally preferred [20]. For BCI applications prioritizing classification accuracy, robust decoding methods that maintain continuous operation often prove most effective [72] [2]. In fundamental neuroscience research examining precise neural dynamics, artifact rejection may be preferred when feasible to ensure unambiguous interpretation of neural signals.

Table 4: Essential Research Tools for Artifact Management Research

Tool/Resource	Type	Primary Function	Application Context
MNE-Python	Software Library	EEG processing and analysis	Implementation of preprocessing pipelines, ICA, and machine learning
EEGdenoiseNet	Benchmark Dataset	Model training and validation	Developing and testing deep learning approaches for artifact removal [24]
AUTOREJECT	Software Tool	Automated epoch rejection	Statistical threshold optimization for artifact rejection [72]
ICA Algorithms (Infomax, FastICA)	Computational Method	Blind source separation	Identification and isolation of artifact components [7]
Wavelet Toolboxes	Software Library	Time-frequency decomposition	Enhanced artifact removal using multi-resolution analysis [71] [7]
EEGNet Architecture	Deep Learning Model	Robust decoding	Classification resistant to artifact interference [72]
CLEnet Architecture	Deep Learning Model	Multi-artifact correction	Removal of known and unknown artifacts from multi-channel EEG [24]
ERP CORE Dataset	Benchmark Dataset	Method validation	Standardized evaluation using common ERP components [72]

Figure 2: Artifact Management Strategy Decision Framework

The strategic selection between artifact rejection, correction, and robust decoding approaches represents a critical methodological decision with profound implications for EEG research validity and application effectiveness. Rather than a one-size-fits-all solution, optimal artifact management requires careful consideration of artifact characteristics, research objectives, and practical constraints. EOG and EMG artifacts demand distinct handling approaches due to their fundamentally different spatial, temporal, and spectral properties. The emerging evidence suggests that while artifact correction methods—particularly advanced deep learning approaches—offer powerful solutions for complex contamination scenarios, robust decoding methods may provide superior performance in applied settings where maximum classification accuracy is paramount. Critically, researchers must remain vigilant that strategies which improve decoding performance may do so by exploiting systematic artifact patterns rather than genuine neural signals, potentially compromising interpretability. As wearable EEG applications continue to expand into real-world environments, developing and validating appropriate artifact management strategies tailored to specific research contexts and artifact challenges remains an essential frontier in electrophysiological research.

Tackling Simultaneous Multi-Artifact Contamination with Hybrid Frameworks

Electroencephalogram (EEG) signals are perpetually susceptible to contamination by physiological artifacts, primarily electrooculogram (EOG) from eye movements and electromyogram (EMG) from muscle activity. The research challenge intensifies when these artifacts co-occur, creating a complex, mixed-noise scenario that can severely obscure neural information and compromise data integrity in both clinical and research settings. Traditional single-source artifact removal techniques often fail under these conditions, as they are not designed to handle the overlapping spectral and temporal characteristics of simultaneous contaminants [73] [20].

This whitepaper posits that hybrid frameworks—strategically combining signal processing, machine learning, and auxiliary data—represent the most promising avenue for robust artifact suppression. Framed within a broader thesis contrasting EOG and EMG artifact properties, this guide provides an in-depth technical examination of modern methodologies capable of addressing multi-artifact contamination, with a focus on quantitative performance and reproducible experimental protocols.

The Distinctive Challenges of EOG and EMG Artifacts

Effective removal of simultaneous artifacts first requires an understanding of their distinct properties. EOG artifacts, caused by eye blinks and movements, are typically high-amplitude, low-frequency events (0.5–12 Hz) that predominantly affect frontal EEG channels [15] [71]. In contrast, EMG artifacts, stemming from facial, jaw, or neck muscle tension, are characterized by their broad-spectrum, high-frequency nature (≥20 Hz) and can affect virtually all EEG channels [74] [40]. The core challenge in simultaneous contamination lies in their spatiotemporal overlap, where the low-frequency EOG content and the broad-spectrum EMG energy interfere with the underlying neural signals across multiple frequency bands, making disentanglement exceptionally difficult [73] [20].

Conventional approaches like Independent Component Analysis (ICA) often struggle with this separation. ICA assumes statistical independence of sources, a condition frequently violated when multiple artifacts and brain signals overlap in time and space [74]. This limitation is particularly pronounced in low-density, wearable EEG systems where the number of channels is insufficient for reliable source separation [20].

Quantitative Performance of Hybrid Frameworks

Recent research demonstrates that hybrid frameworks significantly outperform conventional methods in handling multi-artifact scenarios. The table below summarizes the quantitative performance of several advanced hybrid approaches as reported in recent literature.

Table 1: Performance Comparison of Hybrid Multi-Artifact Handling Frameworks

Framework Name	Core Methodology	Artifacts Addressed	Reported Performance Metrics	Key Advantages
Hybrid Spectral-Temporal ML Framework [73]	Low-pass filtering + PSD analysis + PCA feature fusion + lightweight MLP	Simultaneous EOG, EMG, White Noise	>96% classification accuracy (simultaneous); >90% accuracy at SNR 4 dB	High accuracy under low SNR; 97% faster training than CNNs; real-time capable
MultiResUNet3+ [75]	1D Convolutional Neural Network (Deep Learning)	Simultaneous EOG & EMG	94.82% temporal EOG reduction; 83.21% spectral EMG reduction	Superior EOG removal; handles concurrent artifacts; robust & generalizable
Ci_SSA with Wavelet & Clustering [71]	Circular Singular Spectrum Analysis + Morlet Wavelet + Unsupervised Clustering	EOG (Eye-Blink)	High Correlation Coefficient (CC); Low RRMSE & MAE on synthetic data	Preserves non-artifact regions; automated; suitable for single-channel EEG
FF-EWT + GMETV Filter [15]	Fixed Frequency Empirical Wavelet Transform + Generalized Moreau Envelope TV Filter	EOG	Low Relative Root Mean Square Error (RRMSE); High CC; Improved SAR & MAE	Effective low-frequency preservation; validated on real & synthetic data
CNN-LSTM with EMG Reference [40]	Hybrid Convolutional & Recurrent Neural Network using auxiliary EMG	EMG	Improved SSVEP Signal-to-Noise Ratio (SNR) after cleaning	Preserves evoked potentials (SSVEP); uses auxiliary EMG for guided denoising
ERASE (Modified ICA) [74]	ICA with additional real/simulated EMG reference channels	EMG	~75% artifact removal (real EMG reference); ~63% (simulated reference)	"Forces" EMG into fewer components; outperforms conventional ICA

Detailed Experimental Protocols for Key Hybrid Frameworks

Protocol 1: Hybrid Spectral-Temporal ML Framework for Simultaneous Contamination

This protocol is designed for real-time detection and classification of multiple concurrent artifacts in single-channel EEG [73].

Data Preparation & Simulation:
- Clean EEG Database: Obtain clean EEG segments from a public database (e.g., the dataset used in [75]).
- Semi-Synthetic Contamination: Artificially contaminate clean EEG epochs with recorded or simulated EOG and EMG signals at varying Signal-to-Noise Ratios (SNRs), including conditions of simultaneous contamination (EOG+EMG+white noise).
Feature Engineering (The Hybrid Core):
- Time-Domain Processing: Apply a low-pass filter to the contaminated signal to capture low-frequency EOG components.
- Frequency-Domain Processing: Calculate the Power Spectral Density (PSD) of the signal to capture the broad-spectrum characteristics of EMG.
- Feature Fusion & Reduction: Fuse the extracted temporal and spectral features. Use Principal Component Analysis (PCA) to minimize redundancy and create a compact, discriminative feature vector.
Model Training & Validation:
- Classifier: Train a lightweight Multi-Layer Perceptron (MLP) on the fused feature set.
- Validation: Perform cross-validation, reporting accuracy, precision, and recall across different SNR levels. Compare performance against more complex models like CNNs and RNNs.

The workflow for this hybrid framework is as follows:

Protocol 2: Ci_SSA with Wavelet Analysis for EOG-Rich Contamination

This protocol is an automated, single-channel approach ideal for preserving low-frequency brain activity while removing eye-blink artifacts [71].

Data Acquisition:
- Acquire single-channel EEG data. For validation, use real EEG contaminated with natural blinks or create semi-synthetic data by adding clean EOG templates to artifact-free EEG.
Signal Decomposition:
- Apply Circular Singular Spectrum Analysis (Ci_SSA). This subspace-based technique decomposes the single-channel signal into a set of reconstructible components (RCs) that capture different oscillations, including those related to EOG.
Artifact Component Identification:
- Transform the RCs using a Morlet wavelet transform.
- Use unsupervised clustering (e.g., K-means) on features derived from the wavelet coefficients to automatically identify and group components corresponding to eye-blink artifacts, without manual thresholding.
Reconstruction:
- Subtract the artifact-related components identified by the clustering process.
- Reconstruct the clean EEG signal from the remaining components.

The workflow for this automated EOG removal process is as follows:

Protocol 3: Deep Learning with Auxiliary EMG Reference

This protocol uses a hybrid CNN-LSTM model and auxiliary EMG signals to specifically target muscle artifacts while preserving neural responses like SSVEPs [40].

Experimental Setup:
- Record simultaneous EEG and EMG signals. EMG should be captured from facial or neck muscles (e.g., masseter, trapezius).
- Paradigm: Present subjects with a steady-state visual evoked potential (SSVEP) stimulus (e.g., a flashing LED). Instruct subjects to perform jaw clenching to induce strong EMG artifact.
Data Preparation for Deep Learning:
- Data Augmentation: Generate augmented training samples by adding scaled, recorded EMG segments to clean EEG epochs. This creates a large, diverse dataset for robust model training.
- Alignment: Precisely time-align the auxiliary EMG recordings with the contaminated EEG signals.
Model Architecture & Training:
- Hybrid CNN-LSTM: Design a network where Convolutional Neural Network (CNN) layers extract spatial features from the input (EEG + EMG), and Long Short-Term Memory (LSTM) layers capture temporal dependencies.
- Training: Train the model to map the contaminated EEG and concurrent EMG input to a clean EEG output.
Validation Metric:
- Evaluate performance by calculating the Signal-to-Noise Ratio (SNR) of the SSVEP response in the frequency domain before and after cleaning. A successful cleaning process will show a significant increase in SNR.

Table 2: Key Research Reagents and Computational Tools for Hybrid Artifact Removal

Item / Resource	Type	Primary Function in Research	Exemplary Use Case
Auxiliary EMG Sensors	Hardware	Provides a dedicated reference signal for muscle artifact sources.	Used in the ERASE [74] and CNN-LSTM [40] protocols to guide artifact separation.
Public EEG/EOG/EMG Datasets	Data	Provides clean segments of neural and artifactual signals for creating semi-synthetic data.	Training and benchmarking deep learning models like MultiResUNet3+ [75].
Fixed Frequency EWT (FF-EWT)	Algorithm	Adaptively decomposes a signal into modes for targeted analysis of specific frequency bands.	Isolating EOG-related components in single-channel EEG for subsequent filtering [15].
Principal Component Analysis (PCA)	Algorithm	Reduces dimensionality of feature space, minimizing redundancy and computational cost.	Optimizing feature fusion in the hybrid spectral-temporal framework [73].
Lightweight Multi-Layer Perceptron (MLP)	Software (Model)	A simple, fast neural network for classification; suitable for real-time applications.	Classifying artifact types from fused spectral-temporal features [73].
Convolutional Neural Network (CNN)	Software (Model)	Automatically extracts salient features from raw or preprocessed signal data.	Used in MultiResUNet3+ [75] and hybrid CNN-LSTM [40] for feature learning.
Long Short-Term Memory (LSTM)	Software (Model)	Models long-range temporal dependencies in sequential data like EEG.	Capturing the time-course of artifacts and brain signals in the CNN-LSTM hybrid [40].

The convergence of EOG and EMG artifacts in EEG data presents a formidable challenge that monolithic algorithms cannot adequately solve. As evidenced by the frameworks discussed, the path forward lies in strategic hybridization. Combining the strengths of signal processing techniques like EWT and SSA with the adaptive power of machine learning—and supplementing with auxiliary data where possible—creates a synergistic effect capable of disentangling complex, simultaneous contaminants. The quantitative results are compelling: hybrid methods consistently achieve over 90% accuracy in artifact classification and significant improvements in signal quality metrics, all while paving the way for real-time, clinically applicable systems. For researchers and drug development professionals, adopting these hybrid frameworks is critical for ensuring the integrity of neural data, ultimately leading to more reliable diagnostics and therapeutic outcomes.

In electrophysiological research, particularly in studies contrasting electrooculogram (EOG) and electromyogram (EMG) artifacts, precise parameter tuning is not merely a preprocessing step but a fundamental requirement for data integrity. EOG and EMG artifacts manifest with distinct spatial, temporal, and spectral characteristics that often overlap with neural signals of interest, complicating analysis in both clinical and research settings. EOG artifacts, originating from eye movements and blinks, are typically low-frequency, high-amplitude signals, whereas EMG artifacts, arising from muscle activity, are generally broadband and high-frequency [35] [15]. The relaxed constraints of modern wearable acquisition setups, which often employ dry electrodes and reduced scalp coverage, further amplify these challenges by compromising signal quality and introducing additional noise sources [35]. This guide provides a systematic framework for parameter tuning to address these specific artifact properties, enabling researchers to extract cleaner signals and draw more reliable conclusions in drug development and neuroscientific studies.

Core Technical Principles: EOG vs. EMG Artifact Characteristics

Effective parameter tuning begins with a thorough understanding of the fundamental differences between EOG and EMG artifacts. The table below summarizes their key characteristics, which should guide the selection of initial processing parameters.

Table 1: Characteristic Profiles of EOG and EMG Artifacts

Characteristic	EOG Artifacts	EMG Artifacts
Spectral Domain	Dominant in low frequencies (0.5 - 12 Hz) [15] [71]	Broadband, but dominant in high frequencies (20 - 450 Hz) [42] [76]
Amplitude	High-amplitude [15]	Variable, typically lower amplitude than EOG
Source	Corneo-retinal potential; eye blinks and movements [77]	Electrical activity from muscle motor units [42]
Primary Impact	Obscures low-frequency brain rhythms (e.g., Delta, Theta)	Masks high-frequency brain activity (e.g., Beta, Gamma)
Common Detection/Removal Techniques	Source separation (ICA, CCA), wavelet transforms, SSA [35] [71]	High-pass filtering, ICA, wavelet transforms, adaptive filtering [35] [78]

Parameter Tuning Methodologies for Artifact Management

Tuning for Electrooculogram (EOG) Artifact Handling

Fixed Frequency Empirical Wavelet Transform (FF-EWT) with GMETV Filtering For single-channel EOG artifact removal, an advanced method combines FF-EWT with a Generalized Moreau Envelope Total Variation (GMETV) filter. The process involves decomposing the contaminated signal into six Intrinsic Mode Functions (IMFs) using FF-EWT. The critical tuning step lies in identifying EOG-related IMFs using a feature threshold based on kurtosis (KS), dispersion entropy (DisEn), and power spectral density (PSD) metrics. IMFs identified as artifacts are then processed by the finely tuned GMETV filter to suppress the artifact while preserving underlying neural information [15]. Performance on real EEG recordings is assessed via Signal-to-Artifact Ratio (SAR) and Mean Absolute Error (MAE), with the combined method demonstrating substantial improvement in these metrics [15].

Circular Singular Spectrum Analysis (Ci-SSA) with Wavelet and Clustering Another robust, automated framework for single-channel EOG artifact removal integrates Ci-SSA, Morlet wavelet transform, and unsupervised clustering. The Ci-SSA step decomposes the signal to isolate oscillating low-frequency noise components. A key automation feature is the use of unsupervised clustering (e.g., K-means) on the Ci-SSA output to group artifact-dominated components without manual threshold setting. This method specifically aims to preserve the low-frequency content in non-artifact regions, a common pitfall of other techniques [71]. Performance is quantified using Relative Root Mean Square Error (RRMSE), Correlation Coefficient (CC), and Artifact-to-Residue Ratio (AReR) on both synthetic and real data [71].

Tuning for Electromyogram (EMG) Artifact Handling

Adaptive Filtering with Metaheuristic Optimization Adaptive Noise Cancellation (ANC) filters are highly effective for separating volitional EMG from noise or other artifacts, such as power line interference and ECG coupling. The performance of an ANC filter is directly tied to the optimization of its weights. The Gray Wolf Optimization (GWO) algorithm has been employed for this purpose, mimicking social hierarchy and hunting behavior to find optimal filter parameters [78]. Tuning involves configuring the GWO's population (wolf pack) and iteration parameters to minimize error signals like Mean Square Error (SMSE) and Maximum Error (SME). This approach has shown a 28 dB improvement in output SNR and an 81% reduction in MSE compared to RLS-based ANC filters [78].

Feature Selection with Hybrid Filter-Wrapper Methods For EMG classification tasks, high-dimensional feature sets necessitate robust selection methods. A hybrid approach combining Neighborhood Component Analysis (NCA), a filter method, with metaheuristic wrappers like Gray Wolf Optimization (GWO) or Equilibrium Optimization (EO) has proven effective [76]. The process starts with NCA, which ranks features based on a regularization parameter (λ) that must be tuned. The top-ranked features are then passed to the wrapper method (e.g., GWO), which uses a multi-objective fitness function to evaluate feature subsets, aiming to maximize classification accuracy while minimizing the number of selected features [76]. This hybrid method achieves high accuracy with a reduced feature subset, crucial for real-time applications.

Table 2: Performance Metrics for EOG and EMG Parameter Tuning Methods

Method	Key Tunable Parameters	Primary Performance Metrics	Reported Performance
FF-EWT + GMETV (EOG) [15]	KS, DisEn, PSD thresholds; GMETV filter coefficients	SAR, MAE, RRMSE, CC	Lower RRMSE, higher CC on synthetic data; improved SAR/MAE on real data.
Ci-SSA + Wavelet + Clustering (EOG) [71]	Ci-SSA window length; clustering parameters (e.g., k)	RRMSE, MAE, CC, ARR, SER, AReR	Improved metrics on synthetic/real data; enhanced driver fatigue detection accuracy.
ANC with GWO (EMG) [78]	GWO population size; number of iterations; error target (SMSE, SME)	Output SNR, SMSE, SME, Correlation (Sr)	28 dB SNR improvement, 81% MSE reduction vs. RLS; 7 dB SNR improvement vs. ABC-MR.
NCA-GWO/EO (EMG) [76]	NCA regularization (λ); fitness function weights; GWO/EO population	Classification Accuracy, Kappa, F-score, MSE	High accuracy (>95% in some cases), reduced feature subset size, improved computational efficiency.

Experimental Protocols for Validation

To ensure the robustness of tuned parameters, rigorous experimental validation is required. Below are detailed protocols adapted from the cited research.

4.1 Protocol for Validating EOG Artifact Removal This protocol is designed to test algorithms under controlled, realistic conditions [15] [71].

Data Acquisition & Synthesis:
- Real Data: Collect EEG data using a wearable single-channel system (e.g., a portable EEG headband). Instruct participants to perform periodic voluntary blinks according to a visual cue, interspersed with periods of rest.
- Synthetic Data: Create a contaminated signal y by mixing a clean EEG baseline (s) from a public database with a synthetic eye-blink signal (x) using the formula: y = s + p * x, where p is a proportion constant used to control the artifact severity [71].
Algorithm Application: Apply the parameter-tuned algorithm (e.g., FF-EWT+GMETV or Ci-SSA+Wavelet+Clustering) to both the synthetic and real contaminated datasets.
Performance Quantification:
- For synthetic data, where the clean ground truth (s) is known, calculate RRMSE and CC between the cleaned signal and s.
- For real data, where a ground truth is unavailable, use metrics like SAR and AReR to quantify the amount of artifact remaining in the signal [15] [71].

4.2 Protocol for Validating EMG Processing and Classification This protocol validates parameter tuning for both denoising and classification tasks [76] [78].

Data Collection:
- Setup: Follow SENIAM standards for electrode placement. For a neck/shoulder study, place sEMG electrodes on muscles like the right/left trapezius (RT/LT) and sternocleidomastoid (RSCM/LSCM) [76].
- Task: Participants perform a set of defined movements (e.g., shoulder shrug, lateral flexion). Each movement should be performed voluntarily for 3 seconds and repeated 10 times.
Signal Pre-processing: Acquire raw EMG at a 1 kHz sampling rate. Apply a notch filter (e.g., 50/60 Hz) to remove power line interference, followed by a bandpass filter (20–450 Hz). Rectify the signals for analysis [76].
Feature Extraction & Selection: Extract a comprehensive set of 23 time and frequency domain features (e.g., MAV, WL, ZC, RMS, MNF). Use the tuned hybrid feature selection method (e.g., NCA-GWO) to reduce the feature dimensionality.
Model Training & Testing: Split the data into training (70%) and testing (30%) sets. Train a classifier (e.g., Random Forest, SVM) using the selected feature subset. Validate performance using metrics like accuracy, sensitivity, specificity, and F1-score [76].

Essential Research Reagent Solutions and Materials

The following table details key components required for establishing a pipeline for EOG/EMG artifact research.

Table 3: Essential Research Reagents and Materials for EOG/EMG Artifact Research

Item Name	Function/Application	Technical Specifications & Alternatives
Wearable EEG/EOG Headband	Acquires single or few-channel neural data in ecological settings.	Dry or semi-dry electrodes; often ≤16 channels [35]. Alternative: Research-grade wet electrode systems (e.g., Ag/AgCl).
Biopotential Amplifier (e.g., AD8232)	Amplifies weak physiological signals like EOG and EMG for measurement.	Integrated circuit (IC) solution for portable designs [79].
Smart Eyewear (e.g., JINS MEME)	Provides a platform for unobtrusive EOG signal acquisition in real-world settings.	Integrated electrodes; wireless data transmission [80].
Surface EMG Sensors	Non-invasively capture muscle activation signals.	Dry, wet, or textile-based electrodes [42]. Textile electrodes are promising for long-term comfort [42].
High-Density sEMG (HD-sEMG) System	Captures high-resolution spatial muscle activity patterns for advanced pattern recognition.	Grid of closely spaced electrodes; higher computational cost [42].
Inertial Measurement Units (IMUs)	Captures motion data to assist in identifying movement-related artifacts.	Accelerometers, gyroscopes. Noted as underutilized in artifact detection [35].

Visual Workflows for EOG and EMG Processing Pipelines

The following diagrams illustrate the standard workflows for processing EOG and EMG signals, highlighting key stages where parameter tuning is most critical.

Figure 1: EOG artifact processing workflow.

Figure 2: EMG processing and classification workflow.

Mastering parameter tuning for EOG and EMG artifacts is a critical competency for researchers aiming to ensure data fidelity in electrophysiological studies. As this guide has detailed, best practices involve a methodical approach: understanding the distinct spectral and temporal signatures of each artifact type, applying tailored algorithms (from wavelet transforms to metaheuristic optimization), and rigorously validating tuned parameters against standardized performance metrics. The emerging trends highlighted in recent literature, including the hybridization of filter and wrapper methods for feature selection and the use of advanced decomposition techniques like FF-EWT and Ci-SSA, point towards more automated, robust, and computationally efficient pipelines. By adhering to these structured methodologies and experimental protocols, scientists and drug development professionals can significantly enhance the reliability of their findings, thereby accelerating progress in neuroscience and therapeutic development.

The advancement of Brain-Computer Interfaces (BCIs) toward real-world, point-of-care applications hinges on overcoming a fundamental challenge: the reliable, real-time distinction between neurological signals and physiological artifacts. Electrooculogram (EOG) and electromyogram (EMG) artifacts represent two of the most pervasive and disruptive noise sources in electroencephalography (EEG) data. EOG artifacts, caused by eye movements and blinks, manifest as low-frequency, high-amplitude signals that can obscure underlying brain activity [15]. In contrast, EMG artifacts, generated by muscle contractions in the face, neck, or scalp, introduce high-frequency, broad-spectrum noise that can be mistaken for neural oscillations [25] [73]. Within the context of a broader thesis on EOG vs. EMG artifacts, this whitepaper examines the critical role of lightweight computational models in enabling real-time artifact handling. These models are essential for transitioning BCI systems from controlled laboratory environments to dynamic point-of-care settings, where they can support applications in neurorehabilitation, patient monitoring, and assistive technologies [81] [82].

The limitations of conventional artifact removal techniques become acutely apparent in real-time scenarios. Traditional methods, such as Independent Component Analysis (ICA), are often computationally intensive and ill-suited for single-channel EEG systems, which are common in wearable devices [15]. Furthermore, the presence of simultaneous multi-source contamination (e.g., EOG and EMG artifacts occurring together) presents a complex challenge that many existing approaches fail to address adequately [73]. This paper provides an in-depth technical guide to the latest lightweight models and experimental protocols designed to overcome these hurdles, thereby enhancing the robustness and practicality of next-generation BCI systems.

Lightweight Methodologies for Real-Time Artifact Management

Implementing effective artifact processing in real-time requires a careful balance between computational efficiency and signal fidelity. The following section details cutting-edge methodologies tailored for point-of-care devices.

A Hybrid Spectral-Temporal Framework for Single-Channel EEG

A novel framework for real-time detection and classification of EOG, EMG, and white noise artifacts in single-channel EEG has demonstrated remarkable efficacy. This method is particularly valuable for its computational efficiency, a prerequisite for wearable BCIs. The protocol involves a multi-stage process [73]:

Time-Domain Low-Pass Filtering: This initial stage specifically targets the low-frequency components characteristic of EOG artifacts.
Frequency-Domain Power Spectral Density (PSD) Analysis: Concurrently, a PSD analysis captures the broad-spectrum energy signature of EMG artifacts.
PCA-Optimized Feature Fusion: The features extracted from the temporal and spectral analyses are fused, and Principal Component Analysis (PCA) is applied to minimize redundancy while preserving discriminative information.
Lightweight Multi-Layer Perceptron (MLP) Classification: The optimized feature set is fed into a compact MLP for final artifact classification.

This hybrid approach has been validated to achieve 99% accuracy at low Signal-to-Noise Ratios (SNR of -7 dB) and maintains >90% accuracy in moderate noise conditions (SNR 4 dB). Critically, it addresses the challenging scenario of simultaneous multi-source contamination, maintaining 96% classification accuracy even with overlapping EOG, EMG, and white noise. The training time for this model is approximately 30 seconds, which is reported to be 97% faster than comparable CNN models, making it highly suitable for real-time applications [73].

Fixed Frequency EWT with GMETV Filtering for EOG Removal

For the targeted removal of EOG artifacts, an automated method using Fixed Frequency Empirical Wavelet Transform (FF-EWT) combined with a Generalized Moreau Envelope Total Variation (GMETV) filter has been developed. This protocol is designed to separate artifact components from the single-channel EEG signal without damaging the underlying neural data [15]. The experimental procedure is as follows:

Decomposition: The contaminated EEG signal is decomposed into six Intrinsic Mode Functions (IMFs) using FF-EWT.
Identification: EOG-related IMFs are automatically identified from the set of IMFs using a feature threshold based on kurtosis, dispersion entropy, and power spectral density metrics.
Filtering: The identified artifact components are processed and removed using a finely tuned GMETV filter.
Reconstruction: The cleaned IMFs are reconstructed into the final artifact-reduced EEG signal.

Validation on both synthetic and real EEG datasets confirmed the method's capability to suppress EOG artifacts while preserving essential low-frequency EEG information. Quantitative metrics showed substantial improvements, including lower Relative Root Mean Square Error (RRMSE) and higher Correlation Coefficient (CC) on synthetic data, and improved Signal-to-Artifact Ratio (SAR) on real recordings [15].

Deep Learning with DenoiseMamba for Multi-Artifact Suppression

For scenarios requiring high-performance denoising across multiple artifact types, the DenoiseMamba model represents a significant innovation. This deep learning approach addresses the limitation of many existing models that struggle to efficiently capture the long-term temporal dependencies in EEG signals [83]. The model's architecture is based on two key components:

Structured State-Space Duality (SSD): This mechanism allows the model to effectively model global temporal dependencies and long-range interactions within the EEG signal.
Convolutional Neural Networks (CNNs): These are integrated to capture local contextual and spatial features.

The integration of these elements in the ConvSSD module enables DenoiseMamba to capture both local and global spatiotemporal features, leading to more effective suppression of EOG, EMG, and electrocardiographic (ECG) artifacts while preserving critical EEG signal details. The model has been validated on semi-simulated datasets, demonstrating superior EEG reconstruction accuracy compared to existing methods [83].

A Hybrid EMG-EEG Interface with Adaptive Fusion

Moving beyond simple artifact removal, some advanced BCI systems proactively leverage multiple biosignals. A hybrid EMG-EEG interface for rehabilitation robotics demonstrates a fatigue-adaptive control strategy, which is highly relevant for point-of-care therapeutic devices. This system uses a Bayesian fusion strategy to combine the strengths of peripheral EMG and central EEG signals for robust intention detection [82]. The experimental protocol involves:

Parallel Signal Acquisition and Classification: EMG signals (from an armband) and EEG signals (from a motor cortex headset) are acquired and classified independently using Support Vector Machine (SVM) models.
Real-Time Fatigue Estimation: A k-Nearest Neighbors (k-NN) model computes a continuous fatigue score based on EMG spectral features.
Adaptive Bayesian Fusion: The probabilities from the EMG and EEG classifiers are combined in a Bayesian fusion module. The weighting of the EEG modality is dynamically adjusted based on the real-time fatigue score using the function α(f) = 0.2 + 0.6 × f(x), where a higher fatigue level increases the reliance on the EEG signal.

This adaptive system achieved an overall classification accuracy of 94.5% for elbow flexion and extension tasks, outperforming EMG-only classification (88.5%). Most importantly, during high-fatigue conditions, the adaptive fusion maintained a high accuracy of 91.4%, compared to a drop to 83.1% for the EMG-only system, thereby demonstrating enhanced robustness for prolonged use [82].

Comparative Performance Analysis of Lightweight Models

The quantitative performance of the various models and techniques discussed provides critical insight for researchers selecting an appropriate methodology. The data below summarizes key findings from the cited research for direct comparison.

Table 1: Performance Metrics of Lightweight Artifact Handling Models

Model / Technique	Primary Target	Key Performance Metric	Result	Computational Efficiency
Hybrid Spectral-Temporal MLP [73]	EOG, EMG, White Noise	Classification Accuracy (SNR -7 dB)	99%	Training time: ~30 seconds (97% faster than CNNs)
Hybrid Spectral-Temporal MLP [73]	Simultaneous EOG+EMG+Noise	Classification Accuracy	96%	Maintains real-time performance
FF-EWT + GMETV Filter [15]	EOG Artifacts	Signal-to-Artifact Ratio (SAR) & Correlation Coefficient (CC)	Substantial Improvement	Validated for single-channel portable EEG
Hybrid EMG-EEG Adaptive Fusion [82]	Robust Intention Detection	Overall Classification Accuracy	94.5%	End-to-end latency <500 ms
Hybrid EMG-EEG Adaptive Fusion [82]	Intention Detection under Fatigue	Classification Accuracy	91.4% (vs. 83.1% EMG-only)	Real-time fatigue estimation & weighting

The Scientist's Toolkit: Research Reagents & Essential Materials

Translating these methodologies from concept to implementation requires a specific set of hardware and software tools. The following table details essential components for building and testing real-time BCI systems with advanced artifact handling capabilities.

Table 2: Essential Research Materials for Real-Time BCI Development

Item / Technology	Function / Application	Specific Examples / Notes
Textile-Based Electrodes [42]	Non-invasive, long-term EMG/EEG signal acquisition; improved user comfort for wearable devices.	Comparable to wet electrodes; ideal for integration into wearable garments for rehabilitation.
Portable EEG Headsets [84] [81]	Consumer-grade brain signal acquisition for research; enables wearable BCI prototyping.	Emotiv EPOC (14 channels), OpenBCI Ultracortex (configurable); allow wireless, mobile data collection.
Wearable EMG Armbands [82]	Capture peripheral neuromuscular activity for intention detection and fatigue monitoring.	Used in hybrid systems; typically placed on limb muscles (e.g., biceps, triceps).
Lightweight MLP Classifier [73]	Core processing unit for real-time artifact classification; offers high speed and accuracy.	Outperforms deeper CNNs/RNNs in noisy scenarios; ideal for low-power, point-of-care devices.
Fixed Frequency EWT [15]	Signal decomposition technique for targeted isolation of artifact components in single-channel EEG.	Effective for separating EOG artifacts in the 0.5-12 Hz frequency range.
Bayesian Fusion Framework [82]	Adaptive algorithm for combining multiple biosignal modalities (e.g., EEG and EMG).	Dynamically weights input signals based on real-time reliability estimates (e.g., fatigue).
Structured State-Space (SSD) Models [83]	Deep learning component for modeling long-term temporal dependencies in EEG signals.	Core of DenoiseMamba; effective for capturing global context in signal denoising tasks.

Experimental Protocol for Real-Time Artifact Classification

To empirically validate a lightweight artifact classification model, researchers can adopt the following detailed protocol, based on the hybrid spectral-temporal framework [73].

Data Acquisition and Simulation:
- EEG Recording: Collect clean EEG data using a standard EEG system (e.g., a research-grade amp or a portable headset like OpenBCI) from healthy participants in a resting state with eyes open and closed.
- Artifact Simulation: To create a ground-truthed dataset, artificially contaminate the clean EEG epochs with recorded EOG (from electrodes near the eyes) and EMG (from forehead or temple muscles) artifacts. This allows for systematic variation of the Signal-to-Noise Ratio (SNR).
Signal Preprocessing and Feature Extraction:
- Apply a time-domain low-pass filter (e.g., cutoff ~12 Hz) to emphasize components for EOG artifact detection [73] [15].
- Simultaneously, compute the Power Spectral Density (PSD) of the signal in overlapping windows to capture the broad-spectrum features of EMG artifacts.
- Fuse the temporal and spectral features into a unified feature vector.
- Apply Principal Component Analysis (PCA) to this vector to reduce dimensionality and minimize feature redundancy.
Model Training and Validation:
- Partition the dataset into training, validation, and testing sets, ensuring no data leakage.
- Design a Multi-Layer Perceptron (MLP) architecture with input nodes matching the PCA-reduced feature dimension, one or two hidden layers, and an output layer with nodes corresponding to the artifact classes (e.g., clean, EOG, EMG, mixed).
- Train the MLP using the training set and optimize hyperparameters (learning rate, number of nodes) based on performance on the validation set.
- Evaluate the final model on the held-out test set, reporting accuracy, precision, recall, and F1-score for each artifact class and at different SNR levels.

System Architecture and Signaling Pathways

The integration of hardware, signal processing, and adaptive control logic in a modern BCI can be visualized through its system architecture. The following diagram illustrates the workflow of a hybrid EMG-EEG interface, showcasing the pathway from signal acquisition to final actuation.

Hybrid EMG-EEG System Workflow

This architecture highlights the parallel processing of central (EEG) and peripheral (EMG) biosignals. The critical innovation is the fatigue estimation module, which directly influences the fusion logic, making the system adaptive to the user's physiological state [82].

For a more granular view of the artifact classification process, the following diagram details the workflow of a single-channel denoising model, such as the hybrid spectral-temporal framework.

Single-Channel Artifact Classification Process

This workflow demonstrates the efficient hybrid approach that combines different analytical perspectives to achieve robust artifact identification with minimal computational overhead [73].

The implementation of lightweight models is a cornerstone for the future of real-time BCIs in point-of-care settings. The techniques detailed in this whitepaper—from hybrid spectral-temporal frameworks and targeted EOG removal to adaptive multimodal fusion—demonstrate that it is possible to achieve high-fidelity artifact handling without prohibitive computational cost. The quantitative data shows that these models can achieve accuracies exceeding 90-95% with low latency, making them viable for clinical and assistive applications. As research in this field progresses, the fusion of domain-informed feature engineering, efficient machine learning architectures, and adaptive control systems will continue to bridge the gap between laboratory prototypes and dependable, everyday medical technology, ultimately enhancing the synergy between human and machine.

Benchmarking Performance: Validating and Comparing Artifact Management Strategies

In electrophysiological research, particularly in the domains of electrooculogram (EOG) and electromyogram (EMG) artifact contamination of electroencephalogram (EEG) signals, establishing robust ground truth is paramount for validating the efficacy of artifact detection and removal algorithms. EOG artifacts, originating from eye movements and blinks, and EMG artifacts, generated by muscle activity, represent two of the most pervasive and challenging sources of signal contamination in brain-computer interface (BCI) systems and clinical neurophysiology [25]. The heterogeneous nature of these artifacts—with EOG typically manifesting as high-amplitude, low-frequency deflections and EMG as broad-spectrum, high-frequency noise—necessitates the use of specialized performance metrics that can quantitatively assess how well removal algorithms separate artifact from neural signal [28]. This technical guide provides an in-depth examination of three core performance metrics—Correlation Coefficient (CC), Signal-to-Artifact Ratio (SAR), and Relative Root Mean Square Error (RRMSE)—framed within the broader context of EOG vs. EMG artifacts research. These metrics serve as the fundamental yardstick for establishing ground truth across experimental paradigms, enabling researchers to objectively compare the performance of diverse artifact removal methodologies, from traditional blind source separation techniques to emerging deep learning approaches [85] [28].

Performance Metrics: Theoretical Foundations and Computational Methods

The quantitative assessment of artifact removal algorithms relies on metrics that evaluate both the preservation of underlying neural information and the effectiveness of artifact suppression. The following section details the theoretical foundations and computational methodologies for three principal metrics.

Table 1: Core Performance Metrics for Artifact Removal Validation

Metric	Mathematical Formula	Interpretation	Optimal Value
Correlation Coefficient (CC)	( CC = \frac{\sum{i=1}^{N}(xi - \bar{x})(yi - \bar{y})}{\sqrt{\sum{i=1}^{N}(xi - \bar{x})^2 \sum{i=1}^{N}(y_i - \bar{y})^2}} )	Measures the linear similarity between the cleaned and pure EEG signal.	+1 (Perfect positive correlation)
Signal-to-Artifact Ratio (SAR)	( SAR = 10 \log{10}\left(\frac{P{signal}}{P_{artifact}}\right) )	Quantifies the power ratio of the desired signal to the residual artifact.	Higher values (indicating greater artifact suppression)
Relative Root Mean Square Error (RRMSE)	( RRMSE = \sqrt{\frac{\sum{i=1}^{N}(xi - yi)^2}{\sum{i=1}^{N} x_i^2}} )	Evaluates the normalized magnitude of error between the cleaned and pure signal.	0 (No error)

Correlation Coefficient (CC)

The Correlation Coefficient is a statistical measure that quantifies the degree of linear relationship between the ground-truth, artifact-free signal and the processed signal after artifact removal [86]. A CC value close to +1 indicates that the cleaning process has successfully preserved the temporal dynamics and morphology of the original neural signal, with minimal distortion. This is particularly crucial for applications relying on event-related potentials (ERPs) or other time-locked neural phenomena, where maintaining signal fidelity is essential [71]. For instance, in a recent study utilizing Circulant Singular Spectrum Analysis with Discrete Wavelet Transform (CiSSA-DWT) for EOG removal, the algorithm achieved a CC value of 0.9883, demonstrating excellent reconstruction of the original EEG [86].

Signal-to-Artifact Ratio (SAR)

The Signal-to-Artifact Ratio is a power-based metric that measures the effectiveness of an algorithm in suppressing artifact components. It is calculated as the ratio of the power of the desired neural signal to the power of the residual artifact contamination remaining after processing [86]. A higher SAR denotes a cleaner output signal. This metric is especially valuable for quantifying the performance of algorithms targeting specific artifacts, such as the low-frequency, high-amplitude characteristics of EOG artifacts [85] or the broad-spectrum nature of EMG noise [87]. The CiSSA-DWT method, for example, reported a SAR of 1.4525, indicating significant artifact power reduction [86].

Relative Root Mean Square Error (RRMSE)

The Relative Root Mean Square Error provides a normalized measure of the overall difference or error between the clean ground-truth signal and the artifact-cleaned signal [85] [86]. Unlike standard RMSE, RRMSE is normalized by the energy of the reference signal, making it more suitable for comparing performance across different datasets or subjects. A lower RRMSE value indicates superior performance, with zero representing a perfect reconstruction. This metric is sensitive to both large, abrupt artifacts like eye blinks and smaller, more persistent artifacts like muscle noise. Advanced methods like the Fixed Frequency Empirical Wavelet Transform with a Generalized Moreau Envelope Total Variation (FF-EWT+GMETV) filter have been shown to achieve a low RRMSE, confirming their accuracy in signal reconstruction [85].

Understanding the distinct characteristics of EOG and EMG artifacts is fundamental to developing and validating effective removal strategies. The following section provides a detailed comparison of these two primary artifact types, highlighting their differing impacts on the EEG signal and the consequent implications for performance metric interpretation.

Table 2: Comparative Analysis of EOG and EMG Artifacts

Characteristic	EOG Artifacts	EMG Artifacts
Physiological Origin	Eye blinks and eye movements [25]	Muscle activity from head, neck, jaw, and face [25] [87]
Spectral Domain	Dominantly low-frequency (0.5 - 12 Hz) [85] [71]	Broad-spectrum, overlapping with beta and gamma EEG bands (up to 200 Hz) [87] [28]
Temporal Signature	High-amplitude, slow, stereotypical deflections [28] [7]	Bursty, high-frequency, stochastic noise [87]
Primary Metric Challenge	Preserving low-frequency EEG content (e.g., delta waves) during artifact removal [86]	Differentiating high-frequency neural oscillations (e.g., gamma) from EMG noise [87] [88]
Exemplary Removal Method	CiSSA-DWT [86], wICA [7]	ERASE-ICA [88], EEMD-CCA with EMG array [87]

The disparate nature of EOG and EMG artifacts, as summarized in Table 2, necessitates tailored removal approaches and influences the interpretation of performance metrics. For example, a high CC value following EOG artifact removal strongly suggests that the algorithm has successfully preserved the low-frequency components of the neural signal [71]. Conversely, for EMG artifact removal, a significant improvement in SAR, particularly in the high-frequency band, is a more direct indicator of successful muscle noise suppression without necessarily attesting to the fidelity of gamma oscillations [87]. Furthermore, the spatial distribution of artifacts differs: EOG artifacts are most prominent over frontal electrodes, while EMG artifacts from head and neck muscles can contaminate virtually all EEG channels, presenting a more generalized problem for source separation algorithms like ICA [88]. This widespread contamination often makes EMG artifacts more challenging to isolate and remove completely compared to the more spatially constrained EOG artifacts.

Experimental Protocols for Metric Validation

To ensure the reliability and comparability of performance metrics (CC, SAR, RRMSE), rigorous experimental protocols must be employed. These protocols typically involve the use of hybrid datasets where ground truth is known, allowing for direct computation of these metrics.

Synthetic Data Generation and Contamination Models

A standard validation approach involves contaminating clean, artifact-free EEG recordings with well-characterized EOG or EMG signals [85] [86]. The clean EEG serves as the ground truth ((x(t))), and the EOG/EMG signal is added at a controlled amplitude to generate a contaminated signal ((y(t) = x(t) + p \cdot artifact(t))), where (p) is a scaling factor that controls the contamination level [71]. This model allows for the precise calculation of CC, SAR, and RRMSE after processing (y(t)) to obtain the cleaned signal ((\hat{x}(t))). For EMG artifacts, more complex models involving the addition of real or simulated EMG channels as reference artifacts have been developed to improve the performance of ICA-based methods, a technique known as ERASE [88]. Another study enhanced the Ensemble Empirical Mode Decomposition with Canonical Correlation Analysis (EEMD-CCA) method by incorporating an array of EMG sensors, demonstrating that performance improved as the number of EMG reference channels increased from 2 to 16 [87].

Real-Data Validation with Semi-Simulated Approaches

In the absence of a perfect ground truth, semi-simulated approaches are used. One common method is the "leave-one-in" protocol, where a single, identifiable artifact (e.g., an eye blink) is retained in an otherwise clean segment of data. The algorithm's performance is then assessed based on its ability to remove that specific artifact while preserving the surrounding neural activity [7]. Metrics like the Artifact Rejection Ratio (ARR) and Signal-to-Error Ratio (SER) can also be used alongside CC and RRMSE in these scenarios [71]. Furthermore, the performance of artifact removal can be validated through downstream applications. For instance, one study demonstrated that after applying their proposed EOG removal framework, the detection accuracy of driver fatigue was noticeably enhanced compared to the pre-filtering stage, providing functional validation of the method's efficacy [71].

Visualization of Methodologies and Workflows

The following diagrams illustrate the core workflows for validating artifact removal techniques and the logical relationships between EOG/EMG characteristics and performance metrics.

Workflow for Validating Artifact Removal Algorithms

Metric Selection Logic for EOG vs. EMG Artifacts

The development and validation of high-performance artifact removal methods rely on a suite of computational tools, algorithms, and datasets. The following table catalogs key resources referenced in contemporary literature.

Table 3: Research Toolkit for EOG/EMG Artifact Investigation

Tool/Resource	Type	Primary Function	Relevance to Metrics
Independent Component Analysis (ICA)	Algorithm	Blind source separation to isolate neural and artifactual components [7] [88].	Basis for calculating CC/SAR/RRMSE on reconstructed signals.
Discrete Wavelet Transform (DWT)	Algorithm	Multi-resolution analysis to decompose signals into frequency sub-bands [86] [7].	Enables targeted artifact removal in specific bands, improving SAR.
Circulant SSA (CiSSA)	Algorithm	Subspace-based decomposition of single-channel time series [86] [71].	Effective for isolating oscillatory EOG artifacts; improves CC and RRMSE.
Fixed Frequency EWT (FF-EWT)	Algorithm	Adaptive wavelet transform targeting predefined frequency bands [85].	Precisely separates artifact components, optimizing all three metrics.
EMG Array (Reference Signals)	Experimental Setup	Multiple EMG sensors to provide statistical artifact information [87] [88].	Provides prior knowledge to "force" EMG into fewer ICs, dramatically improving SAR.
Deep Learning Models (e.g., A²DM, Motion-Net)	Algorithm	End-to-end artifact removal using CNN-based architectures [89] [28].	Uses artifact representation as prior to unifiedly remove EOG/EMG, boosting CC.
Public Datasets (e.g., BCI Competition IV)	Data	Standardized EEG data with known or labeled artifacts [86].	Provides a common benchmark for fair calculation and comparison of metrics.

The rigorous establishment of ground truth through the quantitative application of Correlation Coefficient (CC), Signal-to-Artifact Ratio (SAR), and Relative Root Mean Square Error (RRMSE) is indispensable for advancing the field of electrophysiological artifact removal. As research progresses, the development of unified models capable of handling interleaved EOG and EMG artifacts, such as the deep learning-based A²DM framework, represents a promising frontier [28]. Furthermore, the trend toward subject-specific, adaptive algorithms like Motion-Net highlights the growing importance of personalized medicine in neurotechnology [89]. Future work must continue to refine these performance metrics and validation protocols, ensuring they remain robust and informative in the face of increasingly complex algorithms and diverse real-world application scenarios, from clinical brain monitoring to agile brain-computer interfaces.

The analysis of neural signals is fundamentally compromised by the presence of artifacts, with electrooculogram (EOG) and electromyogram (EMG) representing two of the most pervasive and challenging categories. EOG artifacts, originating from eye movements and blinks, manifest as high-amplitude, low-frequency fluctuations, while EMG artifacts, stemming from muscle activity, are characterized by high-frequency, broadband signatures. Their distinct physiological origins and signal properties necessitate tailored removal strategies. Within this context, a methodological showdown has emerged among advanced artifact removal algorithms. Independent Component Analysis (ICA) and Artifact Subspace Reconstruction (ASR) have established themselves as classical staples, while iCanClean and various deep learning (DL) approaches represent a new generation of cleaning tools. This whitepaper provides an in-depth, technical guide to these core algorithms, framing their comparative analysis within the specific demands of EOG vs. EMG artifact research for an audience of scientists, researchers, and drug development professionals. The objective is to furnish a clear framework for algorithm selection based on empirical performance, computational efficiency, and suitability for specific artifact types and experimental conditions.

Core Algorithm Mechanics and Comparative Performance

This section dissects the operational principles of each major algorithm and presents a synthesized comparison of their performance based on current literature.

Algorithm Workflows and Technical Specifics

Independent Component Analysis (ICA): ICA is a blind source separation technique that decomposes multi-channel EEG data into statistically independent components (ICs). The core assumption is that recorded EEG is a linear mixture of underlying brain and non-brain sources. The algorithm then identifies and allows for the removal of ICs correlated with artifacts. Different ICA variants exist, with studies highlighting performance differences:
- Infomax ICA: Known for high decomposition yield, effectively identifying a large number of motor units in EMG analysis [90].
- FastICA: Offers high yield and shorter computation times, making it practical, though it may show lower accuracy under challenging signal conditions like low signal-to-noise ratio [90].
- RobustICA: Consistently demonstrates high accuracy across various conditions, particularly with low signal quality, though it may have a lower yield at high contraction levels [90].
Artifact Subspace Reconstruction (ASR): ASR is an online-capable method that uses a sliding-window Principal Component Analysis (PCA) to identify and remove high-variance, high-amplitude artifacts. It requires a short segment of clean "calibration" data to establish a baseline covariance matrix. When the variance in a data segment exceeds a user-defined threshold (typical k values range from 10 to 30), ASR reconstructs the artifact-contaminated segment by projecting it onto the baseline subspace [91] [92]. A lower k value leads to more aggressive cleaning but risks "overcleaning" and removing brain activity [91].
iCanClean: This is a generalized framework that leverages canonical correlation analysis (CCA) to detect and subtract noise subspaces from the EEG. It can operate with dedicated noise sensors (e.g., electrodes not in contact with the scalp) or by creating "pseudo-reference" noise signals from the EEG itself (e.g., by applying a temporary notch filter). iCanClean identifies subspaces of the scalp EEG that are correlated with noise subspaces based on a user-selected R² threshold and then subtracts these components [91] [92]. It has been validated as effective for motion, muscle, eye, and line-noise artifacts [92].
Deep Learning (DL) Approaches: DL models learn a non-linear mapping from artifact-contaminated EEG to clean EEG. Recent advances include:
- Artifact-aware Denoising Model (A²DM): This model first uses a pre-trained classifier to identify the type of artifact (EOG/EMG) and fuses this "artifact representation" into a denoising network. It employs a frequency enhancement module with a hard attention mechanism to selectively filter artifact-specific frequency components, demonstrating a 12% improvement in Correlation Coefficient over previous methods on a benchmark dataset [28].
- AnEEG: An LSTM-based Generative Adversarial Network (GAN) designed to generate artifact-free EEG signals. The generator produces cleaned signals, while the discriminator judges their authenticity against ground-truth clean data [93].
- GCTNet: Integrates a GAN with a parallel CNN and transformer network to capture both global and temporal dependencies in the EEG for denoising [93].

Quantitative Performance Comparison

The following tables summarize the key performance characteristics of these algorithms as reported in recent evaluations.

Table 1: Overall Algorithm Performance and Characteristics

Algorithm	Best For Artifact Type	Key Strength	Key Weakness	Computational Speed
RobustICA	EMG (under low SNR) [90]	High decomposition accuracy [90]	Lower yield with many active sources [90]	Medium (similar to FastICA) [90]
FastICA	EMG (high yield) [90]	High decomposition yield [90]	Lower accuracy in challenging conditions [90]	Fast [90]
ASR	Ocular, Motion, Muscular [91] [35]	Effective for large-amplitude bursts [91]	Risk of over-cleaning with low `k` [91]	Fast (real-time capable) [92]
iCanClean	Motion, Muscle, Eye, Line-Noise [92]	All-in-one solution; preserves brain activity [91] [92]	Requires parameter tuning (R²) [91]	Fast (real-time capable) [92]
Deep Learning	Multiple, Interleaved Artifacts [93] [28]	Handles complex, overlapping artifacts [28]	Requires large datasets for training [93]	Slow training, variable inference [73]

Table 2: Quantitative Performance Metrics from Key Studies

Algorithm	Reported Metric	Performance	Experimental Context
iCanClean	Data Quality Score	Improved from 15.7% to 55.9%	Phantom head with all artifact types [92]
ASR	Data Quality Score	Improved from 15.7% to 27.6%	Phantom head with all artifact types [92]
A²DM (DL)	Correlation Coefficient	12% improvement over NovelCNN	EEGdenoiseNet dataset [28]
Lightweight ML	Classification Accuracy	>96% accuracy with multi-source artifacts	Single-channel EEG, mixed EOG/EMG/White Noise [73]
ICA (RobustICA)	Decomposition Accuracy	85-100% for motor unit timings	Simulated HD EMG [90]

The data reveals a clear trade-off. ICA variants, particularly RobustICA, offer high precision for specific analyses like EMG decomposition but may struggle with yield. ASR and iCanClean are robust, online-capable methods, with iCanClean demonstrating superior performance in direct comparisons, especially for complex artifact mixtures [92]. Deep learning approaches show immense promise for handling heterogeneous artifacts in a unified model but at the cost of complexity and data hunger [28].

Algorithm Selection Workflow

The following diagram maps the logical decision process for selecting an appropriate artifact removal algorithm based on research requirements.

Algorithm Selection Workflow: A decision tree to guide researchers in selecting the most suitable artifact removal method based on their data characteristics and research constraints.

Experimental Protocols for Algorithm Evaluation

Robust evaluation of artifact removal algorithms requires rigorous, standardized protocols. Below are detailed methodologies adapted from key studies to benchmark performance against a known ground truth.

Phantom Head Validation

Objective: To quantitatively compare the artifact removal efficacy of ICA, ASR, iCanClean, and DL algorithms in a setup where the true brain signals are known. Protocol (as in [92]):

Apparatus: Utilize an electrically conductive phantom head embedded with multiple "brain source" antennae (e.g., 10 sources) that generate ground-truth signals. Contaminating sources (e.g., for eyes, neck muscles, facial muscles, motion) are also integrated.
Data Acquisition: Record high-density EEG (e.g., 100+ channels) from the phantom under several conditions:
- Condition 1 (Baseline): Brain sources only.
- Condition 2-5 (Single Artifact): Brain sources + one type of contaminating source (e.g., Eyes, Neck EMG, etc.).
- Condition 6 (Mixed Artifact): Brain sources + all contaminating sources simultaneously.
Processing: Apply each algorithm (ICA, ASR, iCanClean, DL) to the contaminated data from Conditions 2-6. Use identical preprocessing steps (e.g., band-pass filtering) before applying the target algorithm.
Performance Metric: Calculate a Data Quality Score defined as the average correlation coefficient between the known ground-truth brain source signals and the cleaned EEG channels. A higher score indicates better preservation of brain signals and more effective artifact removal.

Semi-Synthetic Benchmarking

Objective: To evaluate algorithm performance on real EEG data with controlled, added artifacts. Protocol (as in [93] [28]):

Clean EEG Database: Obtain a dataset of high-quality, clean EEG recordings from public repositories (e.g., EEGdenoiseNet [28]). These are treated as ground truth.
Artifact Synthesis: Generate or record realistic EOG and EMG artifacts. EOG can be simulated as low-frequency, high-amplitude pulses, while EMG can be modeled as broadband, Gaussian-like noise or taken from real recordings.
Signal Mixing: Linearly mix the clean EEG with the artifact signals at varying Signal-to-Noise Ratios (SNRs) to create a semi-synthetic, contaminated dataset: EEG_contaminated = EEG_clean + γ * Artifact, where γ controls the contamination level.
Processing and Evaluation: Apply the algorithms to the contaminated data and compare the output EEG_denoised to the original EEG_clean. Use metrics like:
- Correlation Coefficient (CC): Measures the linear relationship between the cleaned and pure signal.
- Root Relative Mean Squared Error (RRMSE): Measures the relative error.
- Signal-to-Artifact Ratio (SAR): Measures the residual artifact content.

Objective: To assess algorithm performance in the absence of a ground truth by leveraging well-established neural correlates. Protocol (as in [91]):

Experimental Paradigm: Collect EEG data during a task with a known neural signature, such as the P300 event-related potential (ERP) elicited by a Flanker task. Conduct the task in both static (low-artifact) and dynamic (e.g., running, high-artifact) conditions.
Data Acquisition: Record mobile EEG from participants performing the task.
Processing: Apply each artifact removal algorithm to the data from the dynamic condition.
Performance Metric: Compare the ERPs extracted from the cleaned dynamic data to the ERPs from the static condition. A successful algorithm will yield P300 components with similar latency, amplitude, and topography (e.g., showing the expected congruency effect) as the static baseline, indicating effective artifact removal without distorting the neural signal of interest.

The Scientist's Toolkit: Essential Research Reagents & Materials

Table 3: Key Resources for Experimental Research in Artifact Removal

Resource Name / Category	Function / Application in Research	Specific Example / Note
High-Density EEG Systems	Acquisition of neural data with high spatial resolution; essential for spatial methods like ICA.	Systems with 64+ channels; often wet electrode setups for highest fidelity [90].
Wearable / Mobile EEG	Enables research in ecological settings and study of motion-related artifacts.	Systems with dry or semi-wet electrodes; typically fewer channels (e.g., <16) [35].
Phantom Head Apparatus	Provides ground-truth signals for controlled, quantitative validation of algorithms.	Conductive head model with embedded brain and artifact sources [92].
Reference Noise Sensors	Provides noise references for algorithms like iCanClean and Adaptive Filtering.	Dual-layer EEG electrodes (mechanically coupled but not in scalp contact) or EOG/EMG electrodes [92].
Public Datasets	Benchmarking and training data for algorithm development and comparison.	EEGdenoiseNet: Contains clean EEG and various artifacts for semi-synthetic mixing [28]. PhysioNet: Includes motor/imagery EEG data [93].
Computational Framework	Software environment for implementing and testing algorithms.	MATLAB with EEGLAB: Supports ICA, ASR, and plugins [91] [94]. Python with PyTorch/TensorFlow: For developing and running deep learning models like A²DM [93] [28].

Implementation and Workflow Integration

Successfully integrating an artifact removal algorithm into a research pipeline requires careful consideration of the entire workflow, from data acquisition to final analysis.

Standardized Processing Pipeline

A generalized, recommended workflow for applying these algorithms is depicted below.

EEG Processing Pipeline: A standard workflow showing the crucial placement of the artifact removal step after initial preprocessing and before final data analysis.

Practical Implementation Notes

Preprocessing is Critical: All advanced artifact removal methods require properly preprocessed data. Essential steps include band-pass filtering (e.g., 1-100 Hz), notch filtering for line noise, and the identification and interpolation of bad channels. This ensures algorithm stability and performance.
Parameter Optimization is Non-Trivial: The performance of ASR (k value), iCanClean (R² threshold, window size), and ICA (algorithm variant, rejection criteria) is highly sensitive to parameter settings. These should be optimized for a specific dataset and then held constant for a given study to ensure consistency. For instance, an ASR k value of 20-30 is often a safe starting point, while a lower k of 10 may be used for more aggressive cleaning of motion artifacts [91].
Validation is Mandatory: Never assume an algorithm has worked perfectly. Always include a validation step, such as visualizing the data before and after cleaning, examining the removed components (in the case of ICA), or checking the integrity of expected neural signatures (e.g., ERPs, oscillatory power in canonical bands).
Consider the Channel Count: ICA's effectiveness diminishes with low-channel-count EEG (e.g., <16 channels) [35]. In such cases, single-channel methods (e.g., FF-EWT [15]) or algorithms like iCanClean with pseudo-reference signals or ASR may be more appropriate.

Electroencephalography (EEG) provides a non-invasive window into brain dynamics, enabling the study of cognitive processes through event-related potentials (ERPs) and spectral power analyses. However, the fidelity of these neural signatures is profoundly threatened by physiological artifacts, primarily electrooculographic (EOG) and electromyographic (EMG) signals. Within the broader context of EOG vs. EMG artifacts research, this whitepaper examines their distinct and compounding impacts on two critical end-game analyses: ERP decoding and spectral power estimation. EOG artifacts, originating from eye blinks and movements, introduce high-amplitude, low-frequency drifts that can obscure cognitive ERPs like the P300 or N400 [95]. In contrast, EMG artifacts, generated by pericranial muscle activity, constitute a broadband, high-frequency noise that can masquerade as or mask genuine neural oscillations in the beta and gamma ranges [19]. A systematic evaluation is paramount, as preprocessing choices aimed at artifact mitigation can themselves significantly alter decoding outcomes and power estimates, sometimes deleteriously affecting interpretability [96]. This document provides a technical guide for researchers, summarizing quantitative impacts, detailing robust experimental protocols, and presenting essential analytical tools to safeguard the validity of neuroscientific and neuropharmacological findings.

EOG and EMG artifacts possess distinct etiologies and signal properties, leading to different interference patterns with the neural signal of interest.

EOG Artifacts: The cornea-retina dipole in the eye creates a steady electrical potential. Blinks and eye movements transiently alter this field, generating high-amplitude, slow-voltage changes recorded across the scalp. These artifacts are most prominent over frontal sites but volume conduction affects even posterior channels, contaminating the low-frequency bands (delta/theta) crucial for many ERPs [95]. The artifact shape is somewhat stereotyped, which facilitates certain correction methods.
EMG Artifacts: These originate from the summation of motor unit action potentials from head, face, neck, and jaw muscles. EMG signals are characterized by their broad spectral distribution (from 2 Hz up to 300 Hz), low voltage but high frequency, and a lack of stereotypy [19]. Crucially, their spectral profile overlaps with all classic EEG frequency bands, meaning they can inflate power estimates across the entire spectrum and are exquisitely sensitive to psychologically interesting processes, creating a potent confound [19].

Comparative Table of Artifact Properties

Table 1: Characteristic Properties of EOG and EMG Artifacts

Property	EOG Artifacts	EMG Artifacts
Origin	Cornea-retinal dipole; eye blinks & movements [95]	Pericranial muscle activity (face, jaw, neck, forehead) [19]
Spectral Profile	Predominantly low-frequency (Delta, Theta) [95]	Broadband, up to 300 Hz; peaks ~100 Hz [19]
Topography	Maximal at frontal sites, declines posteriorly [95]	Widespread, can be detected anywhere on the scalp [19]
Amplitude	High (can exceed 100 µV) [95]	Variable, can be low-amplitude but persistent [19]
Key Impact on ERP	Obscures low-frequency ERP components (e.g., P300, N400) [95]	Can add high-frequency noise to ERP waveform [19]
Key Impact on Spectral Power	Inflates low-frequency power estimates [95]	Inflates power estimates across all bands, especially Beta/Gamma [19]

Quantitative Impact of Preprocessing on Decoding

Preprocessing decisions, including those for artifact handling, are not neutral; they directly and systematically shape the outcome of decoding analyses.

Table 2: Impact of Preprocessing Choices on EEG Decoding Performance (Based on [96])

Preprocessing Step	Variation	Impact on Decoding Performance
Artifact Correction	Applying ICA, AR	↓ Decrease in performance across experiments and models. May remove neural information or structured noise used by the classifier [96].
High-Pass Filtering	Higher cutoff (e.g., 1 Hz vs. 0.1 Hz)	↑ Increase in performance. Effectively removes slow drifts, including those from EOG [96].
Low-Pass Filtering	Lower cutoff (e.g., 30 Hz vs. 40 Hz)	↑ Increase for time-resolved classifiers; effect less clear for EEGNet. Reduces high-frequency noise, including EMG [96].
Baseline Correction	Applying vs. Not Applying	↑ Increase for EEGNet. Helps center data and improve generalization [96].
Linear Detrending	Applying vs. Not Applying	↑ Increase for time-resolved classifiers. Removes linear drifts [96].

A critical finding is that while artifact correction steps often reduce decoding accuracy, this may be because the classifier is learning to exploit the structured artifactual noise rather than the neural signal. Therefore, maximizing raw performance without careful artifact handling can come at the expense of interpretability and model validity [96].

Experimental Protocols for Artifact Handling

A Multiverse Approach to Preprocessing

Given the significant impact of preprocessing, a "multiverse" approach—systematically evaluating multiple pipeline permutations—is recommended over a single, fixed pipeline [96]. The following protocols outline standard methods for EOG and EMG mitigation within such a framework.

Workflow: Generalized EEG Preprocessing for ERP and Spectral Analysis

The following diagram outlines the key decision points in a comprehensive preprocessing pipeline, highlighting stages specific to EOG and EMG handling.

Protocol 1: Handling EOG Artifacts using Wavelet-Enhanced ICA

The standard method for EOG correction is Independent Component Analysis (ICA). However, simply rejecting artifact-related components can discard residual neural information. Advanced methods like Wavelet-Enhanced ICA (wICA) offer a more nuanced correction [7].

Procedure:

Filter & Preprocess: High-pass filter the continuous data at a low cutoff (e.g., 1 Hz) to remove slow drifts and improve ICA stability.
ICA Decomposition: Run an ICA algorithm (e.g., Infomax or Extended Infomax) on the filtered continuous data to decompose it into statistically independent components (ICs).
Identify Ocular ICs: Identify components related to blinks and eye movements using:
- Topography: Strong frontal, bipolar focus typical of EOG.
- Time-course: High-amplitude, transient deflections corresponding to blinks/saccades.
- Spectrum: Dominated by low frequencies.
Wavelet-Based Correction (wICA):
- Instead of rejecting the entire component, apply a Discrete Wavelet Transform (DWT) to the time-course of the EOG-contaminated IC.
- Identify and threshold wavelet coefficients with anomalously high amplitudes (corresponding to the artifact peaks).
- Reconstruct the "cleaned" IC time-course using the inverse DWT. This removes the high-amplitude EOG spikes while preserving lower-amplitude neural activity within the same component [7].
Reconstruct Data: Project the corrected components back to the sensor space using the inverse ICA transform.

Protocol 2: Addressing EMG Artifacts

EMG artifact correction is more challenging due to its non-stereotyped, broadband nature. A combination of filtering and component rejection is currently the most viable approach.

Procedure:

Visual Inspection: Plot the data and note periods of sustained high-frequency, low-amplitude activity, particularly over temporal sites (from temporalis muscles) or the neck (affecting mastoids).
Spectral Inspection: Calculate the power spectral density. Unexplained elevations in power above 20 Hz, particularly a "shoulder" in the 20-60 Hz range, can indicate EMG contamination [19].
ICA for EMG:
- Perform ICA on the continuous data (as in the EOG protocol).
- Identify myogenic components using:
  - Spectrum: Broad, high-frequency content without a clear peak in the alpha band.
  - Topography: Focal projections over muscle groups (e.g., temporal, neck) [19].
- Given the difficulty in perfectly separating neural from myogenic activity in components and the risk of neural signal loss, a conservative component rejection strategy is often applied to clear ICs unambiguously classified as EMG.
Leverage Filtering: Applying a low-pass filter with a cutoff of 30-40 Hz during preprocessing can attenuate the highest-frequency EMG content. However, this also removes genuine high-frequency neural activity (Gamma band), so its use must be justified by the research question [96].

The Scientist's Toolkit: Essential Research Reagents & Tools

Table 3: Key Tools and Software for EEG Artifact Research

Tool / Material	Type	Primary Function in Artifact Research
MNE-Python [96]	Software Library	A comprehensive open-source Python package for exploring, visualizing, and analyzing human neurophysiological data. It provides implementations for filtering, ICA, and other core preprocessing steps.
MNE-ICA	Algorithm	Submodule within MNE-Python for performing Independent Component Analysis, a cornerstone technique for identifying and removing EOG and EMG artifacts.
BrainVision Analyzer [95]	Commercial Software	A widely used commercial platform for EEG/ERP analysis that includes integrated tools for artifact detection, correction, and rejection.
EEGNet [96]	Neural Network	A compact convolutional neural network for EEG-based decoding. Useful for evaluating the functional impact of different artifact handling pipelines on end-game classification performance.
Wavelet Toolbox (MATLAB) / `PyWT` (Python)	Software Library	Provides functions for Discrete Wavelet Transform (DWT), essential for implementing advanced artifact correction methods like wICA [7] or the FF-EWT method [15].
FieldTrip	Software Toolbox	An open-source MATLAB toolbox for advanced analysis of MEG, EEG, and other electrophysiological data, offering alternative implementations for time-frequency and component analysis.

Advanced Methods & Future Directions

While ICA is the current standard, research into fully automated and single-channel methods is advancing rapidly, which is crucial for portable EEG and brain-computer interface (BCI) applications.

Fixed Frequency EWT with GMETV Filtering: This is a novel, fully automated method designed for single-channel EEG. It uses Fixed Frequency Empirical Wavelet Transform (FF-EWT) to decompose the signal and automatically identify artifact-dominated components using kurtosis, dispersion entropy, and power spectral density. These components are then cleaned with a Generalized Moreau Envelope Total Variation (GMETV) filter, which effectively suppresses EOG artifacts while preserving the integrity of the underlying EEG [15]. This method shows promise for clinical and mobile settings where multi-channel setups are impractical.
Multiscale Entropy (MSE) for Signal Quality Assessment: Beyond removing artifacts, it is crucial to assess the quality of the cleaned signal. MSE quantifies the complexity of a time series across multiple temporal scales. It is sensitive to both linear and nonlinear dynamics and can potentially be used to detect residual artifact contamination or signal distortion caused by overzealous preprocessing that standard metrics might miss [97].

Decision Process: Selecting an Artifact Handling Strategy

The choice of method depends on data characteristics and research goals, guided by the following decision workflow:

The analysis of electroencephalography (EEG) signals is fundamental to neuroscience research, clinical diagnostics, and the development of neuropharmaceuticals. However, a significant challenge persists: the pervasive contamination of these signals by non-cerebral artifacts, primarily electrooculographic (EOG) and electromyographic (EMG) sources. EOG artifacts, generated by eye movements and blinks, manifest as high-amplitude, low-frequency peaks that can obscure underlying neural activity [7] [98]. EMG artifacts, originating from muscle contractions in the face, neck, and scalp, present as broad-spectrum, high-frequency noise that overlaps with and can mask crucial EEG rhythms [31]. The central dilemma in preprocessing this data lies in choosing between two fundamental strategies: the outright rejection of contaminated data segments or components, which risks losing valuable neural information, or the correction of these artifacts, which may introduce signal distortion and alter the spectral characteristics of the EEG [7] [54]. This trade-off is particularly critical in drug development, where precise measurement of neural signals can be essential for assessing a compound's neurophysiological effects. The choice between rejection and correction is not merely technical but philosophical, balancing the preservation of data integrity against the completeness of the dataset.

Physiological Artifacts: A Comparative Analysis of EOG and EMG

Understanding the distinct characteristics of EOG and EMG artifacts is a prerequisite for selecting an appropriate removal strategy. While both are physiological artifacts, their origins, spatial distributions, and spectral properties differ significantly, which in turn influences the optimal approach for their mitigation.

EOG Artifacts: The eye acts as an electric dipole due to the charge difference between the cornea and retina. Eye blinks and movements shift this dipole, generating an electric field measurable on the scalp [98]. These artifacts are characterized by their high amplitude (often 100–200 µV, which is an order of magnitude larger than typical EEG signals) and their predominance in the low-frequency range (delta and theta bands) [7] [98]. Spatially, they are most prominent over frontal electrodes (e.g., Fp1, Fp2), and their morphology consists of slow, large-amplitude deflections [98].
EMG Artifacts: These originate from the electrical activity of muscle contractions during jaw clenching, talking, or frowning [31] [98]. In contrast to EOG, EMG artifacts are broadband and high-frequency, with significant energy in the beta (13–30 Hz) and gamma (>30 Hz) ranges, directly overlapping with frequencies crucial for studying cognitive and motor processes [31]. EMG is detectable across the entire scalp due to volume conduction, with peak amplitudes that can be 75–400 µV, vastly exceeding typical EEG differences [31]. A critical challenge is that facial EMG is sensitive to cognitive and affective processes, creating a high risk of temporal confounding in experimental manipulations [31].

Table 1: Characteristics of EOG and EMG Artifacts

Feature	EOG Artifact	EMG Artifact
Origin	Corneo-retinal dipole (eye movements/blinks) [98]	Muscle contractions (face, neck, jaw) [31]
Spectral Profile	Low-frequency (Delta/Theta bands) [7]	Broadband, high-frequency (Beta/Gamma bands) [31]
Spatial Topography	Maximal over frontal lobes [98]	Widespread, detectable across scalp [31]
Amplitude	100–200 µV [98]	75–400 µV [31]
Key Challenge	Large amplitude peaks corrupting neural signals [7]	Spectral overlap with neural signals; sensitive to psychology [31]

Methodological Approaches: From Classic Rejection to Advanced Correction

The evolution of artifact handling methodologies reflects a steady movement from simple, destructive approaches toward complex, reconstructive ones. The following workflow outlines the key decision points and processes involved in a modern, hybrid artifact handling pipeline.

Figure 1: Workflow for handling EOG/EMG artifacts, showing the decision point between rejection and correction strategies.

The Rejection Paradigm

The simplest and most conservative approach is artifact rejection. This can be applied at the level of data epochs (removing entire time segments contaminated by an artifact) or at the level of independent components (after applying a blind source separation algorithm like ICA) [7] [25].

Epoch Rejection: This method involves visually or automatically identifying and discarding segments of data that contain artifacts. While straightforward, it is labor-intensive and can lead to a significant loss of experimental trials. In event-related potential (ERP) studies, this can critically reduce the signal-to-noise ratio after averaging [7].
Component Rejection: Independent Component Analysis (ICA) decomposes the EEG signal into statistically independent components (ICs) [7]. Artifactual components are identified and entirely removed before signal reconstruction. The primary risk is that neural information present within the artifactual component is permanently lost, as ocular sources are not always entirely separated from neural sources [7] [25]. A survey of BCI literature found that many studies do not adequately report how, or if, they handle artifact components [25].

The Correction Paradigm

Correction methods aim to remove the artifactual content while preserving the neural information within the signal. These techniques represent a more sophisticated but computationally complex approach.

Wavelet-Enhanced ICA (wICA): This hybrid method improves upon simple component rejection. After ICA identifies an EOG-artifactual component, the Discrete Wavelet Transform (DWT) is applied to it. Wavelet coefficients corresponding to the high-amplitude artifact are identified via thresholding and set to zero, while coefficients representing neural activity are retained. The component is then reconstructed and used in the inverse ICA transform [7]. An improved version performs this correction selectively, only within the identified EOG activity regions, further minimizing neural data loss [7].
Adaptive Neural Fuzzy Inference Systems (ANFIS): This approach combines the learning capability of neural networks with the fuzzy logic reasoning of IF-THEN rules. It can be configured as an adaptive filter to model and subtract EOG and EMG artifacts from EEG signals. To enhance its performance, it can be combined with a Functional Link Neural Network (FLNN), which improves its nonlinear approximation capability, leading to better artifact removal compared to standalone ANFIS or RBF-ANFIS systems [54].
Deep Learning (LSTM-ICA): For situations where reference EOG recordings are unavailable, a Long Short-Term Memory (LSTM) network can be trained to estimate the horizontal and vertical EOG signals directly from the contaminated EEG data. These estimated EOG signals are then used alongside the EEG in an ICA procedure to facilitate the separation and removal of the artifactual components [26].
Single-Channel Techniques (FF-EWT + GMETV): For portable, single-channel EEG systems, methods like Fixed Frequency Empirical Wavelet Transform (FF-EWT) are used. The signal is decomposed into components, and features like kurtosis and power spectral density identify artifact-laden components. A Generalized Moreau Envelope Total Variation (GMETV) filter then removes the artifacts while preserving the low-frequency EEG information [15].

Quantitative Comparison: Weighing the Evidence

The theoretical trade-off between rejection and correction is best evaluated through empirical data. The following table summarizes key performance metrics from various studies, providing a basis for comparing the effectiveness of different methodologies.

Table 2: Performance Comparison of Artifact Handling Methods

Method	Artifact Type	Key Performance Metrics	Reported Advantages	Reported Limitations
Component Rejection (ICA)	EOG, EMG	N/A (Component loss) [7]	Simple, widely used [25]	Loss of neural data in rejected component; distorts spectral features [7]
Wavelet-Enhanced ICA (wICA)	EOG	Outperforms component rejection in time/spectral accuracy [7]	Reduces neural data loss; fully automatic [7]	Threshold selection is crucial [7]
FLNN-ANFIS Filter	EOG, EMG	Better performance than ANFIS, RBF-ANFIS, adaptive FL-BPNN [54]	Handles non-stationary, random EEG signals; good nonlinear approximation [54]	Computational expense can be high [54]
LSTM-ICA	EOG	Lower Mean Squared Error (MSE) and Mean Absolute Error (MAE) vs. other deep learning methods [26]	Does not require reference EOG channels [26]	LSTM setup can be trial-and-error; needs quality datasets [26]
FF-EWT + GMETV	EOG (Single-Channel)	Lower RRMSE, higher CC (synthetic data); improved SAR, MAE (real data) [15]	Effective for single-channel EEG; preserves low-frequency info [15]	Performance dependent on accurate decomposition [15]
Fast Automatic BSS	Ocular, Cardiac, Muscle	88% overall artifact removal (Ocular: 81%, Cardiac: 84%, Muscle: 98%) [39]	Fast computation; suitable for online/real-time correction [39]	Not all artifacts are removed (e.g., 19% of ocular artifacts remain) [39]

Table 3: Research Reagent Solutions for EOG/EMG Artifact Studies

Tool / Solution	Function in Research	Application Context
Independent Component Analysis (ICA)	Blind source separation to decompose multi-channel EEG into independent components for artifact identification [7] [39]	Foundational step for both rejection and correction paradigms in multi-channel studies [25]
Discrete Wavelet Transform (DWT)	Time-frequency decomposition to isolate and remove artifactual coefficients from signals or components [7]	Core to correction methods like wICA; allows selective removal of high-amplitude artifact features [7]
Adaptive Neural Fuzzy Inference System (ANFIS)	Adaptive filter that combines neural network learning and fuzzy logic to model and subtract artifacts [54]	Used in nonlinear filtering of EOG and EMG artifacts, often enhanced with FLNN [54]
Long Short-Term Memory (LSTM) Network	Type of recurrent neural network that learns temporal sequences to estimate EOG signals from EEG [26]	Enables artifact correction without dedicated EOG reference channels [26]
Fixed Frequency EWT (FF-EWT)	Adaptive signal decomposition method that creates filters tailored to the input signal's spectral components [15]	Key for artifact isolation in single-channel EEG recordings where BSS is not applicable [15]
Semi-Simulated Datasets	Data constructed by adding real EOG/EMG recordings to clean EEG, providing a ground truth for validation [26] [15]	Essential for quantitative validation and comparison of new artifact removal algorithms [31]

Experimental Protocols for Method Validation

Robust validation is critical for assessing the sensitivity and specificity of any artifact handling method. The following protocols are commonly used in the field.

Protocol for Validation with Scripted Data

This protocol uses participant instruction to systematically generate artifact-contaminated and clean data [31].

Participant Preparation: Fit participants with a high-density EEG cap (e.g., 125-channel). Ensure proper electrode placement and impedance.
Experimental Design: Employ a cross-factorial design where neurogenic and myogenic activation are independently varied. For example:
- Alpha-blocking manipulation: Eyes open vs. eyes closed.
- Muscle activation: Tensing facial/neck muscles vs. muscle quiescence.
Data Recording: Record EEG data across all conditions. Instruct participants to tense or relax muscles in blocks, with adequate rest periods.
Data Analysis:
- Sensitivity Analysis: Apply the correction method (e.g., GLM, ICA) and compare the "corrected" EMG-contaminated data to uncorrected EMG-free data in an anterior, myogenic region of interest (ROI). Use equivalence tests to confirm the correction effectively removes EMG.
- Specificity Analysis: Similarly, compare corrected and uncorrected data in a posterior, neurogenic ROI (e.g., showing alpha-blocking) to ensure the correction preserves genuine neural signals.

Protocol for Validation with Semi-Simulated Data

This protocol is advantageous because the ground truth (clean EEG) is definitively known [26] [15].

Data Collection:
- Record clean EEG data from participants in a state that minimizes artifacts (e.g., closed eyes, relaxed muscles).
- Separately, record pure EOG and EMG artifacts.
Data Contamination: Artificially add the recorded EOG/EMG signals to the clean EEG data at varying amplitudes to create a contaminated dataset. The original clean EEG serves as the ground truth.
Method Application: Apply the artifact removal algorithm (e.g., LSTM-ICA, FF-EWT+GMETV) to the contaminated dataset.
Performance Quantification: Calculate error metrics between the algorithm's output and the ground truth clean EEG. Common metrics include:
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- Correlation Coefficient (CC)
- Signal-to-Artifact Ratio (SAR)

The debate between data rejection and signal correction is not a binary choice but a strategic decision that must be informed by the research context. Rejection methods, such as ICA component removal, offer simplicity but carry an inherent risk of discarding valuable neural information, potentially biasing results in studies with limited trials or where the artifact and neural signal are poorly separated [7] [25]. In contrast, correction methods like wICA, ANFIS, and deep learning hybrids strive to preserve data integrity at the cost of greater algorithmic complexity and the potential for introducing subtle distortions if not properly validated [7] [54] [26].

The future of artifact handling lies in the development of intelligent, context-aware systems. There is a clear trend towards fully automatic, computationally efficient algorithms that are suitable for online applications like brain-computer interfaces and clinical monitoring [39] [15]. Deep learning approaches show immense promise in estimating and removing artifacts without the need for reference channels [26]. Furthermore, as mobile EEG applications grow, robust single-channel techniques will become increasingly important [15]. For the researcher and drug development professional, the path forward involves a careful, validated approach. The selection of an artifact handling strategy should be guided by the specific artifact type (EOG vs. EMG), the experimental design, the available data density, and, most critically, a rigorous validation protocol that quantitatively demonstrates both the sensitivity of artifact removal and the specificity of neural signal preservation.

The rigorous comparison of artifact removal techniques for electrooculogram (EOG) and electromyogram (EMG) contamination in electrophysiological signals is a cornerstone of reliable neuroscience and clinical research. The absence of standardized resources can lead to challenges in reproducing results, validating new algorithms, and translating research findings into practical tools, particularly for drug development professionals assessing neurotherapeutics. This whitepaper provides an in-depth guide to publicly available datasets and benchmarking tools specifically curated for the standardized evaluation of EOG and EMG artifact research. By framing these resources within the context of a broader thesis on EOG vs. EMG artifacts, this document aims to empower researchers with the tools necessary for conducting reproducible, comparable, and robust analyses, thereby accelerating progress in the field.

The Critical Need for Standardization in Artifact Research

Artifacts pose a significant threat to the validity of electrophysiological data interpretation. EOG artifacts, originating from eye movements and blinks, and EMG artifacts, from muscle activity, exhibit overlapping spectral characteristics with genuine neural signals, making their separation particularly challenging [8]. Traditional artifact removal methods, such as regression, independent component analysis (ICA), and filtering, have inherent limitations, including the need for reference channels or manual inspection [8] [99].

The emergence of deep learning (DL) models has transformed the field, offering automated, end-to-end artifact removal. For instance, models like CLEnet, which integrates dual-scale CNN and LSTM with an attention mechanism, demonstrate superior performance in removing mixed and unknown artifacts from multi-channel EEG data [8]. However, the proliferation of novel algorithms necessitates standardized benchmarks to ensure these methods are evaluated fairly and consistently. Without such standards, it becomes difficult to determine whether performance improvements are due to algorithmic advances or simply variations in evaluation datasets and metrics. Furthermore, for drug development, where EEG may be used as a biomarker for drug efficacy or safety, consistent artifact handling is paramount to avoid misinterpretation of a drug's effect on the central nervous system.

A Curated Repository of Public Datasets

Below is a synthesis of key public datasets that facilitate standardized research into EOG and EMG artifacts. These resources are critical for training, testing, and benchmarking artifact removal algorithms.

Table 1: Public Datasets for EOG and EMG Artifact Research

Dataset Name	Modality	Key Features & Artifacts	Subject Count & Tasks	Access Information
EEGdenoiseNet [8]	EEG (Semi-synthetic)	Includes clean EEG, EMG, and EOG; allows controlled generation of contaminated signals for benchmarking.	N/A (Pre-recorded signal segments)	[8]
NeBULA [100]	hd-EEG, sEMG	Simultaneous EEG-EMG during standardized reaching tasks; real movement artifacts.	40 healthy subjects	Formatted in BIDS [100]
TUH EEG Artifact Corpus (TUAR) [101]	Clinical EEG	Real-world clinical EEG with annotations for various artifacts, including EMG and EOG.	Vast corpus (thousands of recordings)	Requires registration [101]
CHB-MIT Scalp EEG Database [102]	Clinical EEG	Long-term EEG from pediatric patients with seizures; contains natural artifacts.	22 subjects	Open Access via PhysioNet [102]
EMG for Gesture Recognition with Arm Translation [103]	EMG	Focus on EMG variability across different arm positions; useful for studying motion artifacts.	8 subjects performing 6 hand gestures	[103]

These datasets cater to different research needs. Semi-synthetic datasets like EEGdenoiseNet are ideal for initial algorithm development and controlled comparison because the ground-truth clean EEG is known [8]. In contrast, real-world datasets like NeBULA and the TUH Corpora provide ecological validity, capturing the complex, multi-source artifact environments encountered in actual experiments or clinical settings [100] [101]. The NeBULA dataset is particularly noteworthy for its adherence to the Brain Imaging Data Structure (BIDS) standard, which promotes FAIR (Findable, Accessible, Interoperable, and Reusable) principles and simplifies data sharing and reuse [100].

Benchmarking Tools and Experimental Protocols

The ABOT Benchmarking Tool

To address the challenge of comparing myriad machine learning-based artifact removal methods, the Artefact removal Benchmarking Online Tool (ABOT) was developed. ABOT compiles key characteristics from over 120 articles, creating a knowledge base that allows users to search and compare methods based on criteria such as the signal modality (EEG, MEG, ECoG), artifact type, and specific ML model used [104]. This tool is vital for researchers to quickly identify state-of-the-art approaches appropriate for their specific artifact-related challenges.

Standardized Experimental Protocols

Reproducibility hinges on detailed and standardized experimental methodologies. The following workflow, derived from reviewed literature, outlines a robust protocol for evaluating an artifact removal algorithm.

Workflow for Benchmarking Artifact Removal Algorithms

The protocol for a study validating a novel deep learning network, CLEnet, serves as an exemplar [8]:

Dataset Curation & Synthesis: The study utilized three distinct datasets. Dataset I and II were semi-synthetic, created by mixing clean single-channel EEG from EEGdenoiseNet with recorded EMG, EOG, and ECG signals. Dataset III consisted of real 32-channel EEG collected from healthy participants during a cognitive task, containing unknown and mixed artifacts [8].
Model Training & Architecture: CLEnet was designed with a dual-branch network. One branch used dual-scale convolutional kernels to extract morphological features at different scales, while an incorporated attention mechanism (EMA-1D) preserved temporal information. These features were then passed to an LSTM network to model temporal dependencies before final EEG reconstruction [8].
Performance Metrics: The model was evaluated using a suite of quantitative metrics to provide a multi-faceted assessment:
- Signal-to-Noise Ratio (SNR): Measures the improvement in signal quality.
- Correlation Coefficient (CC): Quantifies the preservation of the original EEG's morphology.
- Relative Root Mean Square Error (RRMSE): Assesses reconstruction accuracy in both temporal (t) and frequency (f) domains [8].
Benchmarking: CLEnet's performance was rigorously compared against other mainstream models (e.g., 1D-ResCNN, NovelCNN, DuoCL) across all datasets and artifact types, demonstrating its superior performance in terms of the aforementioned metrics [8].

The Scientist's Toolkit: Essential Research Reagents

This section details key computational tools and resources that form the essential "reagent solutions" for modern artifact research.

Table 2: Essential Research Reagents for Artifact Investigation

Tool/Resource	Type	Primary Function in Artifact Research
ABOT [104]	Online Benchmarking Tool	Allows comparison of over 120 ML-based artifact removal methods from literature.
BIDS Standard [100] [101]	Data Format Standard	Ensures electrophysiology data is organized in a consistent, reproducible manner.
EEGdenoiseNet [8]	Benchmark Dataset	Provides a semi-synthetic benchmark with ground truth for EMG and EOG artifact removal.
ICA & Variants [99]	Algorithm	A family of blind source separation methods used to isolate and remove artifact components.
Wavelet Transform [35]	Algorithm	Provides time-frequency analysis useful for detecting and filtering non-stationary artifacts.
Deep Learning Models (e.g., CNN, LSTM, Transformer) [8]	Algorithm	Enable end-to-end, automated removal of complex and unknown artifacts from raw signals.

The transition from traditional methods like ICA and wavelet transforms to deep learning represents a significant shift in the field. While ICA is a powerful technique for decomposing signals into independent sources, it often requires manual component selection and performs poorly with low-channel counts, a common feature of wearable EEG [35]. Deep learning models, such as the CNN-LSTM hybrid in CLEnet, overcome these limitations by automatically learning to separate artifacts from neural signals in an end-to-end fashion, showing particular promise for handling unknown artifacts in challenging, real-world conditions [8].

Visualization of Artifact Management Pipelines

The following diagram illustrates a generalized, modular pipeline for managing EOG and EMG artifacts in electrophysiological data, highlighting the decision points and algorithmic choices at each stage.

Modular Pipeline for EOG/EMG Artifact Management

This modular view underscores that there is no one-size-fits-all solution. The choice of pipeline depends on the artifact type, signal modality, and available computational resources. Research indicates that hybrid methods like EEMD-CCA can be particularly effective as a post-processing step for challenging hybrid artifacts, such as those evoked by prefrontal transcranial magnetic stimulation [99]. Meanwhile, deep learning approaches offer a unified framework that can adapt to a variety of artifacts without requiring manual pipeline adjustments [8].

The advancement of research comparing EOG and EMG artifacts is inextricably linked to the adoption of standardized, publicly available datasets and benchmarking tools. Resources such as EEGdenoiseNet, the NeBULA dataset, the TUH Corpus, and the ABOT tool provide an indispensable foundation for conducting reproducible research, enabling fair comparisons between algorithms, and ultimately driving innovation in artifact removal techniques. As the field moves toward more complex, real-world applications—especially in wearable neurotechnology and clinical drug development—leveraging these resources will be critical for developing robust, automated methods that ensure the integrity of electrophysiological data and the validity of scientific conclusions drawn from it.

Conclusion

Effectively distinguishing and managing EOG and EMG artifacts is not a mere preprocessing step but a critical determinant of success in EEG-based biomedical research. A clear understanding of their distinct characteristics enables the selection of tailored methodologies, from robust traditional techniques like wavelet-Enhanced ICA to promising unified deep learning models. The field is moving decisively towards automated, real-time solutions that can handle the complexities of mobile, low-density wearable systems and the challenge of simultaneous multi-artifact contamination. Future progress hinges on the development of sophisticated, artifact-aware algorithms and their rigorous validation using public benchmarks. For drug development and clinical research, these advancements promise more reliable neural biomarkers, enhanced BCI robustness, and ultimately, greater translational impact by ensuring that observed effects are driven by neurology, not artifact.

EOG vs EMG Artifacts in EEG: A Comprehensive Guide for Biomedical Research and Signal Processing

EOG vs EMG Artifacts in EEG: A Comprehensive Guide for Biomedical Research and Signal Processing

Abstract

Understanding the Adversaries: A Deep Dive into EOG and EMG Artifact Origins and Characteristics

Physiological Origins and Signal Characteristics

The Nature of EEG and the Problem of Artifacts

EOG Artifacts: The Ocular Component

EMG Artifacts: The Muscular Component

Methodological Approaches for Artifact Management

Experimental Design and Pre-processing Considerations

Core Algorithmic Approaches for Artifact Removal

Experimental Protocols for Method Validation

Visualization of Method Workflows

The Researcher's Toolkit: Essential Methods and Reagents

The Biophysics of the Corneo-Retinal Potential

Characterizing Blinks and Saccades

The Critical Role of the Radial EOG Component

Experimental Protocols for EOG Artifact Research

Data Acquisition and Calibration

Experimental Paradigms

The Scientist's Toolkit: Key Research Reagents and Materials

Advanced Correction Methodologies

Core Characteristics of Pericranial EMG Artifacts

Spectral and Spatial Properties

Contrasting EMG and EOG Artifacts

Experimental Methodologies for EMG Artifact Analysis

Protocol 2: Creating Semi-Synthetic Contaminated EEG

Protocol 3: Quantitative Analysis of Artifact Impact

Visualization of Analysis Workflows

Diagram 1: Experimental Protocol for EMG Artifact Study

Diagram 2: Semi-Synthetic Dataset Creation & Validation

The Researcher's Toolkit: Essential Reagents & Materials

Fundamental Characteristics and Underlying Physiology

Comparative Signatures: A Multi-Domain Analysis

Temporal Domain Signatures

Spectral Domain Signatures

Spatial Domain Signatures

Experimental Protocols for Characterization and Removal

Data Acquisition and Preprocessing

Core Methodologies for Analysis and Removal

Blind Source Separation with Independent Component Analysis (ICA)

Deep Learning-Based Approaches

Discussion and Future Directions

The Nature and Scope of the Problem

Fundamental Characteristics of EOG and EMG Artifacts

Comparative Impact on Data Interpretation

Quantitative Impact on Decoding and Analysis

Methodologies for Artifact Detection and Correction

Classical and Component-Based Methods

Modern Data-Driven and Adaptive Methods

The Scientist's Toolkit: Essential Research Reagents and Materials

From Theory to Practice: Advanced Detection and Removal Techniques for EOG and EMG

Fundamental Principles of BSS in Biomedical Signal Processing

Comparative Performance Analysis of ICA and PCA for EOG and EMG Isolation

Quantitative Performance Metrics

Technical Advantages and Limitations

Experimental Protocols for ICA/PCA-Based Artifact Removal

Standardized EEG Acquisition Protocol

ICA Processing Workflow for EOG/EMG Isolation

PCA Processing Workflow for Preliminary Artifact Reduction

Advanced Hybrid Methodologies and Emerging Approaches

Integration with Complementary Techniques

Frequency Domain Characteristics of Artifacts and Neural Signals

The Researcher's Toolkit: Essential Materials and Reagents

Fundamental Principles: EOG vs. EMG Artifact Characteristics

The Evolution of ICA-Based Artifact Removal

Wavelet-Enhanced ICA: Core Methodology and Mechanisms

Theoretical Foundation

Workflow Architecture

Component Identification Strategies

Advanced Hybrid Frameworks and Performance Evaluation

Extended Hybrid Methodologies

Quantitative Performance Metrics

Comparative Performance Analysis

Experimental Protocols and Implementation Guidelines

Standard Experimental Protocol for Method Validation

Research Reagent Solutions

Applications in EOG vs. EMG Research and Future Directions

Differential Applications

Emerging Trends and Future Developments