This article provides a definitive guide for researchers and biomedical professionals on tuning the two critical parameters in the iCanClean algorithm—the R2 correlation threshold and the window length—for processing mobile...
This article provides a definitive guide for researchers and biomedical professionals on tuning the two critical parameters in the iCanClean algorithmâthe R2 correlation threshold and the window lengthâfor processing mobile electroencephalography (EEG) data. We synthesize findings from recent peer-reviewed studies to establish foundational concepts, detail methodological applications for various experimental conditions, and offer troubleshooting strategies for optimization. A comparative analysis validates iCanClean's performance against other artifact removal methods like Artifact Subspace Reconstruction (ASR). The guide concludes with key takeaways and future directions for employing iCanClean in clinical and drug development research to achieve high-fidelity brain source identification during movement.
FAQ 1: What makes motion artifacts particularly problematic for ICA? Motion artifacts are highly problematic for Independent Component Analysis (ICA) because they violate the algorithm's core assumption of statistical independence between sources. The large amplitude and non-stationary nature of motion-induced signals can dominate the mixture, leading to an inaccurate decomposition. This often results in brain and artifact sources being improperly merged into single components, a failure known as "over-mixing," which makes it difficult to isolate and remove artifacts without also discarding neural data [1].
FAQ 2: How does movement intensity affect my ICA decomposition? Research shows that within individual studies, increased movement intensity significantly decreases ICA decomposition quality. The greater the motion, the more the recorded data is dominated by artifactual signals, which reduces the algorithm's ability to identify maximally independent brain components effectively [1].
FAQ 3: Can't I just use ICLabel to identify and remove motion artifact components after ICA? While ICLabel is a valuable tool, it has a significant limitation for mobile EEG data: its underlying classifier was not trained on mobile EEG data containing substantial motion artifacts. The presence of large motion artifacts can contaminate the ICA's ability to separate sources in the first place, meaning ICLabel may be working with already flawed components. Therefore, relying solely on post-ICA correction is often insufficient [2] [3].
FAQ 4: What is the fundamental difference between a method like iCanClean and traditional ICA? Traditional ICA is a blind source separation technique applied to the mixed EEG signals. In contrast, iCanClean is a preprocessing algorithm that uses reference noise signals (from dedicated sensors or created from the EEG itself) to identify and subtract noise subspaces before ICA is run. It leverages canonical correlation analysis (CCA) to find and remove components in the EEG that are highly correlated with known noise, thereby cleaning the data so that ICA can perform a more effective decomposition [4] [5] [2].
Your independent component analysis (ICA) yields few brain-like components, components that are a mixture of brain and artifact, or components that are poorly fit by a single dipole (high residual variance).
This is a classic sign that motion artifacts are hindering the ICA decomposition. The large amplitude and complex nature of motion artifacts (from head movement, cable sway, electrode pops) are dominating the signal mixture, preventing the algorithm from cleanly separating brain sources.
The most effective strategy is to reduce the motion artifact burden before performing ICA. The table below compares two prominent methods for this purpose.
Table 1: Comparison of Pre-ICA Motion Artifact Removal Methods
| Method | How It Works | Key Parameters | Optimal Use Case |
|---|---|---|---|
| Artifact Subspace Reconstruction (ASR) | Uses a sliding-window PCA to identify and remove high-variance signal subspaces that deviate from a clean baseline recording [2] [3]. | k: Standard deviation threshold for artifact rejection. A lower k is more aggressive. A k between 10-20 is often used for locomotion data to avoid over-cleaning [2]. |
Effective for removing large, transient motion artifacts when a clean baseline period is available. |
| iCanClean | Leverages canonical correlation analysis (CCA) and reference noise signals (from dual-layer electrodes or created as pseudo-references) to subtract noise subspaces from the EEG [4] [5]. | R²: Correlation threshold (aggressiveness). Window Length: Segment size for CCA. Optimal parameters for walking data are often an R² of 0.65 and a 4-second window [4] [5]. |
Superior performance when dedicated noise references are available. Also effective with pseudo-reference signals, making it highly versatile [6] [2]. |
R²=0.65 and a 4-s window [4] [5].The following tables consolidate key performance metrics from recent studies to aid in the evaluation and selection of artifact removal methods.
Table 2: iCanClean Performance Metrics on Walking EEG Data
| Study | Key Metric | Performance with iCanClean | Performance Baseline |
|---|---|---|---|
| Gonsisko et al. (2023) [4] [5] | Number of "good" ICA components (dipolar, brain-like) | 13.2 | 8.4 |
| Gonsisko et al. (2023) [4] [5] | Effective number of noise channels | 16-64 channels maintained good performance (12.0-12.7 good components) | â |
Table 3: Comparative Performance in Running EEG Paradigm (Ledwidge et al., 2025) [6] [2]
| Method | ICA Component Dipolarity | Reduction at Gait Frequency | P300 ERP Congruency Effect Recovered? |
|---|---|---|---|
| iCanClean (pseudo-reference) | Most effective | Significant reduction | Yes |
| Artifact Subspace Reconstruction (ASR) | Improved | Significant reduction | No |
| Standard Preprocessing | Least effective | No significant reduction | No |
Researchers validating artifact removal methods often use the following experimental paradigms and metrics.
This protocol tests whether neural markers can be recovered after artifact removal in a dynamic setting [6] [2].
This methodology is used to determine the optimal settings for a cleaning algorithm like iCanClean [4] [5].
Table 4: Essential Materials for Advanced Mobile EEG Research
| Item / Technique | Function in Research |
|---|---|
| Dual-Layer EEG Cap | A cap with two layers of electrodes: scalp electrodes that record brain signal + noise, and mechanically coupled outward-facing electrodes that record only environmental and motion noise. This provides an ideal reference for noise-cancellation algorithms like iCanClean [5]. |
| High-Density EEG Systems (64+ channels) | Provides adequate spatial resolution (~1 cm) for source-level analysis and improves the performance of blind source separation techniques like ICA by providing more spatial information [5]. |
| iCanClean Algorithm | A dedicated preprocessing algorithm that uses canonical correlation analysis (CCA) to remove motion and other artifacts by leveraging reference noise signals, thereby improving subsequent ICA decomposition [4] [5]. |
| AMICA Algorithm | An adaptive mixture ICA algorithm considered one of the most powerful implementations for decomposing EEG data, including data from mobile paradigms [1]. |
| ICLabel | A convolutional neural network-based classifier that automatically labels independent components (e.g., as brain, muscle, eye, noise). It is a standard tool for post-ICA evaluation [4] [5]. |
What is the iCanClean R2 parameter? The R2 parameter is a correlation threshold between 0 and 1 that determines the cleaning aggressiveness of the iCanClean algorithm. It works by using Canonical Correlation Analysis (CCA) to identify and remove noisy subspaces from EEG data that are correlated with reference noise recordings [7] [3]. A lower R2 value means the algorithm will remove components that have even a weak correlation with noise, resulting in more aggressive cleaning. A higher R2 value (closer to 1) protects more of the signal by only removing components with a very strong correlation to noise, resulting in less aggressive cleaning [7].
What are the optimal R2 settings for mobile EEG data? Based on a systematic parameter sweep, the optimal setting for cleaning mobile EEG data during walking was determined to be an R2 value of 0.65 combined with a 4-second window length [7]. This configuration significantly improved the quality of the Independent Component Analysis (ICA) decomposition.
How does the R2 parameter interact with the window length? The R2 threshold is applied within a specific time window that slides through the data. The window length determines the segment of data used to calculate the local correlation between cortical electrodes and noise electrodes [7]. Shorter windows (e.g., 1-2 seconds) can adapt to rapidly changing noise but may be less stable, while the recommended 4-second window provides a good balance for capturing the structure of motion artifacts during walking [7].
What happens if I set the R2 value too low or too high? Setting the R2 value too low (e.g., below 0.5) risks over-cleaning, where genuine brain signal components may be mistakenly identified as noise and removed from the data [7] [3]. Conversely, setting the R2 value too high (e.g., above 0.8) leads to under-cleaning, leaving a significant amount of motion artifact in the data, which can hinder subsequent source analysis with ICA [7].
Can I use iCanClean effectively with a standard EEG system? Yes, while iCanClean is ideally used with a dual-layer EEG cap that has dedicated noise sensors, it can also generate "pseudo-reference" noise signals from standard EEG data. This is often done by applying a temporary notch filter to the raw EEG to isolate noise in a specific frequency band, such as below 3 Hz, which is then used for the CCA-based cleaning [3].
The following quantitative data on R2 parameter tuning is drawn from a foundational study that performed a comprehensive parameter sweep [7].
Table 1: Parameter Sweep Results for R2 and Window Length This table shows how the number of "good" independent components (ICs)âdefined as being well-localized (Residual Variance < 15%) and having a high brain probability (ICLabel > 50%)âchanges with different settings.
| R2 Value | Window Length | Avg. Number of Good ICs | Performance Notes |
|---|---|---|---|
| 0.65 | 4 seconds | 13.2 | Optimal performance [7] |
| Varied (0.05-1.0) | 1 second | < 13.2 | Less stable cleaning [7] |
| Varied (0.05-1.0) | 2 seconds | < 13.2 | Less stable cleaning [7] |
| Varied (0.05-1.0) | Infinite | < 13.2 | Less adaptive to non-stationary noise [7] |
| Baseline (No cleaning) | N/A | 8.4 | Performance before iCanClean processing [7] |
Table 2: Performance with a Reduced Number of Noise Channels This table demonstrates that iCanClean remains effective even when the number of available noise channels is reduced, which is relevant for systems with fewer reference sensors. The data was obtained using the optimal R2 value of 0.65 and a 4-second window [7].
| Number of Noise Channels | Avg. Number of Good ICs |
|---|---|
| 120 (Full Set) | 13.2 |
| 64 | 12.7 |
| 32 | 12.2 |
| 16 | 12.0 |
Experimental Methodology from Cited Studies
The primary study that established the optimal R2 value involved 45 participants across three groups: young adults, high-functioning older adults, and low-functioning older adults [7]. The key experimental steps were:
A subsequent 2025 study validated these findings in a running paradigm, confirming that iCanClean (with both dual-layer and pseudo-reference signals) improved ICA dipolarity and enabled the recovery of expected event-related potential components, outperforming other common methods like Artifact Subspace Reconstruction (ASR) [3].
Table 3: Key Materials and Software for iCanClean Research
| Item | Function / Description |
|---|---|
| Dual-Layer EEG Cap | A specialized cap where scalp electrodes are mechanically coupled with outward-facing noise electrodes. The noise electrodes record motion artifacts without brain signals, providing an ideal reference for iCanClean [7] [3]. |
| High-Density EEG System | An EEG system with 64 or more channels, providing the spatial resolution needed for effective source separation using ICA, which is crucial for validating cleaning outcomes [7]. |
| MATLAB | The primary computational environment used for implementing iCanClean and associated data processing scripts in the cited studies [7]. |
| EEGLAB | An interactive MATLAB toolbox used for processing, analyzing, and visualizing EEG data. It is essential for performing ICA, dipole fitting, and using the ICLabel classifier [7] [3]. |
| ICLabel | An EEGLAB plugin that uses a trained convolutional neural network to automatically classify independent components as brain, muscle, eye, heart, line noise, or other. It is a key metric for evaluating cleaning success [7] [3]. |
| DIPFIT (EEGLAB Plugin) | A toolbox within EEGLAB used to localize the neural sources of independent components by fitting an equivalent dipole model. A low residual variance (< 15%) is a marker of a high-quality, brain-like component [7]. |
| Drofenine hydrochloride | Drofenine Hydrochloride | Anticholinergic Research Compound |
| Ethylenediaminetetraacetic Acid | Edetic Acid (EDTA) | High-Purity Chelating Agent |
Q1: What is the "window length" parameter in the iCanClean algorithm? The window length is a key user-selectable parameter in the iCanClean algorithm that determines the duration of the data segment used to calculate local correlations between cortical EEG electrodes and reference noise electrodes. It controls the timescale over which motion artifacts are identified and removed [7] [5]. Shorter windows (e.g., 1-2 seconds) can capture rapidly changing noise, while longer windows provide a more global noise estimate.
Q2: How does window length interact with the R² cleaning aggressiveness setting? Window length and the R² threshold work together to determine cleaning performance. The R² threshold (ranging from 0 to 1) defines the correlation level above which data subspaces are considered noise and removed, with lower values being more aggressive [7]. The optimal window length provides the temporal framework for calculating these correlations. Research has found that a 4-second window paired with an R² value of 0.65 provides optimal results for gait-related motion artifacts [7].
Q3: What happens if I choose a window length that is too short or too long? An improperly chosen window length can reduce cleaning efficacy [7]:
Q4: What is the empirically determined optimal window length for mobile EEG during walking? A comprehensive parameter sweep demonstrated that a 4-second window was optimal for cleaning high-density mobile EEG data collected during walking on various terrains [7]. This timescale effectively captured the noise structure associated with gait and other whole-body movements.
Q5: Can I use fewer noise channels than the 120 used in the original study? Yes, iCanClean maintains good performance even with reduced noise channels. Testing with 64, 32, and 16 noise channels showed only a gradual decline in the number of "good" independent components identified after cleaning [7]. This makes the method applicable to systems with fewer available reference channels.
Description: After running iCanClean, the subsequent Independent Component Analysis (ICA) yields few brain-like components, as determined by dipole localization and ICLabel classification.
Potential Causes and Solutions:
Cause: Overly Conservative Cleaning (R² too high)
Cause: Suboptimal Window Length
Cause: Insufficient Number of Noise Reference Channels
Description: The cleaned EEG data appears over-processed, with attenuated event-related potentials or a loss of high-frequency brain activity.
Potential Causes and Solutions:
Cause: Overly Aggressive Cleaning (R² too low)
Cause: Mismatch Between Noise and EEG Channels
This protocol is derived from the methodology used to establish optimal iCanClean settings [7].
1. Objective: To systematically determine the optimal window length and R² threshold for cleaning mobile EEG data in a specific experimental context.
2. Materials and Setup:
3. Procedure:
4. Outcome Measures:
Table 1: Performance of iCanClean at Optimal vs. Baseline Settings [7]
| Condition | Window Length | R² Value | Average Number of "Good" ICs | Performance Change |
|---|---|---|---|---|
| Basic Preprocessing (Baseline) | Not Applicable | Not Applied | 8.4 | Baseline |
| iCanClean (Optimal) | 4 seconds | 0.65 | 13.2 | +57% |
| iCanClean (Reduced Channels) | ||||
| ... with 64 Noise Channels | 4 seconds | 0.65 | 12.7 | +51% |
| ... with 32 Noise Channels | 4 seconds | 0.65 | 12.2 | +45% |
| ... with 16 Noise Channels | 4 seconds | 0.65 | 12.0 | +43% |
Table 2: Key Research Reagents and Solutions for iCanClean Protocol [7] [5]
| Item | Function / Relevance in the Protocol |
|---|---|
| Dual-Layer EEG Cap | Provides mechanically coupled noise reference electrodes essential for the iCanClean algorithm. The original study used a 120+120 electrode configuration [7]. |
| iCanClean Algorithm | The core cleaning algorithm that uses Canonical Correlation Analysis (CCA) to remove motion artifacts based on correlations with reference noise signals [7]. |
| AMICA (Adaptive Mixture ICA) | An implementation of Independent Component Analysis used for source separation after cleaning. It was identified as a high-performing ICA algorithm [7]. |
| ICLabel Classifier | A convolutional neural network-based tool for automatically classifying independent components into categories like brain, muscle, eye, and noise [7]. |
| Dipole Fitting Tool (e.g., DIPFIT) | Used to localize the source of an independent component and calculate its Residual Variance (RV), a measure of how well its topography is explained by a single dipole [7]. |
Diagram Title: iCanClean Parameter Optimization and Evaluation Workflow
Diagram Title: Logic for Selecting an Initial Window Length
Q1: What is dual-layer EEG and how does its hardware setup function? A dual-layer EEG system employs two layers of electrodes: a scalp layer that records a mixture of brain signals and motion artifacts, and a noise layer with electrically isolated electrodes that record primarily motion and non-biological artifacts [8]. These two layers are mechanically joined using 3D-printed couplers, and their wires are secured together with tape, ensuring both sets of cables experience nearly identical motion, which is a primary source of artifact [8] [9].
Q2: What are the critical steps for setting up a dual-layer EEG system for a mobile experiment?
Q3: Why are my noise channels not correlating well with motion artifacts in the scalp data? This is often due to improper mechanical coupling. Ensure that the scalp and its corresponding noise electrode are firmly joined with a 3D-printed coupler and that their wires are taped together along their entire length. If the cables move independently, the noise channel will not accurately capture the artifact profile affecting the scalp channel [8].
Problem: Poor quality Independent Components (ICs) after ICA decomposition.
Problem: Persistent motion artifacts despite using a dual-layer system.
Problem: Muscle artifacts (EMG) contaminating the signal.
The following table summarizes key quantitative findings from research using dual-layer EEG and the iCanClean algorithm, which are critical for informing parameter tuning in mobile EEG studies.
| Study Focus | Key Metric | Performance Before Processing | Performance After Processing | Recommended Parameters / Conditions |
|---|---|---|---|---|
| iCanClean Parameter Sweep (Walking) [4] | Number of "good" ICA components | 8.4 components | 13.2 components (+57%) | Window Length: 4 secondsR² Aggressiveness: 0.65 |
| iCanClean on Phantom Head (All Artifacts) [11] | Data Quality Score (correlation with ground truth) | 15.7% | 55.9% | iCanClean outperformed ASR, Auto-CCA, and Adaptive Filtering. |
| Noise Channel Reduction Test [4] | Number of "good" ICA components | - | 12.7 (64 channels)12.2 (32 channels)12.0 (16 channels) | Performance maintained even with a reduced set of noise channels. |
This protocol, adapted from Studnicki et al. (2022), provides a robust methodology for testing dual-layer EEG hardware and processing pipelines in a high-motion environment [8].
1. Objective: To characterize and remove motion artifacts in EEG data during a discrete, responsive, whole-body task (table tennis) and to identify optimal processing strategies.
2. Participant Preparation:
3. Data Acquisition & Synchronization:
4. Experimental Tasks:
5. Data Processing & Analysis:
| Item | Function / Explanation |
|---|---|
| Dual-Layer EEG Cap | A cap with mechanically coupled scalp and noise electrodes (e.g., 120+120 channels). The noise layer provides a dedicated reference for motion artifacts [8] [9]. |
| Portable Amplifiers | Lightweight, battery-powered amplifiers (e.g., LiveAmp) that enable mobile brain imaging in real-world settings [8]. |
| Conductive Fabric | Acts as an artificial skin circuit to bridge noise electrodes, completing the electrical pathway for artifact recording [8]. |
| Inertial Measurement Units (IMUs) | Placed on the participant, equipment, and environment to synchronize motion kinematics with brain data and mark behavioral events [8] [9]. |
| iCanClean Algorithm | A cleaning algorithm that uses canonical correlation analysis on scalp and noise electrodes to find and reject artifact subspaces, improving subsequent ICA [4] [11]. |
| Neck EMG Electrodes | Repurposed EEG electrodes placed on neck muscles to provide a reference signal for myogenic artifacts, which can be used by cleaning algorithms like iCanClean [8] [11]. |
| 3D-Printed Cable Couplers & Cases | Custom components to mechanically join scalp/noise electrodes and securely house amplifiers, which are critical for system integrity and portability [8]. |
| Fosamprenavir | Fosamprenavir | Prodrug of Amprenavir | RUO |
| Volasertib | Volasertib, CAS:755038-54-1, MF:C34H50N8O3, MW:618.8 g/mol |
iCanClean Artifact Removal Flow
Experimental Workflow for Validation
Q1: What are the established starting parameters for cleaning mobile EEG data collected during walking? Based on a comprehensive parameter sweep using high-density, dual-layer EEG data from participants walking on a treadmill, the optimal parameters for the iCanClean algorithm are an R2 value of 0.65 and a window length of 4 seconds [7]. These settings were determined to maximize the number of "good" independent components recovered after ICA decomposition.
Q2: Why is an R2 value of 0.65 recommended, and what happens if I use a more or less aggressive value? The R2 threshold controls the cleaning aggressiveness. A higher R2 value (closer to 1) results in less cleaning, while a lower value (closer to 0) results in more aggressive cleaning [7]. The parameter sweep found that an R2 of 0.65 optimally balances the removal of motion artifacts with the preservation of underlying brain signals. Straying significantly from this value may result in either insufficient cleaning (leaving too much noise) or over-cleaning (removing brain activity).
Q3: Can I use these parameters with a standard EEG system, or do I need a special cap? The original validation was performed using a dual-layer EEG cap with 120 scalp electrodes and 120 dedicated noise electrodes [7]. However, the study also demonstrated that good performance can be maintained with a reduced set of noise channels. The parameters are effective with 64, 32, or even 16 noise channels, though the number of high-quality brain components identified may decrease slightly as noise channels are reduced [7].
Q4: How much improvement in data quality can I expect using these parameters? In the validation study, using the optimal parameters (R2=0.65, 4-s window) improved the average number of "good" independent components (ICs) from 8.4 to 13.2, an increase of 57% [7]. "Good" components were defined as those well-localized by a dipole model (residual variance < 15%) and with a high probability of being brain activity (ICLabel > 50%).
Q5: Are these parameters specific to certain populations? The study tested these parameters across three different participant groups: young adults, high-functioning older adults, and low-functioning older adults [7]. The optimal parameters were consistent across these groups, suggesting they are a robust starting point for neurotypical adult populations.
| Problem | Possible Cause | Solution |
|---|---|---|
| Poor ICA decomposition after cleaning. | Over-cleaning; R2 value is too low, removing brain signals along with noise. | Increase the R2 value (e.g., to 0.7 or 0.75) to make the algorithm less aggressive [7]. |
| Residual motion artifacts are still visible in the data. | Under-cleaning; R2 value is too high, leaving too much noise in the data. | Decrease the R2 value (e.g., to 0.55 or 0.6) to increase cleaning aggressiveness [7]. |
| Cleaning performance is lower than expected with a reduced number of noise channels. | The algorithm has insufficient reference information to isolate noise subspaces effectively. | Ensure noise channels are evenly spaced around the scalp. Use the loc_subsets function in EEGLAB to select the most geometrically representative channels [7]. |
| The algorithm struggles with very high-frequency noise (e.g., muscle artifacts). | The 4-second window may be too long to capture non-stationary, high-frequency bursts. | Shorten the window length (e.g., to 2 seconds) to better capture and remove local, high-frequency artifacts [7]. |
Methodology for Parameter Optimization
The following workflow was used to establish the optimal R2 and window length parameters [7]:
Quantitative Performance of iCanClean Parameters
Table 1: Improvement in ICA Decomposition Quality with Optimal iCanClean Settings [7]
| Condition | Average Number of "Good" ICs | Change from Baseline |
|---|---|---|
| Before iCanClean (Baseline) | 8.4 | - |
| After iCanClean (R2=0.65, 4-s window) | 13.2 | +57% |
Table 2: Performance with a Reduced Number of Noise Channels (using R2=0.65, 4-s window) [7]
| Number of Noise Channels | Average Number of "Good" ICs |
|---|---|
| 120 (Full Set) | 13.2 |
| 64 | 12.7 |
| 32 | 12.2 |
| 16 | 12.0 |
Table 3: Key Materials and Software for iCanClean Experiments
| Item | Function / Description | Role in the Protocol |
|---|---|---|
| Dual-Layer EEG Cap | A specialized cap with scalp electrodes and mechanically coupled, outward-facing noise electrodes [7]. | Provides concurrent recordings of brain signals (EEG channels) and reference noise (noise channels) essential for the iCanClean algorithm. |
| High-Density EEG System | An EEG system capable of recording from 64+ channels, often 120+ channels in validation studies [7]. | Ensures adequate spatial sampling for effective source separation using ICA after cleaning. |
| iCanClean EEGLAB Plugin | The software implementation of the iCanClean algorithm, available as a plugin for the EEGLAB environment [12]. | Performs the core cleaning function using Canonical Correlation Analysis (CCA) to remove artifact subspaces. |
| Independent Component Analysis (ICA) | A blind source separation algorithm (e.g., AMICA, Infomax) to decompose cleaned EEG data into independent components [7]. | Used after iCanClean processing to isolate brain and non-brain sources for analysis. |
| ICLabel | A classifier that automatically labels independent components based on their type (e.g., brain, muscle, eye) [7]. | Provides a quantitative metric (brain probability) for identifying "good" brain components post-ICA. |
| Dipolar Source Localization | A method for fitting an equivalent current dipole to an independent component's scalp topography [7]. | Provides a quantitative metric (Residual Variance) for identifying well-localized, "good" brain components. |
| Riviciclib | P-276-00 free base | High Purity CDK9 Inhibitor | P-276-00 free base is a potent CDK9 inhibitor for cancer & virology research. For Research Use Only. Not for human or veterinary diagnostic or therapeutic use. |
| Methazolamide | Methazolamide | Carbonic Anhydrase Inhibitor | RUO | Methazolamide is a potent carbonic anhydrase inhibitor for research applications like glaucoma and cancer studies. For Research Use Only. Not for human consumption. |
Parameter Decision Workflow
The following diagram outlines the logic for adjusting parameters based on your cleaning goals and data quality:
What is a parameter sweep and why is it critical for iCanClean research? A parameter sweep is a systematic research approach where multiple parameters of an algorithm are varied across a defined range of values to determine the optimal combination for a specific goal [13]. For iCanClean, a novel algorithm for cleaning motion artifacts from mobile EEG data, conducting a parameter sweep is essential because its performance is highly dependent on two key user-defined parameters: the R² threshold (cleaning aggressiveness) and the window length (the segment of data used for local correlation calculations) [5]. Empirical optimization ensures the algorithm removes noise without accidentally degrading the underlying brain signals of interest [14] [5].
What are the optimal parameter values for iCanClean when processing walking data? Research involving high-density EEG recorded during walking has identified an optimal window length of 4 seconds and an optimal R² threshold of 0.65 [5] [2]. This combination significantly improved the quality of the subsequent independent component analysis (ICA), increasing the average number of "good" brain components extracted from the data by 57% [5].
Can I use iCanClean if I don't have a dual-layer EEG system with dedicated noise sensors? Yes, iCanClean can still be implemented using "pseudo-reference" noise signals derived from the raw EEG data itself [2]. This is typically done by applying a temporary notch filter to the EEG to isolate noise within a specific frequency band (e.g., below 3 Hz for motion artifacts), which then serves as the reference for the canonical correlation analysis (CCA) [2].
Problem: OvercleaningâSuspected Loss of Brain Signal After iCanClean
Problem: Ineffective CleaningâMotion Artifacts Persist After Processing
Problem: Inconsistent Results Across Participants or Experimental Conditions
This protocol is adapted from published research that successfully optimized iCanClean for mobile EEG data [5].
1. Define the Parameter Space Create a table of the parameters and the values you will test.
| Parameter | Description | Values to Test |
|---|---|---|
| R² Threshold | Controls cleaning aggressiveness. Lower values remove more components. | Test a range from 0.05 to 1.00 in increments of 0.05 [5]. |
| Window Length | Duration of the data segment used for local correlation analysis. | Test 1, 2, and 4 seconds, and potentially the entire recording ("infinite") [5]. |
2. Select a Sweep Strategy For this type of discrete parameter optimization, an Exhaustive Sweep is the most straightforward strategy, as it evaluates every possible combination of the listed R² and window length values [13]. This ensures you do not miss the optimal configuration.
3. Prepare the EEG Data
4. Execute the Parameter Sweep
5. Evaluate the Results For each cleaned dataset, run an ICA decomposition and calculate the following quality metrics:
6. Identify the Optimal Configuration The optimal parameter set is the one that maximizes the number of "good" brain components while also successfully reducing gait-related power and preserving expected ERPs [5] [2].
The diagram below outlines the logical flow for designing and executing a parameter sweep for iCanClean.
The following table details key materials and computational tools required for implementing the iCanClean parameter sweep.
| Item | Function in the Experiment | Specification / Note |
|---|---|---|
| High-Density EEG System | Records scalp potentials containing mixtures of brain signal and noise. | Systems with 64+ channels are recommended for effective ICA [5]. |
| Dual-Layer EEG Cap | Provides dedicated noise reference signals. Outer-layer electrodes are mechanically coupled to scalp electrodes but record only noise [5] [2]. | Ideal for iCanClean. A 120+120 channel configuration was used in foundational studies [5]. |
| Pseudo-Reference Signals | An alternative noise reference when a dual-layer cap is not available. Created by filtering the raw EEG to isolate noise bands [2]. | For motion, a notch filter below 3 Hz can be used to create these signals [2]. |
| Computing Environment | Runs the parameter sweep and iCanClean processing. | MATLAB with EEGLAB is commonly used. Parallel processing is recommended to reduce computation time [5] [15]. |
| ICA Algorithm | Decomposes cleaned EEG into independent sources for quality assessment. | AMICA is recommended for high-quality decompositions in mobile settings [5]. |
| ICLabel Classifier | Automatically classifies independent components as "brain", "muscle", "eye", etc. | A trained neural network used to quantify the number of "good" brain components post-cleaning [5]. |
How does the number of noise channels affect iCanClean's performance?
The number of reference noise channels directly impacts the cleaning efficacy of the iCanClean algorithm. However, the system is robust and shows good performance even with reduced noise channel sets. Research has demonstrated that when using an optimal r2 value of 0.65 and a 4-second window, iCanClean maintained strong performance as noise channels were reduced [5].
What is the minimum number of noise channels required?
While a dual-layer setup with 120 noise channels was used in the foundational research, subsequent testing showed that good performance could be maintained with sets as low as 16 noise channels [5]. The key is to adjust the r2 parameter to compensate for having fewer noise references.
Should I change the window length if I have fewer noise channels?
The primary research indicates that the 4-second window length is optimal across different numbers of noise channels [5]. Your focus should be on tuning the r2 value, as it is the parameter most sensitive to changes in the noise channel setup.
The following table summarizes the key experimental findings for different numbers of noise channels. The baseline performance before cleaning was 8.4 good components, established using a 120-channel dual-layer EEG setup [5].
Table 1: Performance of iCanClean with Different Noise Channel Setups
| Number of Noise Channels | Number of "Good" ICs Retained After Cleaning | Performance Relative to Baseline (8.4 Good ICs) |
|---|---|---|
| 64 | 12.7 | +51% improvement |
| 32 | 12.2 | +45% improvement |
| 16 | 12.0 | +43% improvement |
Note: The data above was obtained using the identified optimal parameters of a 4-second window length and an
r2value of 0.65 [5].
The r2 threshold is the most critical parameter to adjust when your noise channel setup changes. Here is a methodology for determining the optimal value for your specific system, based on established research practices [5].
r2 parameter from a low value (e.g., 0.05) to 1.0 in increments of 0.05. Keep the window length fixed at 4 seconds [5].r2 value for your setup is the one that yields the highest number of "good" ICs. Using an overly aggressive (low) r2 will remove brain activity, while a too-conservative (high) r2 will leave noise in the data.Table 2: Essential Materials for iCanClean Research
| Item | Function in the Context of iCanClean |
|---|---|
| High-Density EEG System (64+ channels) | Records the mixture of brain signals and artifacts from the scalp. Essential for high-fidelity source separation with ICA [5]. |
| Dual-Layer EEG Cap or Separate Noise Sensors | Provides the reference noise recordings. The outer layer of electrodes is mechanically coupled to the scalp electrodes but records primarily non-brain noise, which is crucial for iCanClean's operation [5] [11]. |
| Portable EEG Amplifier | Enables data collection during mobile, whole-body movement tasks, which are the primary use-case for iCanClean [5]. |
| Electrical Phantom Head | A validation tool containing embedded "brain" sources. It allows for quantitative testing of iCanClean's performance with known ground-truth signals, free from biological variability [11]. |
| Carumonam | Carumonam | Antibacterial Agent for Research |
| trans-Clopenthixol | Sordinol (Clotiapine) |
The following diagram illustrates the logical process for adapting iCanClean's core parameters to your specific hardware setup, particularly the number of noise channels.
| Problem | Possible Cause | Solution |
|---|---|---|
| Too few "good" brain components after cleaning. | r2 value is too aggressive (too low), removing brain signals along with noise. |
Increase the r2 value to be more conservative (e.g., try 0.7 or 0.75) and re-run the analysis. |
| Excessive noise remains in the data after cleaning. | r2 value is too conservative (too high), failing to remove enough noise. |
Decrease the r2 value to be more aggressive (e.g., try 0.6 or 0.55). Also, verify the quality of your noise channel recordings. |
| Cleaning performance is poor with a low number of noise channels. | The algorithm lacks sufficient noise reference information. | If possible, increase the number of noise channels. If not, you may need to accept a more aggressive r2 setting, acknowledging a potential for minor loss of brain signal. |
| Inconsistent cleaning across the recording. | The default "infinite" window may not capture non-stationary noise well. | Ensure you are using a shorter, sliding window (e.g., 4 seconds) to adapt to changing noise conditions during movement [5]. |
Q1: What are the primary iCanClean parameters I need to tune for mobile EEG studies? The two primary user-selectable parameters for the iCanClean algorithm are the window length and the r² cleaning aggressiveness threshold [5]. The window length determines the segment of data used to calculate correlations between cortical and noise electrodes. The r² threshold (ranging from 0 to 1) determines which correlated components are removed, with a lower value resulting in more aggressive cleaning [4] [5].
Q2: I am analyzing walking data. What are the recommended starting parameters? For EEG data corrupted by walking motion artifacts, research has identified an optimal window length of 4 seconds and an r² value of 0.65 [4] [5]. This configuration improved the average number of "good" independent components (ICs)âwell-localized dipoles with high brain probabilityâfrom 8.4 to 13.2, a 57% increase [4].
Q3: Can I use iCanClean effectively with a reduced number of dedicated noise sensors? Yes, performance can be maintained with a reduced set of noise channels. One study found that using 64, 32, and 16 noise channels still yielded 12.7, 12.2, and 12.0 good components, respectively, demonstrating robust performance even with fewer reference channels [4].
Q4: How does iCanClean compare to other common artifact removal methods? iCanClean has been shown to consistently outperform other real-time-capable methods like Artifact Subspace Reconstruction (ASR), Auto-CCA, and Adaptive Filtering, regardless of the type or number of artifacts present [14]. In a phantom head study with all artifact types simultaneously present, iCanClean improved the Data Quality Score from 15.7% to 55.9%, whereas ASR, Auto-CCA, and Adaptive Filtering only improved it to 27.6%, 27.2%, and 32.9%, respectively [14].
Q5: Is preprocessing necessary before applying iCanClean? Basic preprocessing is recommended. A typical pipeline includes high-pass filtering (e.g., 1 Hz cutoff) and average re-referencing of both EEG and noise channels separately [5]. A basic channel rejection step to remove large-amplitude channels (e.g., those with amplitudes greater than 3 times the median) is also commonly applied before running iCanClean [5].
The following methodology outlines how key parameters for iCanClean were established, providing a template for validating parameters in new experimental scenarios.
1. Data Collection Setup
2. Data Processing and Parameter Sweep
The table below summarizes quantitative findings from published studies to guide parameter selection.
| Use Case Scenario | Optimal Window Length | Optimal R² Value | Key Performance Outcome | Source |
|---|---|---|---|---|
| General Walking | 4 seconds | 0.65 | Increased good ICs by 57% (from 8.4 to 13.2) | [4] [5] |
| Phantom Head (All Artifacts) | Not Specified | Not Specified | Data Quality Score improved from 15.7% to 55.9% | [14] |
| Item | Function in iCanClean Research |
|---|---|
| Dual-Layer EEG Cap | A cap with paired scalp and noise electrodes provides the reference noise recordings essential for the iCanClean algorithm to function [4] [5]. |
| High-Density EEG System | Systems with 64+ channels are recommended for mobile brain imaging to provide adequate spatial resolution for source localization after ICA [5]. |
| Motion Platform & Phantom Head | An electrically conductive phantom head with embedded sources provides known ground-truth brain signals, enabling quantitative validation of cleaning performance against motion, muscle, and other artifacts [14]. |
| MATLAB & EEGLAB | A standard software environment for implementing custom iCanClean scripts, performing parameter sweeps, and conducting ICA with toolboxes like ICLabel [5]. |
| Mkt-077 | Mkt-077, CAS:1427472-75-0, MF:C21H22ClN3OS2, MW:432.0 g/mol |
| Z-Vrpr-fmk | Z-Vrpr-fmk | Caspase-3 Inhibitor | For Research Use |
This diagram illustrates the logical workflow for determining the optimal iCanClean parameters for a mobile EEG experiment.
This diagram outlines the decision-making process for selecting and adjusting the key iCanClean parameters based on your data and research goals.
A critical challenge in mobile electroencephalography (EEG) research is balancing effective artifact removal with the preservation of neural signals. Over-cleaning, often resulting from inappropriate parameter selection in algorithms like iCanClean, can lead to the accidental removal of brain components, compromising data integrity and experimental outcomes [4] [14]. This guide provides troubleshooting and solutions for this common issue within the context of parameter tuning for iCanClean.
Q1: What are the primary indicators that I have over-cleaned my mobile EEG data using iCanClean? A significant drop in the number of brain components identified post-ICA is a key indicator. Brain components are typically characterized by a dipolar topography (Residual Variance < 15%) and a high brain probability (ICLabel > 50%) [4] [5]. If the count of such "good" components decreases substantially after cleaning, over-cleaning is likely. A reduction in the Data Quality Score, which measures the correlation between preserved signals and known brain sources, also suggests over-aggressive cleaning [14].
Q2: Which iCanClean parameters most directly influence cleaning aggressiveness and the risk of brain signal loss? The two most critical parameters are the R² correlation threshold and the window length [4] [5].
Q3: What are the empirically optimized iCanClean parameter settings to avoid over-cleaning? Research on mobile EEG data collected during walking recommends an R² threshold of 0.65 and a window length of 4 seconds as optimal starting points [4]. This combination significantly improved the number of "good" brain components from 8.4 to 13.2 (a 57% increase) without evidence of over-cleaning [4] [5].
Table 1: Diagnostic Metrics for Identifying Over-Cleaned Data
| Metric | Well-Cleaned Data | Over-Cleaned Data | Measurement Method |
|---|---|---|---|
| Number of "Good" ICs | Increased (>50% improvement reported) [4] | Decreased or unchanged | ICA decomposition followed by ICLabel and dipole fitting [4] |
| Data Quality Score | Significantly improved [14] | Worsened or not improved | Correlation between cleaned EEG channels and ground-truth brain sources [14] |
| Spectral Content | Preserved brain oscillatory patterns (e.g., alpha, beta, theta) [16] | Attenuated or absent brain oscillations | Power spectral density analysis [16] |
Table 2: Performance Comparison of EEG Cleaning Algorithms
| Cleaning Method | Data Quality Score (All Artifacts Condition) | Requires Clean Calibration Data? | Computational Efficiency |
|---|---|---|---|
| Uncleaned Data | 15.7% [14] | N/A | N/A |
| iCanClean (Optimal) | 55.9% [14] | No | High [14] [17] |
| Artifact Subspace Reconstruction (ASR) | 27.6% [14] | Yes [14] | Moderate [14] |
| Auto-CCA | 27.2% [14] | No | High [14] |
| Adaptive Filtering | 32.9% [14] | Requires reference noise signals [14] | High [14] |
The following diagram illustrates the systematic workflow for diagnosing over-cleaning and identifying optimal R² and window length parameters.
Aim: To quantitatively evaluate iCanClean parameters and prevent over-cleaning by comparing cleaned data to ground-truth brain signals or established component quality metrics [4] [14].
Materials:
Methodology:
Table 3: Essential Materials and Tools for iCanClean Research
| Item | Function / Description | Example Use in Protocol |
|---|---|---|
| Dual-Layer EEG Cap | A cap with inner (scalp) and outer (noise) electrode layers. Provides reference noise recordings mechanically coupled to the EEG sensors [5] [14]. | Provides the reference noise signals required for the iCanClean algorithm to separate motion artifacts from brain activity [5]. |
| iCanClean Algorithm | A novel cleaning algorithm that uses Canonical Correlation Analysis (CCA) and reference noise to remove artifact subspaces from EEG data [14] [17]. | The core method for removing motion, muscle, eye, and line-noise artifacts while preserving brain activity prior to ICA [4]. |
| ICLabel | A convolutional neural network for automatically classifying independent components from ICA [4] [5]. | Used to calculate the "brain probability" of each component, which is a key metric for identifying "good" components and diagnosing over-cleaning [4]. |
| Dipolar Source Localization | A method for fitting an equivalent current dipole to the scalp topography of an independent component [4] [5]. | Used to calculate the Residual Variance (RV), helping to identify components that are likely of brain origin (RV < 15%) [4]. |
| Artifact Subspace Reconstruction (ASR) | An alternative cleaning method based on principal component analysis, useful for comparative benchmarking [14]. | Serves as a performance benchmark when validating iCanClean's effectiveness and tuning parameters [14]. |
| Phomopsin A | Phomopsin A | Microtubule Inhibitor | For Research | Phomopsin A is a fungal mycotoxin and potent microtubule destabilizing agent for cancer & cell biology research. For Research Use Only. |
| Ivermectin B1a monosaccharide | Ivermectin B1a monosaccharide|Research Use Only | Ivermectin B1a monosaccharide is a modified avermectin for research. For Research Use Only. Not for human or veterinary diagnostic or therapeutic use. |
This is a classic symptom of under-cleaning, often caused by suboptimal parameter selection. The iCanClean algorithm has two primary parameters that directly impact cleaning aggressiveness: the r² correlation threshold and the window length.
Diagnosis: Residual motion artifacts typically indicate that your r² threshold is set too high (too conservative) or your window length is too short to capture the complete structure of the motion artifacts.
Recommended Solution: Based on systematic parameter sweeps, adjust your parameters as follows:
Table 1: Optimal iCanClean Parameters for Motion Artifact Removal
| Parameter | Suboptimal Setting (Causes Under-Cleaning) | Recommended Optimal Setting | Impact of Adjustment |
|---|---|---|---|
| r² Threshold | High (e.g., > 0.8) | 0.65 [4] [5] | Increases cleaning aggressiveness by removing more correlated noise subspaces. |
| Window Length | Short (e.g., 1s or 2s) | 4 seconds [4] [5] | Allows the algorithm to better capture the temporal structure of motion artifacts like those from walking. |
The recommended parameters are not theoretical; they are derived from controlled, empirical studies. The key validation experiments are summarized below.
Experimental Protocol 1: Validation on Human Mobile EEG Data
Experimental Protocol 2: Validation on a Phantom Head with Ground Truth
iCanClean has been shown to consistently outperform other real-time-capable methods, especially when multiple artifact types are present simultaneously. The following table summarizes a quantitative comparison from a phantom head study.
Table 2: Performance Comparison of Artifact Removal Methods on Phantom EEG Data ("Brain + All Artifacts" Condition)
| Cleaning Method | Data Quality Score After Cleaning | Key Requirement / Limitation |
|---|---|---|
| Uncleaned Data | 15.7% [11] [14] | Baseline for comparison. |
| iCanClean | 55.9% [11] [14] | Requires reference noise signals (e.g., from a dual-layer cap). |
| Adaptive Filtering | 32.9% [11] [14] | Assumes noise projects identically to EEG and noise sensors; can struggle with motion artifacts [11] [14]. |
| Artifact Subspace Reconstruction (ASR) | 27.6% [11] [14] | Requires clean calibration data [11] [14]. |
| Auto-CCA | 27.2% [11] [14] | Risks removing brain activity, which is also low-frequency/high-correlation [11] [14]. |
For the best performance, iCanClean is designed to work with reference noise recordings. The most effective setup is a dual-layer EEG cap, where outward-facing noise electrodes are mechanically coupled to the scalp electrodes to provide ideal noise references [7] [5]. However, the algorithm can still function with reduced noise channels. Research shows that good cleaning performance can be maintained with 64, 32, or even 16 properly spaced noise channels [4] [5].
The following diagram illustrates the step-by-step troubleshooting process for addressing residual motion artifacts in your data.
Parameter Optimization Workflow
Table 3: Key Materials and Software for iCanClean Experimentation
| Item | Function / Relevance | Example from Literature |
|---|---|---|
| Dual-Layer EEG Cap | Provides dedicated reference noise electrodes that are mechanically coupled to scalp electrodes to record artifact signals without brain activity. Essential for optimal performance [7] [5]. | 120 scalp electrodes + 120 outward-facing noise electrodes [7] [5]. |
| High-Density EEG System | Enables sufficient spatial sampling for effective source separation via ICA after cleaning with iCanClean. | Systems with 64+ channels; studies used 120-channel setups [4] [5]. |
| Mobile/Portable EEG Amplifier | Allows for data collection during whole-body movement, which is the primary scenario where motion artifacts are encountered. | Used for recordings during treadmill walking [4] [7]. |
| Signal Processing Software with EEGLAB | Provides the environment for implementing preprocessing, running iCanClean, and performing subsequent ICA and ICLabel analysis. | Custom MATLAB scripts with EEGLAB toolbox [7] [5]. |
| ICLabel Classifier | A machine learning-based tool for automatically classifying independent components (ICs) after ICA. Used to quantify "brain" vs. "non-brain" components [4] [5]. | Components with >50% brain probability classified as "good" [4] [5]. |
| 12-OAHSA-d17 | 12-OAHSA-d17 Stable Isotope | Lipokine Internal Standard | High-purity 12-OAHSA-d17, a deuterated lipokine, for lipidomics & metabolism research. For Research Use Only. Not for human or veterinary use. |
Problem Statement: After running ICA, my group-level Independent Component (IC) scalp topographies are inconsistent, and the Event-Related Potential (ERP) amplitudes appear weakened, suggesting polarity cancellation issues within IC clusters.
Background: A fundamental property of ICA is polarity indeterminacy [18]. This means that for any given independent component, the sign of the scalp topography and its time course (e.g., an ERP) can be arbitrarily flipped without affecting the decomposition's validity. When performing group-level analysis and clustering ICs from multiple participants, this can lead to components within the same cluster having mixed polarities. When averaged, these opposing polarities cancel each other out, reducing the amplitude and sensitivity of the resulting ERP and potentially obscuring true effects [18].
Solution: Implement a polarity alignment method during the group-level clustering stage. The default method in toolboxes like EEGLAB is often Iterative Correlation Maximization, which aligns polarities based on the scalp topographies. However, for studies prioritizing ERP fidelity, the Covariance Maximization method is recommended.
Methodology:
Decision Flowchart: The following diagram illustrates the logical process for diagnosing and resolving polarity-related issues in your ICA results.
Problem Statement: When tuning the iCanClean R2 value window size parameter, I face a trade-off. A smaller window improves the preservation of ERP fidelity but may leave more non-dipolar artifacts. A larger window enhances dipolarity (a key marker of a neural source) but risks distorting or smearing genuine neural signals in the ERP.
Background: The R2 value in iCanClean is a metric of component dipolarity. A high R2 value indicates that the component's scalp topography is consistent with a single neural generator within the brain [19]. However, the parameter window size used to calculate this R2 value over time is critical. Your research must decide which outcome to prioritize: the quality of the source separation (dipolarity) or the fidelity of the time-domain signal (ERP).
Solution: The optimal window size is experiment-dependent. There is no universal value. The choice should be guided by the primary research question and the subsequent analysis plans.
Methodology for Parameter Optimization:
Quantitative Comparison Table: The table below summarizes the trade-offs and provides a framework for evaluating different window size parameters.
| Window Size | Avg. R2 of ICs (Dipolarity) | ERP Peak Amplitude (e.g., P3 µV) | ERP Peak Latency (e.g., P3 ms) | Residual Noise Level | Recommended Use Case |
|---|---|---|---|---|---|
| Small (e.g., 100 ms) | Lower | Higher Fidelity | More Accurate | Higher | Primary: ERP Analysis. Use when precise timing/amplitude of cognitive events is critical. |
| Medium (e.g., 250 ms) | Moderate | Moderate | Slightly Shifted | Moderate | Balanced Approach. Suitable for studies investigating both sources and ERPs. |
| Large (e.g., 500 ms) | Higher | Reduced/Smeared | Delayed/Blurred | Lower | Primary: Source Analysis. Use when localization and dipolarity are the main goals. |
FAQ 1: What is the single most important factor for ensuring reliable ICA results in individual differences research?
Answer: The psychometric reliability of your EEG measures is the most critical factor. High internal consistency and test-retest reliability are prerequisites for any analysis seeking to differentiate individuals [19]. Regardless of how well you optimize ICA parameters, if the underlying EEG measures (e.g., power, ERP amplitudes) are not stable and reliable over time, your study will lack the power to detect meaningful correlations with behavior or traits. It is essential to consult reliability profiles for your specific EEG measures (e.g., power, ERPs, functional connectivity) and employ denoising techniques and data quality metrics to improve the reliability of your individual differences analyses [19].
FAQ 2: My primary goal is to enhance the signal-to-noise ratio of my ERPs for drug development studies. Should I prioritize ICA dipolarity?
Answer: Not primarily. In the context of drug development, where detecting a subtle change in a cognitive ERP (like the P3) between a drug and placebo group is often the goal, you should prioritize ERP Fidelity [20]. While ensuring a baseline level of data cleanliness via dipolarity is good practice, an over-emphasis on maximizing R2 values with large window sizes can smear and distort the temporal dynamics of the ERP. This distortion can mask the very drug effects you are trying to measure. Your optimization strategy should favor parameters that best preserve the timing and amplitude of your ERP components of interest.
FAQ 3: How can I formally optimize multiple ICA and post-processing parameters at once?
Answer: For complex optimization problems involving multiple interacting parameters, we recommend using a structured Design of Experiments (DoE) approach [21]. The general workflow is:
| Item Name | Function / Explanation |
|---|---|
| Human Neocortical Neurosolver (HNN) | A biophysical modeling software used to simulate the cellular and circuit-level mechanisms that generate scalp-recorded EEG/ERP signals. It helps generate testable predictions for interpreting ICA components and ERPs [20]. |
| EEGLAB Toolbox | A foundational open-source MATLAB toolbox for processing EEG data. It provides the core environment and functions for performing ICA, component clustering, and calculating ERPs [20]. |
| iCanClean Plugin | An EEGLAB plugin designed to denoise EEG data using an ICA-based approach. Its key parameter, the R2 value window size, is central to the trade-off between dipolarity and ERP fidelity. |
| Design of Experiments (DoE) Software | Software that assists in designing efficient parameter optimization studies. It helps systematically vary multiple parameters (like window size and threshold) to find the optimal combination for a desired outcome [21]. |
| Covariance Maximization Script | A custom or toolbox-integrated script for resolving polarity indeterminacy in group-level ICA analyses. It is essential for maximizing the sensitivity of IC-clustered ERPs [18]. |
Q1: What is ICLabel, and why is it a superior tool for classifying Independent Components (ICs) in EEG research? ICLabel is an automated, publicly available IC classifier for EEG data that assigns components to source categories like Brain, Muscle, Eye, and Line Noise [22]. It improves upon prior methods by offering enhanced computational efficiency and label accuracy, performing comparably to or better than other public classifiers while computing labels ten times faster [22]. This speed and accuracy are crucial for parameter validation in methods like iCanClean, where rapid, objective assessment of signal sources is needed.
Q2: What does Residual Variance (RV) measure, and how should I interpret its value for a component? Residual Variance (RV) is a measure of how well an IC's scalp topography can be fit by an equivalent current dipole (ECD) model [23]. A lower RV indicates a more "dipolar" component, which is often characteristic of a true brain source. The ICLabel tutorial notes that while a two-dipole fit will almost always yield a lower RV than a one-dipole fit, most components are not perfectly dipolar, and of those that are, the majority require only one dipole [23]. Therefore, RV is one piece of evidence to be weighed alongside other features like the power spectrum and time series.
Q3: My ICA decomposition seems poor, with many non-brain components having high power. What preprocessing steps are critical? A key preprocessing step for a stable ICA decomposition is high-pass filtering the data. It is recommended to apply a high-pass filter with a 1-Hz pass-band edge (equivalent to a 0.5 Hz cutoff) before running ICA [24]. This removes slow baseline drift, which can bias ICA toward high-amplitude, low-frequency artifacts, allowing the algorithm to better isolate brain signals in the 3-13 Hz range [24].
Q4: How can ICLabel and RV be used together to objectively validate cleaning parameters? ICLabel provides a probabilistic classification of a component's origin. RV quantitatively measures its fit to a generative brain source model. Used in tandem, they offer a multi-faceted validation metric. For instance, when tuning a parameter like the window size in iCanClean, you can objectively track its effect on the resulting ICs. A successful parameter set should yield a higher proportion of components labeled "Brain" by ICLabel and, for those brain components, a lower average RV, indicating more physiologically plausible sources.
| Observed Problem | Potential Causes | Solutions and Validation Checks |
|---|---|---|
| High proportion of components classified as Muscle, Eye, or Channel Noise by ICLabel [22]. | Insufficient data length or data quality for ICA. Inadequate high-pass filtering. Noisy or bad channels included in the ICA computation. | Ensure sufficient clean data is available (e.g., ⥠30 min for high-density mobile EEG) [11]. Apply a 1-Hz high-pass filter before ICA [24]. Identify and remove bad channels before running ICA. |
| Observed Problem | Potential Causes | Solutions and Validation Checks |
|---|---|---|
| A component with a high "Brain" probability from ICLabel has a high Residual Variance (RV). | The underlying source may not be well-modeled by a single equivalent dipole. The component may represent a valid but non-dipolar brain source. | Do not rely on a single metric. Examine the component's power spectrum and time series for brain-like characteristics [23]. Consider the two-dipole RV; a much lower value may indicate a bilateral source. |
| Observed Problem | Potential Causes | Solutions and Validation Checks |
|---|---|---|
| Components that appear neurogenic based on topography and spectrum have unexpectedly high RV. | Forward head model inaccuracies. Imperfections in the ICA decomposition itself. The brain source is genuinely non-dipolar. | Consult the ICLabel tutorial, which cautions that "most components are not dipolar" [23]. Use RV as a relative measure for comparison between parameter sets (e.g., different iCanClean windows), not as an absolute indicator of quality. |
Objective: To determine the optimal window size parameter for the iCanClean algorithm by objectively assessing the quality of resulting independent components.
Methodology:
Expected Outcome: The optimal window size will maximize the percentage of Brain components, minimize the average RV of those components, and achieve the highest Data Quality Score, indicating superior artifact removal and signal preservation.
The following tools are critical for conducting and validating EEG artifact removal research.
| Tool / Material | Function in Research | Application in Parameter Validation |
|---|---|---|
| ICLabel Classifier | An automated classifier to label Independent Components (ICs) from EEG data into categories like Brain, Muscle, and Eye [22]. | Provides a primary, objective metric (% Brain components) for evaluating the output of iCanClean under different parameters. |
| iCanClean Algorithm | A novel framework for removing motion, muscle, eye, and line-noise artifacts from EEG in real-time capable scenarios [11]. | The algorithm whose parameters (e.g., R² value window size) are the subject of the tuning and validation process. |
| Phantom Head Apparatus | A physical model with embedded brain signal sources and contaminating artifact sources [11]. | Provides ground-truth data with known brain signals, enabling quantitative calculation of a Data Quality Score to validate cleaning efficacy. |
| Residual Variance (RV) | A quantitative measure of how well an IC's scalp map is fit by an equivalent current dipole model [23]. | Serves as a secondary, objective metric to assess the neurophysiological plausibility of components labeled as "Brain". |
| High-Pass Filter (1 Hz) | A preprocessing step to remove slow baseline drift from the EEG signal prior to ICA [24]. | A critical, standardized step to ensure a stable and meaningful ICA decomposition, which is foundational for all subsequent validation. |
Problem: After running ICA on mobile EEG data, you find too few brain components, or the components have poor dipolarity (high residual variance).
Explanation: Large motion artifacts can overwhelm the ICA algorithm, preventing it from effectively separating brain signals from noise [3] [5]. The goal of preprocessing is to reduce these large-amplitude artifacts to enable a successful decomposition.
Solutions:
k parameter (e.g., below 10) can over-clean the data and remove neural signals. For locomotion data, a k value of 20-30 is often recommended, but values below 10 can reduce component dipolarity [3].Problem: After cleaning, you notice a strong peak in the power spectrum at the step frequency and its harmonics, suggesting motion artifact remains.
Explanation: Motion artifacts from activities like running are often time-locked to the gait cycle, producing rhythmic, high-amplitude signals that can be difficult to fully remove [3].
Solutions:
Problem: You are running a task during movement (e.g., a Flanker task while jogging) but cannot recover the expected ERP components, like the P300.
Explanation: Motion artifacts can obscure stimulus-locked neural responses. The cleaning method must be aggressive enough to remove noise without distorting or removing the underlying cognitive signal [3].
Solutions:
FAQ 1: I don't have a dual-layer EEG system. Can I still use iCanClean?
Answer: Yes. While iCanClean is most effective with true dual-layer noise sensors, it can be configured to use "pseudo-reference" noise signals derived from your existing EEG data. The algorithm creates these by temporarily filtering the data to isolate noise-dominated frequency bands (e.g., very low frequencies for motion artifacts), which are then used as the reference for cleaning [3] [11].
FAQ 2: How does iCanClean differ from traditional Adaptive Filtering?
Answer: They are fundamentally different. Adaptive Filtering assumes a simple, linear relationship between a reference noise signal and the corruption in the EEG signal. In contrast, iCanClean uses Canonical Correlation Analysis (CCA) to identify and remove entire subspaces of the EEG data that are correlated with noise subspaces. This allows it to handle more complex mixing scenarios and consistently outperforms Adaptive Filtering, especially for motion artifacts [11] [14].
FAQ 3: Which method is best for real-time processing?
Answer: iCanClean, ASR, and Adaptive Filtering are all capable of real-time implementation [11] [14]. ICA, however, is computationally intensive and generally not suitable for real-time applications due to slow decomposition times [11].
FAQ 4: Can Auto-CCA remove low-frequency motion artifacts?
Answer: Theoretically, yes, by rejecting high-correlation components. However, you must exercise caution because brain activity is also low-frequency and has high correlation, creating a risk of accidentally removing neural signals along with the artifact [11] [14].
The table below summarizes key performance metrics from controlled studies, providing a basis for method selection.
Table 1: Performance Comparison of Artifact Removal Methods
| Method | Data Quality Score (Brain + All Artifacts) [11] | Good ICA Components (Before/After) [5] [7] | Optimal Parameters | Key Strength |
|---|---|---|---|---|
| iCanClean | 55.9% (from 15.7%) | 13.2 (from 8.4) | R²=0.65, 4-s window [5] | Effectively removes multiple artifact types without calibrated data [11]. |
| ASR | 27.6% (from 15.7%) | Information Missing | k=20-30 [3] |
Good for burst-like artifacts; integrated into EEGLAB. |
| Auto-CCA | 27.2% (from 15.7%) | Information Missing | Information Missing | Computationally efficient; no reference signals needed. |
| Adaptive Filtering | 32.9% (from 15.7%) | Information Missing | Information Missing | Effective for simple, linear artifacts like eye blinks. |
Table 2: Impact of iCanClean Noise Channels on Performance
| Number of Noise Channels | Average Good ICA Components | Performance Note |
|---|---|---|
| 120 (Full Set) | 13.2 | Baseline optimal performance [5] [7]. |
| 64 | 12.7 | Good performance maintained [5] [7]. |
| 32 | 12.2 | Moderate performance loss [5] [7]. |
| 16 | 12.0 | Performance remains acceptable [5] [7]. |
This protocol is designed to systematically identify the optimal settings for your specific dataset, a core aspect of thesis research.
Methodology:
Using a phantom head with known ground-truth brain signals provides the most rigorous validation of any cleaning method.
Methodology:
Algorithm Selection Workflow
Table 3: Essential Materials for Mobile EEG Artifact Research
| Item | Function in Research |
|---|---|
| Dual-Layer EEG Cap | The key hardware for iCanClean. Outer-layer electrodes act as noise references, mechanically coupled to scalp electrodes to record only environmental and motion artifacts [5] [11]. |
| Conductive Phantom Head | A physical model with embedded "brain" signal sources. Provides known ground-truth signals to quantitatively validate and compare the performance of cleaning algorithms [11]. |
| Portable EEG System with Active Electrodes | Enables data collection during whole-body movement. Active electrodes help mitigate motion artifacts by amplifying signals close to the source before transmission [11]. |
| High-Performance Computing (HPC) or Workstation | Necessary for running computationally intensive processes like ICA on high-density, mobile EEG datasets, which can require hours of computation time [11]. |
| Software Platform (EEGLAB/BCILAB) | Standard software environments that provide implementations of ASR and a framework for integrating and testing other algorithms like iCanClean [5] [11]. |
Q1: What is a phantom head, and why is it critical for validating EEG processing techniques like iCanClean?
A phantom head is a physical model of the human head, engineered to simulate the electrical conductivity of biological tissues and skull structures [25]. It contains embedded antennae that deliver known "ground-truth" signals. For algorithms like iCanClean that are designed to remove motion artifacts from mobile EEG data, phantom heads provide an indispensable validation tool. They allow researchers to test whether their processing steps can accurately recover known signals that have been corrupted by real-world volume conduction and motion artifacts, a scenario that computer simulations alone cannot fully replicate [25] [26].
Q2: My iCanClean-processed data still shows high residual noise. Which parameter should I adjust first?
The primary parameter to adjust is the r² threshold, which controls the cleaning aggressiveness [4] [5]. A lower r² value (e.g., 0.3) results in more aggressive noise removal, while a higher value (e.g., 0.8) is more conservative. If high noise persists, try lowering the r² value incrementally. A parameter sweep study identified an optimal r² value of 0.65 for data collected during walking, which serves as an excellent starting point for tuning [4] [5].
Q3: How does the choice of "window length" in iCanClean affect the cleaning of my data?
The window length determines the segment of data over which the algorithm calculates the correlation between cortical electrodes and noise electrodes [5]. Using shorter windows (e.g., 1-2 seconds) allows iCanClean to adapt to rapidly changing noise, which is ideal for highly dynamic tasks. Longer windows (e.g., 4 seconds or the entire recording) can be more effective for stabilizing the decomposition when noise is more consistent. Research suggests a 4-second window paired with an r² of 0.65 provides a robust configuration for mobile data [5].
Q4: What are the key metrics for quantifying the performance of iCanClean after using a phantom for validation?
When using a ground-truth phantom, two primary quantitative metrics are:
Problem: After processing mobile EEG data with iCanClean, subsequent ICA fails to produce a sufficient number of brain-like components.
Solution:
Problem: iCanClean performs well on data from young adults but seems less effective on data from older adult populations.
Solution:
Problem: After aggressive cleaning with iCanClean, the resulting EEG signals appear overly smoothed, and evoked responses are attenuated.
Solution:
This protocol outlines how to use a phantom head to empirically determine the optimal iCanClean parameters for your specific experimental setup.
1. Materials and Setup
2. Ground-Truth Signal Generation Generate complex, physiologically relevant signals for the antennae using a Neural Mass Model (NMM). Create signals with peak frequencies in different EEG bands (e.g., theta: 6.5 Hz, alpha: 10 Hz, gamma: 41 Hz). Incorporate intermittent, known connections between these signals to serve as a ground truth for connectivity analysis [25].
3. Data Collection & Processing
4. Performance Quantification For each parameter combination, calculate the following metrics:
The table below summarizes key performance metrics from published studies to serve as a benchmark.
Table 1: Benchmark Performance Metrics for iCanClean and Phantom Validation
| Metric | Target Performance | Context / Conditions |
|---|---|---|
| Optimal iCanClean Parameters | Window: 4-s, r²: 0.65 [4] [5] | Mobile EEG data from walking; improves good ICs from 8.4 to 13.2 (+57%) [5] |
| Cross-correlation with Ground-Truth | Primarily > 0.8 [25] | Phantom head validation after ICA source separation [25] |
| Signal-to-Noise Ratio (SNR) | ~10 dB (maintained with iCanClean) [25] | During fast walking speeds; compared to ~2 dB in raw scalp data [25] |
| Good Independent Components (Residual Variance <15%, ICLabel >50%) | 13.2 components on average [5] | After iCanClean processing with optimal parameters [5] |
| Scanner Instability/Noise in Phantom Data | 6â18% of total noise [27] | Measured as multiplicative noise contribution in "best-case" fMRI scanners [27] |
For labs without a commercial phantom, this protocol provides a method for creating a basic validation tool.
1. Phantom Fabrication
2. Basic System Validation
Table 2: Key Materials for Phantom Fabrication and Validation
| Item | Function / Application | Example Use Case |
|---|---|---|
| Polyvinyl Chloride (PVC) Plastisol | Tissue-mimicking material for simulating brain tissue in phantoms [28] [29] | Used as a primary filler in 3D-printed skull models to create realistic head phantoms for transcranial ultrasound [28]. |
| Polylactic Acid (PLA) | Filament for Fused Deposition Modeling (FDM) 3D printing; creates acoustically opaque structures like the skull [28] [29] | 3D printing the structural components of a head phantom, such as the skull bone [28]. |
| Photopolymer Resin | Material for LCD 3D printing; used to create parts with specific acoustic properties, like temporal acoustic windows [28] [29] | Printing the "acoustic windows" in a head phantom to simulate areas of the skull that allow ultrasound to pass [28]. |
| Dental Plaster Mixture | Conductive medium that simulates realistic tissue conductance for EEG phantoms [25] | Combined with sodium propionate and water to fill a mannequin head for EEG electrode testing [25]. |
| Neural Mass Model (NMM) | Computational model that generates complex, physiologically relevant signals with known interconnectivity [25] | Providing the "ground-truth" brain signals that are played into the antennae of a phantom head during validation [25]. |
| dSPACE MicroLabBox | Input/output interface hardware for delivering precise, predefined signals to antennae within a phantom [25] | Used to feed NMM-generated signals into the antennae of an EEG phantom head [25]. |
The following diagram illustrates the complete experimental workflow for phantom head validation, from setup to quantitative analysis.
Experimental Workflow for Phantom Head Validation
This guide provides targeted support for researchers optimizing the iCanClean algorithm for mobile brain imaging, focusing on the critical parameters of R² threshold and window size.
The table below summarizes the key parameters for the iCanClean algorithm, their functions, and recommended values based on systematic testing.
| Parameter | Function & Effect | Recommended Value | Performance Impact |
|---|---|---|---|
| R² Threshold | Controls cleaning aggressiveness; lower values remove more data components [7]. | 0.65 (for walking motion artifacts) [7] [5] | Increased good ICA components from 8.4 to 13.2 (+57%) at this setting [7]. |
| Window Length | Duration of data segments cleaned in one analysis cycle; affects noise correlation detection [7]. | 4 seconds (for walking motion artifacts) [7] [5] | Balances local noise correlation tracking with sufficient data for stable cleaning [7]. |
| Noise Channels | Number of reference noise electrodes used for artifact detection [7]. | 16-64 channels (from 120 available) [7] | Performance maintained with reduced channels (12.0-12.7 good components vs. 13.2 with 120 channels) [7]. |
This protocol was designed to quantitatively compare iCanClean against other artifact removal methods using a known ground truth [14].
This protocol established optimal parameters for cleaning motion artifacts during walking [7] [5].
The following diagram illustrates the logical flow of the iCanClean algorithm and its key parameters based on the described research.
| Item | Function in Experiment |
|---|---|
| Dual-Layer EEG Cap | A specialized cap with 120 scalp electrodes and 120 mechanically coupled but electrically isolated noise electrodes. It provides the reference noise signals essential for the iCanClean algorithm [7] [5]. |
| Electrical Phantom Head | A bench-test apparatus with embedded brain and artifact sources. It provides known ground-truth signals for validating cleaning algorithms without human subjects [14]. |
| INDIP Reference System | A multi-sensor system (INertial modules, DIstance sensors, Pressure insoles) used as a gold standard for validating real-world gait and mobility digital outcomes [30]. |
| GaitPy Algorithm | An open-source method for analyzing gait from a single lumbar-worn accelerometer, enabling validation of gait speed in naturalistic environments [31]. |
| ICLabel Classifier | A convolutional neural network that automatically classifies Independent Components from ICA, helping researchers identify "good" brain components post-cleaning [7] [5]. |
Q1: My data is from a seated task with eye and muscle artifacts, not walking. Should I use the same R²=0.65 setting? A: The R²=0.65 and 4-second window were optimized specifically for cleaning motion artifacts during walking [7]. For other artifact types (e.g., pure eye-blink or EMG noise), a different parameter combination might be more effective. It is recommended to run a small parameter sweep on a representative segment of your data to find the optimal setting for your specific task and artifact profile.
Q2: I have fewer than 16 noise channels available. Can I still use iCanClean effectively? A: The research shows that performance gracefully declines as noise channels are reduced [7]. While 16-64 noise channels are recommended for best results, the algorithm can still function with fewer, though cleaning efficacy may be lower. Prioritize using the most evenly spaced subset of your available noise channels.
Q3: How does iCanClean's performance compare to traditional methods like ICA alone? A: iCanClean is not a replacement for ICA but a powerful preprocessing step. The study showed that cleaning data with iCanClean before running ICA significantly improved the results of the ICA decomposition, increasing the number of identifiable, high-quality brain components [7] [5].
Q4: What is considered a successful outcome after using iCanClean? A: Success depends on your downstream analysis. For source-level analysis with ICA, success is measured by an increase in the number of "good" brain components that are dipolar and have high brain probability [7]. For other analyses, success could be a higher Data Quality Score or improved signal-to-noise ratio in event-related potentials.
Q1: What are the key performance metrics used to validate iCanClean's effectiveness? The primary metrics for evaluating iCanClean are:
Q2: How do the r² threshold and window length parameters affect cleaning performance?
These are iCanClean's core tuning parameters [5]:
r² Threshold (Cleaning Aggressiveness): Controls which correlated noise subspaces are removed. A lower r² value results in more aggressive cleaning (more components removed), while a value near 1.0 results in less cleaning [5] [7].Q3: What are the optimal parameter settings for iCanClean? Based on a systematic parameter sweep with mobile EEG data during walking, the optimal settings were found to be [5] [7]:
r² Threshold: 0.65Q4: How does iCanClean performance change with fewer noise reference channels? Performance remains robust even with a reduced set of noise channels. After finding the optimal parameters, a subsequent test showed that the number of "good" ICs only slightly decreased with fewer channels [5] [7]:
| Number of Noise Channels | Average Good ICs |
|---|---|
| 120 (Full Set) | 13.2 |
| 64 | 12.7 |
| 32 | 12.2 |
| 16 | 12.0 |
Q5: How does iCanClean compare to other artifact removal methods? In a phantom head study with known ground-truth signals, iCanClean consistently outperformed other real-time-capable methods, especially when multiple artifacts were present simultaneously. Starting from a Data Quality Score of 15.7% (before cleaning) in the "Brain + All Artifacts" condition, results were [14] [11]:
| Cleaning Method | Data Quality Score After Cleaning |
|---|---|
| iCanClean | 55.9% |
| Adaptive Filtering | 32.9% |
| Artifact Subspace Reconstruction (ASR) | 27.6% |
| Auto-Canonical Correlation Analysis (Auto-CCA) | 27.2% |
r² threshold and window length for iCanClean to maximize the number of high-quality brain components extracted via ICA from mobile EEG data [5] [7].r² Threshold: Tested from 0.05 to 1.00 in increments of 0.05.
| Item | Function in iCanClean Research |
|---|---|
| Dual-Layer EEG Cap | A specialized cap with inner-layer (scalp) electrodes recording brain + noise and outer-layer electrodes recording only reference noise. Mechanically coupled but electrically isolated [5] [7]. |
| Electrical Phantom Head | A physical model with embedded antennas to simulate known "brain" signals. Allows for controlled injection of artifacts and provides ground-truth for validating cleaning algorithms [14] [11]. |
| ICLabel | A convolutional neural network-based classifier that automatically labels independent components from ICA by estimating the probability that a component comes from a specific source (e.g., brain, muscle, eye) [5] [7]. |
| Dipolar Source Localization | A method to fit an equivalent current dipole to an ICA component's scalp topography. A low residual variance (<15%) indicates a component that is physically plausible and likely originates from a compact brain source [5]. |
| Canonical Correlation Analysis (CCA) | The core statistical engine of iCanClean. It identifies linear subspaces of the cortical EEG data that are maximally correlated with subspaces in the reference noise data, which are then removed [5] [14]. |
The systematic tuning of iCanClean's R2 value and window size is paramount for unlocking reliable mobile brain imaging. The consensus from current research points to an R2 threshold of 0.65 and a 4-second window as a robust starting point for human walking studies, significantly improving the yield of high-quality, dipolar brain components. Successful application requires a careful balance; an overly aggressive R2 can suppress neural signals, while an overly conservative one leaves artifacts. iCanClean has consistently demonstrated superiority over methods like ASR in both phantom and human studies, particularly in preserving brain signals while removing complex motion and muscle artifacts. For biomedical and clinical research, these optimized cleaning protocols enable more sensitive detection of electrocortical biomarkers during dynamic behaviors, paving the way for deeper investigations into neurological mechanisms of mobility, more objective assessment in neurorehabilitation, and potentially sharper endpoints for clinical trials in drug development for neurological disorders. Future work should focus on developing fully automated parameter selection and adapting these guidelines for a wider range of populations and real-world activities.