首页 > 最新文献

MethodsX最新文献

英文 中文
WAYVision: A hybrid deep learning approach for recognizing handwritten Kannada Braille using wavelet transformation and attention based YOLOv5 WAYVision:一种使用小波变换和基于注意力的YOLOv5识别手写卡纳达文盲文的混合深度学习方法
IF 1.6 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-06-13 DOI: 10.1016/j.mex.2025.103440
Bipin Nair B J , Niranjan , Saketh P , Shobha Rani N
Handwritten Braille character recognition presents a significant challenge in the field of assistive technology, especially with the inclusion of various linguistic scripts such as Kannada. The data set is uniquely curated, combining ground-truth data from Kaggle and real-world samples collected from blind schools, segmented into vowels and consonants. The proposed system demonstrates exceptional performance in feature extraction, classification accuracy, and addressing spatial misalignments in Braille dots. Comparative analysis against state-of-the-art methods confirms the efficiency of the proposed model in overcoming the limitations of conventional techniques. The system was trained with two train test splits 70:30 and 80:20. The initial train test split has achieved 97.9 % and the latter one has achieved 98.7 %. This study aims to contribute significantly to the empowerment of visually impaired communities through advancements in automated Braille recognition systems.
  • The study addresses the challenge of handwritten Kannada Braille recognition using a uniquely curated dataset from Kaggle and blind schools, divided into vowels and consonants.
  • The proposed system achieves high accuracy (97.9 % for 70:30 and 98.7 % for 80:20 split) showing superior feature extraction and handling of spatial misalignments in Braille dots.
  • Comparative analysis of state-of-the-art methods confirms the model’s efficiency in overcoming limitations of conventional techniques, contributing to assistive technology for visually impaired communities.
手写体盲文字符识别是辅助技术领域的一个重大挑战,特别是在包括各种语言脚本(如卡纳达语)的情况下。数据集是独一无二的,结合了来自Kaggle的真实数据和从盲校收集的真实样本,分为元音和辅音。该系统在特征提取、分类精度和解决盲文点的空间偏差方面表现出优异的性能。与最先进的方法的比较分析证实了所提出的模型在克服传统技术的局限性方面的效率。系统以70:30和80:20两个列车测试分割进行训练。初始列车测试分割率达到97.9%,后一列车测试分割率达到98.7%。本研究旨在透过自动盲文识别系统的进步,为视障社群的赋权作出重大贡献。•该研究使用来自Kaggle和盲校的独特管理数据集,将元音和辅音分为元音和辅音,解决了手写卡纳达文盲文识别的挑战。•所提出的系统达到了很高的准确率(70:30分割97.9%,80:20分割98.7%),显示出优越的特征提取和处理盲文点的空间错位。•对最先进的方法进行比较分析,证实了该模型在克服传统技术局限性方面的效率,为视障社区提供辅助技术。
{"title":"WAYVision: A hybrid deep learning approach for recognizing handwritten Kannada Braille using wavelet transformation and attention based YOLOv5","authors":"Bipin Nair B J ,&nbsp;Niranjan ,&nbsp;Saketh P ,&nbsp;Shobha Rani N","doi":"10.1016/j.mex.2025.103440","DOIUrl":"10.1016/j.mex.2025.103440","url":null,"abstract":"<div><div>Handwritten Braille character recognition presents a significant challenge in the field of assistive technology, especially with the inclusion of various linguistic scripts such as Kannada. The data set is uniquely curated, combining ground-truth data from Kaggle and real-world samples collected from blind schools, segmented into vowels and consonants. The proposed system demonstrates exceptional performance in feature extraction, classification accuracy, and addressing spatial misalignments in Braille dots. Comparative analysis against state-of-the-art methods confirms the efficiency of the proposed model in overcoming the limitations of conventional techniques. The system was trained with two train test splits 70:30 and 80:20. The initial train test split has achieved 97.9 % and the latter one has achieved 98.7 %. This study aims to contribute significantly to the empowerment of visually impaired communities through advancements in automated Braille recognition systems.<ul><li><span>•</span><span><div>The study addresses the challenge of handwritten Kannada Braille recognition using a uniquely curated dataset from Kaggle and blind schools, divided into vowels and consonants.</div></span></li><li><span>•</span><span><div>The proposed system achieves high accuracy (97.9 % for 70:30 and 98.7 % for 80:20 split) showing superior feature extraction and handling of spatial misalignments in Braille dots.</div></span></li><li><span>•</span><span><div>Comparative analysis of state-of-the-art methods confirms the model’s efficiency in overcoming limitations of conventional techniques, contributing to assistive technology for visually impaired communities.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103440"},"PeriodicalIF":1.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144501159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A blockchain-enabled healthcare system for cervical cancer risk prediction using enhanced metaheuristic optimised graph convolutional attention based GRU 使用增强的元启发式优化的基于GRU的图卷积注意力的区块链支持的宫颈癌风险预测医疗保健系统
IF 1.9 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-08-16 DOI: 10.1016/j.mex.2025.103564
Anusha R, Srinivas Prasad
Cervical cancer is a serious health concern that entails high risks for individuals due to delayed detection and treatment worldwide. Formal screening for the condition is challenging in developing countries due to several factors, including medical costs, access to healthcare facilities, and delayed symptom manifestation. A blockchain-enabled healthcare system for cervical cancer risk prediction ensures data security, privacy, and accurate risk assessment. This system uses blockchain to provide decentralised, tamper-proof storage and access control over sensitive patient data, ensuring that only authorized entities can interact with the information. An improved spotted hyena optimization algorithm is employed for cervical cancer risk prediction, fine-tuning a Graph Convolutional Network (GCN) integrated with an Attention Mechanism and a Gated Recurrent Unit (GRU). The GCN captures complex relationships between medical attributes and patients, while the attention mechanism dynamically assigns weights to features based on relevance, improving predictive accuracy. The GRU processes sequential data, such as medical history, to model temporal dependencies in the risk factors. The metaheuristic optimization further enhances the model by finding the optimal parameters, boosting performance
Introduces a blockchain-enabled system for secure and decentralized medical data management
Applies an intelligent model for predicting cervical cancer risk using patient health records
Demonstrates improved accuracy, privacy, and reliability over traditional diagnostic methods
子宫颈癌是一个严重的健康问题,由于在世界范围内发现和治疗延迟,给个人带来了很高的风险。在发展中国家,由于若干因素,包括医疗费用、获得卫生保健设施和延迟症状表现,对该病进行正式筛查具有挑战性。基于区块链的宫颈癌风险预测医疗系统可确保数据安全性、隐私性和准确的风险评估。该系统使用区块链提供分散的、防篡改的存储和对敏感患者数据的访问控制,确保只有授权实体才能与信息交互。将一种改进的斑点鬣狗优化算法应用于宫颈癌风险预测,对结合注意机制和门控循环单元的图卷积网络(GCN)进行微调。GCN捕获了医疗属性与患者之间的复杂关系,而注意机制则根据相关性动态地为特征分配权重,提高了预测的准确性。GRU处理顺序数据,如病史,以模拟风险因素的时间依赖性。元启发式优化通过寻找最佳参数进一步增强了模型,提高了性能。引入了一个支持区块链的系统,用于安全和分散的医疗数据管理。应用智能模型,使用患者健康记录预测宫颈癌风险。与传统诊断方法相比,展示了更高的准确性、隐私性和可靠性
{"title":"A blockchain-enabled healthcare system for cervical cancer risk prediction using enhanced metaheuristic optimised graph convolutional attention based GRU","authors":"Anusha R,&nbsp;Srinivas Prasad","doi":"10.1016/j.mex.2025.103564","DOIUrl":"10.1016/j.mex.2025.103564","url":null,"abstract":"<div><div>Cervical cancer is a serious health concern that entails high risks for individuals due to delayed detection and treatment worldwide. Formal screening for the condition is challenging in developing countries due to several factors, including medical costs, access to healthcare facilities, and delayed symptom manifestation. A blockchain-enabled healthcare system for cervical cancer risk prediction ensures data security, privacy, and accurate risk assessment. This system uses blockchain to provide decentralised, tamper-proof storage and access control over sensitive patient data, ensuring that only authorized entities can interact with the information. An improved spotted hyena optimization algorithm is employed for cervical cancer risk prediction, fine-tuning a Graph Convolutional Network (GCN) integrated with an Attention Mechanism and a Gated Recurrent Unit (GRU). The GCN captures complex relationships between medical attributes and patients, while the attention mechanism dynamically assigns weights to features based on relevance, improving predictive accuracy. The GRU processes sequential data, such as medical history, to model temporal dependencies in the risk factors. The metaheuristic optimization further enhances the model by finding the optimal parameters, boosting performance</div><div>Introduces a blockchain-enabled system for secure and decentralized medical data management</div><div>Applies an intelligent model for predicting cervical cancer risk using patient health records</div><div>Demonstrates improved accuracy, privacy, and reliability over traditional diagnostic methods</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103564"},"PeriodicalIF":1.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144896174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TARGETFLOW:An automated literature mining pipeline for accelerating the discovery of high-potential targets in rare and mature diseases TARGETFLOW:一个自动化的文献挖掘管道,用于加速发现罕见和成熟疾病的高潜力靶点
IF 1.9 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-11-26 DOI: 10.1016/j.mex.2025.103735
Shuhan Guo, Tianxiang Shang, Yuzhu Pan, Yingjia Wu, Yan Li, Mohan Zhou
To expedite the early stages of drug development for diseases lacking established target databases, and to enhance knowledge updating in well-studied disease domains, this paper introduces TARGETFLOW, an automated literature-mining pipeline. The workflow begins by automatically retrieving literature, downloading relevant abstracts, and constructing a comprehensive database. After performing selective text cleaning and data preprocessing, it leverages large language models (LLMs) to conduct intelligent literature screening, followed by code-based whitespace tokenization. Subsequently, rule-based filtering is applied to extract high-potential therapeutic targets for the specified disease. To validate the effectiveness of this pipeline, three hypotheses were formulated: (1) An effective pipeline should be capable of identifying high-potential therapeutic targets for the given disease; (2) For diseases with established target databases, the pipeline should be able to detect novel and emerging targets not yet included in existing databases; and (3) The pipeline should also be applicable to rare or emerging diseases that lack mature target databases. Then, rheumatoid arthritis (RA), a common disease, and idiopathic pulmonary fibrosis (IPF), a rare disease, were selected as case studies. The results demonstrated the method’s reliability (high-potential target validation rate: 56 %), innovativeness (new target validation pass rate: 100 %), and generalizability (IPF target literature support rate: 88.9 %).
为了加快缺乏既定目标数据库的疾病的早期药物开发,并加强对已研究的疾病领域的知识更新,本文介绍了TARGETFLOW,一个自动化的文献挖掘管道。工作流程从自动检索文献、下载相关摘要和构建综合数据库开始。在执行选择性文本清理和数据预处理之后,它利用大型语言模型(llm)进行智能文献筛选,然后进行基于代码的空白标记化。随后,应用基于规则的滤波方法提取针对特定疾病的高潜力治疗靶点。为了验证该管道的有效性,提出了三个假设:(1)有效的管道应该能够识别给定疾病的高潜力治疗靶点;(2)对于已建立靶点数据库的疾病,该管道应能够发现尚未纳入现有数据库的新靶点和新兴靶点;(3)管线也应适用于缺乏成熟靶点数据库的罕见或新发疾病。然后,选择常见病类风湿性关节炎(RA)和罕见病特发性肺纤维化(IPF)作为案例研究。结果表明,该方法具有较高的信度(高电位目标验证率为56%)、创新性(新目标验证合格率为100%)和通用性(IPF目标文献支持率为88.9%)。
{"title":"TARGETFLOW:An automated literature mining pipeline for accelerating the discovery of high-potential targets in rare and mature diseases","authors":"Shuhan Guo,&nbsp;Tianxiang Shang,&nbsp;Yuzhu Pan,&nbsp;Yingjia Wu,&nbsp;Yan Li,&nbsp;Mohan Zhou","doi":"10.1016/j.mex.2025.103735","DOIUrl":"10.1016/j.mex.2025.103735","url":null,"abstract":"<div><div>To expedite the early stages of drug development for diseases lacking established target databases, and to enhance knowledge updating in well-studied disease domains, this paper introduces TARGETFLOW, an automated literature-mining pipeline. The workflow begins by automatically retrieving literature, downloading relevant abstracts, and constructing a comprehensive database. After performing selective text cleaning and data preprocessing, it leverages large language models (LLMs) to conduct intelligent literature screening, followed by code-based whitespace tokenization. Subsequently, rule-based filtering is applied to extract high-potential therapeutic targets for the specified disease. To validate the effectiveness of this pipeline, three hypotheses were formulated: (1) An effective pipeline should be capable of identifying high-potential therapeutic targets for the given disease; (2) For diseases with established target databases, the pipeline should be able to detect novel and emerging targets not yet included in existing databases; and (3) The pipeline should also be applicable to rare or emerging diseases that lack mature target databases. Then, rheumatoid arthritis (RA), a common disease, and idiopathic pulmonary fibrosis (IPF), a rare disease, were selected as case studies. The results demonstrated the method’s reliability (high-potential target validation rate: 56 %), innovativeness (new target validation pass rate: 100 %), and generalizability (IPF target literature support rate: 88.9 %).</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103735"},"PeriodicalIF":1.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145681155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A method to modelling oil spill using combination of logistic regression and cellular automata 逻辑回归与元胞自动机相结合的溢油建模方法
IF 1.6 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-06-27 DOI: 10.1016/j.mex.2025.103474
Yihan Zhang, Shanshan Li
This study proposes a logistic regression-integrated cellular automata (CA) model for oil spill simulation, addressing challenges in parameter determination of traditional CA models. The method involves data preprocessing (geospatial alignment, resampling, normalization), Monte Carlo sampling for training data, logistic regression-based weight assignment to impact factors, neighborhood function and stochastic term computation, and iterative oil spill simulation. The model can be calibrated through sensitivity analyses of sampling ratios, spatial scales, and neighborhood structures. Finally, it was validated using DeepSpill experimental data. Results show optimal accuracy (97.40 %) under 22 % sampling ratio, 12.61 % oil area proportion, 6 m spatial scale, and 7 × 7 Moore neighborhood.
  • Innovative Model Integration & Calibration: Merged logistic regression with CA to objectively quantify environmental drivers (currents, wind, salinity) and optimize parameters (sampling, scale and neighborhood) in oil simulation.
  • Dynamic Optimization & Scale Sensitivity: Peak accuracy (96.41 %) can be obtained at 22 % sampling rate and 12.61 % oil area. 97.32 % accuracy at 6 m resolution balances resolution and boundary roughness.
  • Neighborhood-Driven Diffusion Enhancement: 7 × 7 Moore neighborhood boosts accuracy to 97.40 % (vs. 3 × 3), proving neighborhood size critically shapes diffusion dynamics.
针对传统元胞自动机模型在参数确定方面存在的问题,提出了一种基于logistic回归的元胞自动机模型。该方法涉及数据预处理(地理空间对齐、重采样、归一化)、训练数据的蒙特卡罗采样、基于logistic回归的影响因子权重分配、邻域函数和随机项计算以及迭代溢油模拟。该模型可以通过采样比、空间尺度和邻域结构的敏感性分析来校准。最后,利用DeepSpill实验数据对该方法进行了验证。结果表明,在采样率为22%、油区比例为12.61%、空间尺度为6 m、摩尔邻域为7 × 7的条件下,反演精度为97.40%。•创新模式集成&;校准:将逻辑回归与CA相结合,客观地量化石油模拟中的环境驱动因素(洋流、风、盐度),并优化参数(采样、尺度和邻域)。•动态优化&;尺度灵敏度:在采样率为22%、含油面积为12.61%的条件下,可获得峰值精度(96.41%)。在6米分辨率下,97.32%的精度平衡了分辨率和边界粗糙度。•邻域驱动扩散增强:7 × 7摩尔邻域将精度提高到97.40%(相对于3 × 3),证明邻域大小对扩散动力学至关重要。
{"title":"A method to modelling oil spill using combination of logistic regression and cellular automata","authors":"Yihan Zhang,&nbsp;Shanshan Li","doi":"10.1016/j.mex.2025.103474","DOIUrl":"10.1016/j.mex.2025.103474","url":null,"abstract":"<div><div>This study proposes a logistic regression-integrated cellular automata (CA) model for oil spill simulation, addressing challenges in parameter determination of traditional CA models. The method involves data preprocessing (geospatial alignment, resampling, normalization), Monte Carlo sampling for training data, logistic regression-based weight assignment to impact factors, neighborhood function and stochastic term computation, and iterative oil spill simulation. The model can be calibrated through sensitivity analyses of sampling ratios, spatial scales, and neighborhood structures. Finally, it was validated using DeepSpill experimental data. Results show optimal accuracy (97.40 %) under 22 % sampling ratio, 12.61 % oil area proportion, 6 m spatial scale, and 7 × 7 Moore neighborhood.<ul><li><span>•</span><span><div>Innovative Model Integration &amp; Calibration: Merged logistic regression with CA to objectively quantify environmental drivers (currents, wind, salinity) and optimize parameters (sampling, scale and neighborhood) in oil simulation.</div></span></li><li><span>•</span><span><div>Dynamic Optimization &amp; Scale Sensitivity: Peak accuracy (96.41 %) can be obtained at 22 % sampling rate and 12.61 % oil area. 97.32 % accuracy at 6 m resolution balances resolution and boundary roughness.</div></span></li><li><span>•</span><span><div>Neighborhood-Driven Diffusion Enhancement: 7 × 7 Moore neighborhood boosts accuracy to 97.40 % (vs. 3 × 3), proving neighborhood size critically shapes diffusion dynamics.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103474"},"PeriodicalIF":1.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144570202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CISCS: Classification of inter-class similarity based medicinal plant species groups with machine learning 基于类间相似性的药用植物类群的机器学习分类
IF 1.9 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-09-30 DOI: 10.1016/j.mex.2025.103652
N. Shobha Rani , Bhavya K R , I. Jeena Jacob , Pushpa B. R , Bipin Nair BJ , Akshatha Prabhu
The reliable classification of medicinal plant species plays a vital role in ensuring their quality, authenticity, and safe use in healthcare. However, existing methods often face difficulties when species exhibit strong visual similarities or when datasets are imbalanced, which limits their effectiveness in practice. Although deep learning models such as ResNet18 and VGG16 have proven influential in image recognition tasks, our experiments showed that they tended to overfit, with validation losses reaching 42.99 % and test accuracy falling to 73.99 % in certain groups. To overcome these challenges, we introduce a multi-level fusion feature model that combines 3D normalized color histograms, extended uniform Local Binary Patterns (LBP with P = 24, R = 3), multi-orientation Gabor filters, and Histogram of Oriented Gradients (HOG). This approach captures a richer set of visual cues by bringing together global color statistics, detailed textures, frequency-domain patterns, and shape descriptors. We incorporate SMOTE-based synthetic augmentation to address further class imbalance, which helps balance feature distributions across categories. We employ a soft-voting ensemble of machine learning classifiers for classification and use cosine similarity metrics to capture inter-class relationships better. Tests on Indian medicinal plant datasets show that our model consistently outperforms deep learning baselines, reaching 100 % accuracy in Group 1, 95.82 % in Group 3, and over 90 % in other groups. These results suggest that the proposed model offers a more robust and computationally efficient solution for plant species classification, particularly under conditions of high inter-class similarity and dataset imbalance.
  • The proposed domain-specific model can be applied explicitly to Indian plant species groups exhibiting high inter-class visual similarities through a novel feature fusion strategy.
  • The proposed multi-level feature fusion method's innovation integrates 3D normalized color histograms, extended uniform LBP (P = 24, R = 3), multi-orientation Gabor filters, and HOG features to capture the color, texture, and shape characteristics.
  • The proposed work offers a scalable ensemble framework for inter-class similarity analysis by combining SMOTE-based class balancing, feature normalization, and a soft-voting ensemble of diverse classifiers that support biodiversity and ecological studies.
药用植物物种的可靠分类对保证其质量、真实性和在医疗保健中的安全使用起着至关重要的作用。然而,当物种表现出强烈的视觉相似性或当数据集不平衡时,现有的方法往往面临困难,这限制了它们在实践中的有效性。尽管ResNet18和VGG16等深度学习模型已被证明在图像识别任务中具有影响力,但我们的实验表明,它们倾向于过拟合,在某些组中验证损失达到42.99%,测试准确率下降到73.99%。为了克服这些挑战,我们引入了一种多层次融合特征模型,该模型结合了3D归一化颜色直方图、扩展均匀局部二值模式(LBP, P = 24, R = 3)、多向Gabor滤波器和定向梯度直方图(HOG)。这种方法通过将全局颜色统计、详细纹理、频域模式和形状描述符结合在一起,捕获了一组更丰富的视觉线索。我们结合了基于smote的合成增强来解决进一步的类别不平衡,这有助于平衡类别之间的特征分布。我们采用机器学习分类器的软投票集成进行分类,并使用余弦相似度度量来更好地捕获类间关系。对印度药用植物数据集的测试表明,我们的模型始终优于深度学习基线,在第1组中达到100%的准确率,在第3组中达到95.82%,在其他组中达到90%以上。这些结果表明,该模型为植物物种分类提供了一种鲁棒性和计算效率更高的解决方案,特别是在类间相似性高和数据不平衡的情况下。•提出的领域特定模型可以通过一种新颖的特征融合策略明确应用于表现出高度类间视觉相似性的印度植物物种群。•提出的多层次特征融合方法的创新之处是集成了3D归一化颜色直方图、扩展均匀LBP (P = 24, R = 3)、多向Gabor滤波器和HOG特征来捕获颜色、纹理和形状特征。•提出的工作提供了一个可扩展的集成框架,通过结合基于smote的类平衡、特征规范化和支持生物多样性和生态研究的各种分类器的软投票集成,用于类间相似性分析。
{"title":"CISCS: Classification of inter-class similarity based medicinal plant species groups with machine learning","authors":"N. Shobha Rani ,&nbsp;Bhavya K R ,&nbsp;I. Jeena Jacob ,&nbsp;Pushpa B. R ,&nbsp;Bipin Nair BJ ,&nbsp;Akshatha Prabhu","doi":"10.1016/j.mex.2025.103652","DOIUrl":"10.1016/j.mex.2025.103652","url":null,"abstract":"<div><div>The reliable classification of medicinal plant species plays a vital role in ensuring their quality, authenticity, and safe use in healthcare. However, existing methods often face difficulties when species exhibit strong visual similarities or when datasets are imbalanced, which limits their effectiveness in practice. Although deep learning models such as ResNet18 and VGG16 have proven influential in image recognition tasks, our experiments showed that they tended to overfit, with validation losses reaching 42.99 % and test accuracy falling to 73.99 % in certain groups. To overcome these challenges, we introduce a multi-level fusion feature model that combines 3D normalized color histograms, extended uniform Local Binary Patterns (LBP with <em>P</em> = 24, <em>R</em> = 3), multi-orientation Gabor filters, and Histogram of Oriented Gradients (HOG). This approach captures a richer set of visual cues by bringing together global color statistics, detailed textures, frequency-domain patterns, and shape descriptors. We incorporate SMOTE-based synthetic augmentation to address further class imbalance, which helps balance feature distributions across categories. We employ a soft-voting ensemble of machine learning classifiers for classification and use cosine similarity metrics to capture inter-class relationships better. Tests on Indian medicinal plant datasets show that our model consistently outperforms deep learning baselines, reaching 100 % accuracy in Group 1, 95.82 % in Group 3, and over 90 % in other groups. These results suggest that the proposed model offers a more robust and computationally efficient solution for plant species classification, particularly under conditions of high inter-class similarity and dataset imbalance.<ul><li><span>•</span><span><div>The proposed domain-specific model can be applied explicitly to Indian plant species groups exhibiting high inter-class visual similarities through a novel feature fusion strategy.</div></span></li><li><span>•</span><span><div>The proposed multi-level feature fusion method's innovation integrates 3D normalized color histograms, extended uniform LBP (<em>P</em> = 24, <em>R</em> = 3), multi-orientation Gabor filters, and HOG features to capture the color, texture, and shape characteristics.</div></span></li><li><span>•</span><span><div>The proposed work offers a scalable ensemble framework for inter-class similarity analysis by combining SMOTE-based class balancing, feature normalization, and a soft-voting ensemble of diverse classifiers that support biodiversity and ecological studies.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103652"},"PeriodicalIF":1.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145320096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning approach with ConvNeXt-SE-attn model for in vitro oral squamous cell carcinoma and chemotherapy analysis 基于ConvNeXt-SE-attn模型的深度学习方法在体外口腔鳞状细胞癌及化疗分析中的应用
IF 1.6 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-07-17 DOI: 10.1016/j.mex.2025.103519
Abhay Nath , Om Roy , Priyanka Silveri , Sanskruti Patel
Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture.
The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss.
The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets.
Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions.
The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.
口腔鳞状细胞癌(OSCC)仍然是一个主要的全球卫生保健问题,因为患者的生存结果很差,而且经常复发。Globocan预测,到2022年,全球将有389,846例OSCC新病例和188,438例死亡,同时维持极低的5年生存率,约为50%。我们的方法将残余连接与挤压和激励块以及混合注意系统和增强的激活函数和优化算法结合起来,在特征提取过程中促进梯度运动。与已建立的传统CNN主梁(VGG16、ResNet50、DenseNet121等)相比,本文提出的ConvNeXt-SE-Attn模型在识别和校准方面均优于传统的CNN主梁(VGG16、ResNet50、DenseNet121等),精度97.88% (vs.≤94.2%)、灵敏度96.82% (vs.≤92.5%)、特异性95.94% (vs.≤93.1%)、F1评分97.31% (vs.≤93.8%)、AUC 0.9644 (vs.≤0.945)、MCC 0.9397 (vs.≤0.910)。这些发现对于增强特征表示能力和体系结构分类的鲁棒性至关重要。提出的体系结构采用带有SE块的ConvNeXt主干和混合关注来提取类边界内标准模型通常遗漏的重要细节,通过基于高斯的GReLU激活结合Swish激活和DropPath正则化来产生平滑的梯度模式,从而产生跨不平衡数据集的可泛化特征。Grad-CAM通过显示哪些图像部分导致预测,从而增强了可解释性,从而使临床决策成为可能。该模型证明了其作为口腔细胞微小变化的有效检测方法的能力,为OSCC的精确非侵入性治疗提供了支持。
{"title":"Deep learning approach with ConvNeXt-SE-attn model for in vitro oral squamous cell carcinoma and chemotherapy analysis","authors":"Abhay Nath ,&nbsp;Om Roy ,&nbsp;Priyanka Silveri ,&nbsp;Sanskruti Patel","doi":"10.1016/j.mex.2025.103519","DOIUrl":"10.1016/j.mex.2025.103519","url":null,"abstract":"<div><div>Oral squamous cell carcinoma (OSCC) continues to present a major worldwide healthcare problem because patients have poor survival outcomes alongside frequent disease returns. Globocan predicts that, OSCC will result in 389,846 new cases and 188,438 deaths globally during 2022 while maintaining an extremely poor 5-year survival rate at about 50%. Our method applies residual connections with Squeeze-and-Excitation blocks along with hybrid attention systems and enhanced activation functions and optimization algorithms to boost gradient movement throughout feature extraction. Compared against established conventional CNN backbones (VGG16, ResNet50, DenseNet121, and more), the proposed ConvNeXt-SE-Attn model outperformed them in all aspects of discrimination and calibration, including precision 97.88% (vs. ≤94.2%), sensitivity 96.82% (vs. ≤92.5%), specificity 95.94% (vs. ≤93.1%), F1 score 97.31% (vs. ≤93.8%), AUC 0.9644 (vs. ≤0.945), and MCC 0.9397 (vs. ≤0.910). The findings are critical to the increased feature-representation power and the robustness of classification of the architecture.</div><div>The proposed architecture employs ConvNeXt backbone with SE blocks and hybrid attention to extract essential details within class boundaries which standard models usually miss.</div><div>The activation through Gaussian-based GReLU incorporates Swish activation together with DropPath regularization for producing smooth gradient patterns which lead to generalizable features across imbalanced datasets.</div><div>Grad-CAM enhances interpretability by showing which image sections lead to predictions in order to enable clinical decisions.</div><div>The model demonstrates its capability as an effective detection method for minimal variations in oral cells which supports precise non-invasive treatment approaches for OSCC.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103519"},"PeriodicalIF":1.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144696758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simulating forage plantain growth and defoliation in APSIM APSIM模拟饲用大车前草生长和落叶
IF 1.9 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-07-23 DOI: 10.1016/j.mex.2025.103530
Rogerio Cichota, Xiumei Yang
The methodology used to develop and test a crop system model to simulate growth and defoliation management of plantain forage (Plantago lanceolata L.) in the Agricultural Production Systems Simulator (APSIM) framework is presented. The model has been primarily developed based on data from New Zealand, but it should be applicable in other regions, and it can be extended to account for new cultivars being released in the market. The model is capable of simulating pure forage plantain stands and can also be used in mixed swards with other forage species within the APSIM framework. Validation tests for the current stage focused on pure forage plantain stands in New Zealand; results showed a good model performance when predicting biomass accumulation over a growing season and nitrogen content (with NSE > 0.8), although predictions for individual defoliation events were not very accurate (NSE between -1.0 and 0.6). More data is needed to improve how the model describes biomass partition among plant organs and how this is affected by environmental conditions and defoliations. This model contributes to the ongoing efforts to explore alternative crops and improve management of grazing systems, aiming to reduce environmental impacts while maintaining productivity.
  • Data to describe various aspects of plantain forage were gathered from literature and field trials
  • Model built to simulate plant growth and re-growth after defoliations within APSIM
  • The model can be used in monoculture and mixed swards
本文介绍了在农业生产系统模拟器(APSIM)框架下开发和测试作物系统模型以模拟车前草(Plantago lanceolata L.)生长和落叶管理的方法。该模型主要是根据新西兰的数据开发的,但它应该适用于其他地区,并且可以扩展到考虑市场上发布的新品种。该模型能够模拟纯牧草大车前草林分,也可以在APSIM框架下用于与其他牧草种混合的草地。现阶段的验证试验集中在新西兰的纯牧草大车前草林分;结果表明,该模型在预测生长季生物量积累和氮含量方面表现良好。0.8),尽管对个别落叶事件的预测不是很准确(NSE在-1.0到0.6之间)。需要更多的数据来改进该模型如何描述植物器官之间的生物量分配,以及这种分配如何受到环境条件和落叶的影响。这一模式有助于探索替代作物和改善放牧系统管理的持续努力,旨在减少对环境的影响,同时保持生产力。•从文献和田间试验中收集了描述大蕉饲料各个方面的数据•在APSIM中建立了模拟植物生长和落叶后再生长的模型•该模型可用于单作和混合草地
{"title":"Simulating forage plantain growth and defoliation in APSIM","authors":"Rogerio Cichota,&nbsp;Xiumei Yang","doi":"10.1016/j.mex.2025.103530","DOIUrl":"10.1016/j.mex.2025.103530","url":null,"abstract":"<div><div>The methodology used to develop and test a crop system model to simulate growth and defoliation management of plantain forage (<em>Plantago lanceolata</em> L.) in the Agricultural Production Systems Simulator (APSIM) framework is presented. The model has been primarily developed based on data from New Zealand, but it should be applicable in other regions, and it can be extended to account for new cultivars being released in the market. The model is capable of simulating pure forage plantain stands and can also be used in mixed swards with other forage species within the APSIM framework. Validation tests for the current stage focused on pure forage plantain stands in New Zealand; results showed a good model performance when predicting biomass accumulation over a growing season and nitrogen content (with NSE &gt; 0.8), although predictions for individual defoliation events were not very accurate (NSE between -1.0 and 0.6). More data is needed to improve how the model describes biomass partition among plant organs and how this is affected by environmental conditions and defoliations. This model contributes to the ongoing efforts to explore alternative crops and improve management of grazing systems, aiming to reduce environmental impacts while maintaining productivity.<ul><li><span>•</span><span><div>Data to describe various aspects of plantain forage were gathered from literature and field trials</div></span></li><li><span>•</span><span><div>Model built to simulate plant growth and re-growth after defoliations within APSIM</div></span></li><li><span>•</span><span><div>The model can be used in monoculture and mixed swards</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103530"},"PeriodicalIF":1.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144721624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automating content analysis of scientific abstracts using ChatGPT: A methodological protocol and use case 使用ChatGPT自动化科学摘要的内容分析:一个方法学协议和用例
IF 1.6 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-06-13 DOI: 10.1016/j.mex.2025.103431
Adrián Domínguez-Diaz , Manuel Goyanes , Luis de-Marcos
This paper presents a protocol for using ChatGPT to perform content analysis. The protocol involves converting a codebook, outlining categories, descriptions, coding rules, and possible values, into a structured prompt that guides ChatGPT's analysis. The protocol was validated through analysis of 980 research articles to identify research approaches and data collection methods. ChatGPT achieved high performance in identifying data collection methods, but faced challenges with poorly defined or underrepresented categories, particularly in mixed methods research. Overall, while it scored well for quantitative (0.96) and qualitative (0.82) studies, it struggled with mixed methods (0.60), highlighting the need for clear methodological definitions.
  • The protocol enhances coding efficiency and demonstrates the feasibility of using AI for content analysis, potentially streamlining the coding process in research.
  • Challenges arose in categories that were not clearly defined (big data), underrepresented (ethnography), or hierarchically related (Interview & Discourse/Textual analysis).
  • Interrater metrics indicated a substantial level of agreement, reinforcing the potential of ChatGPT in content analysis while emphasizing the importance of clear methodological definitions.
本文提出了一个使用ChatGPT进行内容分析的协议。该协议包括将代码本、概述类别、描述、编码规则和可能的值转换为指导ChatGPT分析的结构化提示。通过对980篇研究文章的分析来验证该方案,以确定研究方法和数据收集方法。ChatGPT在识别数据收集方法方面取得了高性能,但面临着定义不清或代表性不足的类别的挑战,特别是在混合方法研究中。总体而言,虽然它在定量研究(0.96)和定性研究(0.82)方面得分很高,但它在混合方法(0.60)方面表现不佳,这突出了明确方法定义的必要性。•该协议提高了编码效率,并证明了使用AI进行内容分析的可行性,可能会简化研究中的编码过程。•挑战出现在定义不明确的类别(大数据)、代表性不足的类别(人种学)或与等级相关的类别(访谈&;话语/文本分析)。•解释器指标表明了相当程度的一致性,加强了ChatGPT在内容分析中的潜力,同时强调了明确方法定义的重要性。
{"title":"Automating content analysis of scientific abstracts using ChatGPT: A methodological protocol and use case","authors":"Adrián Domínguez-Diaz ,&nbsp;Manuel Goyanes ,&nbsp;Luis de-Marcos","doi":"10.1016/j.mex.2025.103431","DOIUrl":"10.1016/j.mex.2025.103431","url":null,"abstract":"<div><div>This paper presents a protocol for using ChatGPT to perform content analysis. The protocol involves converting a codebook, outlining categories, descriptions, coding rules, and possible values, into a structured prompt that guides ChatGPT's analysis. The protocol was validated through analysis of 980 research articles to identify research approaches and data collection methods. ChatGPT achieved high performance in identifying data collection methods, but faced challenges with poorly defined or underrepresented categories, particularly in mixed methods research. Overall, while it scored well for quantitative (0.96) and qualitative (0.82) studies, it struggled with mixed methods (0.60), highlighting the need for clear methodological definitions.<ul><li><span>•</span><span><div>The protocol enhances coding efficiency and demonstrates the feasibility of using AI for content analysis, potentially streamlining the coding process in research.</div></span></li><li><span>•</span><span><div>Challenges arose in categories that were not clearly defined (big data), underrepresented (ethnography), or hierarchically related (Interview &amp; Discourse/Textual analysis).</div></span></li><li><span>•</span><span><div>Interrater metrics indicated a substantial level of agreement, reinforcing the potential of ChatGPT in content analysis while emphasizing the importance of clear methodological definitions.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103431"},"PeriodicalIF":1.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144321571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data preparation method for machine learning-based breast cancer risk prediction: A Cuban case study 基于机器学习的乳腺癌风险预测的数据准备方法:古巴案例研究
IF 1.9 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-10-28 DOI: 10.1016/j.mex.2025.103688
Jose Manuel Valencia-Moreno, Everardo Gutierrez-Lopez, Jose Angel Gonzalez-Fraga, Rodolfo Alan Martinez Rodriguez, Olivia Denisse Victoria Mejia, Alma Alejandra Soberano Serrano
This article presents a dataset of breast cancer risk factors collected from 1697 Cuban women between 2001 and 2018, as a tool to design and support the development and validation of predictive models in public health for breast cancer risk. A reproducible methodology for quality control and variable enrichment was implemented to ensure data integrity and compatibility with machine learning techniques.
• Reproducible preprocessing methodology to ensure data quality and traceability.
• Open breast cancer risk factor dataset for epidemiological studies and risk assessment using machine learning.
• Consistent prediction model performance across multiple metrics after data preprocessing
本文介绍了2001年至2018年期间从1697名古巴妇女中收集的乳腺癌风险因素数据集,作为设计和支持开发和验证公共卫生乳腺癌风险预测模型的工具。实施了可重复的质量控制和变量丰富方法,以确保数据完整性和与机器学习技术的兼容性。•可重复的预处理方法,确保数据质量和可追溯性。•开放乳腺癌风险因素数据集,用于流行病学研究和使用机器学习进行风险评估。•数据预处理后,跨多个指标的一致预测模型性能
{"title":"Data preparation method for machine learning-based breast cancer risk prediction: A Cuban case study","authors":"Jose Manuel Valencia-Moreno,&nbsp;Everardo Gutierrez-Lopez,&nbsp;Jose Angel Gonzalez-Fraga,&nbsp;Rodolfo Alan Martinez Rodriguez,&nbsp;Olivia Denisse Victoria Mejia,&nbsp;Alma Alejandra Soberano Serrano","doi":"10.1016/j.mex.2025.103688","DOIUrl":"10.1016/j.mex.2025.103688","url":null,"abstract":"<div><div>This article presents a dataset of breast cancer risk factors collected from 1697 Cuban women between 2001 and 2018, as a tool to design and support the development and validation of predictive models in public health for breast cancer risk. A reproducible methodology for quality control and variable enrichment was implemented to ensure data integrity and compatibility with machine learning techniques.</div><div>• Reproducible preprocessing methodology to ensure data quality and traceability.</div><div>• Open breast cancer risk factor dataset for epidemiological studies and risk assessment using machine learning.</div><div>• Consistent prediction model performance across multiple metrics after data preprocessing</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103688"},"PeriodicalIF":1.9,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Missing data imputation of climate time series: A review 气候时间序列缺失数据的估算:综述
IF 1.6 Q2 MULTIDISCIPLINARY SCIENCES Pub Date : 2025-12-01 Epub Date: 2025-06-19 DOI: 10.1016/j.mex.2025.103455
Lizette Elena Alejo-Sanchez , Aldo Márquez-Grajales , Fernando Salas-Martínez , Anilu Franco-Arcega , Virgilio López-Morales , Otilio Arturo Acevedo-Sandoval , César Abelardo González-Ramírez , Ramiro Villegas-Vega
Missing data in climate time series is a significant problem because it complicates the monitoring and prediction of climatic phenomena. The primary objective of this research document is to describe the most relevant imputation methods for missing data in the climate context over the last decade. Results reveal a superior concentration of documents on the use of imputation methods for climate time series in Asia and Europe, with notable examples from Malaysia, China, and Italy. Meanwhile, Brazil and Australia were the countries with a high number of research in America and Oceania. Moreover, temperature and precipitation were the most frequently employed climate variables. Regarding the information source, the monitoring networks were the most commonly used source for extracting data in almost all the research. On the other hand, methods such as mean techniques, simple and multiple linear regression, interpolation, and Principal Component Analysis (PCA) were the conventional statistical techniques used for imputing missing data. Furthermore, artificial neural networks demonstrated the ability to identify complex patterns in the data. Finally, Generative Adversarial Networks excel over other deep learning methods in the imputation of missing climate data.
气候时间序列数据缺失是一个严重的问题,因为它使气候现象的监测和预测复杂化。本研究文件的主要目的是描述过去十年气候背景下缺失数据的最相关的估算方法。结果显示,在亚洲和欧洲,使用气候时间序列的估算方法的文献非常集中,其中有来自马来西亚、中国和意大利的显著例子。与此同时,巴西和澳大利亚是美洲和大洋洲研究数量最多的国家。此外,温度和降水是最常用的气候变量。在信息源方面,监测网络是几乎所有研究中最常用的数据提取来源。另一方面,均值技术、简单和多元线性回归、插值和主成分分析(PCA)等方法是用于输入缺失数据的传统统计技术。此外,人工神经网络还展示了识别数据中复杂模式的能力。最后,生成对抗网络在缺失气候数据的输入方面优于其他深度学习方法。
{"title":"Missing data imputation of climate time series: A review","authors":"Lizette Elena Alejo-Sanchez ,&nbsp;Aldo Márquez-Grajales ,&nbsp;Fernando Salas-Martínez ,&nbsp;Anilu Franco-Arcega ,&nbsp;Virgilio López-Morales ,&nbsp;Otilio Arturo Acevedo-Sandoval ,&nbsp;César Abelardo González-Ramírez ,&nbsp;Ramiro Villegas-Vega","doi":"10.1016/j.mex.2025.103455","DOIUrl":"10.1016/j.mex.2025.103455","url":null,"abstract":"<div><div>Missing data in climate time series is a significant problem because it complicates the monitoring and prediction of climatic phenomena. The primary objective of this research document is to describe the most relevant imputation methods for missing data in the climate context over the last decade. Results reveal a superior concentration of documents on the use of imputation methods for climate time series in Asia and Europe, with notable examples from Malaysia, China, and Italy. Meanwhile, Brazil and Australia were the countries with a high number of research in America and Oceania. Moreover, temperature and precipitation were the most frequently employed climate variables. Regarding the information source, the monitoring networks were the most commonly used source for extracting data in almost all the research. On the other hand, methods such as mean techniques, simple and multiple linear regression, interpolation, and Principal Component Analysis (PCA) were the conventional statistical techniques used for imputing missing data. Furthermore, artificial neural networks demonstrated the ability to identify complex patterns in the data. Finally, Generative Adversarial Networks excel over other deep learning methods in the imputation of missing climate data.</div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"15 ","pages":"Article 103455"},"PeriodicalIF":1.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144489601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
MethodsX
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1