Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.357
Filtrar
1.
Int. j. morphol ; 42(4): 970-976, ago. 2024. ilus, tab
Artigo em Inglês | LILACS | ID: biblio-1569272

RESUMO

SUMMARY: Since machine learning algorithms give more reliable results, they have been used in the field of health in recent years. The orbital variables give very successful results in classifying sex correctly. This research has focused on sex determination using certain variables obtained from the orbital images of the computerized tomography (CT) by using machine learning algorithms (ML). In this study 12 variables determined on 600 orbital images of 300 individuals (150 men and 150 women) were tested with different ML. Decision tree (DT), K-Nearest Neighbour (KNN), Logistic Regression (LR), Random Forest (RF), Linear Discriminant Analysis (LDA), and Naive Bayes (NB) algorithms of ML were used for unsupervised learning. Statistical analyses of the variables were conducted with Minitab® 21.2 (64-bit) program. ACC rate of NB, DT, KNN, and LR algorithms was found as % 83 while the ACC rate of LDA and RFC algorithms was determined as % 85. According to Shap analysis, the variable with the highest degree of effect was found as BOW. The study has determined the sex with high accuracy at the ratios of 0.83 and 0.85 through using the variables of the orbital CT images, and the related morphometric data of the population under question was acquired, emphasizing the racial variation.


Dado que los algoritmos de aprendizaje automático dan resultados más fiables, en los últimos años han sido utilizados en el campo de la salud. Las variables orbitales dan resultados muy exitosos a la hora de clasificar correctamente el sexo. Esta investigación se ha centrado en la determinación del sexo utilizando determinadas variables obtenidas a partir de las imágenes orbitales de la tomografía computarizada (TC) mediante el uso de algoritmos de aprendizaje automático (AA). En este estudio se probaron 12 variables determinadas en 600 imágenes orbitales de 300 individuos (150 hombres y 150 mujeres) con diferentes AA. Se utilizaron algoritmos de AA de árbol de decisión (DT), K-Nearest Neighbour, regresión logística (RL), Random Forest (RF), análisis discriminante lineal (ADL) y Naive Bayes (NB) para el aprendizaje no supervisado. Los análisis estadísticos de las variables se realizaron con el programa Minitab® 21.2 (64 bits). La tasa de ACC de los algoritmos NB, DT, KNN y RL se encontró en % 83, mientras que la tasa de ACC de los algoritmos ADL y RFC se determinó en % 85. Según el análisis de Sharp, la variable con el mayor grado de efecto se encontró como BOW. El estudio determinó el sexo con alta precisión en las proporciones de 0,83 y 0,85 mediante el uso de las variables de las imágenes de TC orbitales, y se adquirieron los datos morfométricos relacionados de la población en cuestión, enfatizando la variación racial.


Assuntos
Humanos , Masculino , Feminino , Órbita/diagnóstico por imagem , Tomografia Computadorizada por Raios X , Determinação do Sexo pelo Esqueleto , Aprendizado de Máquina , Órbita/anatomia & histologia , Algoritmos , Modelos Logísticos , Antropologia Forense , Imageamento Tridimensional
2.
Parasit Vectors ; 17(1): 329, 2024 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-39095920

RESUMO

BACKGROUND: Identifying mosquito vectors is crucial for controlling diseases. Automated identification studies using the convolutional neural network (CNN) have been conducted for some urban mosquito vectors but not yet for sylvatic mosquito vectors that transmit the yellow fever. We evaluated the ability of the AlexNet CNN to identify four mosquito species: Aedes serratus, Aedes scapularis, Haemagogus leucocelaenus and Sabethes albiprivus and whether there is variation in AlexNet's ability to classify mosquitoes based on pictures of four different body regions. METHODS: The specimens were photographed using a cell phone connected to a stereoscope. Photographs were taken of the full-body, pronotum and lateral view of the thorax, which were pre-processed to train the AlexNet algorithm. The evaluation was based on the confusion matrix, the accuracy (ten pseudo-replicates) and the confidence interval for each experiment. RESULTS: Our study found that the AlexNet can accurately identify mosquito pictures of the genus Aedes, Sabethes and Haemagogus with over 90% accuracy. Furthermore, the algorithm performance did not change according to the body regions submitted. It is worth noting that the state of preservation of the mosquitoes, which were often damaged, may have affected the network's ability to differentiate between these species and thus accuracy rates could have been even higher. CONCLUSIONS: Our results support the idea of applying CNNs for artificial intelligence (AI)-driven identification of mosquito vectors of tropical diseases. This approach can potentially be used in the surveillance of yellow fever vectors by health services and the population as well.


Assuntos
Aedes , Mosquitos Vetores , Redes Neurais de Computação , Febre Amarela , Animais , Mosquitos Vetores/classificação , Febre Amarela/transmissão , Aedes/classificação , Aedes/fisiologia , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Culicidae/classificação , Inteligência Artificial
3.
Data Brief ; 55: 110678, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39100781

RESUMO

In recent years, there has been significant growth in the development of Machine Learning (ML) models across various fields, such as image and sound recognition and natural language processing. They need to be trained with a large enough data set, ensuring predictions or results are as accurate as possible. When it comes to models for audio recognition, specifically the detection of car horns, the datasets are generally not built considering the specificities of the different scenarios that may exist in real traffic, being limited to collections of random horns, whose sources are sometimes collected from audio streaming sites. There are benefits associated with a ML model trained on data tailored for horn detection. One notable advantage is the potential implementation of horn detection in smartphones and smartwatches equipped with embedded models to aid hearing-impaired individuals while driving and alert them in potentially hazardous situations, thus promoting social inclusion. Given these considerations, we developed a dataset specifically for car horns. This dataset has 1,080 one-second-long .wav audio files categorized into two classes: horn and not horn. The data collection followed a carefully established protocol designed to encompass different scenarios in a real traffic environment, considering diverse relative positions between the involved vehicles. The protocol defines ten distinct scenarios, incorporating variables within the car receiving the horn, including the presence of internal conversations, music, open or closed windows, engine status (on or off), and whether the car is stationary or in motion. Additionally, there are variations in scenarios associated with the vehicle emitting the horn, such as its relative position-behind, alongside, or in front of the receiving vehicle-and the types of horns used, which may include a short honk, a prolonged one, or a rhythmic pattern of three quick honks. The data collection process started with simultaneous audio recordings on two smartphones positioned inside the receiving vehicle, capturing all scenarios in a single audio file on each device. A 400-meter route was defined in a controlled area, so the audio recordings could be carried out safely. For each established scenario, the route was covered with emissions of different types of horns in distinct positions between the vehicles, and then the route was restarted in the next scenario. After the collection phase, the data preprocessing involved manually cutting each horn sound in multiple one-second windowing profiles, saving them in PCM stereo .wav files with a 16-bit depth and a 44.1 kHz sampling rate. For each horn clipping, a corresponding non-horn clipping in close proximity was performed, ensuring a balanced model. This dataset was designed for utilization in various machine learning algorithms, whether for detecting horns with the binary labels, or classifying different patterns of horns by rearranging labels considering the file nomenclature. In technical validation, classifications were performed using a convolutional neural network trained with spectrograms from the dataset's audio, achieving an average accuracy of 89% across 100 trained models.

4.
Front Plant Sci ; 15: 1373318, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39086911

RESUMO

Coffee Breeding programs have traditionally relied on observing plant characteristics over years, a slow and costly process. Genomic selection (GS) offers a DNA-based alternative for faster selection of superior cultivars. Stacking Ensemble Learning (SEL) combines multiple models for potentially even more accurate selection. This study explores SEL potential in coffee breeding, aiming to improve prediction accuracy for important traits [yield (YL), total number of the fruits (NF), leaf miner infestation (LM), and cercosporiosis incidence (Cer)] in Coffea Arabica. We analyzed data from 195 individuals genotyped for 21,211 single-nucleotide polymorphism (SNP) markers. To comprehensively assess model performance, we employed a cross-validation (CV) scheme. Genomic Best Linear Unbiased Prediction (GBLUP), multivariate adaptive regression splines (MARS), Quantile Random Forest (QRF), and Random Forest (RF) served as base learners. For the meta-learner within the SEL framework, various options were explored, including Ridge Regression, RF, GBLUP, and Single Average. The SEL method was able to predict the predictive ability (PA) of important traits in Coffea Arabica. SEL presented higher PA compared with those obtained for all base learner methods. The gains in PA in relation to GBLUP were 87.44% (the ratio between the PA obtained from best Stacking model and the GBLUP), 37.83%, 199.82%, and 14.59% for YL, NF, LM and Cer, respectively. Overall, SEL presents a promising approach for GS. By combining predictions from multiple models, SEL can potentially enhance the PA of GS for complex traits.

5.
Curr Med Chem ; 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39092736

RESUMO

BACKGROUND: Computational assessment of the energetics of protein-ligand complexes is a challenge in the early stages of drug discovery. Previous comparative studies on computational methods to calculate the binding affinity showed that targeted scoring functions outperform universal models. OBJECTIVE: The goal here is to review the application of a simple physics-based model to estimate the binding. The focus is on a mass-spring system developed to predict binding affinity against cyclin-dependent kinase. METHOD: Publications in PubMed were searched to find mass-spring models to predict binding affinity. Crystal structures of cyclin-dependent kinases found in the protein data bank and two web servers to calculate affinity based on the atomic coordinates were employed. RESULTS: One recent study showed how a simple physics-based scoring function (named Taba) could contribute to the analysis of protein-ligand interactions. Taba methodology outperforms robust physics-based models implemented in docking programs such as AutoDock4 and Molegro Virtual Docker. Predictive metrics of 27 scoring functions and energy terms highlight the superior performance of the Taba scoring function for cyclin- dependent kinase. CONCLUSION: The recent progress of machine learning methods and the availability of these techniques through free libraries boosted the development of more accurate models to address protein-ligand interactions. Combining a naïve mass-spring system with machine-learning techniques generated a targeted scoring function with superior predictive performance to estimate pKi.

6.
JMIR Res Protoc ; 13: e55466, 2024 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-39133913

RESUMO

BACKGROUND: The use of technologies has had a significant impact on patient safety and the quality of care and has increased globally. In the literature, it has been reported that people die annually due to adverse events (AEs), and various methods exist for investigating and measuring AEs. However, some methods have a limited scope, data extraction, and the need for data standardization. In Brazil, there are few studies on the application of trigger tools, and this study is the first to create automated triggers in ambulatory care. OBJECTIVE: This study aims to develop a machine learning (ML)-based automated trigger for outpatient health care settings in Brazil. METHODS: A mixed methods research will be conducted within a design thinking framework and the principles will be applied in creating the automated triggers, following the stages of (1) empathize and define the problem, involving observations and inquiries to comprehend both the user and the challenge at hand; (2) ideation, where various solutions to the problem are generated; (3) prototyping, involving the construction of a minimal representation of the best solutions; (4) testing, where user feedback is obtained to refine the solution; and (5) implementation, where the refined solution is tested, changes are assessed, and scaling is considered. Furthermore, ML methods will be adopted to develop automated triggers, tailored to the local context in collaboration with an expert in the field. RESULTS: This protocol describes a research study in its preliminary stages, prior to any data gathering and analysis. The study was approved by the members of the organizations within the institution in January 2024 and by the ethics board of the University of São Paulo and the institution where the study will take place. in May 2024. As of June 2024, stage 1 commenced with data gathering for qualitative research. A separate paper focused on explaining the method of ML will be considered after the outcomes of stages 1 and 2 in this study. CONCLUSIONS: After the development of automated triggers in the outpatient setting, it will be possible to prevent and identify potential risks of AEs more promptly, providing valuable information. This technological innovation not only promotes advances in clinical practice but also contributes to the dissemination of techniques and knowledge related to patient safety. Additionally, health care professionals can adopt evidence-based preventive measures, reducing costs associated with AEs and hospital readmissions, enhancing productivity in outpatient care, and contributing to the safety, quality, and effectiveness of care provided. Additionally, in the future, if the outcome is successful, there is the potential to apply it in all units, as planned by the institutional organization. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): PRR1-10.2196/55466.


Assuntos
Assistência Ambulatorial , Aprendizado de Máquina , Humanos , Brasil , Segurança do Paciente
7.
Sensors (Basel) ; 24(15)2024 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-39124011

RESUMO

Load recognition remains not comprehensively explored in Home Energy Management Systems (HEMSs). There are gaps in current approaches to load recognition, such as enhancing appliance identification and increasing the overall performance of the load-recognition system through more robust models. To address this issue, we propose a novel approach based on the Analysis of Variance (ANOVA) F-test combined with SelectKBest and gradient-boosting machines (GBMs) for load recognition. The proposed approach improves the feature selection and consequently aids inter-class separability. Further, we optimized GBM models, such as the histogram-based gradient-boosting machine (HistGBM), light gradient-boosting machine (LightGBM), and XGBoost (extreme gradient boosting), to create a more reliable load-recognition system. Our findings reveal that the ANOVA-GBM approach achieves greater efficiency in training time, even when compared to Principal Component Analysis (PCA) and a higher number of features. ANOVA-XGBoost is approximately 4.31 times faster than PCA-XGBoost, ANOVA-LightGBM is about 5.15 times faster than PCA-LightGBM, and ANOVA-HistGBM is 2.27 times faster than PCA-HistGBM. The general performance results expose the impact on the overall performance of the load-recognition system. Some of the key results show that the ANOVA-LightGBM pair reached 96.42% accuracy, 96.27% F1, and a Kappa index of 0.9404; the ANOVA-HistGBM combination achieved 96.64% accuracy, 96.48% F1, and a Kappa index of 0.9434; and the ANOVA-XGBoost pair attained 96.75% accuracy, 96.64% F1, and a Kappa index of 0.9452; such findings overcome rival methods from the literature. In addition, the accuracy gain of the proposed approach is prominent when compared straight to its competitors. The higher accuracy gains were 13.09, 13.31, and 13.42 percentage points (pp) for the pairs ANOVA-LightGBM, ANOVA-HistGBM, and ANOVA-XGBoost, respectively. These significant improvements highlight the effectiveness and refinement of the proposed approach.

8.
Diagnostics (Basel) ; 14(15)2024 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-39125499

RESUMO

Type 2 diabetes mellitus (T2DM) is one of the most common metabolic diseases in the world and poses a significant public health challenge. Early detection and management of this metabolic disorder is crucial to prevent complications and improve outcomes. This paper aims to find core differences in male and female markers to detect T2DM by their clinic and anthropometric features, seeking out ranges in potential biomarkers identified to provide useful information as a pre-diagnostic tool whie excluding glucose-related biomarkers using machine learning (ML) models. We used a dataset containing clinical and anthropometric variables from patients diagnosed with T2DM and patients without TD2M as control. We applied feature selection with three different techniques to identify relevant biomarker models: an improved recursive feature elimination (RFE) evaluating each set from all the features to one feature with the Akaike information criterion (AIC) to find optimal outputs; Least Absolute Shrinkage and Selection Operator (LASSO) with glmnet; and Genetic Algorithms (GA) with GALGO and forward selection (FS) applied to GALGO output. We then used these for comparison with the AIC to measure the performance of each technique and collect the optimal set of global features. Then, an implementation and comparison of five different ML models was carried out to identify the most accurate and interpretable one, considering the following models: logistic regression (LR), artificial neural network (ANN), support vector machine (SVM), k-nearest neighbors (KNN), and nearest centroid (Nearcent). The models were then combined in an ensemble to provide a more robust approximation. The results showed that potential biomarkers such as systolic blood pressure (SBP) and triglycerides are together significantly associated with T2DM. This approach also identified triglycerides, cholesterol, and diastolic blood pressure as biomarkers with differences between male and female actors that have not been previously reported in the literature. The most accurate ML model was selection with RFE and random forest (RF) as the estimator improved with the AIC, which achieved an accuracy of 0.8820. In conclusion, this study demonstrates the potential of ML models in identifying potential biomarkers for early detection of T2DM, excluding glucose-related biomarkers as well as differences between male and female anthropometric and clinic profiles. These findings may help to improve early detection and management of the T2DM by accounting for differences between male and female subjects in terms of anthropometric and clinic profiles, potentially reducing healthcare costs and improving personalized patient attention. Further research is needed to validate these potential biomarkers ranges in other populations and clinical settings.

9.
BMC Public Health ; 24(1): 2131, 2024 Aug 06.
Artigo em Inglês | MEDLINE | ID: mdl-39107721

RESUMO

BACKGROUND: The temporal relationships across cardiometabolic diseases (CMDs) were recently conceptualized as the cardiometabolic continuum (CMC), sequence of cardiovascular events that stem from gene-environmental interactions, unhealthy lifestyle influences, and metabolic diseases such as diabetes, and hypertension. While the physiological pathways linking metabolic and cardiovascular diseases have been investigated, the study of the sex and population differences in the CMC have still not been described. METHODS: We present a machine learning approach to model the CMC and investigate sex and population differences in two distinct cohorts: the UK Biobank (17,700 participants) and the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil) (7162 participants). We consider the following CMDs: hypertension (Hyp), diabetes (DM), heart diseases (HD: angina, myocardial infarction, or heart failure), and stroke (STK). For the identification of the CMC patterns, individual trajectories with the time of disease occurrence were clustered using k-means. Based on clinical, sociodemographic, and lifestyle characteristics, we built multiclass random forest classifiers and used the SHAP methodology to evaluate feature importance. RESULTS: Five CMC patterns were identified across both sexes and cohorts: EarlyHyp, FirstDM, FirstHD, Healthy, and LateHyp, named according to prevalence and disease occurrence time that depicted around 95%, 78%, 75%, 88% and 99% of individuals, respectively. Within the UK Biobank, more women were classified in the Healthy cluster and more men in all others. In the EarlyHyp and LateHyp clusters, isolated hypertension occurred earlier among women. Smoking habits and education had high importance and clear directionality for both sexes. For ELSA-Brasil, more men were classified in the Healthy cluster and more women in the FirstDM. The diabetes occurrence time when followed by hypertension was lower among women. Education and ethnicity had high importance and clear directionality for women, while for men these features were smoking, alcohol, and coffee consumption. CONCLUSIONS: There are clear sex differences in the CMC that varied across the UK and Brazilian cohorts. In particular, disadvantages regarding incidence and the time to onset of diseases were more pronounced in Brazil, against woman. The results show the need to strengthen public health policies to prevent and control the time course of CMD, with an emphasis on women.


Assuntos
Doenças Cardiovasculares , Aprendizado de Máquina , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Brasil/epidemiologia , Fatores de Risco Cardiometabólico , Doenças Cardiovasculares/epidemiologia , Estudos de Coortes , Estudos Longitudinais , Fatores Sexuais , Biobanco do Reino Unido , Reino Unido/epidemiologia
10.
Ann Hepatol ; 29(6): 101540, 2024 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-39151891

RESUMO

INTRODUCTION AND OBJECTIVES: The increasing incidence of hepatocellular carcinoma (HCC) in China is an urgent issue, necessitating early diagnosis and treatment. This study aimed to develop personalized predictive models by combining machine learning (ML) technology with a demographic, medical history, and noninvasive biomarker data. These models can enhance the decision-making capabilities of physicians for HCC in hepatitis B virus (HBV)-related cirrhosis patients with low serum alpha-fetoprotein (AFP) levels. PATIENTS AND METHODS: A total of 6,980 patients treated between January 2012 and December 2018 were included. Pre-treatment laboratory tests and clinical data were obtained. The significant risk factors for HCC were identified, and the relative risk of each variable affecting its diagnosis was calculated using ML and univariate regression analysis. The data set was then randomly partitioned into validation (20 %) and training sets (80 %) to develop the ML models. RESULTS: Twelve independent risk factors for HCC were identified using Gaussian naïve Bayes, extreme gradient boosting (XGBoost), random forest, and least absolute shrinkage and selection operation regression models. Multivariate analysis revealed that male sex, age >60 years, alkaline phosphate >150 U/L, AFP >25 ng/mL, carcinoembryonic antigen >5 ng/mL, and fibrinogen >4 g/L were the risk factors, whereas hypertension, calcium <2.25 mmol/L, potassium ≤3.5 mmol/L, direct bilirubin >6.8 µmol/L, hemoglobin <110 g/L, and glutamic-pyruvic transaminase >40 U/L were the protective factors in HCC patients. Based on these factors, a nomogram was constructed, showing an area under the curve (AUC) of 0.746 (sensitivity = 0.710, specificity=0.646), which was significantly higher than AFP AUC of 0.658 (sensitivity = 0.462, specificity=0.766). Compared with several ML algorithms, the XGBoost model had an AUC of 0.832 (sensitivity = 0.745, specificity=0.766) and an independent validation AUC of 0.829 (sensitivity = 0.766, specificity = 0.737), making it the top-performing model in both sets. The external validation results have proven the accuracy of the XGBoost model. CONCLUSIONS: The proposed XGBoost demonstrated a promising ability for individualized prediction of HCC in HBV-related cirrhosis patients with low-level AFP.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA