A machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows.

Fasciola hepatica and Ostertagia ostertagi are internal parasites of cattle compromising physiology, productivity, and well-being. Parasites are complex in their effect on hosts, sometimes making it difficult to identify clear directions of associations between infection and production parameters. T...

Full description

Bibliographic Details
Main Authors: Andreas W Oehm, Andrea Springer, Daniela Jordan, Christina Strube, Gabriela Knubben-Schweizer, Katharina Charlotte Jensen, Yury Zablotski
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0271413
_version_ 1818162307820158976
author Andreas W Oehm
Andrea Springer
Daniela Jordan
Christina Strube
Gabriela Knubben-Schweizer
Katharina Charlotte Jensen
Yury Zablotski
author_facet Andreas W Oehm
Andrea Springer
Daniela Jordan
Christina Strube
Gabriela Knubben-Schweizer
Katharina Charlotte Jensen
Yury Zablotski
author_sort Andreas W Oehm
collection DOAJ
description Fasciola hepatica and Ostertagia ostertagi are internal parasites of cattle compromising physiology, productivity, and well-being. Parasites are complex in their effect on hosts, sometimes making it difficult to identify clear directions of associations between infection and production parameters. Therefore, unsupervised approaches not assuming a structure reduce the risk of introducing bias to the analysis. They may provide insights which cannot be obtained with conventional, supervised methodology. An unsupervised, exploratory cluster analysis approach using the k-mode algorithm and partitioning around medoids detected two distinct clusters in a cross-sectional data set of milk yield, milk fat content, milk protein content as well as F. hepatica or O. ostertagi bulk tank milk antibody status from 606 dairy farms in three structurally different dairying regions in Germany. Parasite-positive farms grouped together with their respective production parameters to form separate clusters. A random forests algorithm characterised clusters with regard to external variables. Across all study regions, co-infections with F. hepatica or O. ostertagi, respectively, farming type, and pasture access appeared to be the most important factors discriminating clusters (i.e. farms). Furthermore, farm level lameness prevalence, herd size, BCS, stage of lactation, and somatic cell count were relevant criteria distinguishing clusters. This study is among the first to apply a cluster analysis approach in this context and potentially the first to implement a k-medoids algorithm and partitioning around medoids in the veterinary field. The results demonstrated that biologically relevant patterns of parasite status and milk parameters exist between farms positive for F. hepatica or O. ostertagi, respectively, and negative farms. Moreover, the machine learning approach confirmed results of previous work and shed further light on the complex setting of associations a between parasitic diseases, milk yield and milk constituents, and management practices.
first_indexed 2024-12-11T16:31:36Z
format Article
id doaj.art-977a75624300468cb1a6f20c42a6a091
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-11T16:31:36Z
publishDate 2022-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-977a75624300468cb1a6f20c42a6a0912022-12-22T00:58:36ZengPublic Library of Science (PLoS)PLoS ONE1932-62032022-01-01177e027141310.1371/journal.pone.0271413A machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows.Andreas W OehmAndrea SpringerDaniela JordanChristina StrubeGabriela Knubben-SchweizerKatharina Charlotte JensenYury ZablotskiFasciola hepatica and Ostertagia ostertagi are internal parasites of cattle compromising physiology, productivity, and well-being. Parasites are complex in their effect on hosts, sometimes making it difficult to identify clear directions of associations between infection and production parameters. Therefore, unsupervised approaches not assuming a structure reduce the risk of introducing bias to the analysis. They may provide insights which cannot be obtained with conventional, supervised methodology. An unsupervised, exploratory cluster analysis approach using the k-mode algorithm and partitioning around medoids detected two distinct clusters in a cross-sectional data set of milk yield, milk fat content, milk protein content as well as F. hepatica or O. ostertagi bulk tank milk antibody status from 606 dairy farms in three structurally different dairying regions in Germany. Parasite-positive farms grouped together with their respective production parameters to form separate clusters. A random forests algorithm characterised clusters with regard to external variables. Across all study regions, co-infections with F. hepatica or O. ostertagi, respectively, farming type, and pasture access appeared to be the most important factors discriminating clusters (i.e. farms). Furthermore, farm level lameness prevalence, herd size, BCS, stage of lactation, and somatic cell count were relevant criteria distinguishing clusters. This study is among the first to apply a cluster analysis approach in this context and potentially the first to implement a k-medoids algorithm and partitioning around medoids in the veterinary field. The results demonstrated that biologically relevant patterns of parasite status and milk parameters exist between farms positive for F. hepatica or O. ostertagi, respectively, and negative farms. Moreover, the machine learning approach confirmed results of previous work and shed further light on the complex setting of associations a between parasitic diseases, milk yield and milk constituents, and management practices.https://doi.org/10.1371/journal.pone.0271413
spellingShingle Andreas W Oehm
Andrea Springer
Daniela Jordan
Christina Strube
Gabriela Knubben-Schweizer
Katharina Charlotte Jensen
Yury Zablotski
A machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows.
PLoS ONE
title A machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows.
title_full A machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows.
title_fullStr A machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows.
title_full_unstemmed A machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows.
title_short A machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows.
title_sort machine learning approach using partitioning around medoids clustering and random forest classification to model groups of farms in regard to production parameters and bulk tank milk antibody status of two major internal parasites in dairy cows
url https://doi.org/10.1371/journal.pone.0271413
work_keys_str_mv AT andreaswoehm amachinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT andreaspringer amachinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT danielajordan amachinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT christinastrube amachinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT gabrielaknubbenschweizer amachinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT katharinacharlottejensen amachinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT yuryzablotski amachinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT andreaswoehm machinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT andreaspringer machinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT danielajordan machinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT christinastrube machinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT gabrielaknubbenschweizer machinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT katharinacharlottejensen machinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows
AT yuryzablotski machinelearningapproachusingpartitioningaroundmedoidsclusteringandrandomforestclassificationtomodelgroupsoffarmsinregardtoproductionparametersandbulktankmilkantibodystatusoftwomajorinternalparasitesindairycows