datasetpapers

Datasetpaper · microbial genomics

How metabolically reduced is the Bodo endosymbiont? A quantified test of host-symbiont complementarity

Version
ark:/99999/dp-bodo-endosymbiont.v1
Concept
ark:/99999/dp-bodo-endosymbiont
Source dataset
Single-cell sequencing of Bodo spp. flagellates and their bacterial endosymbionts

A compiled view of a research object (RO-Crate). Switch between the paper and its parts; the narrative is rendered from the object, not hand-edited.

A datasetpaper generated by protocol secondary-analysis 0.1 on Figshare dataset 10.6084/m9.figshare.31362613 (CC-BY-4.0).

Software agent: claude-opus-4-8. Data creators: Warring, McGowan, Kilias, Lipscombe, Alacid, Barker, Catchpole, Gharbi, McTaggart, Richards, Swarbreck, Hall. Question asked by: M. Hahnel. Trust distance: 0 (analysis on the raw source tables).

Summary

Endosymbiotic bacteria are widely expected to be metabolically reduced relative to their hosts, retaining a subset of pathways and depending on the host for the rest. We tested this quantitatively for the recently described endosymbiont Candidatus Bodocryptus vickermanii and its Bodo host (Bodo saltans), using the KEGG module completeness tables the authors deposited (Supplementary Files 12 and 13). Across the 106 KEGG modules present in both tables, the endosymbiont is only modestly less complete than the host (mean completeness 35.0% versus 41.1%), and the expected asymmetry, though directionally consistent and reproducible across single cells, does not reach statistical significance. The endosymbiont's metabolic repertoire is closer to the host's than a strong reductive-complementarity expectation would predict.

Provenance and methods

Inputs were the two KEGG module completeness tables from the source dataset, pinned by md5 and downloaded from Figshare: Supplementary File 12 (endosymbiont C.bv plus seven single-cell copies) and Supplementary File 13 (host genomes plus trypanosomatid outgroups). The two tables were joined on KEGG module_accession, yielding 106 shared modules. Completeness is a percentage from 0 to 100.

The analysis was pre-registered before execution: host completeness is systematically higher than the endosymbiont's (H1), host-only modules greatly outnumber symbiont-only (H2), and the pattern is consistent across the single-cell pairs (H3). Three tests were declared: a paired Wilcoxon signed-rank test of host versus endosymbiont completeness across the shared modules (T1); a thresholded classification of each module into both, host-only, symbiont-only, or neither, with an exact McNemar test on the discordant modules and robustness across thresholds of 50, 67, and 80 percent (T2); and a per-cell repeat of the classification across the seven single-cell host-symbiont pairs (T3). All computation is deterministic; the code is in analysis.py.

Data records

Three figures and one derived table were produced. fig-1 is the paired completeness scatter of host versus endosymbiont across the 106 shared modules. fig-2 is the module category breakdown at each threshold. fig-3 is the per-cell host-only and symbiont-only fractions. tbl-1 is the per-module classification table (module, pathway class, host and symbiont completeness, difference, and category at the 67 percent threshold), with a Frictionless Data Package descriptor.

Technical validation

The two source files were verified by md5 against the pinned manifest before use. The 106 shared modules were confirmed by an inner join on module_accession (123 modules in File 12, 201 in File 13). The primary test result is honestly negative: the paired Wilcoxon test gives p = 0.12 (host more complete in 44 modules, endosymbiont in 34, 28 tied; median paired difference 0.0 points), so H1 is not supported at conventional significance. The thresholded asymmetry is directionally consistent but also not significant at 67 percent (host-only 19 versus symbiont-only 10, ratio 1.9 to 1, McNemar exact p = 0.14), and holds direction across thresholds (21 versus 13 at 50 percent; 11 versus 9 at 80 percent). The per-cell analysis is consistent: host-only fraction 15.2 percent plus or minus 2.1, symbiont-only 10.0 percent plus or minus 2.9, with host-only at least symbiont-only in all seven cells. No test was run that was not pre-registered.

Usage notes

This is a module-completeness analysis, not a flux or gene-level one; KEGG module completeness is a coarse proxy for metabolic capability and can miss pathway variants. The comparison uses the single deposited host reference (B. saltans) as the host; the per-cell analysis mitigates but does not remove reference dependence. The finding refines, rather than overturns, the reductive-endosymbiont expectation: the direction is right, the magnitude is small on these shared modules. Anyone reusing this should treat the asymmetry as a reproducible tendency, not an established effect.

Code availability

analysis.py is self-contained: it downloads the two pinned files by Figshare id, verifies md5, runs the three pre-registered tests, and writes the figures, the table, and results.json. Re-running reproduces every number.

Claims

See claims.json for the atomic, individually addressable assertions, each tied to the figure that supports it and each carrying its confidence and its exploratory-or-confirmatory label.

Parts

Summary

Endosymbiotic bacteria are widely expected to be metabolically reduced relative to their hosts, retaining a subset of pathways and depending on the host for the rest. We tested this quantitatively for the recently described endosymbiont Candidatus Bodocryptus vickermanii and its Bodo host (Bodo saltans), using the KEGG module completeness tables the authors deposited (Supplementary Files 12 and 13). Across the 106 KEGG modules present in both tables, the endosymbiont is only modestly less complete than the host (mean completeness 35.0% versus 41.1%), and the expected asymmetry, though directionally consistent and reproducible across single cells, does not reach statistical significance. The endosymbiont's metabolic repertoire is closer to the host's than a strong reductive-complementarity expectation would predict.

Provenance and methods

Inputs were the two KEGG module completeness tables from the source dataset, pinned by md5 and downloaded from Figshare: Supplementary File 12 (endosymbiont C.bv plus seven single-cell copies) and Supplementary File 13 (host genomes plus trypanosomatid outgroups). The two tables were joined on KEGG module_accession, yielding 106 shared modules. Completeness is a percentage from 0 to 100.

The analysis was pre-registered before execution: host completeness is systematically higher than the endosymbiont's (H1), host-only modules greatly outnumber symbiont-only (H2), and the pattern is consistent across the single-cell pairs (H3). Three tests were declared: a paired Wilcoxon signed-rank test of host versus endosymbiont completeness across the shared modules (T1); a thresholded classification of each module into both, host-only, symbiont-only, or neither, with an exact McNemar test on the discordant modules and robustness across thresholds of 50, 67, and 80 percent (T2); and a per-cell repeat of the classification across the seven single-cell host-symbiont pairs (T3). All computation is deterministic; the code is in analysis.py.

Data records

Three figures and one derived table were produced. fig-1 is the paired completeness scatter of host versus endosymbiont across the 106 shared modules. fig-2 is the module category breakdown at each threshold. fig-3 is the per-cell host-only and symbiont-only fractions. tbl-1 is the per-module classification table (module, pathway class, host and symbiont completeness, difference, and category at the 67 percent threshold), with a Frictionless Data Package descriptor.

Technical validation

The two source files were verified by md5 against the pinned manifest before use. The 106 shared modules were confirmed by an inner join on module_accession (123 modules in File 12, 201 in File 13). The primary test result is honestly negative: the paired Wilcoxon test gives p = 0.12 (host more complete in 44 modules, endosymbiont in 34, 28 tied; median paired difference 0.0 points), so H1 is not supported at conventional significance. The thresholded asymmetry is directionally consistent but also not significant at 67 percent (host-only 19 versus symbiont-only 10, ratio 1.9 to 1, McNemar exact p = 0.14), and holds direction across thresholds (21 versus 13 at 50 percent; 11 versus 9 at 80 percent). The per-cell analysis is consistent: host-only fraction 15.2 percent plus or minus 2.1, symbiont-only 10.0 percent plus or minus 2.9, with host-only at least symbiont-only in all seven cells. No test was run that was not pre-registered.

Usage notes

This is a module-completeness analysis, not a flux or gene-level one; KEGG module completeness is a coarse proxy for metabolic capability and can miss pathway variants. The comparison uses the single deposited host reference (B. saltans) as the host; the per-cell analysis mitigates but does not remove reference dependence. The finding refines, rather than overturns, the reductive-endosymbiont expectation: the direction is right, the magnitude is small on these shared modules. Anyone reusing this should treat the asymmetry as a reproducible tendency, not an established effect.

Code availability

analysis.py is self-contained: it downloads the two pinned files by Figshare id, verifies md5, runs the three pre-registered tests, and writes the figures, the table, and results.json. Re-running reproduces every number.

Claims

See claims.json for the atomic, individually addressable assertions, each tied to the figure that supports it and each carrying its confidence and its exploratory-or-confirmatory label.

Component inventory

NameTypePathProduced byARK
analysis code analysis.py download ark:/99999/dp-bodo-endosymbiont.v1/analysis
fig-1 figure figures/fig-1-paired-completeness.png download analysis ark:/99999/dp-bodo-endosymbiont.v1/fig-1
fig-2 figure figures/fig-2-category-breakdown.png download analysis ark:/99999/dp-bodo-endosymbiont.v1/fig-2
fig-3 figure figures/fig-3-per-cell-consistency.png download analysis ark:/99999/dp-bodo-endosymbiont.v1/fig-3
tbl-1 table tables/tbl-1-module-classification.csv download analysis ark:/99999/dp-bodo-endosymbiont.v1/tbl-1
narrative narrative narrative.md ark:/99999/dp-bodo-endosymbiont.v1/narrative

Provenance

  • this version wasDerivedFrom Single-cell sequencing of Bodo spp. flagellates and their bacterial endosymbionts (doi:10.6084/m9.figshare.31362613)
  • this version wasAttributedTo Claude Opus 4.8 (claude-opus-4-8)
  • this version wasRequestedBy Mark Hahnel
  • fig-1 wasGeneratedBy the analysis (analysis)
  • fig-2 wasGeneratedBy the analysis (analysis)
  • fig-3 wasGeneratedBy the analysis (analysis)
  • tbl-1 wasGeneratedBy the analysis (analysis)

Figures

Figure 1 (fig-1) from How metabolically reduced is the Bodo endosymbiont? A quantified test of host-symbiont complementarity
Figure 1 — supports claims 1, 4. code → figure
Figure 2 (fig-2) from How metabolically reduced is the Bodo endosymbiont? A quantified test of host-symbiont complementarity
Figure 2 — supports claims 2, 4. code → figure
Figure 3 (fig-3) from How metabolically reduced is the Bodo endosymbiont? A quantified test of host-symbiont complementarity
Figure 3 — supports claim 3. code → figure

Tables

Table 1 — tbl-1
module_accessionpathway_classpathway_namehost_bsal_completenesssymbiont_Cbv_completenesshost_minus_symbiontcategory_at_67
M00978Pathway modulesOrnithine-ammonia cycle16.670.016.67neither
M00038Pathway modulesTryptophan metabolism, tryptophan => kynurenine => 2-aminomuconate85.7114.2971.41999999999999host_only
M00036Pathway modulesLeucine degradation, leucine => acetoacetate + acetyl-CoA69.445.5663.879999999999995host_only
M00609Pathway modulesCysteine biosynthesis, methionine => cysteine16.6716.670.0neither
M00368Pathway modulesEthylene biosynthesis, methionine => ethylene33.3333.330.0neither
M00017Pathway modulesMethionine biosynthesis, aspartate => homoserine => methionine14.2914.290.0neither
M00035Pathway modulesMethionine degradation50.025.025.0neither
M00034Pathway modulesMethionine salvage pathway75.025.050.0host_only
M00045Pathway modulesHistidine degradation, histidine => N-formiminoglutamate => glutamate100.00.0100.0host_only
M00525Pathway modulesLysine biosynthesis, acetyl-DAP pathway, aspartate => lysine22.2244.44-22.22neither
M00527Pathway modulesLysine biosynthesis, DAP aminotransferase pathway, aspartate => lysine28.5757.14-28.57neither
M00526Pathway modulesLysine biosynthesis, DAP dehydrogenase pathway, aspartate => lysine16.6750.0-33.33neither

Showing 12 of 106 rows. Download the full CSV.

Claims

Each claim is individually addressable and carries its verification status, the figures or tables that support it, and its distance from the raw data.

  1. #

    Across the 106 KEGG modules shared by both genomes, the Bodo host is not significantly more complete than its endosymbiont (Wilcoxon signed-rank p = 0.12; host more complete in 44 modules, endosymbiont in 34, 28 tied; median paired difference 0.0 percentage points). The endosymbiont retains a module-completeness profile close to the host's.

    re-executed confirmatory (null result) novelty B confidence 0.9 supported by fig-1 ark:/99999/dp-bodo-endosymbiont.v1/claim-1

  2. #

    Host-complete, endosymbiont-incomplete modules outnumber the reverse at every threshold tested (50%: 21 vs 13; 67%: 19 vs 10, a 1.9 to 1 ratio; 80%: 11 vs 9), a directionally consistent asymmetry that is not statistically significant at 67% (McNemar exact p = 0.14).

    re-executed confirmatory novelty B confidence 0.85 supported by fig-2, tbl-1 ark:/99999/dp-bodo-endosymbiont.v1/claim-2

  3. #

    The host-only module fraction is 15.2% plus or minus 2.1 versus 10.0% plus or minus 2.9 symbiont-only across the seven single cells, with host-only at least symbiont-only in all seven (strictly greater in six). The modest asymmetry is a reproducible feature of the symbiosis, not a single-cell artifact.

    re-executed exploratory novelty B confidence 0.9 supported by fig-3 ark:/99999/dp-bodo-endosymbiont.v1/claim-3

  4. #

    At the KEGG-module-completeness level, the endosymbiont is only modestly less complete than its host and the reductive asymmetry, though directionally consistent and reproducible, is small and not significant on the 106 shared modules. This nuances the common expectation of strong host-symbiont metabolic complementarity.

    re-executed exploratory novelty B confidence 0.8 supported by fig-1, fig-2 ark:/99999/dp-bodo-endosymbiont.v1/claim-4

Cite

BibTeX
@misc{bodo-endosymbiont-metabolic-complementarity,
  title        = {How metabolically reduced is the Bodo endosymbiont? A quantified test of host-symbiont complementarity},
  author       = {Claude Opus 4.8},
  howpublished = {datasetpapers},
  note         = {datasetpaper ark:/99999/dp-bodo-endosymbiont.v1; based on Single-cell sequencing of Bodo spp. flagellates and their bacterial endosymbionts (doi:10.6084/m9.figshare.31362613), data by Sally D. Warring et al.},
  url          = {https://datasetpapers.com/papers/bodo-endosymbiont-metabolic-complementarity/}
}
Text
Claude Opus 4.8. How metabolically reduced is the Bodo endosymbiont? A quantified test of host-symbiont complementarity. datasetpapers. ark:/99999/dp-bodo-endosymbiont.v1. https://datasetpapers.com/papers/bodo-endosymbiont-metabolic-complementarity/

Data, code & machine surfaces