Cocoa genome sequencing – sciencedaily

The production of high-quality chocolate and the farmers who grow it will benefit from the chocolatier’s recent sequencing and genome assembly, according to an international team led by Claire Lanaud of CIRAD, France, with Mark Guiltinan of Penn State, and including scientists from 18 other institutions.

The team sequenced the DNA of a variety of Theobroma cocoa, considered to produce the best chocolate in the world. The Mayans domesticated this variety of Theobroma cocoa, Criollo, about 3,000 years ago in Central America, and it is one of the oldest domesticated tree crops. Today, many growers prefer to grow hybrid cocoa trees which produce inferior chocolate but are more resistant to disease.

“Fine cocoa production is estimated at less than 5 percent of global cocoa production due to its low productivity and susceptibility to disease,” said Guiltinan, professor of plant molecular biology.

The researchers report in the current issue of Genetics of nature “Consumers have shown an increased interest in high quality chocolate made from good quality cocoa and in dark chocolate, containing a higher percentage of cocoa, while also taking into account the environmental and ethical criteria of the production of cocoa. “

Currently, most cocoa farmers earn around $ 2 a day, but fine cocoa farmers earn more. Increasing the productivity and ease of cultivation of cocoa can help develop a sustainable cocoa economy. Trees are now also considered a beneficial crop for the environment, as they grow better in the shade of forests, allowing land rehabilitation and enrichment of biodiversity.

The team’s work has identified a variety of gene families that could have a future impact on improving cocoa trees and fruits, either by improving their attributes or by providing protection against the fungal diseases and insects that affect them. cocoa trees.

“Our analysis of the Criollo genome uncovered the genetic basis of pathways leading to the most important quality traits in chocolate – the biosynthesis of oil, flavonoids and terpenes,” said Siela Maximova, associate professor of horticulture, Penn State. , and member of the research team. “It has also led to the discovery of hundreds of genes potentially involved in resistance to pathogens, all of which can be used to accelerate the development of elite varieties of cocoa in the future.”

Because Criollo trees are self-pollinated, they are generally very homozygous, possessing two identical forms of each gene, making this particular variety a good choice for precise genome assembly.

Researchers have assembled 84 percent of the genome identifying 28,798 genes that code for proteins. They assigned 88 percent or 23,529 of these protein-coding genes to one of the 10 chromosomes of the Criollo cocoa tree. They also looked at microRNAs, short, non-coding RNAs that regulate genes, and found that Criollo microRNAs are likely major regulators of gene expression.

“Interestingly, only 20% of the genome was made up of transposable elements, one of the natural pathways through which genetic sequences change,” Guiltinan said. “They do it by moving around chromosomes, changing the order of genetic material. Smaller amounts of the transposons than those found in other plant species could lead to slower evolution of the chocolate plant, which was found to have a relatively simple evolutionary history in terms of genome structure. “

Guiltinan and his colleagues are interested in specific gene families that may be linked to specific qualities of cocoa or to disease resistance. They hope that mapping these gene families will lead to a source of genes directly involved in plant variations that will be useful in speeding up plant breeding programs.

Researchers have identified two types of disease resistance genes in the Criollo genome. They compared them to previously identified regions on chromosomes that correlate with disease resistance – QTLs – and found that there was a correlation between many locations of QTL resistance genes. The team suggests that a functional genomics approach, which examines what genes do, is needed to confirm potential disease resistance genes in the Criollo genome.

Hidden in the genome, the researchers also discovered genes that code for the production of cocoa butter, a substance highly prized in chocolate making, confectionery, pharmaceuticals and cosmetics. Most cocoa beans already contain around 50 percent fat, but these 84 genes control not only the quantity but the quality of cocoa butter.

Other genes have been found that influence the production of flavonoids, natural antioxidants and terpenoids, hormones, pigments, and flavors. Altering the genes of these chemicals could produce chocolate with better flavors, better aromas, and even healthier chocolate.

Penn State researchers involved in this study include Guiltinan and Maximova; Yufan Zhang and Zi Shi, graduate students, plant biology; Stephen Schuster, Department of Biochemistry and Molecular Biology; John E. Carlson, School of Forest Resources and MJ Axtell and Z. Ma, Department of Biology.

The other researchers involved were from CIRAD; National Institute of Agronomic Research UMR; Genoscope; Scientific Research National Center ; National Genotyping Center; University of Evry; INRA-CNRS LIPM Laboratory of Plant Micro-Organism Interactions; University of Perpignan; Biometrics and Artificial Intelligence Unit; Institute of Plant Sciences; and Chocolaterie Valrhona, all in France.

Also included are researchers from the University of Arizona; Cold Spring Harbor Laboratory; National Center for Agronomic Research, Côte d’Ivoire; CEPLAC, Brazil; and Centro Nacional de Biotecnologia Agricola, Instituto de Estudios Avanzados, Venezuela.

CIRAD, the Agropolis Foundation, the Languedoc Roussillon Region, the National Research Agency (ANR), Valrhona, the Venezuelan Ministry of Science, Technology and Industry, Hershey Corp., the American Cocoa Research Institute Endowment and the National Science Foundation supported this work.

the Theobroma cocoa the genome sequences are deposited in the EMB: / Genbank / DDBJ databases under the accession numbers CACC01000001-CACC01025912. A genomic browser and additional information on the project are available at and

Comments are closed.