-Cancerseek (fig3), based on the detection of circulating tumor cells , which for the most common types of cancer (Colon, Breast, lung) does not exceed 0.65. The classification of cancers is done in this study by the complementary analysis of the circulating DNA. At stage 1, the CancerSeek study in particular only has a sensitivity of 0.43(1)
-CCGA CCGA (Circulating Cell-free Genome Atlas), which can detect 50 types of cancer by analyzing circulating DNA, but with a sensitivity of 0.7 (2) at all stages, and 0.73 at stage 2. Liquid biopsy by searching for circulating DNA, for example, in the Thunder study only gives a sensitivity of 59.8% up to stage III (4).
While for example in 2012, 46% of cancers in the United Kingdom were only discovered from stage 3 (5), it would indeed be very effective to be able to detect with good sensitivity and at a stage at most equal to 2 , most types, from the same sample. Indeed, in the case of gastric cancer for example, the 5-year survival rate goes from 65% if the cancer is detected at stages 1-2 to 25% at later stages (6) .
This is the reason why we are building a new model , similar to the one already established, but which is limited to sequencing data from patients without cancer, or with cancer at most stage 2.
A model limited, to date, to the detection of gastric cancer (PRJNAxxx-Ncbi study) has shown in phases 1 and 2 of cancer, a sensitivity of 0.88 and a specificity of 0.80, for logistic regression. However, the statistical reliability of this model, which involves 36 patients, is not optimal since a minimum of 100+2*50=200 (7) would be needed to get the bast confidence interval.
Another model applied to the detection of colon cancer (study PRJNAxxx-Ncbi) has shown in phases 1 and 2, a sensitivity of 1 and a specificity of 1, for more than 200 patients.
The cost of individual sequencing is at least 300Eur (8). If we had to redo a complete study, only for gastric cancer, the cost would be at least 200*300=60,000Eur. The detection examination for this type of cancer is quite restrictive, since it is most often an upper digestive endoscopy (gastroscopy, 9), while the symptoms are often late. The cost of sequencing will be all the more competitive, for example in comparison with a gastroscopy (260 Eur, 10), as it will make it possible to detect several types of cancer in a single examination. Liquid biopsy is on average less expensive than gastroscopy because it is not linked to a day off work, but on the contrary does not allow the removal of potentially cancerous polyps (11).
We are therefore currently seeking the provision of ngs sequencing data already carried out, of patients listed at stages 1 or 2, to train another Machine Learning models to the detection of another types of cancer.