Composition of the Cohort

The MEC consists of men and women primarily of five ethnic groups (Caucasians, Japanese Americans, Native Hawaiians, African Americans and Latinos). At baseline, the final sample consisted of more than 215,000 participants. The great majority of the African Americans and Latinos in the cohort are from CA, whereas the great majority of Japanese, whites, and Native Hawaiians are from HI.  The cohort encompasses a broad spectrum of the ethnic groups sampled: when we compared the distributions across educational levels and marital status between the cohort sample and the 1990 census data for HI and Los Angeles county, the distributions were remarkably similar, suggesting that study findings should be broadly generalizable to these populations.  Furthermore, cancer incidence rates within the cohort are similar to those for the state of HI and for southern California.  Descriptions of the cohort have been published (1,2); the baseline distribution by sex, age and ethnicity is shown in Table 1.

Table 1. Composition of the Multiethnic Cohort at baseline
    Age at Entry  
    45-54 55-64 65-75 Total
Japanese American Male 8,065 7,667 11,232 26,964
Female 8,809 9,299 11,849 29,957
Caucasian Male 8,718 6,672 7,467 22,857
Female 10,216 7,937 8,349 26,502
Native Hawaiian Male 2,779 1,969 1,372 6,120
Female 3,753 2,426 1,672 7,851
African American Male 2,899 3,503 6,449 12,851
Female 6,136 6,315 9,805 22,256
Latino Male 6,376 9,453 6,989 22,818
Female 7,767 10,481 6,372 24,620
Other1 Male 1,744 1,801 1,655 5,200
Female 2,949 2,495 1,811 7,255
Total 70,211 70,018 75,022 215,251
1Includes Filipinos, Samoans, Koreans, and small numbers of other ethnic groups.


Comprehensive baseline information was collected using an extensive, 26-page, questionnaire (Qx.1) that was mailed to all participants. The initial instrument included information on demographics, medical and reproductive histories, medication use (including hormonal replacement therapy), family history of various cancers, smoking history, physical activity, and an extensive quantitative food frequency questionnaire (QFFQ). This QFFQ was developed from 3-day measured food records collected on a representative sample of ~300 adults representing each of the five main ethnic groups in the two areas. These records were used to identify the minimum set of food items contributing >85% (for each ethnic group) of the intakes of nutrients of particular interest, and unique items consumed primarily by one or more ethnic groups. The measured records were also used to establish common portion sizes. For several food groups that are difficult to estimate quantitatively, the questionnaire included photographs showing the foods in three different portion sizes. Based on our prior experience developing dietary assessment instruments, we were able to develop a single diet questionnaire for all five ethnic groups, which optimized standardization in the data collection. A Spanish version was developed to include Latinos preferring to complete the questionnaire in Spanish; 50% of the MEC Latinos chose this version. Our dietary instrument was validated in a calibration study which showed very satisfactory correlations of nutrient intakes between the QFFQ and the 24-hour recalls for each sex-ethnic group, especially after energy adjustment (3). The calibration study collected three 24-hour recalls (on random days about 1-month apart, with weekday and weekend representation) and a repeat QFFQ on a random sample (stratified by sex-, ethnic-, and age group) of over 2,500 subjects to permit estimation of and adjustment for measurement error in the FFQ.

We have augmented the baseline questionnaire data with additional follow-up questionnaires in 1998-2002 (Qx.2), 2003-2008 (Qx.3), 2008-2012 (Qx.4), 2012-2016 (Qx.5) and 2018-2022 (Qx.6).


Biospecimen collection first began in 1995 with samples from new cases of breast, prostate, and colorectal cancer, and from a cross-section of participants, for genomic analyses using a nested case-control design. Subsequently (2001-2006), we created a prospective biorepository. The purposes were: a) to enable us to examine biomarkers (nutritional, hormonal, etc.) related to etiology and early detection that require pre-diagnostic samples; and b) to enhance the genetic studies by including additional, particularly rapidly-fatal, cancer sites (e.g., pancreas, ovary, lung) for which blood cannot be easily collected after diagnosis. Both locations used the same collection protocol, with the exception that overnight urines were collected in HI, whereas a first morning void was obtained in CA, where the overnight collection was not feasible. The prospective specimen collections yielded a subcohort of ~70,000 participants, with 40 cc of blood preserved for each subject in multiple 0.4 cc aliquots of serum, plasma, buffy coat and red cells stored in liquid nitrogen. Urine (20 ml on each subject) is stored in multiple aliquots at -80°C. Mouthwash (MW) was requested from subjects who refused the blood sample (4).

Biospecimen collection

Subjects were requested to fast from 9:00 pm the previous evening, and to collect their first morning urine (plus any passed during the night) on frozen blue ice in a container provided. At USC, most samples were collected at the subjects' homes; the subjects were requested to fast overnight and to collect a first morning urine sample. To standardize for circadian rhythms, samples were collected between 7:00 am and 10:00 am. A short survey was administered prior to the blood draw inquiring about the time of the subject’s last meal, as well as the starting and ending times of the urine collection in HI. Four 10 ml evaporated glass tubes (two heparin and two dry) were collected. Samples were transported on ice to the UH or USC laboratory for registration and processing. A total of 3,703,039 aliquots were stored, with 55% collected in Hawaii and 45% at USC, to form the MEC Biorepository.

Number and quality of the samples collected

Table 2 provides the numbers of MEC participants in HI and CA on whom samples were collected, by type of specimen and ethnicity. The total number of participants in the biorepository was 75,928. The vast majority of blood samples were collected in a fasting state, with 92% fasting for ≥8 hrs, 89% for ≥10 hrs, and 69% for ≥12 hrs. In HI, the median intervals of time between blood collection and freezing were 5.5 hrs for all (9.4 hrs for neighbor islands and 4.9 hrs for Oahu). At USC, 93% of the samples were processed and frozen within 4 hrs of collection.

Table 2. Number of MEC Members on whom samples were collected by Type of Sample and Race1
Type of Sample Japanese Caucasian Native Hawaiian African American Latino Total
Blood 23,225 13,703 5,273 10,419 16,979 70,068
Urine 23,285 13,736 5,286 9,902 16,101 68,749
Mouthwash 1,456 927 319 715 938 4,379
Saliva 278 538 61 372 444 1,870
Viable Lymphocytes 7,292 5,542 2,279 0 0 15,113
Any 25,129 15,468 5,893 11,636 18,443 77,252
1As of September 2023

Cancer Ascertainment

The MEC was specifically designed to include participants living in areas covered by cancer registries of the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute (NCI), to make possible the ready identification of incident cancers through computer linkages. Because of the high quality of SEER registries, the identification of cases still residing in the catchment areas is virtually complete and the information is highly accurate. Out-migration to states other than HI or CA in the MEC is minimal. Based on extensive tracking of a random sample of the cohort, we found that the out-migration rate was 3.7% after 7 years of follow-up. The most recent Medicare linkage of 88% MEC members provided us with updated residential status; after updating or confirming the addresses of these members, the last known address of 4.0% of the original cohort was outside of HI and CA, of whom 74% have died. Cohort members who have out-migrated are still followed through questionnaires and our efforts to update addresses; and through linkages to Medicare and the National Death Index (NDI).

Our main means to identify incident cancers entails linkage to two state-wide SEER registries: the Hawaiʻi Tumor Registry and the California State Cancer Registry. Deaths are identified by linkage to the state death certificate files in HI and CA, and to the NDI for deaths occurring in other states. Table 3 shows the number of invasive cancer cases for each main cancer site currently, as well as the numbers projected by the end of 2027. It also shows the numbers of cases with biospecimens (total and collected pre-diagnostically (pre-dx)).

Table 3.  Incident Cases in the MEC
 Cancer Site Cases Available in 2021* Cases Available in 2027*
  Total All with Specimens With Pre-Dx Specimens Total All with Specimens With Pre-Dx Specimens
Total 48,064 21,408 10,957 55,317 27,468 13,627
Breast (F) 7,968 4,729 1,746 9,150 5,850 2,171
Prostate 10,962 6,748 2,780 12,106 8,206 3,195
Colorectum 6,675 3,167 1,392 7,736 4,118 1,811
Lung 7,378 2,118 1,723 8,971 3,142 2,327
Stomach 1,860 585 458 2,222 853 612
Bladder 1,345 521 367 1,708 841 564
Kidney 1,888 817 537 2,354 1,139 712
Pancreas 2,206 715 657 2,788 1,085 884
Non-Hodgkin's lymphoma (NHL) 2,756 1,070 777 3,432 1,546 1,046
Endometrium 1,629 853 370 1,886 1,087 466
Ovary 742 289 139 859 379 193
Represents cancer cases diagnosed up to 18 months earlier due to reporting lag.*


  1. Kolonel LN, Henderson BE, Hankin JH, Nomura AM, Wilkens LR, Pike MC, Stram DO, Monroe KR, Earle ME, Nagamine FS. A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. American journal of epidemiology. 2000;151(4):346-57. Epub 2000/03/01. PubMed PMID: 10695593.
  2. Kolonel LN, Altshuler D, Henderson BE. The multiethnic cohort study: exploring genes, lifestyle and cancer risk. Nature reviews Cancer. 2004;4(7):519-27. doi: 10.1038/nrc1389. PubMed PMID: 15229477.
  3. Stram DO, Hankin JH, Wilkens LR, Pike MC, Monroe KR, Park S, Henderson BE, Nomura AM, Earle ME, Nagamine FS, Kolonel LN. Calibration of the dietary questionnaire for a multiethnic cohort in Hawaii and Los Angeles. American journal of epidemiology. 2000;151(4):358-70. PubMed PMID: 10695594; PubMed Central PMCID: 4482461.
  4. Le Marchand L, Lum-Jones A, Saltzman B, Visaya V, Nomura AM, Kolonel LN. Feasibility of collecting buccal cell DNA by mail in a cohort study. Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology. 2001;10(6):701-3. PubMed PMID: 11401922.