fundamentals of Research


(P) Population of interest

(I) Intervention being studied. 

(C) Comparison group (or to what is the intervention being compared). 

(O) Outcome (dependent variable) of interest. 

(T) Time for follow-up 

Type of Errors

There are 2 potential errors are commonly recognized when testing a hypothesis:


A drug manufacturer has been new calcium channel blocker to detect a difference on lowering ability against versus placebo. The study is only powered to 60%. The study did not prove a statistical difference in depression scores versus placebo with a p-value of 0.16. Deriving Beta from the Power, β-value = 0.4. There was a 40% chance that we committed a Type 2 error. Or stated another way, there was a 40% chance that new calcium channel blocker did have an effect, but we were unable to prove it. Therefore, this type of error can be overcome by increase the sample size or increasing the time of study.

internal and external validity in research

Validity of a study is a general issue of whether or not there are imperfections in the study design, data collection, or methods of data analyses that might distort the conclusions about an exposure-disease relationship. Validity of the study can be classified as internal and external validity. 

Should note that when evaluating the bias in a study, it is essential to assess its source, strength, and direction. Bias can pull an estimate towards or away from the null. The most common types of bias are selection and information bias. Misclassification (Measurement error) is the most common type of information bias that occuring during the data collection.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale, records an incorrect measurement). Random error mainly affects precision, which is how reproducible the same measurement is under equivalent circumstances. In contrast, systematic error affects the of a measurement, or how close the observed value is to the true value. Factors that contribute to random error: Observer variability, imprecise definition, instrument variability, lack of instrument sensitivity and sampling error.


Confounder (s)

Confounder (s): 

Residual confounding: Confounding that remains even after many confounding variables have been controlled. Reasons for residual confounders: Uncollected data, mismeasurement of confounder, and/or persistent differences in risk within a category of a confounder.

Confounder (s) criteria:

Assessment of confounding:

Magnitude of confounding= [(crude estimate-adjusted estimate)/Adjusted estimate ]x100

Example: If we are studying the association between coffee and lung cancer.

Estimate Coffee-Lung cancer= 5 (not adjusted for smoking).

Estimate Coffee-Lung cancer= 1.2 (adjusted for smoking).

Magnitude of confounding= [(crude estimate-adjusted estimate)/Adjusted estimate ]x100. If we apply the equation, Magnitude of confounding= ((5-1.2)/1.2))x100=316%. Therefore, smoking is considered a significant confounder, and we should adjust for it.

Categorical versus Continuous Variables

Categorical Variables

Categorical variables can be divided into nominal or ordinal:

Continuous (Parametric) Variables

Continuous variables can be divided to ratio or interval scale:

Correlation  & Causation

1. Strength of association.

2. Consistency.

3. Specificity.

4. Temporality.

5. Dose-response relationship (gradient).

6. Plausibility (agrees with currently accepted understanding of pathological processes).

7. Coherence (compatible with existing theory and knowledge).

8. Experimental evidence.

9. Consideration of alternate Explanations.

Central tendency

Mean: Equals the sum of observations divided by the number of observations. 

Median:  Equals the observation in the middle when all observations are ordered from smallest to largest if odds number. While if even number, mean of the middle two data points. Median is the value that holds 50% of the data above it and 50% of the data below it. 

Mode: Equals the observation that occurs most frequently. Mode is the value that appears the most frequent.

For example, the mean, median, and mode for the following data: 5, 6, 9, 5, 6, 7, 2, 3 will be as follows: Mean=5+6+9+5+6+7+2+3/8=5.3; Median, will order data from smallest to largest: 23556679. Therefore, as it’s even number, the median is 5+6/2=5.5 and lastly, the Mode is bimodal (5 & 6).


Dependent versus Independent Variables

prevalence versus incidence

Rate versus proportion


For example, in a study where 10% of patients treated with drug A progressed vs. 15% of patients treated with drug B there is a 5% ARR in disease progression with drug A compared with drug B: Absolute risk reduction (ARR) = 15% -10% = 5%. Using the example above the RRR of progression is reduced by 33% with Drug A compared with Drug B: RRR = (15-10)/15 = 5/15 = 33.3%

Odds ratio (OR) 

Odds ratio (OR) is defined as the “odds” or chance of an association between treatment of a medication (or exposure to something) with an outcome. Odds Ratio (OR)= A × D /B × C:

Relative risk (RR) differs from than Odds ratio (OR). For example, if we are studying drug X on mortality, the relative risk of death = ([number of deaths]/[all outcomes(all deaths + survivors)]). While, odds ratio (OR) of death = ([number of deaths]/[number of non-deaths, i.e., survivors]).

For interpretation of odds ratio, for example, an OR of 0.5, suggests that patients exposed to a variable of interest were 50 % less likely to develop a specific outcome compared to the control group. Similarly, an OR of 1.7 suggests that the risk was increased by 70 %.

Hazard ratio (HR) 

Hazard ratio (HR) is analogous to an odds ratio (OR). Thus, a hazard ratio of 5 means that exposed group of to a specific risk factor has 5 times the chance of developing the outcome compared with unexposed group, another examples, HR of 2 means that there is double the risk. While, HR of 0.5 tells that there is half the risk (Protection effect):

Number needed to treat (NNT) 

For example, if NNT is nine,  Its interpretation can be illustrated by the following sentence: "This study suggests that we need to treat 9 patients to get the desired outcome for 1 patient." On the other hand, if NNH to have AKI is 9 Its interpretation  will be as follow "This study suggests that if we treated 9 patients, 1 patient will get the adverse effect (AKI).

Confidence interval (CI)